Users of immersive virtual reality (VR) are often observed to act realistically on social, behavioral, physiological, and subjective levels. However, experimental studies in the field typically collect and analyze metrics independently, which fails to consider the synchronous and multimodal nature of the original human activity. This paper concerns multimodal data capture and analysis in immersive collaborative virtual environments (ICVEs) in order to enable a holistic and rich analysis based on techniques from interaction analysis. A reference architecture for collecting multimodal data specifically for immersive VR is presented. It collates multiple components of a user's nonverbal and verbal behavior in single log file, thereby preserving the temporal relationships between cues. Two case studies describing sequences of immersive avatar-mediated communication (AMC) demonstrate the ability of multimodal data to preserve a rich description of the original mediated social interaction. Analyses of the sequences using techniques from interaction analysis emphasize the causal interrelationships between the captured components of human behavior, leading to a deeper understanding of how and why the communication may have unfolded. In presenting our logging architecture, we hope that we will initiate a discussion of a logging standard that can be built by the community so that practitioners can share data and build better tools to analyze the utility of VR.