An episodic memory is specific to an event that occurred at a particular time and place. However, the elements that constitute the event—the location, the people present, and their actions and goals—might be shared with numerous other similar events. Does the brain preferentially represent certain elements of a remembered event? If so, which elements dominate its neural representation: those that are shared across similar events, or the novel elements that define a specific event? We addressed these questions by using a novel experimental paradigm combined with fMRI. Multiple events were created involving conversations between two individuals using the format of a television chat show. Chat show “hosts” occurred repeatedly across multiple events, whereas the “guests” were unique to only one event. Before learning the conversations, participants were scanned while viewing images or names of the (famous) individuals to be used in the study to obtain person-specific activity patterns. After learning all the conversations over a week, participants were scanned for a second time while they recalled each event multiple times. We found that during recall, person-specific activity patterns within the posterior midline network were reinstated for the hosts of the shows but not the guests, and that reinstatement of the hosts was significantly stronger than the reinstatement of the guests. These findings demonstrate that it is the more generic, familiar, and predictable elements of an event that dominate its neural representation compared with the more idiosyncratic, event-defining, elements.
Our lives progress through a series of unique events. When we remember these events, we reactivate their neural representations. Although the events themselves are unique, defined by the combination of what happened where and when, many will share common elements, such as the same people or location. It is not known how such repeated elements are activated during memory recall. Do the predictable and repeated elements of an event dominate its neural representation? If so, this might provide a structure to retrieve less predictable, more idiosyncratic elements. Alternatively, are the unique elements that distinguish similar events represented more robustly? The current study aimed to address these questions.
When humans experience or recall an event, an event-specific pattern of brain activity—as measured using fMRI—is elicited within regions of the cortex (Raykov, Keidel, Oakhill, & Bird, 2021; Chen et al., 2017; Oedekoven, Keidel, Berens, & Bird, 2017; Lee & Kuhl, 2016; Bird, Keidel, Ing, Horner, & Burgess, 2015; St-Laurent, Abdi, & Buchsbaum, 2015; Kuhl & Chun, 2014). These effects are observed most frequently in the brain's posterior medial network (Cooper & Ritchey, 2020; Ranganath & Ritchey, 2012), and although they reflect the neural representation of an event, it is unclear what drives them. Under the view that episodic memory is holistic in nature (Tulving, 1983), it might be expected that all elements that comprise an event contribute equally. Some recent evidence speaks against that because the spatial context that an event takes place in has been shown to be a major factor in determining its neural representation (Robin, Buchsbaum, & Moscovitch, 2018). In this study, we wanted to broadly compare nonspatial elements that were repeated, familiar, and more predictable with those that were event-unique, unfamiliar, and less expected.
Elements that are repeatedly encountered across events inevitably become more familiar to us and also more predictable. Any viewer of the TV show “Friends” will become familiar with the six main characters and would expect them to feature in any new episode. Predictive coding accounts of perception argue that we generate predictions that serve to “explain away” variance in incoming sensory information (Clark, 2013). Thus, during perception, the activity of neuronal populations that code the predictable features of an event are suppressed, whereas those that code the unexpected or novel features are enhanced (Sohoglu & Davis, 2020; Aitchison & Lengyel, 2017; Friston, 2005). Importantly, the enhanced elements of our perceptual experience are also encoded better into memory, consistent with the view that “prediction errors” drive new learning (Quent, Henson, & Greve, 2021; Niv & Schoenbaum, 2008). Given that the brain appears to prioritize unfamiliar and unexpected information, we might expect that when we recall an event, it is the less predictable elements that dominate our representation of the event. For example, if we recall a specific episode of “Friends” that involved the unexpected arrival of one of the sisters of one of the main characters, our representation of this individual might be more robust than the others, as her presence is a key identifying element of this particular episode.
Memory recall has also been argued to play a role in updating our internal model, by the “off-line” generation of fictive prediction errors (Barron, Auksztulewicz, & Friston, 2020; see also Hinton, Dayan, Frey, & Neal, 1995). More generally, memory recall has long been thought to involve reinstating both the processes and representations that were active during encoding (Nyberg, Habib, McIntosh, & Tulving, 2000; Morris, Bransford, & Franks, 1977). It is therefore plausible that memory recall—similarly to memory encoding—might be biased toward reinstating representations of the more unexpected elements of events (see Wittkuhn, Chien, Hall-McMaster, & Schuck,2021).
However, it has been argued that recent repeated experiences are most useful in predicting future experiences because they are more likely to be encountered again (Anderson & Milson, 1989). By contrast, idiosyncratic experiences are poor for making generalizations about the future (Sherman & Turk-Browne, 2020). Behaviorally, it is well established that prior knowledge exerts a strong influence over what aspects of an event are recalled (Popov & Reder, 2020; Smith, Hasinski, & Sederberg, 2013; Poppenk, Köhler, & Moscovitch, 2010; Brewer & Treyens, 1981). Accordingly, we may expect that it is those elements of an event that are more reliably present that will dominate its neural representation.
More generally, it is well established that repeated exposure to the same material is beneficial for memory (Van Strien, Hagenbeek, Stam, Rombouts, & Barkhof, 2005; Glenberg, Smith, & Green, 1977). However, the situation is less clear-cut when the same items are repeatedly encoded in different contexts. Here, although the items may become more familiar and are better recognized, their associations with the contexts that they were experienced in can weaken (Sievers, Bird, & Renoult, 2019; Yassa & Reagh, 2013).
Although repetition can result in better memory, it is also the case that more novel and distinctive items are also better remembered (Hunt, 1995). This memory advantage for more distinct items affects both recollection- and familiarity-based recognition judgments (Kishiyama & Yonelinas, 2003). Furthermore, novelty can act at the level of a stimulus or whether a particular stimulus is novel within a specific context (Ranganath & Rainer, 2003). Taken together, there is substantial evidence that two broad factors can determine the degree to which elements of an event are more memorable. On the one hand, there are items that are reliably present and predictable—often as a result as having been frequently encountered in the same context. On the other hand, there are items that are more idiosyncratic and unexpected—perhaps only occurring once in a context and therefore uniquely identifying an event. These factors have different effects on memory in different situations, indeed there are situations where memory is superior both for highly unexpected and highly expected events (Quent, Henson, & Greve, 2021). Our focus in this study is how these factors affect the neural representation of complex naturalistic events. Specifically, if a complex memory can be accurately recalled, is it those elements that are most reliably present that dominate the neural representation or is it the less familiar events that are uniquely associated with an event?
We first recorded fMRI patterns of activity elicited when participants viewed images or read names of famous celebrities. Following this, participants learned nine fictional conversations between two celebrities set within a television “chat show” format. The repeated elements were the hosts of the shows (three in total), whereas the unique elements were the guests (nine in total). After they had learned all of the conversations, participants then repeatedly recalled them in a second fMRI session. By using the person-specific patterns of activity from Session 1, we were able to examine whether the hosts or the guests were more robustly reinstated when participants recalled the conversations. All analyses were performed on ROIs from the posterior midline (PM) network identified in a previous study of event memory (Robin, 2018).
Thirty-one (21 women, 10 men) participants took part in the experiment. Participants were aged 19–30 years (mean = 24 years, SD = 3.54 years) and did not have a history of any psychiatric or neurological disorders. All participants were right-handed and were fluent English speakers. Four participants were excluded from any analyses: two did not complete the experiment, one because of excessive movement in the scanner, and one for failing to recall three of the guests during the memory screening test. Therefore, data from 27 participants were analyzed. Note three participants had corrupted audio recording and were not included in reports of the memory screening test. This sample size is consistent with previous studies examining fMRI pattern effects about stories or person decoding (Raykov, Keidel, Oakhill, & Bird, 2020; di Oleggio Castello, Halchenko, Guntupalli, Gors, & Gobbini, 2017; Zadbood, Chen, Leong, Norman, & Hasson, 2017). Informed consent was obtained from all participants before the experiment, and they were reimbursed £40 for their time. This project was approved by the Brighton and Sussex Medical School Research Governance and Ethics Committee.
We also collected follow-up data from a group of separate participants who completed a similar task online. This group comprised 37 participants (20 women, 17 men) with a mean age of 24.61 years (±4.94 years). We excluded one participant who failed to learn the conversations after five learning sessions. Therefore, 36 participants from the online study were included in the analyses. Participants received payment through online recruitment platform Prolific (https://www.prolific.co/) for each of the learning sessions. The project was approved by the University of Sussex Cross School Research Ethics Committee.
We ran a pilot study to select 12 celebrities who would be familiar to our sample (see https://osf.io/zpcv3/). Forty-eight pictures of 12 (six men, six women) famous individuals were used in Session 1. Four different pictures sourced from Google Images were used for each famous individual. The pictures only showed the famous individual and were converted to grayscale using Adobe Photoshop CC19. Additionally, five pictures of nonfamous individuals were used in the experiment.
Nine short fictitious conversations were written by the research team (see https://osf.io/zpcv3/). These were learned after Session 1 and recalled in the scanner in Session 2. The conversations were relatively short (162.5 ± 16.1 words on average) and took the form of chat show conversations using a question-and-answer structure. The conversations involved two people, a guest, and a host, taken from the 12 famous celebrities. On average, a similar number of words were “spoken” by the hosts (mean = 80.7 words) and the guests (mean = 81.7). The topic of each conversation was unique (e.g., giving to charity, social media; see https://osf.io/zpcv3/ for the complete transcripts of all conversations). Each conversation was associated with a particular day (Monday, Wednesday, or Friday) and week (Week 1, 2, or 3). Participants were required to learn when each conversation took place, as this was how memory would be cued in the scanner. Each of the three hosts was associated with three conversations that occurred on a specific day of the week (Host 1, Mondays; Host 2, Wednesdays; and Host 3, Fridays; see Figure 1). The guests were only associated with one of the nine conversations. A pilot experiment with an independent group of participants ensured that there were not substantial differences between the conversations in how memorable and interesting they were (see https://osf.io/zpcv3/). A picture was created using Photoshop to illustrate the context of each conversation and to make them appear more plausible. The picture presented the host and the guest sitting in a TV studio (see Figure 1C). Three background pictures of TV studios were used overall, which were consistent for the three hosts (e.g., Studio 1 for Mondays, Studio 2 for Wednesdays). Three counterbalancing lists were created where the identity of hosts and guests was varied across participants.
Before taking part in the fMRI experiment, participants completed a short online questionnaire establishing their knowledge of the famous individuals used in the experiment (see https://osf.io/zpcv3/). Participants rated the following from 1–5: how familiar they are with the person, how well they can imagine them, how much they know about their career and/or personal life, and whether they like the person. Participants were selected if they responded to be familiar (e.g., responded above 3) with the 12 celebrities used in the experiment. Participants were also encouraged to learn more about these celebrities before the experiment. Participants took part in two fMRI sessions spaced approximately 7 days apart (see Figure 1).
Scanning Session 1
In Session 1, images and names of 12 famous individuals were presented in a blocked fMRI design. There were five runs in total. Each run contained 24 blocks of trials, comprising one image and one text block for each of the 12 celebrities. In an image block, participants saw the four different images of the same individual (e.g., Daniel Craig). Each of the four pictures were presented twice within a single block, resulting in eight images per block (see Figure 1). The presentation order within a block was randomized. Each picture was presented for 800 msec with a 200-msec gap, during which a white fixation cross was shown. Each block lasted 8 sec. In a text block, participants saw multiple presentations of the name of 1 of the 12 famous individuals. Each text presentation was 800 msec, and there was a 200-msec ISI. To increase engagement with each stimulus, the font color and text position for each name presentation was varied. Each image and text name were presented on a black background and presentation of identities across blocks was randomized.
In between picture and text blocks, participants performed an odd–even number judgment task. This served as an active baseline task. Here, participants saw a sequence of four numbers randomly selected from the range 1–98. Each number was presented for 1.8 sec and was followed by a white fixation cross for 700 msec. Therefore, a block of the odd–even task lasted 10 sec. The odd–even task was followed by a red fixation cross presented for 400 msec signaling the upcoming identity block. Participants performed an odd-ball detection task on the image blocks. In each run, there was an additional image block that included a picture of an unfamiliar individual embedded in the sequence of pictures of a famous individual. These odd-ball blocks could appear at any point in the run apart from the first image block and were not included in the main analyses. Identities were presented in a randomized order.
After completing scanning Session 1, participants were provided with a list with the nine conversations involving the 12 celebrities (see Figure 1). Participants were asked to learn each conversation and were encouraged to visualize each one, using the images provided. Participants were instructed to learn the conversations well and were told that their memory for the conversations would be tested before being allowed to take part in the second scanning session. They were given 4–6 days to learn the conversations.
Following this, participants underwent a screening test to ensure they had learned the conversations before proceeding to the second scanning session. The test was carried out at least 1 day before the scanning session. Participants freely recalled all of the nine conversations in a random order, cued by the week and the day. The experimenter provided feedback on parts of the conversations that the participants had failed to recall, and all participants were asked to review the conversations again before scanning.
Immediately before the second fMRI session, participants also provided subjective ratings on their memory for the conversations. They rated on a scale of 0–100 how vividly and confidently they could remember the conversations. They also rated how engaging they found the conversations.
Scanning Session 2
In Session 2, while in the scanner, participants were asked to recollect the nine conversations they had previously learned. There were six runs in total, and within each run, participants recollected all nine of the conversations. Participants were presented with the time information (Monday first week) to cue their memory for the specific conversation. The cue was presented for 3 sec and was followed by a 15-sec recollection period. Participants were asked to remember the conversation in as much detail as possible for the 15 sec. Participants were provided with the option to press a button if they failed to recollect the conversation during that particular memory trial (e.g., because of mind-wandering). All such events were removed from the main analyses. In between memory blocks, participants performed the odd–even number judgment task (described above) for 15 sec, which served as active baseline. A red fixation cross followed the number judgment task and signaled the upcoming recall block (see Figure 1). Within each run, conversation memory trials were randomized.
In a follow-up online study, we asked a separate group of participants to learn the nine conversations used in the fMRI experiment and provide additional subjective ratings about the conversations.
Similarly to the fMRI experiment, we initially prescreened participants to be familiar with the 12 celebrities used in the experiment. Participants who were not familiar with any of the 12 celebrities were not included in the main experiment and were not included in the further learning sessions.
Participants completed five learning sessions over a week, each session released on a separate day. During the learning sessions, participants read each of the nine conversations at their leisure. The event cue (e.g., Monday first) and the photoshopped studio picture showing the host and the guest were presented to participants as they read the conversations during each session. After completing the reading session, participants answered multiple-choice questions about the conversations. The multiple-choice questions differed on each learning session and aimed to help participants learn the conversations. The questions concerned the identities of the hosts and guests, what they were talking about, and details from the conversations. After the fifth learning session, participants provided subjective ratings on how confidently and vividly they could remember each of the conversations. Participants also rated how confidently they could remember the host and the guest for each conversation and how important they perceived each one was for the conversation. Additionally, participants were asked to guess who the host and the guest on a hypothetical conversation happening on Week 4 (Monday, Wednesday, and Friday) would be. This was an open-ended question, and participants were instructed to make a guess or write they do not know.
All images were acquired on a 3-T Siemens Prisma scanner with a 32-channel head coil. To minimize head movement, soft cushions were inserted into the head coil. Functional images were acquired with a gradient-echo EPI sequence with multiband acceleration factor of 8 with the following parameters (repetition time = 0.8 sec, echo time = 33.1 msec, 52° flip angle, field of view = 208 × 180 mm, 72 slices with sliced thickness of 2 mm and isotropic 2 mm voxels). Two SpinEcho fieldmap runs with reversed phase-encode blips in both anterior to posterior and posterior to anterior were acquired with the same parameters as the functional images. Separate field maps were acquired for Sessions 1 and 2. A high-resolution T1-weighted image was acquired with 3-D MPRAGE sequence (repetition time = 2.4 sec, echo time = 2.14 sec, 8° flip angle, field of view = 224 × 224mm and 0.8 mm isotropic voxels).
SPM 12 (Wellcome Department of Imaging Neuroscience) was used to preprocess all images, except for the field maps. For each session, we first spatially realigned the functional images to the mean session image. Session-specific field maps were estimated with command line functions from FSL (Smith et al., 2004) and were applied to the motion-corrected data to correct for image distortions (Hutton et al., 2002). The anatomical image was segmented into gray, white, and cerebrospinal fluid using tissue probability maps, and was coregistered to the mean functional image. The segmented images were used to estimate deformation fields, which were applied to the functional data to transform them to Montreal Neurological Institute space. A 3-mm smoothing FWHM Gaussian kernel was applied to the data as recommended by previous work, showing that a small amount of smoothing can improve sensitivity of multivoxel pattern analyses (Hendriks, Daniels, Pegado, & Op de Beeck, 2017; Gardumi et al., 2016).
In both scanning sessions, participants completed an odd–even number judgment task that acted as an active baseline. We analyzed accuracy and RTs from both sessions to ensure participants were paying attention throughout the main tasks.
We carried out a post hoc analysis of data from the screening test before the second scanning session. This analysis used the well-established procedure for scoring performance on tests of prose recall (e.g., Wechsler, 1945). The script for each conversation was divided into discrete “idea units,” and a point was allocated for successfully recalling each unit. For each conversation, we scored units spoken by the host and the guest separately. Examples of the full script divided into units and participants' recalled conversations are available online (https://osf.io/zpcv3/).
During the second scanning session, participants could press a button to indicate that they could not retrieve the conversation on this trial. These trials were not included into further analyses. Participants provided subjective ratings on a scale of 0–100 on how vividly and confidently they could recall the conversations. Participants also rated how familiar and engaging they found the conversations. These ratings were averaged across participants separately for each conversation.
In the online study, a separate group of participants provided subjective confidence and vividness ratings about the same conversations. Participants also provided subjective ratings in their confidence to remember the host and the guest for each conversation and their importance.
We note that data from the last run of Session 2 from 3 out of the 27 participants were lost because of a technical issue. Therefore, these participants had only five runs rather than six from Session 2. MRI data were analyzed with SPM 12, the CosMoMVPA toolbox (Oosterhof, Connolly, & Haxby, 2016), and custom scripts written in MATLAB (Version 2017b, The MathWorks, Inc.). All analyses were conducted on Montreal Neurological Institute normalized images. The BrainNet Viewer toolbox was used for visualizing the ROIs (see Figure 5A), and bspmview toolbox was used for visualizing the whole-brain parcellations reported in supplementary materials.
We carried out our analyses on regions associated with the PM network (see Introduction). The ROIs were taken from a previous study of multielement event recall by Robin et al. (2018). We used their multifeature ROI, which is a set of regions comprising voxels that could classify between different aspects of events—locations, people, and objects (see Figure 5A). This comprises five different regions: posterior medial cortex (PMC), dorsal medial pFC (MPFC), left and right superior lateral occipital cortex (LOC) extending to the angular gyrus in the lateral parietal cortex (referred to as angular gyrus by Robin and colleagues), and left and right parahippocampal gyrus.
General Linear Models
To estimate activation patterns for later use in the representational similarity analyses (RSAs), we used general linear models (GLMs). In each run from Session 1, a separate regressor for each block was included, such that picture and text blocks were modeled separately (i.e., 24 regressors per run). All trial regressors were entered in a single first-level model as the least squares all method described in Mumford, Turner, Ashby, and Poldrack (2012). The patterns (t maps) from picture and text blocks for the same identity within each run were then averaged. Therefore, from all five runs, there were 60 patterns (12 per run). After averaging the picture and text blocks, this resulted in five patterns for each of the 12 famous individuals included in the experiment. In Session 2, each retrieval trial block was modeled with a separate regressor (nine per run) in a least squares all approach. This resulted in 54 patterns for the nine conversations used (six per conversation). Separate regressors of no interest for the six motion parameters, a session constant term, and a high pass filter with cutoff of 1/128 Hz were included in all GLMs.
We used the GLM estimated t maps as inputs to our RSAs. A series of RSAs were carried out to examine whether repeated features are more strongly represented during recollection. Contrast matrices for each analysis are shown in Figure 4. All similarity matrices were estimated using Pearson correlation, and all correlation values were Fisher transformed before computing further contrasts. Group-level one-sample t test against zero was used to examine the significance of the RSA contrasts (α = .05).
Before investigating whether the hosts and/or the guests were represented in the ROIs during memory retrieval, we first established which ROIs distinguished the identities of the celebrities as well as the conversations. If an ROI could not discriminate between the celebrities when presented in isolation, then it would not make sense to seek evidence for identity-specific reactivation of these patterns during memory retrieval. Similarly, if an ROI could not discriminate between the conversations themselves during retrieval, then it would not make sense to look for reactivation of the identities of the people taking part in the conversations (see Supplementary Figure 5, available online at https://osf.io/zpcv3/). Therefore, the ROIs that we report showing a significant effect of “host” or “guest” reinstatement had to not only pass the significance threshold for these specific analyses but also show significant effects in two additional independent analyses (of “identity” and “conversation,” see below).
First, we examined which regions show reliable identity-specific patterns (see Figure 4A). For this analysis, only patterns from Session 1 were used. For each ROI, spatial patterns of activity (t maps) for each identity were extracted, vectorized, and used to construct an RSA matrix. Patterns for the same identity in odd numbered runs and even numbered runs were separately averaged. This resulted in 12 identity patterns estimated from the odd runs and 12 identity patterns estimated from the even runs. The pairwise similarities between all of these patterns were used to produce a 12 × 12 correlation matrix. The resulting RSA matrix represents the neural similarity between the 12 identities. The diagonal values represent the matching identities across runs, and the 132 off-diagonal values represent the similarity between nonmatching identities. To examine which regions show reliable identity-specific patterns, the mean average similarity between matching identities versus the mean average similarity between nonmatching identities was compared. ROIs that did not show reliable identity-specific patterns were not included into further analyses. Results from all ROIs are available online at https://osf.io/zpcv3/.
Second, we examined whether the remaining ROIs would exhibit reliable conversation-specific effects. Using a similar logic, patterns for each conversation across odd and even numbered runs were separately averaged. Their similarity was then computed using Pearson correlation, which resulted in a 9 × 9 similarity matrix. Diagonal values represented similarity between patterns of matching conversations, whereas off-diagonal values represented values of mismatching conversations. The mean matching similarity with the mean nonmatching similarity across conversations was compared. ROIs that did not show reliable conversation-specific patterns were also not included into further analyses.
In the host RSA, the patterns from Session 1 for the host identities were averaged across all runs. Similarly, all conversation patterns, from Session 2, were averaged across all runs. The correlation between the host patterns and the conversations was computed. The correlation matrix was constructed such that the diagonal values represented the correlation between host identity (e.g., Jennifer Aniston pattern from Session 1) and the conversations with matching host. The off-diagonal values represented correlation between host identity and mismatching conversations (e.g., where Jennifer Aniston was not the host; see Figure 4C). Note that the host patterns were repeated within the correlation matrix, and therefore, some off-diagonal values were not included into the analyses. The contrast between matching host to conversation patterns and the mismatching patterns was computed. This was done to examine whether host patterns were reinstated during retrieval. ROIs that did not show a significant host-specific reinstatement were not removed from further analyses, as it was possible that they would show a guest-specific reinstatement.
Afterward, the guest identity-specific patterns from Session 1 were extracted and averaged them over all runs. The similarity between guest patterns from Session 1 and the conversation patterns from Session 2 was computed. This resulted in a 9 × 9 correlation matrix (see Figure 4D). The diagonal values represent similarity between matching guest and conversation patterns (e.g., Michelle Obama Session 1 pattern and conversation where Michelle Obama was the guest—Monday first). The off-diagonal values that were used as a contrast represented the similarity between a mismatching guest and conversation. To keep the number of contrast values similar to the host analysis, described above, we focused on the values representing mismatch between guest and conversation, coming from the same show. For instance, the mismatch values for Michelle Obama who was a guest on Monday first week were conversations where she was not the guest, but still happened on Mondays and had the same host (e.g., Monday second week and Monday third week). To examine guest-specific reinstatement, the mean matching guest to conversation similarity to the mean of the mismatching guest to conversation similarity was computed.
Simulations of Host and Guest RSAs
The RSAs for the hosts involve repeatedly correlating the pattern of activity from Session 1 for each of the hosts with three different conversations from Session 2. By contrast, the RSAs for the guests involve unique pairwise correlations between the pattern of activity for each guest and their respective conversation. We wanted to check that this procedure did not bias the analyses to find greater evidence for reinstatement of either the hosts or the guests. We therefore generated simulated patterns for the hosts and guests and modeled the situation where both the host and guest patterns were equally present in the pattern for each conversation.
We first simulated 12 random patterns of the same length as our PMC ROI. These patterns acted as the 12 celebrity patterns from Session 1. Three patterns were taken to represent the hosts, and the other nine were taken to represent the guests. We then simulated nine conversation patterns that were linear combinations of the host and guest patterns as well as noise (the correlation between each conversation pattern and each of its constituent “host” and “guest” pattern was assigned to be 0.2). In the first simulation, we added white noise to the simulated conversations patterns. This allowed us to control the similarity between the simulated identity patterns and the simulated conversation patterns. To match our design, three of the simulated conversation patterns were associated with the same host (but a unique guest). We then ran our planned “host” and “guest” RSA comparisons on the simulated data (as illustrated in Figure 4). We ran these simulations 100,000 times each.
Next, we ran a further simulation using fMRI data to rule out the possibility that correlated noise in the data would bias the analyses to find greater evidence of reinstatement for either the hosts or the guests. For each subject's fMRI data, we averaged the conversation patterns across the different runs of Session 2 and constructed a correlation matrix between the different conversation patterns. We then Fisher-transformed these correlation matrixes and created an average correlation matrix across subjects that represented the similarity between the nine conversations participants were remembering during Session 2. We used this average correlation matrix across conversations to add correlated noise to the simulated conversations. Specifically, here, we modeled the conversations as combinations of simulated host and guest patterns. Additionally, we added simulated noise patterns that were drawn from a multivariate Gaussian distribution with mean zero and correlation matrix being equal to the empirical average correlation matrix across conversations. We ran each simulation 100,000 times.
Hosts versus guests.
To examine which regions show differential reinstatement for host and guest identities during retrieval of conversations, we performed a paired t test. Specifically, we contrasted for each subject their host-specific reinstatement effect with their guest-specific reinstatement effect.
Time period analyses.
We additionally ran post hoc analyses to examine the time course of the host and guest RSA effects. This examined whether any effects observed in the main analyses evolve over time, and in particular, whether the host effects are stronger during the initial period of recall. To examine the time course of the host and guest analyses, we ran three additional GLMs. We modeled the first, middle, and last third of each of the conversations in separate GLMs. This allowed us to estimate conversation patterns of brain activity that were specific for the beginning, middle, and end of the 15-sec recall period. We ran the host and guest analyses as described previously, but separately for each of the three patterns. We also ran the conversation-specific analysis (see Figure 4B) that tested whether we could observe reliable conversation-specific patterns.
Online (https://osf.io/zpcv3/), we report two additional fMRI analyses. In the first, we ran exploratory RSAs to investigate reinstatement of the hosts and guests during the conversations within a whole-brain parcellation (see Supplementary Figures 6 and 7 at https://osf.io/zpcv3/). The main purpose of this analysis was to establish whether any regions outside our predefined ROIs showed evidence for reinstatement of the guests during recall of the conversations. We therefore report results for the four main analyses (see Figure 4) from 200 functionally defined regions using a parcellation reported in Schaefer et al. (2018). We also ran the four main analyses in a bilateral hippocampal ROI (constructed using parcellations from Ritchey, Montchal, Yonelinas, & Ranganath, 2015, and deposited on https://neurovault.org/). Second, we used intersubject pattern analysis to investigate whether the topics of the conversations were represented in patterns of activity, regardless of the identities of the hosts and guests. This analysis addresses the issue of how narrative information is represented in the brain (see Supplementary Figure 9 at https://osf.io/zpcv3/).
Before taking part in the experiment, participants were asked to rate how familiar they are with the 12 celebrities and how easily they could imagine them (rating data were missing from three participants). Participants, on average, were familiar with the celebrities included in the experiment (mean = 4.07, SD = 1.02, max = 5). One participant initially reported relatively low familiarity with the celebrities (mean = 2.66) and was asked to learn about each of the celebrities by watching videos of them over the course of a week.
In the first scanning session, participants were presented with an additional image block per run that included a single picture of an unfamiliar person embedded in a series of pictures of a famous celebrity. Participants, on average, identified 38% (SD = 19%) of these oddball blocks. A coding error prevented us from recording responses to the oddball blocks that occurred after a block has finished, so this may be an underestimate of the true percentage of oddball blocks detected. Between picture and text blocks, participants performed an odd–even number judgment task that served as an active baseline and attention check. On average, participants made the odd–even judgments accurately at 97.6% (SD = 0.02, RT = 790 msec, SD = 101). Highly similar performance on the odd–even judgment baseline task was also found during the second scanning session (97%, SD = 0.03, RT = 768 msec, SD = 105).
The memory screening test before the second scanning session revealed that all of the participants had learnt the nine conversations well. All participants correctly identified both the host and the guest for all conversations (one participant failed to identify the guest for three of the conversations and their data were excluded from the study). Participants identified 6.04 (SD = 2.29) idea units spoken by the hosts and 7.03 (SD = 2.04) idea units spoken by the guests, a difference that is highly significant, t(24) = −8.18, p < .001 (see Figure 2). Thus, even though the conversations were constructed so that the hosts and guests spoke the same amount and both asked and responded to questions, participants recalled more dialogue that was spoken by the guests than by the hosts.
Immediately before scanning, participants rated that they could vividly and confidently remember the conversations. Furthermore, they rated them as being familiar and engaging (data available at https://osf.io/zpcv3/).
During the second scanning session, participants could indicate if they did not recall a conversation in a specific trial. Participants rarely reported that they could not recall the specific conversation on a given trial. On average, participants had 1.71 (SD = 1.35) discarded trials out of 54, or 3.16%.
A separate group of participants learned the same conversations online and provided behavioral ratings about their memory. Before completing the learning, sessions participants were screened to be familiar with each of the 12 celebrities. Participants completed five learning sessions. After the fifth session, participants answered multiple-choice questions about the conversations and provided subjective ratings about their memory.
Participants showed overall good learning of the conversations after five sessions, with average accuracy being 95.38% (±5.72). Furthermore, participants had achieved ceiling performance for the host (100% ± 0%) and near ceiling performance for the guest (94.7% ± 9%) memory questions. Participants rated that they could confidently (75.75 ± 19.47) and vividly (75.03 ± 21.42) remember the conversations. When examining all participants, we observed that participants were more confident in their memory for the hosts (94.01 ± 10.41) compared with the guests (89.06 ± 16.92), t(603) = 4.33, p < .001. Interestingly, participants rated the guests as playing a more important role than the hosts in the conversations (71.96 ± 21.77 vs. 67.24 ± 21.84), t(603) = −3.68, p < .001 (see Figure 3A). To further investigate these effects and to provide a better comparison to the participants who took part in the main scanning experiment (who identified both the host and the guest of every conversation), we analyzed data from participants who were able to identify all of the hosts and the guests (25 in total). In this sample, we did not observe any significant differences in host versus guest confidence (93.55 ± 11.15 vs. 92.69 ± 12.42), t(416) = 0.78, p = .43, but the guests were still perceived to be more important than the hosts within each conversation (72.03 ± 22.30 vs. 67.96 ± 23.38), t(416) = −2.65, p = .008 (see Figure 3B). Last, when asked to guess who the host and guests would be on a following week, participants guessed 72.9% of the times that a following week will involve the same hosts that were repeated in the previous weekdays. For the guests, participants responded that they did not know who the guest would be for 73.83% of the future conversations and 25.2% guessed that the guest on Week 4 would be a celebrity that was not one of the previously seen celebrities. Only on 1% of trials did one participant guess that a previously seen guest would reappear on Week 4. Therefore, according to this measure, the hosts were more “predictable” than the guests.
We first examined which regions would show reliable identity-specific patterns in Session 1 when the celebrities were presented in isolation as both pictures and names. See https://osf.io/zpcv3/ for results from all ROIs. Reliable identity-specific decoding was observed in the left LOC, t(26) = 3.52, p = .001; PMC, t(26) = 2.47, p = .02; right LOC, t(26) = 2.54, p = .01; and right fusiform cortex, t(26) = 3.62, p = .001 (see Figure 5B). The other regions, left parahippocampal cortex, t(26) = −0.59, p = .55, and MPFC, t(26) = 0.88, p = .38, did not show significant identity-specific patterns. Therefore, the regions taken forward to the next analysis were the PMC, the left and right LOC, and the right fusiform cortex.
We next examined which of the four regions taken forward from the previous analyses would show reliable conversation-specific patterns. See https://osf.io/zpcv3/ for results from all four ROIs. The left LOC, t(26) = 6.10, p < .001; PMC, t(26) = 4.04, p < .001; and right LOC, t(26) = 4.06, p < .001, showed reliable conversation-specific patterns (see Figure 5B). However, the right fusiform cortex ROI did not show significantly higher reliability for matching conversations, t(26) = −0.28, p = .78. Therefore, the regions that were taken forward for further analysis were the PMC and the left and right LOC.
Simulations of host and guest RSAs.
We ran simulations to ensure that our analyses were not statistically biased to find stronger effects for the host identities because the hosts were repeated across three conversations. We simulated conversation patterns that had an equal contribution from both simulated host and simulated guest patterns. The first simulation added white noise to the simulated conversation patterns and ran the same analyses for the hosts and guests as the ones ran in the main analyses. The second simulation added noise that had the same correlational structure as the conversation patterns in the fMRI data (see Supplementary Figure 4 at https://osf.io/zpcv3/). Importantly, we did not see any differences in the simulated effects for host and guest analysis (simulation with correlated noise: host mean = 0.202, SD = 0.005; guest mean = 0.202, SD = 0.005), t(99999) = −0.58, p = .56.
We next examined which ROIs would show host-specific reinstatement. We compared the similarity between brain patterns from the Session 1 to Session 2 conversations. We took the patterns of identities in Session 1 that would be hosts in Session 2 and computed the similarity between host identities with the matching conversations. The left LOC, t(26) = 2.34, p = .02; PMC, t(26) = 3.27, p = .003; and right LOC, t(26) = 2.02, p = .05, showed host-specific reinstatement during the conversations (see Figure 5B).
We next examined whether these regions would also show guest-specific reinstatement. None of the ROIs showed a reliable guest-specific reinstatement: left LOC: t(26) = −1.56, p = .12; PMC: t(26) = −1.61, p = .12; right LOC: t(26) = −1.62, p = .11. Surprisingly, these nonsignificant effects are in the opposite direction to that which would show evidence for reinstatement (see Figure 5B).
Hosts versus guests.
We then performed a direct contrast between the host-specific reinstatement and guest-specific reinstatement effects. We observed higher host reinstatement in the left LOC, t(26) = 2.90, p = .007; PMC, t(26) = 3.41, p = .002; and right LOC, t(26) = 2.46, p = .02.
Time course analysis.
In a post hoc analysis, we examined the time course of the host and guest effects reported above. Specifically, we ran three separate GLMs to estimate the first, second, and third part of the conversations separately. This allowed us to estimate patterns for the conversations for the beginning, middle, and end of the recall period. We ran the host and guest analyses as described above separately for each part of the recall period. We further ran the conversation-specific analysis in the three parts separately to ensure that we had enough power to distinguish between the conversations when modeling only a third of the duration (see Figure 6). Throughout the beginning, middle, and end of the events we observed conversation specific patterns. We observed host reinstatement in the middle and end of recall periods, but not during the beginning of events. In contrast, we did not observe guest reinstatement during any point of the recall period. We found that when we modeled only the beginning of the conversations, we could reliably detect conversation-specific patterns in all ROIs (left LOC: t(26) = 3.98, p < .001; PMC: t(26) = 3.14, p = .004; right LOC: t(26) = 2.25, p = .03), but could not observe any reinstatement of the hosts or guests in any of the three ROIs (all ps > .05). When modeling the middle part of the recall, we again found conversation-specific patterns in all ROIs: left LOC: t(26) = 5.47, p < .001; PMC: t(26) = 6.06, p < .001; right ROC: t(26) = 3.86, p < .001. We also found host reinstatement (left LOC: t(26) = 2.71, p = .01; PMC: t(26) = 3.21, p = .003; right LOC: t(26) = 1.98, p = .058), but not guest reinstatement (all ps > .05). When modeling only the last 30% of the recall, we again found conversation-specific patterns (left LOC: t(26) = 4.28, p < .001; PMC: t(26) = 4.04, p < .001; right LOC: t(26) = 4.06, p < .001), host reinstatement in PMC (left LOC: t(26) = 1.73, p = .09; PMC: t(26) = 2.23, p = .03; right LOC: t(26) = 1.87, p = .07), but no significant guest reinstatement (all ps > .05).
Reinstatement effects for host across time. Here, we show results for the conversation-specific host and guest analysis separately for the first, second, and third part of the recall duration. Bar plot represents means, and error bar shows bootstrapped 95% confidence intervals. Asterisk indicates significance at p < .05.
Rich and detailed episodic memories comprise many individual elements: Some of these elements are common to several different memories, whereas others are unique to one. Using a novel paradigm involving recall of complex, naturalistic events (custom written “chat show” conversations), we found that the neural representation of the events was dominated by the repeated and more predictable elements (the “hosts”) rather than the more idiosyncratic, yet event-defining elements (“the guests”). These effects were found in PMC as well as lateral parietal/occipital cortex. Importantly, analyses of behavioral data showed that participants actually had better memory for dialogue spoken by the guests compared with the hosts and also rated the guests as being more important to the conversations. This shows that the fMRI effects we report are not driven by better memory for the hosts compared with the guests.
Many studies have shown that episodic memories tend to be recalled in a holistic manner, with the retrieval of one element being dependent on retrieval of others (Joensen, Gaskell, & Horner, 2020; Ngo, Horner, Newcombe, & Olson, 2019; Horner & Burgess, 2014). Furthermore, retrieval of one element tends to result in reactivation of the representation of other elements (Horner, Bisby, Bush, Lin, & Burgess, 2015), suggesting that no particular element dominates in the overall representation of the event. However, in these studies, the events are all unique combinations of elements. Our results suggest that when we recall memories of particular events, it is the repeated and predictable elements that serve to frame or scaffold the representation of the memory. It has already been shown that the spatial context of an episode can dominate the representation of episodic memories (Robin et al., 2018), but in the current study, we specifically chose nonspatial elements (well-known people) to be the repeated or novel elements. We therefore speculate that when we recall detailed episodic memories, those elements that are reliably present serve as a form of “context” within which to retrieve the less predictable details that are specific to a particular occasion.
Our study focused on regions of the brain's PM network, and we used ROIs that were identified in a previous study of multielement event recall (Robin et al., 2018). The regions where we found that hosts were represented more strongly than guests—the PMC and lateral occipital/parietal cortex—have been shown to support event-specific patterns of activity in many previous studies (Masís-Obando, Norman, & Baldassano, 2021; Raykov et al., 2021; Reagh & Ranganath, 2021; Bird, 2020; Chen et al., 2017; Oedekoven et al., 2017; Bird et al., 2015; St-Laurent et al., 2015; Kuhl & Chun, 2014). This brain network has been argued to predominantly represent contextual information in a broad sense—not only spatial contexts but also temporal and social contexts (Ranganath & Ritchey, 2012). Similarly, it is associated with integrating multimodal information over long timescales to build mental models of an overarching situation (Yeshurun, Nguyen, & Hasson, 2021; Hasson, Chen, & Honey, 2015). Nevertheless, there is a large body of evidence that these regions also support representations of known individuals and more basic semantic concepts (Fairhall & Caramazza, 2013). Here, we show that although these regions do support representations of famous individuals, when two individuals occur together in an event, it is the person who occurs most reliably within that context who is represented most robustly.
The aim of this study was to contrast those elements that are more generic, familiar, and reliably present with those that are more idiosyncratic, unexpected, and event-defining. Our behavioral results revealed that it was actually the parts of the conversations spoken by the guests that were more memorable. Moreover, a follow-up study found that participants judged the guests to be more important to the conversations compared with the hosts. Given this and the research summarized in the Introduction, it was unclear which elements would dominate the neural representation of the conversations. We, therefore, feel that the finding that the host representations were consistently stronger than the guest representations is an important step to understanding how recalled episodic memories are represented by the brain. Our design does not allow us to further tease apart which factors contribute to the representations of the hosts being dominant. For example, the number of repetitions of the hosts across the conversations is confounded with their predictability. Moreover, it is possible that chat show “hosts” are associated with a qualitatively different structural or social representation. Future work should aim to deconfound the effects of number of repetitions and predictability as well as verify that our results generalize to elements of an episode that do not have strong preexisting contextual associations.
Episodic memory recall is thought to comprise an initial period of memory search and construction followed by a period of memory elaboration (Addis, Wong, & Schacter, 2007; Conway, Pleydell-Pearce, & Whitecross, 2001). It is possible that, in our study, the initial search period of episodic recall was dominated by the representations of the hosts. This is because the hosts were reliably associated with a particular day of the week across three separate conversations (e.g., all Fridays), whereas the guests were only associated with one unique combination of day and week (e.g., Monday Week 3). However, we found no evidence that the representation of the hosts was preferentially active at the beginning of the recall period or that the activation of the representation of the guest became stronger as the recall phase progressed. Furthermore, we observed conversation-specific patterns of activity in all ROIs included in the main analyses. Indeed, we also observed conversation-specific patterns shared across participants who learned the conversations with different host and guests (see https://osf.io/zpcv3/). Taken together, these results and our behavioral memory findings suggests that participants were unlikely to be remembering only the hosts during the whole retrieval period.
The guests in the conversations were all well-known celebrities and key elements in each of the events. Nevertheless, the representational similarity between the guest-related patterns of activity in Session 1 and recall of the conversations in Session 2 was not reliably above chance in any of the ROIs. Furthermore, an exploratory analysis of 200 cortical regions and the hippocampus did not reveal any locations where guest-related patterns of activity were reliably activated when recalling the conversations (see https://osf.io/zpcv3/). By modeling our analyses, we verified that they were not biased toward detecting reactivation effects in the hosts rather than the guests. It remains a possibility that brain regions supporting representations of the guests during retrieval did not correspond to shape of the parcellated brain regions used in the exploratory analysis.
It has been argued that a function of episodic memory is to enable the prediction of future events (Barron et al., 2020; Lu, Hasson, & Norman, 2020; Schacter, Benoit, & Szpunar, 2017; see also Sun, Advani, Spruston, Saxe, & Fitzgerald, 2021). Predictive processing accounts of perception and cognition typically emphasize the role of unexpected information in driving attention and new learning (Niv & Schoenbaum, 2008; Friston, 2005). Since memory offers an opportunity for “off-line” learning (Hinton et al., 1995), we might expect that the less typical aspects of an event would be overrepresented when the event is recalled. However, we found the opposite effect—the repeated and more predictable elements of the event—dominated its neural representation.
Instead, our results are compatible with the view that information that has been encountered more frequently is likely to be most useful to us in the future (Anderson & Milson, 1989). When recalling an event, activating robust representations of the elements that are most reliably present may help us to generalize our experiences to new situations (Gershman, 2017). To reconcile our findings with predictive coding accounts of cognition, it may be that, during perception, prediction errors drive the learning of any unexpected information, whereas memory recall offers the opportunity to improve our internal model of the world by selectively enhancing those elements that are likely to be encountered again and removing noisy or idiosyncratic components that are unlikely to be repeated (Barron et al., 2020; see also Sun et al., 2021).
In summary, our results show that the elements of an event are not represented equally in memory. Those elements that are most reliably present across similar events—being both more frequently encountered and more predictable—are the ones that are most robustly activated when recalling these events. The brain may use these elements to provide context and scaffold the representation of a remembered event within the PM network. This, in turn, may facilitate the activation of the more idiosyncratic elements that are unique to one specific occasion.
We thank Ediz Sohoglu and Sam Berens for helpful discussions about the study. We would like to thank Jessica Robin for providing us with the multifeature ROI and Charlotte Sutherland for helping to score the behavioral data.
Reprint requests should be sent to Petar P. Raykov, School of Psychology, University of Sussex, Falmer BN1 9QH, UK, or via e-mail: P.Raykov@sussex.ac.uk.
Petar P. Raykov: Conceptualization; Formal analysis; Methodology; Project administration; Visualization; Writing—Original draft; Writing—Review & editing. Konstantinos Bromis: Conceptualization; Formal analysis; Methodology; Project administration. Leah Wickens: Data curation; Project administration. Warrick Roseboom: Conceptualization. Chris M. Bird: Conceptualization; Funding acquisition; Methodology; Project administration; Supervision; Writing—Original draft; Writing—Review & editing.
This project has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (https://dx.doi.org/10.13039/100010663), grant number: 819526 to C. M. B. P. P. R. was additionally supported by an Economic and Social Research Council studentship (https://dx.doi.org/10.13039/501100000269), grant number: ES/J500173/1 and fellowship: ES/V012444/1.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance.