Although memory of episodic associations is generally considered to be recollective in nature, it has been suggested that when stimuli are experienced as a unit, familiarity-related processes might contribute to their subsequent associative recognition. Furthermore, intradomain associations are believed to be unitized more readily than interdomain associations. To assess these claims, we tested associative recognition following two types of pair associate learning. In the unimodal task, stimulus pairs were pictures of common objects, whereas in the cross-modal task, stimulus pairs consisted of an object picture and an unrelated environmental sound. At test, participants discriminated intact from recombined pairs while ERPs were recorded. In the unimodal task only, associative recognition was accompanied by a robust frontal deflection reminiscent of a component commonly interpreted as related to familiarity processes. In contrast, ERP correlates of associative recognition observed at more posterior sites, akin to a component that has been related to recollection, were apparent in both tasks. These findings indicate that retrieval of unimodal associations can be supported by familiarity-related processes that are dissociable from recollective processes required for the retrieval of cross-modal associations.
Remembering episodic associations—that two or more stimuli were experienced conjointly—is a vital cognitive function that enables us to reconstruct environments in which we have been present and relive events that we have experienced. One form of access to associative memory is recognition, the judgment that a pair (or larger group) of items currently presented were previously experienced together in a specific episodic context. To test associative recognition memory, participants are usually required to discriminate between intact and recombined pairs of studied stimuli.
One widely accepted model of episodic recognition, dual process theory, posits that recognition is not a unitary entity, but rather comprises two functionally and neurally separable processes: familiarity and recollection. Familiarity refers to the basic feeling of having encountered something or someone without retrieval of additional information, whereas recollection provides additional contextual details about that encounter (Yonelinas, 2002). This distinction is supported by a variety of ERP studies showing that familiarity and recollection are related to two qualitatively distinct ERP components. Familiarity is associated with an early mid-frontal old/new effect between 300 and 500 msec often referred to as FN400, whereas recollection is related to a late positive component, an old/new effect prominent over parietal scalp between 400 and 800 msec (Wilding & Ranganath, 2011; Rugg & Curran, 2007; Mecklinger, 2000).
Although it is generally agreed that recognition of single items can be supported by both recollection and familiarity, the contributions of these processes to associative memory have yet to be determined. It has been proposed that, in associative recognition tasks, recollection is required to reactivate de novo associations between arbitrary items, and that such associative memory is not accessible via familiarity processes (e.g., Hockley & Consoli, 1999; Donaldson & Rugg, 1998; Yonelinas, 1997). However, recent research has suggested that under certain circumstances familiarity might also contribute to associative memory—specifically, when the to-be-associated stimuli are unitized during presentation and are thus perceived and encoded as a single entity (Jäger & Mecklinger, 2009; Rhodes & Donaldson, 2007, 2008; Quamme, Yonelinas, & Norman, 2007; Jäger, Mecklinger, & Kipp, 2006; Yonelinas, Kroll, Dobbins, & Soltani, 1999).
A number of experimental studies have addressed the suggestion that the retrieval of unitized associations can be supported by familiarity, whereas nonunitized associations require recollection for their retrieval. Bastin, van der Linden, Schnakers, Montaldi, and Mayes (2010) report that within-domain (face–face) associative recognition was mainly supported by familiarity, whereas between-domain (face–name) associative recognition required a major contribution of recollection. Other studies have shown that unitization and associative strategies modulate aging effects on associative memory, seemingly by strengthening familiarity, as recollection becomes less effective with aging (Jäger, Mecklinger, & Kliegel, 2010; Naveh-Benjamin, Brav, & Levy, 2007). On the basis of findings from their hemodynamic imaging studies, Staresina and Davachi (2010) and Haskins, Yonelinas, Quamme, and Ranganath (2008) have proposed that unitization enables associative representation formation by perirhinal cortex independently of hippocampal processes. This notion dovetails with suggestions that have been made regarding perirhinal-supported “associative familiarity” (Mayes, Montaldi, & Migo, 2007) and the rapid encoding of unitized items supported by substrates in parahippocampal gyrus (Henke, 2010). In the electrophysiological domain, ERP correlates of familiarity and recollection dissociate unitized and nonunitized associative representations for faces (Jäger et al., 2006) and words (Kriukova, Bridger, & Mecklinger, 2013; Bader, Mecklinger, Hoppstädter, & Meyer, 2010; Wiegand, Bader, & Mecklinger, 2010; Rhodes & Donaldson, 2007, 2008).
These physiological and behavioral findings indicate that distinctions between item and associative memory may be more parametric than binary, depending on the possibility of stimulus unitization. Nevertheless, the exact circumstances under which familiarity contributes to associative recognition require further specification. According to one taxonomy, episodic memory may involve three types of associations, reflecting differing degrees of unitization: intraitem associations, that is, items unitized into one entity (e.g., a compound word); within-domain associations, formed between similar kinds of items that are not remembered as one entity (e.g., two unrelated words); and between-domain associations, formed between different kinds of items or modalities, such as faces and voices (Mayes et al., 2007). This model suggests that intraitem and within-domain associations may be unitized more readily than between-domain associations. In line with this approach, we have found dissociations between the ERP correlates of successful retrieval of unimodal and cross-modal associations in a cued recall paradigm, possibly reflecting the effects of differential degrees of stimulus unitization (Tibon & Levy, 2013).
The current study was designed to address the domain dichotomy proposal and its implications for dual process models of recognition memory using ERPs elicited during an episodic recognition task. As noted above, almost all prior studies of unitization effects on associative recognition have employed word pairs. Unitization of word pairs, whether conventional cases such as “traffic jam” or creatively instructed cases such as “vegetable bible,” defined as a guide to aspiring gardeners (Bader et al., 2010), may not fully model the formation of associations in ecological conditions of audiovisual perception. We therefore employed two types of single-trial pair associate learning for which associative recognition would subsequently be performed. In the unimodal association task, stimulus pairs were color drawings of common objects (tools, animals, food, toys, vehicles, etc.), whereas in the cross-modal association task, stimulus pairs consisted of an object picture and a brief nameable environmental sound (e.g., dog barking, glass breaking, harp arpeggio, etc.). In both cases, the semantic relations between the stimuli comprising the pairs were arbitrary, and the study participants were asked to create an association in which the presented objects interact. At test, intact and recombined pairs were presented, and participants performed an associative recognition judgment task while their EEG was recorded. This paradigm enabled us to examine the time course of associative retrieval of unimodal and cross-modal associations and to differentiate between the processes subserving the retrieval of such associations. We hypothesized that neural correlates of familiarity-based recognition would be observed for unimodal pair associates, which could be more readily unitized at encoding, but not for cross-modal pair associates.
Participants were 33 healthy right-handed (all scored positively on the Edinburgh Handedness Inventory; Oldfield, 1971) young adults (26 women, mean age = 23.30 years, SD = 1.65 years, range = 21–29 years), with normal or adjusted-to-normal vision. All were undergraduate students who volunteered in return for academic requirement credit or payment. Informed consent was obtained from all participants for a protocol approved by the Interdisciplinary Center's Institutional Review Board. Four participants were excluded from the analyses: one participant because of very poor performance of the task, another participant because of computer failure during the experiment and two participants with a very low number of trials (n < 8) in one bin after removing EEG artifacts, leaving 29 participants whose data were analyzed.
One hundred twenty-five environmental sounds were selected for identifiability from a corpus downloaded from various Internet sources. These were transformed to mono mode and adjusted to equal amplitude level using Audacity audio editing software, at 44,100 Hz, 32-bit resolution. The sounds selected to be used in the experiment varied in length as required for identifiability, with a minimum length of 340 msec and a maximum length of 1112 msec (mean = 828.78 msec, SD = 183.3 msec). Three raters who did not take part in the main study were asked to name the candidate sounds. One hundred twenty sounds that were correctly identified by majority of the raters were used in the experiment. Five additional sounds were used for practice trials and examples.
Stimuli were 360 color drawings of common objects obtained from various Internet sources coming from categories including fruits and vegetables, tools, sporting goods, electrical and electronic devices, animals, furniture, and clothing, each approximately 6–8 cm in on-screen size. Seventeen additional drawings were used for the practice trials and examples.
To form the various experimental conditions, four stimulus lists were created for the encoding phase: Three lists of 120 pictures each and one list of 120 sounds. One list of pictures and the list of sounds were paired alternately with pictures from the other two lists. Thus, two thirds of the pictorial stimuli were counterbalanced across participants and tasks. There were no direct semantic relationships between the stimuli in each pair (e.g., never two animals or two inanimate objects from the same category, nor close relationships such as dog and bone), such that the association to be generated by the participant would be unconfounded by preexisting associations, and its formation would constitute a discrete event, leading to a subsequent episodic memory.
For the retrieval phase, test probes were either intact stimulus pairs, unmodified from study (intact condition), or recombined pairs comprised of two semantically unrelated studied stimuli that did not appear together at study (recombined condition). Four stimulus lists were created to form these conditions. Assignment of all stimuli to condition type (intact/recombined) was counterbalanced across participants.
As a pilot study indicated that performance in the unimodal task was superior to performance in the cross-modal task, we employed different block lengths to match difficulty levels; behavioral results (see below) indicated that this manipulation was essentially successful. The experiment thus consisted of three blocks. In the first and third blocks, each consisting of 60 picture–sound pairs, participants performed the cross-modal pair-associate task. In the second block, consisting of 120 picture–picture pairs, participants performed the unimodal pair-associate task.
Participants were tested individually in a quiet room. Upon arrival at the laboratory, they signed an informed consent form and filled out the Edinburgh Handedness Inventory (Oldfield, 1971). Following EEG electrode cap preparation (described below), participants were seated at a distance of ∼70 cm from a computer monitor. For the cross-modal session, in-ear headphones were applied. Participants were then told that they would be presented with pairs of stimuli (picture–picture pairs in the unimodal task and sound–picture pairs in the cross-modal task) and were instructed to remember those pairs. They were further instructed to form an association between the stimuli, preferably by using imagery, to enhance their memory. They were told that after 60 pairs (in the cross-modal task) or 120 pairs (in the unimodal task) had been presented, a test phase would ensue, in which stimulus pairs would be presented. They were then to perform an associative recognition memory task judgment, replying on a confidence scale of 1–5 (1 = definitely didn't appear together, 2 = probably didn't appear together, 3 = don't know, 4 = probably appeared together, 5 = definitely appeared together). Participants were asked to relax and to avoid eye movements and blinks as much as possible.
During the encoding phase of each block, stimulus pairs were presented for 1100 msec, followed by a 700-msec blank screen. This was followed by a screen with the legend “Association?” to which participants were instructed to respond by hitting the “Enter” key once they had generated an association. Next, an 800-msec visual fixation cross appeared, followed by a blank screen for 700 msec and then the next stimulus pair. The first five participants were given unlimited time to respond. For the following participants, we limited RT to 10 sec to keep experiment length similar across participants. The behavioral results (see below) indicated that performance measures did not vary between the two groups. After all the pairs in the block (either 60 pairs in the cross-modal task or 120 pairs in the unimodal task) had been presented, the recognition phase started (Figure 1). In this phase, in the unimodal task, pairs of pictures were presented for 1100 msec. In the cross-modal task, the picture was also presented for 1100 msec. To compensate for the evolution of sound object identity over time, the sound stimuli onset 200 msec before the picture presentation. Participants were asked to provide their answers with their right hand using keys of a standard keyboard, marked 1–5 and spaced for comfortable finger placement (using the keys: ALT = 1, D = 2, T = 3, H = 4, N = 5). If a response was not provided before stimuli disappearance, a blank screen appeared until the participant responded. The response triggered a 700-msec blank screen, followed by a 800-msec visual fixation cross. This was followed by an additional 700-msec blank screen, after which the next pair appeared. A practice block of five trials for the unimodal task and four trials for the cross-modal task was provided at the beginning of the experiment. During this practice session, the experimenter ascertained that the participant understood the nature of the associations to be generated using the stimuli pairs. Self-paced rest breaks of several minutes duration were given between experimental sessions.
Electrophysiological Recording Parameters and Data Processing
The EEG was recorded using the Active II system (BioSemi, Amsterdam, the Netherlands) from 64 electrodes mounted in an elastic cap according to the extended 10–20 system. EOG was recorded using four additional external electrodes, located above and below the right eye and on the outer canthi of both eyes. Additionally, one electrode was placed on the tip of the nose, and two electrodes were placed over the left and right mastoid bones, for reference purposes. The ground function during recording was provided by common mode signal and direct right leg electrodes forming a feedback loop, placed over parieto-occipital scalp. The on-line filter settings of the EEG amplifiers were 0.16–100 Hz. Both EEG and EOG were continuously sampled at 512 Hz and stored for off-line analysis.
Using the Fieldtrip toolbox for Matlab (Oostenveld, Fries, Maris, & Schoffelen, 2011), stimulus-locked ERPs were segmented into epochs starting 500 msec before cue presentation and up to 1000 msec afterward. EEG and EOG channels were then rereferenced to the average of the left and right mastoid channels, band-pass filtered with an off-line cutoff of 0.1–30 Hz, and baseline-adjusted by subtracting the mean amplitude of the prestimulus period (200 msec) of each trial from all the data points in the segment. Because in the cross-modal task the onset of the sound preceded that of the picture by 200 msec, the baseline period started 200 msec before the onset of the sound (i.e., 400 msec before picture onset; see Figure 1). Independent component analysis was employed to remove heart, eye movements, and blink artifacts (Makeig et al., 1999). Additional trials containing electrode pop artifacts and muscle artifacts were rejected visually. Channels depicting drifts and other artifacts in individual trials were replaced with interpolated data from adjacent electrodes.
Mean associative recognition rates and RTs were calculated for each confidence level for both retrieval conditions in each task. Behavioral data for the group of 29 participants whose EEG data were analyzed are shown in Table 1. To analyze the behavioral data, we initially collapsed over correct response choices; thus, for intact pairs “definitely together” and “probably together” responses were classified as correct responses, and for recombined pairs “definitely not together” and “probably not together” responses were classified as correct responses. RT outliers (3 SDs above or below the participant's average in each condition) were removed from behavioral and ERP analyses. We conducted a repeated-measures ANOVA with factors Task (cross-modal, unimodal) and Condition (intact, recombined) as repeated factors. For accuracy rates, this analysis revealed a significant main effect of Task, F(1, 28) = 14.26, p = .001. For RTs, the analysis revealed a significant effect of Task, F(1, 28) = 11.51, p < .01, and of Condition, F(1, 28) = 71, p < .001. This indicates that varying block length was not entirely successful in equating difficulty levels, as responses in the unimodal task were both faster and more accurate than responses in the cross-modal task. However, although differences in difficulty levels might account for overall differences in neural activations between cross-modal and unimodal tasks, there were no significant interactions between task and condition. Therefore, the data collected may be informative regarding our key question of interest—the electrophysiological correlates of successful recognition of unimodal and cross-modal associations. Furthermore, as shown in Table 1, the distribution of response confidence varies across tasks, with more high-confidence responses for unimodal, compared with cross-modal pairs. Therefore, to reduce the confounding of ERPs related to retrieval processes by differences in confidence levels, only high-confidence correct responses were included in the ERP analyses. As detailed below, ERP data were examined with mixed-effect models analysis, a method appropriate for unbalanced data, to deal with the differences in the number of high-confidence observations across the two tasks.
|Definitely .||Probably .||Definitely .||Probably .||Definitely .||Probably .||Definitely .||Probably .|
|Response rate (%)||84.2 (2)||1.8 (0.9)||85.3 (2.4)||4.1 (1.5)||77.8 (2.1)||5.4 (1.4)||75.6 (2.5)||7.5 (1.6)|
|RTs (msec)||1209 (46)||2857 (192)||1475 (65)||3131 (201)||1311 (46)||2592 (124)||1578 (57)||2649 (177)|
|Definitely .||Probably .||Definitely .||Probably .||Definitely .||Probably .||Definitely .||Probably .|
|Response rate (%)||84.2 (2)||1.8 (0.9)||85.3 (2.4)||4.1 (1.5)||77.8 (2.1)||5.4 (1.4)||75.6 (2.5)||7.5 (1.6)|
|RTs (msec)||1209 (46)||2857 (192)||1475 (65)||3131 (201)||1311 (46)||2592 (124)||1578 (57)||2649 (177)|
Standard errors are given in parentheses. ERP Trials = range of trial numbers per participant per condition.
As mentioned in the Methods section, some participants were tested with a time limit for encoding and some had no time limit. We therefore divided our data into two groups (limit, no limit) and ran the accuracy and RT analyses again, using Group as a between-subject factor. The Group factor did not interact with any of the other factors for neither accuracy nor RT. Additionally, no main effect of Group was found. Therefore, for ERP analyses we collapsed the two groups to form one data set.
ERP Analyses and Results
Trials were averaged across participants to compute four ERP waveforms: (1) Unimodal–intact, (2) Unimodal–recombined, (3) Cross-modal–intact, and (4) Cross-modal–recombined. As mentioned above, only correct high-confidence responses were included in the analyses.
To allow comparison with our previous results (Tibon & Levy, 2013), we used the same nine electrode clusters we used before, covering left anterior (LA: Fp1, AF3, F1, F3, F5), mid-anterior (MA: Fpz, AFz, Fz), right anterior (RA: Fp2, AF4, F2, F4, F6), left central (LC: FC1, FC3, FC5, C1, C3, C5), mid-central (MC: FCz, Cz), right central (RC: FC2, FC4, FC6, C2, C4, C6), left posterior (LP: CP1, CP3, CP5, P1, P3, P5, PO3), mid-posterior (MP: CPz, Pz, POz), and right posterior (RP: CP2, CP4, CP6, P2, P4, P6, PO4) locations (see Figure 2 for topographical distribution).
In previous studies, modulations of the frontal and parietal old/new effects have been examined in various time windows (e.g., 300–700 msec [Wolk et al., 2009], 300–750 msec [Ecker, Zimmer, Groh-Bordin, & Mecklinger, 2007], 300–800 msec [Kriukova et al., 2013; Speer & Curran, 2007; Wolk et al., 2006; Curran & Dien, 2003; Tsivilis, Otten, & Rugg, 2001], 350–700 msec [Bader et al., 2010; Opitz, 2010], 350–750 msec [Jäger et al., 2006], 350–800 msec [Mollison & Curran, 2012], 400–1200 msec [Senkfor & Van Petten, 1998], 500–1400 msec [Wilding & Rugg, 1996], and 550–900 msec [Graham & Cabeza, 2001]; for reviews, see Wilding & Ranganath, 2011; Rugg & Curran, 2007; Friedman & Johnson, 2000; Mecklinger, 2000). These prior findings indicate that mnemonic effects may be found across a rather long-lasting post-probe presentation span beginning as early as 300 msec and extending for more than 1 sec. We further delineated the time window of interest that appeared most relevant for the probe presentation employed here and the response profiles of the participants in the following fashion: The mean amplitudes of ERPs for both retrieval tasks were computed in 50-msec bins from −200 to 1000 msec after picture cue onset and used to conduct separate t tests (intact vs. recombined) in each time window, for each task, for each of the 64 scalp electrodes, at p < .01 (see Rosburg, Mecklinger, & Johansson, 2011, for a similar approach). Time windows in which there were effects apparent for either condition were included in the final analysis; in practice, this encompassed the entire recording period beginning from 400 msec. This analysis was confirmed by visual inspection of the distribution of electrodes showing significant retrieval success differences in both tasks, which also indicated that differences starting ∼400 msec after stimuli presentation and extending to the end of recording epoch. To avoid confounding the data with motor activation driven by the response, we selected for analyses a time window ranging from 400 to 900 msec poststimuli presentation (about 300 msec before the average RT of the fastest condition; Table 1).
Figure 3 shows group mean ERPs for each retrieval condition, for the nine electrode clusters in the two tasks. As can be seen, in general, ERPs in both tasks were more negative-going for recombined items compared with intact pairs. Nonetheless, whereas in cross-modal pairs the effect was found in central-posterior locations, for unimodal pairs widespread effects were found in all locations, including a striking anterior difference between conditions.
Mixed-effect Models Analysis
To analyze the ERP data, we used a linear mixed-effects models approach, which takes participant-specific variability into account in modeling effects, and can accommodate the repeated-measures study design. Such models can be considered a generalization of ANOVA, but use maximum likelihood estimation instead of sum of squares decomposition. An advantage of such an approach over standard repeated-measures ANOVA is that mixed-effects models are better suited for complex designs (e.g., Bagiella, Sloan, & Heitjan, 2000). Moreover, this approach is particularly recommended for unbalanced data, as in the current case, in which the number of trials in each condition varied because of differences in accuracy rates between conditions. Interindividual differences in EEG amplitude dynamics were modeled as a random intercept, which represents an individual “baseline,” in addition to being affected by the fixed factors. The fixed part of the model includes the Task factor (unimodal, cross-modal), the Condition factor (intact, recombined), and two spatial location factors: Anteriority (anterior, central, and posterior) and Laterality (left, midline, and right). The fixed part of the model further included all possible interactions between these four fixed factors. Model parameters were estimated with the nlme package of the software R (Pinheiro, Bates, DebRoy, Sarkar, & the R Core team, 2007), freely available at www.R-project.org).
This analysis revealed significant main effects of Task, F(1, 45827) = 42.31, p < .001, Condition, F(1, 45827) = 73.6, p < .001, Anteriority, F(2, 45827) = 444.83, p < .001, and Laterality, F(2, 45827) = 5, p < .01. There were also significant two-way interactions between Task × Condition, F(1, 45827) = 10, p < .01, Task × Anteriority, F(2, 45827) = 23.81, p < .001, and Task × Laterality, F(2, 45827) = 5.25, p < .01, and a significant three-way interaction between Task × Condition × Anteriority, F(2, 45827) = 3.07, p < .05. To further decompose the key three-way interaction we collapsed over the Laterality factor, which did not play a part in this interaction, and ran the analyses separately for each location (anterior, central, and posterior) using the participant as a random factor and the Task, Condition, and Task × Condition interaction as fixed factors. The results of this decomposition are portrayed in Figure 4. For the anterior location, this analysis revealed a significant main effect of Task, F(1, 15265) = 45.79, p < .001, and of Condition, F(1, 15265) = 29.67, p < .001, and a significant interaction between the two, F(1, 15265) = 11.62, p < .001. Decomposition of this interaction indicated that the effect of the Condition factor was apparent in the unimodal, F(1, 8124) = 44.51, p < .001, but not in the cross-modal task. For the central location, significant main effects of Task, F(1, 15265) = 25.01, p < .001, and Condition, F(1, 15265) = 29.27, p < .001, emerged, whereas for the posterior location, only the Condition factor was significant, F(1, 15265) = 16.73, p < .001. Importantly, there were no significant Task × Condition interactions, neither in central nor in posterior locations.
Mixed-effects Model Analysis versus Repeated Measures ANOVA
As we explain above, mixed-effects model analysis appears to be the appropriate mode of inspecting data in which bin size differs between the various conditions. However, because that type of analysis is not yet widespread, we computed an average waveform for each participant in each condition and compared the results of the mixed-effects approach with those of a conventional repeated-measures ANOVA, using the same factors as those used as fixed factors in our mixed-effects analyses (Task, Condition, Anteriority, and Laterality). As in the mixed-effects analysis, a significant three-way Task × Condition × Anteriority interaction emerged, F(1.2, 33.55) = 4.42, p < .05. Decomposition of this interaction similarly indicated that a significant Task × Condition interaction was only apparent at anterior sites, F(1, 28) = 4.49, p < .05. This interaction was because of an effect of Condition in the unimodal t(28) = 4.05, p < .001, but not in the cross-modal task. On the other hand, the significant effect of Condition at central sites, F(1, 28) = 9.37, p < .01, and the marginal effect at posterior sites, F(1, 28) = 3.83, p = .06, did not interact with the Task factor.
We extended our ROI analysis by examining topographical differences in the distributions of cross-modal and unimodal tasks' associative retrieval effects using the entire montage of electrodes in all time windows. Differences in amplitude topography suggest that these effects might be mediated by distinct mechanisms (e.g., Allan, Robb, & Rugg, 2000). To directly compare topography of associative retrieval effects in the different tasks, we first calculated the difference waves (success minus failure) for each participant for both tasks. The topography of those differences is shown in Figure 5.
Difference amplitudes were then normalized according to the vector scaling procedure described by McCarthy and Wood (1985), applied within participants, as was suggested by Haig, Gordon, and Hook (1997). The comparison of the normalized difference amplitudes at 400–900 msec in a repeated-measures ANOVA with Task (unimodal/cross-modal) and Location (64 electrodes) as factors revealed a significant effect of Location, F(63, 1764) = 2.64, p < .001, and significant Task × Location interaction, F(63, 1764) = 1.66, p = .001. This finding may be taken as an indication that different processes contributed to associative recognition success in unimodal and cross-modal tasks.
In the current study, an associative recognition memory task was employed to explore whether unimodal and cross-modal episodic associations formed during encoding differentially affect familiarity- and recollection-based recognition, as indexed by their putative electrophysiological signatures. Associative recognition of intact stimulus pairs was accompanied by a robust frontal positive deflection compared with recombined pairs in the unimodal condition only. In contrast, ERP correlates of associative recognition observed at more posterior sites were apparent in both conditions. These data provide novel evidence for a multiplicity of processes supporting associative recognition of intra- and interdomain associations for stimuli closely modeling conditions of ecological perception and memory. The findings indicate that, although the recognition of unimodal associations might be able to rely on familiarity-related processes, associative cross-modal recognition necessitates recollection.
The critical assumption underlying the current study was that, in accordance with the domain dichotomy theory, the ability of interactive intradomain associations to be processed in a unitized fashion (Mayes et al., 2007) may promote the contribution of familiarity to associative recognition. In accordance with this prediction, enhancement of an anterior ERP deflection conventionally interpreted as reflecting familiarity was selectively observed for unimodal stimulus pairs. A body of research on unitization, employing either experimentally or preexperimentally established relationships (Kriukova et al., 2013; Bader et al., 2010; Ford, Verfaellie, & Giovanello, 2010; Quamme et al., 2007; Giovanello, Keane, & Verfaellie, 2006) or coherent spatial or semantic representation (Diana, Yonelinas, & Ranganath, 2008; Jäger et al., 2006; Yonelinas et al., 1999), support this notion of enhanced familiarity for these types of associations. However, Harlow, Mackenzie, and Donaldson (2010) present data that challenge this view. In two experiments, within-domain (two names or two abstract drawings) and between-domain (name + drawing) pairs were studied, and subsequent associative familiarity of intact versus rearranged pairs was assessed either by ROC analysis or by a modified remember/know procedure. In both cases, contrary to the prediction of the domain dichotomy theory, between-domain pairs yielded higher familiarity estimates than within-domain pairs. The discrepancy between the findings of Harlow and colleagues and those of the studies cited above as well as the current study might be understood as resulting from the types of stimuli involved, the encoding tasks employed, or the method of estimating familiarity. In Harlow et al. (2010), stimuli were personal names and abstract images, which might not engender unitization by their mere presentation (as opposed, e.g., to object pictures employed in the current study). Furthermore, in Harlow et al. (2010), participants were instructed to indicate how well the two items go together. In contrast, in the current study, participants were instructed to form a semantic association between the stimuli. Finally, familiarity estimates in Harlow et al. (2010) were assessed using either ROC analysis based on a dual process signal distribution assumption or a subjective familiarity-only exclusion report by participants. In contrast, the current study is focused on neural process dissociations, generally assumed to reflect the familiarity/recollection distinction, but amenable to alternative interpretations, as we will suggest below.
It should be noted that one possible reason for the absence of a frontal old/new effect in the cross-modal condition might be that environmental sounds do not engender ERP old/new effects. However, Cycowicz and Friedman (1999) report an old/new effect for environmental sounds in an item recognition task in the 500–700 msec time window following intentional encoding, which was statistically reliable over parietal scalp, but evident in fronto-central scalp as well (Figure 5 of their study). Those authors suggest that the old/new effect for environmental sounds that they found was familiarity-related. Furthermore, item familiarity was not diagnostic of associative memory in either unimodal or cross-modal conditions, as all items had been studied. Therefore, the difference between components elicited by the two associative recognition conditions does not seem to be explicable by differences in responses to their component items alone.
In addition to the anterior effect of stronger positivity being elicited by intact than rearranged stimulus pairs in the unimodal task, an associative-recognition related central-posterior modulation was observed in both tasks. Although this later effect is rather broadly distributed and lacks the pronounced parietal maxima associated with the recollection-related late positive component (Wilding & Ranganath, 2011; Rugg & Curran, 2007), retrieval-related modulations with central topographic distribution are commonly reported in associative recognition ERP studies (e.g., Kriukova et al., 2013; Mollison & Curran, 2012; Bader et al., 2010; Rhodes & Donaldson, 2007, 2008) and are interpreted as reflecting recollective processes. In the cross-modal task, in which frontal modulations are not apparent, this central-posterior effect may be most plausibly linked with recollection, required to accurately discriminate between intact and rearranged picture–sound pairs. Posterior associative recognition-related modulation was similar in distribution and amplitude in the unimodal and cross-modal tasks. Therefore, this effect in the unimodal task appears to be dissociated from the more anterior modulation and seemingly indexes recollective-based retrieval in the unimodal condition as well. Indeed, in a study conducted by Diana, Van den Boom, Yonelinas, and Ranganath (2011), a late posterior effect was found for recollection-based responses in both high- and low-unitization conditions, suggesting that unitization did not greatly influence recollection-based recognition. However, it must be noted that in the current unimodal ERPs there was a substantial temporal overlap between the anterior and posterior modulations, with no distinct parietal maxima. The possibility therefore remains that the posterior modulation in the unimodal condition is an extension of the anterior effect and thus also reflects familiarity-related processes. The latter option is supported by two recent studies (Kriukova et al., 2013; Bader et al., 2010) in which a parietal old/new effect was found only for nonunitized associations, indicating a clear contribution of recollection to associative recognition only for this kind of pairs. The authors surmise that the presence of an early midfrontal old/new effect in the absence of the parietal effect for unitized associations suggests that familiarity alone may have been sufficiently diagnostic for associative recognition of unitized pairs.
An alternative interpretation of the posterior ERP components that differ in response to intact and rearranged stimulus pairs is offered by recent studies (Coane, Balota, Dolan, & Jacoby, 2011; Bader et al., 2010; Wiegand et al., 2010) that have addressed the roles of relative and absolute familiarity in recognition. Following Mandler (1980), these researchers draw a distinction between absolute familiarity—baseline knowledge of an item—and relative (or incremental) familiarity—the relative increase of the familiarity signal compared with the preexperimental baseline as a result of an episodic encounter. This distinction suggests that, for tasks involving thoroughly novel stimuli (such as unfamiliar faces or abstract drawings), the assessment of absolute familiarity is mnemonically diagnostic, whereas in tasks with preexperimentally familiar stimuli (such as words), only relative familiarity is informative. It is proposed (Bader et al., 2010; Wiegand et al., 2010) that this distinction is reflected in differences in the topographical distribution of the early old/new ERP effect, with a mid-frontal effect associated with the assessment of relative familiarity, and a posteriorly distributed effect associated with the assessment of an absolute familiarity signal. In the current study, the onset of the posterior old/new effect was relatively early (∼400 msec), possibly linking it to early familiarity-related modulation, rather than to recollective processes. It is therefore theoretically possible to interpret the current findings as showing that for unimodal associations both relative and absolute familiarity contribute to recognition, whereas only the latter plays a role in the retrieval of cross-modal associations. However, such an interpretation would require greater preexperimental familiarity for the conjunction of the unrelated object picture pairs than for the conjunction of environmental sounds with unrelated object pictures. As we constructed all stimulus pairs in both conditions so as to minimize their preexperimental associations, this seems unlikely to be the case. It therefore seems that the anterior effect in the unimodal condition reflects de novo familiarity facilitated by greater ease of unitization whereas the posteriorly distributed effects should be attributed to recollection rather than to absolute familiarity.
Regardless of how we interpret the posterior associative modulation in the current study and irrespective of the definitive identification of the frontal old/new effect for successful unimodal associative retrieval as a familiarity process, the current data indicate the presence of frontally distributed activity that characterizes that condition but is absent in successful cross-modal recognition, although the two conditions do not differ during the same time window in posterior activity. If the differences between conditions were simply a function of difficulty, we would expect to see a graded effect across all scalp sites. The topographical distribution differences between the unimodal and cross-modal tasks suggest that the distinction is a matter of process dissociation rather than of simple strength. We speculate that the contribution to recognition judgments of an additional retrieval process (as indexed by the frontal activity in the unimodal condition) may be precisely what makes unimodal associative recognition easier and more prone to higher-confidence judgments.
Although dissociations in behavioral and functional findings regarding recognition memory are often explained in terms of familiarity and recollection processes, it is possible that a different taxonomy of subprocesses might better capture the cognitive operations that contribute to recognition and other forms of retrieval. In two recent studies (Tibon & Levy, 2013, 2014), we investigated the electrophysiological correlates of associative memory following unimodal pair-associate learning but tested retrieval by cued recall rather than by recognition. Critically, in those studies we identified an early frontal divergence between recall success and failure trials that was not likely to be accounted for by familiarity (because the target needed to be unboundedly generated rather than identified). We therefore proposed that it might reflect frontal lobe-based “working-with-memory” operations (Moscovitch, 1992). For unimodal associations, in which unitized representations might be formed at encoding, frontal mechanisms engaged in retrieval might query medial-temporal lobe representations via pattern completion attempts, continuing until retrieval is successful or a decision is made to cease retrieval efforts. Such operations may lead to retrieval success via “direct access” (Brainerd & Reyna, 2010), reflected in subsequent anterior modulation. Alternatively, successful retrieval may be achieved using recollective processes, reflected by later posterior components. Although in the current recognition paradigm familiarity can account for the frontal associative retrieval modulation, the results may also be understood in light of the abovementioned suggestion that the key dissociation among retrieval processes is not specifically between familiarity and recollection, but rather between direct ecphory and strategic reconstruction. Direct ecphory processes might include recognition of single items through familiarity, associative recognition of unitized representations by pattern completion, rapid cued recall of pair associates through direct access (Brainerd & Reyna, 2010), and perhaps even by expressions of processing fluency asserted to reflect conceptual priming (Paller, Voss, & Boehm, 2007). On the other hand, strategic reconstruction of encoding episodes may contribute to item recognition judgments supported by recollection, associative recognition in the absence of unitization, and most types of cued recall. Regarding the current results, this taxonomy of retrieval processes may provide one way to understand the emergence of frontal associative retrieval effects in the unimodal condition (in which direct access to the representation of an encoded pair-associate via pattern completion/familiarity may be possible) but not in the cross-modal condition (in which modality differences do not allow for direct ecphory, requiring episodic reconstruction to distinguish between intact and recombined stimulus pairs). Further research is required to determine the value of this suggestion that direct versus strategic retrieval modes provide the optimal characterization of ecphory across forms of memory assessment. Irrespective of the preferred interpretation, the current results provide evidence for the impact of unitization during associative encoding on the processes required for remembering associations.
D. A. L. was supported by Israel Science Foundation grant 611/09. We thank Ayelet Peer, Ilana Elstein, Anna Pintusevych, Valeria Azorina, and Lisa Tabata Laeber for assistance with stimuli preparation and data collection.
Reprint requests should be sent to Daniel A. Levy, School of Psychology and Unit for Applied Neuroscience, The Interdisciplinary Center, Herzliya 46683 Israel, or via e-mail: email@example.com.