Abstract
Mounting evidence linking gaze reinstatement—the recapitulation of encoding-related gaze patterns during retrieval—to behavioral measures of memory suggests that eye movements play an important role in mnemonic processing. Yet, the nature of the gaze scanpath, including its informational content and neural correlates, has remained in question. In this study, we examined eye movement and neural data from a recognition memory task to further elucidate the behavioral and neural bases of functional gaze reinstatement. Consistent with previous work, gaze reinstatement during retrieval of freely viewed scene images was greater than chance and predictive of recognition memory performance. Gaze reinstatement was also associated with viewing of informationally salient image regions at encoding, suggesting that scanpaths may encode and contain high-level scene content. At the brain level, gaze reinstatement was predicted by encoding-related activity in the occipital pole and BG, neural regions associated with visual processing and oculomotor control. Finally, cross-voxel brain pattern similarity analysis revealed overlapping subsequent memory and subsequent gaze reinstatement modulation effects in the parahippocampal place area and hippocampus, in addition to the occipital pole and BG. Together, these findings suggest that encoding-related activity in brain regions associated with scene processing, oculomotor control, and memory supports the formation, and subsequent recapitulation, of functional scanpaths. More broadly, these findings lend support to Scanpath Theory's assertion that eye movements both encode, and are themselves embedded in, mnemonic representations.
INTRODUCTION
The human visual field is limited, requiring us to move our eyes several times a second to explore the world around us. This necessarily sequential process of selecting visual features for fixation and further processing has important implications for memory. Research using eye movement monitoring indicates that, during visual exploration, fixations and saccades support the binding of salient visual features and the relations among them into coherent and lasting memory traces (e.g., Liu, Rosenbaum, & Ryan, 2020; Liu, Shen, Olsen, & Ryan, 2017; for a review, see Wynn, Shen, & Ryan, 2019). Moreover, such memory traces may be stored and subsequently recapitulated as patterns of eye movements or “scanpaths” at retrieval (Noton & Stark, 1971a, 1971b; for a review, see Wynn et al., 2019). Specifically, when presented with a previously encoded stimulus or a cue to retrieve a previously encoded stimulus from memory, humans (and nonhuman primates; see Sakon & Suzuki, 2019) spontaneously reproduce the scanpath enacted during encoding (i.e., gaze reinstatement), and this reinstatement is predictive of mnemonic performance across a variety of tasks (e.g., Wynn, Ryan, & Buchsbaum, 2020; Damiano & Walther, 2019; Wynn, Olsen, Binns, Buchsbaum, & Ryan, 2018; Scholz, Mehlhorn, & Krems, 2016; Laeng, Bloem, D'Ascenzo, & Tommasi, 2014; Olsen, Chiew, Buchsbaum, & Ryan, 2014; Johansson & Johansson, 2013; Foulsham et al., 2012; Laeng & Teodorescu, 2002; for a review, see Wynn et al., 2019). Although there is now considerable evidence supporting a link between gaze reinstatement (i.e., reinstatement of encoding gaze patterns during retrieval) and memory retrieval, investigations regarding the neural correlates of this effect are recent and few (see Bone et al., 2019; Ryals, Wang, Polnaszek, & Voss, 2015), and no study to date has investigated the patterns of neural activity at encoding that predict subsequent gaze reinstatement. Thus, to further elucidate the link between eye movements and memory at the neural level, this study used concurrent eye movement monitoring and fMRI to investigate the neural mechanisms at encoding that predict functional gaze reinstatement (i.e., gaze reinstatement that supports mnemonic performance) at retrieval, in the vein of subsequent memory studies (e.g., Brewer, Zhao, Desmond, Glover, & Gabrieli, 1998; Wagner et al., 1998; for a review, see Hannula & Duff, 2017).
Scanpaths have been proposed to at once contain, and support the retrieval of, spatiotemporal contextual information (Noton & Stark, 1971a, 1971b). According to Noton and Stark's (1971a, 1971b) seminal Scanpath Theory, on which much of the current gaze reinstatement literature is based (see Wynn et al., 2019), scanpaths consist of both image features and the fixations made to them as “an alternating sequence of sensory and motor memory traces.” Consistent with this proposal, research using eye movement monitoring and neuroimaging techniques has established an important role for eye movements in visual memory encoding (for a review, see Ryan, Shen, & Liu, 2020; Meister & Buffalo, 2016). For example, at the behavioral level, recognition memory accuracy is significantly attenuated when eye movements during encoding are restricted (e.g., to a central fixation cross) as opposed to free (e.g., Liu et al., 2020; Damiano & Walther, 2019; Henderson, Williams, & Falk, 2005). At the neural level, restricting viewing to a fixed location during encoding results in attenuated activity in brain regions associated with memory and scene processing including the hippocampus (HPC) and parahippocampal place area (PPA), as well as reduced functional connectivity between these regions and other cortical regions (Liu et al., 2020). When participants are free to explore, however, the number of fixations executed is positively predictive of subsequent memory performance (e.g., Fehlmann et al., 2020; Liu et al., 2017; Olsen et al., 2016; Loftus, 1972) and of activity in the HPC (Liu et al., 2017, 2020; see also Olsen et al., 2016) and medial temporal lobe (Fehlmann et al., 2020), suggesting that eye movements are critically involved in the accumulation and encoding of visual feature information into lasting memory traces. That the relationships between gaze fixations and recognition memory performance (e.g., Wynn, Buchsbaum, & Ryan, 2021; see also Chan, Chan, Lee, & Hsiao, 2018) and between gaze fixations and HPC activity (Liu, Shen, Olsen, & Ryan, 2018) are reduced with age, despite an increase in the number of fixations (e.g., Firestone, Turk-Browne, & Ryan, 2007; Heisz & Ryan, 2011), further suggests that these effects extend beyond the effects of mere attention or interest.
Recent work suggests that eye movements not only play an important role in memory encoding but also actively support memory retrieval. Consistent with the Scanpath Theory, several studies have provided evidence that gaze patterns elicited during stimulus encoding are recapitulated during subsequent retrieval and are predictive of mnemonic performance (e.g., Wynn et al., 2018, 2020; Damiano & Walther, 2019; Scholz et al., 2016; Laeng et al., 2014; Olsen et al., 2014; Johansson & Johansson, 2013; Foulsham et al., 2012; Laeng & Teodorescu, 2002; for a review, see Wynn et al., 2019). In addition to advancing a functional role for eye movements in memory retrieval, this literature has raised intriguing questions regarding the nature of the scanpath and its role in memory. For example, how are scanpaths created, and what information do they contain? To answer these questions, it is necessary not only to relate eye movement and behavioral patterns, as prior research has done, but also, and perhaps more critically, to relate eye movement and neural patterns. Yet, only two studies, to our knowledge, have directly investigated the neural correlates of gaze reinstatement, with both focusing on retrieval-related activity patterns. In the first of these studies, Ryals et al. (2015) demonstrated that trial-level variability in gaze similarity (between previously viewed scenes and novel scenes with similar feature configurations) was associated with activity in the right HPC. Extending this work, Bone et al. (2019) observed that gaze reinstatement (i.e., similarity between participant- and image-specific gaze patterns during encoding and subsequent visualization) was positively correlated with whole-brain neural reinstatement (i.e., similarity between image-specific patterns of brain activity evoked during encoding and subsequent visualization) during a visual imagery task. Considered together, these two studies provide evidence that functional gaze reinstatement is related to neural activity patterns typically associated with memory retrieval, suggesting a common mechanism.
Although there is now some evidence that mnemonic retrieval processes support gaze reinstatement at the neural level, the relationship between gaze reinstatement and encoding-related neural activity has yet to be investigated. Accordingly, this study used the data from Liu et al. (2020) to elucidate the encoding mechanisms that support the formation and subsequent recapitulation of functional scanpaths. Participants encoded intact and scrambled scenes under free or fixed (restricted) viewing conditions (in the scanner) and subsequently completed a recognition memory task with old (i.e., encoded) and new (i.e., lure) images (outside the scanner). Previous analysis of this data revealed that, when compared to free viewing, restricting eye movements reduced activity in the HPC, connectivity between the HPC and other visual and memory regions, and, ultimately, subsequent memory performance (Liu et al., 2020). These findings critically suggest that eye movements and memory encoding are linked at both the behavioral and neural levels. Here, we extend this work further by investigating the extent to which the patterns of eye movements, or scanpaths, that are created at encoding are reinstated at retrieval to support memory performance and also by investigating the neural activity at encoding that predicts the subsequent reinstatement of scanpaths at retrieval.
To this end, we first computed the spatial similarity between encoding and retrieval scanpaths (containing information about fixation location and duration) and used this measure to predict recognition memory accuracy. On the basis of prior evidence of functional gaze reinstatement, we predicted that gaze reinstatement would be both greater than chance and positively correlated with recognition of old images. To further interrogate the nature of information represented in the scanpath, we additionally correlated gaze reinstatement with measures of visual (i.e., stimulus-driven; bottom–up) and informational (i.e., participant-driven; bottom–up and top–down) saliency. Given that prior work has revealed a significant role for top–down features (e.g., meaning, Henderson & Hayes, 2018; scene content, O'Connell & Walther, 2015) in guiding eye movements, above and beyond bottom–up image features (e.g., luminance, contrast; Itti & Koch, 2000), we hypothesized that gaze reinstatement would be related particularly to the viewing of informationally salient image regions. Finally, to uncover the neural correlates of functional gaze reinstatement, we analyzed neural activity patterns at encoding, both across the whole brain and in memory-related ROIs (i.e., HPC, PPA; see Liu et al., 2020), to identify brain regions that (1) predicted subsequent gaze reinstatement at retrieval and (2) showed overlapping subsequent gaze reinstatement and subsequent memory effects. Given that previous work has linked gaze scanpaths, as a critical component of mnemonic representations, to successful encoding and retrieval, we hypothesized that functional gaze reinstatement would be supported by encoding-related neural activity in brain regions associated with visual processing (i.e., ventral visual stream regions) and memory (i.e., medial temporal lobe regions). By linking the neural correlates and behavioral outcomes of gaze reinstatement, this study provides novel evidence in support of Noton and Stark's assertion that scanpaths both serve to encode and are themselves encoded into memory, allowing them to facilitate retrieval via recapitulation and reactivation of informationally salient image features.
METHODS
Participants
Participants were 36 young adults (22 women) aged 18–35 years (M = 23.58 years, SD = 4.17) with normal or corrected-to-normal vision and no history of neurological or psychiatric disorders. All participants were recruited from the University of Toronto and surrounding Toronto area community and were given monetary compensation for their participation in the study. All participants provided written informed consent in accordance with the Research Ethic Board at the Rotman Research Institute at Baycrest Health Sciences.
Stimuli
Stimuli consisted of 864, 500 × 500-pixel, colored images, made up of 24 images of each of 36 semantic scene categories (e.g., living room, arena, warehouse), varying along the feature dimensions of size and clutter (six levels per dimension = 36 unique feature level combinations, balanced across conditions).1 Within each scene category, eight images were assigned to the free-viewing encoding condition and eight images were assigned to the fixed-viewing encoding condition; images were randomly assigned to eight fMRI encoding runs (36 images per run per viewing condition). The remaining eight images in each scene category were used as novel lures at retrieval. One hundred forty-four scene images from encoding (72 images per viewing condition from two randomly selected encoding runs) and 72 scene images from retrieval (two per scene category) were scrambled using six levels of tile size (see Figure 1). Thus, in total, 432 intact scene images and 144 scrambled color-tile images were viewed at encoding, balanced across free- and fixed-viewing conditions, and 648 intact scene images (432 old and 216 novel lure) and 216 scrambled color-tile images (144 old and 72 novel lure) were viewed at retrieval. All images were balanced for low-level image properties (e.g., luminance, contrast)2 and counterbalanced across participants (for assignment to experimental/stimulus conditions).
Procedure
In-scan Scene Encoding Task
Participants completed eight encoding runs in the scanner, six containing scene images and two containing scrambled images (run order was randomized within participants3; see Figure 1A). Within each run, participants viewed 72 images, half of which were studied under free-viewing instructions and half of which were studied under fixed-viewing instructions. Before the start of each trial, participants were presented with a fixation cross for 1.72–4.16 sec (exponential distribution, M = 2.63 sec) presented in a random location within a 100-pixel (1.59° visual angle) radius around the center of the screen. The color of the cross indicated the viewing instructions for the following image, with free viewing indicated by a green cross and fixed viewing indicated by a red cross. After presentation of the fixation cross, a scene or scrambled image appeared for 4 sec, during which time participants were instructed to encode as much information as possible. If the image was preceded by a red cross, participants were to maintain fixation on the location of the cross for the duration of image presentation. The length of each run was 500 sec, with 10 and 12.4 sec added to the beginning and end of the run, respectively.
Postscan Scene Recognition Task
After the encoding task, participants were given a 60-min break before completing the retrieval task in a separate testing room. For the retrieval task, participants viewed all 576 images (432 scene images and 144 scrambled color-tile images) from the encoding task along with 288 novel lure images (216 scene images and 72 scrambled color-tile images), divided evenly into six blocks. Before the start of each trial, participants were presented with a fixation cross for 1.5 sec presented in a random location within a 100-pixel radius around the center of the screen (for old trials, the fixation cross was presented at the same location in which it was presented during the encoding task). After presentation of the fixation cross, a scene or scrambled image (either old, i.e., previously viewed during encoding, or novel lure) appeared for 4 sec. Participants were given 3 sec to indicate whether the presented image was “old” or “new” and rate their confidence in that response, via keypress (z = high confidence “old,” x = low confidence “old,” n = high confidence “new,” m = low confidence “new”). To quantify recognition memory for old images, points were assigned to each response as follows: z = 2, x = 1, m = 0, and n = −1.
Eye-tracking Procedure
During the encoding task, monocular eye movements were recorded inside the scanner using the Eyelink 1000 MRI-compatible remote eye tracker with a 1000-Hz sampling rate (SR Research Ltd.). The eye tracker was placed inside the scanner bore (behind the participant's head) and detected the pupil and corneal reflection via a mirror mounted on the head coil. During the retrieval task, monocular eye movements were recorded using the Eyelink II head-mounted eye tracker with a 500-Hz sampling rate (SR Research Ltd.). To ensure successful tracking during both the encoding and retrieval tasks, 9-point calibration was performed before the start of the task. Online manual drift correction to the location of the upcoming fixation cross was performed between trials when necessary. As head movements were restricted in the scanner, drift correction was rarely performed. Saccades greater than 0.5° of visual angle were identified by Eyelink as eye movements having a velocity threshold of 30°/sec, an acceleration threshold of 8000°/sec, and a saccade onset threshold of 0.15°. Blinks were defined as periods in which the saccade signal was missing for three or more consecutive samples. All remaining samples (not identified as a saccade or blink) were classified as fixations.
MRI Protocol
As specified in Liu et al. (2020), a 3-T Siemens MRI scanner with a standard 32-channel head coil was used to acquire both structural and functional images. For structural T1-weighted high-resolution MRI images, we used a standard 3-D magnetization prepared rapid gradient echo pulse sequence with170 slices and using field of view = 256 × 256 mm, 192 × 256 matrix, 1-mm isotropic resolution, echo time/repetition time = 2.22/200 msec, flip angle = 9°, and scan time = 280 sec. Functional images were obtained using T2*-weighted EPI acquisition protocol with repetition time = 2000 msec, echo time = 27 msec, flip angle = 70°, and field of view = 192 × 192 with 64 × 64 matrix (3 mm × 3 mm in-place resolution, slice thickness = 3.5 mm with no gap). Two hundred fifty volumes were acquired for each run. Both structural and functional images were acquired in an oblique orientation 30° clockwise to the AC–PC axis. Stimuli were presented with Experiment Builder (SR Research Ltd.) back-projected to a screen (projector resolution: 1024 × 768) and viewed with a mirror mounted on the head coil.
Data Analysis
Gaze Reinstatement Analysis
To quantify the spatial overlap between the gaze patterns elicited by the same participants viewing the same images during encoding and retrieval, we computed gaze reinstatement scores for each image for each participant. Specifically, we computed the Fisher z-transformed Pearson correlation between the duration-weighted fixation density (i.e., heat) map4 (σ = 80) for each image for each participant during encoding and the corresponding density map for the same image being viewed by the same participant during retrieval (“match” similarity; R eyesim package: https://github.com/bbuchsbaum/eyesim; see Figure 1B). Critically, although this measure (“match” similarity) captures the overall similarity between encoding and retrieval gaze patterns, it is possible that such similarity reflects participant-specific (e.g., tendency to view each image from left to right), image-invariant (e.g., tendency to preferentially view the center of the screen) viewing biases. Thus, to control for idiosyncratic viewing tendencies (which were not of particular interest for this study), we additionally computed the similarity between participant- and image-specific retrieval density maps and 50 other randomly selected encoding density maps (within participant, stimulus type, and viewing condition). The resulting 50 scores were averaged to yield a single “mismatch” similarity score for each participant for each image.
Match and mismatch similarity scores were contrasted using an ANOVA with Similarity Value as the dependent variable and Stimulus Type (scene, scrambled), Viewing Condition (free, fixed), and Similarity Template (match, mismatch) as the independent variables. For all subsequent analyses, gaze reinstatement was reported as the difference between match and mismatch similarity scores, thus reflecting the spatial similarity between encoding and retrieval scanpaths for the same participant viewing the same image, controlling for idiosyncratic viewing biases.
To investigate the effect of gaze reinstatement on mnemonic performance, we ran a linear mixed effects model (LMEM) on trial-level accuracy (coded for a linear effect: high confidence miss = −1, low confidence miss = 0, low confidence hit = 1, high confidence hit = 2) with fixed effects including all interactions of gaze reinstatement (match similarity − mismatch similarity; z scored), stimulus type (scene*, scrambled), and viewing condition (free*, fixed) as well as random effects including random intercepts for participant and image. Backward model comparison (α = .05) was used to determine the most parsimonious model (p values approximated with the lmerTest package; Kuznetsova, Brockhoff, & Christensen, 2017).
Saliency Analysis
To characterize gaze patterns at encoding, and specifically, the type of information encoded into the scanpath, saliency was computed for each image using two methods. First, we used a leave-one-subject-out cross-validation procedure to generate duration-weighted informational saliency (participant data-driven) maps5 for each image using the aggregated fixations of all participants (excluding the participant in question) viewing that image during encoding (mean number of fixations per image = 204, aggregated from all included participants). Second, we used the Saliency Toolbox (Walther & Koch, 2006) to generate visual saliency maps by producing 204 pseudo-fixations for each image based on low-level image properties including color, intensity, and orientation. Critically, whereas the stimulus (Saliency Toolbox)-driven saliency map takes into account primarily bottom–up stimulus features (e.g., luminance, contrast), the participant data-driven saliency map takes into account any features (bottom–up or top–down) that might attract viewing for any reason (e.g., semantic meaning, memory). To quantify the extent to which individual gaze patterns during encoding were guided by salient bottom–up and top–down features, participant- and image-specific encoding gaze patterns were correlated with both the informational (participant data-driven) and visual (stimulus [Saliency Toolbox]-driven) saliency maps in the same manner as the gaze reinstatement analysis described above. This analysis yielded two scores per participant per image reflecting the extent to which fixations at encoding were guided by high-level image features (i.e., informational saliency; based on the data-driven saliency map) and low-level image features (i.e., visual saliency; based on the stimulus-driven saliency map).
To investigate the relationship between encoding gaze patterns and gaze reinstatement, we ran an LMEM on gaze reinstatement with visual and informational saliency scores (z scored) as predictors. To compare the strength of each saliency score in predicting gaze reinstatement, saliency scores were dummy coded (visual saliency = 0, informational saliency = 1). Random intercepts for participant and image were also included in the model.
fMRI data preprocessing.
The fMRI preprocessing procedure was previously reported in Liu et al. (2020); for completeness, it is re-presented here. MRI images were processed using SPM12 (Statistical Parametric Mapping, Welcome Trust Center for Neuroimaging, University College London; www.fil.ion.ucl.ac.uk/spm/software/spm12/ Version: 7487) in the MATLAB environment (The MathWorks, Inc.). Following the standard SPM12 preprocessing procedure, slice timing was first corrected using sinc interpolation with the midpoint slice as the reference slice. Then, all functional images were aligned using a six-parameter linear transformation. Next, for each participant, functional image movement parameters obtained from the alignment procedure, as well as the global signal intensity of these images, were checked manually using the freely available toolbox ART (www.nitrc.org/projects/artifact_detect/) to detect volumes with excessive movement and abrupt signal changes. Volumes indicated as outliers by ART default criteria were excluded later from statistical analyses. Anatomical images were coregistered to the aligned functional images and segmented into white matter, gray matter, cerebrospinal fluid, skull/bones, and soft tissues using SPM12 default six-tissue probability maps. These segmented images were then used to calculate the transformation parameters mapping from the individuals' native space to the Montreal Neurological Institute (MNI) template space. The resulting transformation parameters were used to transform all functional and structural images to the MNI template. For each participant, the quality of coregistration and normalization was checked manually and confirmed by two research assistants. The functional images were finally resampled at a 2 × 2 × 2 mm resolution and smoothed using a Gaussian kernel with an FWHM of 6 mm. The first five fMRI volumes from each run were discarded to allow the magnetization to stabilize to a steady state, resulting in 245 volumes in each run.
fMRI Analysis
Parametric modulation analysis.
To interrogate our main research question, that is, which brain regions' activity during encoding was associated with subsequent gaze reinstatement, we conducted a parametric modulation analysis in SPM12. Specifically, we first added the condition mean activation regressors for the free- and fixed-viewing conditions, by convolving the onset of trials of each condition with the canonical hemodynamic response function in SPM12. We then added the trial-wise gaze reinstatement measure as our interested linear modulator, which was also convolved with the hemodynamic response function. We also added motion parameters, as detailed in Liu et al. (2020), as regressors of no interest. Default high-pass filters with a cutoff of 128 sec and a first-order autoregressive model AR(1) were also applied.
Using this design matrix, we first estimated the modulation effect of gaze reinstatement at the individual level. These beta estimates, averaged across all scene runs, were then carried to the group-level analyses in which within-participant t tests were used to examine which brain regions showed stronger activity when greater gaze reinstatement was observed. For this analysis, we primarily focused on the free-viewing scene condition as this is the condition in which the gaze reinstatement measure is most meaningful (because participants were allowed to freely move their eyes). In this analysis, the HPC and PPA served as our a priori ROIs (see Supplementary Figure S1 in Liu et al., 2020). As specified in Liu et al. (2020), the HPC ROI for each participant was obtained using Freesurfer recon-all function, Version 6.0 (surfer.nmr.mgh.harvard.edu.myaccess.library.utoronto.ca; Fischl, 2012). The PPA ROIs were obtained using the “scene versus scrambled color tile” picture contrast. The MNI coordinates for the peak activation of the PPA were [32, −34, −18] for the right PPA and [−24, −46, −12] for the left PPA. The left and right PPA ROIs contained 293 and 454 voxels, respectively.
To explore whether other brain regions showed gaze reinstatement modulation effects, in addition to the ROI analysis, we also obtained voxel-wise whole-brain results. As an exploratory analysis, we used a relatively lenient threshold of p = .005 with a 10-voxel extension (no correction), which can also facilitate future meta-analyses (Lieberman & Cunningham, 2009).
Brain activation pattern similarity between parametric modulation of gaze reinstatement and subsequent memory.
To understand the extent to which there was similar modulation of brain activity by gaze reinstatement and by subsequent memory, we calculated cross-voxel brain activation pattern similarity between the two parametric modulation effects. This analysis allowed us to test whether the brain activity associated with the two behavioral variables (i.e., trial-wise gaze reinstatement and subsequent memory) shares a similar pattern. First, we obtained subsequent memory modulation effects as detailed in Liu et al. (2020). Specifically, in this subsequent memory effect analysis, we coded subsequent recognition memory for each encoding trial based on participants' hit/miss response and confidence (correct recognition with high confidence = 2, correction recognition with low confidence = 1, missed recognition with low confidence = 0, missed recognition with high confidence = −1). We then used this measure as a linear parametric modulator to find brain regions that showed a subsequent memory effect, that is, stronger activation when trials were subsequently better remembered. We averaged the subsequent memory effect estimates across runs for each participant. We then extracted unthresholded voxel-by-voxel subsequent memory effects and gaze reinstatement effects (i.e., estimated betas) for the HPC and PPA, separately. These beta values were then vectorized, and Pearson correlations were calculated between the two vectors of the two modulation effects for each ROI. Finally, these Pearson correlations were Fisher z transformed to reflect the cross-voxel pattern similarity between the subsequent memory effect and the gaze reinstatement modulation effect.
Although we mainly focused on the brain activation pattern similarity between the two modulation effects in the free-viewing scene condition, we also obtained the same measure for the fixed-viewing scene condition to provide a control condition. If the brain activation pattern modulated by the gaze reinstatement measure is related to memory processing in the free-viewing scene condition, it should show larger-than-zero pattern similarity with the subsequent memory effects, which should also be greater than those in the fixed-viewing scene condition. Therefore, at the group level, we used one-sample t tests to examine whether the similarity z scores in the free-viewing scene condition were larger than zero and used a paired t test to compare the similarity scores against those in the fixed-viewing scene condition.
In addition to the ROI brain activation pattern similarity, we also examined brain activation similarity between subsequent memory and gaze reinstatement for the whole brain in each voxel using a searchlight analysis (The Decoding Toolbox v3.997; Hebart, Görgen, & Haynes, 2015). Specifically, for each voxel, we applied an 8-mm spheric searchlight to calculate the across-voxel (voxels included in this searchlight) brain activation pattern similarity between the subsequent memory effect and the gaze reinstatement modulation effect, using the same procedure detailed above for the ROI analysis. We first generated the brain activation similarity z-score images for the free- and fixed-viewing scene conditions separately for each participant. At the group level, the individual participants' brain activation similarity z-score images were tested against zero for the free-viewing scene condition and compared to the similarity images in the fixed-viewing scene condition using paired t tests. For this whole-brain voxel-wise analysis, we used a threshold of p = .005 with a 10-voxel extension (uncorrected; see Lieberman & Cunningham, 2009).
RESULTS
Behavioral Results
Results of the ANOVA on recognition memory performance are reported in Liu et al., 2020. In short, a significant interaction of Stimulus Type × Viewing Condition indicated that recognition memory was significantly higher in the free-viewing condition than in the fixed-viewing condition, for scene images only, and for scenes relative to scrambled images, for free-viewing only (see Figure 2E in Liu et al., 2020).
Eye Movement Results
To determine whether gaze reinstatement was significantly greater than chance, we ran an ANOVA with Similarity Value as the dependent variable and Stimulus Type (scene, scrambled), Viewing Condition (free, fixed), and Similarity Template (match, mismatch) as the independent variables. If individual retrieval gaze patterns are indeed image specific, they should be more similar to the gaze pattern for the same image viewed at encoding (match) than for other images within the same participant, image category, and condition (mismatch). Results of the ANOVA revealed a significant three-way interaction of Similarity Template, Stimulus Type, and Viewing Condition, F(1, 34) = 7.09, p = .012, ηp2 = .17. Post hoc tests of the difference in mean match and mismatch similarity scores indicated that match similarity was significantly greater than mismatch similarity in all conditions and categories [fixed scene: t(69.7) = 2.12, p = .037; fixed scrambled: t(69.7) = 3.60, p = .001; free scene: t(69.7) = 6.22, p < .001, see Figure 2A; free scrambled: t(69.7) = 4.583, p < .001].
To explore the relationship between gaze reinstatement and mnemonic performance, we ran an LMEM on trial-level accuracy with interactions of gaze reinstatement (match similarity − mismatch similarity), stimulus type (scene*, scrambled), and viewing condition (free*, fixed) as fixed effects as well as participant and image as random effects. Results of the final best fit model indicated that accuracy was significantly greater for scenes relative to scrambled images (β = −0.24, SE = 0.03, t = −8.19, p < .001), and this effect was significantly attenuated for fixed viewing (Stimulus Type × Viewing Condition: β = 0.17, SE = 0.03, t = 5.10, p < .001). Accuracy was also significantly greater for free relative to fixed viewing (β = −0.17, SE = 0.16, t = −10.56, p < .001; see Figure 2A), and this effect was significantly attenuated for scrambled images (see Stimulus Type × Viewing Condition). Finally, the model revealed a significant positive effect of gaze reinstatement on accuracy (β = 0.06, SE = 0.01, t = 5.28, p < .001; see Figure 2B) for free-viewing scenes, and this effect was significantly attenuated for fixed viewing (Gaze Reinstatement × Viewing Condition: β = −0.04, SE = 0.14, t = −2.82, p = .005) and for scrambled images (Gaze Reinstatement × Stimulus Type: β = −0.06, SE = 0.16, t = −3.53, p < .001). The addition of number of gaze fixations to the model significantly improved the model fit (χ2 = 15.52, p < .001; see also Liu et al., 2020) but importantly did not abolish the effect of gaze reinstatement. Furthermore, a correlation of mean gaze reinstatement scores and mean cumulative encoding gaze fixations was nonsignificant (r = .049, p = .79), suggesting that these effects were independent.
To determine whether gaze reinstatement (i.e., the extent to which encoding gaze patterns were recapitulated at retrieval) was related to gaze patterns (i.e., the types of information viewed) at encoding, we derived two measures to capture the extent to which individual gaze patterns at encoding reflected “salient” image regions. Given that “saliency” can be defined by both bottom–up (e.g., bright) and top–down (e.g., meaningful) image features, with the latter generally outperforming the former in predictive models (e.g., Henderson & Hayes, 2018; O'Connell & Walther, 2015), we computed two saliency maps for each image using the Saliency Toolbox (visual saliency map, reflecting bottom–up stimulus features) and aggregated participant data (informational saliency map, reflecting bottom–up and top–down features). Gaze patterns for each participant for each image were compared to both the visual and informational saliency maps, yielding two saliency scores. To probe the relationship between encoding gaze patterns and subsequent gaze reinstatement, we ran an LMEM on gaze reinstatement with saliency scores (visual*, informational) as fixed effects and participant and image as random effects. Results of the model revealed a significant effect of saliency on controlled gaze reinstatement (β = 0.10, SE = 0.01, t = 9.36, p < .001), indicating that similarity of individual encoding gaze patterns to the visual saliency map predicted subsequent gaze reinstatement at retrieval. Notably, the saliency effect was significantly increased when the informational saliency map was used in place of the visual saliency map (β = 0.06, SE = 0.01, t = 5.01, p < .001), further indicating that gaze reinstatement is best predicted by encoding gaze patterns that prioritize “salient” image regions, being regions high in bottom–up and/or top–down informational content.
fMRI Results
To answer our main research question regarding the neural activity patterns at encoding that predict subsequent gaze reinstatement (at retrieval), we first examined the brain regions in which activations during encoding were modulated by trial-wise subsequent gaze reinstatement scores (i.e., brain regions that showed stronger activation for trials with higher subsequent gaze reinstatement). Our ROI analyses did not yield significant effects for either the HPC or PPA (t = −0.31–1.13, p = .76–.26; Figure 3A). However, as evidenced by the whole-brain voxel-wise results (Figure 3B), the occipital poles bilaterally showed a parametric modulation by subsequent gaze reinstatement at p = .005, with a 10-voxel extension (no correction). Two clusters in the BG also showed effects at this threshold. All regions that showed gaze reinstatement modulation effects at this threshold are presented in Table 1.
Anatomical Areas . | Cluster Size . | t Value . | p Value . | MNI Coordinates . | ||
---|---|---|---|---|---|---|
x . | y . | z . | ||||
Precentral_L | 63 | 3.983363 | .000164 | −30 | −4 | 42 |
Putamen_R | 91 | 3.919645 | .000197 | 28 | 2 | 16 |
Occipital_Inf_R | 180 | 3.858175 | .000235 | 32 | −94 | −4 |
Occipital_Mid_L | 145 | 3.834414 | .000251 | −28 | −98 | 2 |
Putamen_L | 161 | 3.815405 | .000265 | −26 | 12 | 12 |
Caudate_R | 12 | 3.431526 | .000778 | 14 | 6 | 20 |
Supp_Motor_Area_L | 12 | 3.112625 | .00184 | −10 | 16 | 50 |
Putamen_R | 14 | 2.931331 | .002952 | 20 | 8 | 8 |
Anatomical Areas . | Cluster Size . | t Value . | p Value . | MNI Coordinates . | ||
---|---|---|---|---|---|---|
x . | y . | z . | ||||
Precentral_L | 63 | 3.983363 | .000164 | −30 | −4 | 42 |
Putamen_R | 91 | 3.919645 | .000197 | 28 | 2 | 16 |
Occipital_Inf_R | 180 | 3.858175 | .000235 | 32 | −94 | −4 |
Occipital_Mid_L | 145 | 3.834414 | .000251 | −28 | −98 | 2 |
Putamen_L | 161 | 3.815405 | .000265 | −26 | 12 | 12 |
Caudate_R | 12 | 3.431526 | .000778 | 14 | 6 | 20 |
Supp_Motor_Area_L | 12 | 3.112625 | .00184 | −10 | 16 | 50 |
Putamen_R | 14 | 2.931331 | .002952 | 20 | 8 | 8 |
All clusters survived the threshold of p < .005, with a 10-voxel extension, no correction. The names of the anatomical regions in the table, obtained using the automated anatomical labeling (AAL) toolbox for SPM12, follow the AAL template naming convention (Tzourio-Mazoyer et al., 2002). R/L = right/left hemisphere; Mid = middle; Inf = inferior.
As reported previously by Liu et al. (2020; see Figure 6A), both the PPA and HPC showed a parametric modulation by subsequent memory; that is, the PPA and HPC were activated more strongly for scenes that were later successfully recognized versus forgotten. Although PPA and HPC activation at the mean level were not modulated by subsequent gaze reinstatement, we investigated whether the variation across voxels within each ROI in supporting subsequent memory was similar to the variation of these voxels in supporting subsequent gaze reinstatement. Critically, this cross-voxel brain modulation pattern similarity analysis can reveal whether the pattern of activation of voxels in an ROI contains shared information, or supports the overlap, between subsequent memory and subsequent gaze reinstatement effects. Results of this analysis revealed significant pattern similarity between the two modulation effects, subsequent memory and gaze reinstatement, in both the right PPA and right HPC, t = 2.37 and 3.31, and p = .024 and .002, respectively. The left HPC showed a marginally significant effect, t = 1.88, p = .069, whereas the left PPA similarity effect was not significant, t = 1.41, p = .17 (Figure 4A and B).
Since the occipital pole and BG regions showed stronger mean level activation for trials with greater subsequent gaze reinstatement, we also examined the pattern similarity in the voxel clusters in these two regions. Specifically, we obtained the two voxel clusters in the BG and the occipital pole that survived the threshold of p = .005 (no correction) in the gaze reinstatement modulation analysis (Figure 3B) and then computed the pattern similarity scores as we did for the PPA and HPC (see above). Similar to the PPA and HPC results, the right BG and right occipital pole ROIs showed significant pattern similarity between the subsequent memory and subsequent gaze reinstatement modulation effects, t = 2.45 and 2.36, and p = .02 and .024, respectively. The left ROIs did not show any significant results, p > .05 (Figure 4C and D).
Directly comparing the brain activation pattern similarity between the free- versus fixed-viewing condition revealed greater brain pattern similarity in the free- versus fixed-viewing condition for the right PPA, HPC, and occipital pole regions (t = 3.84, 3.55, and 2.24, and p = .0005, .001, and .032, respectively). The left HPC and a region in the left fusiform gyrus also showed marginally significant effects (t = 2.02 and 1.91, and p = .051 and .065, respectively).
As reported earlier, gaze reinstatement and memory performance were correlated at the behavioral level. Therefore, to ensure that the observed pattern similarity between the subsequent memory and gaze reinstatement modulation effects was specific to brain regions that were important for the scene encoding task, such as the ROIs tested above, and not general to all brain regions (i.e., reflecting the shared variance between the two behavioral measures at the brain level), we employed a searchlight method in which a sphere with a radius of 8 mm was used to obtain the similarity value at each voxel of the brain. As shown in Figure 5A, not all brain regions showed the similarity effect. Instead, two large clusters in both the left and right HPC showed significant similarity between the subsequent memory and gaze reinstatement modulation effects (SPM small volume correction using HPC mask: cluster level pFWE-corr = .004 and .012, cluster size = 248 and 172 voxels). Other regions including regions in the ventral and dorsal visual stream also showed similar patterns. These results confirm that the pattern similarity effect (i.e., the brain manifestation of the shared variance between gaze reinstatement and memory performance) occurred specifically in brain regions that are known to play key roles in visual memory encoding.
To further confirm the specificity of the pattern similarity effect, we conducted the same analysis for the fixed-viewing scene condition, which, consistent with our hypothesis, yielded no significant results in the HPC or in other ventral visual stream regions (Figure 5B). Directly contrasting the pattern similarity between the free- versus fixed-viewing conditions confirmed that the similarity between the subsequent memory and subsequent gaze reinstatement modulation effects was specific to brain regions typically implicated in scene encoding, such as the left and right HPC (SPM small volume correction using HPC mask: cluster level pFWE-corr = .023 and .022, cluster size = 126 and 130 voxels; Figure 5C), and specific to the free-viewing condition (Figure 5A and B).
Notably, the occipital poles showed stronger activation bilaterally for subsequently remembered versus subsequently forgotten trials (embedded brain image [right] in Figure 6) and for trials with stronger subsequent gaze reinstatement (embedded brain image [left] in Figure 6). This region also showed similar cross-voxel modulation patterns for the subsequent memory and gaze reinstatement effects. We thus hypothesized that the activation of this region may mediate the relationship between gaze reinstatement and subsequent memory. To test this prediction, we conducted a mediation analysis in which we examined whether the effect of gaze reinstatement on subsequent memory could be significantly reduced when brain activity in the occipital pole, aggregated across the left and right, was entered as a mediator in the regression analysis. Specifically, for each participant, we first estimated brain activity for each scene image in each condition using the beta-series method (Rissman, Gazzaley, & D'Esposito, 2004). We then extracted the occipital pole activation corresponding to the left and right occipital pole ROIs. Next, at the individual level, we conducted a mediation analysis with the trial-wise gaze reinstatement measure as the predictor (x), occipital pole ROI activation as the mediator (m), and the subsequent memory measure as the outcome variable (y). The regression coefficient a (see Figure 6) was obtained when x was used to predict m, b was obtained when m was used to predict y (while controlling for x), and c′ was obtained when x was used to predict y (while controlling for m). Finally, the coefficients a, b, and c′ were averaged across runs for each participant and then tested at the group level using t tests. In line with our prediction, occipital pole activation partially mediated the prediction of gaze reinstatement on subsequent memory (indirect path: t = 1.86, p = .035, one-tailed; Figure 6).
DISCUSSION
This study explored the neural correlates of functional gaze reinstatement—the recapitulation of encoding-related gaze patterns during retrieval that is significantly predictive of mnemonic performance. Consistent with the Scanpath Theory (Noton & Stark, 1971a, 1971b), research using eye movement monitoring has demonstrated that the spatial overlap between encoding and retrieval gaze patterns is correlated with behavioral performance across a number of memory tasks (e.g., Wynn et al., 2018, 2020; Damiano & Walther, 2019; Scholz et al., 2016; Laeng et al., 2014; Olsen et al., 2014; Johansson & Johansson, 2013; Foulsham et al., 2012; Laeng & Teodorescu, 2002; for a review, see Wynn et al., 2019). Indeed, guided or spontaneous gaze shifts to regions viewed during encoding (i.e., gaze reinstatement) have been proposed to support memory retrieval by reactivating the spatiotemporal encoding context (Wynn et al., 2019). In line with this proposal, recent work using concurrent eye tracking and fMRI has indicated that gaze reinstatement elicits patterns of neural activity typically associated with successful memory retrieval, including HPC activity (Ryals et al., 2015) and whole-brain neural reactivation (Bone et al., 2019). Critically, however, these findings do not speak to the cognitive and neural processes at encoding that support the creation of functional scanpaths. This question is directly relevant to Scanpath Theory, which contends that eye movements not only facilitate memory retrieval but are themselves embedded in the memory trace (Noton & Stark, 1971a, 1971b). Accordingly, this study investigated the neural regions that support the formation and subsequent recapitulation of functional scanpaths. Extending earlier findings, and lending support to Scanpath Theory, here we show for the first time that functional gaze reinstatement is correlated with encoding-related neural activity patterns in brain regions associated with sensory (visual) processing, motor (gaze) control, and memory. Importantly, these findings suggest that, like objects and the relations among them, scanpaths may be bound into memory representations, such that their recapitulation may cue, and facilitate the retrieval of, additional event elements (see Wynn et al., 2019).
Consistent with previous work, this study found evidence of gaze reinstatement that was significantly greater than chance and significantly predictive of recognition memory accuracy when participants freely viewed repeated scenes. In addition, gaze reinstatement (measured during free viewing of scenes at retrieval) was positively associated with encoding-related neural activity in the BG and in the occipital pole. Previous work has linked the BG to voluntary saccade control, particularly when saccades are directed toward salient or rewarding stimuli (for a review, see Gottlieb, Hayhoe, Hikosaka, & Rangel, 2014), and to memory-guided attentional orienting (Goldfarb, Chun, & Phelps, 2016). Dense connections with brain regions involved in memory and oculomotor control including the HPC and FEFs (Shen, Bezgin, Selvam, McIntosh, & Ryan, 2016) make the BG ideally positioned to guide visual attention to informative (i.e., high reward probability) image regions. The occipital pole has been similarly implicated in exogenous orienting (Fernández & Carrasco, 2020) and visual processing, including visual imagery (St-Laurent, Abdi, & Buchsbaum, 2015), partly because of the relationship between neural activity in the occipital pole and gaze measures including fixation duration (Choi & Henderson, 2015) and saccade length (Frey, Nau, & Doeller, 2020).
Notably, the occipital pole region identified in the current study was spatially distinct from the occipital place area seen in other studies (e.g., Bonner & Epstein, 2017; Patai & Spiers, 2017; Dilks, Julian, Paunov, & Kanwisher, 2013), suggesting that it may differentially contribute to scene processing, possibly by guiding visual exploration. Moreover, the identified occipital pole region did not include area V1, suggesting that unlike (the number of) gaze fixations, which modulate activity in early visual regions (Liu et al., 2017), gaze reinstatement does not directly reflect the amount of bottom–up visual input present at encoding. Rather, gaze reinstatement may be related more specifically to the selective sampling and processing of informative regions at encoding (see also Fehlmann et al., 2020). Indeed, during encoding, viewing of informationally salient regions, as defined by participant data-driven saliency maps capturing both low-level and high-level image features, was significantly more predictive of subsequent gaze reinstatement than viewing of visually salient regions, as defined by stimulus (Saliency Toolbox)-driven saliency maps capturing low-level image features. The occipital pole additionally partially mediated the effect of gaze reinstatement on subsequent memory, further suggesting that this region may contribute to mnemonic processes via the formation of gaze scanpaths reflecting informationally salient image regions. Taken together with the neuroimaging results, these findings suggest that viewing, and consequentially, encoding, regions high in informational and/or rewarding content may facilitate the laying down of a scanpath that, when recapitulated, facilitates recognition via comparison of presented visual input with stored features (see Wynn et al., 2019).
To further interrogate the relationship between gaze reinstatement and memory at the neural level, we conducted a pattern similarity analysis to identify brain regions in which neural activity patterns corresponding to gaze reinstatement and those corresponding to subsequent memory covaried. Results of this analysis revealed significant overlap between the subsequent memory and subsequent gaze reinstatement effects in the occipital pole and BG (regions that showed a parametric modulation by subsequent gaze reinstatement) and in the PPA and HPC (regions that showed a parametric modulation by subsequent memory; see Liu et al., 2020). These regions may therefore be important for scene encoding (see Liu et al., 2020), in part through their role in linking the scanpath to the resulting memory representation. Specifically, parametric modulation and pattern similarity effects in the occipital pole and BG suggest that, when informationally salient image features are selected for overt visual attention, those features are encoded into memory along with the fixations made to them, which are subsequently recapitulated during retrieval. Consistent with Scanpath Theory's notion of the scanpath as a sensory–motor memory trace (Noton & Stark, 1971a, 1971b), these findings suggest that eye movements themselves may be part and parcel of the memory representation. The absence of gaze reinstatement-related activity in object- or location-specific processing regions (e.g., PPA, lateral occipital cortex) or low-level visual regions (e.g., V1) further suggests that reinstated scanpaths (at least in the present task) cannot be solely attributed to overlap in bottom–up visual saliency or memory for particularly salient image features. Indeed, recent work from Wang, Baumgartner, Kaule, Hanke, and Pollmann (2019) indicates that simply following a face- or house-related gaze pattern (without seeing a face or house) is sufficient to elicit activity in the FFA or PPA, respectively, suggesting that visual identification is not based solely on visual features but rather can also be supported by efferent oculomotor signals. The present findings further suggest that such signals, serving as a part of the memory representation, may be referenced and used by the HPC, similar to other mnemonic features (e.g., spatial locations, temporal order; Yonelinas, 2013; Davachi, 2006), to cue retrieval of associated elements within memory. That is, although the HPC may not be directly involved in generating or storing the scanpath (which may instead rely on visual and oculomotor regions), similar patterns of HPC activity that predict subsequent gaze reinstatement and subsequent memory suggest that the HPC may index these oculomotor programs, along with other signals, in the service of mnemonic binding and retrieval functions (e.g., relative spatial position coding; see Connor & Knierim, 2017).
Importantly, the finding that the HPC, in particular, similarly codes for subsequent memory and subsequent gaze reinstatement is consistent with its purported role in coordinating sensory and mnemonic representations (see Knapen, 2021). Indeed, early accounts positioned the HPC as the site at which already-parsed information from cortical processors are bound into lasting memory representations (Cohen & Eichenbaum, 1993). The notion that the oculomotor effector trace is included within the HPC representation is aligned with more recent work showcasing the inherent, and reciprocal, connections between the HPC and oculomotor systems. Research using computational modeling and network analyses, for example, indicates that the HPC and FEF are both anatomically and functionally connected (Ryan, Shen, Kacollja, et al., 2020; Shen et al., 2016; for a review, see Ryan, Shen, & Liu, 2020). Indeed, whereas damage to the HPC leads to impairments on several eye-movement-based measures (e.g., Olsen et al., 2015, 2016; Hannula, Ryan, Tranel, & Cohen, 2007; Ryan, Althoff, Whitlow, & Cohen, 2000), disruption of the FEF (via TMS) leads to impairments in memory recall (Wantz et al., 2016). Other work further suggests that visual and mnemonic processes share a similar reference frame, with connectivity between the HPC and V1 showing evidence of retinotopic orientation during both visual stimulation and visual imagery (Knapen, 2021; see also Silson, Zeidman, Knapen, & Baker, 2021). That the HPC may serve as a potential “convergence zone” for binding disparate event elements, including eye movements, is further supported by evidence from intracranial recordings in humans and animals suggesting that the coordination of eye movements with HPC theta rhythms supports memory encoding (Hoffman et al., 2013; Jutras, Fries, & Buffalo, 2013) and retrieval (Kragel et al., 2020) and by evidence of gaze-centric cells in the HPC (and entorhinal cortex; Meister & Buffalo, 2018; Killian, Jutras, & Buffalo, 2012) that respond to a particular gaze location (e.g., Chen & Naya, 2020; Rolls, Robertson, & Georges-François, 1997; for a review, see Nau, Julian, & Doeller, 2018). Extending this work, the present findings suggest that gaze reinstatement and subsequent memory share similar variance in the brain and may be supported by similar HPC mechanisms. Furthermore, these findings critically suggest that reinstated gaze patterns may be recruited and used by the HPC in the service of memory retrieval.
With this study, we provide novel evidence that encoding-related activity in the occipital pole and BG during free viewing of scenes is significantly predictive of subsequent gaze reinstatement, suggesting that scanpaths that are later recapitulated may contain important visuosensory and oculomotor information. Indeed, gaze reinstatement was correlated more strongly with encoding of informationally salient regions than visually salient regions, suggesting that the scanpath carries information related to high-level image content. Critically, visual, oculomotor, and mnemonic ROIs (i.e., occipital pole, BG, PPA, HPC) showed similar patterns of activity corresponding to subsequent memory (see Liu et al., 2020) and subsequent gaze reinstatement, further supporting a common underlying neural mechanism. Lending support to Scanpath Theory, the present results suggest that gaze scanpaths, beyond scaffolding memory retrieval, are themselves embedded in the memory representation (see Cohen & Eichenbaum, 1993), similar to other elements, including spatial and temporal relations (see Yonelinas, 2013; Davachi, 2006), and may be utilized by the HPC to support memory retrieval. Given the nature of the present task, we focused here on the spatial overlap (including fixation location and duration information) between gaze patterns during encoding and retrieval, but future work could also explore how temporal order information embedded in the scanpath may similarly or differentially contribute to memory retrieval. Thus, although further research will be needed to fully elucidate the neural mechanisms supporting functional gaze reinstatement, particularly across different tasks and populations, the current findings spotlight the unique interactions between overt visual attention and memory that extend beyond behavior to the level of the brain. Moreover, these findings speak to the importance of considering, and accounting for, effector systems, including the oculomotor system, in models of memory and cognition more broadly.
Acknowledgments
This work was supported by a Vision: Science to Applications postdoctoral fellowship awarded to Z. X. L.
Reprint requests should be sent to Jennifer D. Ryan, Rotman Research Institute, 3560 Bathurst St., Toronto, ON M6A 2E1, Canada, or via e-mail: [email protected].
Author Contributions
Jordana S. Wynn: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Validation; Visualization; Writing—Original draft; Writing—Review & editing. Zhong-Xu Liu: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Validation; Visualization; Writing—Original draft; Writing—Review & editing. Jennifer D. Ryan: Conceptualization; Funding acquisition; Project administration; Resources; Supervision; Writing—Review & editing.
Funding Information
Jennifer D. Ryan, Natural Sciences and Engineering Research Council of Canada (https://dx.doi.org/10.13039/501100000038), grant number: RGPIN-2018-06399. Jennifer D. Ryan, Canadian Institutes of Health Research (https://dx.doi.org/10.13039/501100000026), grant number: MOP126003.
Diversity in Citation Practices
A retrospective analysis of the citations in every article published in this journal from 2010 to 2020 has revealed a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .408, W(oman)/M = .335, M/W = .108, and W/W = .149, the comparable proportions for the articles that these authorship teams cited were M/M = .579, W/M = .243, M/W = .102, and W/W = .076 (Fulvio et al., JoCN, 33:1, pp. 3–7). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance. The authors of this article report its proportions of citations by gender category to be as follows: M/M = .55, W/M = .175, M/W = .075, and W/W = .2.
Notes
For further details regarding stimulus selection and feature equivalence, see Liu et al. (2020).
To achieve luminance and contrast balance, all color RGB images were transferred to NTSC space using the built-in MATLAB function rgb2ntsc.m. Then, the luminance (i.e., the NTSC Y component) and contrast (i.e., the standard deviation of luminance) were obtained for each image, and the mean values were used to balance (i.e., equalize) the luminance and contrast for all images using SHINE toolbox (Willenbockel et al., 2010). Finally, the images were transferred back to their original RGB space using the MATLAB function ntsc2rgb.m.
For further details regarding the randomization procedure, see Liu et al. (2020).
For further details regarding the density map computation, see Wynn et al. (2020).
Reference variable.
Using the same density map computation as the gaze reinstatement analysis, see Wynn et al. (2020).
REFERENCES
Author notes
Equal contribution.