Fundamental to our understanding of learning is the role of attention. We investigated how attention affects two fMRI measures of stimulus-specific memory: repetition suppression (RS) and pattern similarity (PS). RS refers to the decreased fMRI signal when a stimulus is repeated, and it is sensitive to manipulations of attention and task demands. In PS, region-wide voxel-level patterns of responses are evaluated for their similarity across repeated presentations of a stimulus. More similarity across presentations is related to better learning, but the role of attention on PS is not known. Here, we directly compared these measures during the visual repetition of scenes while manipulating attention. Consistent with previous findings, we observed RS in the scene-sensitive parahippocampal place area only when a scene was attended both at initial presentation and upon repetition in subsequent trials, indicating that attention is important for RS. Likewise, we observed greater PS in response to repeated pairs of scenes when both instances of the scene were attended than when either or both were ignored. However, RS and PS did not correlate on either a scene-by-scene or subject-by-subject basis, and PS measures revealed above-chance similarity even when stimuli were ignored. Thus, attention has different effects on RS and PS measures of perceptual repetition.
How do we make sense of a complex visual world? One challenge we face is overcoming an abundance of input from the environment. Because of capacity limits in information processing, our attentional systems select relevant information to process and ignore or inhibit the rest (Chun, Golomb, & Turk-Browne, 2011; Corbetta & Shulman, 2002). Another challenge is to achieve recognition by successfully matching incoming information to existing stored knowledge, which is acquired through learning from experiences (Greenough, Black, & Wallace, 1987).
For such learning to take place, we must strike a balance between preserving existing information (stability) and encoding new information (plasticity; Norman, Newman, & Perotte, 2005). Attentional control plays a critical role at both of these junctures by allocating limited resources in a competition arising from both external stimuli (e.g., during perception) and internal stimuli (e.g., thoughts, memories, etc.). To do this, frontal and parietal regions of an attentional control network inhibit or facilitate the processing of earlier sensory regions (e.g., occipital or temporal cortex; Corbetta, Kincade, & Shulman, 2002; Corbetta & Shulman, 2002; Kanwisher & Wojciulik, 2000; Kastner, Pinsk, De Weerd, Desimone, & Ungerleider, 1999). According to Chun and Johnson's (2011) Perceptual/Reflective Attention and Memory framework, the same sensory regions within the ventral-occipital temporal cortex are responsible for representing both external percepts and long-term memories (Kuhl, Rissman, Chun, & Wagner, 2011; Johnson & Johnson, 2009; Johnson, McDuff, Rugg, & Norman, 2009), but the manner in which attention acts on them may differ. Acquiring reliable measures of visual learning will help us understand the role of attention in perception and memory.
fMRI researchers have developed several tools to index visual learning. The most common measure is “repetition suppression” (RS), also known as MR adaptation (Grill-Spector & Malach, 2001), repetition attenuation (Yi & Chun, 2005), or neural priming (Maccotta & Buckner, 2004). In RS, the neural response to a stimulus is dampened upon repeated presentations (Grill-Spector, Henson, & Martin, 2006). This effect is observed in single-cell recordings (Desimone, 1996; Lueschow, Miller, & Desimone, 1994) as well as fMRI (Yi & Chun, 2005; Henson & Rugg, 2003; Grill-Spector & Malach, 2001; Grill-Spector et al., 1999). A greater RS effect indicates more experience and better learning; indeed, RS effects correlate with subsequent memory for learned stimuli (Turk-Browne, Yi, & Chun, 2006; Yi & Chun, 2005). RS is frequently used to measure perceptual discrimination. For example, release from RS is observed when a novel stimulus is perceived as sufficiently different from habituated stimuli. In this manner, RS has been used to measure how well a particular brain region can discriminate across different stimuli: RS magnitude decreases as stimulus dissimilarity increases (Henson & Rugg, 2003; Grill-Spector & Malach, 2001).
Once considered an implicit, automatic response to repeated exposure, RS in fact can vary with explicit subsequent memory and is modulated by attention (Turk-Browne et al., 2006; Yi & Chun, 2005). Yi and Chun (2005) demonstrated that RS was observed only in repeated images that were attended. In their experiment, subjects viewed pairs of face–scene composite images. They were instructed to attend to either the face or scene dimension of the image on a block of trials and to detect whether the stimulus changed on the attended dimension when two face–scene composites were presented in close succession. Critical images were presented twice during an experimental run. Importantly, these critical images were either attended or ignored upon both first and second presentations. For example, a critical scene could appear first while the participant was instructed to attend to scenes and again later when the participant was instructed to attend to faces. The authors found RS in the scene-sensitive parahippocampal place area (PPA) region only when scenes were attended during both presentations. Moreover, the amount of suppression was proportional to behavioral performance on a subsequent memory task, indicating that RS correlated with learning. In summary, this study demonstrated that the RS measure of visual learning is dependent on attention during both encoding and subsequent access. Although RS is a useful measure of learning, its absence in a particular experimental paradigm does not logically indicate that a stimulus was not learned. For example, in the Yi and Chun (2005) study, ignored scenes may have induced learning-related changes in neural circuitry, even if they did not exhibit RS effects or explicit subsequent memory. In fact, just as behavioral measures such as priming may reveal memory traces when recognition tasks do not (Ward, Kuhl, & Chun, in revision), other fMRI measures may reveal learning-related changes when RS is absent.
New fMRI methods to measure stimulus-specific processing and memory continue to emerge, and one of the more promising measures is distributed pattern analysis. In fMRI, this method has been most commonly implemented as “multivoxel pattern analysis,” in which a machine learning algorithm trained to recognize neural patterns of responses to one set of visual stimuli can correctly classify another set of responses to similar stimuli (Norman, Polyn, Detre, & Haxby, 2006). For example, distributed pattern analysis has allowed researchers to distinguish the neural responses to object categories such as shoes, scenes, and bottles (Haxby et al., 2001). Remembered stimuli can be classified in the same manner from the ventral-occipito-temporal cortex (Kuhl et al., 2011; Serences, Ester, Vogel, & Awh, 2009), supporting the idea that the same regions of cortex respond similarly during perception and during reactivation of memories (Chun & Johnson, 2011).
More simply, researchers have compared the similarity of two specific patterns to measure perceptual representation, learning, and memory (Xue et al., 2010; Drucker & Aguirre, 2009; Drucker, Kerr, & Aguirre, 2009). For example, using this type of pattern similarity (PS) analysis, Xue et al. showed that better subsequent memory was linked to a more similar pattern of activity between two repeated representations. In their study, participants viewed face images that repeated several times through the course of the fMRI scanning period. Following the scan, participants were given a subsequent memory test on faces appearing in the experiment. Memory for a face in this test was predicted by the similarity between the neural responses to the same face across trials of the fMRI scan. The more similar the neural response for subsequent representations, the more likely the participant would subsequently remember that face (Xue et al., 2010).
Like RS, measuring the changes of distributed neural patterns to stimuli over time can provide insight into the mechanisms of visual learning. Unlike RS, in PS less change (higher correlation) from one presentation to the next is what reveals stronger learning: in the Xue et al. study, subsequent memory was better for faces with higher PS during encoding. PS therefore likely provides an index of efficient, veridical encoding, which is presumably driven by more uniformity of neural responses across presentations.
Because PS is relatively new, very few studies have directly compared it with RS as a measure of visual representation and memory. One study compared RS with classification performance from multivoxel pattern analysis, rather than PS per se. When measuring learned responses to orientation gratings, RS and pattern classification indexed similar properties of learning (Sapountzis, Schluppeck, Bowtell, & Peirce, 2010). In early visual areas, they found pattern analysis to be more sensitive to changes in gratings, but both measures were highly correlated.
Other studies computed both pattern classification and PS measures and showed less correspondence with RS in later visual areas (Drucker & Aguirre, 2009; Drucker et al., 2009). RS and pattern analyses reveal different aspects of scene processing for categories versus exemplars (Epstein & Morgan, 2012): RS effects were present for specific images and exemplars but did not emerge for categories. However, classification based on pattern analysis was robust for categories and for exemplars.
Thus far, it is difficult to derive general conclusions about these two measurements of visual learning. A better understanding of the relationship between distributed pattern analyses and RS may help researchers use these tools more effectively to explore unanswered questions about visual learning. And because attention is a critical modulator of visual learning, one way to evaluate these measures is to observe how they perform under different attentional conditions.
In this study, we directly compared RS and PS in a task that examined the role of attention on visual repetition, focusing on PS measures of distributed analyses because PS is most comparable to RS and because of the nature of our selective attention task. To manipulate attention, participants attended to either faces or scenes in composite face–scene images. We focused our fMRI analyses on the BOLD response to scenes in the PPA, which responds selectively to scenes and negligibly to faces. As in Yi and Chun (2005), we repeated scenes across trials while the scenes were either attended or ignored. To distinguish attentional engagement in different learning stages, a factorial design manipulated attention during initial presentation and separately during repetition. To evaluate RS, we measured the amplitude difference between the neural response to the first and second presentations of a scene. To evaluate PS, we correlated the pattern of activity voxel by voxel from the first presentation to the next. We compared each of these values across different conditions of attentional engagement. To our knowledge, this is the first experiment examining the effect of attention on PS.
We predict RS in the PPA only when stimuli are attended during both presentations, replicating Yi and Chun (2005). Similar to the RS prediction, we also expect to find greater PS in the PPA to two stimuli when both are attended than when neither is or just one stimulus is attended. We will also examine the relationship between RS and PS to see if they are affected by attention similarly or differently. We did not collect subsequent memory data for the images, because testing the four attention conditions within a limited time did not permit sufficient power to differentiate hits versus misses. However, extensive prior work demonstrates that behavioral memory is better for attended images (e.g., for reviews, see Chun & Johnson, 2011; Chun & Turk-Browne, 2007), and so here we focus on RS and PS as our dependent measures.
Twenty-five healthy participants (12 women) participated in exchange for $20/hr payment. All participants (age range = 18–30 years) were right-handed, reported normal or corrected vision, and had no history of neurological injury or disease. Each consented in accordance with the Yale University Institutional Review Board. Because of excessive head motion (two participants) or poor behavioral performance (five participants; see Results), seven participants were excluded from analysis, leaving a total of 18 participants.
Procedure and Design
Participants performed a change detection task for a sequence of two briefly presented composite scene–face stimuli. During a block of trials, participants were instructed to attend to either scenes or faces, which were either identical or not within each trial. Treating each trial sequence of two composite stimuli as a single event, we focused our fMRI analyses on the trials in which no change occurred in either the scene or face. To measure PS and RS, we repeated the scenes across separate trials.
There were eight runs with each run consisting of 45 event-related change detection task trials. Every trial presented a sequence of two 450-msec overlapping scene and face composite images, separated by an 800-msec blank interval. All stimuli, faces and scenes, were 7° × 7° grayscale real pictures presented against a black background. A fixation mark was placed 1° above the center (between the two eyes of the face).
There were two types of task blocks: the attend-scene block and the attend-face block. In the attend-scene blocks, participants were instructed to detect a possible change of the scene across the two composite images within each trial while ignoring a possible change of the face. In contrast, in the attend-face block, participants searched for face changes while ignoring possible scene changes. When a scan run began, a red disk (0.1° in diameter) was shown as a fixation mark. At the beginning of each block, participants were cued by a phrase (“ATTEND SCENE” or “ATTEND FACE”) in yellow for 3 sec, followed by a 1-sec blank period. Then, the fixation mark turned into a 0.2° × 0.2° red letter: an “S” for the attend-scene block or “F” for the attend-face block (see Figure 1A and B). In each trial, participants responded via button press (index finger to indicate no change across the attended dimension and middle finger to indicate a change) and were told to respond within 2 sec of seeing the second image. After the response, the fixation letter remained on the screen until the next block to remind participants of the type of category they should attend to. The intertrial interval was jittered from 6 to 10 sec.
As described earlier, some of the scenes were repeated across two separate trials. The RS and PS measures for across-trial repetitions of scenes were based on only those trials in which the two composite scene–face images were identical within the trial sequence (e.g., eliciting a “same” response in the change detection task, regardless of which category was attended). Importantly, when repeated across trials, each of these scenes was paired with a different face. Thus, any associated neural adaptation or change in the pattern of representation should be accounted for by the repetition of scenes between trials, not by low-level features of the composite stimulus (see Figure 1C).
We measured how attention modulated scene-specific learning by comparing the activations of repeated scenes across trials as a function of whether the scene was attended or ignored during the initial presentation or repetition trial. Within each run, 16 scenes were repeated as the primary trials for analysis. During initial presentation, 8 of the 16 scenes were attended in an attend-scene block (novel-attended scene condition, NewAttn), whereas the other eight scenes were ignored in an attend-face block (novel-ignored scene condition, NewIgn). Scenes were repeated across either five or six trials. By varying whether the scenes repeated within or across interleaving attend-scene and attend-face blocks, our design yielded four types of repeated scenes with distinct attentional histories: (a) a half of the scenes attended initially (NewAttn) were attended again during repetition (AttnAttn) and (b) the other half were ignored during repetition (AttnIgn), and (c) a half of the scenes ignored initially (NewIgn) were attended during repetition (IgnIgn) and (d) the other half were ignored again during repetition (IgnIgn). There were four trials for each attentional history condition in each run, for a total of 32 trials per condition across eight runs.
Each run had three task blocks indicating which images should be attended. They were arranged in one of two orders: face–scene–face or scene–face–scene. A total of 45 trials per run were spread across the three task blocks. The 45 trials comprising 32 critical trials (i.e., 16 scenes with initial and repeated presentations) and 17 filler trials that were excluded from analyses. These filler trials consisted of the following: (a) eight “target” trials that presented a task-relevant change (e.g., face change in an attend-face block), for which participants were supposed to respond “different”; (b) eight “catch” trials in which the task-irrelevant picture changed (e.g., scene change in an attend-face block), so that participants should respond “same” because the task-relevant image did not change; and (c) one filler trial in which (a), (b), or no change occurred, determined randomly in each run. The catch trials with task-irrelevant change were included to motivate participants to attend selectively to the designated category. The first trial of each task block was always one of these filler trials because we did not want potential task switching confounds (e.g., from attend-scene to attend-face) to interfere with a critical trial. The remaining 14 filler trials were distributed across the run to ensure that critical scenes were repeated from five to six trials after their initial presentation. Note that given these constraints, the three task blocks within each run were not of uniform length. For the two types of runs, the number of trials in each task block was face (17), scene (22), face (6), or scene (6), face (22), scene (17). The block length and task type were counterbalanced within participants.
Because we did not analyze the face responses per se (to ensure sufficient power for the scene manipulations), we allowed faces to be reused four times across eight runs, but never within two subsequent runs. Scenes were never reused across runs. Feedback was given verbally after each run.
This procedure allowed us to measure attentional modulation of both RS and PS by comparing the PPA response of the initial and repeated presentation of the scenes. In AttnAttn and IgnIgn trials in which attention instructions were identical for both scene presentations, RS was defined as the difference between the PPA signal to the trial in which a scene appeared for the first time (“the initial presentation trial”) and the PPA signal elicited by the subsequent trial in which the same scene appeared again (“the repetition trial”). For AttnIgn and IgnAttn trials, the attention instruction differed across scene repetition, and thus, RS would be confounded by well-known attentional enhancement effects. To control for attention in these conditions, RS was computed from novel scenes from the appropriate attention condition matching the attend/ignore instruction for the repeated scenes. That is, when measuring RS in the AttnIgn condition, in which repeated scenes were ignored, the comparison trials were novel, ignored scenes (NewIgn). Likewise, RS for the IgnAttn condition, in which repeated scenes were attended, the comparison trials were novel, attended scenes (NewAttn). PS was defined the same way across all trial types. PS was the correlation between the activity in all voxels of the PPA ROI in response to the initial presentation trial versus those voxels' responses to the repetition trial.
We acquired fMRI data using a 3T Siemens Trio scanner and a 12-channel head coil. The parameters of the functional, gradient-echo T2*-weighted EPI sequence were repetition time of 2 sec, echo time of 25 msec, flip angle of 90°, voxel size of 3.5 × 3.5 × 4 mm3, 34 oblique slices acquired at each repetition time, and 195 volumes acquired during each run. Stimuli were presented through an LCD projector onto a rear projection screen located behind the coil and viewed with angled mirrors. Responses were collected with an MRI-compatible button box.
We discarded the first five volumes to allow for T1 equilibration. Using freesurfer 4.0, we corrected functional images for slice acquisition time difference and head movement. Then functional images were normalized and resampled with a 3 × 3 × 3 mm3 voxel size.
As in Yi and Chun (2005), we focused on the PPA as an ROI. We functionally localized the PPA separately within each participant by contrasting the averaged brain activity in attend-scene and attend-face blocks. The most highly scene-selective voxel from each hemisphere of the ventral temporal lobe formed the center of a 27-voxel cube that comprised the ROI. For all participants, the ROIs were localized in the parahippocampal gyrus/collateral sulcus region, consistent with prior studies [mean Talairach coordinates: x = 25.4 (range 18–30), y = −44.5 (range 38–53), z = −6.7 (range 3–10)].
For the univariate analyses to measure RS, we first spatially smoothed the data with an 8-mm FWHM Gaussian kernel. We then modeled each trial of the experiment using a canonical hemodynamic response function and extracted the resulting β values. We averaged the β values in each condition in each participant and across hemispheres (post hoc analyses did not reveal any significant effect of hemisphere or any interaction with the other factors). We performed statistical analyses (ANOVA and t tests) on the resulting average β values.
To analyze PS, we first modeled each trial separately using a hemodynamic response function on unsmoothed data. We extracted β values on each trial from individual voxels in the ROIs and transformed the list of voxels in the ROIs into a 54-item vector (27 voxels for each of the left and right PPA ROIs). We then correlated pairs of vectors that corresponded to the first and second presentations of scene stimuli on critical trials (Xue et al., 2010). Note that we did not normalize the β values, but by using β values instead of raw BOLD signal, the values were standardized within each run. To complete statistical analyses on different correlation values calculated across separate conditions, we then z-transformed the r values.
We then conducted several correlation analyses comparing RS and PS. First, we compared the measures on a trial-by-trial basis within participants. To do so, we used the RS and PS score (calculated using the procedure above) for every pair of repeated images for each subject. We then correlated the lists of these two scores separately for each experimental condition, leaving us with four correlation values per subject. Finally, we z-transformed the resulting correlations so that they could be averaged and compared across participants. Next, we compared RS and PS across participants. For our first two between-subject correlation analyses, we correlated the average RS and PS scores in each of the AttnAttn and IgnIgn conditions for each subject. We also correlated PS and RS with regard to the attention manipulation. To do this, we measured the difference in observed RS between ignored trials (NewIgn vs. IgnIgn) and attended trials (NewAttn vs. AttnAttn). We also measured the difference in PS on the same set of trials. We then correlated these two difference scores to determine whether the attention manipulation had a similar effect on RS and PS across participants.
We excluded five participants from the analysis whose average accuracy across all conditions was less than 65%, suggesting that they did not follow task instructions. Among the remaining participants, performance was better on the attend-scene task than the attend-face task, replicating Yi and Chun (2005). Hits, as defined by detection of changes in the attended dimension, were greater for scenes (93.2%) than for faces (75.9%), t(17) = 5.11, p < .001. Overall, error rates were also lower when attending to scenes. False alarm rates in the task-irrelevant change trials were lower in the attend-scene task (10.6%) than in the attend-face task (49.6%), t = 6.4, p < .001. However, false alarms in the no-change trials were slightly higher in the attend-scene task (10.9%) than in the attend-face task (6.92%), t = 2.63, p < .018. As explained by Yi and Chun (2005), who found a similar pattern of results, one reason performance may have been better for attending to scenes than faces was because scenes are more visually heterogeneous than faces, thereby making it easier to detect changes across them.
Our initial objective was to replicate the prior finding that attention is necessary to observe RS (Yi & Chun, 2005). To do so, we first tested whether our attention manipulation was successful. As in prior findings (e.g., Yi & Chun, 2005), there was a main effect of Attention: novel scenes elicited greater PPA activity when attended (attend-scene) than when ignored (attend-face condition), NewAttn > NewIgn, t(17) = 5.44, p < .001. Next, to assess an effect of RS, we performed a 3 × 2 ANOVA with the factors Attentional History (new, previously ignored, or previously attended) and Current Presentation (ignored vs. attended), using average values for the NewAttn, NewIgn, AttnAttn, AttnIgn, IgnAttn, and IgnIgn conditions across participants. Consistent with the attentional manipulation above, there was a main effect of Current Presentation, with attended images eliciting more activity than unattended ones: F(1, 17) = 18.48, p < .0001. There was no main effect of Attentional History (p > .3). However, in line with Yi and Chun (2005), there was a significant interaction between Attentional History and Current Presentation: F(1, 17) = 3.50, p < .05.
Post hoc t tests demonstrate that this interaction was due to RS: newly attended scenes elicited more activity than scenes that were attended to a second time, NewAttn > AttnAttn, t(17) = 2.20, p < .05. Also, attention during the initial presentation trials was necessary to observe RS. When a scene was ignored the first time it was presented, it elicited as much activity when later attended during the second presentation as a novel scene did, NewAttn versus IgnAttn, t(17) = −0.91, p = .38, ns (see Figure 1). Supporting the hypothesis that attention during the repetition trials is required for RS, there were no significant differences among conditions in which the scene was ignored during initial or second presentation: NewIgn, IgnIgn, and AttnIgn conditions did not show significant differences, all ps > .3.
Our second objective was to assess how PS responds to changes in attention. To do so, we correlated PPA activity from the first and second presentations as described in the methods. A stronger positive correlation indicates more similarity between the patterns of activity across presentations of the scenes. First, we performed a 2 × 2 ANOVA with the factors First Presentation (attended vs. ignored) and Second Presentation (attended vs. ignored) using the correlation values for AttnAttn, AttnIgn, IgnAttn, and IgnIgn pairs. There was a significant main effect of Initial Presentation, F(1, 17) = 15.88, p < .001, indicating higher PS for a pair of scenes when the initial scene was attended than when it was ignored. We did not find an effect of Second Presentation (p > .5). Comparable to our RS finding, there was a significant interaction between First and Second Presentation, F(1, 17) = 4.50, p < .05. As in the RS findings, this interaction was driven by the AttnAttn condition. PS values were the highest between a pair of identical scenes when they were attended during both presentations, as opposed to just one presentation or neither. AttnAttn showed significantly greater PS than IgnAttn [t(17) = 4.08, p < .001] and IgnIgn [t(17) = 2.99, p < .01]. Other differences in PS values among the four critical conditions were also not significant (see Figure 3). The PS differences among the critical conditions could not be accounted for by differences in overall variability of voxel activity within the ROI across the different conditions. Voxel variance was similar for each condition: AttnAttn = 0.054, AttnIgn = 0.065, IgnAttn = 0.053, and IgnIgn = 0.050. These values were not significantly different from each other (all ps > .3).
To determine the extent to which PS measures scene-specific information, we also measured the similarity of responses to two different, random scenes in the experiment, separately for when the scenes were both attended (RandAttn) and when they were both ignored (RandIgn). These two measures did not differ from each other (p = .81), indicating that attention per se does not increase PS. RandAttn PS values were significantly smaller than in the AttnAttn [t(17) = 3.83, p < .005] and AttnIgn [t(17) = 2.25, p < .05] conditions, marginally smaller than IgnIgn [t(17) = 2.01, p = .054], and not significantly different from IgnAttn [t(17) = 0.91, p = .91]. Similarly, RandIgn PS values were significantly smaller than in the AttnAttn [t(17) = 5.57, p < .001], AttnIgn [t(17) = 3.36, p < .005], and IgnIgn [t(17) = 2.74, p < .05] conditions, and not significantly different from IgnAttn [t(17) = 1.48, p = .16]. It is worth noting that the PS value in the IgnIgn condition was significantly higher than in the RandIgn condition, which reveals some effect of ignored stimulus repetition that was not present in the RS measures.
Finally, we set out to compare RS and PS directly, and we tested this relationship in several ways. First, we compared RS and PS on a scene-by-scene basis within participants. If attention-related changes in PS reveal the same underlying mechanisms that drive RS, as measured by fMRI, then the two measures should be correlated in this experiment. We found little evidence of correlation between these measures. On AttnAttn trials, in which both measures were strongest, the correlation between them was not significant: r = 0.065, p = .222. IgnIgn was the only other trial type with the same attention instructions for attending to the first and second scene presentations. In these trials, the correlation was marginally significant: r = 0.101, p = .067. To ensure that the null correlations were not driven by outlying trials, we made scatter plots for individual participants in which each point in the plot represented the PS and RS score for an individual scene for that subject. Sure enough, there does not appear to be an RS/PS relationship, and no outliers are driving the results (see Figure 4).
We also performed three between-subject correlations of PS and RS. For the first two, we correlated the average RS and PS values for AttnAttn trials and IgnIgn trials across participants. There was no between-subject correlation between RS and PS for AttnAttn trials (r = −0.069, p = .83) or for IgnIgn trials (r = −0.31, p = .27). Next, we performed a between-subject correlation of RS and PS based on the attention manipulation. First, for each subject, we calculated an RS difference score that was the magnitude of the RS effect for the attended trials (NewAttn vs. AttnAttn) minus the magnitude of the RS effect for the unattended trials (NewIgn vs. IgnIgn). Next, we performed a similar calculation for PS: the PS difference score was the PS z score for attended trials minus the PS z score for unattended trials. Finally, we correlated these scores between participants. In line with the other correlations, this one was also not significant (r = −0.28, p = .30). Taken together, even though RS and PS are both sensitive to stimulus repetition and to attentional modulation, the two measures do not meaningfully correlate, suggesting that they measure different aspects of visual repetition and learning.
We directly compared how measures of RS and PS respond to manipulations of attention during visual repetition. Replicating prior findings of RS, fMRI BOLD activity in scene-selective PPA was reduced upon second presentation of a stimulus, but only when both presentations of the stimulus were attended. Also as we hypothesized, PS was sensitive to the attention manipulation: the neural responses to the first and second presentations of a stimulus were more similar when the stimulus was attended both times than when the first stimulus was ignored.
PS has only recently been introduced as a way to measure visual learning. Xue and colleagues related the similarity of the patterns of activity to repeated faces (presented four times in the experiment) to the subsequent memory of the faces (Xue et al., 2010). They found that memory was better for faces that elicited more similar patterns of activity at each presentation. Given that attention is important for memory (Chun & Johnson, 2011; Jonides et al., 2008; Chun & Turk-Browne, 2007) and we found that more attention to presentations in a stimulus pair leads to greater PS, our results are consistent with the Xue et al. findings.
Although both RS and PS responded comparably to manipulations of attention, they did not correlate with each other and thus may measure different aspects of visual learning. RS is sensitive to neural changes with repeated experiences at the single voxel level (Grill-Spector & Malach, 2001). Meanwhile, PS is inherently dependent on intervoxel pattern changes after learning, which are lost when voxels are averaged in RS analyses. In addition, even within RS, different mechanisms of neural adaptation may be at work in different brain regions (Kohn & Movshon, 2004), yielding the same perceived result of a reduced BOLD response following learning. The timecourse of learning also plays a role, as second- or millisecond-level lags from the first presentation to the second may cause RS effects because of fatigue from receptive neurons, whereas long-term potentiation or a similar mechanism likely controls more lasting neural changes that lead to RS observations (Grill-Spector et al., 2006). Also, as in PS, different effects of RS have been observed in early versus later visual regions (Drucker & Aguirre, 2009; Drucker et al., 2009). In the present experiment, it is therefore possible that RS and PS measure different substrates of learning in the PPA.
To understand the relationship between RS and PS measures in the present investigation, it is important to examine PS for two different scenes in addition to identical scenes in the critical conditions. As expected, we found that patterns of activity were more different across two different scenes than they were for any of the same-scene pairs. This result indicates that even when a scene is ignored, the response pattern in the PPA contains information specific to that particular scene (i.e., PS for IgnIgn > PS for RandAttn and RandIgn scenes). This finding further suggests that PS and RS measure different aspects of the neural code, and it raises the possibility that PS is more sensitive to stimulus repetition than RS. Another hypothesis is that PS and RS measure the same neural changes, but that RS is noisier than PS. However, individual scatterplots comparing RS and PS (see Figure 4) as well as PS variability scores are not consistent with this alternative. Future work should reveal whether the PS sensitivity relates to behavioral measures of implicit or explicit learning (e.g., Ward et al., in revision).
Also important, the similarity of responses to two different scenes did not depend on whether the scenes were attended, based on the lack of difference between RandAttn and RandIgn. Thus, although attention increases the fMRI signal averaged across voxels, it does not appear to introduce common voxel patterns of activity across attended, nonidentical stimuli.
Although the results suggest that PS is more sensitive than RS, the change detection task in our design may have dampened the strength of RS here. Because images were repeated within trials for the change detection task, it is likely that even within each trial, there was some RS from the first and second presentations of the scene. Indeed, RS has been demonstrated on very short timescales (Sobotka & Ringo, 1996). Because we averaged both presentations within the trial into a single fMRI event, the “initial presentation” value (e.g., the NewAttn value) was underestimated because it included a true initial presentation as well as a second presentation. By the time the same scene repeated several trials later in the experiment, the scene was actually being presented for the third and fourth times. Nevertheless, our conclusions are supported by other studies that have failed to observe RS effects for ignored repetition even when measured across first and second presentations (Eger, Henson, Driver, & Dolan, 2004; Ishai, Pessoa, Bikle, & Ungerleider, 2004). And more importantly, all our comparisons of RS and PS across different attentional manipulations remain valid as we fully controlled for repetition within this study. Hence, we can still conclude that under the current experimental conditions, attention modulates both RS and PS, and that PS appears more sensitive to repetition than RS.
In summary, we demonstrated that RS and PS are both sensitive to manipulations of attention during visual repetition. Yet, we found no significant correlation between the measures, in spite of their exhibiting similar averaged PPA responses to the attention manipulation during learning. Further PS analyses demonstrated that PPA patterns likely included scene-specific information, even when the scenes themselves were not attended. Future studies may further elucidate mechanisms of visual learning by determining whether the same relationship between PS and RS persists in different brain regions, for different experimental tasks, and over different timescales. Nonetheless, this investigation confirms that both RS and PS measures should be considered in visual attention and learning studies (Chun & Johnson, 2011).
This work was supported by NIH R01-EY014193 to M. C. and Korea NRF-2010-0018949 to D. Y. We thank Samuel Cartmell for collecting data and Naseem Al-Aidroos and an anonymous reviewer for helpful comments on our earlier draft.
Reprint requests should be sent to Katherine S. Moore, Department of Psychology, Elmhurst College, 190 Prospect Ave., Elmhurst, IL 60126, or via e-mail: firstname.lastname@example.org; email@example.com.