Abstract

Much of what we need to remember consists of sequences of stimuli, experiences, or events. Repeated presentation of a specific sequence establishes a more stable long-term memory, as shown by increased recall accuracy over successive trials of an STM task. Here we used fMRI to study the neural mechanisms that underlie sequence learning in the auditory–verbal domain. Specifically, we track the emergence of neural representations of sequences over the course of learning using multivariate pattern analysis. For this purpose, we use a serial recall task, in which participants have to recall overlapping sequences of letter names, with some of those sequences being repeated and hence learned over the course of the experiment. We show that voxels in the hippocampus come to encode the identity of specific repeated sequences although the letter names were common to all sequences in the experiment. These changes could have not been caused by changes in overall level of activity or to fMRI signal-to-noise ratios. Hence, the present results go beyond conventional univariate fMRI methods in showing a critical contribution of medial-temporal lobe memory systems to establishing long-term representations of verbal sequences.

INTRODUCTION

There is little value in remembering the digits in a phone number unless you can also remember the order they appeared in. Although a short sequence of, say, six words or digits might be accurately recalled immediately after a single presentation, it is likely to be forgotten very quickly (Brown, 1958). Repeated presentations enable more stable memory representations to be established and allow longer sequences to be learned. That is, information temporarily stored in STM can be transferred to long-term memory via repetitions. What are the neural mechanisms that underlie this process of repetition learning of sequences?

Existing neurophysiological and imaging data suggests that both medial-temporal/hippocampal (MTL/HC)- and BG-based memory systems support learning of sequences with a wide range of different stimuli. Previous studies with rodents and humans have shown that the hippocampus is selectively involved in encoding sequences of odors (Devito & Eichenbaum, 2011; Agster, Fortin, & Eichenbaum, 2002), faces (Ross, Brown, & Stern, 2009; Kumaran & Maguire, 2006), movements (Albouy et al., 2008), and locations (Schendan, Searl, Melrose, & Stern, 2003). Additionally, BG activation has been shown to be correlated with learning sequences of motor movements and spatial locations (Graybiel, 2008; Yin & Knowlton, 2006; Poldrack et al., 2005; White & McDonald, 2002; Miyachi, Hikosaka, Miyashita, Kárádi, & Rand, 1997).

However, previous neuroimaging studies have so far almost exclusively used univariate fMRI subtractions and have therefore only been able to monitor how these brain areas become more or less active as a function of learning. Hence, many of these studies have attempted to identify neural correlates of sequence learning on the basis of correlations between behavioral learning measures and BOLD amplitude (e.g., Ross et al., 2009; Turk-Browne, Scholl, Chun, & Johnson, 2009; Seger & Cincotta, 2006; Lieberman, Chang, Chiao, Bookheimer, & Knowlton, 2004; Schendan et al., 2003). However, even if activation in a given brain area correlates strongly with some measure of learning, this is no guarantee that this area is actually involved in storing or representing the learned information. In the current experiment, our goal was to track the evolution of the neural representations of sequence information over the course of learning using multivariate pattern analysis (MVPA; for reviews, see Kriegeskorte, Goebel, & Bandettini, 2006; Norman, Polyn, Detre, & Haxby, 2006).

We used an immediate serial recall task, where participants have to verbally recall sequences of letter names, with some of those sequences being repeated over the course of the experiment (Figure 1A, B). This task, often referred to as the Hebb (1961) repetition learning task, is commonly seen as a laboratory analogue of sequence learning in a natural environment (e.g., learning sequences of speech sounds that comprise new words; for a review, see Page & Norris, 2009). To ensure that our task was sensitive only to learning order information and not to learning about the individual items being used in the experiment, all sequences were permutations of the same set of eight spoken letter names. The critical sequences were each repeated 12 times over the course of the experiment and were interspersed with filler sequences that were all unique. In common with most multivariate analysis methods, the voxel responses within a given brain area were treated as a pattern of activations (Figure 1C), each corresponding to a single presentation of a repeating sequence. We were then able to track the way that the information contained in these vectors changed over time as a function of learning. We used this procedure to answer two simple questions: First, do any brain regions develop representations that are tuned to repeating sequences, and second, are these representations sequence-specific?

Figure 1. 

(A) Structure of trials. (B) Single trial. (C–E) Multivariate pattern similarity analysis. (C) and (D) indicate the set of pairwise correlations between sequences that enter into the analysis. (D) and (E) show how those correlations are then combined to derive measures of the changes in within- and between-sequence similarity over repetitions. (C) Voxel responses within a given brain area are treated as a vector of activations. Each vector corresponds to a single presentation of a repeating sequence. (D) An example of a pairwise activity correlation matrix for six sequences (labeled A–F), each repeated 3 times. The letters represent different sequences and the numbers indicate repetitions of those sequences the nth repetition. Red cells are used to calculate the pairwise correlations between the activity patterns corresponding to the first presentation of each sequence; orange cells for second presentations, and yellow cells for the third presentations. Blue cells are used to calculate the correlation of the activity patterns between successive presentations of the same sequence (the nth presentation of a sequence to presentation n − 1). (E) The means of the resulting between-sequence and between-repetition correlation are then used to calculate two respective slopes over repetitions.

Figure 1. 

(A) Structure of trials. (B) Single trial. (C–E) Multivariate pattern similarity analysis. (C) and (D) indicate the set of pairwise correlations between sequences that enter into the analysis. (D) and (E) show how those correlations are then combined to derive measures of the changes in within- and between-sequence similarity over repetitions. (C) Voxel responses within a given brain area are treated as a vector of activations. Each vector corresponds to a single presentation of a repeating sequence. (D) An example of a pairwise activity correlation matrix for six sequences (labeled A–F), each repeated 3 times. The letters represent different sequences and the numbers indicate repetitions of those sequences the nth repetition. Red cells are used to calculate the pairwise correlations between the activity patterns corresponding to the first presentation of each sequence; orange cells for second presentations, and yellow cells for the third presentations. Blue cells are used to calculate the correlation of the activity patterns between successive presentations of the same sequence (the nth presentation of a sequence to presentation n − 1). (E) The means of the resulting between-sequence and between-repetition correlation are then used to calculate two respective slopes over repetitions.

We addressed the first question by calculating the changes in the pairwise correlations between the patterns elicited by successive presentations of each repeated sequence from the second to the twelfth repetition (Figure 1D, blue cells). In any brain region that comes to learn a stable representation of a repeated sequence, the correlation between the activation patterns elicited by successive presentations of that sequence should increase over the course of the experiment. However, it could be the case that the representations that develop are the same for all sequences. That is, what might be learned is a generic representation of sequences in the experiment. To show evidence of sequence-specific representations, we additionally need to show that similarity between the patterns elicited by different sequences does not increase over repetitions. Note that decreasing similarity between different sequences is also not sufficient on its own, because this could also arise if the responses to all sequences are initially the same, but become noisier over repetitions.

Given that the average regional BOLD response has been frequently shown to decrease over repetitions of auditory stimuli (Orfanidou, Marslen-Wilson, & Davis, 2006; Bergerbest, Ghahremani, & Gabrieli, 2004), we have to be sure that changes in similarity are not simply a consequence of an increase in physiological or measurement (fMRI) noise. Only by observing the combination of the two effects—an increase in similarity between repetitions of the same sequence and no increase in similarity between different sequences—can we conclude that a region is learning distinct representations of individual sequences. Such outcome could not be explained as arising from an overall change in the magnitude of the BOLD response, nor a change in the proportion of noise in the neural signal in a particular region, because an increase in noise over repetitions would affect both of the slopes.

By combining the two pattern information measures with a “searchlight” approach (Kriegeskorte et al., 2006; see Methods), we were able to identify brain areas that were both sensitive to learning of individual sequences and where distinctive representations emerged as a function of learning. We also performed a conventional whole-brain univariate analysis to ensure that we could replicate previous fMRI studies of sequence learning.

METHODS

Participants

In total, 29 right-handed volunteers (19 women, 20–33 years old) gave informed, written consent for participation in the study after its nature had been explained to them. Participants reported no history of psychiatric or neurological disorders and no current use of any psychoactive medications. Seven participants were later excluded from the study because of the excessive motion artifacts in the collected fMRI data (see “fMRI data acquisition and pre-processing” for the exclusion criteria). The study was approved by the Cambridge Local Research Ethics Committee (Cambridge, UK).

Task and Behavioral Measures

In the current study, we used the Hebb (1961) repetition task. Participants performed immediate serial recall of auditorily presented sequences. Recall was always spoken. These conditions are the most informative for studying repetition learning of sequences for several reasons.

First, the Hebb repetition task has been studied extensively in the behavioral literature. Typically, participants' performance on the immediate serial recall of a sequence of items is seen to improve over unannounced repetitions of a given sequence (Fendrich, Healy, & Bourne, 1991; Cunningham, Healy, & Williams, 1984; Schwartz & Bryden, 1971; Hebb, 1961; see Page & Norris, 2009, for a review). Thus, Hebb repetition learning is a paradigmatic example of the transfer of information from short to long-term memory and a laboratory analogue of auditory–motor learning for linguistic, musical, or numerical sequences.

Second, considerable research has shown that the modality of presentation strongly influences the manner in which people perceive, learn, and represent information in STM; a number of studies suggest an advantage in the processing of sequential auditory input (see Conway & Christiansen, 2006, for a short review). Many behavioral effects of sequential processing are less pronounced or absent when visual stimuli are used (Conway & Christiansen, 2006; Frankish, 1985, 1989; Crowder, 1986; Wright, Santiago, Sands, Kendrick, & Cook, 1985). Third, the combination of auditory presentation with spoken responses employs processes commonly used in acquiring new phonological sequences in word learning (Page & Norris, 2009). Fourth, previous research has shown that response learning is a part of the repetition learning paradigm (Oberauer & Meyer, 2009; Couture, Lafond, & Tremblay, 2008; Page, Cumming, Norris, Hitch, & McNeil, 2006). However, existing functional imaging data on repetition learning of sequences has been gathered almost exclusively using manual rather than verbal responses (see Kalm, 2010, for a short review).

In our task, participants had to recall sequences of eight auditorily presented monosyllabic letters in the correct order. All sequences consisted of random reorderings of the same eight letters (Q, J, Z, D, L, S, H, N). Sequences therefore differed only in terms of the order in which the letters were presented. Sequences were constructed subject to the following constraints: there was no positional overlap between consecutive sequences; all sequences were controlled to exclude rhyming letters and semantic chunks. Sequences were either repeated over the course of the experiment (repeated sequences, repeated 12 times) or not (unique sequences, presented once). No repeated sequences shared more than two items in the same position. Hence, the primary experimental manipulation was the number of repetitions between current and first presentation. All sequences were presented in blocked triplets, where the first trial was always a unique filler sequence and the last two trials were repeating sequences (Figure 1A). As a result, a single repeating sequence was repeated at every third trial.

A new repeating sequence was introduced after the previous sequence had been repeated 6 times. The first repeating sequence was presented 6 times during a training session before the fMRI experiment to ensure that it had also been presented 6 times at the start of the experiment. Thus, at any given point in the experiment two repeating sequences were presented simultaneously, with one of the repeating sequences being presented fewer than six times and the other more than six times. This ensured that comparisons between sequences at different stages of learning were not confounded with time.

On each trial, participants were presented with a visual fixation cross to indicate the start of the auditory presentation of the sequence. Eight letters were then presented (in a male voice, 500 msec SOA) followed by a cue “?” indicating that they were to verbally recall the sequence exactly as they had just heard it; or a cue “–” indicating not to respond and to wait for 2–10 sec for the next trial (Figure 1B). The letters were spoken by a native English-speaking man and recorded at 44.1 kHz sampling rate and 16 bits per sample. Recordings were made in a soundproof room and the perceptual center of the syllable synchronized to a common onset time such that sequences were heard as rhythmic (Morton, Marcus, & Frankish, 1976). This enabled us to control for the time difference in pronouncing different letters.

In summary, each participant was presented with 216 trials, with 54 trials presented in four scanning runs, in addition to an initial practice session outside the scanner. Participants were not informed that there were different types of trials. Participants only had to recall the sequences on two thirds of the trials to allow the effects of encoding and retrieval to be modeled separately in the imaging analysis. Recall and no-recall trials were pseudorandomly mixed during each scanning run.

For each trial, recall performance was measured as the Levenshtein distance (Levenshtein, 1966) between the presented sequence and the participant's recall. The Levenshtein distance is the smallest number of edit operations (insertion, deletion, or substitution of a single character) that are necessary to modify one string to obtain another string. For a sequence of length n, the Levenshtein distance ranges from 0, when the two sequences are identical, to n, when the two sequences are completely different. For each trial, we calculated a normalized Levenshtein distance score according to the following formula:
formula
where P is the sequence presented, R is the recall, and N is number of letters in presented sequence. The resulting normalized Levenshtein distance score (henceforth LD) ranges between 0 and 1, where a score of 1 indicates that all items were recalled in their original serial positions. This method is preferred over counting responses as being correct only when the items are recalled in the same serial position as the original sequence. The Levenshtein procedure gives credit for partially correct responses where, for example, participants omit a single item early on in the sequence.

fMRI Data Acquisition and Preprocessing

Participants were scanned at the Medical Research Council Cognition and Brain Sciences Unit (Cambridge, UK) on a 3-T Siemens TIM Trio MRI scanner using a 12-channel head coil. Functional images were collected using 32 slices covering the whole brain (slice thickness = 3 mm, 25% slice gap, in-plane resolution = 3 × 3 mm) with an EPI sequence (repetition time = 2 sec, echo time = 30 msec, flip angle = 78°). In addition, high-resolution MPRAGE structural images were acquired at 1-mm isotropic resolution. (see imaging.mrc-cbu.cam.ac.uk/imaging/ImagingSequences, for detailed information.) Each participant performed four scanning runs, 364 scans were acquired per run, including 16 dummy scans. Stimulus presentation was controlled by DMDX software Version 3 (Forster & Forster, 2003). Visual cues for sequence presentation and recall were rear projected onto a translucent screen outside the bore of the magnet and viewed via a mirror system attached to the head coil. Auditory stimuli were delivered with magnet-safe headphones installed inside ear defenders (NordicNeuroLab, Bergen, Norway, noise attenuation of +30 dB).

All fMRI data were preprocessed and analyzed using SPM5 software (Wellcome Trust Centre for Neuroimaging, London) and custom in-house software. Before analysis, all images were corrected for slice timing, with the middle slice in each scan used as a reference. Images were realigned with respect to the first image using trilinear interpolation, creating a mean realigned image. The mean realigned image was then coregistered with the structural image, and the structural image was normalized to the Montreal Neurological Institute (MNI) average brain using the combined segmentation/normalization procedure in SPM5.

We excluded seven participants from the analysis whose head movement due to speaking in the scanner repeatedly exceeded the following criteria: a translation threshold of 3 mm, rotation threshold of 4°, and between-image difference threshold of 0.1 calculated by dividing the summed squared difference of consecutive images by the squared global mean.

Multivoxel Pattern Analysis

As noted above, information about representations of individual sequences can be detected in activity patterns by combining two statistical measures: first, the correlation between the activity patterns elicited by successive presentations of a single sequence, which should increase as sequence-specific responses develop (blue cells in Figure 1D); second, the correlation between patterns elicited by different sequences, which should decrease as different sequences become more distinctive (red/yellow cells in Figure 1D).

In this analysis, we moved a spherical searchlight (Kriegeskorte et al., 2006) with a 6-mm radius throughout the gray matter masked and unsmoothed volumes to select, at each location, a local contiguous set of 186 voxels (3 mm isotropic). In each sphere we estimated β values for the encoding phase of every trial (216 βs) for every voxel in the sphere (data from recall period was too noisy due to movement artifacts generated by participants speaking in the scanner). The event regressors were convolved with the canonical hemodynamic response (as defined by SPM analysis package) and passed through a high-pass filter (128 sec) to remove low-frequency noise. In addition to six motion parameters (corresponding to translations and rotations of the image due to movement in the scanner) additional scan-specific regressors were also added to account for large head movements. Additional parameters were modeled to account for extreme interscan movements, which exceeded a translation threshold of 0.5 mm, rotation threshold of 1.33°, and between-image difference threshold of 0.035 calculated by dividing the summed squared difference of consecutive images by the squared global mean. These separate movement spike regressors remove variance due to head movement caused by participants speaking in the scanner during the response phase of the trials.

As a result, we obtained 216 β values for every voxel representing the 216 sequence encoding events over the course of the experiment. A sufficient degree of decorrelation between all of the regressors was ensured by (1) jittering the length of the rest phase (between 2 and 10 sec), (2) varying the length of the recall period (7 sec), and (3) omitting the recall phase for one third of the trials. Two thirds of the β values (144) represented the 12 repetitions of 12 individual sequences, whereas the remaining third represented 72 individual nonrepeating sequences. As a result, the voxels in the searchlight comprised a vector of activations resulting in one vector per trial.

For every searchlight, we computed a Spearman rank correlation between the activity patterns elicited by successive presentations of individual sequences (between-repetition correlation, blue cells in Figure 1D). This was done by correlating the first presentation of sequence A to the second presentation of A, the second presentation to the third presentation, and so forth, for all 12 repeating sequences (blue cells in Figure 1D). As a result, we obtained between-repetition similarity measures from the second to the twelfth repetition for all repeating sequences. Next, we calculated a change in between-repetition similarity by fitting a slope over similarity measures using least-squares linear regression. Similarly, we computed a Spearman rank correlation between the activity vectors of all the repeating sequences presented during the first repetition, second repetition, and so forth, to acquire the change in between-sequence correlation (red–orange–yellow cells in Figure 1D). Finally, we computed a statistic measuring the change in the amount of information about the individual sequences over the experiment. An increase in sequence identity information for a given searchlight was evaluated as a significant interaction between two correlation slopes (between sequences and within-sequence correlation, Figure 1E). The between-sequence pattern similarity change and the interaction coefficient were only calculated for voxels that showed significant between-repetitions pattern similarity increase.

For every participant, the searchlight analysis resulted in a between-repetition pattern change brain map. We assigned a score of zero to any sphere in which fewer than 33 voxels were inside the individual gray matter volume. These individual images were subsequently normalized for MNI anatomical template and entered into random-effects analyses (one-sample t tests). Voxels from the random-effects analysis are reported that passed a whole-brain false detection rate (FDR; Genovese, Lazar, & Nichols, 2002) threshold of p < .05.

Univariate Analysis

After preprocessing, the functional images underwent spatial smoothing with a 6-mm FWHM Gaussian kernel. Single-subject statistical contrasts were set up by modeling the encoding phase of each trial with a single regressor by convolving a box-car representation of the onset and duration of each encoding phase with the canonical hemodynamic response as defined by the SPM analysis package. In common with the MVPA analysis, we added additional scan-specific movement and spike regressors to the first level model to account for large head movements.

Univariate repetition and learning contrasts were established by combining every encoding regressor with repetition and learning rate covariates (parametric modulators in SPM). For the repetition contrast, we correlated the sequence repetition number with the BOLD signal response amplitude in a given brain region. This correlation can be negative (brain regions become less active in response to a repeated presentation of the same sequence) or positive (more activity for later repetitions). For the learning contrast, we correlated the trial-by-trial learning parameter with the BOLD signal response amplitude in a given brain region. The learning parameter was calculated by subtracting the Levenshtein score for trial n from the score for the previous presentation of that sequence (n − 1). The average learning rate across participants is shown in Figure 2C.

Figure 2. 

Change of recall performance: (A) repeating sequences, (B) unique sequences, and (C) rate of learning for repeating sequences. Behavioral performance data are only shown for trials on which recall was measured. Error bars show SEM for the variability across the participants.

Figure 2. 

Change of recall performance: (A) repeating sequences, (B) unique sequences, and (C) rate of learning for repeating sequences. Behavioral performance data are only shown for trials on which recall was measured. Error bars show SEM for the variability across the participants.

Contrasts for the main effects of Repetition and Learning were tested by comparing the mean regression coefficient (β) parameter (expressing the effect of repetition and learning covariates) against zero for each participant relative to the residual error in this parameter over participants (i.e., a random effects analysis). For both contrasts, we also needed to rule out changes in the neural response, which are a result of trial-to-trial variability in attention. We therefore corrected the effects of learning by subtracting from the β value for repeating trials a β value for learning in unique filler trials (i.e., a parameter expressing neural changes over time in responses for unique sequences). This ensures that changes in the response for repeated sequences can only arise from increased familiarity of the stimuli because of repetition and not because of attention.

Contrasts of parameter estimates from the least-mean square fit of single-subject models were entered into random-effects analyses (one-sample t tests) as in the multivariate analysis (FDR threshold of p < .05).

RESULTS

Behavioral Data

To observe a reliable repetition learning effect, performance must be shown to increase for a repeated sequence relative to nonrepeated controls (unique filler sequences). Hence, slopes of immediate serial recall performance over the course of the experiment were calculated using least-squares linear regression for the repeating sequences and the filler sequences for every subject. A paired t test over participants with slopes for repeated and filler sequences as the dependent measure showed a significant Hebb effect (t(21) = 3.29, p < .003). Separate one-sample t test showed that the slopes of repeating sequences were significantly different from zero (t(21) = 5.81, p < .001), whereas the slopes of filler sequences were not (t(21) = 0.52, p = .31). Figure 2 shows average slopes over repeating and filler sequences for all participants.

Conventional Whole-brain Univariate Analysis of Signal Response Amplitudes

To ensure that we could replicate previous fMRI studies of sequence learning with our task, we first calculated the whole-brain univariate maps for both repetition and learning parameters. Brain activity decreased with repetition bilaterally in a number of brain areas, with peak voxels in superior and middle temporal gyrus and sulcus, putamen, insular, and premotor cortex. No brain areas showed a positive correlation with repetition. Only the right caudate nucleus and left ACC showed a significant positive correlation with learning rate (Figure 3, Table 1). No brain areas showed a negative correlation with the learning parameter.

Figure 3. 

Whole-brain univariate effects of repetition and learning. (A) Repetition, surface view. (B) Repetition, slice view (MNI 10 −4): (1) bilateral BG cluster, (2) bilateral sensory–auditory cluster, (3) bilateral posterior temporal lobe clusters. (C) Learning, slice view (MNI −16 26): (1) right caudate nucleus, (2) left ACC. Whole-brain FDR threshold of p < .05.

Figure 3. 

Whole-brain univariate effects of repetition and learning. (A) Repetition, surface view. (B) Repetition, slice view (MNI 10 −4): (1) bilateral BG cluster, (2) bilateral sensory–auditory cluster, (3) bilateral posterior temporal lobe clusters. (C) Learning, slice view (MNI −16 26): (1) right caudate nucleus, (2) left ACC. Whole-brain FDR threshold of p < .05.

Table 1. 

Peak Voxel Coordinates: Whole-brain Univariate Analysis

Cluster Size
t Statistic
x
y
z
H
Region
List Repetition (Negative Correlation) 
92 4.4656 40 −44 −12 inferior temporal gyrus 
4.3649 52 −44 −14 inferior temporal gyrus 
38 4.393 −42 −40 −14 inferior temporal gyrus 
38 4.1821 44 −40 STS 
30 4.1706 −56 superior temporal gyrus 
3.5378 −60 −8 −4 middle temporal gyrus 
549 3.9579 28 −6 putamen 
282 3.7973 40 −16 64 precentral gyrus 
49 3.1135 56 −20 −2 superior temporal gyrus 
 
Learning (Positive Correlation) 
40 4.349 −16 36 24 cingulate cortex 
101 3.879 12 16 caudate nucleus 
Cluster Size
t Statistic
x
y
z
H
Region
List Repetition (Negative Correlation) 
92 4.4656 40 −44 −12 inferior temporal gyrus 
4.3649 52 −44 −14 inferior temporal gyrus 
38 4.393 −42 −40 −14 inferior temporal gyrus 
38 4.1821 44 −40 STS 
30 4.1706 −56 superior temporal gyrus 
3.5378 −60 −8 −4 middle temporal gyrus 
549 3.9579 28 −6 putamen 
282 3.7973 40 −16 64 precentral gyrus 
49 3.1135 56 −20 −2 superior temporal gyrus 
 
Learning (Positive Correlation) 
40 4.349 −16 36 24 cingulate cortex 
101 3.879 12 16 caudate nucleus 

Our univariate results are hence in agreement with previous fMRI findings: We observed a decrease in activation within the auditory speech processing network as a response to stimulus repetition (Grill-Spector, Henson, & Martin, 2006; Gabrieli, 1998; Fleischman, Vaidya, Lange, & Gabrieli, 1997) and a positive correlation between striatal activation and learning rate (Turk-Browne et al., 2009; Seger & Cincotta, 2006; Poldrack et al., 2005; Lieberman et al., 2004).

Multivoxel Pattern fMRI

First, we examined whether any brain regions develop representations that are tuned to repeating sequences, as indicated by an increase in between-repetition pattern similarity over the course of the experiment. A whole-brain MVPA searchlight analysis was conducted independently for every voxel neighborhood to yield a whole-brain corrected activation map showing all regions in which the similarity in between-repetition activity patterns increased significantly (Figure 4). Significant increase of pattern similarity was observed in right supramarginal gyrus (t(21) = 3.43, p = .009; Figure 4A, Table 2), bilateral insula (t(21) = 3.42, p = .007; Figure 4B), and left hippocampus (t(21) = 3.38, p < .05; Figure 4C, Table 2). As a control condition, we computed the change in pattern similarity over successive nonrepeating unique sequences. No brain areas showed significant pattern similarity effects over successive unique sequences.

Figure 4. 

Significant increase in the between-repetition pattern similarity in (A) right supramarginal cortex (MNI 58 −40 40), (B) bilateral insula (MNI 32 24 0), and (C) left hippocampus (MNI −28 −22 −12). Whole-brain FDR threshold of p < .05.

Figure 4. 

Significant increase in the between-repetition pattern similarity in (A) right supramarginal cortex (MNI 58 −40 40), (B) bilateral insula (MNI 32 24 0), and (C) left hippocampus (MNI −28 −22 −12). Whole-brain FDR threshold of p < .05.

Table 2. 

Peak Voxel Coordinates: Increase in Between-repetition Pattern Similarity

Cluster Size
t Statistic
x
y
z
H
Region
219 3.429 52 −40 40 supramarginal gyrus 
101 3.415 32 24 insula 
75 3.201 −30 28 insula 
62 3.382 −28 −22 −12 hippocampus 
Cluster Size
t Statistic
x
y
z
H
Region
219 3.429 52 −40 40 supramarginal gyrus 
101 3.415 32 24 insula 
75 3.201 −30 28 insula 
62 3.382 −28 −22 −12 hippocampus 

Next, we sought to establish whether the representations that develop are the same for all sequences or come to encode individual sequences. To show evidence of sequence-specific representations, we need to show that similarity between the patterns elicited by different individual sequences decreases over repetitions. This analysis was carried out only for the voxels that had previously shown a significant between-repetition pattern similarity effect. This was done to rule out the possibility that the decrease in pattern similarity might be because of an increase in physiological or measurement (fMRI) noise over repetitions. Because the exact locations of between-repetition effects maxima varied across participants, we used subject-specific anatomical and functional constraints in selecting the coordinates for between-sequence similarity analysis. First, the coordinates of the individual ROIs had to be located within 16 mm from the group maximum. Second, the coordinates of individual ROIs had to fall within the same anatomically defined regions as the between-repetition effects (right supramarginal gyrus, left hippocampus, and bilateral insula as determined by the participant's anatomical scan).

Only the left hippocampus showed a significant decrease in between-sequence similarity that was coupled with an increase in between-repetition similarity (subjects' mean interaction coefficient was significantly different from zero, t(21) = −4.02, p < .05, Figure 5). As expected, there was considerable between-subject variability in the location of the peak voxels: for five participants, the peak was located in the parahippocampal cortex rather than the hippocampal formation, and three participants did not display any significant interaction effect. Overall, the neural activity patterns in the hippocampus representing different sequences become significantly more distinct from each other as these sequences were repeated. At the same time patterns elicited by the same sequence became more similar to each other over repetitions, indicating convergence to stable but distinct representations of individual sequences as a function of learning.

Figure 5. 

Slopes of between-repetition (top) and between-sequence (bottom) pattern similarity for (A) right supramarginal cortex (MNI 58 −40 40), (B) right insula (MNI 32 24 0), and (C) left hippocampus (MNI −28 −22 −12). Error bars show SEM for the variability across the participants.

Figure 5. 

Slopes of between-repetition (top) and between-sequence (bottom) pattern similarity for (A) right supramarginal cortex (MNI 58 −40 40), (B) right insula (MNI 32 24 0), and (C) left hippocampus (MNI −28 −22 −12). Error bars show SEM for the variability across the participants.

The pattern similarity effects observed above only reflect how repetition modulates the distributed spatial activity patterns and not the overall response amplitude. Comparison of univariate and multivariate analysis results showed that only the insula showed a jointly significant univariate and pattern similarity effect: brain activity decreased significantly with repetition (Figure 3). No other brain areas, which showed a significant pattern similarity effect, also displayed a significant univariate correlation effect. Hence, we can be confident that the emergence of stable patterns does not arise from changes in response amplitude.

DISCUSSION

Our results show that brain regions in the temporal lobe, hippocampus, and insula represent repeated verbal sequences. Additionally, voxels in the hippocampus and MTL gradually come to encode the identity of the individual sequences as shown by decreasing pattern similarity between individual sequences over repetitions. We showed that such changes could have not been caused by an increase of noise (measurement or neurobiological) and that conventional univariate fMRI analysis techniques are not sensitive to such changes in brain activity.

This contribution of MTL/HC in maintaining distinct representations of overlapping sequences is supported by previous research in neurophysiology. HC-lesioned rodents have impaired memory for sequences constructed from a limited set of odors, despite preserved memory for the individual odors (Agster et al., 2002) and rodent HC cells fire differentially to the same sequence of odors when embedded in different sequences (e.g., ABC in MNABCOP vs. WXABCYZ; Ginther, Walsh, & Ramus, 2011). Similarly, using fMRI, Kumaran and Maguire (2006) found that activity in right posterior hippocampus was correlated with a subject-specific behavioral index of sequence learning but only when different repeating sequences were constructed from the same set of faces. Furthermore, neuronal recordings from rodents and humans have shown that hippocampal neural ensemble activity corresponding to successive sequence items becomes gradually correlated (Paz et al., 2010; Manns, Howard, & Eichenbaum, 2007). In summary, previous studies suggest that neurons in the HC encode the temporal order information within a sequence of events or items.

Our results extend existing knowledge in two important ways. First, we have shown how individual sequence representations develop over the course of repetitions in MTL/HC brain areas. Specifically, we have shown that individual representations become more stable over successive repetitions suggesting that the MTL/HC system encodes both sequence identity and familiarity. Second, we have demonstrated that the same structures that have been shown to play a role in learning sequences of odors and faces (Devito & Eichenbaum, 2011; Ross et al., 2009; Kumaran & Maguire, 2006; Agster et al., 2002; Kesner, Gilbert, & Barua, 2002) are also involved in learning sequences of verbal stimuli. Because our task (the Hebb repetition learning task) is commonly seen as a laboratory analogue of learning overlapping sequences of speech sounds that comprise new words in a natural environment (Page & Norris, 2009), our results bridge the respective neuroimaging literatures of pure sequence learning and human vocabulary acquisition.

Note that lesion studies suggest that intact MTL/HC is not necessary for successful sequence learning. Ergorul and Eichenbaum (2006) showed that, although both HC-lesioned and normal rats were able to learn overlapping sequences, lesioned rats required more training. Similarly, existing neuropsychological data on vocabulary acquisition would suggest a substantial impairment in word learning after bilateral hippocampal lesions (Martins, Guillery-Girard, Jambaqué, Dulac, & Eustache, 2006; Gabrieli, Cohen, & Corkin, 1988), although some learning remains (Gardiner, Brandt, Baddeley, Vargha-Khadem, & Mishkin, 2008; O'Kane, Kensinger, & Corkin, 2004; see Davis & Gaskell, 2009, for a review).

Gagnon, Foster, Turcotte, and Jongenelis (2004) reported data from a patient with a focal lesion of the hippocampal formation who was very impaired on measures of episodic memory, yet showed a Hebb effect, suggesting that learning can take place even with little or no explicit memory of previous recall episodes. However, Gagnon et al.'s task (2004) included only a single repeating sequence whereas our task requires encoding of multiple and simultaneously learned overlapping sequences. This suggests that a single sequence can be learned by extrahippocampal systems whereas the MTL/HC might be necessary for dissociating between multiple overlapping sequences. The stabilizing of individual sequence representations over repetitions that we observed in our study might hence serve as a mechanism that enables to learn multiple repeating sequences. It would be interesting to know whether individuals with hippocampal damage would be able to acquire multiple overlapping sequences.

Guided by hippocampal anatomy and computational principles, previous research has proposed that the formation of discrete representations of bindings between overlapping items depends on a hippocampal pattern separation process (Diana, Yonelinas, & Ranganath, 2008; Leutgeb, Leutgeb, Moser, & Moser, 2007; Rolls & Kesner, 2006; Kesner et al., 2002; McClelland & Goddard, 1996). Specifically, such pattern separation refers to the transformation of overlapping patterns of cortical input into separable hippocampal representations (Diana et al., 2008). Consistent with these models, we show the emergence of dissociable hippocampal representations of overlapping sequences.

Besides the MTL/HC, several other brain areas such as the posterior temporal lobe and the insula dissociated repeating and nonrepeating overlapping sequences. However, these regions do not seem to distinguish between individual sequences at least over the short period of learning tested in this study. It remains to be seen whether—as previously shown for univariate responses to newly learned spoken words (Davis, Di Betta, Macdonald, & Gaskell, 2009)—temporal lobe representations of verbal sequences show evidence of overnight consolidation.

In summary, our findings contribute to a body evidence supporting a model in which the hippocampus and MTL encode the temporal order of events of items (Ginther et al., 2011; Paz et al., 2010; Manns et al., 2007; Agster et al., 2002; Fortin, Agster, & Eichenbaum, 2002). We additionally demonstrate that the stability of these representations is increased by repetition and suggest this underlies the ability of learning multiple overlapping sequences. Our results further extend the evidence that the MTL/HC memory system represents multidimensional relational information (Konkel & Cohen, 2008). The current findings also provide a novel fMRI analysis method for evaluating the representational change as a function of repetition which cannot be achieved with conventional univariate analyses.

Reprint requests should be sent to Kristjan Kalm, MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge, CB2 7E, UK, or via e-mail: kristjan.kalm@mrc-cbu.cam.ac.uk.

REFERENCES

Agster
,
K.
,
Fortin
,
N.
, &
Eichenbaum
,
H.
(
2002
).
The hippocampus and disambiguation of overlapping sequences.
The Journal of Neuroscience
,
22
,
5760
5768
.
Albouy
,
G.
,
Sterpenich
,
V.
,
Balteau
,
E.
,
Vandewalle
,
G.
,
Desseilles
,
M.
,
Dang-Vu
,
T.
,
et al
(
2008
).
Both the hippocampus and striatum are involved in consolidation of motor sequence memory.
Neuron
,
58
,
261
272
.
Bergerbest
,
D.
,
Ghahremani
,
D.
, &
Gabrieli
,
J.
(
2004
).
Neural correlates of auditory repetition priming: Reduced fMRI activation in the auditory cortex.
Journal of Cognitive Neuroscience
,
16
,
966
977
.
Brown
,
J.
(
1958
).
Some tests of the decay theory of immediate memory.
The Quarterly Journal of Experimental Psychology
,
10
,
12
21
.
Conway
,
C.
, &
Christiansen
,
M.
(
2006
).
Statistical learning within and between modalities: Pitting abstract against stimulus-specific representations.
Psychological Science
,
17
,
905
912
.
Couture
,
M.
,
Lafond
,
D.
, &
Tremblay
,
S.
(
2008
).
Learning correct responses and errors in the Hebb repetition effect: Two faces of the same coin.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
34
,
524
532
.
Crowder
,
R.
(
1986
).
Auditory and temporal factors in the modality effect.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
12
,
268
278
.
Cunningham
,
T.
,
Healy
,
A.
, &
Williams
,
D.
(
1984
).
Effects of repetition on short-term retention of order information.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
10
,
575
597
.
Davis
,
M. H.
,
Di Betta
,
A. M.
,
Macdonald
,
M. J. E.
, &
Gaskell
,
M. G.
(
2009
).
Learning and consolidation of novel spoken words.
Journal of Cognitive Neuroscience
,
21
,
803
820
.
Davis
,
M. H.
, &
Gaskell
,
M. G.
(
2009
).
A complementary systems account of word learning: Neural and behavioural evidence.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
364
,
3773
3800
.
Devito
,
L.
, &
Eichenbaum
,
H.
(
2011
).
Memory for the order of events in specific sequences: Contributions of the hippocampus and medial prefrontal cortex.
The Journal of Neuroscience: The Official Journal of the Society for Neuroscience
,
31
,
3169
3175
.
Diana
,
R.
,
Yonelinas
,
A.
, &
Ranganath
,
C.
(
2008
).
High-resolution multi-voxel pattern analysis of category selectivity in the medial temporal lobes.
Hippocampus
,
18
,
536
541
.
Ergorul
,
C.
, &
Eichenbaum
,
H.
(
2006
).
Essential role of the hippocampal formation in rapid learning of higher-order sequential associations.
The Journal of Neuroscience: The Official Journal of the Society for Neuroscience
,
26
,
4111
4117
.
Fendrich
,
D.
,
Healy
,
A.
, &
Bourne
,
L.
(
1991
).
Long-term repetition effects for motoric and perceptual procedures.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
17
,
137
151
.
Fleischman
,
D.
,
Vaidya
,
C.
,
Lange
,
K.
, &
Gabrieli
,
J.
(
1997
).
A dissociation between perceptual explicit and implicit memory processes.
Brain and Cognition
,
35
,
42
57
.
Forster
,
K.
, &
Forster
,
J.
(
2003
).
DMDX: A Windows display program with millisecond accuracy.
Behavior Research Methods, Instruments, & Computers
,
35
,
116
124
.
Fortin
,
N.
,
Agster
,
K.
, &
Eichenbaum
,
H.
(
2002
).
Critical role of the hippocampus in memory for sequences of events.
Nature Neuroscience
,
5
,
458
462
.
Frankish
,
C.
(
1985
).
Modality specific grouping effects in STM.
Journal of Memory and Language
,
209
,
200
209
.
Frankish
,
C.
(
1989
).
Perceptual organization and precategorical acoustic storage.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
15
,
469
479
.
Gabrieli
,
J.
(
1998
).
Cognitive neuroscience of human memory.
Annual Review of Psychology
,
49
,
87
115
.
Gabrieli
,
J. D.
,
Cohen
,
N. J.
, &
Corkin
,
S.
(
1988
).
The impaired learning of semantic knowledge following bilateral medial temporal-lobe resection.
Brain and Cognition
,
7
,
157
177
.
Gagnon
,
S.
,
Foster
,
J.
,
Turcotte
,
J.
, &
Jongenelis
,
S.
(
2004
).
Involvement of the hippocampus in implicit learning of supra-span sequences: The case of SJ.
Cognitive Neuropsychology
,
21
,
867
882
.
Gardiner
,
J. M.
,
Brandt
,
K. R.
,
Baddeley
,
A. D.
,
Vargha-Khadem
,
F.
, &
Mishkin
,
M.
(
2008
).
Charting the acquisition of semantic knowledge in a case of developmental amnesia.
Neuropsychologia
,
46
,
2865
2868
.
Genovese
,
C.
,
Lazar
,
N.
, &
Nichols
,
T.
(
2002
).
Thresholding of statistical maps in functional neuroimaging using the false discovery rate.
Neuroimage
,
15
,
870
878
.
Ginther
,
M.
,
Walsh
,
D.
, &
Ramus
,
S.
(
2011
).
Hippocampal neurons encode different episodes in an overlapping sequence of odors task.
Journal of Neuroscience
,
31
,
2706
2711
.
Graybiel
,
A.
(
2008
).
Habits, rituals, and the evaluative brain.
Annual Review of Neuroscience
,
31
,
359
387
.
Grill-Spector
,
K.
,
Henson
,
R.
, &
Martin
,
A.
(
2006
).
Repetition and the brain: Neural models of stimulus-specific effects.
Trends in Cognitive Sciences
,
10
,
14
23
.
Hebb
,
D.
(
1961
).
Distinctive features of learning in the higher animal.
In J. F. Delafresnaye (Ed.)
,
Brain mechanisms and learning
(pp.
37
46
).
London and New York
:
Oxford University Press
.
Kalm
,
K.
(
2010
).
Chunk formation in verbal short-term memory
.
Unpublished doctoral thesis. University of Cambridge
.
Kesner
,
R.
,
Gilbert
,
P.
, &
Barua
,
L.
(
2002
).
The role of the hippocampus in memory for the temporal order of a sequence of odors.
Behavioral Neuroscience
,
116
,
286
290
.
Konkel
,
A.
, &
Cohen
,
N.
(
2008
).
Relational memory and the hippocampus: Representations and methods.
Frontiers in Neuroscience
,
2
,
22
23
.
Kriegeskorte
,
N.
,
Goebel
,
R.
, &
Bandettini
,
P.
(
2006
).
Information-based functional brain mapping.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
3863
3868
.
Kumaran
,
D.
, &
Maguire
,
E.
(
2006
).
The dynamics of hippocampal activation during encoding of overlapping sequences.
Neuron
,
49
,
617
629
.
Leutgeb
,
J.
,
Leutgeb
,
S.
,
Moser
,
M.
, &
Moser
,
E.
(
2007
).
Pattern separation in the dentate gyrus and CA3 of the hippocampus.
Science
,
315
,
961
966
.
Levenshtein
,
V.
(
1966
).
Binary codes capable of correcting deletions, insertions, and reversals.
Soviet Physics-Doklady
,
10
,
707
710
.
Lieberman
,
M.
,
Chang
,
G.
,
Chiao
,
J.
,
Bookheimer
,
S.
, &
Knowlton
,
B.
(
2004
).
An event-related fMRI study of artificial grammar learning in a balanced chunk strength design.
Journal of Cognitive Neuroscience
,
16
,
427
438
.
Manns
,
J.
,
Howard
,
M.
, &
Eichenbaum
,
H.
(
2007
).
Gradual changes in hippocampal activity support remembering the order of events.
Neuron
,
56
,
530
540
.
Martins
,
S.
,
Guillery-Girard
,
B.
,
Jambaqué
,
I.
,
Dulac
,
O.
, &
Eustache
,
F.
(
2006
).
How children suffering severe amnesic syndrome acquire new concepts?
Neuropsychologia
,
44
,
2792
2805
.
McClelland
,
J. L.
, &
Goddard
,
N. H.
(
1996
).
Considerations arising from a complementary learning systems perspective on hippocampus and neocortex.
Hippocampus
,
6
,
654
665
.
Miyachi
,
S.
,
Hikosaka
,
O.
,
Miyashita
,
K.
,
Kárádi
,
Z.
, &
Rand
,
M.
(
1997
).
Differential roles of monkey striatum in learning of sequential hand movement.
Experimental Brain Research
,
115
,
1
5
.
Morton
,
J.
,
Marcus
,
S.
, &
Frankish
,
C.
(
1976
).
Perceptual centers (P-centers).
Psychological Review
,
83
,
405
408
.
Norman
,
K.
,
Polyn
,
S.
,
Detre
,
G.
, &
Haxby
,
J.
(
2006
).
Beyond mind-reading: Multi-voxel pattern analysis of fMRI data.
Trends in Cognitive Sciences
,
10
,
424
430
.
Oberauer
,
K.
, &
Meyer
,
N.
(
2009
).
The contributions of encoding, retention, and recall to the Hebb effect.
Memory
,
17
,
774
781
.
O'Kane
,
G.
,
Kensinger
,
E. A.
, &
Corkin
,
S.
(
2004
).
Evidence for semantic learning in profound amnesia: An investigation with patient H.M.
Hippocampus
,
14
,
417
425
.
Orfanidou
,
E.
,
Marslen-Wilson
,
W.
, &
Davis
,
M.
(
2006
).
Neural response suppression predicts repetition priming of spoken words and pseudowords.
Journal of Cognitive Neuroscience
,
18
,
1237
1252
.
Page
,
M.
,
Cumming
,
N.
,
Norris
,
D.
,
Hitch
,
G.
, &
McNeil
,
A.
(
2006
).
Repetition learning in the immediate serial recall of visual and auditory materials.
Journal of Experimental Psychology: Learning, Memory, and Cognition
,
32
,
716
733
.
Page
,
M.
, &
Norris
,
D.
(
2009
).
A model linking immediate serial recall, the Hebb repetition effect and the learning of phonological word forms.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
364
,
3737
3753
.
Paz
,
R.
,
Gelbard-Sagiv
,
H.
,
Mukamel
,
R.
,
Harel
,
M.
,
Malach
,
R.
, &
Fried
,
I.
(
2010
).
A neural substrate in the human hippocampus for linking successive events.
Proceedings of the National Academy of Sciences, U.S.A.
,
107
,
6046
6051
.
Poldrack
,
R.
,
Sabb
,
F.
,
Foerde
,
K.
,
Tom
,
S.
,
Asarnow
,
R.
,
Bookheimer
,
S.
,
et al
(
2005
).
The neural correlates of motor skill automaticity.
The Journal of Neuroscience
,
25
,
5356
5364
.
Rolls
,
E.
, &
Kesner
,
R.
(
2006
).
A computational theory of hippocampal function, and empirical tests of the theory.
Progress in Neurobiology
,
79
,
1
48
.
Ross
,
R.
,
Brown
,
T.
, &
Stern
,
C.
(
2009
).
The retrieval of learned sequences engages the hippocampus: Evidence from fMRI.
Hippocampus
,
19
,
790
799
.
Schendan
,
H. E.
,
Searl
,
M. M.
,
Melrose
,
R. J.
, &
Stern
,
C. E.
(
2003
).
An fMRI study of the role of the medial temporal lobe in implicit and explicit sequence learning.
Neuron
,
37
,
1013
1025
.
Schwartz
,
M.
, &
Bryden
,
M.
(
1971
).
Coding factors in the learning of repeated digit sequences.
Journal of Experimental Psychology: General
,
87
,
331
334
.
Seger
,
C.
, &
Cincotta
,
C.
(
2006
).
Dynamics of frontal, striatal, and hippocampal systems during rule learning.
Cerebral Cortex
,
16
,
1546
1555
.
Turk-Browne
,
N.
,
Scholl
,
B.
,
Chun
,
M.
, &
Johnson
,
M.
(
2009
).
Neural evidence of statistical learning: Efficient detection of visual regularities without awareness.
Journal of Cognitive Neuroscience
,
21
,
1934
1945
.
White
,
N.
, &
McDonald
,
R.
(
2002
).
Multiple parallel memory systems in the brain of the rat.
Neurobiology of Learning and Memory
,
77
,
125
184
.
Wright
,
A.
,
Santiago
,
H.
,
Sands
,
S.
,
Kendrick
,
D.
, &
Cook
,
R.
(
1985
).
Memory processing of serial lists by pigeons, monkeys, and people.
Science
,
229
,
287
289
.
Yin
,
H.
, &
Knowlton
,
B.
(
2006
).
The role of the basal ganglia in habit formation.
Nature Reviews Neuroscience
,
7
,
464
476
.