Age has a differential effect on cognition, with word retrieval being one of the cognitive domains most affected by aging. This study examined the functional and structural neural correlates of phonological word retrieval in younger and older adults using word and picture rhyme judgment tasks. Although the behavioral performance in the fMRI task was similar for the two age groups, the older adults had increased activation in the right pars triangularis across tasks and in the right pars orbitalis for the word task only. Increased activation together with preserved performance in the older participants would suggest that increased activation was related to compensatory processing. We validated this hypothesis by showing that right pars triangularis activation during correct rhyme judgments was highest in participants who made overall more errors, therefore being most error-prone. Our findings demonstrate that the effect of aging differ in adjacent but distinct right inferior frontal regions. The differential effect of age on word and picture tasks also provides new clues to the level of processing that is most affected by age in speech production tasks. Specifically, we suggest that right inferior frontal activation in older participants is needed to inhibit errors.
With the increasing life expectancy of the world population, a better understanding of age-related cognitive changes and their correlation with changes in brain structure and function is necessary. Age-related cognitive decline tends to occur in specific domains but not in others (Craik & Bialystok, 2006; Salthouse, 2003; Ivnik, Malec, Smith, Tangalos, & Petersen, 1996). For example, whereas some cognitive functions like working memory and long-term memory may decline with age (Park et al., 2002), others like theory of mind (Castelli et al., 2010) and emotion regulation (Charles & Piazza, 2009) remain relatively preserved. Within the domain of language, aging affects various components of language comprehension (see, e.g., Shake & Stine-Morrow, 2011; Noh & Stine-Morrow, 2009; Titone et al., 2006; Wingfield, McCoy, Peelle, Tun, & Cox, 2006; Waters & Caplan, 2001) and language production (Ivnik et al., 1996; Mitrushina & Satz, 1995; see review by Stine-Morrow & Shake, 2009). Nevertheless, studies of the neural correlates of language and aging have primarily focused on language comprehension. Some examples of this include studies of sentence comprehension (Wingfield & Grossman, 2006), syntactic processing (Tyler et al., 2010), speech recognition (Harris, Dubno, Keren, Ahlstrom, & Eckert, 2009), speech perception (Wong et al., 2009), and word recognition (Brassen et al., 2009). The study of the neural correlates of language production in aging, on the other hand, has received less attention. Behavioral studies show that language production is affected by age, with word retrieval as the most affected domain (Marien, Mampaey, Vervaet, Saerens, & De Deyn, 1998; Ivnik et al., 1996; Mitrushina & Satz, 1995; see review by Stine-Morrow & Shake, 2009; but see studies of aging effects on variables related to sentence, rather than word, production, e.g., Kemper, Marquis, & Thompson, 2001). This is more likely to be related to difficulties in retrieval than to loss of semantic or phonological word representations (Heine, Ober, & Shenaut, 1999; Burke, Mackay, Worthley, & Wade, 1991). Studies also show that older adults exhibit more cases of tip-of-the-tongue (TOT) in both experimental settings (James & Burke, 2000; Heine et al., 1999; Burke et al., 1991) and in everyday life (Heine et al., 1999; Burke et al., 1991).
In the neuroimaging literature, the effect of aging on speech production has so far been investigated at both the level of brain structure (Stamatakis, Shafto, Williams, Tam, & Tyler, 2011; Shafto, Burke, Stamatakis, Tam, & Tyler, 2007) and functional activation (Shafto, Stamatakis, Tam, & Tyler, 2010; Galdo-Alvarez, Lindin, & Diaz, 2009; Meinzer et al., 2009; Wierenga et al., 2008). Critically, however, the brain areas where aging effects are observed differ according to the task and behavioral measure. When participants were asked to look at pictures of famous people and indicate whether they knew the name of the person or not or whether it was on the “tip-of-their-tongue”, activation in the left insula was lower for older than younger participants during TOT states but not during trials when the participants indicated that they knew the person's name (Shafto et al., 2010). Older participants who made more TOT responses during this task were also found to have lower gray matter (GM) density in the left insula (Shafto et al., 2007) and a reduction of the white matter (WM) integrity of the bilateral superior longitudinal fasciculus (Stamatakis et al., 2011).
In contrast to the association between lower left insula activation and age during the TOT paradigm (Shafto et al., 2010), Wierenga et al. (2008) showed that older compared with younger adults had more extensive right hemispheric activation during a picture naming task, especially in the right inferior frontal gyrus (IFG) where increased activation was positively correlated with better performance, in the context of less activation in the right precentral gyrus. This pattern of increased and decreased activation, together with better performance in the older participants, is consistent with both compensatory mechanism and inefficient processing. However, although right IFG activation in older participants was found to increase with better performance during picture naming (Wierenga et al., 2008), Meinzer et al. (2009), found that right IFG and middle frontal gyrus (MFG) activations were higher in older adults who had poorer performance on a semantic fluency task, with no effect of aging during a phonological fluency task. Together, these studies demonstrate that the effect of aging on brain activation is critically dependent on both task and performance.
The aim of our study was to investigate the effect of aging on phonological word retrieval. We examined changes in brain structure and function during rhyme and homophone judgments on pictures of objects and written words. Two tasks were performed outside the scanner (rhyme and homophone judgment, using written words) and two performed inside the scanner (rhyme judgment of words and rhyme judgment of pictures). Unlike the TOT paradigm reported above, rhyme and homophone judgments necessitate the retrieval of the phonological form of the word, which is a central component of all models of language production (e.g., see Martin, 2003; Levelt, 1993, 1999; Dell & Oseaghdha, 1992). In addition, these tasks require phonological processing (Geva, Bennett, Warburton, & Patterson, 2011; reviewed in Howard & Franklin, 1990) followed by a finger press response that provides a precise measure of processing time while avoiding movement artifacts that can arise during overt speech. Precise timing measurements were also crucial to investigate the degree to which the effect of age could be accounted for by a decline in processing speed (Salthouse, 1996). By including rhyming conditions on both pictures of objects and their written names, we also aimed to tease apart the level at which aging effects were arising. Specifically, the demands on phonological retrieval from semantic processing are greater for rhyme judgments on pictures than words whereas phonological retrieval from sublexical orthography is greater for words than pictures (see, e.g., Indefrey & Levelt, 2004; Martin, 2003; Glaser & Glaser, 1989).
We examined the following four hypotheses:
1. On the basis of previous studies of aging, we hypothesized that older adults would show widespread brain activation, in comparison with younger adults, especially in right frontal areas homologous to left hemisphere language regions (Wierenga et al., 2008). On the basis of previous studies of aging, we focused on activations in (a) right IFG (Spreng, Wojtowicz, & Grady, 2010; Meinzer et al., 2009; Wierenga et al., 2008; theoretical models emphasize the importance of right hemispheric homologues, Cabeza, 2002; and of frontal regions, West, 1996) and left IFG, which has been found to be consistently active in fMRI studies of rhyme judgment where activation is often interpreted as related to phonological processing (Hoeft et al., 2007; Owen, Borowsky, & Sarty, 2004; Poldrack et al., 2001; Lurito, Kareken, Lowe, Chen, & Mathews, 2000; Pugh et al., 1996; Paulesu, Frith, & Frackowiak, 1993) and (b) the left insula (Shafto et al., 2007, 2010).
2. Both fMRI tasks (rhyme judgments on pictures and words) involve phonological retrieval but whereas the picture task is dependent on semantic retrieval, the word task is also supported by direct links between orthography and phonology. As the effect of aging is expected to arise at the level of semantically mediated phonological retrieval, we predicted that there would be a greater impact on the picture task than the word task.
3. We hypothesized that behavioral performance would correlate with age-related increases in fMRI activations. A positive correlation would suggest a compensatory mechanism that allows older subjects with potentially compromised performance (because of deficits in inhibitory mechanisms, see Dempster, 1992; or slower processing speed, see Salthouse, 1996) to compensate with higher activation (Park & Reuter-Lorenz, 2009; Cabeza, 2002), which in turn supports correct behavioral output.
4. Lastly, we hypothesized that reduction in tissue density in the group of older adults would correlate with age-related decline in performance and that reduction in GM density would correlate with age related increases in activation.
Two groups participated in the study: 12 young adults (four men and eight women; age range = 21–34 years, mean age = 24.6 ± 4.5 years; mean number of years of education = 18.1 ± 2.1) and 19 older adults (8 men and 11 women; age range = 55–71 years, mean age = 64.1 ± 4.8 years; mean number of years of education = 15.1 ± 2.9). All participants had no previous history of neurological, psychiatric, or language disorders, as verified using a medical questionnaire. They were right-handed and native speakers of British English. All participants completed a task measuring nonverbal IQ (Raven's Progressive Matrices; Raven, Court, & Raven, 1987). The two groups did not differ in their performance on Raven's Progressive Matrices (independent samples t test, t = 0.74, p = .47), but the younger participants had significantly more years of education (independent samples t test, t = 3.10, p = .004). Given the group difference in education level, analyses were repeated with and without education level as a covariate. The study was approved by the Cambridge Research Ethics Committee, and all participants read an information sheet and gave written consent.
Imaging Data Acquisition
Imaging was performed using a 3T Siemens (Germany) Magentom Trio MRI scanner at the Wolfson Brain Imaging Centre in Cambridge. In each of the four functional imaging runs, 234 whole-brain T2*-weighted EPIs (slice thickness = 3.75 mm, 32 axial slices, sequence: interleaved ascending, repetition time = 2 sec, echo time = 30 msec, flip angle = 78°; matrix = 64 × 64, field of view = 192 × 192 mm) were acquired. The first six volumes of each run were treated as dummy pulses and were discarded to allow for T1 equilibrium effects. A magnetization-prepared rapid acquisition gradient echo (MPRAGE) scan was also acquired (repetition time = 2.3 sec, echo time = 2.98 sec, inversion time = 900 msec, flip angle = 9°, field of view = 240 × 256 mm, sagittal plane; slice thickness = 1 mm; 176 slices).
fMRI Behavioral Task
Participants performed a rhyme judgment task on 36 word pairs, out of which 26 pairs rhymed (about 70%) and 10 did not (about 30%). The same set of word pairs was presented as written words in one block and as pictures in another, therefore matching the demands on subarticulatory processing (mean number of letters = 4.25 ± 0.8, range = 3–6; mean number of phonemes = 3.10 ± 0.8, range = 1–4; mean number of syllables = 1.03 ± 0.2, range = 1–2). These words were different from those used for the tasks performed outside the scanner (see below). To prevent the influence of priming on one condition but not the other we exposed all of the subjects to the stimuli during a prescan training and randomized block order across participants. The baseline condition was a visual similarity task where participants had to indicate whether two visual stimuli were identical using a button press. Strings of meaningless symbols (e.g., ) were used in the baseline task for the written words condition, and scrambled pictures were used in the baseline task for the picture condition.
Participants first practiced the tasks outside the scanner. The tasks used in the training session were identical to the ones used during the scan, except that different words and pictures were used. The practice blocks were shorter, containing 20 pairs, 10 of which were rhyming/matching pairs and 10 were non-rhyming/non-matching pairs. To reduce subvocal articulation as much as possible participants practiced the task until they managed to avoid vocalization and articulatory movements. After completion of the training task and before entering the scanner, participants were shown the pictures that were to be used during the scanning session and were asked to name them aloud. Naming errors were corrected. In the scanner, volunteers performed two rhyme judgment tasks and two visual similarity tasks in four separate runs. They were able to rest between runs. In each trial of the rhyme judgment, participants were presented with either two pictures or two words for 7.3 sec and had to indicate, within this time frame, whether the words rhymed, by pressing one of two buttons. In the baseline conditions, subjects were asked to indicate whether two images were identical. In each trial, the words “yes” and “no,” together with a and an , respectively, appeared at the bottom of the screen, to remind participants that the left button corresponds to a “yes” answer and the right button corresponds to a “no” answer. After the trial, a fixation cross appeared for 1.5 sec, and after every third trial a longer fixation cross appeared for 13 sec. In the last 2 sec of the long fixation period, the color of the cross turned to red, alerting the participant that the next trial is starting. Each run lasted about 7 min (see Figure 1). The stimuli were presented using E-Prime (version 1.2; Psychology Tools, Inc., Pittsburgh, PA), in blocks of similar stimulus, to maximize design efficiency. The order of blocks was randomized and counterbalanced between participants. The order of trials was pseudorandomized, making sure that the same word/picture did not repeat in two consecutive trials. The side on which each word appeared was counterbalanced between blocks. Images in the baseline tasks appeared equally on the left and right side of the screen. Stimuli were projected onto a back screen with a resolution of 1024 × 768 pixels.
fMRI Imaging Data Processing and Analysis
fMRI data were preprocessed using SPM8 (Wellcome Trust Centre for Neuroimaging, London, UK, www.fil.ion.ucl.ac.uk/spm) implemented in the Matlab (Mathworks, Natick, MA) environment (2006b). Motion correction was performed using Realign by first registering all images to the first image (after excluding the six dummy scans) and then registering all to the mean. Functional scans and structural MPRAGE scans were coregistered to each other. MPRAGE images were normalized and segmented into GM, WM, and CSF probability maps, based on the standard Montreal Neurological Institute (MNI) template, using the unified segmentation–normalization algorithm. This procedure combines tissue segmentation, bias correction, and spatial normalization in a single unified model (Ashburner & Friston, 2005). The voxel size in the normalized structural and functional images was 2 × 2 × 2 mm. Normalized GM and WM images were visually inspected for quality of the segmentation–normalization process. The normalization parameters were then applied to the functional images. Data were spatially smoothed using an 8 mm FWHM Gaussian kernel. At the first level, the different tasks, as well as correct and incorrect responses in each task, were modeled as separate conditions. Additionally, the motion parameters from realignment were added as multiple regressors.
At the second level, peak BOLD activations were analyzed in a whole-brain group analysis using a factorial design with two factors (age and task), each with two levels (old and young for age; words and pictures for task). A main effect of Task and the main effects of Age (old > young and young > old) were first examined across the whole brain. A main effect of Age, the simple main effects of Age (within each task) and the interaction between Age and Task were also examined within the areas of interest (IFG bilaterally and left insula). This was achieved by applying an explicit mask, which included the right and left IFG and the left insula, as defined by the Anatomy toolbox (Eickhoff et al., 2005). Results are reported with a threshold of p < .05, family-wise error (FWE) corrected.
Next, correlations between activation during correct responses (effect size at the peak voxel) and behavior (success rate and RT during correct trials) were examined in areas that showed an effect of age (older > young or young > older). In cases where a significant correlation was found, we applied Fisher's exact test to examine whether this correlation was different from (1) a correlation in the same area but in a different task (comparing the word task and the picture task), (2) correlation coefficient of the second age group (comparing the older and young adults), and (3) correlations in the same task but in a different area.
Voxel Based Morphometry
We have used different tasks and materials in tasks performed inside and outside the scanner for reasons related to the type of analysis. For the voxel based morphometry (VBM) analysis, we required tasks in which the overall variability in performance is larger. This was achieved in the tasks that were used outside the scanner, because they were slightly harder than those used inside the scanner (as can be seen in the error rate below). For the fMRI analysis, we required tasks in which (1) the two age groups did not differ in performance and (2) success rates are high. This could be achieved by using simple tasks, in which all participants performed close to ceiling.
Rhyme and homophone judgment tasks were used for behavioral assessment outside the scanner. Performance on these tasks was used to examine behavior–structure relationships using VBM analyses. The tasks were adapted from the Psycholinguistic Assessments of Language Processing in Aphasia (Kay, Coltheart, & Lesser, 1992). In the rhyme judgment task, subjects were asked to determine whether two written words rhymed. For example, “bear” and “chair” rhyme, whereas “food” and “blood” do not. The test had 60 pairs altogether (mean number of letters = 4.14 ± 0.6, range = 3–5; mean number of phonemes = 3.09 ± 0.6, range = 2–4; mean number of syllables = 1.01 ± 0.1, range = 1–2). Half of the rhyming pairs and half of the nonrhyming pairs had orthographically similar endings (e.g., town–gown), whereas the other half had orthographically dissimilar endings (e.g., chair–bear). This way the tests could not be successfully solved based on orthography alone, ensuring that the subjects had to say the words (internally) to solve the task, thus utilizing phonological retrieval. In the homophone judgment task subjects had to determine whether two written words sounded the same. That is, whether the words are homophones. For example, “might” and “mite” are homophones, whereas “ear” and “oar” are not. This test had 40 word pairs (mean number of letters = 3.92 ± 0.7, range = 2–5; mean number of phonemes = 2.68 ± 0.8, range = 1–5; mean number of syllables = 1.05 ± 0.2, range = 1–2). Participants performed the homophone and rhyme judgment tasks using written material and without any time limit. Scoring of the task was based on the judgment given to a word pair, with possible answers being correct or incorrect. Hence, every pair judged incorrectly was scored as one error. The percentage of correct responses in each task acted as the dependent variable.
VBM Imaging Data Processing and Analysis
Images were normalized and smoothed as described above. Statistical analyses used multiple regression of the general linear model with proportional scaling. This analysis tested for regional differences in GM or WM after differences at the global level were factored out by including proportional scaling. Explicit masking of the images was performed during statistical analysis. The explicit masks were made from the a priori GM and WM templates thresholded at 0.2.
The main effect of Age was analyzed by comparing the two age groups using an independent samples t test. To look for correlations between whole-brain tissue density and performance among the older adults, behavioral scores (error rate in the homophone and rhyming tasks performed outside the scanner and RT and error rate in the word and picture rhyming tasks performed inside the scanner) were added as separate regressors. These analyses were done separately for the GM and WM images. Next, activation in areas showing age effects and GM density across the entire brain were correlated for the older participants. Activation was estimated at the peak voxel as described in the fMRI analysis section. The extracted activation was entered into a factorial design, with each task separately. We looked for the main effect of tasks and the effects of each task separately, by examining areas that showed correlation between activation and GM density.
In all of these analyses, we searched the whole brain for effects that were significant at p < .05 at local maxima after FWE correction for multiple comparisons. We also conducted the same analyses using the ROIs defined above. We only report clusters that had a minimum of 20 voxels at the statistical threshold used.
Behavioral Data Analysis
Two main variables were measured in the study: RT during correct trials (in the fMRI study only) and success rate. We decided, a priori, to exclude data if (1) a participant had very fast RTs (≤mean minus 3 standard deviations). This is because such a response is likely to be an accidental button press or a late response to a previous trial and (2) if the participant's overall performance was at chance level. No data had to be excluded based on these criteria.
One young participant had error rates and RTs in the tasks performed inside the scanner that were higher than the group's average scores in more than 2 standard deviations. The data of this participant were therefore excluded. Behavioral data significantly deviated from normality (Kolmogorov–Smirnov test, p > .05 for all data except for RT of the picture baseline condition in the fMRI task). Therefore, nonparametric tests were subsequently used.
The two groups did not significantly differ in RT or error rate for any of the fMRI tasks (Mann–Whitney test, p > .2 for all tests). Therefore, any difference seen in brain activation cannot simply be attributed to overall differences in task performance (Christoff et al., 2001). In the word task, older participants had an average error rate of 0.44 ± 1.0%, whereas the young participants had an average error rate of 1.2 ± 1.9%. Average RTs were 1594 ± 287 msec and 1567 ± 437 msec for the older and young adults, respectively. In the picture task, average error rate was 3.3 ± 3.7% and average RT was 2276 ± 337 msec for the older adults. For the young adults, average error rate was 3.0 ± 2.6% and average RT was 2118 ± 591 msec.
For the young adults, there was a significant correlation between RTs and error rates for the picture and word rhyme judgment (Kendall's Tau = 0.82, p < .001 for RTs; Kendall's Tau = 0.51, p = .038 for error rate). For the older adults, significant correlation was only found for the RTs (Kendall's Tau = 0.40, p = .008 for RTs; Kendall's Tau = 0.09, p = .341 for error rate).
No data were excluded based on excessive motion. The magnitude of motion was significantly higher in the older adults in all directions (p < .001 for all); therefore, motion parameters were included as regressors in the analyses.
A main effect of Tasks (word and picture rhyming relative to baseline, Table 1) was found in the occipital cortex bilaterally, extending in both hemispheres into the fusiform gyrus and the superior parts of the cerebellum. Smaller clusters were found in the right MFG and in the left, but not right, precentral gyrus and the left inferior parietal region (p < .05 FWE corrected). In the IFG, activation was observed in the left pars triangularis (pTri/BA45; x = −50, y = 44, z = 2, Z score = 4.88; x = −48, y = 40, z = 18, Z score = 4.72). These clusters extended into the left MFG (Figure 2). No simple effects of Task (words > pictures or pictures > words) were found (p < .05 FWE corrected). There were no differences between tasks in any of the right or left IFG regions or in the left insula, even at an uncorrected threshold (p < .001). Because of the difference between the age groups in education level, we repeated the analyses with education level as a covariate. The effects reported here did not change except for an effect for pictures > words in the right fusiform gyrus, when corrected for education level (p < .05 FWE corrected).
|Peak t Value|
|Occipital lobe/fusiform gyrus|
|R lingual gyrus||18/19/37||16||−90||−4|
|L inferior occipital||18/19/37||−36||−86||−4|
|L inferior parietal||7||−26||−60||40||5.61|
|L inferior parietal||40||−50||−50||40||5.35|
|L IFG (pTri)||45||−48||40||18||5.26|
|L IFG (pTri)||45||−50||44||2||5.48|
|Peak t Value|
|Occipital lobe/fusiform gyrus|
|R lingual gyrus||18/19/37||16||−90||−4|
|L inferior occipital||18/19/37||−36||−86||−4|
|L inferior parietal||7||−26||−60||40||5.61|
|L inferior parietal||40||−50||−50||40||5.35|
|L IFG (pTri)||45||−48||40||18||5.26|
|L IFG (pTri)||45||−50||44||2||5.48|
Rhyming tasks > baseline; p < .05, FWE corrected. t values are listed only for the peak coordinate in every cluster. x, y, z coordinates are given in MNI space.
MFG = middle frontal gyrus; IFG = inferior frontal gyrus; pTri = pars triangulars.
In a whole-brain search, no significant effects of Age nor significant interactions between Age and Task were found (p < .05 FWE corrected). In the ROI analysis, activation was stronger for older than younger participants across the word and picture conditions in the right pTri (x = 36, y = 24, z = 14, Z score = 4.74; p < .05 FWE corrected). In right pars orbitalis (pOrb/BA47), activation was higher for older than younger participants for the word condition only (x = 40, y = 36, z = −2, Z score = 5.10; p < .05 FWE corrected; Figure 3) and the difference in the effect of aging on words relative to pictures was confirmed by an interaction between age and condition in right pOrb (Z score = 3.65, p < .001 uncorrected) but not in right pTri (p > .05 uncorrected). Both age effects were a result of a positive and significant BOLD response in the older adults in the context of a nonsignificant (p > .001) negative BOLD response in the younger adults (Figure 4). There were no other significant effects of Age in the whole-brain or ROI analyses. These effects did not change when we repeated the analyses with a covariate that factored out the difference in education level between the groups.
Correlations between activation and behavioral performance
Increased activation in right pTri and pOrb for older compared with younger participants was observed in the context of no significant differences in task performance (Mann–Whitney tests, p > .2 for success rate and RTs in all conditions). To explore the functional contribution of the increased right IFG activation in older adults, we correlated activation (during correct trials only) with behavior (accuracy and RTs during correct trials) in both older and younger groups. Activation in the right pOrb did not correlate with performance in any of the conditions or groups. Therefore, the results below focus on details of the correlations in right pTri only.
In both groups, pTri activation during correct trials was higher in those participants who made more errors overall. In the older group, this was found for the word condition (Pearson's r = 0.544, p = .008 for errors; r = 0.262, p = .139 for RTs during correct trails) but not in the picture condition (Pearson's r = 0.268, p = .134 for errors; r = 0.011, p = .482 for RTs during correct trails) and there were no significant differences between words and pictures (Fisher's exact test, p > .05). In the younger group, right pTri activation correlated only with error rate in the picture task (Pearson's r = 0.578, p = .032) but not in the word task (Pearson's r = −0.228, p = .250), and this task difference was significant (Fisher's exact test, p = .050).
A comparison of the correlations in the older and younger participants showed no significant difference in the strength of the correlation for the picture task (Fisher's exact test, p = .374), but the correlation between right pTri activation and performance on the word task was significantly higher in the older than the younger participants (Fisher's exact test, p = .009).
In summary, the most significant finding was that right pTri activation for correct trials in the word task was higher in older participants who made more errors. This finding remains significant after correction for multiple comparisons (Bonferroni correction, α = 0.0125). We note that activation in right pTri showed an additive effect of age (older > younger) and error rates (high > low). This is illustrated in Figure 5.
Behavioral Results (Out-of-Scanner Tasks)
On the written rhyme judgment task, performance was significantly better in the older than younger group (average error rate was 1.8 ± 2.2% for the older adults and 3.5 ± 2.2% for the young adults; Mann–Whitney test, p = .039). On the homophone judgment task, the two groups did not differ in their performance (average error rate 1.9 ± 2.7 for older participants and 1.8 ± 2.3% for younger participants; Mann–Whitney test, p = .95).
VBM Imaging Results
After factoring out age group differences at the global level (see Methods), the local effects of Age (older < young) were observed in the WM of the left internal capsule and the corticospinal tract bilaterally (p < .05 FWE corrected for multiple comparisons across the whole brain). There were no significant correlations between local GM or WM density across the entire brain or in the ROIs and performance (in-scanner or out-of-scanner measurements), or between GM density and activation in right pTri or pOrb (word or picture conditions; p > .001 uncorrected).
In this study, the effects of aging on phonological word retrieval were explored by comparing BOLD activation patterns and GM and WM density in two groups of subjects: younger and older adults. The group of older adults showed reductions in WM density and increases in BOLD activation. Looking at performance outside the scanner, it was found that older adults did not significantly differ from younger adults in their performance on the homophone judgment task but were significantly better than the younger participants on the written rhyme judgment task. However, neither GM or WM density was related to performance on either task.
In the fMRI task, the two groups had similar levels of performance. That is, RT and error rates were similar for the two groups of participants in all tasks. Whereas the younger subjects did not show any activation patterns significantly greater than that of the older participants, the older participants showed significantly greater activation in the right IFG, therefore confirming our initial hypothesis that the most significant effect of aging would be in this brain region and corroborating the findings from previous studies (Wierenga et al., 2008).
Our second hypothesis was that the effect of aging would be greater during the picture task than the word task because the picture task is more dependent on semantically mediated phonological retrieval. This hypothesis was not confirmed. To the contrary, in right pTri, there was no significant difference in the effect of aging between the word and the picture task, and in right pOrb, the effect of aging was more significant during the word task than the picture task. This pattern of effects is not consistent with the effect of aging arising at the level of semantically mediated phonological retrieval. An alternative explanation is that right inferior frontal activation increases when the demands on inhibition are high (Aron, Robbins, & Poldrack, 2004). For example, in a recent study, Lenartowicz, Verbruggen, Logan, and Poldrack (2011) associated right pTri activation with the reprogramming of action plans. Consistent with this conclusion, several language studies have reported right inferior frontal activation in the context of ambiguous semantic information (Dick, Goldin-Meadow, Hasson, Skipper, & Small, 2009; Peelle, Troiani, & Grossman, 2009; see Price, 2010, for a review; Schmidt & Seger, 2009; Snijders et al., 2009). In this context, the increased right inferior frontal activation we observed for correct rhyme judgment trials in older relative to younger subjects might reflect the need to control phonological interference that conflicts with the correct response and must therefore be inhibited. Interference is likely to be greater for words than for pictures because in English the phonology retrieved from semantics is not entirely consistent with the phonology retrieved from sublexical orthography (Plaut, McClelland, Seidenberg, & Patterson, 1996; Coltheart, Curtis, Atkins, & Haller, 1993; Paap & Noel, 1991). In our study, stimuli were constructed to ensure that subjects retrieved the phonological form of the word rather than relying on orthography alone to perform the rhyme judgment task. This was achieved by using word pairs that had orthographically dissimilar ending (e.g., bear–chair). This creates interference because the orthographical forms of the words suggest that the words do not rhyme although they actually do. This explanation is in line with the often cited “inhibition-deficit hypothesis” (Dempster, 1992), which suggests a decline in inhibitory function with age. Such decline was previously documented in various cognitive domains, including reading (Kane, Hasher, Stoltzfus, Zacks, & Connelly, 1994), verbal learning (Persad, Abeles, Zacks, & Denburg, 2002), working memory (Lustig, May, & Hasher, 2001), and reasoning (Viskontas, Morrison, Holyoak, Hummel, & Knowlton, 2004). This explanation further suggests that, to examine the difference between semantically versus nonsemantically mediated phonological retrieval, future studies should use words with superficial orthography, thereby reducing the interference created in our study.
As we did not find a difference between the two age groups in performance, we suggest that age-related activation can overcome potential inhibition deficit, resulting in normal behavior, similar to previous behavioral studies (see review by Burke, 1997). It can be the case, of course, that there is a certain threshold beyond which additional neuronal activation cannot overcome age-related deficits, resulting in behavioral deficits.
Our third hypothesis concerned the correlation between BOLD activation and behavioral performance. The co-occurrence of overall aging-related increases in the level or extent of brain activation together with equal levels of behavioral performance is consistent with a compensatory mechanism that enables older participants to maintain performance at a level equal to that of the younger participants (Cabeza, 2002; Dempster, 1992). To find evidence for the nature of this compensatory strategy, we correlated activation during correct rhyming trials with the overall performance of the participants. This allowed us to determine whether right IFG activation was higher in participants who made more or less errors. We found that for both groups right pTri activation, during correct trials, was positively correlated with more errors. Such a correlation is consistent with right inferior frontal activation supporting successful responses in error-prone subjects (participants who had overall lower level of performance in the task, i.e., lower percentage of correct trials). This was predicted by the hypothesis above that right inferior frontal activation is serving to control interference from competing responses, particularly in participants who are prone to errors. One way to test this conclusion further would be to investigate whether participants make more errors when TMS is applied to right pTri and right pOrb (see Hartwigsen et al., 2010, for a study of TMS to the right pars opercularis).
Comparing the correlations across regions and groups elucidated the importance of this activation further. First, the correlations among both groups were specific to the right pTri region, and we found no correlations between right pOrb activation and performance or tissue density. A well-documented anterior–posterior division of Broca's area suggests that the anterior parts of Broca's area, the left pTri and pOrb are involved in semantic processing (Devlin, Matthews, & Rushworth, 2003; Wagner, Pare-Blagoev, Clark, & Poldrack, 2001; Wagner, Koutstaal, Maril, Schacter, & Buckner, 2000; Poldrack et al., 1999), whereas the more posterior pars opercularis is involved in phonology (reviewed in Price, 2010; Hagoort, 2005; Bookheimer, 2002). We are unaware of any studies that distinguish between the function of the pTri and that of the pOrb in either hemisphere. Our study suggests that, at least in the right hemisphere, the pTri and pOrb might have different roles, and this should be explored separately. Second, the correlation between right pTri activation during successful trials and error rate was significantly higher for older adults compared with the younger ones during the word rhyming task. This is in line with our interpretation above that right IFG activation is serving to control interference and serves as a compensatory mechanism that results in successful performance, especially in older participants who are error-prone.
Lastly, we hypothesized that declining performance on phonological tasks might be associated with regionally specific reduction in tissue density. This was not found, most likely because of the very low variability in the behavioral scores or because of the fact that there were no regionally specific differences in GM density when comparing the older and young adults. This suggests that age-related differences in activation cannot be easily explained by regionally specific loss of tissue in older adults.
We did not find age-dependent left insular activation, as shown by Shafto et al. (2007, 2010). The reason for this might lie in the difference in the tasks employed. Shafto et al. used a picture naming task where subjects were asked to indicate whether they knew the person's name or not or whether they experienced a TOT state. This judgment can, at least in some cases, be achieved without retrieving the phonological form of the word. In our study, participants were compelled to retrieve the phonological form of the word to make a correct response. As a result, the studies evoked somewhat different processes. Lastly, we did not detect activation in the right precentral gyrus for either age groups, and we were therefore unable to detect this age-effect reported by Wierenga et al. (2008). However, the activation in the right precentral gyrus in Weiranga et al. might not be specific to language processing in general or word retrieval in particular, but likely reflects a predominantly motor process related to overt speech. In our study, silent speech without articulation was used, which can explain the discrepancy in the results.
Limitations of the Current Study
One of the limitations of this study is that the two age groups were not matched on level of education, although education level did not influence our results. The majority of previous studies have used education level as a proxy marker of cognitive reserve (Valenzuela & Sachdev, 2006), and cognitive reserve has been, in turn, found to be related to cognitive function (Valenzuela & Sachdev, 2006) and brain structure and function (see Bartres-Faz & Arenaza-Urquijo, 2011, for a review) in aging. This was not reflected in our results. This might be because of the fact that in our study, differences between the age groups in education level were more likely reflecting historical changes in the education system and society in the United Kingdom, rather than reliably reflecting the participants' level of cognitive reserve. Future studies should measure education level or cognitive reserve using methods with higher validity.
A second limitation is the finding that older adults moved more while in the scanner. We tried to account for motion by including motion parameters as regressors in our analyses. However, future studies should try to equate the groups for amount of motion.
Another limitation is that there was time pressure on tasks performed inside the scanner but not on those performed outside the scanner. This can create a possible speed–accuracy trade-off affecting performance in the scanner, but not outside the scanner. However, our data suggest that such trade-off did not occur: Error rate are actually lower in the in-scanner word rhyme judgment task, compared with the out-of-scanner word rhyme judgment task (Wilcoxon Signed Rank Test, p = .003). The lack of speed–accuracy trade-off in the in-scanner task can be explained by the fact that the time window in the fMRI task was actually quite long: 7 sec for each trial. In practice, participants performed the task much faster, with the highest RTs on the word task being just under 2.5 sec.
This study provides a novel interpretation of the different levels of neural activation in young and old participants during word retrieval, particularly the increased activation in right IFG regions for older adults. Contrary to our expectations, increased right IFG activation with aging cannot be explained in terms of increased demands on semantically mediated phonological retrieval. If this explanation had been correct, then we would have seen a greater effect of aging on rhyme judgments with pictures than words but instead we saw a greater effect of aging on rhyme judgments with words than pictures. We therefore suggest, in agreement with previous studies, that the effect of aging on right pTri activation reflects the need to control for interference in error-prone subjects. A second novel observation was that activation in the same participants and in the same brain region can demonstrate different aging effects in different tasks. Specifically, the effect of aging in right pOrb was only observed in the word task, not the picture task. Third, we distinguish the effect of aging in two different adjacent regions. In the right pTri, activation increased with age for both the picture and word task but the effect of aging in right pOrb was specific to the word task. This suggests that, at least in the right hemisphere, pTri and pOrb might have different roles, and therefore, these two regions should be investigated independently. Finally, our results highlight the importance of combining behavioral, structural, and functional data when trying to understand aging-related changes in the brain.
We thank Tulasi Marrapu for helping with the analysis and Smriti Agarwal for her comments on the manuscript. S. G. was supported by the Pinsent-Darwin Fellowship, Wingate scholarship, The Cambridge Overseas Trust, and a B'nai Brith Scholarship. P. S. J. was funded by the Cambridge Comprehensive Biomedical Research Centre. E. A. W. received support from the Biomedical Centre Grant to Cambridge from the UK National Institute of Health Research. Imaging was funded by the MRC grant no. G0500874.
Reprint requests should be sent to Sharon Geva, Department of Clinical Neurosciences, University of Cambridge, R3 Neurosciences − Box 83, Addenbrooke's Hospital, Cambridge, CB2 0QQ, UK, or via e-mail: firstname.lastname@example.org.
Present address: Developmental Cognitive Neuroscience Unit, UCL Institute of Child Health, 30 Guilford Street, London, WC1N 1EH, UK.