We investigated how linguistic intention affects the time course of visual word recognition by comparing the brain's electrophysiological response to a word's lexical frequency, a well-established psycholinguistic marker of lexical access, when participants actively retrieve the meaning of the written input (semantic categorization) versus a situation where no language processing is necessary (ink color categorization). In the semantic task, the ERPs elicited by high-frequency words started to diverge from those elicited by low-frequency words as early as 120 msec after stimulus onset. On the other hand, when categorizing the colored font of the very same words in the color task, word frequency did not modulate ERPs until some 100 msec later (220 msec poststimulus onset) and did so for a shorter period and with a smaller scalp distribution. The results demonstrate that, although written words indeed elicit automatic recognition processes in the brain, the speed and quality of lexical processing critically depends on the top–down intention to engage in a linguistic task.
It is widely accepted that written words automatically elicit recognition processes in the brain. For instance, the famous Stroop effect demonstrates that, when naming the colored font of visually presented words, the meaning of these words becomes activated despite being irrelevant and even harmful for performance (e.g., MacLeod, 1991; Stroop, 1935). Likewise, the masked priming literature has convincingly shown that briefly presented (subliminal) words affect the brain's response to practically all levels of linguistic processing (e.g., Grainger & Holcomb, 2009; Kinoshita & Lupker, 2003; Forster & Davis, 1984). Findings like these have led to the general notion that access to words and their meaning is an automatic process, engaged by the mere perception of the visual stimulus regardless of whether we actually intend to read the word or not (e.g., Posner & Snyder, 1975)—a property shared (often implicitly) by most models of visual word recognition (e.g., Price & Devlin, 2011; Dehaene, Cohen, Sigman, & Vinckier, 2005; Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Grainger & Jacobs, 1996; McClelland & Rumelhart, 1981). However, over the last two decades, the position that reading enfolds entirely in an automatic fashion has been criticized. At the behavioral level, some studies have demonstrated that the Stroop effect can be modulated by context (e.g., Balota & Yap, 2006; Besner, Stolz, & Boutilier, 1997). Along similar lines, neurophysiological research has shown that the language-sensitive N400 ERP can be modulated by top–down influences such as prediction, selective attention, and task demands (e.g., Martens, Ansorge, & Kiefer, 2011; Kiefer & Martens, 2010; Kiefer & Brendel, 2006; DeLong, Urbach, & Kutas, 2005; Federmeier & Kutas, 1999; Holcomb, 1988). These results document that language processing is not fully encapsulated from attentional and goal-directed processes (e.g., Fodor, 1983). Notwithstanding, when and how top–down processes do affect visual word recognition remain poorly understood, and surprisingly little research is available. With this study, we aim to address the when and how of top–down involvement in reading by investigating the time course of visual word recognition, as measured with the fine-grained temporal resolution of ERPs, when participants have the conscious intention to engage in language processing (semantic categorization) compared to a situation where no such intention is present (color categorization).
According to the traditional view (e.g., Herdman & Takai, 2013; Augustinova, Flaudias, & Ferrand, 2010; Proverbio, Vecchi, & Zani, 2004; Bentin, Mouchetant-Rostaing, Giard, Echallier, & Pernier, 1999; Nobre, Allison, & McCarthy, 1998; Holcomb, 1988), top–down control will only come into play after the initial automatic sensory-driven access to whole-word knowledge. That is, top–down modulations in reading are a late, reactive process, operating as a correction signal aiding response selection but leaving word recognition itself rather unaffected. Described as such, this view on visual word recognition neatly falls within the broadly embraced conceptualization of automatic versus controlled processing (e.g., Posner & Snyder, 1975), where an automatic process is fast and takes effect without a moderating influence of top–down control. However, over the last few decades, the rather independent classification between automaticity and top–down control has come under strong scrutiny, in particular in the field of vision science. Today, there is ample evidence showing that even the earliest stages in the visual processing stream, classically defined as purely sensory driven, are affected by attention (e.g., Gilbert & Li, 2013; Corbetta & Shulman, 2002; Hillyard, Vogel, & Luck, 1998; Desimone & Duncan, 1995). For instance, in their seminal study, Corbetta, Miezin, Dobmeyer, Shulman, and Petersen (1990) demonstrated that the same visual input resulted in different activation patterns of the extrastriate visual cortex, depending on whether attention was guided toward the shape, color, or velocity attributes of a stimulus. Findings like these have led to a radical change in the role of attention and goal-directed control in vision—one where the brain interprets input through the dynamical and immediate interplay between stimulus-driven bottom–up and attention-driven top–down processes (e.g., Gilbert & Li, 2013; Gilbert & Sigman, 2007; Bar, 2003; Corbetta & Shulman, 2002; Engel, Fries, & Singer, 2001; Kastner, Pinsk, De Weerd, Desimone, & Ungerleider, 1999; Hillyard et al., 1998; Desimone & Duncan, 1995).
Importantly, in recent years, this view has percolated to other higher-order domains of cognition as well, such as semantic processing and speech production (e.g., Kiefer, 2007, 2012; Strijkers & Costa, 2011; Strijkers, Holcomb, & Costa, 2011; Kiefer & Martens, 2010). In particular, the attentional sensitization model developed by Kiefer and colleagues (Kiefer, 2007, 2012; Kiefer & Martens, 2010) provides a detailed account of how a multitude of top–down factors can affect the construction of meaning in the human mind, both for conscious and unconscious processing. According to this model an automatic process is not free from attention but is in fact dependent on it in order to configure the cognitive machinery in such a way that an automatic response can (fully) enfold. This is achieved through top–down biasing signals, which, along the lines of the biased competition model of spatial attention (Desimone & Duncan, 1995), will enhance the sensitivity of task-relevant processing pathways and attenuate the sensitivity of the task-irrelevant pathways. In this manner, when a stimulus enters the system, consciously perceived or not, preemptive attention as a function of the context or the perceiver's goals will dictate the type of “automatic” response the stimulus in question will elicit. Kiefer and colleagues provided positive evidence for their model in a series of elegant ERP experiments exploring how subliminal masked priming would be affected by an immediately preceding context (induction task; e.g., Adams & Kiefer, 2012; Martens et al., 2011; Kiefer & Martens, 2010). The authors observed a traditional N400 semantic priming effect only when the prior induction task was semantic in nature (e.g., living/nonliving decision), but not if the induction task was perceptual in nature (e.g., shape decision). These findings thus suggest that control and automaticity are not two distinct and mutually exclusive concepts but rather interact with each other in a continuous fashion, where attention functions as an important “conductor” for an automatic process to enfold.
In summary, given the demonstrations that visual processing and even unconscious perception of stimuli are strongly affected by proactive top–down modulations, it seems reasonable to assume that, also for the higher-order and conscious skill of reading, top–down processes will not be restricted to late, reactive modulations, as has been traditionally assumed. In fact, a few recent neurophysiological studies of word reading have demonstrated that different task demands elicit modulations at much earlier latencies than typically thought possible (e.g., Chen, Davis, Pulvermuller, & Hauk, 2013; Strijkers, Yum, Grainger, & Holcomb, 2011; Ruz & Nobre, 2008). Ruz and Nobre (2008) showed in a cuing paradigm that a negative ERP deflection peaking around 200 msec after stimulus onset was larger when the cue oriented attention toward orthographic rather than phonological attributes of words (but see Herdman & Takai, 2013). Strijkers, Yum, et al. (2011) reported that, around 170 msec after perceiving a word, the brain's electrical response dissociates between reading aloud and semantic categorization. And similarly, Chen and colleagues (2013) found that task demands altered the neuromagnetic response to words within 150 msec, and they localized this effect to neural sources implicated in language processing. Taken together, these results indicate that the influence of different linguistic goals becomes visible at latencies when word recognition is still ongoing (e.g., Grainger & Holcomb, 2009; Pulvermüller, Shtyrov, & Hauk, 2009; Hauk, Davis, Ford, Pulvermuller, & Marslen-Wilson, 2006; Holcomb & Grainger, 2006; Hauk & Pulvermuller, 2004). Nonetheless, whether these results also signify that the mental operations of word recognition itself are affected and, if so, how they are affected cannot be established on the basis of general task differences. Likewise, the relationship between early top–down involvement, on the one hand, and the notion of automatic word recognition, on the other hand, is an unaddressed issue that requires empirical investigation.
In this study, rather than exploring when task effects occur, we investigated the impact of such top–down influences for processing within the word recognition system itself. To do so, we contrasted the electrophysiological signature of a well-known psycholinguistic phenomenon, namely the word frequency effect, in a language-relevant (semantic categorization) and language-irrelevant (color categorization) task (for a similar rationale employed recently in language production, see Strijkers, Holcomb, et al., 2011; see Figure 1). Word frequency refers to how often we encounter and use a word in natural language, with low-frequency words (e.g., couch) typically taking longer to process than high-frequency words (e.g., chair; e.g., Scarborough, Cortese, & Scarborough, 1977). A particularly useful feature of manipulating word frequency in the present context is that it concerns a robust and well-established physiological index for tracing the onset of lexical processing (e.g., Grainger, Lopez, Eddy, Dufau, & Holcomb, 2012; Hauk et al., 2006; Hauk & Pulvermuller, 2004; Sereno, Rayner, & Posner, 1998). This permits us to explore when different intentional goals interact with the brain's linguistic processing pathway(s). Moreover, because this variable is sensitive to the initial access of whole-word knowledge, it allows us to tease apart the brain's response of automaticity versus that associated with top–down control. That is, although an ERP frequency effect in the nonlinguistic color task would highlight the time course of automatic word access, contrasting the latter with the neurophysiological response to word frequency in the linguistic task would demonstrate the additional contribution (if any) of the top–down intention to retrieve the meaning of words. Because of this design where we compare the same psycholinguistic phenomenon linked to lexical access (word frequency) between two tasks that display distinct linguistic intentions, the following predictions can be brought forward: If reading is instantiated automatically and top–down processing is restricted to late, reactive modulations (e.g., Herdman & Takai, 2013; Augustinova et al., 2010; Proverbio et al., 2004; Bentin et al., 1999; Nobre et al., 1998; Holcomb, 1988), then we should observe a similar onset of the word frequency effect in the semantic and color categorization tasks. It is only after some lexical knowledge has been retrieved that top–down signals can bias selection and decision-making processes and dissociations in the ERP expression of the word frequency effect are expected to emerge. In contrast, if top–down involvement in reading is much more dynamical than typically held and analog to the predictions of the attentional sensitization model (Kiefer, 2007, 2012; Kiefer & Martens, 2010) also plays a central role in the recognition process proper, then linguistic intention should modulate the feedforward flow of activation elicited by the perception of a word near simultaneously, predicting interactions of task and word frequency at the earliest latencies.
Twenty French native speakers, all students at the University Aix-Marseille (age = 18–25 years), took part in the experiment. All participants were right-handed, had normal or corrected-to-normal vision, and did not report neurological problems. They received monetary compensation for their participation in the experiment.
A total of 270 written words from a wide range of semantic categories were selected from the French Lexique database. One hundred eighty of the words corresponded to the critical items of no-go trials, which appeared in both the semantic categorization and color categorization tasks and which were equally divided (90) between low- and high-frequency items. On average, the high-frequency words had a word frequency that was 70 times higher compared with the low-frequency words (mean word frequency for high-frequency items: 219 occurrences per million; mean word frequency for low-frequency items: 3 occurrences per million). High- and low-frequency words were matched on letter length (high frequency: 6.3 letters; low frequency: 6.3 letters), bigram frequency (high frequency: 3.6; low frequency: 3.5), orthographic neighborhood size (high frequency: 2.32; low frequency: 2.71), and semantic category membership. The remaining 90 words linked to the go trials were equally divided over the two tasks in that 45 were animal names employed in the semantic categorization task and the other 45 were blue-colored words (their font) from different semantic categories used in the color categorization task. The two sets of go items were fully counterbalanced for word frequency (go trials in semantic categorization task [animal names]: 21.1 occurrences per million; go trials in color categorization task [blue-colored words]: 21.4 occurrences per million), letter length (go trials in semantic categorization task: 6.0; go trials in color categorization task: 6.2), and bigram frequency (go trials in semantic categorization: 3.2; go trials in color categorization: 3.5). Finally, none of the words used in the experiment had transparent meaning associations with any of the colors (e.g., stimuli such as grass [green] or sky [blue] were avoided).
Tasks and Design
All participants performed the two different tasks: A go/no-go semantic categorization task, where participants were instructed to push a response button as fast as possible when a visually presented word corresponded to an animal name (20% of the trials), and a go/no-go color categorization task, where participants were instructed to push a response button as fast as possible when the color font of the printed word corresponded to blue (20% of the trials). As mentioned above, although the same 180 no-go items appeared once in each task, the 45 go trials (either animals or “blue” items) were different between tasks to ensure that no response-related trials could appear (in the other task) as no-go trials. Task administration was fully counterbalanced between participants. For both tasks, the font of the written words could appear in any of the following five colors: black, green, red, blue, and magenta. Assignment of a color to a word was done in a completely randomized manner for any given participant and task, with the sole difference that in the semantic task the items associated with go trials (animals) could appear in any of the five colors, whereas in the color task all items linked to go trials were in blue. This was done to ensure that participants would not be able to associate certain go trials for one task with a particular color for the subsequent task and vice versa. The crucial comparison was between the high- and low-frequency items of the no-go trials, which were identical between the two tasks (see Figure 1). A trial in the experiment consisted of the following sequence: (1) a blank screen appeared for 500 msec, (2) a fixation cross appeared for 500 msec indicating the upcoming target, (3) the target word replaced the fixation cross and stayed on screen for 500 msec, and (4) a blank intertrial interval of 2000 msec was presented. All words were presented in the center of a standard computer screen onto a white background.
The EEG was recorded from 64 tin electrodes embedded in an elastic cap (Electrode-Cap International, Eaton, OH) and placed on the scalp of the participants. Additional electrodes were attached below the left eye (LE, to monitor for vertical eye movement/blinks), to the right of the right eye (VE, to monitor for horizontal eye movements), over the left mastoid bone (A1, reference), and over the right mastoid bone (A2, recorded actively to monitor for differential mastoid activity). All EEG electrode impedances were maintained below 5 kΩ (impedance for eye electrodes was less than 10 kΩ, and for the reference electrodes, it was less than 2 kΩ). The EEG was amplified by an SA Bioamplifier (SA International, Encinitas, CA) with a bandpass of 0.01 and 40 Hz, and the EEG was continuously sampled at a rate of 250 Hz throughout the experiment.
EEG Processing and Analyses
The raw EEG was segmented offline in epochs of 600 msec starting 100 msec before stimulus onset until 500 msec after stimulus onset. All trials containing eye blinks (signals exceeding ±70 μV within an epoch) or containing response errors (2%) were removed prior to averaging. The epochs were low-pass filtered at 20 Hz and averaged together according to the two levels of word frequency (high vs. low) and the two levels of task (semantic vs. color). All averaged data were baseline-corrected, utilizing the 100 msec prior to stimulus presentation. Two different types of statistical analyses were conducted on the ERP data: onset latency analyses over all electrodes to assess when and for how long the word frequency effect manifested in each task, respectively, and a more traditional time window analyses of the mean ERP amplitudes to assess the global ERP morphology elicited by word frequency for each task and explore statistically potential interactions between both factors.
For the onset latency analyses, we performed running t tests at every sampling point (4 msec) for the entire epoch (−100 to 500 msec) between low- and high-frequency items for each task separately. To correct for multiple comparisons, an onset latency for a given experimental contrast was considered significant and reliable if at least a sequence of 15 consecutive t test samples exceeded the .05 significance level (e.g., Strijkers, Costa, & Thierry, 2010; Guthrie & Buchwald, 1991). For the mean amplitude analyses, time windows were based on the results of the onset latency analyses and determined as a function of variations in the mean global field power, which is measured by averaging the signal of all electrodes over all participants and conditions. In that manner, the following three time windows were identified: [150–250 msec], [250–350 msec], and [350–500 msec]. Furthermore, to reduce noise and obtain a more general assessment of the mean, electrodes were clustered together (linear derivation) dividing the scalp in nine ROIs: left frontocentral (LFC: F3, F5, F7, FC3, FC5, FT7), frontocentral (FC: F1, Fz, F2, FC1, FCz, FC2), right frontocentral (RFC: F4, F6, F8, FC4, FC6, FT8), left centroparietal (LCP: C3, C5, T7, CP3, CP5, TP7), centroparietal (CP1, CPz, CP2, P1, Pz, P2), right centroparietal (RCP: C4, C6, T8, CP4, CP6, TP8), left parietooccipital (LPO: P3, P5, P7, P9, PO7), parietooccipital (PO: PO3, POz, PO4, O1, Oz, O2), and right parietooccipital (RPO: P4, P6, P8, P10, PO8). We performed repeated-measures ANOVAs with Word Frequency (2 levels: high frequency vs. low frequency), Task (2 levels: semantic vs. color), and ROI (9 electrode clusters) as independent variables.
Behavioral Response Latencies and Errors
To assess differences in difficulty between the color and word recognition tasks, respectively, we ran repeated-measures ANOVA on the response latencies for the go trials with Task as within-subject variable. As expected, a main effect of Task was found, F(1, 18) = 67.99, p < .001, indicating that participants were faster in making a go response for the color recognition task (mean response latency: 467 msec) compared with the word recognition task (mean response latency: 594 msec). The number of errors in this study was very low, with around only 1% of errors (pooled over false alarms on the no-go trials and false hits on the go trials) for both tasks (color recognition task: 1.2%; word recognition task: 1.4%). These error rates did not differ statistically between tasks (F < 1).
ERPs: Onset Latency Analysis
These analyses displayed very robust differences in the word frequency effect across the two tasks (see Figure 2). Whereas in the go/no-go color categorization task, the first reliable (i.e., corrected for multiple comparisons) ERP differences between high- and low-frequency words emerged at the earliest 200 msec after stimulus onset for PO7 and became visible for a larger portion of posterior electrodes by ∼232 msec after stimulus onset (see Figure 2A), for the go/no-go semantic categorization task, the first reliable differences elicited by word frequency were already present 120 msec after stimulus onset at central and posterior electrodes (see Figure 2B). Furthermore, when visually comparing the significance graphs for the two tasks, we can also see a longer duration and wider distribution of the effect in the word recognition task compared to the color task (compare Figure 2A and B).
ERPs: Time Window Analyses1
Statistically confirming the observed differences in the above onset latency analyses, task goals interacted with the word frequency effect in this time window as shown by the significant three-way interaction between Word Frequency, Task, and ROI, F(8, 144) = 3.34, p = .02. To understand this interaction, we performed 2 (Word Frequency) × 9 (ROI) ANOVAs for each task separately. For the semantic task, the ANOVA revealed a significant effect of Word Frequency, F(1, 18) = 5.16, p = .04. Low-frequency items displayed more negative-going amplitudes compared to high-frequency items (see Figure 3A and B). This frequency effect did not differ across scalp regions given the absence of an interaction with ROI (F < 1; see Figure 3C). Crucially, for the color recognition task, there was no effect of Word Frequency (F < 1) nor a significant interaction of this variable with ROI, F(8, 144) = 1.13, p = .35 (see Figure 3).
Here we again observed a significant three-way interaction between Word Frequency, Task, and ROI, F(8, 144) = 4.70, p < .01. As for the previous time window, between 250 and 350 msec after stimulus onset, low-frequency items elicited more negative ERP amplitudes compared to those elicited by high-frequency items for the semantic task (main effect of Word Frequency: F(1, 18) = 15.47, p < .01), which was not significantly different across scalp regions (Word Frequency × ROI interaction: F(8, 144) = 1.22, p = .31; see Figure 3). In contrast to the previous time window, however, we did observe now a significant interaction between Word Frequency and ROI for the color task as well, F(8, 144) = 5.68, p < .01. Separate t tests of the frequency effect in each ROI highlighted a significant frequency effect in the color task for the LPO (p = .03) and PO (p = .03) ROIs, with low-frequency words producing more negative-going amplitudes than high-frequency words similar (although with a strongly reduced topographical distribution) to the ERP frequency effect found for semantic categorization (see Figure 3).
In this final time window, the three-way interaction between Word Frequency, Task, and ROI was also significant, F(8, 144) = 5.92, p < .01. ANOVAs separated by task revealed that in the semantic task there was a significant interaction between Word Frequency and ROI, F(8, 144) = 4.38, p < .01. Separate t tests of the frequency effect for each ROI indicated that the observed interaction is likely due to a difference in the magnitude of the frequency effect over ROIs because Word Frequency elicited significant ERP differences for all ROIs (all ps < .01; see Figure 3). Also for the color task, there was a significant interaction between Word Frequency and ROI, F(8, 144) = 4.04, p < .01. Separate t tests of the frequency effect in each ROI suggested that this interaction was driven by a differential ERP expression of the frequency effect between anterior and posterior electrodes. Whereas at the LPO (p = .05) and PO (p = .06) ROIs there was a (marginally) significant frequency effect with low-frequency items eliciting more negative amplitudes than high-frequency items (similar to the ERP frequency effect for the semantic task), at the RFC (p = .05) ROI the frequency effect reversed in polarity with high-frequency words, giving rise to more negative ERP amplitudes compared to those in response to low-frequency words (see Figure 3).
This study was designed to test how the intention to engage in linguistic processing would affect the cortical dynamics of visual word recognition. With this aim in mind, we compared the brain's electrophysiological response to word frequency in the context of a task that required active retrieval of the meaning of words (semantic go/no-go categorization) versus a task where no linguistic processing was necessary (color go/no-go categorization). The main finding was a surprisingly early interaction of the type of task in which a participant was engaged and the effects of word frequency. During semantic categorization, ERPs elicited by low-frequency words started to display more negative-going amplitudes when compared to those elicited by high-frequency words as early as 120 msec after stimulus onset (see Figures 2 and 3), hereby replicating the earliest neurophysiological manifestations of word frequency reported in the literature (e.g., Pulvermüller et al., 2009; Hauk et al., 2006; Hauk & Pulvermuller, 2004; Sereno et al., 1998).2 In strong contrast, when responding to the colored font of the written input, the same word frequency comparison did not elicit ERP modulations until some 100 msec later in time (and this difference cannot be attributed to task difficulty given that color categorization was easier/faster than word categorization). Furthermore, the frequency effect had a much shorter duration and more restricted scalp distribution in the latter case (see Figures 2 and 3). This particular electrophysiological pattern documents that, although written words do trigger automatic recognition processes in the brain, a significantly earlier and qualitatively richer neural response to words critically depends on the top–down intention to engage in linguistic processing.
This finding is at odds with the view that top–down processes only affect late stages of reading (e.g., Herdman & Takai, 2013; Augustinova et al., 2010; Proverbio et al., 2004; Bentin et al., 1999; Nobre et al., 1998; Holcomb, 1988). Although the brain's automatic response to whole-word knowledge (i.e., word frequency) occurred around 250 msec after stimulus onset, accessing the same knowledge with the benefits of top–down influences already took effect within 150 msec (compare Figure 2A with B). This calls for a more dynamical account concerning the interplay between automaticity and goal-directed behavior (e.g., Kiefer, 2007; Balota & Yap, 2006; Naccache, Blandin, & Dehaene, 2002)—one where top–down processes modulate the automatic sensory-driven activation of words at the earliest stages of linguistic processing. More precisely, our results point to a framework where external top–down signals bias the goal-relevant processing pathways simultaneously with, or prior to, the sensory-driven activation elicited by the input, as described in the fields of vision (e.g., Gilbert & Li, 2013; Corbetta & Shulman, 2002; Hillyard et al., 1998; Desimone & Duncan, 1995) and recently extended to higher-order cognition (e.g., Kiefer, 2007, 2012; Strijkers & Costa, 2011; Strijkers, Holcomb, et al., 2011; Kiefer & Martens, 2010). According to this view, top–down signals enhance the responsiveness of task-relevant and attended representations by increasing their sensory gain (multiplicative bias) and/or baseline-firing rate (additive bias). In this manner, when a stimulus enters the system, the sensorial response of the top–down enhanced representations will be stronger and with a higher signal-to-noise ratio compared to unattended or task-irrelevant features of the input. Adopting this proactive top–down account offers an elegant explanation for the present results: If linguistic intention triggers a top–down signal capable of enhancing accessibility to lexical and semantic representations, the sensitivity of the ERP signal to activation differences between words, such as reflected by word frequency, will become visible sooner and more strongly in the semantic categorization task compared to the color categorization task where there will be no benefit from such proactive influences.
At a more conceptual level, the observation that an automatic reading response is not a fixed entity but rather fluctuates as a function of goal-directed intention is in contradiction with the classical temporal and functional dichotomy between automatic (early) and controlled (late) processes (e.g., Posner & Snyder, 1975). Instead, this data pattern confirms one of the core assumptions of the attentional sensitization model, namely, that automatic processing in the brain is not free from top–down control but is in fact dependent on it (e.g., Kiefer, 2007, 2012; Kiefer & Martens, 2010). Although support for this notion has so far mainly come from studies investigating unconscious processing (e.g., Adams & Kiefer, 2012; Martens et al., 2011; Kiefer & Martens, 2010), our data now extend these findings to the conscious domain. There is, however, an interesting difference between the data reported here and those of Kiefer and colleagues: Although for the masked priming studies a nonsemantic induction task abolished the automatic response (N400 semantic priming effect), here we see that the nonlinguistic task nonetheless produced a frequency effect in the ERPs. This is interesting because in the former case the data seem to suggest that preemptive control is a necessary property of an automatic response whereas in the latter case the results rather point to a moderating (but not decisive) role of preemptive control on automatic processing. One possibility is that our ERP frequency effect in the nonlinguistic task is restricted to lower levels of processing, such as whole-word orthography, and does not percolate to the lexico-semantic representations, which could be reflected in the strong topographical differences of the frequency effects between the color recognition and word recognition tasks (see Figure 3C).3 In that case, our data are not necessarily that different from the masked priming data, given that there the focus lay on semantic priming. Another theoretically relevant (and nonmutually exclusive) possibility concerns differences in the level of conscious processing across studies. If we assume that one brain correlate that can differentiate conscious from unconscious perception is the strength and depth of activation elicited by a stimulus (e.g., Dehaene & Changeux, 2011; Dehaene, Changeux, Naccache, Sackur, & Sergent, 2006), then it seems plausible to assume that, in this study, the stimulus eliciting an automatic response, which is consciously perceivable by the participants, has a higher probability of reaching the semantic level of processing compared to the subliminal stimuli presented in the masked priming studies. If so, our data suggest that the speed and strength of an automatic brain response not only depend on the presence of top–down control but are also influenced by the degree of consciousness in the course of perception. Issues as these will be interesting to explore further in future research in order to extend and constrain the attentional sensitization model. For the present purposes, more important are the similarities in results leading to the same conclusion that automaticity and top–down control are both functionally and temporally closely intertwined processes. Assuming in this manner that the language system is a dynamical device where fast and efficient word retrieval relies on the immediate and continuous interaction between stimulus-driven and goal-driven activation, offers a straightforward framework for explaining why language-related effects can be observed outside the scope of attention (e.g., Stroop, 1935) but at the same time be modulated early on by top–down factors (e.g., Chen et al., 2013; Strijkers, Yum, et al., 2011; Ruz & Nobre, 2008).
To conclude, we must point out some limitations of our results and future questions that need to be addressed. Although the data convincingly show that initial access to lexico-semantic properties of a perceived word is not solely governed by automatic feedforward activation but is modulated by top–down processes in a proactive fashion, the results do not necessarily put strong theoretical constraints on current models of visual word recognition (e.g., Price & Devlin, 2011; Dehaene et al., 2005; Coltheart et al., 2001; Grainger & Jacobs, 1996; McClelland & Rumelhart, 1981). That is, based on our findings, it is sufficient for those models to acknowledge that the reading pathway(s) in the brain can be facilitated by external preparatory top–down projections, without the need to consider any other modifications concerning the specific flow of activation between the representational layers. Put differently, it might suffice to assume that the top–down intention to engage in language processing enhances the accessibility to the reading network as a whole, without actually changing the pathways and activation dynamics of the network itself. To establish whether the role of proactive top–down modulations in reading goes beyond a global effect of “placing the brain in a language state” and can also influence the activation dynamics of specific representations, future research needs to explore how the neurophysiological manifestations of several psycholinguistic phenomena (linked to different processing stages) are affected by different types of linguistic tasks. If such future investigations confirm that proactive top–down involvement in reading, as demonstrated here, can enhance specific lexical and semantic representations, then it could be concluded that the spatiotemporal activation dynamics underpinning visual word recognition can vary as a function of the type of language behavior one intends to perform and the particular linguistic knowledge required to do so. This would mean that any explanation of key psycholinguistic phenomena has to take into consideration the linguistic intentions and attentional demands engaged in a given context, thus opening up various novel questions with respect to the cognitive and neurobiological dynamics of language processing. This study emphasizes the importance of uncovering the specific role of proactive top–down modulations in reading by demonstrating that, at least depending on whether or not one has the conscious intention to understand language, the brain “sees” the same words differently.
Kristof Strijkers was supported by the Intra-European Fellowship (FP7-PEOPLE-2012-IEF) of the Marie Curie Actions (grant 302807) awarded by the European Commission, and Jonathan Grainger and Daisy Bertrand were supported by an ERC Advanced Grant 230313.
Reprint requests should be sent to Kristof Strijkers, Centre National de la Recherche Scientifique (CNRS), Laboratoire de Psychology Cognitive–Université d'Aix-Marseille, 3, place Victor Hugo, 13331 Marseille, France, or via e-mail: Kristof.Strijkers@gmail.com.
Given that all participants performed both tasks and task order was counterbalanced across participants, half of the participants started with the word recognition task and half with the color recognition task. Because task order may affect the ERPs and, in particular, the lexical frequency effect (e.g., Strijkers, Baus, Runnqvist, Fitzpatrick, & Costa, 2013), we first ran a repeated-measures ANOVA including Task Order as a between-subject variable to ensure that this particularity of our design would not affect the results. For none of the time windows (150–250 msec, 250–350 msec, 350–500 msec) did the factor Task Order reach significance (all Fs < 1) nor did it significantly interact with any of the other independent variables (all ps > .09).
Note that these results concur with an early lexical access view in reading. That is, in the field there is ongoing debate whether the lexical knowledge of a word becomes available late or early in the course of reading. Proponents of a late access view assume that an orthographic code triggers lexico-semantic representations after about 250–300 msec of processing (e.g., Kutas & Federmeier, 2000, 2011; Grainger & Holcomb, 2009; Holcomb & Grainger, 2006). In contrast, others have demonstrated that lexico-semantic access already takes place within the first 100–150 msec after perceiving a word (e.g., Amsel, Urbach, & Kutas, 2013; Hauk, Coutout, Holden, & Chen, 2012; Hauk et al., 2006; Hauk & Pulvermuller, 2004; Sereno et al., 1998). The present data, with a lexical frequency effect occurring within 150 msec during a semantic task and within 250 msec even during a nonlinguistic task, are in line with the evidence for early lexico-semantic access (but see Laszlo & Federmeier, 2014, for a recent appraisal in favor of the late access view).
Let us stress that we do not wish to argue that the topographical differences of the word frequency effect between tasks necessarily indicates a restricted (whole-word orthography) frequency effect. Assuming that in the nonlinguistic task the frequency effect still reveals full processing of a word (including lexico-semantic knowledge), but which just becomes activated more slowly and less strongly (fewer connectivity) compared with word activation in the linguistic task, is an equally plausible explanation for the topographic differences observed in this study.