Abstract

We describe a novel method for tracking the time course of visual identification processes, here applied to the specific case of letter perception. We combine a new behavioral measure of letter identification times with single-letter ERP recordings. Letter identification processes are considered to take place in those time windows in which the behavioral measure and ERPs are correlated. A first significant correlation was found at occipital electrode sites around 100 msec poststimulus onset that most likely reflects the contribution of low-level feature processing to letter identification. It was followed by a significant correlation at fronto-central sites around 170 msec, which we take to reflect letter-specific identification processes, including retrieval of a phonological code corresponding to the letter name. Finally, significant correlations were obtained around 220 msec at occipital electrode sites that may well be due to the kind of recurrent processing that has been revealed recently by TMS studies. Overall, these results suggest that visual identification processes are likely to be composed of a first (and probably preconscious) burst of visual information processing followed by a second reentrant processing on visual areas that could be critical for the conscious identification of the visual target.

INTRODUCTION

Letters represent a perfect example of the kind of symbol that humans thrive on. Letters are simple, overlearned, and easy-to-control two-dimensional visual patterns associated with a single name. They also gather a certain dose of complexity because of interletter similarities and the large within-letter form variability because of case and font variations. They therefore offer a paradigm case for the investigation of visual pattern recognition in humans (for a recent review, see Grainger, Rey, & Dufau, 2008).

Direct recording of brain activity using magnetoencephalography (MEG) or ERPs has provided evidence regarding the early temporal dynamics of the processes involved in letter perception. MEG studies revealed an occipital activation at 100 msec after stimulus onset that was not sensitive to the specific content of the stimulus and that has been interpreted as reflecting low-level visual feature processing (Tarkiainen, Cornelissen, & Salmelin, 2002; Tarkiainen, Helenius, Hansen, Cornelissen, & Salmelin, 1999). Subsequent inferior occipito-temporal activation was found at around 150 msec poststimulus onset and was interpreted as reflecting the earliest stage of stimulus-specific processing.

Similarly, several ERP studies have reported results consistent with these MEG studies indicating that a similar amount of time (i.e., around 150 msec) is needed to reach an item-specific identification activity (e.g., Rey, Dufau, Massol, & Grainger, 2009; Petit, Midgley, Holcomb, & Grainger, 2006; Wong, Gauthier, Woroch, DeBuse, & Curran, 2005). For example, in a masked priming ERP study, Petit et al. (2006) found a difference starting at 110 msec between ERPs of letter targets preceded by primes in the same case compared with targets preceded by primes in the other case (i.e., a mismatch at the level of elementary visual features). After 150 msec, ERPs for targets preceded by different-letter primes in the same case diverged from ERPs for targets preceded by the same letter in the same case (i.e., a mismatch at the level of letter identity or letter name). Wong et al. (2005) also found an early component peaking at 170 msec after stimulus onset (N170, using an average reference) that was enhanced with single familiar characters (such as letters) relative to unknown characters or pseudoletters. Finally, Rey, Dufau, et al. (2009) compared ERPs to letters and matched nonletters (by controlling the information provided at the level of visual features) and found an initial difference between these two conditions starting at 145 msec after stimulus onset.

Overall, these results are consistent with a two-stage hierarchical model of visual object recognition that includes a first elementary level of visual feature processing occurring around 100 msec after stimulus onset and followed around 150 msec by the initiation of item-specific identification processes. As proposed by Rey, Dufau, et al. (2009), this model is also in line with the hierarchical organization of the visual system (e.g., Rolls, 2000, 2007; Van Essen, 2005; Felleman & Van Essen, 1991) and the presence of object- and category-selective neurons in higher visual areas (Quiroga, Reddy, Kreiman, Koch, & Fried, 2005; Kreiman, Koch, & Fried, 2000; Logothetis, Pauls, & Poggio, 1995; Perrett, Rolls, & Caan, 1982). Moreover, simulations done with a hierarchical interactive activation model of letter perception revealed that a model including excitatory interlevel connections (feedforward and feedback) and inhibitory intralevel connections provided a better account of the variability observed on individual letter ERPs around 150 msec after stimulus onset (Rey, Dufau, et al., 2009).

Recent studies using MRI-guided stereotaxic single TMS have complemented this hierarchical description by showing that visual perception processes could be affected by both early and late stimulations on occipital areas corresponding to V1/V2 (Koivisto, Railo, Revonsuo, Vanni, & Salminen-Vaparanta, 2011; Camprodon, Zohary, Brodbeck, & Pascual-Leone, 2009). For example, using a mammal/bird discrimination task, Camprodon et al. (2009) observed a small but significant decrease in the proportion of correct responses when the TMS pulse occurred at either 100 or 220 msec after stimulus onset. The early effect was interpreted as reflecting a disruption of the initial feed-forward processing sweep, whereas the later effect would reflect the disruption of feedback projections to V1. These results suggest that feedback amplification could be a fundamental feature of visual identification and/or consciousness (Crouzet & Cauchoix, 2011; Dehaene & Changeux, 2011; Lamme, 2006; Bullier, 2001a, 2001b). Indeed, although fast categorization can be obtained on the basis of an initial feed-forward processing sweep that allows the detection of a categorical feature or a broad preidentification of the visual input (e.g., VanRullen, 2007; Thorpe, Fize, & Marlot, 1996), recurrent processing would be important for completing a full identification of the visual target and would be critical for generating a reportable subjective experience (i.e., conscious perception).

Within the general framework of a two-stage hierarchical model of visual perception, the initial feed-forward processing sweep would correspond to preidentification processes that could be sufficient for accurately performing certain experimental tasks (e.g., category discrimination task). The later recurrent processing wave would be needed for a full—conscious (i.e., reportable)—identification of the visual target. The purpose of this study was precisely to further explore the dynamics of these identification processes.

To increase our understanding of the time course of these processes, here we adopt a correlational logic based on the recording of between-item processing variability. Indeed, one way to track identification processes is to work on the processing variance generated by a sample of items, such as the 26 letters of the alphabet in this study. If we have a behavioral index that can be confidently linked to the variance related to letter identification processes, then this identification index can be used to directly track the time course of identification processes through electrophysiological measures that have a fine-grained temporal precision, such as ERPs. If this assumption is correct, identification processes should then be revealed by significant correlations between the behavioral index and the variability of between-letter ERP amplitudes.

Following a psychophysical approach, we recorded single-letter behavioral measures and single-letter ERPs for a small sample of participants (n = 5), each participant being involved in both experiments and having to process a large number of trials per letter (to achieve reliable letter-level measures and the highest signal-to-noise ratio). Our first goal was to obtain a behavioral index of letter identification processes. For that purpose, we used an experimental paradigm combining an immediate naming and a conditional delayed naming task. In the immediate naming task, participants simply had to name as quickly as possible a target letter that was displayed on a computer screen. Then, on each trial, after naming the letter, participants performed a conditional delayed naming task. After a variable delay following their naming response, either a green or a red circle was presented, and participants had to repeat the target letter's name they had just produced, only when they saw a green circle. Two measures were then recorded: immediate naming and delayed naming response times.

The behavioral index of letter identification processes was then calculated on the basis of these two behavioral measures by applying the following logic. The immediate naming measure is assumed to include two main sources of variance that are related to the two main processes involved in letter naming. The first source of variance comes from visual identification processes, and the second source is related to output articulatory processes. Similarly, one can assume that the delayed naming measure is mainly affected by a single source of variance related to output articulatory processes, because the visual identification of the green circle is supposed to be constant across the repeated letters. Having these 26 behavioral measures of immediate naming and delayed naming, it is possible to compute a simple linear regression in which naming times are explained by delayed naming times. Because the naming and delayed naming measures are assumed to share one source of variance that is related to output articulatory processes, the result of the regression will provide an estimate of the naming variance explained by these articulatory output processes. Conversely, the residual values of the regression will provide the remaining variance from naming times that corresponds to visual identification processes. These resulting letter-level residual values can therefore be used as a signature of the variability of interletter identification times. Hereafter, this behavioral index of identification processes will be referred to as the “identification index.”

Additionally, single-letter ERPs were collected in a standard one-back task that allowed us to record a large number of trials per letter, which is critical for obtaining reliable item-level ERPs. At each point in time and for the 26 averaged letter ERPs, it is then possible to obtain 26 amplitude values and correlate these values with the identification index for each letter. In principle, significant correlations between the two sets of values should reveal when identification processes precisely take place during the time course of visual letter processing.

METHODS

Participants

Five participants (four women and one man, 21–27 years old) participated both in the behavioral and ERP experiments. All participants were native speakers of French and reported normal or corrected-to-normal vision.

Material

Stimuli were the 26 letters of the Roman alphabet displayed in Inconsolata font. They were presented on a 17-in. cathode ray tube monitor as white on an 800 × 600 pixel black background (32 × 24 cm). The experiments were controlled by a PC using E-Prime (Psychology Software Tools, Pittsburgh, PA).

Procedure: Immediate and Delayed Naming Task

Participants were seated in a dimly lit room at a distance of 60 cm from a computer screen piloted by a PC computer. A trial started with a white fixation point (symbol “+”) displayed at the center of the screen for 300 msec, followed by an empty screen for 300 msec. A white letter then appeared at the center of the screen, and the participant had to say the letter name aloud into a microphone as quickly as possible. The “immediate naming time” corresponds to the delay between the presentation of the letter and the beginning of the vocal response detected by the computer. The letter disappeared immediately after the response and was followed by an empty screen for a randomized delay varying from 1 to 2 sec. On 80% of the trials, a green circle then appeared at the center of the screen, and participants had to name again the previously presented letter as quickly as possible. The “delayed naming time” corresponds to the delay between the presentation of the green circle and the vocal response. On the remaining 20% of the trials, a red circle appeared for 1 sec, and participants had to remain silent. The trial ended with the presentation of an empty screen (i.e., intertrial interval) during 750 msec. The use of both a randomized delay between the naming of the letter and the presentation of the green circle and a conditional response (i.e., delayed naming was conditional on presentation of the green circle) are two methodological improvements over standard versions of the delayed naming procedure, which were expected to reduce the occurrence of anticipatory responses and improve the quality of the resulting delayed naming measure (see Figure 1A).

Figure 1. 

(A) Procedure for the immediate naming and delayed naming tasks. (B) Procedure for the one-back task during ERP recordings.

Figure 1. 

(A) Procedure for the immediate naming and delayed naming tasks. (B) Procedure for the one-back task during ERP recordings.

For both the immediate naming and delayed naming latencies, responses were recorded using a microphone connected to a PST Serial Response Box (Psychology Software Tools) interfaced with the computer. Each letter appeared randomly for 125 times, leading to 26 × 125 = 3250 immediate naming response times per participant. Because 80% of the trials were followed by a green circle, participants had to produce 100 delayed naming responses for each letter, leading to 26 × 100 = 2600 delayed naming response times per participants. The experiment was divided in two sessions that lasted approximately 2.5 hr each. Each session was divided into 13 and 12 blocks, one block being composed of 5 × 26 = 130 trials and lasting approximately 11 min. Participants were free to take a break between each of these experimental blocks. This first part of the experiment lasted approximately 5.5 hrs per participant.

Procedure: ERP Task

After completing informed consent, participants were seated comfortably in a sound-attenuated and dimly lit room. They were informed that the experiment was composed of regular and test trials. On regular trials, a letter appeared at the center of the screen, and the task was simply to read it silently. On test trials, a green square was displayed indicating that they would have to compare the previous letter with the upcoming one. In this case, participants were instructed to press the right button if the two items were identical and the left button otherwise (participants held a game pad in their hands, allowing responses to be given with their index finger).

A regular trial started with a fixation point (“+”) for 700 msec, followed by an empty screen for 700 msec, the target letter for 500 msec, and an empty screen for 700 msec (intertrial interval). A test trial started with the presentation of a fixation point (“+”) for 700 msec, followed by an empty screen for 200 msec, a green square for 300 msec, an empty screen for 300 msec, a fixation point (“+”) for 700 msec, an empty screen for 700 msec, the to-be-compared letter for 300 msec, and finally, the words “same or different.” Participants then had to produce their response, and the next trial started after an empty screen of 700 msec (see Figure 1B).

Each letter appeared randomly for 100 times during the experiment leading to 2600 regular trials. Each letter was followed six times by a test trial, leading to 312 test trials per letter (12% of the trials). The experiment was organized in two sessions that lasted approximately 1.5 hr each, with each session organized in 10 blocks. Only the ERPs recorded during the regular trials were retained for the analysis. This second part of the experiment lasted approximately 5 hr per participant (including the EEG preparation).

The EEG activity was recorded continuously through the Active Two BioSemi system from 64 electrodes mounted on an elastic cap (Electro-Cap, Inc., Eaton, OH) that was positioned according to the 10–10 International system (American Clinical Neurophysiology Society, 2006). Two additional electrodes (CMS/DRL near Pz) were used as on-line references (for a reference description, see www.biosemi.com; Schutter, Leitner, Kenemans, & van Honk, 2006). The montage included 10 midline sites (FPz, AFz, Fz, FCz, Cz, CPz, Pz, POz, Oz, and Iz) and 27 sites over each hemisphere (FP1/FP2, AF3/AF4, AF7/AF8, F1/F2, F3/F4, F5/F6, F7/F8, FC1/FC2, FC3/FC4, FC5/FC6, FT7/FT8, C1/C2, C3/C4, C5/C6, T7/T8, CP1/CP2, CP3/CP4, CP5/CP6, TP7/TP8, P1/P2, P3/P4, P5/P6, P7/P8, P9/P10, PO3/PO4, PO7/PO8, and O1/O2). Two additional electrodes were used to monitor eye movements and blinks, placed at lateral canthi, and two additional electrodes were used for an off-line rereferencing (placed behind the ears on mastoid bone). Continuous EEG was digitized at 512 Hz and was filtered off-line (30-Hz low pass, 0.05-Hz high pass, 24 dB/octave). Recordings obtained from left and right mastoid electrodes were used off-line to rereference the scalp recordings to the mean of these electrodes. Data were epoched on letter presentation during regular trials: (−100, 500) msec epoch, (−100, 0) msec baseline.

Epochs with eye movements, blinks, or electrical activities greater than ±75 μV were rejected from subsequent analyses (7.8% of the trials). An averaged ERP for each letter and each participant was then computed. Finally, ERPs for each letter were averaged over participants by weighing a participant's contribution by the number of remaining trials per letter.

RESULTS

Naming and Delayed Naming Tasks

We first excluded immediate naming and delayed naming responses times faster than 250 msec or slower than 1000 msec (1.6% of the trials, 492 trials). Then, for each participant, we computed the mean and the standard deviation (SD) of the immediate naming and delayed naming times per letter. For each letter and each participant, we finally excluded response times faster or longer than mean ± 2.5 SD (2.1% of the trials, 621 trials).

On the basis of the remaining trials, we calculated a mean response time for each participant and each letter in the immediate naming and delayed naming tasks (see Appendix). To have an estimation of the between-participant consistency in behavioral responses, we computed a correlation between participants using the 26 immediate naming and delayed naming mean response times. The results, displayed in Table 1, indicate a strong between-participant consistency, with all correlations being highly significant (all ps < .05; minimum/maximum values for immediate naming = .61/.91, minimum/maximum values for delayed naming = .49/.83).

Table 1. 

Between-participant Correlations for the Immediate Naming (IN) and Delayed Naming (DN) Tasks


P1
P2
P3
P4
IN
DN
IN
DN
IN
DN
IN
DN
P2 .61 .57       
P3 .91 .83 .64 .49     
P4 .80 .77 .82 .65 .80 .71   
P5 .78 .59 .74 .59 .71 .55 .75 .72 

P1
P2
P3
P4
IN
DN
IN
DN
IN
DN
IN
DN
P2 .61 .57       
P3 .91 .83 .64 .49     
P4 .80 .77 .82 .65 .80 .71   
P5 .78 .59 .74 .59 .71 .55 .75 .72 

Mean response times for each letter and for all participants were then computed for the immediate naming and delayed naming tasks. Immediate naming times are plotted as a function of delayed naming measures in Figure 2. A simple linear regression predicting naming times by delayed naming times produced an R2 of .79 (F(1, 24) = 92.4, p < .001), indicating that output articulatory processes (captured by the delayed naming measure) are responsible for a large part of the variance in immediate naming latencies.

Figure 2. 

Immediate naming response times (msec) as a function of delayed naming response times (msec).

Figure 2. 

Immediate naming response times (msec) as a function of delayed naming response times (msec).

As explained in the Introduction, to extract the variance related to output processes and retain only the variance related to identification processes, we computed the residuals from the simple linear regression between delayed naming and immediate naming. To obtain these residual values, we calculated for each letter the difference between the value predicted by the linear regression model and the observed value. These residual values are provided in the Appendix (last column on the right), with a minimal value obtained for the letter “U” (−25.1 msec) and a maximal value obtained for the letter “X” (38.2 msec). The variance related to these 26 residual values can therefore be used as a measure of interletter variability in identification processes. We now use this behavioral “identification index” in combination with the measurement of single-letter ERPs to track the time course of identification processes.

Correlation between the Residual Behavioral Values and Single-letter ERPs

As described in Figure 3, from each averaged single-letter ERP, we extracted the ERP amplitude generated by each letter at each electrode site millisecond-by-millisecond from 0 to 250 msec. For example, at 100 msec, the ERP amplitudes for the four displayed single-letter ERPs (i.e., for the letters “A,” “E,” “L,” and “S”) on the FCz electrode are −1.01, −1.58, −2.33, and −1.66 μV, respectively. The variability of the 26 resulting amplitude values was then directly correlated to the identification index.

Figure 3. 

Mean ERPs for letters “A,” “E,” “L,” and “S.” At 100 msec, for each single-letter ERP, an ERP amplitude (in μV) can be obtained for each letter providing, over all letters, a measure of interletter ERP variability.

Figure 3. 

Mean ERPs for letters “A,” “E,” “L,” and “S.” At 100 msec, for each single-letter ERP, an ERP amplitude (in μV) can be obtained for each letter providing, over all letters, a measure of interletter ERP variability.

A representation of the scalp distribution of these correlations is shown in Figure 4 (i.e., the correlations between the identification index and single-letter ERP amplitudes calculated at each electrode site along the 0–250 msec time window). A color scale is used to report positive (in red) and negative (in blue) correlations that reach the p = .05 level of significance (with 24 df, significant correlations are obtained for r > ±.388). Three significant correlation time windows were obtained. First, there was a positive correlation on right occipital electrodes between 90 and 110 msec. Second, between 150 and 200 msec, a positive correlation was observed on fronto-central electrodes and a negative correlation on right occipital electrodes. Third, a positive correlation appeared again on right occipital electrodes between 200 and 250 msec.

Figure 4. 

Scalp distribution of the correlations between the identification index and single-letter ERP amplitudes from 0 to 250 msec.

Figure 4. 

Scalp distribution of the correlations between the identification index and single-letter ERP amplitudes from 0 to 250 msec.

To get a more dynamical view on these effects, the evolution of these correlations during the 0–250 msec time window and for a restricted set of electrodes is shown in Figures 5 (for a subset of occipital electrodes, i.e., P6, P8, PO4, and PO8) and 6 (for a subset of fronto-central electrodes, i.e., F1, Fz, F2, FC1, FCz, and FC2). First, one can note that, within each subset of electrodes, a consistent pattern of correlations across neighboring electrodes is observed. Second, on both occipital and fronto-central electrodes, we obtained similar patterns of correlations during the 0–120 msec time window: There was an absence of correlation between 0 and 50 msec, followed by a positive increase between 50 and 100 msec that reached significance on occipital electrodes only, and a decrease between 100 and 120 msec leading to a null correlation around 120 msec. Third, between 120 and 150 msec, correlations increased on both occipital and fronto-central sites and reached significance (p < .05) between 150 and 180 msec. However, for occipital electrodes, these significant correlations were negative, whereas they were positive on fronto-central electrodes. Finally, for occipital electrodes only, correlations rapidly decreased again and became significantly positive between 200 and 250 msec.

Figure 5. 

Evolution of the correlation between the identification index and single-letter ERP amplitudes on four occipital electrodes (PO4, PO8, P6, and P8) between 0 and 250 msec. Note that p < .05 for r > ± .388 (dotted horizontal lines).

Figure 5. 

Evolution of the correlation between the identification index and single-letter ERP amplitudes on four occipital electrodes (PO4, PO8, P6, and P8) between 0 and 250 msec. Note that p < .05 for r > ± .388 (dotted horizontal lines).

Figure 6. 

Evolution of the correlation between the identification index and single-letter ERP amplitudes on six fronto-central electrodes (F1, Fz, F2, FC1, FCz, FC2) between 0 and 250 msec. Note that p < .05 for r > ± .388 (dotted horizontal lines).

Figure 6. 

Evolution of the correlation between the identification index and single-letter ERP amplitudes on six fronto-central electrodes (F1, Fz, F2, FC1, FCz, FC2) between 0 and 250 msec. Note that p < .05 for r > ± .388 (dotted horizontal lines).

DISCUSSION

In this study, a behavioral index of interletter identification variability was obtained on the basis of immediate naming and delayed naming response times for individual letters. By correlating this source of variance with the interletter variability of single-letter ERPs along a period ranging from 0 to 250 msec, we observed repeated significant correlations between these two sources of variance that open a new window on the time course of visual identification processes.

The first significant correlations were observed on occipital electrodes between 90 and 110 msec. This time window has been previously attributed to low-level feature processing (e.g., Petit et al., 2006; Tarkiainen et al., 1999, 2002) that would correspond to the first stage of identification processes. The fact that these correlations appeared on occipital electrodes is also consistent with an interpretation in terms of early and elementary visual processing.

A second wave of correlations occurred on both occipital and fronto-central electrodes in the 150–200 msec time window. Again, this is consistent with previous findings that have observed a letter-specific activity starting 150 msec after stimulus onset (e.g., Rey, Dufau, et al., 2009; Wong et al., 2005), suggesting that these correlations would reflect, more specifically, identification processes at the letter level. Note that, at both occipital and fronto-central sites, before reaching significance after 150 msec, correlations became null around 120 msec, suggesting that between the first (90–110 msec) and second (150–200 msec) sets of correlations, the brain activity related to identification processes has moved from one brain region to another (although ERPs do not allow us to obtain precise information about source localization). Finally, the fact that these second correlations appeared within the same time window, together with the fact that they were positive on fronto-central electrodes and negative on occipital electrodes, suggests that they could be related to a common source of variance reflecting a single processing mechanism. Additional analyses reported below are consistent with this claim.

The third wave of correlations appearing between 200 and 250 msec on occipital electrodes may well be due to the kind of recurrent processing that has been revealed recently by TMS studies (Koivisto et al., 2011; Camprodon et al., 2009). Both the electrode localization (i.e., occipital) and the timing are consistent with an interpretation in terms of reentrant signals that would either sustain or amplify the activity in primary visual areas and contribute to the conscious identification of the target letter.

To further test these interpretations, we conducted additional correlations and stepwise regressions between these sources of variance. For that purpose, we used the interletter ERP variability of two representative electrodes (PO8 representing occipital activity and FCz representing fronto-central activity) and at different points in time corresponding approximately to the maximal correlation values that were observed (100, 170, and 225 msec for PO8; 170 msec for FCz). The resulting pattern of correlations is shown in Figure 7. This analysis revealed a triangle of significant correlations between PO8 at 100 (PO8-100) and 225 (PO8-225) msec and FCz at 170 msec (FCz-170). Furthermore, PO8 activity at 170 msec (PO8-170) was significantly correlated with FCz-170 but not correlated with PO8-100 and PO8-225. This pattern of results suggests that the activities observed on occipital electrodes at 100 and 225 msec are strongly related and probably reflect similar visual processes occurring in the primary visual cortex. Similarly, the strong correlation between FCz-170 and PO8-170 suggests that they may represent two sides of the same letter-specific identification process.

Figure 7. 

Correlations between single-letter ERP amplitudes recorded on two representative electrodes of the occipital area (PO8) and fronto-central area (FCz) and for three time points (100 msec, 170 msec, and 225 msec) corresponding to points of maximum correlation with the identification index.

Figure 7. 

Correlations between single-letter ERP amplitudes recorded on two representative electrodes of the occipital area (PO8) and fronto-central area (FCz) and for three time points (100 msec, 170 msec, and 225 msec) corresponding to points of maximum correlation with the identification index.

To better understand this pattern of correlations, we further examined the relation between PO8-100/225 and FCz-170. We computed stepwise regressions to determine the respective role of these sources of variance in explaining the variance related to the behavioral identification index. When PO8-100 was entered first in the regression, it significantly explained 23% of the identification index variance (r = .48, F(1, 24) = 7.3, p = .012), and when FCz-170 was entered in second position, it significantly accounted for an additional part of the identification index variance (18.5%, r = .43, F(1, 24) = 5.4, p = .029). Conversely, when FCz-170 was entered first, it explained 37.2% of the identification index variance (r = .61, F(1, 24) = 14.2, p < .001), but when PO8-100 was entered in second position, it did not account for a significant part of the remaining variance (r = .24, F(1, 24) = 1.4, p = .24). These results indicate that the variance related to PO8-100 activity is included in the variance at FCz-170, which contains an independent and additional source of variance that captures variability in letter identification times (i.e., our identification index).

What might be the functional role of this independent and additional source of variance? One possibility is that it is related to some form of phonological processing (i.e., the activation of a letter's name that certainly contributes to the conscious identification of a letter). To test this assumption, we used a simple index of phonological processing, that is, the number of phonemes composing a letter's name, that has been shown to influence word reading performances (e.g., Rey & Schiller, 2005; Rastle & Coltheart, 1998; Rey, Jacobs, Schmidt-Weigand, & Ziegler, 1998). We found that this index of phonological processing was significantly correlated with the behavioral identification index (r = .69) and also to activity at FCz-170 and PO8-170 (r = .42 and r = −.45, respectively). On the other hand, it did not correlate with activity at PO8-100 nor at PO8-225 (r = .12 and r = .14, respectively). This pattern of results therefore suggests that phonological processing (i.e., the activation of a letter's name) could play a critical role in letter identification within the 150–200 time window.

To summarize, the present set of results is consistent with the following scenario for the time course of letter perception. First of all, low-level visual feature processing is observed at occipital regions around 100 msec. This brain activity is followed by letter identification processes that enable activation of letter names (phonological processing) between 150 and 200 msec on more anterior regions. Finally, between 200 and 250 msec, recurrent processing modulates activity in primary visual areas, which participates in the conscious identification of the letter by amplifying or sustaining the activity in these brain regions.

At a more general level, one can note that the pattern of alternating correlations found in this study does not appear to be consistent with classical views on identification processes. Indeed, rather than observing a smoothly increasing pattern of correlations that would reflect a progressive accumulation of information leading to a recognition state (that would be consistent with accumulation models of perception such as the random-walk model [e.g., Ratcliff, 1978, or interactive activation like models, e.g., McClelland & Rumelhart, 1981]), we found an initial rise in these correlations between 50 and 100 msec followed by a fast decrease and a fast increase again during the 100–150 msec time window. This alternating pattern suggests that identification processes might be better conceived of as a rapid flow of neuronal activity moving forward and backward between different brain regions dedicated to low-level visual processing on the one hand and item-specific identification processes on the other. The coactivation of these areas (potentially asynchronous) would contribute to the subjective experience of consciously perceiving a visual shape by creating a transient reverberating neuronal activity. To our knowledge, none of the existing computational architectures describing visual perception processes can readily account for these dynamical interactions at the moment, but this kind of functional constraint may certainly guide the next generation of computational models.

To study the time course of letter perception processes, we used a correlational approach by combining the interitem variability obtained in a behavioral experiment with the interitem variability obtained with ERP measures. The observed pattern of results was not only consistent with the results of previous MEG and ERP studies but also allowed us to combine and integrate these results with those obtained in recent TMS studies, leading to a coherent scenario for the time course of letter perception processes from 0 to 250 msec. Specifying the dynamics of processes taking place after 250 msec will probably involve task- and response-dependent patterns of activity, which will certainly be explored in future research. Finally, this approach could be extended to other types of visual information, such as single words. The larger set of available items (i.e., >26) is likely to increase the richness of the variability indexes. However, because repeating words several times to get reliable ERP measures is probably not a good solution (word repetition may indeed interact with word identification processes, which is less likely to occur for a single letter because of the overlearned nature of these visual shapes), one can instead increase the sample of participants (e.g., up to 100) and use statistical methods that have been developed recently for response time measures to estimate the reliability of single-word ERP data (Courrieu, Brand-D'Abrescia, Peereman, Spieler, & Rey, 2011; Rey, Courrieu, Schmidt-Weigand, & Jacobs, 2009).

In conclusion, the results of this study provide an integrated scenario about the time course and dynamics of the processes involved in letter perception. Our data are consistent with a two-stage model of letter perception together with recent theoretical propositions about the role of recurrent processing in the conscious perception of visual information. Although our data do not provide precise information about the brain localization of letter identification processes, by assuming that visual identification processes can be divided in two interacting processing stages, we implicitly predict that at least two brain areas should be involved to produce this pattern of alternating activity. Future brain imaging studies will certainly allow us to further test this assumption.

APPENDIX: Mean Response Times in the Immediate Naming and Delayed Naming Tasks for Each Participant and Each Letter


P1
P2
P3
P4
P5
Mean Values
Residual Values
IN
DN
IN
DN
IN
DN
IN
DN
IN
DN
IN
DN
410 365 342 335 483 421 406 393 425 436 413 390 −16.2 
495 412 373 350 562 505 475 421 514 502 484 438 −9.1 
433 386 359 348 517 468 505 447 418 419 446 414 −15.1 
480 416 387 359 544 489 518 449 541 533 494 449 −13.8 
417 369 359 340 480 425 419 388 444 461 424 397 −14.5 
433 366 357 315 483 431 453 368 437 420 433 380 17.2 
435 385 363 351 506 456 451 399 456 443 442 407 −9.8 
426 357 348 330 500 445 444 394 428 423 429 390 −0.2 
435 370 387 350 513 433 440 406 490 488 453 409 −1.5 
432 380 369 347 505 439 440 379 466 459 442 401 −1.8 
420 357 390 334 469 412 480 412 509 489 454 401 10.2 
425 377 347 330 472 422 423 392 426 419 419 388 −7.5 
431 381 352 340 488 422 452 397 425 415 430 391 −0.5 
429 379 365 325 496 429 449 381 428 424 433 388 6.5 
427 381 357 350 483 436 417 413 449 442 427 404 −20.8 
475 394 371 346 526 466 467 412 528 504 473 424 −1.4 
438 369 426 354 493 416 493 412 510 502 472 411 14.9 
430 369 355 332 475 425 441 383 433 408 427 383 7.2 
439 389 353 328 488 429 438 390 429 418 429 391 −1.5 
460 401 393 344 510 451 535 450 552 526 490 434 2.2 
413 372 353 341 491 454 426 406 479 480 432 411 −25.1 
499 412 387 361 575 496 542 460 559 545 512 455 −3.7 
507 425 482 394 585 469 623 469 569 522 553 456 35.9 
448 366 396 352 522 419 497 383 488 441 470 392 38.2 
455 386 437 389 555 464 514 416 507 476 494 426 16.9 
471 418 382 349 552 503 532 442 486 473 485 437 −6.8 

P1
P2
P3
P4
P5
Mean Values
Residual Values
IN
DN
IN
DN
IN
DN
IN
DN
IN
DN
IN
DN
410 365 342 335 483 421 406 393 425 436 413 390 −16.2 
495 412 373 350 562 505 475 421 514 502 484 438 −9.1 
433 386 359 348 517 468 505 447 418 419 446 414 −15.1 
480 416 387 359 544 489 518 449 541 533 494 449 −13.8 
417 369 359 340 480 425 419 388 444 461 424 397 −14.5 
433 366 357 315 483 431 453 368 437 420 433 380 17.2 
435 385 363 351 506 456 451 399 456 443 442 407 −9.8 
426 357 348 330 500 445 444 394 428 423 429 390 −0.2 
435 370 387 350 513 433 440 406 490 488 453 409 −1.5 
432 380 369 347 505 439 440 379 466 459 442 401 −1.8 
420 357 390 334 469 412 480 412 509 489 454 401 10.2 
425 377 347 330 472 422 423 392 426 419 419 388 −7.5 
431 381 352 340 488 422 452 397 425 415 430 391 −0.5 
429 379 365 325 496 429 449 381 428 424 433 388 6.5 
427 381 357 350 483 436 417 413 449 442 427 404 −20.8 
475 394 371 346 526 466 467 412 528 504 473 424 −1.4 
438 369 426 354 493 416 493 412 510 502 472 411 14.9 
430 369 355 332 475 425 441 383 433 408 427 383 7.2 
439 389 353 328 488 429 438 390 429 418 429 391 −1.5 
460 401 393 344 510 451 535 450 552 526 490 434 2.2 
413 372 353 341 491 454 426 406 479 480 432 411 −25.1 
499 412 387 361 575 496 542 460 559 545 512 455 −3.7 
507 425 482 394 585 469 623 469 569 522 553 456 35.9 
448 366 396 352 522 419 497 383 488 441 470 392 38.2 
455 386 437 389 555 464 514 416 507 476 494 426 16.9 
471 418 382 349 552 503 532 442 486 473 485 437 −6.8 

Values are given in milliseconds (msec). Mean values over all participants are also provided together with the residual values obtained from the linear regression between immediate naming and delayed naming mean values (i.e., the identification index).

IN = immediate naming; DN = delayed naming; P = participant.

Acknowledgments

This study was funded by the European Research Council (ERC Research grant 230313).

Reprint requests should be sent to Sylvain Madec, Laboratoire de Psychologie Cognitive, Centre National de la Recherche Scientifique, Aix-Marseille Université, Case D, 3 place Victor Hugo, 13331 Marseille Cedex 03, France, or via e-mail: symadec@gmail.com.

REFERENCES

American Clinical Neurophysiology Society
(
2006
).
Guideline 5: Guideline for standard electrode position nomenclature.
Journal of Clinical Neurophysiology
,
23
,
107
110
.
Bullier
,
J.
(
2001a
).
Integrated model of visual processing.
Brain Research Review
,
36
,
96
107
.
Bullier
,
J.
(
2001b
).
Feedback connections and conscious vision.
Trends in Cognitive Sciences
,
5
,
369
370
.
Camprodon
,
J. A.
,
Zohary
,
E.
,
Brodbeck
,
V.
, &
Pascual-Leone
,
A.
(
2009
).
Two phases of V1 activity for visual recognition of natural images.
Journal of Cognitive Neuroscience
,
22
,
1262
1269
.
Courrieu
,
P.
,
Brand-D'Abrescia
,
M.
,
Peereman
,
R.
,
Spieler
,
D.
, &
Rey
,
A.
(
2011
).
Validated intraclass correlation statistics to test item performance models.
Behavior Research Methods
,
43
,
37
55
.
Crouzet
,
S. M.
, &
Cauchoix
,
M.
(
2011
).
When does the visual system need to look back.
Journal of Neuroscience
,
31
,
8706
8707
.
Dehaene
,
S.
, &
Changeux
,
J.-P.
(
2011
).
Experimental and theoretical approaches to conscious processing.
Neuron
,
70
,
200
227
.
Felleman
,
D. J.
, &
Van Essen
,
D. C.
(
1991
).
Distributed hierarchical processing in the primate cerebral cortex.
Cerebral Cortex
,
1
,
1
47
.
Grainger
,
J.
,
Rey
,
A.
, &
Dufau
,
S.
(
2008
).
Letter perception: From pixels to pandemonium.
Trends in Cognitive Sciences
,
12
,
381
387
.
Koivisto
,
M.
,
Railo
,
H.
,
Revonsuo
,
A.
,
Vanni
,
S.
, &
Salminen-Vaparanta
,
N.
(
2011
).
Recurrent processing in V1/V2 contributes to categorization of natural scenes.
Journal of Neuroscience
,
31
,
2488
2492
.
Kreiman
,
G.
,
Koch
,
C.
, &
Fried
,
I.
(
2000
).
Category specific visual responses of single neurons in the human medial temporal lobe.
Nature Neuroscience
,
3
,
946
953
.
Lamme
,
V. A. F.
(
2006
).
Towards a true neural stance on consciousness.
Trends in Cognitive Sciences
,
10
,
494
501
.
Logothetis
,
N. K.
,
Pauls
,
J.
, &
Poggio
,
T.
(
1995
).
Shape representation in the inferior temporal cortex of monkeys.
Current Biology
,
5
,
552
563
.
McClelland
,
J. L.
, &
Rumelhart
,
D. E.
(
1981
).
An interactive activation model of context effects in letter perception: Part 1. An account of basic findings.
Psychological Review
,
88
,
375
407
.
Perrett
,
D. I.
,
Rolls
,
E. T.
, &
Caan
,
W.
(
1982
).
Visual neurons responsive to faces in the monkey temporal cortex.
Experimental Brain Research
,
47
,
329
342
.
Petit
,
J. P.
,
Midgley
,
K. J.
,
Holcomb
,
P. J.
, &
Grainger
,
J.
(
2006
).
On the time-course of letter perception: A masked priming ERP investigation.
Psychonomic Bulletin & Review
,
13
,
674
681
.
Quiroga
,
R. Q.
,
Reddy
,
L.
,
Kreiman
,
G.
,
Koch
,
C.
, &
Fried
,
I.
(
2005
).
Invariant visual representation by single neurons in the human brain.
Nature
,
435
,
1102
1107
.
Rastle
,
K.
, &
Coltheart
,
M.
(
1998
).
Whammies and double whammies: The effect of length on nonword reading.
Psychonomic Bulletin & Review
,
5
,
277
282
.
Ratcliff
,
R.
(
1978
).
A theory of memory retrieval.
Psychological Review
,
83
,
59
108
.
Rey
,
A.
,
Courrieu
,
P.
,
Schmidt-Weigand
,
F.
, &
Jacobs
,
A. M.
(
2009
).
Item performance in visual word recognition.
Psychonomic Bulletin & Review
,
16
,
600
608
.
Rey
,
A.
,
Dufau
,
S.
,
Massol
,
S.
, &
Grainger
,
J.
(
2009
).
Testing computational models of letter perception with item-level ERPs.
Cognitive Neuropsychology
,
26
,
7
22
.
Rey
,
A.
,
Jacobs
,
A. M.
,
Schmidt-Weigand
,
F.
, &
Ziegler
,
J. C.
(
1998
).
A phoneme effect in visual word recognition.
Cognition
,
68
,
B71
B80
.
Rey
,
A.
, &
Schiller
,
N. O.
(
2005
).
Graphemic complexity and multiple print-to-sound associations in visual word recognition.
Memory & Cognition
,
33
,
76
85
.
Rolls
,
E. T.
(
2000
).
Functions of the primate temporal lobe cortical visual areas in invariant visual object and face recognition.
Neuron
,
27
,
205
218
.
Rolls
,
E. T.
(
2007
).
A computational neuroscience approach to consciousness.
Neural Networks
,
20
,
962
982
.
Schutter
,
D. J. L. G.
,
Leitner
,
C.
,
Kenemans
,
J. L.
, &
van Honk
,
J.
(
2006
).
Electrophysiological correlates of cortico-subcortical interaction: A cross-frequency spectral EEG analysis.
Clinical Neurophysiology
,
117
,
381
387
.
Tarkiainen
,
A.
,
Cornelissen
,
P. L.
, &
Salmelin
,
R.
(
2002
).
Dynamics of visual feature analysis and object-level processing in face versus letter-string perception.
Brain
,
125
,
1125
1136
.
Tarkiainen
,
A.
,
Helenius
,
P.
,
Hansen
,
P. C.
,
Cornelissen
,
P. L.
, &
Salmelin
,
R.
(
1999
).
Dynamics of letter string perception in the human occipitotemporal cortex.
Brain
,
122
,
2119
2131
.
Thorpe
,
S. J.
,
Fize
,
D.
, &
Marlot
,
C.
(
1996
).
Speed of processing in the human visual system.
Nature
,
381
,
520
522
.
Van Essen
,
D. C.
(
2005
).
Corticocortical and thalamo-cortical information flow in the primate visual system.
Progress in Brain Research
,
149
,
173
185
.
VanRullen
,
R.
(
2007
).
The power of the feed-forward sweep.
Advances in Cognitive Psychology
,
3
,
167
176
.
Wong
,
A. C.
,
Gauthier
,
I.
,
Woroch
,
B.
,
DeBuse
,
C.
, &
Curran
,
T.
(
2005
).
An early electrophysiological response associated with expertise in letter perception.
Cognitive, Affective & Behavioral Neuroscience
,
5
,
306
318
.