Recent fMRI research has demonstrated that letters and numbers are preferentially processed in distinct regions and hemispheres in the visual cortex. In particular, the left visual cortex preferentially processes letters compared with numbers, whereas the right visual cortex preferentially processes numbers compared with letters. Because letters and numbers are cultural inventions and are otherwise physically arbitrary, such a double dissociation is strong evidence for experiential effects on neural architecture. Here, we use the high temporal resolution of ERPs to investigate the temporal dynamics of the neural dissociation between letters and numbers. We show that the divergence between ERP traces to letters and numbers emerges very early in processing. Letters evoked greater N1 waves (latencies 140–170 msec) than did numbers over left occipital channels, whereas numbers evoked greater N1s than letters over the right, suggesting letters and numbers are preferentially processed in opposite hemispheres early in visual encoding. Moreover, strings of letters, but not single letters, elicited greater P2 ERP waves (starting around 250 msec) than numbers did over the left hemisphere, suggesting that the visual cortex is tuned to selectively process combinations of letters, but not numbers, further along in the visual processing stream. Additionally, the processing of both of these culturally defined stimulus types differentiated from similar but unfamiliar visual stimulus forms (false fonts) even earlier in the processing stream (the P1 at 100 msec). These findings imply major cortical specialization processes within the visual system driven by experience with reading and mathematics.
Letters and numbers (Arabic numerals) are basic building blocks of human written communication. Most adults in modern-day society receive extensive training to read and write letters and numbers until they can instantaneously and effortlessly recognize and categorize them (Hamilton, Mirkin, & Polk, 2006; Polk & Farah, 1995; LaBerge & Samuels, 1974; Jonides & Gleitman, 1972). Yet, letters and numbers are cultural inventions, and the distinction between them is physically arbitrary. They are completely meaningless figures to people with no experience with them (e.g., infants or illiterate individuals). Thus, our ability to effortlessly process letters and numbers implies a major functional specialization of the perceptual system driven by extensive experience (Dehaene & Cohen, 2007).
Research has shown that distinct brain regions are recruited during the visual perception of letters and numbers. On the basis of a large number of neuropsychological, electrophysiological, and neuroimaging studies, it is now well established that the visual processing of letters and words selectively recruits a brain region in the left inferior temporal cortex (Schlaggar & McCandliss, 2007; McCandliss, Cohen, & Dehaene, 2003). As illiterate individuals show negligible recruitment of this cortical area when processing such stimuli (Dehaene et al., 2010), these findings suggest that learning to read leads to a change in the visual system (Park, Park, & Polk, 2012). Conversely, a few studies have also indicated that there are brain regions in the ventral visual cortex that are primarily involved in processing numbers (Shum et al., 2013; Roux, Lubrano, Lauwers-Cances, Giussani, & Demonet, 2008; Polk et al., 2002; Polk & Farah, 1998; Allison, McCarthy, Nobre, Puce, & Belger, 1994). Of particular interest, a recent fMRI study has shown a double dissociation between letter and number processing in the two hemispheres (Park, Hebrank, Polk, & Park, 2012). Specifically, letters elicited greater neural activation in the left ventral visual stream, whereas numbers elicited greater neural activation in the right ventral visual stream, again suggesting that experience causes qualitative changes in the visual cortex and its processing.
Although the fMRI study of Park, Hebrank, et al. (2012) demonstrated that letters and numbers are processed in anatomically distinct regions of visual cortex, the temporal dynamics of these dissociable processes are not understood. When does the dissociation for letters and numbers occur over the course of visual processing? Do the dissociation begin at early levels of basic visual encoding, or not until later in the processing stream when visual orthographic processing is influenced by phonological and semantic processing? To address these questions, we investigated the time course of the visual processing of letters and numbers using the high temporal resolution of the ERP technique.
ERP studies of letters and words have collectively shown that the posterior N1 ERP component, particularly at left occipital temporal electrodes, reflects visual expertise for meaningful orthography. The N1 is characterized as an early negative deflection over the occipital scalp around 130–190 msec after stimulus onset. It is the earliest visual ERP component that reliably distinguishes stimulus categories and is thought to reflect a neurophysiological index of visual encoding and discrimination of important visual categories (Rossion, Joyce, Cottrell, & Tarr, 2003; Tanaka, Luu, Weisbrod, & Kiefer, 1999). In the context of visual word processing, the N1 amplitude evoked by letters or words differs significantly from the N1 amplitude evoked by control stimuli such as meaningless symbols, forms, shapes, and scripts from unfamiliar languages (Appelbaum, Liotti, Perez, Fox, & Woldorff, 2009; Mondini et al., 2008; Spironelli & Angrilli, 2007; Brem et al., 2005; Maurer, Brandeis, & McCandliss, 2005; Wong, Gauthier, Woroch, Debuse, & Curran, 2005; Bentin, Mouchetant-Rostaing, Giard, Echallier, & Pernier, 1999; Tarkiainen, Helenius, Hansen, Cornelissen, & Salmelin, 1999; Schendan, Ganis, & Kutas, 1998). These findings suggest that perceptual specialization for visual word forms occurs as early as 130 msec from stimulus onset (but see Pernet et al., 2003, for an argument on dissociation at a later latency) at the earliest categorical visual encoding level and preferentially in the left ventral visual stream.
Although numerous studies have investigated the ERP responses to letters and letter strings, we have little knowledge about the ERP responses to numbers and how they temporally dissociate from ERP responses to letters. There are a few possibilities for the temporal dissociation between letters and numbers. One possibility is that the dissociation occurs early in visual processing, in which case we might expect dissociation in the N1 or other early sensory ERP components over right occipital temporal scalp as suggested by the findings in Park, Hebrank, et al. (2012). Alternatively, as suggested by the triple-code model of numerical cognition (Dehaene, 1992), numbers may initially be processed by both hemispheres during early visual encoding (Dehaene, 1996), and the dissociation may only be apparent at a later stage once phonological or semantic information is processed and can be incorporated. Under this scenario, ERP traces to numbers should diverge from traces to letters not at the N1 level but at a later component. For instance, numbers may show greater activity than letters at the P2 level, a positive deflection starting approximately 250 msec after stimulus onset known to be modulated by phonological or semantic aspects of stimuli (Hauk, Davis, Ford, Pulvermuller, & Marslen-Wilson, 2006; Sereno, Rayner, & Posner, 1998; McCandliss, Posner, & Givon, 1997; Dehaene, 1995).
In this study, we tested the above hypotheses in two experiments. As our primary research focus was on the visual processing of letters and numbers, we used a viewing paradigm of these stimuli involving a completely orthogonal task (attending for infrequent arrow stimuli) to minimize explicit stimulus-specific top–down modulations of early visual processing. In Experiment 1, we investigated whether the ERP traces to the visual processing of short letter strings and number strings would show dissociations at scalp sites over the occipital temporal cortices. We report a clear dissociation between letter and number processing at the N1 level, showing greater N1 amplitude for letters at the left hemisphere sites and greater N1 amplitude for numbers at the right hemisphere sites. We further report a dissociation at the P2 level exclusively over the left hemisphere. In addition, we show that the processing of unfamiliar visual stimulus forms (false fonts) differentiated from both letters and numbers even earlier at the P1 level (a positive deflection around 100 msec) throughout the entire epoch. In Experiment 2, we replicate the N1 dissociations with singly presented letters and numbers, but not the other dissociations at other time points.
A total of 21 participants were recruited from the Duke University psychology subject pool. Participants' age ranged from 18.3 to 25.0 years, with a mean of 20.0 years. Eight participants were male. All participants were right-handed, had normal or corrected-to-normal vision, and were neurologically intact. All procedures were approved by the Institutional Review Board of Duke University. Participants were given departmental class credit for their participation in the study.
Stimuli and Task
Four-character strings of letters were created randomly from a set of capital letters “BCGKLSZ,” and four-character strings of numbers were created randomly from a set of Arabic numerals “1234567” (Figure 1). For the letters, only consonants were used to discourage pronunciation. Although our primary research question was how letters and numbers are differentially processed, we also included completely novel visual stimuli (false fonts) to examine how the ERP traces of highly familiar categories of stimuli differ from a completely novel yet physically equivalent set of characters. To do so, four-character strings of false fonts were created from a set of individual false-font stimuli that were each generated by randomly rearranging features of letter and number stimuli (see Figure 1). The selection of letters, numbers, and false fonts was chosen to roughly balance the physical properties between the stimulus categories in the number of straight or nearly straight lines, curved segments, enclosures, and joints. A monospace font face (Monaco) was used for all three conditions, and each character subtended about 0.57 × 1.17 degrees of visual angle.
Participants viewed these character strings presented in random order at the center of the screen for 150-msec durations, with stimulus onset asynchronies varying randomly from 600 to 800 msec (uniform distribution). To ensure that participants paid attention to the stimuli, an oddball detection task was imposed. Specifically, four arrowheads either pointing to the left (<<<<) or right (>>>>) occasionally appeared on the screen in the series of presented stimuli, and the participants' task was to attend for these target stimuli and discriminate whether the arrowheads pointed left or right, using their respective left and right index fingers on a game controller. A fixation dot appeared on the center of the screen when no other stimuli were present. The employment of relatively short stimulus onset asynchronies and a visual oddball detection task, along with the exclusive use of the consonants, was to minimize any explicit identification of the stimuli, such as by reading or subvocalizing. In each block of trials, a total of 420 strings (with the three stimulus categories in equal probability) and 30 oddball targets of arrowheads (<<<< or >>>>) were presented. Each participant completed four blocks, which together took a total of about 30 min of recording time.
Electrophysiological Recording and Analysis
The EEG was recorded continuously from 64 channels mounted in a customized, elastic electrode cap (Duke64 Waveguard cap layout, Advanced Neuro Technology, the Netherlands). Our custom cap is designed such that the electrodes are equally spaced across the cap, while also providing extended coverage of the head from just above the eyebrows anteriorly to below the inion posteriorly (Woldorff et al., 2002). The vertical EOG was monitored with electrodes below the left eye (referenced to a site above the eye), and the horizontal EOG with electrodes on the outer canthi of each eye. The ground electrode was placed on the left collarbone. Electrode impedances were maintained below 10 kΩ for the EOG channels and below 5 kΩ for all other channels. The EEG recordings were referenced online to the average of all channels. The recording for all channels was filtered with no high-pass cutoff, a low-pass cutoff of 138 Hz, and a digitization rate of 512 Hz per channel.
ERP analyses were carried out using the EEGLAB (Delorme & Makeig, 2004) and its ERPLAB toolbox (Luck & Lopez-Calderon, 2013) in Matlab R2012a. The continuous EEG data were offline band-pass filtered from 0.01 to 100 Hz. The average of all channels was used as the reference (rather than, e.g., the mastoids) to provide more sensitivity for early visual components at ventrolateral posterior electrode sites (Dien, 1998; Bertrand, Perrin, & Pernier, 1985; Lehmann & Skrandies, 1984), as in a number of other recent ERP studies investigating such activity (Mondini et al., 2008; Spironelli & Angrilli, 2007; Hauk et al., 2006; Brem et al., 2005; Wong et al., 2005; Hauk & Pulvermuller, 2004; Simon, Bernard, Largy, Lalonde, & Rebai, 2004). EEG epochs time-locked to the presentation of letter and number string stimuli were extracted from 200 msec before to 600 msec after stimulus presentation, to which a prestimulus baseline removal was applied. A step-like artifact rejection tool in EEGLAB (threshold = 30 μV; window width = 400 msec; window step = 20 msec) was used to identify any trials in the data contaminated by eye movements or blinks, which were then removed before averaging. On average, 16.9% of trials were rejected after artifact rejection. Finally, the individual ERPs were low-pass filtered at 30 Hz, after which statistical analyses and grand averaging of the ERPs were performed.
Two occipital temporal channels over each hemisphere were selected based on a pilot study (n = 12, similar paradigm as the current experiment but with shorter stimulus duration and shorter stimulus onset asynchronies) that showed the maximal N1 effect across both letter and number strings. The more superior pair of the four channels was located closest but slightly (about 0.14 radians) inferior to PO7 and PO8 in the standard 10–20 system. We will refer to these channels as PO7i and PO8i. The more inferior pair of the four channels was located closest but slightly (about 0.11 radians) inferior to PO9 and PO10 in the standard layout. We will refer to these channels as PO9i and PO10i. These channels of interest are represented as white circles on the figures of the posterior perspective topographic maps in Figure 3. The ERP traces from PO7i and PO9i were averaged together (denoted as PO7i/PO9i) to represent the ERPs over the left occipital ROI, and the traces from PO8i and PO10i were likewise averaged together (denoted as PO8i/PO10i) to represent the ERPs over the right occipital ROI.
The effect of stimulus categories on each of the two bilateral occipital ERP responses was assessed using a cluster-based nonparametric method (Maris & Oostenveld, 2007). This approach first identifies consecutive time points that show significant amplitude differences between two (or possibly more) conditions. Then, a maximum cluster-level test statistic (sum of the F statistic within a cluster exceeding a height threshold of p < .05) is evaluated against the null distribution, which is generated from random permutations of the waveforms by randomly assigning category labels to each waveform. Contiguous time regions with α values (false positive rate of the cluster size) below 0.05 were considered statistically significant. The contrasts of (1) letters versus numbers and (2) letters and numbers together versus false fonts were performed.
Results and Discussion
Participants demonstrated a high level of accuracy (98.2% on average) of the oddball (arrow-stimulus) detection task, with a mean RT of 498.3 msec (SD = 32.5 msec), indicating that participants were attentive during stimulus presentation.
Robust ERPs were evoked at the bilateral occipital ROIs by the visually presented strings of letters, numbers, and false fonts (Figure 2). Our primary contrast of interest, letters versus numbers, revealed significant amplitude differences between the two waveforms around the N1 latency in both hemispheres, with greater N1 amplitude for letters than numbers in the left occipital ROI (time window of 133–184 msec, α = 0.001) and with a reversed pattern in the right occipital ROI (111–215 msec, α = 0.006). At longer latencies, significant amplitude differences were found in the left occipital ROI only (283–377 msec, α = 0.006), with letters eliciting greater amplitude than numbers. Greater ERP responses evoked by letters in the left hemisphere and greater ERP responses evoked by numbers in the right hemisphere parallels the fMRI findings in Park, Hebrank, et al. (2012). Current results further show that numbers dissociate from letters in the right occipital sites as early, if not earlier, as letters dissociate from numbers in the left occipital sites, suggesting that preferential processing of numbers occurs very early in the right visual stream. No other time points showed a significant difference between the waveforms for letters and numbers.
The topographic distributions of the ERP brainwaves at the latency of the N1 are illustrated in Figure 3. Similar scalp topography between the letter condition and the number condition suggests some common neural generators for the two stimulus categories. Yet, the difference map topographies revealed marked and focal differentiation in the relative processing at the bilateral occipital sites. Topographic maps at the P2 latency shown in Figure 4 also illustrate similar scalp topography between the two conditions but with a focal peak over the left occipital scalp in the difference maps. Such focal peaks in the inferior posterior electrodes in the difference wave topographic distributions, with little activity over other, more anterior, cortical regions, suggest that the critical neural generators that differentiate letters and numbers are likely to be located in the posterior regions of the brain, which is consistent with the findings in previous source localization studies (Brem et al., 2005, 2006; Tarkiainen et al., 1999).
In contrast to this clear differentiation in waveforms for letters and numbers, false fonts elicited greater ERP amplitudes across much of the entire epoch, starting at the first positive component or P1, in both hemispheres (from around 74 msec in the left and 72 msec in the right ROIs; see Figure 2). The P1 is generally thought to reflect initial processing in the extrastriate visual cortex driven by any visual stimulus (Hillyard & Anllo-Vento, 1998). These results suggest that, even without any explicit task given to participants, visual processing of completely novel stimuli deviates significantly from the processing of highly familiar ones, a point to which we return in the General Discussion.
Experiment 1 used strings of letters and strings of numbers, which may be processed qualitatively differently from single characters (Stevens, McIlraith, Rusk, Niermeyer, & Waller, 2013; James, James, Jobard, Wong, & Gauthier, 2005). For example, according to a theoretical model for written word processing (Dehaene, Cohen, Sigman, & Vinckier, 2005), neurons coding local contours feed into the next level of neurons that encode letter shapes and that detect abstract letters, which in turn feed into the next level of neurons that encode local bigrams and substrings. Accordingly, it is plausible that experience with letters and Arabic numerals may only shape pools of neurons at the level of local bigram and substring processing and that neurons at the lower levels may not become tuned to these culturally defined symbols. If so, ERPs to single letters and numbers would be expected to show little dissociation, if any, compared with the dissociation we observed in Experiment 1 for strings. To test this hypothesis, we conducted a second experiment in which single letters and numbers were presented.
A new group of 20 participants were recruited from the Duke University psychology subject pool for class credit. Data from one participant were discarded, as the participant was unable to stay awake throughout the experiment. The age of the remaining 19 participants ranged from 18.0 to 21.9 years, with a mean of 19.5 years. Ten participants were male. All participants were right-handed, had normal or corrected-to-normal vision, and were neurologically intact. All procedures were approved by the Institutional Review Board of Duke University.
The experimental paradigm was similar to that of Experiment 1, except that single characters from the same sets of chosen letters and numbers, randomly either letter or number, were presented. For this experiment, the oddball targets were correspondingly single arrowheads either pointing to the left (<) or right (>). False fonts were not included in this study as they were not central to this follow-up research question, allowing us to instead increase the number of letter and number trials to raise statistical power for this comparison. A total of 400 character strings and 40 oddball stimuli were presented in each block, and each participant completed four blocks. Each recording session took about 30 min. All other recording and analytic procedures were identical to that of Experiment 1. The trial rejection rate exceeded 50% in one participant, and thus, this participant was excluded from further analyses. On average, the artifact rejection rate for the remaining participants (n = 18) was about 13.3%. For two participants of the final sample, the button-press behavioral responses were not collected because of software failure.
Results and Discussions
As in Experiment 1, participants were attentive to the stimuli as demonstrated by a high accuracy of detecting the arrowhead direction (mean = 98.1%, with a mean RT of 528.6 msec, SD = 53.0).
Figure 5 illustrates the ERP traces for single letters, single numbers, and their difference waves in the bilateral occipital ROIs. As in the case of character strings (Experiment 1), there were significant amplitude differences between the two waveforms around the N1 latency in both hemispheres, with greater N1 amplitude for single letters than for single numbers in the left occipital ROI (time window of 158–217 msec, α = 0.033) and with a reversed pattern in the right occipital ROI (104–162 msec, α = 0.001). Visual inspection of the right occipital responses suggested slightly earlier peaking of the single-number N1 than of the single-letter N1. To statistically test these latency differences, the latencies marking 50% of the area under the two waveforms (i.e., midpoint latency) within the time window (104–162 msec) were computed in each participant. There were no significant differences in the midpoint latency of N1 evoked by single letters versus single numbers, F(1, 17) = 0.22, p = .645. Even when a wider time window (125–217 msec) was used, the midpoint latencies between the letter and the number conditions did not show a difference, F(1, 17) = 2.85, p = .110, whereas there was a significant difference in the amplitudes between the two, F(1, 17) = 6.29, p = .023. These results suggest that the latency difference between the two conditions in the right occipital site was not reliable or at least that it was secondary to the amplitude differences between the two conditions.
The topographic distributions of the ERP brainwaves and the difference waves at the N1 latency (Figure 6) also show similar scalp topographies between the two conditions, with spatially focal peaks at the bilateral occipital sites in the difference maps. Again, these results suggest a posterior location for the underlying sources that differentiate single letter and number processing. No other time points showed a significant difference between the two conditions.
These results replicate the key findings from Experiment 1 and indicate that the N1 amplitude is greater for letters than numbers in the left hemisphere and greater for numbers than letters in the right hemisphere. The fact that single letters and numbers dissociate at the level of N1 suggests that the visual cortex is tuned to discriminate at very early processing levels between letters and numbers at their smallest unit. There were, however, some slight differences in the character-string findings of Experiment 1 and the single-character findings of Experiment 2. Mainly, when letters and numbers were presented individually rather than in strings, no indication of a differential P2 was observed. These findings thus suggest that there are some additional differences in the processing for strings of characters versus single characters at a somewhat longer latency, a point to which we return in General Discussion.
In this study, ERPs to visually presented letters and numbers were examined to investigate the time course of the dissociation between the two categories. As we hypothesized, a dissociation was observed between the ERP traces evoked by letters and by numbers at both the left and right occipital-temporal sites. Specifically, letters elicited significantly greater N1 amplitudes in the left hemisphere, whereas numbers elicited significantly greater N1 amplitudes in the right hemisphere, with otherwise very similar scalp topography elsewhere. Moreover, both letter/number strings and individual letters/numbers elicited similar patterns of dissociation at the N1 level, implying that the observed results are largely independent of the length of the character string. The finding of these electrophysiological effects at this very early latency suggests that adult human visual cortex is tuned to differentially process letters and numbers at one of the earliest encoding levels in the visual stream.
It should be noted that letters and numbers are both highly familiar stimuli. In addition, we minimized possible phonological or semantic processing by only presenting consonant letters, and we minimized top–down processing by using an orthogonal arrow detection task with rapid and randomized stimulus presentation. Thus, it is unlikely that the dissociations observed in the ERP responses to the letters and numbers were because of differential top–down attention or high-level cognitive strategies for these stimuli.
Greater N1 amplitude for letters than numbers at the left occipital temporal sites is consistent with the proposal that the left inferior temporal area in the occipital-temporal sulcus hosts a region specialized in visual word form processing (McCandliss et al., 2003). In fact, the characteristics of the left-lateralized orthography-evoked N1 closely match the activity in the visual word form area typically found in fMRI studies (Dien, 2009). Brem et al. (2006), using both fMRI and ERP techniques, found that fMRI activation in the visual word form area reliably correlated with N1 amplitude, suggesting that the N1 evoked by orthographic stimuli is closely related to the activation in the occipital-temporal cortex. Using a source localization analysis, Maurer et al. (2005) suggested that the inferior occipital N1 evoked by letters and words arises from the basal posterior temporal source cluster in the fusiform gyrus. Along the same line, a magnetoencephalographic study by Tarkiainen et al. (1999) reported greater activity to noise-free words compared with high-noise words or noiseless symbol strings in the left occipital-temporal cortex at around 150 msec. Thus, the greater N1 to letters compared with numbers over left occipital sites is consistent with the idea that these N1 effects reflect an early sensitivity in the visual cortex to visual letter forms over other also highly familiar visual stimuli.
Analogously, the greater N1 to numbers than to letters over the right hemisphere suggests that the right occipital temporal cortex hosts a region that preferentially processes visual number forms. Visual processing of number symbols (e.g., the Arabic numerals tested in this study) has received relatively little attention in the field, so only a handful of studies have investigated the neural correlates of visual number processing. Dehaene (1996) showed that, when participants are engaged in a numerical comparison task, processing numbers in Arabic notation elicited more bilateral N1 activity compared with verbal notation, which elicited strictly left-lateralized N1 activity. This study provided one of the first hints about dissociable processes between letters and numbers. However, it did not allow a systematic comparison between visual processing of letters and numbers because there were differences in the physical characteristics of letters and numbers (e.g., number of characters) and because an explicit task was used that required numerical processing. A few recent studies have shown that some occipital temporal regions preferentially process numbers compared with other physically similar stimuli (Shum et al., 2013; Park, Hebrank, et al., 2012; Roux et al., 2008) and suggest that these extrastriate regions are tuned to respond to visual shapes of Arabic numerals. Of interest, a recent electrocorticographic study of EEG oscillatory activity found a focally localized brain area that was highly selective for the processing of Arabic numerals, and this region was most reliably found in the right inferior temporal gyrus anterior to the occipital temporal incisures in the majority of participants (Shum et al., 2013).
Although the ERP effects observed for strings of letters and numbers were similar to those observed for single letters and numbers, at least at the level of the N1, there were noteworthy differences between the two sets of responses. In particular, an ERP difference in the P2 was only observed in the strings condition, and the scalp distributions between string processing and single-character processing at the N1 level were slightly different (compare Figure 3 and Figure 6). These results suggest that there may be different underlying mechanisms for string processing compared with single character processing, an idea also supported by James et al. (2005).
Previous studies have shown that the phonological and semantic aspects of stimuli modulate left posterior ERPs after the initial visual encoding level reflected by the N1. For example, Dehaene (1995) in a lexical decision task found that ERP waveforms diverge between words and consonant strings or even between different categories of words, starting from around 200 msec from stimulus onset, with the ERP waveforms for words being more positive than those for consonant strings. Hauk et al. (2006), also using a lexical decision task, showed a marked difference in the P2 evoked by words compared with pseudowords (i.e., pronounceable nonwords), with pseudowords eliciting larger amplitudes. From a yet different standpoint, McCandliss et al. (1997) showed that the P2 magnitude difference between English words and consonant strings was much greater (with words eliciting a larger amplitude) in a semantic task compared with a passive viewing task. Furthermore, when the authors trained participants to associate a set of artificial words to meaningful objects, properties, and events, they observed a significant amplitude change in the P2 to these learned stimuli, but not in the N1. In contrast, no training-induced changes in the P2 were observed in untrained artificial words that matched in orthographic regularity to the trained artificial words. These studies suggest that linguistic aspects of the stimuli modulate brain responses at the level of the P2. According to this idea, our results showing a P2 amplitude difference between strings of letters and numbers, but not between single letters and numbers, suggest that the visual cortex may be implicitly extracting phonological or semantic information in some conditions but not others.
On the other hand, it is difficult to imagine that there is asymmetry in phonological or semantic processing between letters and numbers only in strings and not in single characters. Therefore, P2 differences in the left hemisphere may not be because of implicit phonological or semantic influences on visual processing in the context of our study. Instead, the P2 in the context of visual word form processing may imply a later stage of a hierarchy of local combination detectors (Dehaene et al., 2005). This theoretical model, inspired by neurophysiological models of invariant object recognition, proposes that there is a hierarchical organization whereby neurons detect patterns of visual stimuli of increasing complexity along a hierarchy. Whereas pools of neurons in the lower levels may most effectively process single characters, pools of neurons in the upper levels may most effectively process combinations of characters. Our data fit well with this proposal, as only letter strings, but not single letters, showed a greater P2 compared with numbers. Moreover, according to this idea, our results imply that only letter-string processing (and not number-string processing) in the left visual cortex is subject to this hierarchical organization.
It is also of interest in this study that false fonts elicited greater ERP amplitudes than both letters and numbers across multiple phases of processing. Although these false-font ERP patterns do not explain the hemispheric double dissociations between letters and numbers, which was the more central research question of this study, they may potentially provide important insights for generating further hypotheses about how the visual cortex processes unfamiliar stimuli differently from familiar stimuli. It should first be noted that some other previous studies have shown greater ERP responses, at least at the N1 range, to letters and words compared with false fonts, a pattern opposite from our findings (Stevens et al., 2013; Wong et al., 2005). In these studies, however, an identity 1-back task was used with much longer stimulus presentations and longer intertrial intervals compared with our study. Accordingly, one possibility is that the active encoding of known stimuli (e.g., native letters and words) via top–down attentional mechanisms may elicit greater N1 amplitudes than the active encoding of unknown or incomprehensible stimuli. In contrast, when top–down attentional or encoding strategies are minimized, such as in our study, there may be no selective enhancement of N1 for known stimuli.
Enhanced neural activity to false fonts in our study parallels previous fMRI findings (Park, Hebrank, et al., 2012; Vinckier et al., 2007) but also shows that this enhancement occurs very early (as early as the P1 at 100 msec) in the processing stream. The larger ERP responses to false fonts may potentially be explained by inefficiency in the template-matching process for unfamiliar stimuli. For instance, highly experienced familiar stimuli such as letters and numbers may be detected by generic feature detectors early in the visual processing stream and then get fed rapidly into subsequent processing levels for more focal letter and number areas separately in the two hemispheres. In contrast, greater activity around the P1 for the false fonts may reflect the need for more extensive encoding of these unfamiliar stimuli by the generic feature detectors at early levels, resulting in greater neural activity that may propagate through later phases of the processing stream. Thus, the overall enhancement of the neural response to false fonts could result from the extended activity of generic feature detectors that have little influence from previous experience with false fonts. Consistent with this idea, Park, Park, et al. (2012) in an fMRI study of monozygotic twins showed smaller experiential influence (i.e., unique environmental effects) in the neural response to false fonts compared with letters, although the magnitude of the neural response to false fonts was greater than to letters.
In summary, the two electrophysiological experiments presented here show that the human visual cortex exhibits hemispherically differentiated processing for two categories of culturally defined, otherwise arbitrary, symbols, and it does so very early in the processing stream. These findings complement a previous fMRI study (Park, Hebrank, et al., 2012) by showing precisely when during letter and number processing the dissociation occurs, as well as when the processing of both of these culturally defined stimulus types differentiate from physically similar but unfamiliar visual stimulus forms (false fonts), thus providing important temporal information that was not afforded by the fMRI study. Our results further suggest that the processing of letter and number strings utilizes neural pathways that are partially differentiated from those processing single characters of letters and numbers. These findings imply a major neural specialization in the early visual cortex driven by extensive experience that is unique to humans. Future studies should explore when in the developmental time frame this specialization occurs and how and why it comes about.
We thank Stefanie Schwartz, Chandra Swanson, and Amber Kunze for their assistance in data collection and Dr. Thad Polk for his invaluable comments on an earlier version of the manuscript. This study was supported by Duke Fundamental and Translational Neuroscience Postdoctoral Fellowship to J. P., a James McDonnell Scholar Award to E. M. B., and an NIH R01 grant (R01-MH060415) to M. G. W.
Reprint requests should be sent to Joonkoo Park, Center for Cognitive Neuroscience, Duke University, Box 90999, Durham, NC 27708, or via e-mail: email@example.com.