Recognizing a familiar face rapidly is a fundamental human brain function. Here we used scalp EEG to determine the minimal time needed to classify a face as personally familiar or unfamiliar. Go (familiar) and no-go (unfamiliar) responses elicited clear differential waveforms from 210 msec onward, this difference being first observed at right occipito-temporal electrode sites. Similar but delayed (by about 40 msec) responses were observed when go response were required to the unfamiliar rather than familiar faces, in a second group of participants. In both groups, a small increase of amplitude was also observed on the right hemisphere N170 face-sensitive component for familiar faces. However, unlike the post-200 msec differential go/no-go effect, this effect was unrelated to behavior and disappeared with repetition of unfamiliar faces. These observations indicate that accumulation of evidence within the first 200 msec poststimulus onset is sufficient for the human brain to decide whether a person is familiar based on his or her face, a time frame that puts strong constraints on the time course of face processing.
Humans can easily differentiate a previously seen complex visual pattern such as a face (i.e., a familiar face) from a novel, unfamiliar face. However, the actual speed with which the human brain categorizes a face as familiar remains largely undetermined. A number of behavioral studies have addressed this question (Barragan-Jason, Besson, Ceccaldi, & Barbeau, 2013; Barragan-Jason, Lachat, & Barbeau, 2012; Bruce, Henderson, Newman, & Burton, 2001; Burton, Bruce, & Hancock, 1999; Tong & Nakayama, 1999; O'Toole, Edelman, & Bülthoff, 1998; Hill, Schyns, & Akamatsu, 1997; Bruce, 1982). However, because behavioral RT measures include the time to initiate and execute the motor response, these studies do not allow for a direct assessment of the speed of face familiarity categorization. To this end, the recording of ERPs on the human scalp is the method of choice because it provides an on-line measurement of brain events at a global scale, with a very high temporal resolution.
Unfortunately, electrophysiological studies have so far provided inconsistent data regarding the first robust electrophysiological difference between familiar and unfamiliar faces. During the past decade, ERP studies focused mainly on the N170, an occipito-temporal face-sensitive component peaking between 140 and 180 msec after stimulus onset (Bentin, Allison, Puce, Perez, & McCarthy, 1996; for reviews, see Eimer, 2011; Rossion & Jacques, 2008, 2011). Most studies failed to find any difference between unfamiliar and famous faces (Gosling & Eimer, 2011; Henson et al., 2003; Schweinberger, Pickering, Jentzsch, Burton, & Kaufmann, 2002; Bentin & Deouell, 2000; Eimer, 2000) or experimentally familiarized faces (Kaufmann, Schweinberger, & Burton, 2009; Tanaka, Curran, Porterfield, & Collins, 2006; Rossion et al., 1999) on the N170. Some studies reported such differences, but in opposite directions: a N170 (or M170 in magnetoencephalography [MEG]) increase (Harris & Aguirre, 2008; Caharel, Courtay, Bernard, Lalonde, & Rebaï, 2005; Caharel et al., 2002) or a decrease (Marzi & Viggiano, 2007; Jemel, Pisani, Calabria, Crommelinck, & Bruyer, 2003) for famous faces as compared with unfamiliar faces.
More consistent ERP differences between familiar and unfamiliar faces have been reported at a later time point. Specifically, results from repetition priming paradigms have shown that repeated exposures of familiar faces elicit a larger negative N250 brainwave at inferior temporal sites compared with repetitions of unfamiliar faces (the N250r, “r” for repetition effect). This effect, which starts at about 220–230 msec, is found for famous (Gosling & Eimer, 2011; Pfütze, Sommer, & Schweinberger, 2002; Schweinberger, Pfütze, & Sommer, 1995), or experimentally learned (Pierce et al., 2011; Tanaka et al., 2006) faces and has been taken as evidence for the first activation of familiar face representations in long-term memory (Gordon & Tanaka, 2011; Tanaka et al., 2006; Herzmann, Schweinberger, Sommer, & Jentzsch, 2004; Pfütze et al., 2002; Schweinberger et al., 1995, 2002).
Yet, other studies have rather reported even later effects, in the form of a larger centro-parietal N400 for familiar than unfamiliar faces (Bentin & Deouell, 2000; Eimer, 2000; Paller, Gonsalves, Grabowecky, Bozic, & Yamada, 2000), an effect attributed to the semantic information associated with familiar faces (Bentin & Deouell, 2000; Eimer, 2000; Paller et al., 2000).
The few studies using intracranial recordings in epileptic patients have also reported face familiarity effects with discrepant latency values. For example, Seeck et al. (1997) found that face familiarity could be extracted very fast, as early as 50 msec following stimulus onset. In contrast, in patients performing a famous/unfamiliar face recognition task, the earliest electrophysiological difference between famous and unfamiliar faces was found at approximately 240 msec (N240) after stimulus onset in medial-temporal lobe structures (Barbeau et al., 2008). Using a set of familiar (famous) and unfamiliar faces, Puce, Allison, and McCarthy (1999) found that intracranial N200, P290, P350, and N700 components recorded at face-specific sites were unaffected by face familiarity.
Although disagreements between intracranial studies as to the onset time of face familiarity sensitivity could be accounted for by the differential cortical locations sampled, discrepancies between scalp EEG studies seem more surprising. One possibility for this variability is the type of familiar faces used in the different studies. Usually, famous or experimentally familiarized faces are used (Pierce et al., 2011; Harris & Aguirre, 2008; Tanaka et al., 2006; Henson et al., 2003; Jemel et al., 2003; Pfütze et al., 2002; Schweinberger et al., 1995, 2002; Bentin & Deouell, 2000; Eimer, 2000; Rossion et al., 1999). Yet, there is evidence that personally familiar faces provide a much richer source of information and more robust face representations than other kinds of familiar faces (Carbon, 2008; Herzmann et al., 2004; Tong & Nakayama, 1999), suggesting that personally familiar faces may be more appropriate to disclose robust and consistent effects of face familiarity.
Most importantly, EEG and MEG studies have so far concentrated on a component-based approach, controlling for all factors that differ between familiar and unfamiliar faces, including decision and motor processes. Although this is a methodologically sound approach, it may not be the most sensitive approach to determine the minimal time needed to perform a visual categorization task by means of electrophysiological measures. On the contrary, it may be important to maximize the difference between familiar and unfamiliar face stimuli in terms of decisional and behavioral outputs to determine the time at which electrophysiological waveforms differ reliably on the scalp. Moreover, although differences between visual categories observed on ERP components can be interpreted as reflecting the sensitivity of the system to the differences between these categories, they are not satisfactory if one wants to ensure that sufficient processing has been done to allow categorization. This requires correlating neural activity with the decision of the participant, because it implies that sufficient processing has been done to allow target detection (Romo & Salinas, 1999; Shadlen & Newsome, 1996; Newsome, Britten, & Movshon, 1989).
This latter approach has been used successfully for the past 15 years by Thorpe and colleagues by applying a go/no-go response mode in ERPs during the categorization of object categories (animal, vehicles, faces, etc.) in complex visual scenes (e.g., Rousselet, Mace, & Fabre-Thorpe, 2003; Thorpe & Fabre-Thorpe, 2001; VanRullen & Thorpe, 2001; Thorpe, Fize, & Marlot, 1996; see also Goffaux et al., 2005, for global scene categorization). For instance, VanRullen and Thorpe (2001) asked their participants to lift their finger (go) when an animal (Task 1) or a vehicle (Task 2) was present in the visual scene. The nontarget stimuli used in one task were included in the other task. These authors reported a large differential activity between targets and distractors that emerged on almost all electrode sites (frontal, central, parietal, and occipital) after 150 msec, illustrating the magnitude of the effect. By using this procedure, these authors were not only able to determine the minimal time to make a complex perceptual categorization decision but, contrary to behaviorally unrelated differences observed on early visual components, were also able to clearly establish that sufficient processing has been done to allow categorization.
In this study, we applied this latter approach to a personally familiar/unfamiliar face decision task to obtain robust differences between the two types of faces and to set a threshold to the time needed to actively categorize a face as familiar. The behavioral results of the study have been reported in detail elsewhere (see Ramon, Caharel, & Rossion, 2011), in an article that focused on the detailed analysis of repetition effects and analyses of the minimum RTs for familiar and unfamiliar face categorization.
Fifty-two photographs of full-front faces without glasses and with neutral expression were used. Twenty-six of them were considered as personally familiar face stimuli (including the participants' own faces) as they depicted students who had been attending the same course (Master degree in Psychology) as a small group (total of 31 students) for about 2 years at the time of testing. For each familiar face, a corresponding unfamiliar one, matched for sex, eye, and hair color, was chosen from a larger database of faces. This set of unfamiliar face images was obtained by photographing university students of the same age group under the same conditions as the set of familiar face images. Using Adobe Photoshop 7.0, all images were adjusted so that the pupils were aligned horizontally, and a generic black “sweater” was superimposed on each photograph so that clothing did not vary across stimuli (Figure 1). Furthermore, for each familiar face, a corresponding unfamiliar face was adjusted to have the same size. All face images used in the experiment were equated for mean pixel luminance and contrast using the following procedure in Matlab 7.0. Briefly, each image was first transformed from the RGB to the YUV color space, which contains a luminance component (Y) and two chrominance components (U and V), allowing manipulating pixel luminance and contrast independently of chrominance information. The pixel luminance and contrast of each image was adjusted by equating the mean and the standard deviation of the Y component's pixel intensity of all images, respectively. Each image was then back-transformed into RGB space. Note that only those pixels belonging to the face depicted were taken into account in this procedure, with the light gray background being equal for all images. The images were presented approximately 250 × 360 pixels in size, comprising a visual angle of ∼5.02° × 7.24°. See Ramon et al. (2011) for additional information regarding the stimulus material.
Participants of this experiment (all paid for participation) were recruited from the same course (Master degree in Psychology) of the University of Louvain, in Louvain-la-Neuve (Belgium). They performed a go/no-go familiarity judgment task, which required speeded responses to individually presented face stimuli by categorizing them as familiar or unfamiliar. Participants of the first group (n = 11, 6 women, 2 left-handed, mean age = 23.64 ± 1.02 years) were instructed to respond when a photograph of a personally familiar face was presented (no response for unfamiliar faces). Participants of the second group (n = 11, 6 women, 1 left-handed, mean age = 23.54 ± 0.93 years) were asked to respond when the face was unfamiliar (no response for familiar faces). Among the 11 participants of the second group, 6 also participated in the previous task (go for familiar faces) because only 17 of the students in total agreed to take part in the study. However, both tasks were completed on different days, on average 60 ± 45 days later (see Ramon et al., 2011, for further discussion of this issue). All of the participants had normal or corrected-to-normal vision.
After electrode cap placement, participants were seated in a light- and sound-attenuated room, at a viewing distance of 100 cm from a computer monitor. Stimuli were displayed using E-prime 2.0, on a light gray background. On each trial, a 200-msec central fixation cross was followed by a face stimulus presented for 100 msec. A blank screen of 200–400 msec (randomized) duration was inserted between the fixation cross and the face stimulus to avoid that the offset of the cross coincides with the onset of the face as well as to reduce anticipation and participants' preparation for action. Trials were separated by an intertrial interval of about 1600 msec (1500–1700 msec; Figure 1). Participants were requested to place the index finger of their dominant hand on a response pad (a plate with a pair of emitter-detector infrared diodes) and to indicate the presence of a target stimulus by lifting their finger (go response) and keeping their finger on the response pad if distractors were presented (no-go response). They were instructed to maintain fixation at the center of the screen throughout the trial and respond as accurately and fast as possible. RTs were measured from the onset of the stimulus to the finger lift from the response pad. Participants completed four blocks of 104 trials (416 trials in total). Within each block, the 52 faces were presented first followed by a second (random) presentation of these faces. This procedure enabled postexperimental investigation of potential effects of stimulus repetition (eight repetitions of the entire set of faces separated by three pauses in between blocks).
EEG was recorded from 128 Ag/AgCl electrodes mounted in an electrode cap (Waveguard, ANT, Inc; 2-D map of all electrode positions can be accessed here: www.ant-neuro.com/products/caps/waveguard/layouts/128/). Electrode positions included the standard 10–20 system locations and additional intermediate positions. Vertical and horizontal eye movements were monitored using four additional electrodes placed on the outer canthus of each eye and in the inferior and superior areas of the right orbit. During EEG recording, all electrodes were referenced to a common average reference, and electrode impedances were kept below 10 kΩ. EEG was digitalized at a 1000-Hz sampling rate and a digital antialiasing filter of 0.27*sampling rate was applied at recording (at 1000 Hz sampling rate, the usable bandwidth is 0 to ∼270 Hz). EEG data were analyzed using ASA 4.6 (ANT, Inc.) and custom-made routines in Matlab 7.0. After a 0.1 Hz high-pass and 30 Hz low-pass filtering of the raw EEG data, trials contaminated with eye movements or other artifacts (≥ ± 80 μV in −200 to 800 msec) were marked and rejected. When there were many eye-blink artifacts, a correction was applied using a principal component analyses method (Ille, Berg, & Scherg, 2002). Incorrect trials and trials containing EEG artifacts were removed. For each participant, averaged epochs ranging from −200 to 800 msec relative to the onset of the stimulus and containing no EEG artifact were computed for each condition separately and baseline corrected using the 200 msec prestimulus time window. Participants' averages were then rereferenced to a common average reference and grand-averaged for data display of waveforms and topographical maps.
For both groups of participants, the percentages of correct responses (finger lift on go trials) and mean and median correct RTs were evaluated. An independent two-sample t test was performed to compare percentages of correct responses and RTs in both groups of participants (see Ramon et al., 2011, for a full analysis of behavioral results).
Three types of analyses were performed on the EEG recorded on the scalp following stimulus presentation. First, to obtain an overall view of brain activity during processing of face familiarity, analyses of spatial standard deviation across all channels were performed. This measure, which is typically referred as the global field power (GFP; Lehmann & Skrandies, 1980), provides a compact description of the signal across the scalp. It is assumed that stronger electric fields lead to larger values and that the peaks coincide with maximum activation of the underlying generator. Here the GFP differences between two conditions were estimated at each time point for each participant individually and were then subjected to intrasubject t tests (df = 10 or 21) between −200 and 500 msec. Differences were considered to be significant if they reached p < .01 for 10 consecutive time points (10 msec). Second, ERP differential waveforms between personally familiar and unfamiliar faces were estimated at each time point for each participant individually and were then submitted to intrasubject t tests (df = 10 or 21) at the p < .01 or .05 level between −200 and 500 msec at five pairs of occipito-temporal (TPP9/10h, PO9/10, P9/10, PPO9/10h, and POO9/10h). The selection of the electrodes of interest was based on observation of topographical maps where maximal differences were visible between conditions. Differences were considered as significant if they reached p < .01 or .05 for 10 consecutive time points (10 msec). Third, although this was not the main goal of the study, analyses on specific visual ERP potentials, in particular on the P1 (maximal at approximately 100 msec) and on the N170 (maximal at approximately 150 msec) components, were performed. Amplitude values of these components were measured at five pairs of occipito-temporal and centro-frontal electrodes in the left and right hemisphere where they were the most prominent (for the P1: PO5/6, PO7/8, PO9/10, PP09/10h, and POO9/10h; for the N170: P9/10, TPP9/10h, PO9/10, PP09/10h, and POO9/10h). Amplitudes were quantified for each condition as the mean voltage measured within 30 msec windows centered on the grand-averaged peak latencies of the components' maximum. The amplitude values of each component were then subject to separate repeated-measures of ANOVA with Group as between-subject factor and Familiarity (Familiar vs. Unfamiliar faces), Repetition (between each of the four blocks), Hemisphere, and Electrode as within-subject factors. All effects with two or more degrees of freedom were adjusted for violations of sphericity according to the Greenhouse–Geisser correction. Post hoc Fisher's least significant difference tests were used to compare the conditions two-by-two (all comparisons were performed, i.e., 10 comparisons for five electrodes). Finally, Spearman correlations (nonparametric test) were performed to measure the degree of association between behavioral results and ERP amplitude values recorded at different time windows.
A detailed analysis of the behavioral results has been published separately (Ramon et al., 2011), so it will be only briefly mentioned here. Participants in both Group 1 (go responses = familiar faces) and Group 2 (go responses = unfamiliar faces) performed the speeded go/no-go categorization task successfully [average performance (±SD): 99% (±1.1) and 97% (±2.5), respectively] and relatively rapidly [mean RT (±SD): 463 (±41.4) msec and 555 msec (±56.6), respectively; median RT (±SD): 453 (±38.3) and 535 msec (±51.5), respectively; Figure 2]. Participants responded faster to personally familiar faces (Group 1) compared with unfamiliar (Group 2) faces [mean: t(20) = 4.36; median: t(20) = 4.27, both ps < .0003]. Moreover, participants responded more accurately to personally familiar faces than to unfamiliar faces [t(20) = 2.52, p = .02].
Analysis Performed on the Whole Set of Participants (Independently of the Task)
Analysis of spatial standard deviation across all channels of the scalp
Using the spatial standard deviation across all electrodes of the scalp (GFP), the first differences between the two differential waveforms (go vs. no-go), across the four blocks (eight repetitions of faces), was observed shortly after 200 msec (Figure 3A). The first significant difference between Familiar and Unfamiliar faces (with p < .01) started at 214 msec (until 500 msec; Figure 3A). Inspection of topographical maps and individual electrode waveforms showed that these differences (Familiar minus Unfamiliar faces) were mainly restricted to occipito-temporal regions (Figure 3B and C). On the basis of this observation, representative channels for each hemisphere at occipito-temporal regions were selected for further analyses.
Time-point analyses at occipito-temporal sites
At occipito-temporal sites, ERP differential waveforms between personally familiar and unfamiliar faces across the four blocks (eight repetitions of faces) were observed as early as 206 msec (with p < .01) until 500 msec over the right occipito-temporal region and at 213 msec (until 355 msec and then between 420 and 500 msec) over the left homologous region (Figure 3B).
Analysis Performed on Each Group of 11 Participants Separately
Analysis of spatial standard deviation across all channels of the scalp
Differences between the two differential waveforms (go vs. no-go) were also analyzed at each time point using the spatial standard deviation across all electrodes of the scalp for each group of participants (across the four blocks with eight repetitions of each face confounded). The first difference emerges earlier and is larger when participants respond to personally familiar faces than to unfamiliar faces (Figures 4A and 5A). These analyses revealed that the first significant difference between Familiar and Unfamiliar faces started at 212 msec for familiar face targets and 30 msec later for unfamiliar face targets (243 msec).
Time-point analyses at occipito-temporal sites
Time-point analyses on ERP differential waveforms between personally familiar and unfamiliar faces across the four blocks (eight repetitions of faces) showed significant amplitude differences between conditions as early as 208 msec (until 500 msec) over the right occipito-temporal region and 220 msec (until 363 msec) over the left homologous region (Figure 4B) when target stimuli were familiar faces. For unfamiliar face target stimuli, differences between waveforms elicited by familiar and unfamiliar faces emerged at 254 msec in the right hemisphere (until 306 msec; with a second difference from 441 to 500 msec) and 248 msec in the left hemisphere (until 299 msec; a second difference from 306 to 500 msec; Figure 5B).
Analysis of face repetition effects
Visually (Figures 4A and 5A), for familiar face target stimuli, the onset of the difference seemed to be strictly identical when considering only the first block (each face appearing twice) as compared with the average of the four blocks of trials (Figure 4A). In contrast, for categorization of unfamiliar faces, there was an important reduction of the onset time of the go/no-go difference when considering only the first block as compared with the average of the four blocks of trials (Figure 5A). To precisely evaluate the impact of face repetition on the familiarity effects observed shortly after 200 msec, the go/no-go difference was analyzed over right occipito-temporal channels at each time point during each block of trials, with a more liberal statistical criterion (p < .05; 10 consecutive time points significant) to take into account the reduced number of trials and signal-to-noise ratio of such an analysis. For familiar face detection, the difference was significant at 210 msec already in the first block of trials, and this difference decreased to 185 msec when considering only the very last block (Block 4) of trials (Figure 6). Importantly, within the first block, there was a significant effect at 210 msec when considering only the very first presentation of the (26) familiar faces compared with the second presentation (onset 217 msec; Figure 6). Thus, although there were only very few trials considered in this analysis, the overall time threshold of 210 msec seems to be valid even for the very first presentation of the faces. For the second group, aside for a N170 effect discussed below, the difference emerged reliably only at 261 msec in the first block of trials (Figure 7). At the end of the experiment (Block 4), there was an earlier significant difference (216–227 msec), but the difference between the two conditions was not prolonged and consistent across time windows (Figure 7).
Although the main focus of this study is on the go/no-go differential waveforms, we also report the differences observed on early visual ERP components analyzed in electrophysiological studies of face processing. This complementary analysis was restricted to responses below 200 msec because, from that latency at least, differences at the level of such components could be artificially increased or decreased by the superimposition of decisional and response-related components because of the association of different behavioral responses to familiar and unfamiliar faces in a go/no-go paradigm (e.g., fronto-central and P3 components with peak latency between 200 and 300 msec at fronto-central sites, both generally associated with inhibitory neural processes; see Bokura, Yamaguchi, & Kobayashi, 2001; Falkenstein, Hoormann, & Hohnsbein, 1999; Fallgatter & Strik, 1999; Eimer, 1993; Jodo & Kayama, 1992; Kok, 1986).
The first electrophysiological component was a large positivity (P1) recorded over lateral occipital sites, peaking shortly before 100 msec. Next, the N170 and its positive counterpart on the vertex, the VPP (vertex positive potential; see Joyce & Rossion, 2005; Jeffreys, 1993), peaked at about 150 msec (Figures 4B and 5B).
There were no P1 amplitude differences according to Face Familiarity [F(1, 20) = .97; p = .33], nor any interaction involving this factor [Familiarity × Hemisphere: F(1, 20) = .05, p = .82; Familiarity × Electrode: F(4, 80) = .89; ɛ = .49, p = .4]. There was a significant main effect of Electrode [F(4, 80) = 3.58; ɛ = .43, p < .04] because of slightly smaller amplitudes at two electrodes (P09/10, POO9/10h) compared with the three other electrodes (p < .05).
The N170 was larger for familiar faces [F(1, 20) = 7.5, p = .013; Figures 4B and 5B] over the right than the left hemisphere [F(1, 20) = 5.04, p = .036] and on posterior (P9/10, PO9/10) compared with more anterior (PPO9/10h, POO9/10h, TPP9/10h) electrodes [F(4, 80) = 4.79; ɛ = .53, p < .012]. These main effects were qualified by a Familiarity × Electrode [F(4, 80) = 3.71; ɛ = .65, p < .021] interaction because the familiarity effect was significant only for the most anterior channels (for TTP9/10h, PO9/10, and P9/10 electrodes, all ps < .016; but on POO9/10h and PPO9/10h electrodes, all ps > .05). A marginally significant Familiarity × Hemisphere [F(1, 20) = 3.8, p = .065] interaction reflects a difference between familiar and unfamiliar faces in the right hemisphere only (p = .005; left hemisphere: p = .38). There was also an effect of Repetition [F(3, 60) = 4.61; ɛ = .92, p < .007] because of smaller N170 amplitude in the first block as compared with each of the next blocks (all ps < .008), which did not differ from each other (ps > .51). However, there was no significant interaction between Repetition and Familiarity [F(3, 60) = 1.42; ɛ = .81, p = .25]. Additional planned comparisons revealed a significant effect of familiarity in the first block [F(1, 20) = 6.98; p = .015] but not in the last three blocks. Furthermore, an effect of Repetition was significant only for unfamiliar faces because of smaller N170 amplitude during the first block as compared with the next three blocks [F(3, 60) = 7.51; p = .001; familiar faces: F(3, 60) = .90; p = .43].
To clarify the link between the small face familiarity effects observed on the N170 and the more robust and prolonged differences observed shortly after 200 msec, correlation analyses were performed on the differences between familiar and unfamiliar faces on the N170 amplitudes with the differential amplitude values within successive 20 msec windows from 200 to 500 msec after stimulus onset. Amplitude values were averaged on five pairs of occipito-temporal (TPP9/10h, PO9/10, P9/10, PPO9/10h, POO9/10h) electrodes. For familiar face targets, the N170 familiarity effect at occipito-temporal electrode sites was correlated with the amplitudes at five consecutive time windows, from 220 to 320 msec in the right hemisphere only (.6 < r < .7; .01 < p < .04). The larger the difference between familiar and unfamiliar faces on the N170, the larger the differences on the amplitude values from 220 to 320 msec (Figure 8A). However, for unfamiliar face targets, no significant correlation was found (all ps > .05). Correlation analyses were also performed between behavioral results (RT values obtained for correct go responses) and the ERP amplitude values (values recorded for correct (go responses) on both the N170 temporal window and on successive 20 msec windows from 200 to 500 msec after stimulus onset [values were averaged on five pairs of occipito-temporal (TPP9/10h, PO9/10, P9/10, PPO9/10h, POO9/10h) electrodes]. For familiar face targets, no correlations were found between RTs and N170 differential amplitudes. However, most importantly, RTs were correlated with amplitude at four consecutive windows, that is, from 220 to 300 msec in the right and left occipito-temporal regions (i.e., the shorter the RTs, the more negative the amplitude for go responses; RH: .58 < r < .76; .006 < p < .05; LH: .58 < r < .72; .01 < p < .05; Figure 8B).
In summary, we found the perceptual decisions about the personal familiarity of a face emerges over the right occipito-temporal cortex shortly after 200 msec (208 msec). However, having to categorize faces as unfamiliar significantly delays (by about 30 msec) this decision. Stimulus repetition did not decrease significantly the time threshold for familiar face detection as identified in EEG but had a significant effect on the time threshold for detecting unfamiliar faces. Interestingly, and although this was not the main focus of the study, a larger N170 for familiar than unfamiliar faces—an effect unrelated to behavior—was found only at the first stimulus presentations because the N170 increased following the repetition of unfamiliar faces.
Within 200 msec the visual system has accumulated sufficient evidence to be able to categorize a face as personally familiar. In a first group of participants, the electrical waveforms associated with the familiar (go) and unfamiliar (no-go) responses dissociate neatly, shortly after 200 msec over the right occipito-temporal region. The go responses to familiar faces are associated with a large occipito-temporal response. From 220 msec onward, the amplitude of this response is correlated with behavioral RTs, which take place at least 160 msec later (the earliest responses were found at 380 msec for the first presentation of a face; see Ramon et al., 2011) and on average 240 msec later (mean RTs of 463 msec). Furthermore, this familiarity effect is observed at an earlier time point in the right than in the left hemisphere, where the effect may depend on the activation of stored labels (i.e., names) associated to the familiar faces only. This is in agreement with findings from neuroimaging studies showing larger differences between familiar and unfamiliar faces in the right than in the left occipito-temporal cortex (Ishai, Schmidt, & Boesinger, 2005; Rossion, Schiltz, Robaye, Pirenne, & Crommelinck, 2001; Leveroni et al., 2000; Nakamura et al., 2000), adding a temporal dimension to this lateralized difference.
These results go beyond previous observations of scalp electrophysiological differences between familiar and unfamiliar faces at various time points between stimulus onset and behavioral outputs, as reviewed in the Introduction. This is because, unlike this study, the paradigms used previously were not explicitly designed to elicit differences between the two kinds of visual categories and could not ensure that sufficient processing had been done to allow categorization. Hence, these studies often reported relatively small, brief, and/or inconsistent electrophysiological differences between familiar and unfamiliar faces. Admittedly, the face familiarity effect observed on the N170 component in this study falls in this category. As mentioned in the Introduction, this type of effect has been observed in some studies (e.g., Caharel, d'Arripe, Ramon, Jacques, & Rossion, 2011; Keyes, Brady, Reilly, & Fox, 2010; Harris & Aguirre, 2008; Wild-Wall, Dimigen, & Sommer, 2008; Caharel, Bernard, Lalonde, Fiori, & Rebaï, 2006; Kloth et al., 2006; Caharel et al., 2002, 2005; Herzmann et al., 2004) but an absence of differences (e.g., Gosling & Eimer, 2011; Kaufmann et al., 2009; Schweinberger et al., 2002; Bentin & Deouell, 2000; Eimer, 2000; Rossion et al., 1999) or opposite effects have also been reported (i.e., larger N170 to unfamiliar faces; Todd, Lewis, Meusel, & Zelazo, 2008; Marzi & Viggiano, 2007; Jemel et al., 2003; see Caharel et al., 2011, for further discussions of these discrepancies). Interestingly, we found a larger N170 to familiar faces but only in the first block of trials: Subsequent repetitions of unfamiliar faces increased the N170 to these faces, so that the N170 was no longer larger for familiar faces. This effect may reflect a build-up of familiar representations in memory for the unfamiliar faces used in this study. Given that face stimuli are often repeated in ERP studies, such an observation may account partly for discrepancies in N170 familiar/unfamiliar face differences observed previously. Regardless of this issue, it is worth noting that there is no clear rationale for expecting a differential N170—a component reflecting the early activation of a face representation in the human brain—to familiar than unfamiliar faces. The lack of correlation between the N170 familiarity effect and the behavioral face familiarity decision in this study also suggests that a N170 face familiarity effect is not directly related to familiarity decisions. This does not mean that differential processing between familiar and unfamiliar faces does not take place already at this latency. Indeed, one possibility is that early perceptual (N170) processes could be top–down modulated by a preactivation of a familiar face representation (participants knew which familiar faces would be presented) or a newly learned face representation, but that accumulated information at this latency is insufficient to allow such decisions based on familiarity.
Regarding previous studies, our observations of an approximate 200-msec time frame for face familiarity decisions are consistent with ERP studies showing an increased amplitude of the N250 (or N250r) for famous (Pfütze et al., 2002; Schweinberger et al., 1995) or experimentally learned (Pierce et al., 2011; Tanaka et al., 2006) as compared with unfamiliar faces. However, the relation between the N250 familiarity effect and behavioral decisions of face familiarity also remains unknown. Moreover, the effect identified here using a go/no-go paradigm emerges on the positive deflection following immediately the N170 (called the “P2”), clearly before the N250 (Figure 3), which peaks generally between 230 and 300 msec following stimulus onset. The reason why this effect appears earlier here could be because of the use of personally familiar faces, as compared with the famous faces mainly used in the experiments focusing on the N250. Obviously, another factor is the use of a go/no-go task in this study, which boosts differences between familiar and unfamiliar faces from 200 msec onset (Figure 3).
A time frame of about 200 msec for face familiarity decisions is compatible with the time taken to accumulate evidence allowing various visual face categorization tasks. Although faces can be detected very early (at about 100 msec), as indicated both by saccadic RTs (Crouzet, Kirchner, & Thorpe, 2010) and face sensitivity effects observed at the level of the P1 (e.g., Halgren, Raij, Marinkovic, Jousmaki, & Hari, 2000; Eimer, 1998), these very early effects appear to be related to low-level visual cues, such as amplitude spectrum (Crouzet & Thorpe, 2011; Rossion & Caharel, 2011, for behavioral and ERP evidence, respectively). In contrast, the N170, with an onset of about 130 msec, marks the interpretation of the visual stimulus as a face, regardless of its low-level visual properties (Rossion & Caharel, 2011; see Rossion & Jacques, 2011, for a review and discussion of this issue). At the peak of the N170 (160–170 msec), sufficient evidence has been accumulated to individualize a face (irrespective of its long-term familiarity), based primarily on its global shape properties (Caharel, Jiang, Blanz, & Rossion, 2009; Jacques & Rossion, 2009; Jacques, d'Arripe, & Rossion, 2007). In this context, an additional duration of 40–50 msec to accumulate evidence to distinguish between a previously seen and a novel individual face, leading to a time frame of about 200 msec (210 msec), seems reasonable. Note that such a time frame may depend on the type of stimuli and the paradigm used. Here, we used personally familiar faces, which are associated with particularly robust face representations (Carbon, 2008; Herzmann et al., 2004; Tong & Nakayama, 1999). Providing that one does not use overexposed (“iconic”) pictures, a slightly slower time frame is expected for the detection of famous faces versus unfamiliar faces, with increased variability between observers. A second factor is that cropped faces are used in many studies, in particular in unfamiliar individual face repetition paradigms to avoid matching based only on external features. Here all faces were presented with external features, which certainly contribute to speeding up the familiarity decisions. Finally, we used a task in which participants knew in advance that they would have to discriminate a limited set of familiar faces from unfamiliar faces, in a binary decision task. Although we used quite a large set of faces, such a paradigm favors the role of top–down factors, and one cannot exclude that a face familiarity decision in an unexpected context (one unexpected familiar face in a crowd of unfamiliar faces, like “the butcher in the bus” phenomenon; Mandler, 1980) may take somewhat longer for the human brain than the time limit of 200 msec as identified here. Nevertheless, it is worth noting that—unlike the repetition effects observed for RTs (Ramon et al., 2011)–the onset of the electrophysiological difference between unfamiliar and familiar face categorization (about 200 msec) did not decrease with stimulus repetition when participants had to detect a familiar face (Figures 4A and 6). This observation suggests that when categorizing a face as familiar, the repetition-related decrease of RTs arises because of a compression of the late decisional and motor response components, not the early visual categorization process.
Interestingly, we also observed that differential go/no-go responses were delayed by about 40 msec and had a slower rise when participants had to respond to unfamiliar face targets as compared with familiar ones. This observation is consistent with behavioral data, showing that the categorization of a face as being unfamiliar takes about 60–80 msec longer than familiarity decisions (Ramon et al., 2011). The timing advantage for face familiarity decisions may reflect familiarity-related facilitation of perceptual processing of faces (see, e.g., Goto, Kinoe, Nakashima, & Tobimatsu, 2005). However, if a face can be categorized as familiar in about 200 msec, the system is, from a logical point of view, able to discriminate between familiar and unfamiliar faces already at that latency. In other words, unfamiliar faces are somehow also detected at that latency. However, actively categorizing a face as unfamiliar (i.e., a “no” or “rejection” response) appears to require a longer analysis and accumulation of more evidence before a decision is reached. The present finding of a later onset for differential waveforms associated with familiar and unfamiliar faces when detecting unfamiliar faces suggests that the behavioral effect is not only because of decision-based differences (e.g., participants being more hesitant to respond to an unfamiliar face) but may also be because of slower visual face processing (i.e., more analysis is required for unfamiliar faces, which are processed less efficiently, e.g., Megreya & Burton, 2006) or memory search (which, if involving a serial component, will be terminated earlier if a match is present). The fact that stimulus repetition had a substantial effect on the unfamiliar go decision task is compatible with such accounts.
In conclusion, the visual system requires slightly more than 200 msec to categorize an individual face as being personally familiar, as opposed to unfamiliar, with decision-related visual categorization emerging from right occipito-temporal regions; this latency puts strong constraints on the time course of face categorization in the human brain.
M. R. and B. R. are supported by the Belgian National Fund for Scientific Research (Fonds de la Recherche Scientifique, FNRS). This work was also supported by a PAI/UIAP Grant P7/33 (Pôles d'Attraction Interuniversitaires, Phase 7).
Reprint requests should be sent to Bruno Rossion, Institute of Research in Psychology, Université Catholique de Louvain, 10 Place du Cardinal Mercier, 1348 Louvain-la-Neuve, Belgium, or via e-mail: Bruno.firstname.lastname@example.org.