Evidence from hemodynamic and electrophysiological measures suggests that the processing of emotionally relevant information occurs in a spatially and temporally distributed affective network. ERP studies of emotional stimulus processing frequently report differential responses to emotional stimuli starting around 120 msec. However, the involvement of structures that seem to become activated at earlier latencies (i.e., amygdala and OFC) would allow for more rapid modulations, even in distant cortical areas. Consistent with this notion, recent ERP studies investigating associative learning have provided evidence for rapid modulations in sensory areas earlier than 120 msec, but these studies either used simple and/or very few stimuli. The present study used high-density whole-head magneto-encephalography to measure brain responses to a multitude of neutral facial stimuli, which were associated with an aversive or neutral odor. Significant emotional modulations were observed at intervals of 50–80 and 130–190 msec in frontal and occipito-temporal regions, respectively. In the absence of contingency awareness and with only two learning instances, a remarkable capacity for emotional learning is observed.
The preferential processing of high-priority stimuli in the environment is an essential function of selective attention. fMRI studies have revealed increased activations by emotional stimuli in occipital, parietal, and inferior temporal cortex (ITC; Junghöfer, Schupp, Stark, & Vaitl, 2005; Sabatinelli, Bradley, Fitzsimmons, & Lang, 2005; Bradley et al., 2003; Vuilleumier, Armony, Driver, & Dolan, 2001). ERP studies have shown that emotional stimuli modulate distinct ERP components between 120–300 and 300–700 msec (Kissler, Herbert, Peyk, & Junghöfer, 2007; Schupp, Flaisch, Stockburger, & Junghöfer, 2006; Junghöfer, Bradley, Elbert, & Lang, 2001). Thus, emotion modulates vision at midlatency and late processing stages, allowing for re-entrant processing and widespread activation, conveying stimulus information to many associative cortical regions.
However, several recent studies indicate that emotion may modulate visual processing through the rapid extraction of relevant features at even earlier stages of processing for both simple geometric gratings (Keil, Stolarova, Moratti, & Ray, 2007; Stolarova, Keil, & Moratti, 2006) and perceptually more complex faces (Morel, Ponz, Mercier, Vuilleumier, & George, 2009; Pizzagalli, Regard, & Lehmann, 1999). The patterns paired with the threatening stimuli elicited an increased C1 visual event-related component between 65- and 90-msec poststimulus (Stolarova et al., 2006), and a recent fMRI study revealed that Gabor gratings paired with shock resulted in increased activity in the primary visual cortex (Padmala & Pessoa, 2008). During aversive conditioning of faces, extrastriate visual cortex responses were also identified earlier than 120 msec (Pizzagalli, Greischar, & Davidson, 2003). Thus, recent findings suggest that emotional relevance supports associative learning in early visual stimulus processing.
Previous experiments either investigated associative learning of simple geometric patterns and/or made use of very few conditioned stimuli (CS; Pizzagalli et al., 2003; Gottfried, Deichmann, Winston, & Dolan, 2002). Thus, in these studies, the salience tagging might have always been on the basis of simple physical feature differences rather than individual visual perceptual features, necessary to differentiate socially relevant stimuli in real life. The present magnetoencephalography (MEG) study investigated the timing and regional distribution of cortical responses to different neutral facial stimuli (n = 104) that were associated with either an aversive odor or humid air. Thus, the discrimination among the CS+ and CS− stimuli was much more difficult than in previous studies, and the large number of faces challenged the system's resolving power and capacity limits. Furthermore, conditioning effects were revealed by comparing identical postconditioning and preconditioning phases, in which face images were also shown in a different viewing angle to examine the hypothesis that associative learning relies on image-based representations (Riesenhuber & Poggio, 2000).
One hypothesis is derived from the perspective of a hierarchically organized parallel visual processing system (Zeki, 1993), in which the flow of visual information is thought to follow a serial retinogeniculostriate pathway, before it reaches temporal visual and amygdaloid structures (Vuilleumier, 2005). As the discrimination of individual faces is presumed to depend on distinct higher-order visual structures implicated in face processing (Kanwisher & Yovel, 2006), associative learning effects may occur comparatively late in the visual processing stream (>120 msec). Alternatively, if the analysis of emotional information is distributed to several interacting cortical (prefrontal regions) and subcortical (i.e., amygdala) structures where relevant information can reach the amygdala quickly via a retino-collicular-pulvinar-amygdala pathway, higher-order visual processing of aversively conditioned faces may be modulated at very early latencies (Johnson, 2005; Vuilleumier, 2005; Bullier, 2001). Anatomically, support for this pathway comes, for example, from fiber-tracking studies in macaque monkeys, showing that the amygdala shares numerous reciprocal connections with virtually every stage of visual processing along the ventral visual stream (Freese & Amaral, 2005). Functionally, this pathway might exist in humans as well, as demonstrated by emotional effects in imaging data (Morris, Ohman, & Dolan, 1998) and blindsight vision (Hamm et al., 2003; De Gelder, Vroomen, Pourtois, & Weiskrantz, 1999). Additionally, lateral and OFCs have been suggested to guide recognition in inferotemporal cortex by providing initial guesses and hypotheses, particularly under circumstances of reduced discriminability (Bar et al., 2006).
Twenty-four right-handed adults (12 women and 12 men, mean age = 25.5 years, range = 22–34 years) with normal or corrected-to-normal vision participated in our study. All participants had no recorded history of neurological or psychiatric disorders, were nonsmokers, and provided written informed consent to the protocol approved by the review board of the University of Münster.
Stimuli and Apparatus
The CS comprised 208 pictures of 104 different neutral faces (104 frontal and 104 lateral views, 45° angle; matched left and right lateral view and matched male and female gender). Most images (N = 140, lateral and frontal view of 70 individuals) were taken from the Karolinska Directed Emotional Faces set (Lundqvist, Flykt, & Öhman, 1998), complemented by additional pictures from the face database of our own laboratory. Faces were cropped (including hair), converted to grayscale, and equalized in size (6 cm in height) using Adobe Photoshop® Software. Figure 1 shows examples of the stimuli used in the study.
Unconditioned Stimuli (US)
Hydrosulfide (H2S: 0.1% w/v in water) and humid clean air served as US and control stimulus, respectively. A “nonsmelling odor” was selected as control stimuli to assure that the smell is perceived as emotionally neutral, hence behaviorally irrelevant. Self-report measures, obtained at the end of the experiment, confirmed expected differences in the valence of the odors. Specifically, each of the two odors was evaluated three times on valence, and arousal dimension was assessed by the Self-Assessment Manikin (SAM) Scale (Bradley & Lang, 1994), which was scored on a 9-point scale, where the valence scale ranged from 1 (negative) through 5 (neutral) to 9 (positive) and the arousal scale ranged from 1 (relaxed) to 9 (exited). Hydrosulfide (valence: M = 1.94; SD = 0.77, arousal: M = 5.01; SD = 2.03) compared with humid air (valence: M = 5.41; SD = 0.86, arousal: M = 2.84; SD = 1.72) was perceived as significantly more unpleasant (t(23) = −17.26, p = .000) and arousing (t(23) = 4.74, p = .000). The order of odor presentation did not affect ratings of valence and arousal as neither the main effect nor high-order interactions involving the Presentation factor reached significance.
Odors were delivered using a custom-made nonferromagnetic stimulator, which generated a continuous stream of compressed air, adjusted to a pressure of 0.3 bar. Two flexible tubes were attached to the stimulator and led to two sealed test tubes containing the two odorants. Another two tubes led from the test tubes to a retaining device placed on the subject's chest. Once activated, the odor emanated from the vents of the tubes, which were oriented toward the nostrils. In several simulations, odor delivery was calibrated such that no perceivable tactile, thermal, or acoustic stimulation occurred. In the condition phase, the ISI was 2.5 sec between offset and onset of two consecutively presented odors. Pretesting in a separate sample of subjects revealed that the aversive odor was eliminated almost immediately after its offset. Thus, the chosen ISI assured that there were no carryover effects of odor between conditioning trials. This is also supported by the results in the SAM rating for the faces (see below).
The main experiment consisted of three phases, a pretraining, a conditioning, and a posttraining phase. The pretraining phase comprised of two parts. First, participants were asked to rate all CS faces on a computerized version of the aforementioned SAM scales (Bradley & Lang, 1994). Stimuli were presented for 1000 msec and evaluated on dimensions of emotional meaning and arousal. Second, after the rating, participants were placed in the MEG scanner. They were presented with two repetitions of the CS stimuli (2 × 208 faces) in a fully randomized order, shown for 500 msec (ISI = 500 msec). Participants were instructed to passively view the pictures. A scheme of the experimental paradigm is given in Figure 1.
In the following conditioning phase, from the overall faces of 104 individuals, 52 frontal and 52 lateral face stimuli were selected for conditioning. Half of each of these frontal and lateral sets (2 × 26) was paired twice with hydrosulfide (CS+), and the other half (2 × 26) was paired twice with humid air (CS−). According to the huge number of CS used here, this will be referred to as multi-CS conditioning paradigm. In the pretraining and posttraining phases, these 104 stimuli were presented in the same view as in the training phase. Additionally, another 104 stimuli—showing these individuals from a respective different viewing angle—were presented in the pretraining and posttraining phase to test for view-independent and view-dependent conditioning effects in the later analysis.
The assignment of CS and US was balanced across participants. Stimulus presentation was repeated after showing the CS once, resulting in 208 conditioning trials. The CS was presented for 1000 msec at central location (such that the nasion appeared at the center of the screen) at a visual angle of 6°. On trials with odor presentation, the CS was presented 1.5 sec after odor onset, and the odor persisted until 500 msec after CS offset. CS order was pseudorandomized, such that no more than three faces were paired with hydrosulfide (or humid air) consecutively. Although the procedure is neither a classic forward, backward, nor simultaneous conditioning paradigm, it relates to natural conditions in that the perception of odors both precedes the sight of a person and persists, even when the person left the scene.
Following conditioning, the posttraining phase was identical with respect to the experimental stimulation in the MEG chamber. After recording, the subjects completed the SAM rating for all CS faces again. Although the posttraining included many extinction trials, each CS was repeated only twice.
Between the pretraining, conditioning, and posttraining phases, short pauses of approximately 5 min were interspersed.
Finally, participants had to complete a binary forced-choice CS–US matching task. In this test, 20 pseudorandomly selected CS faces (10 that had been paired with hydrosulfide and 10 that had been paired with humid air) were presented, and participants had to indicate the associated US. To avoid fatigue and loss of motivation because of unacceptable overall measurement time, only a pseudorandomly selected subset of all CS was chosen for the CS–US matching task. Overall, it took the participants roughly about 60 min to complete the whole experiment. All behavioral tests were realized on a Dell® computer using Presentation® (Neurobehavioral Systems, Albany, CA).
MEG Recording and Analysis
Visual-evoked magnetic fields were acquired using a 275-channel MEG whole-head sensor system (Omega 275, CTF, VSM Medtech Ltd., Coquitlam, British Columbia, Canada) with first-order axial SQUID gradiometers. To monitor the participant's position in the MEG scanner, landmark coils were attached to the two auditory canals and the nasion. Furthermore, individual head shapes were digitized using a Polhemus 3Space Fasttrack to determine the individual head coordinate system. Continuous MEG data in the frequency band between 0 and 150 Hz were recorded with a sampling rate of 600 Hz. MEG data preprocessing, artifact rejection and correction, averaging, statistics, and visualization was conducted with the Matlab-based EMEGS software (www.emegs.org; Peyk, De Cesarei, & Junghöfer, in press). Responses were down-sampled to 300 Hz and filtered off-line between 0.01 and 148 Hz. To optimize the detection of early and transient brain responses in the higher-frequency range, further low-pass filtering was not applied. Data were aligned to stimulus onset, with an averaging epoch ranging from 200 msec before stimulus to 300 msec after stimulus and baseline-adjusted using a 150-msec pre-stimulus interval. The method for statistical control of artifacts in high-density EEG/MEG data was used for single-trial data editing and artifact rejection (Junghöfer, Elbert, Tucker, & Rockstroh, 2000). This procedure (1) detects individual channel artifacts, (2) detects global artifacts, (3) replaces artifact-contaminated sensors with spline interpolation statistically weighted on the basis of all remaining sensors, and (4) computes the variance of the signal across trials to document the stability of the averaged waveform.1 The rejection of artifact-contaminated trials and sensor epochs relies on the calculation of statistical parameters for the absolute measured magnetic field amplitudes over time, their standard deviation over time, the maximum of their gradient over time (first temporal derivative), and the determination of boundaries for each of these three parameters. On average, 5 of 275 sensors were interpolated and 9.6% of all trials were rejected by the artifact rejection procedure.
After averaging, cortical sources of the event-related magnetic fields were estimated using the L2 minimum-norm estimate method (Hämäläinen & Ilmoniemi, 1994). The L2 minimum-norm estimate method is an inverse modeling technique applied to reconstruct the topography of the primary current underlying magnetic field distribution. It allows for the estimation of distributed neural network activity without a priori assumptions regarding the location and/or number of current sources (Hauk, 2004). In addition, of all possible generator sources, only those exclusively determined by the measured magnetic fields are considered. A spherical shell with evenly distributed 2 (azimuthal and polar direction, radial dipoles do not generate magnetic fields outside of a sphere) × 350 dipoles was used as source model. A source shell radius of 87% of the individually fitted head radius was chosen, roughly corresponding to the gray matter depth. Although the distributed source reconstruction in MEG does not give the precise location of cerebral generators, it allows for a fairly good approximation of cortical generators and corresponding assignment to larger cortical structures. Across all participants and conditions, a Tikhonov regularization parameter k of 0.2 was applied.2 Topographies of source direction-independent neural activities—the vector length of the estimated source activities at each position—were calculated for each individual participant, condition, and time point on the basis of the averaged magnetic field distributions and the individual sensor positions for each participant and run.
Taking advantage of combining high-density MEG with source localization methods, the analysis was based on the neural generator activities obtained for every dipole, time point, and subject. To statistically test for conditioning effects, these data were submitted to a repeated measures ANOVA, including the factors Session (pretraining vs. posttraining), Valence (negative vs. neutral), and View (frontal vs. lateral). As a result of this procedure, a spatio-temporal distribution of statistical values for each dipole over time and across all subjects is obtained. Given a significance threshold of p < .05, only effects matching predefined spatio-temporal criteria survived this analysis step. Accordingly, to avoid false-positive results, a statistical effect was considered meaningful only if it emerged in a region composed of at least 10 neighboring neural sources (dipoles) and appeared within a time interval comprising at least 10 consecutive time points (33 msec). On the basis of the results of this spatio-temporal waveform analysis, the identified regions showing a significant difference were further analyzed using post hoc t tests to specify the direction of effects. To assess conditioning effects, brain activity measured in posttraining and pretraining phases was compared with each other, thus two phases where no odor was present at all. The pretraining and posttraining sessions were identical with respect to their experimental design.
In advance of the Results section, it is important to note that an additional analysis explored whether conditioning effects depended on an identical viewing angle in conditioning and testing phases (the identical image of the face) or if image-independent transfer of CS–US associations occurred (respective different viewing angle). This factor was labeled Image (same vs. different). However, no significant conditioning effects were observed in the posttraining period for faces shown in a different orientation than used during conditioning. This was true for behavioral as well as MEG data. Further control analysis on the basis of the mean activity in regions and time intervals showing conditioning effects also revealed no significant differences associated with faces in image-independent conditions. Accordingly, the data suggest that conditioning effects were image-based and, for brevity, the Image factor was not considered in the following analysis.
CS–US Matching Task
In the matching task, participants were asked to indicate which US (hydrosulfide or humid air) was delivered while viewing the CS face. As expected and revealing an intended lack of contingency awareness, overall performance as measured by the sensitivity index d′ (M = 0.07, SD = 0.52)3 did not exceed chance (one-sample t test [test value = 0]: t(23) = 0.7, p = .49).
To evaluate conditioning-induced changes in evaluative ratings of the CS faces, a repeated measures ANOVA including the factors Session, View, and US Valence was conducted. Ratings of two participants were excluded from further analysis because of erroneous use of the rating scales. As expected, a significant Session × US Valence interaction was found, F(1, 21) = 5.2, p = .033, indicating a relatively more negative classification of CS+ as compared with the CS− faces. Post hoc t tests for the CSpost minus CSpre differences (CS+post − CS+pre vs. CS−post − CS−pre) revealed that CS+ faces (M = −.011, SD = .23) as compared with CS− faces (M = .12, SD = .28) were perceived to be more negative after conditioning (t(21) = −2.28, p = .033). Thus, despite the large number of CS stimuli, evaluative conditioning effects were observed. Considering the results of the explicit CS–US matching task, conditioning effects may occur implicitly in the absence of awareness about CS–US pairings. Although the matching task probed only a randomly selected subset of CS, it is theoretically possible that, by chance, faces were primarily presented in the task, with most pairings not explicitly learned by the subjects. However, as the selection of this subset was completely random and there was no tendency in any subject to reliably report previously learned CS–US association, this issue does not appear to have biased the data. Conditioning had no effect on arousal ratings, hence results are not reported.
Conditioning effects were explored by contrasting neural activity for CS+ (hydrosulfide) and CS− (humid air) faces in the posttraining relative to the pretraining condition. As expected, there were no significant differences in processing of the CS during the pretraining condition. In the posttraining condition, conditioning effects were revealed by the global power measure, showing a marked increase in neural activity for faces paired with hydrosulfide compared with faces paired with humid air in an early (50–80 msec) and a midlatency (130–190 msec) time window (Figure 2). Between 50 and 80 msec, neuronal activity associated with CS+ processing was increased over frontal and right occipito-parieto-temporal regions. Later stages of processing (130–190 msec) revealed enhanced neural activity for CS+ processing in more distributed frontal and occipito-parieto-temporal regions.
Area Score: “Early” Conditioning Effects (50–80 msec)
Each of the two source regions revealing conditioning effects was submitted to an overall ANOVA analysis containing the factors of Session (pretraining vs. posttraining), View (frontal vs. lateral), and US Valence (hydrosulfide vs. humid air).
A significant higher-order interaction of Session × US Valence was observed, F(1, 23) = 14.89, p = .001. Although CS+ activity in this region significantly increased from pretraining to posttraining session (t(23) = −3.28, p = .003), CS− activity did not change (t(23) = 0.58, p = .571; Figure 3A).
Right occipito-parieto-temporal region
Similar to the frontal region, a significant interaction of Session × US Valence was found, F(1, 23) = 9.86, p = .005. There was a relative increase in CS+ processing as neural activity did not change between pretraining and posttraining (t(23) = −1.15, p = .264), whereas activity in the CS− (t(23) = 2.53, p = .019) condition was significantly decreased (Figure 3B).
To identify the onset of effects found in both regions, a 10-msec sliding window analysis for an interval ranging from 30 to 100 msec was performed. For each interval, the above-described ANOVA was calculated. In the frontal region, the first significant difference between CS+ and CS− processing occurred in the 40- to 50-msec interval, F(1, 23) = 7.87, p = .01, whereas in the occipito-parieto-temporal region, the first significant effect was found in the 50- to 60-msec interval, F(1, 23) = 4.83, p = .038). Thus, effects in the frontal region seem to precede effects in the right occipito-parieto-temporal region.
Area Score: “Midlatency” Conditioning Effects (130–190 msec)
Three source regions showing conditioning effects were analyzed using the above-mentioned ANOVA.
Similar to the effects observed in the earlier interval, a highly significant interaction of Session × US Valence was found in a more lateral frontal region, F(1, 23) = 8.89, p = .007. Within this region, CS+ faces (t(23) = −3.24, p = .004) evoked significantly stronger responses than CS− (t(23) = −0.072, p = .943) faces (Figure 4A).
Right and left occipito-parieto-temporal regions
As expected, in these bilateral and more distributed regions, significant differences in the processing of the CS between the pretraining and posttraining conditions were found (right: F(1, 23) = 4.56, p = .044; left: F(1, 23) = 10.04, p = .004).
Within both areas, significant effects were characterized by decreased neural activity in the CS− (right: t(23) = 3.05, p = .006; left: t(23) = 3.5, p = .002) as compared with the CS+ (right: t(23) = 1.06, p = .299; left: t(23) = 0.35, p = .729) condition. Finally, a left superior frontal region failed to reach statistical significance (p > .05) in the second level analysis (Figure 4B).
Aversive associative learning was used to examine facial emotional processing. Using a much larger number of CS stimuli than previous studies, highly significant conditioning effects were observed, indicating that aversive conditioning, unbeknownst to the individual, affects (emotional) face processing to a much greater extent than previously thought. The main finding relates to the role of PFC activation. CS+ compared with CS− processing was associated with enhanced neural activity in PFC encompassing lateral and orbital regions. This PFC activation occurred rather early in the processing stream (∼50 msec), even preceding enhanced CS+ processing in inferotemporal regions implicated in object recognition. Overall, aversive associative learning efficiently alters the processing of socially relevant stimuli and induces short-latency responses in cortical structures related to object recognition, categorization, and emotion processing.
In both time windows, between 50–80 and 130–190 msec, faces associated with hydrosulfide elicited more neural activity in the OFC than did faces associated with humid air. The OFC has been implicated in stimulus–reinforcement learning in macaques, where neurons responded to learning new associations between a neutral and a primary reinforcer (Kringelbach & Rolls, 2004; Thorpe, Rolls, & Maddison, 1983). Furthermore, single-cell recordings in macaques revealed neurons in the OFC that selectively responded to faces, providing a possible neuronal basis for the rapid learning of face–reinforcement associations (Rolls, 2004). Human studies investigating associative learning are consistent with this notion. For instance, in a visuo-olfactory conditioning study, OFC regions were implicated in the storage of previously learned CS–US associations presumably mediated by an OFC-amygdala network (Gottfried & Dolan, 2004). Overall, the present findings are in general agreement with the notion that the OFC is involved in learning new associations, and given its reciprocal connections with the amygdale (Ghashghaei & Barbas, 2002), an early engagement in emotion discrimination is likely.
In addition, faces associated with hydrosulfide relative to humid air also elicited increased activity on the lateral surface of PFC (Figure 3A). Previous animal studies have shown that neuronal activity in lateral PFC and ITC reflects the category of visual stimuli. The ITC seems to be more involved in the visual analysis of currently viewed images, whereas lateral PFC regions rather encode for behaviorally relevant factors, such as category membership (Freedman, Riesenhuber, Poggio, & Miller, 2003). A hallmark function of visual categorization is that perceptually similar objects may be treated as different when they belong to different categories, whereas perceptually dissimilar objects are treated alike when they belong to the same category (Wyttenbach, May, & Hoy, 1996). The present findings imply that the emotional value can serve as a cardinal basis of categorization. Functionally, treating the many faces conditioned with an aversive odor in a similar way seems highly adaptive to activate a motivational orientation of avoidance (Lang, Bradley, & Cuthbert, 1997). Accordingly, it is proposed that lateral prefrontal activation may reflect a fundamental and basic categorization of human experience: the discrimination of “bad” from “good” (Lang et al., 1997; Cacioppo, Crites, Berntson, & Coles, 1993).
In addition, activity in both time windows extends to motor and/or premotor areas (see Figures 3 and 4) of the right hemisphere, although no explicit task was given to the subjects requiring motor reactions, for example, button presses. However, there are studies showing that the supramarginal gyrus is involved in emotional processing as lesions of the right somatosensory areas (including supramarginal gyrus) can lead to deficits in judging emotions from faces (e.g., Adolphs, Damasio, Tranel, Cooper, & Damasio, 2000). Accordingly, it might be that somatosensory areas are activated to increase sensitivity to bodily responses and encoded emotional associations. Alternatively, the activation along the parietal and frontal axis seen in Figures 3 and 4 might reflect activation in attentional networks, that is, directing visuospatial attentional resources toward salient emotional features such as the eyes (Adolphs & Spezio, 2006).
Lateral PFC, orbital PFC, and ITC are reciprocally connected, providing a neural basis for the categorization of visual information (Bar, 2003; Freedman et al., 2003). Furthermore, the OFC is part of an OFC–amygdala–ITC triad, which has extensive reciprocal connections to other limbic areas, premotor areas, and the autonomic motor system (Ghashghaei & Barbas, 2002). Thus, PFC activation may facilitate the detection of danger to avoid harm. It has been suggested that the early activation of PFC may guide visual information processing by providing top–down expectations to the ventral processing stream (Adolphs, 2002). A critical issue in this regard is that PFC activation may precede activation in object-related regions in the temporal cortex, that is, lateral occipital cortex and fusiform gyrus (Bar et al., 2006). The current findings support this notion, as PFC activation preceded temporal cortex activation. However, although latencies of PFC and temporal cortex activations in the later time window (130–190 msec; Figure 4) nicely fit to previously reported latencies for PFC-driven modulation of sensory processing (>120 msec; e.g., Rudrauf et al., 2008; Bar, 2003; Adolphs, 2002), activation in the earlier time window (50–80 msec; Figure 3) indicates PFC-driven sensory modulations already in the initial wave of processing. With regard to existing anatomical pathways, several alternative explanations of this result appear feasible:
First, rapid modulations mediated by limbic structures could contribute to the early effects found here. It has been shown that affect-related information can be conveyed to emotional structures like the amygdala via direct projections from the sensory thalamus with response latencies around 20–30 msec (LeDoux, 2000) and 20–50 msec (Luo et al., 2009; Luo, Holroyd, Jones, Hendler, & Blair, 2007) in the auditory and visual domain, respectively. The amygdala, as part of a widely distributed emotional network, shares numerous, mainly reciprocal connections with, for example, OFC, medial PFC, and the entire ventral visual stream (Freese & Amaral, 2005; Ghashghaei & Barbas, 2002), perhaps facilitating processing in these areas. Thus, initial stimulus categorization could be dominated by evaluation of emotional significance, allowing for rapid preparation of basic motivational systems (LeDoux, 2000) and may even precede full (e.g., view-independent) object recognition. However, in a recent review, Pessoa and Adolphs (2010) challenge the model of a fast thalamo-amygdala pathway in humans for the rapid transmission of emotion-related information. Thus, this pathway might be less likely to produce the early effects seen in the present study.
Second, it is possible that fast geniculo-cortico-cortical pathways are involved. Earliest response latencies in occipital (V1–V4), parietal (MT, MST), and prefrontal (FEF) areas in the macaque can be seen before 65 msec (Bullier, 2001; Lamme & Roelfsema, 2000). In humans, Kirchner, Barbeau, Thorpe, Régis, & Liégeois-Chauvel (2009) reported earliest V1 and FEF responses around 25 and 45 msec, respectively. Thus, sensory information can reach frontal areas extremely fast (<50 msec) through rapid cortico-cortical routes (e.g., long-range association fibers such as longitudinal fasciculus). Our finding further accords with ultrarapid saccadic eye movements (∼120 msec) in an object detection task, suggesting that a first analysis of even complex visual stimuli is provided within less than 100 msec (Kirchner & Thorpe, 2006).
Third, it has been suggested that the remaining nonconscious visual capacities in blindsight patients might be mediated by an extrageniculostriate pathway, “bypassing” V1, involving the superior colliculus and the posterior visual thalamus (Morris, DeGelder, Weiskrantz, & Dolan, 2001). Consistent with the current findings, a recent MEG study revealed increased activity to facial stimuli with emotional expressions at midtemporal visual areas already around 70 msec in a patient with “affective blindsight” (Andino, Menendez, Khateb, Landis, & Pegna, 2009).
Currently, we cannot provide clear evidence in favor of one main pathway or even a network combination of these three candidate pathways accounting for the early effects found in the present study. However, our findings—at least for the early interval—seem difficult to reconcile with a strict hierarchically organized model of visual processing. In such a model, earliest differential CS+ activity at PFC regions should have been observed in the midlatency time interval (>120 msec). The finding of differential CS processing already between 50 and 80 msec suggests that emotionally significant information is distributed to cortical (prefrontal regions) and subcortical structures early in the processing stream (cf. Vuilleumier, 2005; Bullier, 2001).
The finding in the midlatency time window (130–190 msec) is consistent with previous EEG and MEG studies reporting the amplified processing of aversively conditioned faces in this time region (Dolan, Heinze, Hurlemann, & Hinrichs, 2006; Pizzagalli et al., 2003). However, compared with these studies, which relied on a posttraining comparison of CS processing only, the pretraining condition of the present study provided additional information regarding aversive conditioning effects. Specifically, the amplified processing of CS+ as compared with CS− processing can reflect either that the activation for the CS+ is stronger in the posttraining session whereas responses to the CS− seem to be unaffected by conditioning across sessions (CS+post > CS+pre vs. CS−post = CS−pre), or the activation for the CS+ does not change from pretraining to posttraining sessions whereas the responses to the CS− significantly decrease or habituate, respectively (CS+post = CS+pre vs. CS−post < CS−pre). In both cases, activations in the posttraining phase would be stronger for the CS+ compared with the CS− stimuli. Inspection of Figures 3 and 4 reveals evidence for both alternatives: Over anterior frontal regions, CS+ processing seems to be facilitated as compared with pretraining, whereas CS− processing appears attenuated at posterior occipito-temporal regions. Overall, the occurrence of these response patterns might suggest different modes of biasing processing toward the more relevant CS+ in the posttraining phase, possibly reflecting functional differences among brain regions.
Previous studies provided evidence for PFC-driven modulation of sensory processing (e.g., Rudrauf et al., 2008; Bar et al., 2006; Adolphs, 2002). It has been suggested that PFC regions can influence even early perceptual processing in a top–down manner within the first 120 msec (Adolphs, 2002), possibly starting around 80 msec after stimulus onset (Bar et al., 2006) or even earlier, as suggested here. Additionally, lateral prefrontal regions have been implicated in stimulus categorization and the encoding of behaviorally relevant information (Freedman et al., 2003). Furthermore, neurons in the right PFC are selectively elicited by aversive events in a time range between 120 and 160 msec (Kawasaki et al., 2001). Other studies regarding basic face perception have shown that responses to faces habituate over time if they are not task-relevant (Maurer, Rossion, & McCandliss, 2008). Given that frontal areas are supposed to represent the emotional value of sensory stimuli, we suggest that the activity found in frontal areas in the present study guides the activity in more occipito-temporal areas by providing initial top–down guesses (Bar et al., 2006) and, furthermore, prevents habituation to the CS+ (because they are emotionally relevant) which, in turn, is evident for the CS−, leading to relative decrease of occipito-temporal CS− activation in the pre–post comparison.
The present findings also parallel previous results regarding enhanced CS+ sensory processing in the early time interval (<100 ms). For instance, previous research using gratings as CS+ stimuli revealed early differential processing of CS+ compared with CS− stimuli in the C1 time window between 65 and 90 msec (Keil et al., 2007; Stolarova et al., 2006). However, whereas these rather simple CS+ stimuli primarily enhanced primary visual cortex activation, the more complex face CS+ evoked more widespread preferential processing in secondary occipito-temporal regions in the present study. Furthermore and as predicted by current models of object recognition (Riesenhuber & Poggio, 2000), enhanced processing of CS+ faces within the early and midlatency intervals appears to be based on view-dependent representations.
Taken together, our results reveal a remarkable capacity for emotional categorization. Pairing each of the 52 CS+ faces with an aversive odor only twice was sufficient to induce associative learning. The large number of CS stimuli effectively prevented contingency awareness. When explicitly asked, participants were at chance level to indicate CS–US relationships. In contrast, emotional valence ratings clearly differed between CS conditions, as CS+ faces were rated as more unpleasant than CS− faces. Thus, there is a dissociation between explicit representation of CS–US contingencies and implicit CS–US effects revealed by MEG findings and self-report measures. This result is consistent with numerous data demonstrating associative implicit learning without necessarily requiring the explicit knowledge of contingencies (see Ohman & Mineka, 2001).
Overall, this study revealed associative learning effects in extrastriate visual as well as prefrontal areas, which occurred in the absence of contingency awareness, with a stimulus set overwhelmingly exceeding working memory capacity and with only two learning instances. The rather early onset of the prefrontal effects (∼40–50 msec) is consistent with nonsequential models of visual processing in which emotionally significant information leads to amplified processing in a distributed cortico-subcortical networks early in the processing stream preceding full visual analysis.
This work was supported by the German Research Association (DFG; SFB-TRR58-C1) and the Academy of Science, Heidelberg.
Reprint requests should be sent to Dr. Markus Junghöfer, Institute for Biomagnetism and Biosignalanalysis, University of Münster, Malmedyweg 15, 48149 Münster, Germany, or via e-mail: firstname.lastname@example.org.
The artifact detection process in EMEGS tests the goodness of interpolation by interpolating a multitude (number of sensors = 275) of test topographies. If many to-be-interpolated sensors fall into one region or if many noisy sensors are positioned at the edge of the sensor coverage, many test topographies are not interpolated with a sufficient accuracy and the corresponding trials get rejected from further analysis.
The degrees of freedom of the distributed source model exceed the number of measured MEG sensors. The inverse modeling, thus, poses an underdetermined problem and the leadfield matrix inversion needs some regularization. The Tikhonov regularization is the most commonly used method of regularization of ill-posed problems.
Formula used for calculating d′: d′ = z(hits) − z(false alarms). In this formula, z calculation is based on the inverse of the normal distribution (see Green & Swets, 1966).