We assessed the extent of neural competition for attentional processing resources in early visual cortex between foveally presented task stimuli and peripheral emotional distracter images. Task-relevant and distracting stimuli were shown in rapid serial visual presentation (RSVP) streams to elicit the steady-state visual evoked potential, which serves as an electrophysiological marker of attentional resource allocation in early visual cortex. A task-related RSVP stream of symbolic letters was presented centrally at 15 Hz while distracting RSVP streams were displayed at 4 or 6 Hz in the left and right visual hemifields. These image streams always had neutral content in one visual field and would unpredictably switch from neutral to unpleasant content in the opposite visual field. We found that the steady-state visual evoked potential amplitude was consistently modulated as a function of change in emotional valence in peripheral RSVPs, indicating sensory gain in response to distracting affective content. Importantly, the facilitated processing for emotional content shown in one visual hemifield was not paralleled by any perceptual costs in response to the task-related processing in the center or the neutral image stream in the other visual hemifield. Together, our data provide further evidence for sustained sensory facilitation in favor of emotional distracters. Furthermore, these results are in line with previous reports of a “different hemifield advantage” with low-level visual stimuli and are suggestive of independent processing resources in each cortical hemisphere that operate beyond low-level visual cues, that is, with complex images that impact early stages of visual processing via reentrant feedback loops from higher order processing areas.
Emotional stimuli are often regarded as a privileged stimulus category, whose neural processing in the human brain is prioritized because of their pivotal role in motivation and behavior. However, there is a longstanding debate concerning to what extent emotional cues automatically attract attentional resources when they are presented as task-irrelevant and outside the focus of attention (Todd & Manaligod, 2018; Carretié, 2014; Okon-Singer, Lichtenstein-Vidne, & Cohen, 2013; Vuilleumier, 2005; Pessoa, McKenna, Gutierrez, & Ungerleider, 2002; Vuilleumier, Armony, Driver, & Dolan, 2001).
A critical factor in determining the extent of unattended processing of emotional distracters is thought to be the availability of attentional processing resources. It has been shown that visual processing capacity is limited, and the enhanced neural processing of emotionally arousing cues is typically achieved at the expense of other simultaneously presented stimuli that compete for the limited pool of attentional resources (Jiang, Wu, Saab, Xiao, & Gao, 2018; Deweese, Müller, & Keil, 2016; Mather & Sutherland, 2011; Müller, Andersen, & Keil, 2008; Desimone & Duncan, 1995). Interestingly, spatial positions of unattended emotional distracters and task-relevant stimuli may also influence the competitive interactions between attention and emotion. Specifically, previous research has indicated that when task-irrelevant emotional pictures shared the same spatial location with task-relevant items, for example, when being spatially overlaid by the targets, they frequently exerted a greater attentional capture relative to their neutral counterparts (Santos, Iglesias, Olivares, & Young, 2008; Okon-Singer, Tzelgov, & Henik, 2007; Anderson, Christoff, Panitz, De Rosa, & Gabrieli, 2003). Conversely, when presented spatially separated from the task-related stimuli, either in the periphery (Lichtenstein-Vidne, Henik, & Safadi, 2012; De Cesarei, Codispoti, & Schupp, 2009; Eimer, Holmes, & McGlone, 2003; Holmes, Vuilleumier, & Eimer, 2003) or at fixation (Holmes, Kiss, & Eimer, 2006), unattended distracter images often failed to produce differential neural modulation by affective content, that is, emotionally arousing distracters were not processed preferentially relative to their neutral counterparts.
The mixed evidence might be, at least in part, due to the relative spatial positions of stimuli, which may have resulted in spatial attentional filtering by suppressing the processing of task-irrelevant items (Müller & Hübner, 2002) and possibly by dividing the attentional spotlight between noncontiguous zones of the visual field (Müller, Malinowski, Gruber, & Hillyard, 2003). In concert, dividing attention across left and right visual hemifields to process targets has been consistently shown to be easier than within the same visual hemifield, at least when using simple low-level stimuli (i.e., discs, bars; Störmer, Alvarez, & Cavanagh, 2014; Alvarez, Gill, & Cavanagh, 2012; Awh & Pashler, 2000; Sereno & Kosslyn, 1991). These findings suggest that attentional resources might not be shared across visual hemifields, but instead each cortical hemisphere may have an independent pool of processing resources (Walter, Quigley, & Müller, 2014; Alvarez et al., 2012; Sereno & Kosslyn, 1991), consistent with the different hemifield advantage account (Sereno & Kosslyn, 1991), or as stated in the model of competitive content maps (Franconeri, Alvarez, & Cavanagh, 2013). In turn, this may have important implications for neural competition for attentional processing of more complex semantic stimuli, that is, emotional distracter images, when they are distributed across both visual hemifields.
Over the past decade, new insights have been obtained with respect to the neural competition for attentional resources with unattended visual emotional cues presented in foveal vision and spatially overlapped with task stimuli using steady-state visual evoked potentials (SSVEPs). The SSVEP, an electrophysiological marker of selective attention, is a brain response elicited by a periodically presented stimulus, with its main source generators in early visual areas (Norcia, Appelbaum, Ales, Cottereau, & Rossion, 2015). Three important properties of the SSVEP response render it powerful for directly examining modulatory effects of emotional distracters on attentional processing resources at early stages of perceptual processing. First, SSVEP amplitude increases significantly when the stimulus is attended as compared with when it is unattended (Andersen & Müller, 2010). Second, one can obtain a direct measure of attentional resource allocation to multiple simultaneously presented stimuli by “frequency-tagging” each stimulus at a specific rate and recording its unique SSVEP signal. Third, the SSVEP amplitude differs reliably when an emotional relative to a neutral stimulus is displayed (Schettino, Gundlach, & Müller, 2019; Wieser, Miskovic, & Keil, 2016; Keil et al., 2009, 2010, 2012). Notably, in our previous experiments utilizing frequency-tagging, the SSVEP elicited in response to the flickering foreground task stimuli was significantly reduced when an emotional compared with a neutral background image unexpectedly appeared, signifying a withdrawal of visual processing resources from the primary task away toward affective image content in the background (Bekhtereva & Müller, 2017a; Müller & Gundlach, 2017; Deweese et al., 2016; Bekhtereva, Craddock, & Müller, 2015; Schönwald & Müller, 2014; Hindi Attar, Andersen, & Müller, 2010; Müller et al., 2008). Thus, the competitive advantage of emotional relative to neutral visual cues came at the cost of concurrently presented information, supporting the view that visual processing capacity is limited, at least when distracting and task-relevant information shared the same spatial location.
The goal of this study was to provide a direct assessment of neural competition for attentional resources between a centrally displayed visual task and neutral and emotional distracters, which, as opposed to our previous experimental protocols, were presented spatially distributed across both visual hemifields. More specifically, we tested whether neural processing of emotional relative to neutral scenes, when presented as task-irrelevant in both visual hemifields, would result in a sensory gain in favor of affective content at the cost of processing of other simultaneously displayed stimuli. In the present design, visual scenes were shown in a rapid serial visual presentation (RSVP) in the left and right visual hemifield, with a new image displayed at every presentation cycle to elicit the SSVEP. This was done by analogy with our recent experiments indicating the SSVEP sensitivity to affective content in similar study protocols, namely, an increase or decrease in SSVEP amplitude when affective relative to neutral images were shown in an RSVP at the center of the screen (Bekhtereva, Pritschmann, Keil, & Müller, 2018; Bekhtereva & Müller, 2015). Here, neutral valence images were initially shown in both visual hemifields. At an unpredictable time point during each trial, the pictures in one visual hemifield switched from neutral to unpleasant content. The RSVP streams were displayed at 4 and 6 Hz (250 and ∼167 msec per image, respectively). Meanwhile, at fixation an RSVP of symbolic letters was presented, which changed at a 15-Hz rate (i.e., ∼67 msec per letter). Thus, the images in each hemifield and the central RSVP stream each elicited distinct SSVEP responses. The presentation frequencies were chosen based on our previous findings, indicating robust SSVEP amplitude modulations as a function of valence at these rates (Bekhtereva et al., 2018; Bekhtereva & Müller, 2015).
We hypothesized that if emotional content had a competitive advantage in neural competition for attentional resources, then sensory gain elicited by emotionally arousing relative to neutral distracter images would be reflected in a valence-dependent SSVEP amplitude modulation when unpleasant as compared with neutral RSVPs were presented in the periphery. Moreover, if visual processing as measured by SSVEPs were a strictly limited resource, which is shared between cortical hemispheres, then sensory modulation with emotional RSVPs would come at the expense of other stimulus processing. If such limited resource pool sharing were the case, then with a presentation of emotional RSVPs in one of the visual fields, we would in parallel expect greater costs in attentional resources (SSVEP reduction) dedicated to the processing of the concurrent task and in response to a neutral RSVP in the other visual hemifield. Alternatively, if separate attentional resource pools exist for each cortical hemisphere, sensory gain with affective distracters might occur independently for each visual hemifield, without interfering with the processing of the attentional task and the neutral image stream in the opposite visual hemifield.
Thirty-two individuals (27 women, 5 men), with a mean age of 23 years (SD = 4.87 years) and with normal or corrected visual acuity, took part in the study. The number of participants was sufficient to achieve power of .8 based on the smallest effect size from one of the critical tests obtained in our recent study (ηg2 = .15), with a similar RSVP protocol (Bekhtereva et al., 2018). The power analysis was calculated using G*Power software (Faul, Erdfelder, Lang, & Buchner, 2007).
All participants received information about the study's nature and provided their written informed consent before experimental recording. For participation, all participants received either credit points or financial compensation (€8/hr). The study was approved by the ethics committee of the University of Leipzig and conducted in accordance with the Code of Ethics of the World Medical Association.
Fourteen different Amharic characters constituted task-relevant stimuli, which could either be presented in red, green, blue, yellow, turquoise, or purple color. As task-irrelevant distracters, 80 neutral and 80 unpleasant color picture scenes1 were selected from the International Affective Picture System (Lang, Bradley, & Cuthbert, 2008) and from the Emotional Picture Set databases (EmoPicS; Wessa et al., 2010).
All pictures were resized to 320 × 240 pixels using the MATLAB (The MathWorks) image processing toolbox. To ascertain similar luminance and contrast across neutral and unpleasant image categories, the mean (representative of the global luminance) and standard deviation (representative of root-mean-square contrast) of the luminance distribution of each picture were quantified on the intensity of pixels ranging from 0 to 1 (normalized RGB [red, green, blue] values, with a minimum value of 0 = black and a maximum value of 1 = white). A two-sample Welch t test did not show any statistically significant differences between unpleasant and neutral pictures for mean luminance, t(153.05) = −0.068, mean difference = −0.0004, 95% CI [−0.01 0.01], d = 0.01, p = .95, or contrast composition, t(153.85) = 0.41, mean difference = 0.001, 95% CI [−0.004 0.007], d = 0.07, p = .68.
Furthermore, unpleasant and neutral pictures were compared on the ratings of subjective image complexity (received from Andreas Keil, University of Florida). The two-sample Welch t test did not reveal any discernible differences between emotionally unpleasant and neutral categories, t(158) = 0.27, mean difference = 0.07, 95% CI [−0.46 0.61], d = 0.04, p = .79.
The task-relevant Amharic letters were presented as an RSVP at the center of a 19-in. computer screen set at a resolution of 1024 × 768 pixels against a black background, 16 bits per pixel color mode, and 60-Hz monitor refresh rate at a viewing distance of 80 cm. A white cross was presented centrally throughout the entire experiment to maintain fixation. Amharic letters were each presented overlaid on a gray square, with an average luminance of 35–45 cd/m2, subtending a 2.15° of visual angle vertically and horizontally. The RSVP letter stream was periodically displayed at a frequency of 15 Hz to elicit an SSVEP. Hence, every letter was displayed for four frames (∼67 msec).
Participants were instructed to detect the predefined blue Amharic symbol target in the central RSVP stream as accurately and quickly as possible by pressing the “space” bar on a standard “QWERTZ” keyboard, while ignoring all other letters. Simultaneously with the central RSVP, various images of neutral or unpleasant content were displayed in RSVP streams in the periphery, and participants were instructed to disregard them as task irrelevant.
At the beginning of each trial, a centrally presented cue was displayed for a random time interval of 1000–1500 msec. The cue constituted a blue Amharic letter, which served as a target throughout the experiment, whereas four other letters that had the same shape but different color were used as distractors (see Figure 1). All other Amharic letters served as standard stimuli. After the cue offset, stimulation began with the presentation of the RSVP stream of Amharic letters, superimposed with a white fixation cross (0.36° horizontal and 0.36° vertical visual angle). Overall, a sequence of 90 various letter characters was shown in random order for 6000 msec.
In parallel with the presentation of the RSVP letter stream, various images of neutral content were presented peripherally in RSVP streams, to the left and right from the fixation cross (6.62° of visual angle from the center of the fixation cross to the center of the respective image). Each image was shown for 15 or 10 frames of screen refresh (250 and ∼167 msec, respectively), corresponding to 4 and 6 Hz. Picture size in peripheral RSVP streams subtended 8.22° × 5.87° of visual angle, and picture luminance as measured against the screen background was between 20 and 50 cd/m2. At a variable time point during a trial, the RSVP stream of neutral picture scenes could change to an RSVP of unpleasant images in either the left or right visual field, whereas the other RSVP always remained neutral (Figure 1). Changes in emotional valence were jittered, occurring randomly and only once during the trial at 2000, 3000, or 4000 msec after trial onset. This was done to counteract any anticipation effects for change in emotional valence in RSVP streams. During every trial, 4- and 6-Hz RSVPs were presented simultaneously, and the order of presentation was counterbalanced across left and right visual fields. Trial presentation duration of 6000 msec corresponded to 24 presentation cycles for 4 Hz and 36 presentation cycles for 6-Hz RSVP, respectively, with a new image displayed every cycle. Across the experiment, each neutral picture was shown 108 times, and each unpleasant image was displayed 36 times in each visual field. Neutral images needed to be repeated more often, given that an RSVP consisting of only neutral images was always presented to one visual hemifield. Furthermore, the images were shown in a randomized order, with the constraint that the same image could not appear twice consecutively. At the end of the RSVP presentation, only the black background with a white fixation cross was presented for additional 600 msec. Overall, the experiment consisted of 384 trials subdivided into 12 blocks (32 trials per block), corresponding to four experimental conditions, with 96 trials per condition. Experimental conditions were as follows: (1–2) a change from neutral to unpleasant content occurred in the left visual hemifield with 4 (or 6) Hz RSVP, whereas only neutral distracter images were shown in the right visual hemifield at 6 (or 4) Hz RSVP; (3–4) a change from neutral to unpleasant content occurred in the right visual hemifield with 6 (or 4) Hz RSVP, whereas only neutral distracter images were shown in the left visual hemifield at 4 (or 6) Hz RSVP, respectively.
In each trial, during the first ∼533 msec following stimulus onset, no targets or distractors were presented; that time window served to establish a reliable SSVEP and was later discarded from data analysis (see EEG Recording and Analysis section). All targets and distractors were equally distributed across a time window before and after the jittered timing of the switch in emotional content for each experimental condition (i.e., at 2000, 3000, or 4000 msec) in the following way: (1) ∼533–2000 msec before and ∼2067–5467 msec after the change in content; (2) ∼533–3000 msec before and 3067–5467 msec after the change in content; and (3) ∼533–4000 msec before and 4067–5467 msec after the change in content. Thus, for each experimental condition and timing of the switch associated with it, a total of 10 targets and distractors occurred over the time window of ∼2.5 sec before and ∼2.5 sec after the switch in emotional content, respectively. Across the whole experiment, 240 targets and 240 distractors were presented in total in 50% of the trials. Thus, the other 50% of all trials did not contain any targets or distractors. Targets and distractors embedded in the RSVP stream of symbolic letters were visible for ∼67 msec, and their onsets were separated by at least 800 msec. No events (targets or distracters) were presented in the last ∼533 msec of the trial or immediately during the cycle in which the switch in emotional content occurred. Unlike in our previous experiments, in which a greater number of events were distributed uniformly at a much finer resolution to efficiently study the time course of behavioral costs of visual distraction with emotional stimuli (Bekhtereva & Müller, 2017a; Bekhtereva et al., 2015; Trauer, Andersen, Kotz, & Müller, 2012; Hindi Attar et al., 2010), here we opted for a more crude distribution of targets and distracters, given that the main purpose here was to ensure that participants paid attention to the task at the center of the screen.
All experimental conditions were presented randomized across trials, and participants could take a break after each block. Halfway through the experiment, the responding hand was switched, whereas the starting hand was counterbalanced across all participants. On the day before the EEG recording, participants performed a training session to familiarize with the task. For this practice session, a different set of pictures was used. The presentation flow and timing were controlled using the Cogent toolbox for MATLAB (Cogent, www.vislab.ucl.ac.uk/Cogent/.
Upon completion of the EEG experiment, to ascertain that our preselected images were perceived by participants similarly to our categorization, participants viewed the pictures used in the experiment in randomized order and were asked to evaluate them on the dimensions of affective arousal and valence by means of the Self-Assessment Manikin (SAM) Scale, ranging from 1 (low arousal and unpleasant valence) to 9 (high arousal and pleasant valence; Bradley & Lang, 1994). The rating procedure began with the presentation of an image from the experiment that was briefly displayed for either 250 msec (one cycle of 4-Hz rate) or ∼167 msec (one cycle of 6-Hz rate) and was subsequently masked by its phase-distorted (meaningless) version displayed for the same duration (Bekhtereva et al., 2018). Following that, the SAM rating scale was displayed to collect a rating of arousal and valence for the respective picture. Responses were given with a numeric pad on the keyboard. Overall, the experimental set of images was presented twice, first for 250 msec and subsequently for ∼167 msec, with the order of presentation rate counterbalanced between participants.
EEG Recording and Analysis
Brain electrical activity was recorded using a BioSemi ActiveTwo system at a sampling rate of 512 Hz. Sixty-four Ag/AgCl scalp electrodes were mounted in an elastic cap according to the international 10–20 system (Jasper, 1958). During the recording, two electrodes were used as reference and ground electrodes (CMS [“Common Mode Sense”] and DRL [“Driven Right Leg”]; for details, see www.biosemi.com/faq/cms&drl.htm). We monitored vertical and lateral eye movements with four bipolar electrodes positioned above and below the right eye (vertical EOG) as well as on the outer canthi of each eye (horizontal EOG).
Because the presentation frequencies differed across the three RSVP streams, the onset of each stream was timed such that the three streams would be phase-synchronized at the time of change in emotional content. Thus, the onset of the first unpleasant image in the stream was simultaneous with the onset of an image in the stream that remained neutral as well as with the onset of an Amharic letter in the central stream. Epochs were extracted between 1500 msec before and 1500 msec after the change in content. All experimental trials, with and without events (targets and distracters), entered the analysis. First, linear trends were removed from the data, and an automatic procedure was then applied for every participant to detect epochs contaminated with artifacts by means of the “Statistical Control of Artifacts in Dense Array EEG/MEG Studies” (Junghöfer, Elbert, Tucker, & Rockstroh, 2000). Subsequently, all epochs were visually inspected for artifacts, particularly for nonstereotypical artifacts (e.g., electrode cable movements or extreme voltage jumps), and such epochs were excluded if contaminated. Following that, data were re-referenced to the average reference. In the next step, to correct for ocular and muscle artifacts, epochs were submitted to an independent component analysis (ICA; Delorme, Palmer, Onton, Oostenveld, & Makeig, 2012). The obtained ICA components were manually screened for components reflecting artifacts (i.e., showing typical topographies of eye artifacts, muscle noise, line noise), and in addition, SASICA plugin for EEGLAB (Chaumon, Bishop, & Busch, 2015) was used to aid the judgment. Those components identified as artifactual were then pruned from the data. In a final step, trials were averaged for each participant and experimental condition. EEG data preprocessing and analyses were performed with custom-built MATLAB scripts and functions in EEGLAB toolbox (Delorme & Makeig, 2004).
To determine electrodes with maximum SSVEP amplitudes for statistical analysis, we quantified topographical distributions of the 4, 6, and 15 Hz mean SSVEP amplitudes averaged across experimental conditions and participants by means of discrete Fourier transform, separately for central and peripheral (left/right visual field) locations (see Figure 2A–B). For the central 15 Hz letter-RSVP, SSVEP were maximal at two parieto-occipital electrodes O2 and PO8 (see Figure 2A). Although centrally presented, SSVEP amplitudes for the letter-RSVP exhibited a right cortical hemifield maximum, similar to a previous study in which we also presented that stream at the center of the screen (Hindi Attar & Müller, 2012). For 4- and 6-Hz RSVP image streams shown in the right visual field, SSVEP amplitudes were maximal at electrodes PO7 and O1 (left electrode cluster). For 4- and 6-Hz RSVP image streams shown in the left visual field, the SSVEP amplitudes were most pronounced at electrodes O2 and PO8 (right electrode cluster). Thus, subsequent analysis of SSVEP amplitudes for 15, 4, and 6 Hz was performed based on the averaged amplitudes of two electrodes selected from the respective electrode clusters for each frequency by means of a Fourier transform. Fourier analyses were performed on the time windows from −1500 to −500 msec before and from 500 to 1500 msec after the change in emotional content (or set time marker for neutral RSVP streams without a change in content). These time windows were chosen based on our previous experimental findings with regard to the time point of SSVEP emotional modulation in response to emotional distracter images that typically occurred at ∼500 msec after the onset of an emotional image (Bekhtereva & Müller, 2017a; Müller & Gundlach, 2017; Hindi Attar et al., 2010). Similar to our earlier studies (Bekhtereva et al., 2018; Bekhtereva & Müller, 2015), we calculated the difference between the amplitude of the time window before minus time window after the change in emotional content. Thus, a positive amplitude value from the difference score corresponds to a decrease in SSVEP response in the time window after the change relative to the time window before the change to emotional (unpleasant) content in an RSVP stream.
For statistical testing, a repeated-measures 2 × 3 × 2 ANOVA with within-subject factors of Location of Content Change (left hemifield/right hemifield), Recording Location (left/right/central), and Frequency Combination (4 and 6 Hz/6 and 4 Hz) were performed on the SSVEP difference scores (time window before minus after the change in emotional content; see above). Where necessary, to decompose significant interactions, we performed post hoc analyses with appropriate correction for multiple comparisons in R v3.4.1 (R Core Team, 2012) using the emmeans package (Lenth, 2018).
To test for the absence of a meaningful effect for a specific condition, equivalence tests were used for the relevant comparisons for SSVEP data. For this purpose, the two 1-sided tests (TOST) procedure, in which upper and lower equivalence bounds were determined based on the smallest effect size of interest, was used to statistically reject the presence of the effects large enough to be considered meaningful and worthwhile to examine (Lakens, 2017). To specify the lower and upper equivalence bounds, the TOST power analysis for a one-sample equivalence test was performed. The analysis indicated that the equivalence bounds, to achieve power of 80% for a sample size of n = 32 and alpha = .05, were [−0.52, 0.52] in Cohen's d (with d = 0.5 representing a “medium” effect size) and were further used for one-sample equivalence tests against zero (μ = 0). Overall, with 32 participants, the experiment had 80% power to detect equivalence with equivalence bounds of d = −0.52 and d = 0.52. Statistical equivalence was warranted when the greater of the two p values from the TOST was smaller than alpha = .05. All calculations were performed using the spreadsheet from Lakens (2017).
Behavioral Data and SAM Rating Analyses
Only correct button presses within the time interval of 250 and 1000 msec following the onset of a target or distracter event were considered as hits or false alarms, respectively. Button presses that occurred later were considered as misses. d′ was calculated as a measure of sensitivity based on hits and false alarms (Macmillan & Creelman, 2005), and a log linear correction was applied to correct for extreme probabilities of 0 and 100%. Target detection rates, false alarms, d′ scores, as well as RTs were used for statistical analysis and analyzed using a 2 × 2 × 2 repeated-measures ANOVA with within-subject factors of Location of Content Change (left hemifield vs. right hemifield), Frequency Combination (4 and 6 Hz vs. 6 and 4 Hz), and Change Time (before vs. after).
Mean arousal and valence SAM ratings for images were analyzed by a 2 × 2 repeated-measures ANOVA with the factors of Emotion (unpleasant vs. neutral) and Picture Presentation Time (250 msec vs. 167 msec). To follow up significant interactions and explore differences between experimental conditions, we conducted post hoc t tests using the Holm-Bonferroni correction for multiple comparisons.
Statistical analyses were performed using R v3.4.1 (R Core Team, 2012). For data manipulation, visualization, and statistical tests, the following packages were used: tidyr v0.8.1, afex v0.21-2 (Singmann, Bolker, Westfall, & Aust, 2018), emmeans v1.2.3 (Lenth, 2018), stats v3.3.2, ez v4.4-0 (Lawrence, 2016), lsr v0.5 (Navarro, 2015), Rmisc v1.5, and ggplot2 v2.21 (Wickham, 2009). EEG topographical scalp maps were visualized using R package eegUtils 0.1.15.dev (Craddock; https://github.com/craddm/eegUtils; https://doi.org/10.5281/zenodo.1292901).
Generalized eta-squared (ηg2) and Cohen's d were calculated as measures of standardized effect size (Lakens, 2013; Baguley, 2012; Bakeman, 2005; Olejnik & Algina, 2003). To quantify the Cohen's d measure of effect size, the function cohensD from lsr v0.5 package (Navarro, 2015) was used. Additionally, unstandardized effect sizes and their 95% confidence intervals or standard errors are provided for most relevant comparisons. With repeated-measures ANOVA, Greenhouse–Geisser corrections were applied when the sphericity assumption was violated.
For subjective ratings of image valence, the 2 (Emotion) × 2 (Picture Presentation Time) repeated-measures ANOVA showed a significant main effect of Emotion, F(1, 31) = 430.68, p < .001, ηg2 = .88, with emotionally unpleasant relative to neutral scenes rated as more negative.
A main effect of Picture Presentation Time was also significant, F(1, 31) = 12.67, p = .001, ηg2 = .02, with overall slightly higher values for images displayed for ∼167 msec relative to 250 msec presentation time. These main effects were, however, further qualified by the presence of a significant interaction Emotion × Picture Presentation Time, F(1, 31) = 4.78, p = .036, ηg2 = .004.
The follow-up pairwise comparisons indicated that the valence ratings for neutral pictures were similar, regardless of their presentation time (mean difference = 0.05, 95% CI [−0.02 0.13]; p = .18, d = 0.25); by contrast, unpleasant pictures were perceived as slightly more negative when they were displayed for 250 msec relative to the same images shown for ∼167 msec (mean difference = −0.17, 95% CI [−0.27 −0.08], p = .001, d = 0.67), as depicted in Figure 3A.
Similarly, for arousal ratings, both the main effect of Picture Presentation Time, F(1, 31) = 5.57, p = .02, ηg2 = .007, and the main effect of Emotion, F(1, 31) = 183.71, p < .001, ηg2 = .62, were significant. Furthermore, there was a significant Emotion × Picture Presentation Time interaction, F(1, 31) = 5.05, p = .03, ηg2 = .005. Follow-up post hoc paired t tests revealed that neutral images were evaluated similarly on arousal, irrespective of their presentation time (mean difference = −0.03, 95% CI [−0.18 0.13], p = .74, d = 0.06), whereas emotional scenes had slightly higher arousal values when they were shown for 250 msec as compared with ∼167 msec (mean difference = 0.29, 95% CI [0.09 0.5], p = .01, d = 0.51; see Figure 3B).
Thus, neutral images were rated similarly on valence (Figure 3A) and arousal (Figure 3B) regardless of picture exposure time, whereas unpleasant pictures were perceived as slightly more negative and more arousing when they were displayed for 250 msec relative to ∼167 msec. This effect, however, was very small (mean differences: −0.17 for valence and 0.29 for arousal).
A repeated-measures 2 × 3 × 2 ANOVA with within-subject factors of Location of Content Change (left hemifield/right hemifield), Recording Location (left/right/central), and Frequency Combination (4 and 6 Hz/6 and 4 Hz) revealed neither a main effect of Frequency Combination, F(1, 31) = 1.21, p = .28, ηg2 = .003, nor Location of Content Change, F(1, 31) = 1.11, p = .3, ηg2 = .003. Interactions Frequency Combination × Location of Content Change as well as Frequency Combination × Location of Content Change × Recording Location were not significant (Fs < 1.9, ps > .18, ηg2 < .0001). Although the analysis showed a significant Frequency Combination × Recording Location interaction, F(1.79, 55.36) = 4.92, p = .01, ηg2 = .03, because this interaction is not relevant for our primary experimental question, we do not further focus here on its interpretation.
We observed a main effect of Recording Location, F(1.85, 57.38) = 3.95, p = .03, ηg2 = .02, reflecting an overall greater SSVEP amplitude modulation at right and left hemisphere electrodes than for the central location. Importantly, this effect was further qualified by the presence of the significant Recording Location × Location of Content change interaction, F(1.50, 46.57) = 8.54, p = .002, ηg2 = .04. The interaction, as expected, reflected that the pattern of SSVEP amplitudes varied across the three electrode locations in accordance with the spatial location of the change from neutral to unpleasant images. We followed up this interaction by calculating the simple effects of Recording Location for each level of the factor Location of Content change (see also Figure 4):
The follow-up contrasts revealed that when an RSVP of neutral content changed to that of an unpleasant content in the left visual hemifield, the SSVEP decreased significantly more in the hemisphere contralateral to the change in image valence as compared with the ipsilateral hemisphere (right electrode cluster vs. left electrode cluster, respectively: mean difference = 0.12 μV, SE = 0.04, p = .006, t(123.94) = 2.78, d = 0.39). Also, the SSVEP decreased more for the right hemisphere relative to central location (mean difference = 0.12 μV, SE = 0.04, t(123.94) = 2.9, p = .004, d = 0.48) but did not differ between central and left hemisphere electrode locations (mean difference = 0.005 μV, SE = 0.04, t(123.94) = 0.12, p = .9, d = 0.03).
Similarly, as expected, when a change from neutral to unpleasant RSVP occurred in the right visual hemifield, the SSVEP amplitudes demonstrated a more pronounced decrease in amplitudes for the left (contralateral) relative to the right (ipsilateral) hemisphere (mean difference = −0.13 μV, SE = 0.04, t(123.94) = −3.11, p = .002, d = 0.53); the SSVEP modulation also differed between central and left (p < .001; mean difference = 0.14 μV, SE = 0.04, t(123.94) = 3.44, d = 0.59) hemisphere locations, but not between central and right hemisphere electrode locations (p = .74; mean difference = 0.01 μV, SE = 0.04, t(123.94) = 0.33, d = 0.07).
Thus, SSVEP amplitude dropped substantially more in the hemisphere contralateral to the visual hemifield where task-irrelevant RSVP changed from neutral to emotional content. Critically, we were motivated to test whether the observed SSVEP amplitude modulation for the hemisphere contralateral to the visual hemifield with the change in emotional content was paralleled by (1) the reciprocal SSVEP modulation for the central stimuli and (2) the SSVEP amplitude modulation for the hemisphere contralateral to the simultaneous presentation of only neutral images.
Two 1-sample t tests revealed that the SSVEP response modulation to the task (i.e., central letter stream) did not statistically differ from zero when a change in emotional valence occurred in the left (M = 0.01, SD = 0.1, t(31) = 0.7, p = .48, d = 0.1) or right hemifield (M = −0.02, SD = 0.11, t(31) = −1.23, p = .23, d = −0.18). However, given that a nonsignificant result in the Null Hypothesis Significance Test does not provide evidence for the absence of a meaningful effect, we conducted two 1-sample equivalence tests against zero (μ = 0), to examine whether the SSVEP modulation at central location was close enough to zero to be practically equivalent (Seaman & Serlin, 1998).
When a neutral RSVP changed to an unpleasant one in the left hemifield, the TOST procedure indicated that the observed effect size (d = 0.1) for the 15-Hz SSVEP modulation was significantly within the equivalence bounds of d = −0.52 and d = 0.52 (or in raw scores: −0.05 and 0.05), t(31) = −2.38, p = .01, and thus, one can reject effects larger than d = 0.52 and conclude statistical equivalence within those margins. Similarly, when a change in emotional valence occurred in the right hemifield, the amplitude modulation with the observed effect size (d = −0.18) was also statistically within the equivalence bounds of ±0.52 (or in raw scores: −0.06 and 0.06), t(31) = 1.9, p = .03. Taken together, the equivalence tests statistically rejected the presence of meaningful effects (larger than d = 0.52), and thus, the task-related SSVEP amplitude modulations as a function of change in emotional content in the periphery were close enough to zero.
Neutral RSVP stream
Two one-sample t tests revealed that the SSVEP in response to the neutral RSVP in the left (M = −0.009, SD = 0.15, t(31) = −0.34, p = .74, d = −0.06) or right visual hemifield (M = 0.02, SD = 0.13, t(31) = 0.8, p = .43, d = 0.15) was not modulated (did not statistically differ from zero), when a change in emotional valence simultaneously occurred in the opposite visual hemifield. As above, we further calculated two 1-sample equivalence tests against zero (μ = 0) to ensure that these effects were close enough to zero to be practically equivalent. For the neutral RSVP in the left visual field (right hemisphere electrodes), the TOST procedure demonstrated that the observed effect size (d = −0.06) for the SSVEP difference score was significantly within the equivalence bounds of d = −0.52 and d = 0.52 (or in raw scores: −0.08 and 0.08), t(31) = 2.6, p = .007. In the same vein, for the neutral image stream in the right visual field (left hemisphere electrodes), the SSVEP difference score with the observed effect size (d = 0.15) was also statistically within the equivalence bounds of ±0.52 (or in raw scores: −0.07 and 0.07), t(31) = −2.07, p = .02. Thus, the equivalence tests statistically rejected the presence of effects larger than d = 0.52 and confirmed that, during valence change in one of the peripheral RSVPs, SSVEP modulations within the neutral RSVP in the opposite visual field were practically zero.
In Figure 4, the location of content change signifies in which visual hemifield (left or right) a task-irrelevant RSVP changed from neutral to unpleasant content. A change from neutral to unpleasant scenes in either the left or right visual hemifield resulted in a significant decrease in SSVEP amplitudes (positive difference values) in the contralateral hemisphere compared with that in the ipsilateral or central locations (indicated with **p < .01, ***p < .001 on the plot). In parallel with SSVEP amplitude modulation by affective content change, no amplitude modulation was observed for the task-related RSVP (two middle box plots), or neutral RSVP presentation in the opposite visual hemifield (the far left and the far right box plot), which was practically equivalent to zero.
Although target detection rates (82% on average) were slightly higher in the time window after the change in emotional content (main effect of Change Time: F(1, 31) = 5.85, mean difference = 3.75%, 95% CI [0.59 6.92], p = .02, ηg2 = .02; all other effects were nonsignificant: Fs < 2.6, ps > .1, ηg2 < .002), false alarms did not reveal any statistically significant main effects or interactions (Fs < 1.6, ps > .2, ηg2 < .004). Moreover, analysis of the sensitivity index d′ did not result in any reliable statistical differences between conditions (all Fs < 3.4, ps > .07, ηg2 < .007), including a main effect of change time, F(1, 31) = 2.23, p = .14, ηg2 = .004. This, therefore, suggests that a similar sensitivity criterion was used between the time windows before and after the change in emotional content, accounting for the slight bias in responding as a hit in the period after the change in valence.
This study investigated the impact of emotionally arousing distracter images shown in the periphery during concurrent processing of a centrally presented sustained visual detection task. First, we tested whether a change in emotional valence from neutral to emotionally unpleasant task-irrelevant RSVP streams in one of the visual hemifields would lead to sensory amplification in early visual cortex. Second, we examined whether a sensory gain with the unpleasant RSVP streams would come at the expense of processing of task-related stimuli or neutral RSVPs presented in the other visual hemifield. By frequency-tagging each RSVP stream, the present experimental design permitted us to independently record SSVEPs to the concurrently presented streams and directly measure the neural competition in early visual areas in the frequency domain while stimuli competed for attentional processing resources. In line with the notion of preferential processing of emotional information and the idea of independent resource capacities for each hemisphere (Franconeri et al., 2013; Sereno & Kosslyn, 1991), we have observed sensory gain for peripheral unpleasant distracters that was not accompanied by a reciprocal attentional resource withdrawal from the visual task or any trade-off cost effects for the simultaneously shown neutral stream in the opposite visual hemifield.
In corroboration with the “motivated attention” account (Bradley et al., 2003; Lang, Bradley, & Cuthbert, 1997), according to which emotionally arousing stimuli activate neural circuits mediating their enhanced perceptual processing to maintain adaptive behavior, we observed a robust neural sensory gain in early visual cortex with highly arousing unpleasant relative to neutral distracter images indexed by a robust SSVEP amplitude modulation, in line with previous studies (Schettino et al., 2019; Everaert, Koster, & Joormann, 2018; Leleu et al., 2018; Dzhelyova, Jacques, & Rossion, 2017; Schupp, Schmälzle, Flaisch, Weike, & Hamm, 2012). In particular, when image content switched from neutral to unpleasant valence in either the left or right visual hemifield, this change prompted a significant attenuation of SSVEP amplitude in the contralateral hemisphere. This amplitude modulation pattern fully replicates what we found in our recent experiments presenting affectively laden and neutral RSVPs at ∼6 Hz either as task irrelevant (Bekhtereva, Craddock, Gundlach, & Müller, 2019) or during passive viewing (Bekhtereva et al., 2018; Bekhtereva & Müller, 2015, 2017b). In those studies, contrary to the past observations of an SSVEP enhancement during passive viewing of a single flickering emotional versus neutral picture (Keil et al., 2009, 2012), we consistently found a robust decrease in SSVEP amplitudes for emotional as opposed to neutral RSVP streams presented at ∼6-Hz rates. It is important to note that this SSVEP amplitude modulation with affective picture streams was not driven by their low-level featural composition such as color or spatial frequencies (Bekhtereva & Müller, 2015). As one of our experiments using very similar RSVP design and images has previously demonstrated, no differences in SSVEP amplitudes were observed for phase-scrambled versions of neutral and unpleasant image streams, whose content was distorted but the global properties (amplitude spectrum) were preserved.
Notably, our latest experiments have strongly indicated that such a “reversal” of the SSVEP amplitude modulation (neutral > emotional) in our RSVP design cannot be accounted for by preferential processing of neutral over emotional valence (Bekhtereva et al., 2018, 2019). It is also unlikely to be attributed to a fundamentally different processing mechanism at about 6 Hz. Rather, our simulations with linear modeling have shown that the SSVEP response could be driven by ERP superposition consisting of a linear concatenation of an ERP response to each image in a continuous RSVP stream (Bekhtereva et al., 2018; Capilla, Pazo-Alvarez, Darriba, Campo, & Gross, 2011). More specifically, the presentation of each individual image in an RSVP stream creates an ERP, the amplitude of which differs consistently in response to emotional relative to neutral scenes (Peyk, Schupp, Keil, Elbert, & Junghöfer, 2009). In turn, given the consistent differences in ERP waveforms between emotional and neutral images, a systematic linear superposition of these differential ERPs may potentially lead to amplitude patterns that either decrease (destructive interference between the successive ERP responses) or enhance (constructive interference) the power at the SSVEP frequency response (see Bekhtereva et al., 2018) and thereby result in a destructive interference for emotional images at a 6-Hz rate.
Interestingly, the SSVEP amplitude here was attenuated with the presentation of unpleasant content across both 6- and 4-Hz RSVP rates. In our recent studies where emotional (pleasant or unpleasant) as opposed to neutral images were passively viewed or served as task-irrelevant distracters at fixation, the SSVEP response was found to be enhanced with a 4-Hz RSVP rate (250 msec per image) and attenuated at presentation rates of ∼6 Hz (∼167 msec per image; Bekhtereva et al., 2018, 2019). The discrepancy between the previous and present SSVEP modulation patterns, however, might be attributed to the ERP superposition mechanism as outlined above. As previous relevant research indicate, spatial position or visual eccentricity can modulate ERP responses to emotionally arousing images (Bayle, Henaff, & Krolak-Salmon, 2009; De Cesarei et al., 2009; Rigoulot et al., 2008). For example, emotional modulation of ERPs with peripherally presented affectively laden scenes was observed with a delayed latency (at ∼120 msec) compared with the latency for centrally displayed images (Rigoulot et al., 2008). In a similar vein, affective amplitude modulations of early and late ERPs with pleasant, unpleasant, and neutral images were most pronounced at the center and attenuated with increasing eccentricity (De Cesarei et al., 2009). Thus, it is possible that peripheral presentation of emotional images in the present design, as opposed to foveally displayed stimuli in our latest studies, may have led to consistent differential modulations in latencies and amplitudes of valence-sensitive ERP deflections in response to each individual image in the peripheral RSVPs. In turn, if the SSVEP response mirrors the temporal superposition of ERPs to each image in the stream as described above, these systematic differences in ERPs may ultimately contribute to differential valence-dependent SSVEP modulation with central as compared with peripheral RSVPs, thus potentially leading to destructive interference for peripherally displayed emotional images (emotional < neutral) in both 4- and 6-Hz SSVEP amplitudes as seen in the present experiment. Although it remains uncertain whether a perceptual gist or a more elaborate content identification of the images could be accomplished in this study, the observed valence-dependent SSVEP amplitude modulation is in good accord with previous literature indicating emotional encoding in peripheral vision sometimes with up to 30° eccentricity (Calvo, Gutiérrez-García, & del Líbano, 2015; Calvo, Rodríguez-Chinea, & Fernández-Martín, 2015; Rigoulot, D'Hondt, Honoré, & Sequeira, 2012; Rigoulot et al., 2011; Calvo & Avero, 2008; Calvo, Nummenmaa, & Hyönä, 2008). Moreover, the “reverse” pattern of emotional SSVEP modulations observed here agrees well with the recent findings by Campagnoli et al. (2019) that reported a consistent SSVEP response attenuation, when the presentation of a neutral face (baseline) changed to an angry face expression during the trial.
The second critical question of the current study was concerned with the competition for processing resources between peripheral neutral and emotional distracter images and a concurrent visual task. At odds with the account of one limited processing capacity or resource pool and our previous studies showing significant perceptual and behavioral interference costs with spatially overlapping task-related and emotional stimuli (Schönwald & Müller, 2014; Hindi Attar et al., 2010; Müller et al., 2008), here we found no evidence for resource sharing effects imposed by competing stimuli in the field of view. Specifically, the SSVEP amplitude modulation in response to unpleasant distracters in one of the visual hemifields was neither paralleled by an attentional deployment from the task nor mirrored by any cost effects for processing of a neutral image stream displayed in the other visual hemifield. This was indicated by the absence of the SSVEP cost effects to those concurrently presented stimuli. One potential way to interpret the SSVEP modulation with unpleasant images accompanied by no discernable cost effects for other competing stimuli is that sensory gain for emotional content in the periphery may possibly have arisen from independent resource pools that might not be shared across two visual hemifields (Walter et al., 2014; Franconeri et al., 2013; Sereno & Kosslyn, 1991). As noted earlier, the current experimental design differed significantly from that used in our previous studies (Bekhtereva & Müller, 2017a; Deweese et al., 2016). Here task-relevant stimuli and distracting scenes were for the first time displayed separately across the field of view instead of being presented spatially overlapped, therefore likely creating different conditions for spatial competition among the stimuli (Fuchs, Andersen, Gruber, & Müller, 2008). The idea of independent processing resources for each hemisphere, as one potential explanation model for the observed findings, would fit well with the previous report from a study with socially anxious participants that passively viewed flickering emotional and neutral facial expressions assigned to each visual hemifield (Wieser, McTeague, & Keil, 2011). In that experiment, although a sensory gain as indexed in a larger SSVEP amplitude for angry relative to neutral faces was present, there was no evidence for interference with the processing of a concurrent competitor face flickering in the opposite visual hemifield. Thus, it is possible that images presented in the left and right visual fields may tap into separate pools of attentional resources, effectively increasing the resources available to process competing visual objects. However, our study was not designed to test the different hemifield advantage account with emotional and neutral images. Future work should therefore directly investigate this hypothesis in an experimental paradigm that goes beyond the current study protocol by explicitly manipulating the number of target and distractor stimuli presented within as compared with between visual hemifields (Störmer et al., 2014; Walter et al., 2014). Furthermore, the visual eccentricity of emotional images and task difficulty (perceptual load) are known to affect attentional capture with emotional material and, therefore, should be experimentally manipulated in future research to test whether these factors may additionally contribute to sensory gain effects with emotional distracters in similar RSVP paradigms (De Cesarei et al., 2009; Pessoa et al., 2002; Lavie, 1995).
The lack of processing costs at early visual areas was also paralleled in the behavioral data. With the average target accuracy of ∼80%, we did not observe any statistically reliable emotional distraction costs based on target detection rate or sensitivity index. Additionally, to ensure that unpleasant and neutral experimental images were indeed perceived as such, we obtained subjective image ratings of valence and arousal from the participants. The ratings have corroborated that unpleasant images were perceived as more negative and arousing as compared with neutral images, even when presented only as brief as ∼167 and 250 msec and then masked. Similar to our earlier results (Bekhtereva et al., 2019), image presentation time had a slight impact on picture ratings, somewhat intensifying the subjective perception of negative valence and arousal for unpleasant images when they were flashed for 250 msec as compared with ∼167 msec. That presentation time effect on the ratings for unpleasant images was, however, very small (differences in mean values were 0.17 for valence and 0.29 for arousal) and, therefore, negligible. Together, our findings provide further support for the rapid extraction of affective content from naturalistic scenes (Schettino et al., 2019; Codispoti, Mazzetti, & Bradley, 2009; De Cesarei et al., 2009).
In conclusion, the present experiment provides direct electrocortical assessment of neural competition for attentional resources between peripherally presented neutral and emotional distracter scenes and a centrally displayed visual detection task. First, our findings indicate a significant perceptual bias in response to task-irrelevant unpleasant relative to neutral images as reflected in the valence-dependent SSVEP amplitude modulation. This is consistent with the account that an SSVEP modulation by affective content at low-tier visual areas may result from sustained reentrant feedback from higher order cortical areas that code for semantic image content (Norcia et al., 2015; Keil et al., 2009). Second, at odds with the notion of a single limited processing capacity, we observed a lack of evidence for attentional resource sharing between the task stimuli and visual distracters at either the early perceptual or the behavioral level. Instead, our data rather support the idea of independent resource pools for left and right visual hemifields, which has been consistently found with low-level visual stimuli but has not yet been shown with complex images presented in both visual fields. Future research, however, should explicitly test the different hemifield advantage hypothesis of separate processing resource pools for visual images presented within as opposed to between visual hemifields, using the SSVEP to quantify the cortical engagement associated with simultaneously presented stimuli during emotional perception.
The study was supported by Deutsche Forschungsgemeinschaft [MU972/22-2]. The authors thank Renate Zahn for her support in the data acquisition and Christopher Gundlach for his help with data analyses.
Reprint requests should be sent to Matthias M. Müller, Institute of Psychology, University of Leipzig, Neumarkt 9-19, 04109 Leipzig, Germany, or via e-mail: email@example.com.
IAPS numbers of neutral pictures: 1122, 1350, 1645, 1945, 2036, 2037, 2102, 2191, 2206, 2221, 2235, 2272, 2273, 2357, 2377, 2393, 2396, 2435, 2445, 2525, 2560, 2745, 2749, 2840, 2850, 5120, 5130, 5201, 5530, 5535, 7001, 7010, 7011, 7026, 7030, 7038, 7041, 7062, 7081, 7130, 7136, 7160, 7161, 7165, 7300, 7491, 7495, 7512, 7513, 7546, 7547, 7550, 7560, 7595, 7632, 8010, 8090, 8232, 8250, 8325, 8371, 8620, 9210. EmoPicS numbers of neutral pictures: 119, 121, 123, 124, 126, 127, 128, 135, 139, 141, 148, 161, 162, 176, 191, 352, 375.
IAPS numbers of unpleasant pictures: 1111, 1113, 1200, 1202, 1220, 1300, 2661, 2683, 2691, 2703, 2710, 2730, 2981, 3001, 3019, 3064, 3103, 3110, 3150, 3190, 3195, 3212, 3213, 3230, 3250, 3261, 3350, 3500, 3530, 6021, 6210, 6313, 6550, 6560, 8230, 9002, 9008, 9031, 9040, 9042, 9075, 9140, 9163, 9181, 9250, 9254, 9300, 9342, 9410, 9420, 9433, 9471, 9495, 9570, 9571, 9590, 9594, 9596, 9600, 9623, 9635, 9810, 9902, 9920, 9930, 9940. EmoPicS numbers of unpleasant pictures: 216, 232, 233, 234, 235, 236, 240, 241, 243, 248, 321, 325, 326, 327.