Abstract
Although it is well established that self-related information can rapidly capture our attention and bias cognitive functioning, whether this self-bias can affect language processing remains largely unknown. In addition, there is an ongoing debate as to the functional independence of language processes, notably regarding the syntactic domain. Hence, this study investigated the influence of self-related content on syntactic speech processing. Participants listened to sentences that could contain morphosyntactic anomalies while the masked face identity (self, friend, or unknown faces) was presented for 16 msec preceding the critical word. The language-related ERP components (left anterior negativity [LAN] and P600) appeared for all identity conditions. However, the largest LAN effect followed by a reduced P600 effect was observed for self-faces, whereas a larger LAN with no reduction of the P600 was found for friend faces compared with unknown faces. These data suggest that both early and late syntactic processes can be modulated by self-related content. In addition, alpha power was more suppressed over the left inferior frontal gyrus only when self-faces appeared before the critical word. This may reflect higher semantic demands concomitant to early syntactic operations (around 150–550 msec). Our data also provide further evidence of self-specific response, as reflected by the N250 component. Collectively, our results suggest that identity-related information is rapidly decoded from facial stimuli and may impact core linguistic processes, supporting an interactive view of syntactic processing. This study provides evidence that the self-reference effect can be extended to syntactic processing.
INTRODUCTION
It is well-established that self-referential information can rapidly capture our attention, such as hearing one's own name in a crowded setting, as illustrated by the classic “cocktail party effect” (Moray, 1959). Over the past few years, an increasing body of literature has shown how self-related information is prioritized during different stages of processing, thus biasing different cognitive processes such as perception (Sui, Liu, Mevorach, & Humphreys, 2015), memory (Tanguay et al., 2018), attention (Macrae, Visokomogilski, Golubickis, & Sahraie, 2018), emotion (Herbert, Pauli, & Herbert, 2011), and decision-making (Sui, He, Golubickis, Svensson, & Neil Macrae, 2023). However, less is known as to whether this self-bias may be extended to language processing. Thus, this study investigated the possible interaction between self-reference and language processing.
Self-reference Effect and Cognitive Processing
Current research in psychology and cognitive neuroscience suggests that the self is based on a unique mental representation, both biologically (Qin, Wang, & Northoff, 2020) and socio-culturally shaped (Markus & Kitayama, 2010). Neuroimaging studies have observed that self-referential processing is supported by cortical midline structures, including the medial pFC and the posterior cingulate cortex (Northoff et al., 2006), along with the insula and temporoparietal junction (Qin et al., 2020). From a cognitive perspective, self-representation operates as an integrative mechanism between upcoming information of the social environment and different stages of information processing (Sui & Humphreys, 2015). Accordingly, self-related stimuli, in contrast to non-self-related stimuli, are preferentially processed because of a combination of bottom–up and top–down operations that act as a filter, increasing the emotional salience of the stimuli (Sui & Rothstein, 2019). This view is supported by electrophysiological studies on face recognition that have shown that self and personally familiar faces can capture attention preferentially (e.g., Rubianes et al., 2021; Alzueta, Melcón, Poch, & Capilla, 2019). Furthermore, behavioral studies have described that recognition or recall task performance is more accurate and faster when the stimuli are self-related (e.g., Scheller & Sui, 2022). This phenomenon is often referred to as the self-reference or self-prioritization effect and has generated considerable research interest aside from the psycholinguistic field (Rogers, Kuiper, & Kirker, 1977; for a review, see Cunningham & Turk, 2017).
Of relevance to our aims, ERPs and neural oscillations provide fine-grained details about the temporal course of brain activity. Prior studies on facial identity processing have suggested that the earliest access to self-face recognition in long-term memory (LTM) occurs around 250 msec after the stimulus onset (as reflected by the N250 component), followed by a subsequent late positivity (P3 component), presumably related to higher engagement of attentional resources to self-related stimuli (e.g., Rubianes et al., 2021; Alzueta et al., 2019). Other studies have reported increased activation for self-faces than other identities at early stages of visual perception, reflected by the N170 component (e.g., Geng, Zhang, Li, Tao, & Xu, 2012; Keyes, Brady, Reilly, & Foxe, 2010). Increasing studies are showing that self-reference could be automatically elicited without explicit awareness and when displayed as a task-irrelevant distractor (Bola, Paź, Doradzińska, & Nowicka, 2021; Wójcik, Nowicka, Kotlewska, & Nowicka, 2018). Additionally, it is still debated to what extent personally familiar faces could benefit from this prioritized processing or whether it is self-specific, as both face identities can activate pre-established representations in LTM (Caharel & Rossion, 2021; Schweinberger & Neumann, 2016; Olivares, Iglesias, Saavedra, Trujillo-Barreto, & Valdés-Sosa, 2015).
Extralinguistic Processes Affecting Language Comprehension
Language processing requires the integration of different subprocesses to effectively comprehend utterances, including acoustic-phonological, syntactic, and semantic processes (Jackendoff & Audring, 2020). An outstanding question in psycholinguistic research is how and when the language processing system integrates different sources of extralinguistic and linguistic inputs (Münster & Knoeferle, 2018; Hagoort, 2017). As for the temporal course of syntactic and semantic processing, three ERP components have been primarily described in the literature: left anterior negativity (LAN), N400, and P600. These components are typically obtained when comparing incorrect, violating, or unexpected material (i.e., words) with the correct one. When an incorrectness is found, the brain circuits underlying the type of process that has been violated (e.g., syntactic, semantic) are presumably boosted, this yielding visible electrical fluctuations (ERP components). Electrophysiological responses to correct material are then used as control and subtracted, to better see and isolate these specific language-related fluctuations (e.g., Urbach & Kutas, 2018; Molinaro, Barber, & Carreiras, 2011). In this regard, sentences with syntactic anomalies (e.g., gender or number violations), compared with syntactically correct sentences, typically elicit a LAN around 300–500 msec after the critical word onset (for a recent review, see Maran, Friederici, & Zaccarella, 2022). Thus, the LAN component is thought to reflect the early detection of a morphosyntactic mismatch, and its amplitude is usually linked to the difficulty of morphosyntactic integration based on the agreement relations for structure-building (Friederici, 2017) or verbal working memory operations (Martín-Loeches, Muñoz, Casado, Melcón, & Fernández-Frías, 2005; Kolk, Chwilla, van Herten, & Oor, 2003). The LAN is probably originated in the left inferior frontal gyrus (IFG; Brodmann's area [BA] 44) and the posterior superior temporal gyrus (STG)/middle temporal gyrus (Herrmann, Maess, Hasting, & Friederici, 2009; Friederici, Wang, Herrmann, Maess, & Oertel, 2000), which underlie the syntactic network (Matchin & Hickok, 2020; Friederici, 2017). In turn, the N400 component, a negative fluctuation normally emerging after semantic incongruences and peaking around 400 msec, is generally considered an index of semantic processing and its amplitude can be sensitive to a wide range of semantic manipulations in a given context (Kutas & Federmeier, 2011; for an alternative view, see Bornkessel-Schlesewsky & Schlesewsky, 2019). The semantic network is supported by different portions of the left IFG (BA45/47), the angular gyrus, the anterior temporal lobe, and the STG/middle temporal gyrus (Friederici, 2017; Hagoort, 2017; Kutas & Federmeier, 2011). Subsequently, the P600, a positive component customarily appearing after syntactic anomalies, but also following semantic ones, and starting around a 500-msec stimulus onset, is often associated with a later stage of reanalysis/repair processes of the sentence structure by integrating different linguistic and nonlinguistic inputs (Aurnhammer, Delogu, Brouwer, & Crocker, 2023; Leckey & Federmeier, 2020; for an alternative view, see Sassenhagen & Fiebach, 2019). The posterior part of the STG and the superior temporal sulcus probably underlie the P600 (Friederici, 2017).
Although increasing studies are showing how social and emotional information afforded by the visual context is rapidly integrated into sentence-level semantic processing, as evidenced by N400 effects for the speaker's facial information (Maquate, Kissler, & Knoeferle, 2023; Hernández-Gutiérrez et al., 2021), it remains unclear whether first-pass syntactic processing may also be influenced by such extralinguistic information. In this regard, LAN and P600 modulations have been observed for social and emotional information elicited by mood (Verhees, Chwilla, Tromp, & Vissers, 2015), speaker's identity (Xu, Abdel Rahman, & Sommer, 2022), social presence (Hinchcliffe et al., 2020), or emotional words both unmasked (Espuny et al., 2018; Martín-Loeches et al., 2012) and masked (Jiménez-Ortega et al., 2021; Jiménez-Ortega, Espuny, de Tejada, Vargas-Rivero, & Martín-Loeches, 2017). This body of evidence supports the view that syntactic, semantic, and contextual inputs interact directly during the early stages of linguistic processing (e.g., Pulvermüller, Shtyrov, & Hauk, 2009; McClelland, St. John, & Taraban, 1989). However, other studies have reported that the LAN is unaffected by emotional content (e.g., Padrón, Fraga, & Acuña-Fariña, 2020; Fraga, Padrón, Acuña-Fariña, & Díaz-Lago, 2017) or by other processes summoning attentional resources or increasing arousal (Hohlfeld, Martín-Loeches, & Sommer, 2019), favoring the traditional view of encapsulated syntactic processing. This view considers syntax as a module blind to other cognitive processes (e.g., Hauser, Chomsky, & Fitch, 2002; Ferreira & Clifton, 1986).
In the frame of this debate, it appears of interest to study whether the self-reference effect may affect the LAN component, particularly considering that some ERP fluctuations related to this effect, such as the N250 component, overlap or even precede the linguistic component. On the interplay between language processing and self-related content, previous studies have observed that self-relevant scenarios can modulate language processing while participants read two-sentence social vignettes that could contain different emotional valences, regardless of syntactic or semantic violations (Fields & Kuperberg, 2012, 2016). For instance, Fields and Kuperberg (2016) reported an interaction between self-relevance and emotion; particularly, they observed a larger late positive component to emotional words versus neutral words in the self-relevant scenarios. These results suggest that self-reference and emotion may interact during written language processing.
In addition to ERP components, neural oscillations can provide valuable insights into the neural dynamics underlying a broad range of cognitive functions (Meyer, 2018). It is noteworthy that alpha-band activity (8–13 Hz) has been proposed to be involved in neural information processing (Jensen & Mazaheri, 2010). Specifically, alpha activity would reflect functional inhibition of task-irrelevant brain regions, its reduction or suppression in task-relevant regions reflecting the release of these regions to allocate cognitive resources relevant to task demands (Klimesch, 2018). In line with this, Drijvers, Özyürek, and Jensen (2018a, 2018b) observed that the alpha activity was more suppressed over the IFG along with motor and visual cortices when visual semantic information (i.e., iconic gestures) mismatched the speech, probably reflecting a larger semantic unification load over these task-relevant brain regions. Intriguingly, Alzueta, Melcón, Jensen, and Capilla (2020) have found that alpha and beta power decreased when perceiving self-faces relative to familiar and unknown faces. Thus, it could be expected to observe a larger alpha suppression over task-relevant brain regions (i.e., language processing network) driven by self-faces and personally familiar faces.
The Present Study
Although a growing body of research is highlighting the role of social and emotional information on language processing, whether syntactic processes may be sensitive to self-related content remains unexplored, especially when it comes to the early stages of syntactic parsing (as reflected by the LAN component). Hence, this study investigated whether syntactic processing—including morphosyntactic anomalies—can be affected by self-reference under masked conditions. The self-related stimuli were manipulated through face identity, that is, self, friend, and unknown faces. Facial identities were masked as self, and personally familiar faces convey substantial social and emotional content that can draw attention to salient information even automatically (Ramon & Gobbini, 2017; Sui et al., 2015). In this manner, the unnaturalistic conscious perception of one's face in a communicative context was avoided while testing the effects of these processes. The paradigm was adapted from previous works (Hernández-Gutiérrez et al., 2021; Jiménez-Ortega et al., 2021; Rubianes et al., 2021).
We hypothesized that the LAN and P600 components would be modulated as a function of masked self-related information (self <> friend <> unknown faces). In this regard, there appear to be two possibilities. In one, a larger LAN followed by a reduced P600 would reflect that early syntactic parsing is influenced by self and familiar faces. This biphasic pattern has been previously reported when emotion-laden words affected morphosyntactic mismatches (e.g., Espuny et al., 2018; Jiménez-Ortega et al., 2017). A second possibility is that the LAN may be reduced or even vanish because of the capture of processing resources by self-related information, a finding that could be compatible with prior results for social and emotional stimuli (e.g., Hinchcliffe et al., 2020; Martín-Loeches et al., 2012). On the contrary, if syntactic processing is modular and encapsulated from other cognitive processes, we would not expect an effect on either the LAN or the P600 components by self-related information. Regarding alpha modulations, we expected a larger alpha suppression over task-relevant brain regions, such as the IFG or temporal regions, driven by self and personally familiar faces.
Our design enabled us to further investigate the neural correlates of face identity processing when perceptual awareness levels are substantially reduced, irrespective of task demands. In line with previous research on both subliminal and supraliminal face processing, we expected that the N250 (instead of the N170) would be larger for self-faces compared with other identities (e.g., Bola et al., 2021; Rubianes et al., 2021; Alzueta et al., 2019).
METHODS
Participants
Thirty-six participants (24 women) were included with no history of neurological or cognitive disorders and reported normal or corrected-to-normal vision (mean age = 23.24, SD = 4.78). On the basis of the effect size of prior work (Hernández-Gutiérrez et al., 2023; η2 = .06; effect size f = .25), power analysis using G*Power software (Faul, Erdfelder, Buchner, & Lang, 2009) indicated that total sample size required was n = 28 participants (alpha level of .05, power of .80). Thus, 36 participants were recorded so that all stimuli were counterbalanced: sentence structure (3), voice type (2), correctness (2), and face identity (3) (3 × 2 × 2 × 3 = 36). No participants were excluded from the sample. According to the Edinburgh Handedness Inventory (Oldfield, 1971), all participants were right-handed (mean + 88; range + 72 to +100). The study was conducted following the international ethical protocol for human research (Helsinki Declaration of the World Medical Association) and approved by the Ethics Committee of the Faculty of Psychology of the Complutense University of Madrid.
Design and Stimuli
The study used a within-subject 3 × 2 design in which face identity (3: self, close friend, and unknown) and correctness (2: correct and incorrect sentences) were manipulated.
The auditory linguistic material consisted of 240 Spanish sentences with three different structures validated from previous research (Hernández-Gutiérrez et al., 2021, 2023). Depending on the sentence structure, there could be a mismatch between the noun and the adjective (Structure 1) or between the determiner and the noun (Structures 2 and 3) in terms of number or gender agreement. The set of sentences was spoken with neutral prosody by four different voices (two female and two male). The length of critical words varied between two and five syllables. Psycholinguistic variables such as word frequency, concreteness, imageability, familiarity, and emotional content were controlled across all conditions (for more details, see the supplementary information in Hernández-Gutiérrez et al., 2021). Some examples of the linguistic material are provided in Table 1 (critical words are highlighted in bold).
Structure 1 (n = 300): [Det]-[N]-[Adj]-[V]-[Prep]-[N] | Correct | 1. a. El pañueloMasc/SingbordadoMasc/Sing era de mi abuela. |
1. a. The embroidedMasc/Sing cushionMasc/Sing belonged to my grandmother. | ||
Incorrect | 1. b. El pañueloMasc/SingbordadaFem/Sing era de mi abuela. | |
1. b. The embroidedFem/Sing cushionMasc/Sing belonged to my grandmother. | ||
Structure 2 (n = 45): [Det]-[N]-[V]-[Determiner]-[N]-[Adj] | Correct | 2. a. Los turistas habían fotografiado losMasc/PlurglaciaresMasc/Plur árticos. |
2. a. The tourists had photographed theMasc/Plur arctic glaciersMasc/Plur. | ||
Incorrect | 2. b. Los turistas habían fotografiado losMasc/PlurglaciarMasc/Sing árticos. | |
2. b. The tourists had photographed the Masc/Plur arctic glacierMasc/Sing. | ||
Structure 3 (n = 45): [Det]-[N]-[V]-[Prep]-[Det]-[N]-[Prep]-[Det]-[N] | Correct | 3. a. Las hojas son recogidas durante elMasc/SingotoñoMasc/Sing por los barrenderos. |
3. a. The leaves are picked by the sweepers during theMasc/SingautumnMasc/Sing. | ||
Incorrect | 3. b. Las hojas son recogidas durante elMasc/SingotoñosMasc/Plur por los barrenderos. | |
3. b. The leaves are picked by the sweepers during theMasc/SingautumnsMasc/Plur. |
Structure 1 (n = 300): [Det]-[N]-[Adj]-[V]-[Prep]-[N] | Correct | 1. a. El pañueloMasc/SingbordadoMasc/Sing era de mi abuela. |
1. a. The embroidedMasc/Sing cushionMasc/Sing belonged to my grandmother. | ||
Incorrect | 1. b. El pañueloMasc/SingbordadaFem/Sing era de mi abuela. | |
1. b. The embroidedFem/Sing cushionMasc/Sing belonged to my grandmother. | ||
Structure 2 (n = 45): [Det]-[N]-[V]-[Determiner]-[N]-[Adj] | Correct | 2. a. Los turistas habían fotografiado losMasc/PlurglaciaresMasc/Plur árticos. |
2. a. The tourists had photographed theMasc/Plur arctic glaciersMasc/Plur. | ||
Incorrect | 2. b. Los turistas habían fotografiado losMasc/PlurglaciarMasc/Sing árticos. | |
2. b. The tourists had photographed the Masc/Plur arctic glacierMasc/Sing. | ||
Structure 3 (n = 45): [Det]-[N]-[V]-[Prep]-[Det]-[N]-[Prep]-[Det]-[N] | Correct | 3. a. Las hojas son recogidas durante elMasc/SingotoñoMasc/Sing por los barrenderos. |
3. a. The leaves are picked by the sweepers during theMasc/SingautumnMasc/Sing. | ||
Incorrect | 3. b. Las hojas son recogidas durante elMasc/SingotoñosMasc/Plur por los barrenderos. | |
3. b. The leaves are picked by the sweepers during theMasc/SingautumnsMasc/Plur. |
Critical words are marked in bold. Note that the noun–adjective order in Spanish is reversed compared with English.
The participants listened to sentences that could contain morphosyntactic anomalies while viewing a scrambled face. The scrambled version was created by using 30 × 40 matrices in Adobe Photoshop. This control stimulus keeps low-level visual features (i.e., pictorial encoding) intact without being able to identify facial features (i.e., structural encoding). The face corresponding to each identity was presented for 16 msec, masked by the scrambled face. The facial stimuli were obtained before the experiment from a set of three different photographs of the participants and another set of three different photographs of his/her close friend. These pictures showed a direct gaze and a neutral emotional facial expression as much as possible. For the unknown condition, it was collected from the photos of the close friend of other different participants. This procedure was successfully employed in previous work (Rubianes et al., 2021). All facial stimuli were processed in Adobe Photoshop with the aim of normalizing several parameters (grayscale, black background, contrast, luminance, and facial proportions). At the end of the experiment, all participants confirmed that they did not know the facial identity of the unknown condition. In total, each participant was presented with 240 sentences (half of them were incorrect) and nine masked faces by the scrambled stimulus (three different photos for each identity). Accordingly, the number of trials for each experimental condition (3 face identity × 2 correctness = 6 conditions) was 40 (240 sentences / 6 = 40 trials).
Procedure
As shown in Figure 1, at the beginning of each trial, the scrambled stimulus appeared in the center of the screen, and the audio presentation started 500 msec later. The face identity was presented for 16 msec, and the previous scrambled stimulus appeared again until the end of the sentence. After 1 sec, the alternatives (correct or incorrect) were presented on each side of the screen (left or right). The critical word emerged 16 msec after the presentation of the face identity. Participants were informed that they would hear sentences while seeing the visual stimulus at the center of the screen and that they would have to judge the correctness of each sentence by pressing one of two buttons on a response box (using either the index or middle fingers). Both presentations of the alternatives on each side of the screen and the response hand were counterbalanced across participants.
Following the EEG recording session, participants carried out a visibility task to test the awareness of the masked faces. This task consisted of 40 trials that were identical to the trial procedure of the EEG session, but they were asked to respond if they detected anything beyond the visual (scrambled) stimulus by communicating what they saw to the experimenter. This task is a subjective measure of visibility (Ramsøy & Overgaard, 2004), and it has been employed successfully in previous work using masked adjectives (Jiménez-Ortega et al., 2017, 2021). According to our visibility task, 16 participants reported detecting the shape of a face, but none of them (including the rest of the participants) declared to be able to recognize the face identity. Indeed, all participants were amazed when it was explained to them that they saw facial stimuli corresponding to themselves and their friends during the EEG experiment.
EEG Recordings and Analysis
Continuous EEG was registered using 59 scalp electrodes (EasyCap; Brain Products) following the International 10–20 system. EEG data were recorded by BrainAmp DC amplifier at a sampling rate of 250 Hz with a band-pass from 0.01 to 100 Hz. All scalp electrodes plus the left mastoid were all referenced to the right mastoid during the EEG recording and then rereferenced offline to the average of the right and left mastoids. The impedance of all electrodes was kept below 5 kΩ. The ground electrode was located at Afz. Eye movements were monitored using two vertical (VEOG) and two horizontal (HEOG) electrodes placed above and below the left eye and on the outer canthus of both eyes, respectively.
The EEG was preprocessed using the Brain Vision Analyzer (Brain Products) software. The raw data were filtered offline with a band-pass of 0.1–30 Hz and subsequently segmented into 1200-msec epochs starting 216 msec before the onset of the critical word. Baseline correction was applied from −216 to −16 msec. As the trigger is on the onset of the critical word, the baseline correction was moved to −16 msec in such a manner that the effect of interest is time-locked to the face presentation (followed by the onset of the critical word 16 msec later). Both incorrect and omitted responses were excluded from the analyses. Common artifacts (eye movements or muscle activity) were corrected through infomax independent component analysis (Bell & Sejnowski, 1995). Trials exceeding a threshold of 100 microvolts (μV) in any of the channels were semi-automatically rejected. The mean number of valid segments for each condition were as follows: correct sentences for self (M = 29.89; 95% CI [28.45, 31.33]), friend (M = 31.75; 89; 95% CI [30.27, 33.23]), and unknown faces (M = 30.11; 89; 95% CI [28.29, 31.94]), and incorrect sentences for self (M = 26; 89; 95% CI [24.06, 27.94]), friend (M = 28.78; 95% CI [26.93, 30.62]), and unknown faces (M = 29.14; 95% CI [27.07, 31.20]). Finally, separate averages were performed for each condition for the ERP analysis.
The preprocessed EEG data were exported to Fieldtrip (Oostenveld, Fries, Maris, & Schoffelen, 2011), an open-source toolbox of MATLAB (R2021b, Mathworks Inc.), for further analyses, namely, time–frequency analysis, source analysis, and cluster-based permutation tests.
Time–Frequency and Source Reconstruction
To compute the oscillatory dynamics contained in the EEG signals, time–frequency representations (TFRs) of spectral power were computed over frequencies ranging from 2 to 30 Hz (steps of 1 Hz). The time window selected was −216 to 1200 msec (steps of 4 msec) with a Hanning window of 354 msec (Mitra & Pesaran, 1999). The average power spectrum over all participants and all channels was then computed for each condition.
With the object of reconstructing the neural sources of the effects observed in the TFR, a spatial beamforming filtering technique was performed, namely, dynamic imaging of coherent sources (Gross et al., 2001). For each condition, this algorithm uses a spatial filter from the cross-spectral density (CSD) matrix to estimate coherent brain regions, thus providing the time courses of their activity based on a common spatial filter containing the covariance of all conditions to project the data through. The brain is divided into a regular three-dimensional grid, and the source power for each grid or voxel point is computed. To generate the forward model, the lead field matrix was computed based on an EEG head model template (boundary element method) and divided into a 5-mm-spaced grid (source model) based on the coordinate system of the Montreal Neurological Institute (MNI) template brain. Following the oscillatory activity observed in the alpha band, the CSD matrix was computed at 10 Hz from 0 to 1000 msec, as employed in previous studies (e.g., Drijvers et al., 2018a). Thus, the source analysis for each condition was calculated using both CSD and lead field matrices and a common spatial filter containing the covariance of all the conditions to project the data through. The output of source-level statistics was interpolated onto an anatomical MNI brain template.
Cluster-based Permutation Tests
Nonparametric statistics and cluster-based permutation tests were conducted to statistically evaluate the data obtained from the ERP, time–frequency, and source analysis using functions implemented in Fieldtrip (Maris & Oostenveld, 2007). The significance probability is computed from the permutation distribution using the Monte Carlo method and the cluster-based test statistic (for a review, see Sassenhagen & Draschkow, 2019). The permutation distribution was formed by randomly reassigning the values corresponding to each condition across all participants 8000 times. If the p value for each cluster (computed under the permutation distribution of the maximum cluster-level statistic) was smaller than the critical alpha level (.05), it was considered that the two experimental conditions were significantly different. This statistical test was applied to evaluate the difference between experimental conditions in our design: 3 face identity (self, close-friend, and unknown) × 2 correctness (correct and incorrect).
When testing early face-related components, a priori time windows including all channels were analyzed based on previous literature (e.g., Rubianes et al., 2021; Alzueta et al., 2019; Olivares et al., 2015): for the N170 component (100–200 msec) and the N250 component (200–300 msec). Cluster-based permutation tests can also statistically assess if there are differences between conditions based on prior information as to when and where to expect an effect (Meyer, Lamers, Kayhan, Hunnius, & Oostenveld, 2021). Because our linked mastoids reference could have attenuated the effects for the N170 component (Joyce & Rossion, 2005), the data were rereferenced offline to the average of all scalp channels only for the statistical analysis of this component. To calculate the interaction effects, the mean difference between incorrect and correct sentences for each condition (self, friend, and unknown) was performed, including the whole-time window and all channels, and then contrasted by correcting the critical alpha value because of multiple comparisons (.05/3 = .016). For the rest of the analyses (including time–frequency and source analysis), all channels and the whole-time window were included. Both effect size (Cohen's d; Cohen, 1988) and mean difference were calculated based on the mean of the channels and the latency reported by the cluster permutation tests to estimate the magnitude of the effects in the data.
RESULTS
Behavioral Results
Repeated-measures ANOVAs were performed to test both RTs and response accuracy, including the factors Face Identity and Correctness. The accuracy was measured as the percentage of successfully detecting whether a sentence was syntactically correct or incorrect (see Table 2). The analysis revealed a significant main effect of Correctness, F(1, 35) = 29.172; p < .001; ηp2 = .455, showing that participants were more accurate responses for correct sentences compared with incorrect sentences (Δ = 5.185 ± 0.960%; p < .001). However, both the main effect of Face Identity, F(2, 70) = 1.139; p = .326; ηp2 = .032, and the interaction effect between Face Identity and Correctness, F(2, 70) = .541; p = .585; ηp2 = .015, were nonsignificant. Similarly, the ANOVA for RTs revealed a significant main effect of Correctness, F(1, 35) = 15.301; p < .001; ηp2 = .304, indicating that participants responded faster to incorrect sentences than to correct ones (Δ = −0.025 ± 0.006 msec; p < .001). Again, no significant effects were found involving the main effect of Face Identity nor the interaction between Face Identity and Correctness, F(2, 70) = 3.201; p = .063; ηp2 = .084; F(2, 70) = .157; p = .855; ηp2 = .004, respectively.
. | Self (SD) . | Friend (SD) . | Unknown (SD) . | |||
---|---|---|---|---|---|---|
Correct . | Incorrect . | Correct . | Incorrect . | Correct . | Incorrect . | |
Accuracy (%) | 94.93 (7.01) | 89.10 (10.18) | 95.83 (4.43) | 90.56 (9.51) | 94.86 (6.73) | 90.42 (8.81) |
RTs (msec) | 379 (18) | 356 (15) | 389 (19) | 364 (17) | 396 (0.21) | 368 (17) |
. | Self (SD) . | Friend (SD) . | Unknown (SD) . | |||
---|---|---|---|---|---|---|
Correct . | Incorrect . | Correct . | Incorrect . | Correct . | Incorrect . | |
Accuracy (%) | 94.93 (7.01) | 89.10 (10.18) | 95.83 (4.43) | 90.56 (9.51) | 94.86 (6.73) | 90.42 (8.81) |
RTs (msec) | 379 (18) | 356 (15) | 389 (19) | 364 (17) | 396 (0.21) | 368 (17) |
Face Perception-related Components
Both correct and incorrect sentences were collapsed when testing the main effects of Face Identity, the cluster permutation tests yielding no differences for the N170 component. In contrast, analysis for the N250 component revealed a significant difference when self-faces were presented compared with the friend (negative cluster: p = .008; Δ = −0.293 μV; d = .291) and unknown faces (negative cluster: p = .038; Δ = −0.169 μV; d = .187), whereas the difference between the friend and unknown faces did not reach statistical significance (positive cluster: p = .057; Δ = .156 μV; d = .18). In the latency of approximately 220–280 msec, these significant differences were more pronounced over parieto-occipital sites (as shown in Figure 2).
Language-related Components
The cluster permutation tests revealed a significant effect between incorrect versus correct sentences for each face identity. These effects, in line with the topographic distribution and the latency range shown in Figure 3, were associated to the LAN component when the self (negative cluster: p < .001; Δ = −1.67 μV; d = −0.86), friend (negative cluster: p < .001; Δ = −.78 μV; d = −0.33), and unknown faces were displayed (negative cluster: p = .041; Δ = −.487 μV; d = −0.25). Remarkably, this LAN effect was exhibited as long-lasting negativity for self-faces (240–1020 msec approximately) compared with the friend and unknown faces (210–790 msec; 540–600 msec approximately, respectively). Subsequently, a P600 component was observed for the self (positive cluster: p = .036; Δ = 1.41 μV; d = .42), friend (positive cluster: p < .001; Δ = 2.30 μV; d = .86), and unknown faces (positive cluster: p < .001; Δ = 2.34 μV; d = .83).
After identifying both LAN and P600 components for each face identity, an additional analysis was performed to examine whether the LAN and P600 effects differ between face identities (p < .016 corrected for multiple comparisons). Notably, after subtracting incorrect versus correct sentences for each identity, the analyses showed for the LAN effect a significant difference for self-faces compared with friend (negative cluster: p = .002; Δ = −1.35 μV; d = −0.52) and unknown faces (negative cluster: p = .006; Δ = −1.84 μV; d = −0.69), as well as between friend and unknown faces (negative cluster: p < .012; Δ = −.72 μV; d = −0.31). When testing for the P600 effects, the analyses revealed significant differences between friend and self faces (positive cluster: p = .003; Δ = 0.85 μV; d = .30) and between unknown and self faces (positive cluster: p < .001; Δ = 1.05 μV; d = .36), whereas no differences were found between friend and unknown faces (positive cluster: p = .17; Δ = .30 μV; d = .11). Taken together, a larger LAN effect followed by a lower P600 effect were only found for self-faces in contrast to friend and unknown faces, whereas a larger LAN with no reduction of the P600 was observed for friend faces than for unknown faces.
Alpha Oscillations and Source-level Results
Regarding alpha modulations, the cluster permutation tests showed a significant effect between incorrect and correct sentences only for self-faces over frontocentral sites (negative cluster: p < .001; Δ = −.99; d = −0.41), whereas the contrasts for the friend and unknown faces were nonsignificant (negative cluster: p = .605; Δ = −.18; d = −0.06; negative cluster: p = .121; Δ = −.59; d = −0.22, respectively). Following the same procedure for calculating the interaction effects of both LAN and P600 components (subtracting incorrect vs. correct sentences for each identity at the alpha band), the analyses indicated a significant difference for self-faces compared with the friend (negative cluster: p < .001; Δ = −0.94; d = −0.33) and unknown faces (negative cluster: p = .01; Δ = −0.68; d = −0.35).
When estimating the brain sources related to alpha-band modulations, the whole time window and all scalp channels were included as input for the source analysis. The cluster permutation test at the source level revealed a significant difference only for the self-faces when comparing incorrect versus correct sentences (negative cluster: p = .046). After interpolating the output of this contrast into a structural MRI (as shown in Figure 4D), the highest peak was found in the left IFG, particularly in BA 47 (p = .011). Hence, when participants had to judge the sentences in terms of syntactic correctness, a more significant alpha suppression was found over the left IFG only when the masked self-faces preceded the critical word.
DISCUSSION
We investigated whether syntactic speech processing can be affected by self-related information by visually presenting masked faces (corresponding to self, friend, or unknown identities). Whereas the largest LAN followed by a reduced P600 effect was found for self-faces, a larger LAN with no reduction of the P600 was found for friend faces, as compared with unknown faces. Our results also showed that the LAN exhibited a frontocentral distribution for self and friend faces, whereas it was mainly left-lateralized for unknown faces. Furthermore, a larger alpha suppression over the left IFG was observed only for self-faces when contrasting syntactic correctness. The possible mechanisms underlying these effects will be discussed below. Collectively, our findings indicate that syntactic processing, even at early stages, may be affected by self-related information without explicit awareness. These data provide further evidence for an interactive view of syntactic language processing, which contrasts with the traditional encapsulated view of syntax (Jiménez-Ortega et al., 2021; Münster & Knoeferle, 2018; Lucchese, Hanna, Autenrieb, Miller, & Pulvermüller, 2017; Pulvermüller et al., 2009).
Our design also enables us to test the neural correlates of face identity processing under reduced levels of awareness. Our data showed that the N170 component was insensitive to the facial identity, whereas the N250 component was the earliest neural marker discriminating self-faces from other identities, in accordance with previous studies using unmasked presentations (e.g., Rubianes et al., 2021; Miyakoshi, Kanayama, Iidaka, & Ohira, 2010). These data provide evidence for the ongoing debate about whether early components related to face perception may reflect self-prioritization over familiarity (Caharel & Rossion, 2021; Schweinberger & Neumann, 2016; Olivares et al., 2015). As such, our results suggest that self-face processing is driven by automatic prioritization mechanisms when accessing LTM representations after facial structural coding, being elicited even under reduced levels of awareness and regardless of task demands.
Behavioral Findings
We found no significant effect on participants' accuracy driven by face identity. A possible explanation could be that the advantage of the self-reference effect on accuracy may be diminished when the self-related stimuli are not explicitly presented. In this regard, previous studies have reported improved behavioral performance for the accuracy of self-related information (Scheller & Sui, 2022; Macrae et al., 2018; Keyes & Dlugokencka, 2014), whereas other studies have failed to report this finding using implicit or masked paradigms (Bola et al., 2021; Yaoi, Osaka, & Osaka, 2021; Geng et al., 2012). As for RTs, our results showed that participants responded faster to incorrect than to correct sentences. This result is in consonance with previous language studies (e.g., Hernández-Gutiérrez et al., 2023; Hinchcliffe et al., 2020). However, no significant effects were observed involving face identity as factor. This result contrasts with prior studies on face recognition that observed shorter RTs for self-faces (e.g., Geng et al., 2012). These differences can be because of the fact that the response window was adjusted after the end of the sentence in our study, meaning that participants had to wait to provide their response, which may reduce the differences between conditions.
Face Identity under Reduced Levels of Awareness
According to models of face perception, first-order facial processing involves structural encoding of face configuration after low-level visual analysis, whereas second-order facial processing consists of recognizing face identity because of the access to face recognition units in LTM (Schweinberger & Neumann, 2016; Gobbini & Haxby, 2007; Bruce & Young, 1986). Our results are consistent with the proposal that the earliest neural marker for self-face recognition occurs in the second-order facial processing, this being reflected by the N250 component (Rubianes et al., 2021; Woźniak, Kourtis, & Knoblich, 2018; Miyakoshi et al., 2010; Tanaka, Curran, Porterfield, & Collins, 2006). Thus, the pattern observed for self-faces may indicate facilitated access for facial recognition units in LTM (Muñoz et al., 2022; Olivares et al., 2015). In turn, the pattern observed for the N170 is in line with previous studies showing an automatic mechanism detecting face configuration regardless of face identity (e.g., Rubianes et al., 2021; Alzueta et al., 2019; Kotlewska & Nowicka, 2015). Hence, the paradigm used in this study replicates previous work using explicit attentional contexts.
A growing body of evidence is showing that the self-reference effect could emerge automatically by capturing our attention under reduced levels of awareness (i.e., subliminally). When presenting subliminal self-faces as a task-irrelevant stimulus, previous research using a dot-probe task found negative components over parieto-occipital regions related to self-face processing (200–300 msec after stimulus onset), thus biasing attentional mechanisms during the early stages of processing (Bola et al., 2021; Wójcik et al., 2018). Collectively, our data are also consistent with the notion that self-related information can be prioritized at this stage without explicit perceptual awareness, relying on bottom–up mechanisms (low-level attentional capture) and activation of pre-established representation in LTM (Bola et al., 2021; Alzueta et al., 2020; Sui & Rothstein, 2019). Considering the trend observed when comparing familiar and unknown faces during the N250 window (p = .057), further studies are needed to determine to what extent personally familiar faces may also share such preferential access when the levels of perceptual awareness are limited.
On the Interplay between Self-reference and Syntactic Language Processing
One of the main findings of this study is that early syntactic computations can be modulated by self and familiar faces. This result is reflected by long-lasting negativity to morphosyntactic violations—interpreted as a LAN component—only for self and friend faces, along with a frontocentral distribution of this typically left-sided component. This pattern could reflect both initial morphosyntactic operations and the access in parallel to person-related information of self and familiar faces. In this regard, an increased allocation of cognitive resources may occur during first-pass syntactic parsing in the presence of self-relevant information. This interpretation is consistent with long-lasting effects previously reported for self-relevant content, engaging more cognitive resources when processed consciously and unconsciously (Rubianes et al., 2021; Wójcik et al., 2018; Fields & Kuperberg, 2016). On the other hand, prior research has also observed a central distribution of the LAN component because of a higher load on language working memory processes (Martín-Loeches et al., 2005; Kolk et al., 2003). In addition, other studies have found a centroparietal distribution of the LAN component as a result of shifting the processing strategy for solving morphosyntactic violations—toward a heuristic processing style instead of the algorithmic and rule-based strategy (Isen & Means, 1983)—triggered, for instance, by social presence (Hinchcliffe et al., 2020) or masked positively charged words (Jiménez-Ortega et al., 2017).
Notably, a larger LAN followed by a reduced P600 effect was observed only in the presence of self-faces. This biphasic pattern has been previously linked to good versus poor comprehenders (Coulson & Kutas, 2001), verbal working memory operations (Martín-Loeches et al., 2005; Kolk et al., 2003), or as a result of shifting the processing strategy to solve agreement anomalies (Hinchcliffe et al., 2020; Jiménez-Ortega et al., 2017). Hence, it could be possible that implicit, self-relevant content triggered more cognitive resources during first-pass syntactic operations because of low-level attentional capture (i.e., bottom–up mechanisms), thus reducing the processes reflected by the later P600. This interpretation is aligned with similar biphasic patterns observed in previous works when emotion-laden words precede morphosyntactic anomalies (Jiménez-Ortega et al., 2021; Espuny et al., 2018), suggesting that less reanalysis/repair processes might be necessary to successfully resolve the morphosyntactic mismatch (Hinchcliffe et al., 2020; van de Meerendonk, Kolk, Vissers, & Chwilla, 2010). Thus, our data suggest that implicit self-referential information can be decoded from visual parameters, prompting more cognitive resources during first-pass parsing processes, yielding a self-referential effect. Furthermore, the results presented here are compatible with the flexibility of early syntax processes and its interaction with semantic and contextual information (Münster & Knoeferle, 2018; Hagoort, 2017; Pulvermüller et al., 2009).
Intriguingly, a larger alpha suppression was observed over the left IFG, presumably BA 47, only for self-faces when contrasting syntactically correct and incorrect words, as early as around 150–550 msec. Different portions of the left IFG have been involved in several linguistic computations (Matchin & Hickok, 2020; Friederici, 2017; Hagoort, 2017). According to Hagoort (2017), BA 47 in the left IFG is involved in lexical access operations and in unifying the lexical building blocks obtained from memory in parallel with nonlinguistic information, being BA 47 a key node within the semantic unification network. Following this framework, a possible interpretation of our alpha results could be that semantic sentence processing was boosted by the presence of self-relevant information, only—or particularly—in the presence of a grammatical violation, this being straightforward evidence that semantic and syntactic domains interact early during sentence processing (in line with, e.g., Hagoort, 2017; Malaia & Newman, 2015; Pulvermüller et al., 2009). Overall, we found both early semantic and syntactic boosting specifically when one's face and a morphosyntactic violation concur. This evidence suggests that the interplay between semantic and syntactic operations is clearly bidirectional, which is consistent with several linguistic models (e.g., Jackendoff, 2007).
In our view, we are dealing with self-reference effects and not with effects because of an incongruence or mismatch in the simultaneous appearance of self-face and an unknown voice summoning attentional resources. The fact that the face of a close friend was also accompanied by the same unknown voice, but the effects were not as noticeable as for self-face, implies that the main modulations observed are primarily the consequence of the self-reference effect.
Limitations and Concluding Remarks
Regarding the limitations of this study, it should be noted that the face identity was presented 16 msec before the target word to test the modularity of syntactic processing. Thus, the mere presentation of face identity cannot be isolated because of the pseudosimultaneous presentation of both stimuli (face and critical word) that could lead to a possible mixing between long-lasting, face-related components (i.e., P3) and language-related components (i.e., LAN and P600). From a technical point of view, this question is settled for the linguistic manipulation by comparing both correct and incorrect linguistic material under the same self-referential conditions. Indeed, the modulation of language-related ERP components by self-related information was the main purpose of the present study.
Another limitation of the study is related to the sample, as it was not equally balanced in terms of the number of women and men (24 vs. 12, respectively). Prior research has suggested that there might be sex differences in brain structure and function, probably reflecting differences in neural and cognitive processing (Proverbio, 2023; Cahill, 2006). For instance, it has been suggested that language is more left-lateralized in male individuals while it is more bilaterally distributed in female individuals (for a review, see Ullman, Miranda, & Travers, 2007). However, current research in this regard remains scarce and inconclusive (Sato, 2020). It should also be noted that, to the best of our knowledge, no sex differences have been reported for either the LAN or the P600 components, even if these have been quite extensively studied.
An open question for future studies is to investigate the effects of self-reference in linguistic processing using an ecological approach (e.g., by manipulating the content of the language material along with the speaker's face). In addition, whether self-reference may facilitate or hamper language processing remains to be determined. This could be afforded, for instance, in a behavioral study without the limitations of the ERP procedures, that is, in which RTs and accuracy are measured lime-locked to the occurrence of the linguistic anomaly. Finally, as the linguistic material was presented aurally and combined with the visual presentation of faces, functional connectivity analyses of visual and auditory brain regions (e.g., Keil & Senkowski, 2018) may be another source of potential interest for future studies.
To conclude, our data provide evidence that identity-related information is rapidly decoded from facial cues under masked conditions (especially when it comes to self-identity), driven by automatic prioritization mechanisms. The data presented here indicate that the self-reference effect can be extended to core linguistic computations, as evidenced by the mobilization of cognitive resources during syntactic processing. Overall, this study provides further evidence for an interactive view of language processing in the human brain.
Corresponding authors: Miguel Rubianes, Department of Psychobiology & Behavioral Sciences Methods, Complutense University of Madrid, Campus de Somosagua, Ctra. de Húmera, Madrid, Spain, 28223, or via e-mail: [email protected]; or Francisco Muñoz, Cognitive Neuroscience Section, UCM-ISCIII Center for Human Evolution and Behavior, Av. Monforte de Lemos, n° 5, pabellón 14, Madrid, Spain, 28029, or via e-mail: [email protected].
Data Availability Statement
The raw data and scripts are available on request.
Author Contributions
Miguel Rubianes: Data curation; Formal analysis; Investigation; Methodology; Software; Visualization; Writing—Original draft; Writing—Review & editing. Linda Drijvers: Data curation; Formal analysis; Investigation; Methodology; Software; Visualization; Writing—Review & editing. Francisco Muñoz: Conceptualization; Data curation; Formal analysis; Funding acquisition; Investigation; Methodology; Project administration; Supervision; Visualization; Writing—Original draft; Writing—Review & editing. Laura Jiménez-Ortega: Conceptualization; Funding acquisition; Investigation; Methodology; Project administration; Writing—Review & editing. Tatiana Almeida-Rivera: Investigation; Writing—Review & editing. José Sánchez-García: Investigation; Writing—Review & editing. Sabela Fondevila: Investigation; Writing—Review & editing. Pilar Casado: Investigation; Writing—Review & editing. Manuel Martín-Loeches: Conceptualization; Data curation; Formal analysis; Funding acquisition; Investigation; Methodology; Project administration; Supervision; Visualization; Writing—Review & editing.
Funding Information
This study was supported by Ministerio de Ciencia, Investigación y Universidades, Programa Estatal de Investigación Científica y Técnica de Excelencia, Spain, grant number: PSI2017–82357-P; Ministerio de Ciencia e Innovación, grant numbers: PID2021-123421NB-I00 and PID2021-124227NB-I00; and by Ministerio de Ciencia, Innovación y Universidades (https://dx.doi.org/10.13039/100014440), grant number: FPU18/02223.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance.