Abstract
Musical expertise has been proven to be beneficial for time perception abilities, with musicians outperforming nonmusicians in several explicit timing tasks. However, it is unclear how musical expertise impacts implicit time perception. Twenty nonmusicians and 15 expert musicians participated in an EEG recording during a passive auditory oddball paradigm with 0.8- and 1.6-sec standard time intervals and deviant intervals that were either played earlier or delayed relative to the standard interval. We first confirmed that, as was the case for nonmusicians, musicians use different neurofunctional processes to support the perception of short (below 1.2 sec) and long (above 1.2 sec) time intervals: Whereas deviance detection for long intervals elicited a N1 component, P2 was associated with deviance detection for short time intervals. Interestingly, musicians did not elicit a contingent negative variation (CNV) for longer intervals but show additional components of deviance detection such as (i) an attention-related N1 component, even for deviants occurring during short intervals; (ii) a N2 component for above and below 1.2-sec deviance detection, and (iii) a P2 component for above 1.2-sec deviance detection. We propose that the N2 component is a marker of explicit deviance detection and acts as an inhibitory/conflict monitoring of the deviance. This hypothesis was supported by a positive correlation between CNV and N2 amplitudes: The CNV reflects the temporal accumulator and can predict explicit detection of the deviance. In expert musicians, a N2 component is observable without CNV, suggesting that deviance detection is optimized and does not require the temporal accumulator. Overall, this study suggests that musical expertise is associated with optimized implicit time perception.
INTRODUCTION
How do we make sense of physical time? Models such as the scalar expectancy theory (SET), which relies on an “internal clock”, have received rising interest in the timing and time perception literature (Gibbon, 1977). This internal clock is referred to as a pacemaker-counter device. The pacemaker emits pulses that are accumulated in a counter, and this accumulation provides the basis for estimating time intervals, with more pulses leading to longer perceived duration. Other models, such as beat-based models (i.e., dynamic attending theory), which postulate we entrain our internal oscillations to external rhythms, have also been thoroughly used to study timing in a rhythmic context (Henry & Herrmann, 2014; Large & Jones, 1999).
SET posits that the variability in a series of time judgments should increase proportionally with the magnitude of physical time passed (Grondin, 2010). In other words, the variability-to-time ratio should remain constant, which is not the case (Grondin, 2001, 2014). This ratio increases when intervals reach 1.2–1.5 sec (Grondin, Laflamme, & Mioni, 2015; Grondin, 2012, 2014; Gibbon, Malapani, Dale, & Gallistel, 1997). For this reason, researchers often distinguish the processing of sub- and suprasecond intervals (Lewis & Miall, 2003b; Rammsayer & Lima, 1991). It has been suggested that the perception of subsecond intervals is associated with sensory/automatic/motor processing (Thibault, Albouy, & Grondin, 2023; Lewis & Miall, 2003a, 2003b) whereas estimating suprasecond intervals is associated with cognitive and motor functions (Thibault et al., 2023; Brown, Collier, & Night, 2013; Brown, 2006; Zakay & Block, 1997; Macar, Grondin, & Casini, 1994).
Recently, Thibault and colleagues (2023) suggested that different neurofunctional processes are involved during the processing of short (below 1.2 sec) and long (above 1.2 sec) time intervals in nonmusicians. To isolate the brain correlates of timing from the brain activity related to the motor and cognitive functions (such as decision-making) when classical explicit timing tasks are used, they employed a passive oddball task paradigm as an alternative. Comparing the evoked-related potentials (ERPs) of deviants against standards, they have shown that deviants from long empty time intervals (>1.2 sec) elicit a N1 component and a contingent negative variation (CNV) whereas deviants from the short intervals (<1.2 sec) only elicit a P2 component (see Thibault, Vallet, and Grondin [in press] for a summary of the role of these components in the time perception literature). They also demonstrate that, for long intervals, brain areas typically associated with cognitive processing are recruited (parietal and frontal cortices) whereas for short intervals, the brain areas activated are related to sensory/automatic/motor processing (primary auditory cortex, SMA, inferior frontal gyrus, cingulate and parietal cortices).
To date, these different neurofunctional processes involved in the implicit processing of short (below 1.2 sec) and long (above 1.2 sec) time intervals have been identified only in nonmusicians. Considering that musical expertise is beneficial for time perception abilities, one can question how musical expertise can impact implicit time perception of short- and long-time intervals. Psychophysics research on time perception has consistently shown the impact of musical expertise and the performance superiority of musicians over nonmusicians in countless explicit timing tasks. Indeed, Rammsayer, Buttkus, and Altenmüller (2012) demonstrated that across six different temporal tasks, musicians performed better than nonmusicians in all temporal and rhythm discrimination tasks. Grondin and Killeen (2009) have shown that musicians display decreased variance in their temporal judgments when reproducing time intervals, thus performing better than nonmusicians. In an interesting study, Güçlü, Sevinc, and Canbeyli (2011) demonstrated a lower Weber Fraction for musicians than nonmusicians, for the discrimination of both short and long auditory intervals. As well, temporal variability in a task involving an explicit counting task was much lower for musicians than for nonmusicians with both a 900- and a 1400-msec counting pace (Grondin, Demers, Rioux, Thibault, & Mioni, submitted).
There is limited knowledge on how musicians process timing information in a purely passive paradigm such as in Thibault and colleagues (2023). Izadifar, Formuli, Isham, and Paolini (2023) have studied professional violinists' sense of time in a mental imagery task in which they had to play a violin piece mentally while undergoing fMRI scanning. They observed that musicians generally overestimated time while performing the mental imagery task, which they attributed to increased creativity. They also found high activation in the left cerebellum, which corroborates the cerebellar timing hypothesis of Ivry, Spencer, Zelaznik, and Diedrichsen (2002). The cerebellar timing hypothesis stipulates that the cerebellum contributes to sensorimotor timing functions such as finger tapping or playing an instrument for the subsecond range (Spencer & Ivry, 2021; Ivry et al., 2002). Extrapolating from these findings, sensorimotor timing functions may be enhanced for musicians considering their extended hours of musical practice.
In addition, ERP studies of time perception have already identified distinct neural traces for musicians when compared with nonmusicians for explicit timing tasks. When given rhythm incongruities in explicit time perception tasks, musicians' P3 latency decreased and their amplitude increased relative to nonmusicians, indicating that musicians were more cognitively involved than nonmusicians (Ungan et al., 2013). Comparing percussionists to nonpercussionists (vocalists and controls), Slater, Ashley, Tierney, and Kraus (2018) identified enhanced amplitude for a negativity component around 260–370 msec for off-beat stimuli in favor of percussionists during a continuation drumming task. They hypothesized that the development of rhythm skills enhances inhibitory control in two ways: first, by fine-tuning motor networks with precise, timed movements (Krause, Pollok, & Schnitzler, 2010) and, second, by activating reward-based mechanisms such as predictive processing and conflict monitoring, which are known to be involved in tracking temporal structures (Graybiel, 2005). The enhanced negative component they found around 300 msec had previously been associated with timing error in the context of reward-based predictive processing (Baker & Holroyd, 2011; Miltner, Braun, & Coles, 1997). In their study, Slater and colleagues (2018) observed that amplitude for this predictive processing/conflict monitoring component correlated with behavioral measures of inhibitory control. The amplitude of the negative component was decreased for nonpercussionists, probably reflecting differences in the reward-based processes of timing and inhibitory control.
There is limited understanding regarding how different musicians' time perception is in an implicit timing task relative to nonmusicians. The purpose of this study is to (1) replicate the results of Thibault and colleagues (2023) with a different population, namely, expert musicians, regarding the distinct neural mechanisms (namely, N1, CNV, and P2 components), supporting deviance detection for short- and long-time intervals, and (2) investigate the possible effects of musical expertise for the processing of both short and long empty time intervals. (1) We make the hypothesis that as in nonmusicians, different neurofunctional processes will be involved during the processing of short (below 1.2 sec) and long (above 1.2 sec) time intervals by musicians. (2) Moreover, on the basis of previous studies, we make the hypothesis that amplitudes for N1 and P2 will be higher for musicians than nonmusicians, Habibi, Wirantana, and Starr (2014). We expect to see the same thing in both subsecond and suprasecond conditions in a passive oddball task. (3) Furthermore, we hypothesize that musicians will engage in more cognitive processes, such as cognitive inhibition, while being confronted with deviant temporal information. Musicians should elicit an additional component, the N200, that is typically observed after an off-beat stimulus in rhythm tasks. The N200 translates into cognitive inhibitory control (Slater et al., 2018), which is not a process that Thibault and colleagues (2023) observed for nonmusicians.
METHODS
Participants
Fifteen healthy expert musicians were recruited for the present study (9 female and 6 male participants, mean age = 29.44 years; SD = 6.82 years; age range = 20–42 years, 14 right-handed and 1 left-handed). Musicians had an average of 9.30 years of musical training, practiced at least one instrument for an average of 18.53 years, and averaged 8 hr of practice per week. Data from controls (20 nonmusicians: 10 female and 10 male participants, mean age = 23.15 years; SD = 1.50 years; age range = 21–27 years, 17 right-handed and 3 left-handed) were taken from Thibault and colleagues (2023). Nonmusicians reported no musical training. In total, 35 healthy individuals contributed to the study. A priori sample size calculations were made using Thibault and colleagues' (2023) previously published data. Effect size at the peak of the P2 component was used for the nonmusician's data (d = 0.78) to calculate how many participants are needed for the current study. Power analysis results showed that 15 (n = 14.93) participants were needed to demonstrate similar effect sizes in the current study (α = .05, power = 0.80).
All participants reported normal hearing, no history of neurological or psychiatric disorders, gave their informed consent to participate in this study, and received monetary compensation for their participation. All procedures were approved by the ethics committee of the Centre Intégré Universitaire de Santé et de Services Sociaux de la Capitale Nationale: 2021–2156.
Task and Procedure
The task and procedure were the same as in Thibault and colleagues (2023). Participants were scheduled for two sessions lasting approximately an hour and a half each (52 min of experiment and approximately 30 min of EEG preparation and setup). Over the two sessions, participants went through twelve 8-min blocks of an oddball paradigm with empty time intervals.
Each 52-min session consisted of six blocks. Each block was separated by a 3-min break. The task consisted of passive listening to an auditory oddball paradigm; eyes opened with a fixation cross displayed on the screen. For each block, 40 trials of 10 sounds were presented for each condition (below and above 1.2 sec). It is relevant to note that all trials were presented continuously without intertrial intervals, which results in a continuous flow of stimuli (corresponding to a classic oddball paradigm). For each trial, a deviant empty time interval that was presented between sounds was pseudorandomly presented across the standard empty time intervals. The standards and deviants refer to empty time intervals presented. In a series of 10 empty time intervals, 9 were “standards” (which corresponded to 0.8 sec in the <1.2-sec condition, and 1.6 sec in the >1.2-sec condition), and one was “deviant” (the empty time interval was either shorter/early) or longer/delayed. The deviant empty time intervals were either early (<1.2 sec: 0.70, 0.75 sec; >1.2 sec: 1.4, 1.5 sec) or delayed (<1.2 sec: 0.85, 0.9 sec; >1.2 sec: 1.7, 1.8 sec) as compared with the empty standard interval. There was a repetition of at least four standard time intervals before a deviant time interval was presented. For half of the blocks, the <1.2-sec condition trials were presented first, and for the other half, the >1.2-sec condition trials were presented first. The deviant empty time interval for each sequence of 10 empty time intervals was selected randomly. Empty time intervals, in opposition to filled intervals, were used to avoid auditory fatigue. Furthermore, for the range of durations used in the current experiment, empty intervals are at least as easily discriminated as filled intervals (Grondin, Meilleur-Wells, Ouellette, & Macar, 1998; Grondin, 1993). This resulted in 60 deviants for each condition. Note that 60 standard trials were randomly selected to perform contrasts between deviant and standard (see below). The participants were instructed to simply listen to the different sounds that would be presented.
All sounds were 5 kHz and lasted 30 msec, with a 10-msec ascending–descending envelope. Presentation software (Neurobehavioral Systems) was used for the delivery of the experimental protocol and to trigger auditory stimuli. The sounds were presented with Audiotechnica ATH-M50x headphones at 70 dB (SPL). For the six blocks of one given session, the <1.2-sec condition and the >1.2-sec condition were presented in alternation, and then the opposite sequence was adopted for the six blocks of the other session. Auditory ERPs for each condition were then studied at the sensor levels.
EEG Recording
The EEG recordings were done with the same system as Thibault and colleagues (2023). A 64-channel EEG cap with active electrodes (ActiCap, Brain Vision Solutions) was used to capture the electroencephalographic activity with two 32-BrainAmp MR Plus amplifiers (Brain Products). The installation of the EEG was completed with respect to the standard 10–20 installation. The signal was band-pass filtered between direct current and 1000 Hz and digitized at a sampling rate of 1000 Hz.
All channels were referenced with an electrode placed on the nose and with a forehead ground. All electrodes had an impedance of <20 kΩ. EEG data were preprocessed using Brainstorm software (Tadel, Baillet, Mosher, Pantazis, & Leahy, 2011) combined with Fieldtrip functions (https://www.fieldtriptoolbox.org/) and MATLAB (MathWorks, https://www.mathworks.com/products/matlab.html). The EEG preprocessing included notch filtering of the wall outlets' contamination (removing the 60, 120, and 180 Hz). A band-pass filter for frequencies of interest (ERPs) between 2 and 16 Hz was applied for the N1-P2 preprocessing and between 1 and 16 Hz for the CNV preprocessing. The filtered data were subjected to independent component analysis using EEGlab functions (https://sccn.ucsd.edu/eeglab/). Independent component analysis removes muscle artifacts such as blinking and eye movements. Using time-course and topographic information, components representing clear ocular artifacts were identified and removed from the filtered data. Each event (deviant and standard) was inspected from −1900 msec to 1900 msec relative to the onset of each sound, and trials for which the signal varied by more or less than 150 μV during the duration of the trial were excluded. Between 22 and 58 trials per condition were kept for each participant. For each epoch, a baseline correction of 100 msec (− 100 to 0) before stimulus onset was performed. Analyses for the CNV include a baseline correction between −100 and 0 msec relative to the onset of the previous standard ERP.
Statistics
Whole-brain analyses (sensors and sources of deviants vs. standard contrasts) of EEG activity were performed using nonparametric permutation testing and cluster randomization statistics in time and space (1000 permutations) as implemented in Fieldtrip (www.fieldtriptoolbox.org/). These analyses were done on the ERPs' data for a time period covering the N1, P2, and N2 ERPs components (see Figure 3). The same analyses (with cluster α = .05) were performed for the CNV period (−500 to 0 msec for delayed conditions; see Figure 2).
At the sensor level, the average difference between deviants and standards was calculated for each participant (musicians and nonmusicians) in both delayed conditions (short, long) for a 30-msec window surrounding the peak of each ERP component (N1, P2, N2, CNV) on the FCz electrode. ANOVAs with factors Group (musicians, nonmusicians) × Length (delayed short, delayed long) were then performed on each average ERP amplitude (N1, P2, N2, CNV) for both groups.
ANOVAs, post hoc Tukey's tests, and correlations were executed using R Version 4.1.2 (2021-11-01; R Core Team, 2010). The correlation package (Makowski, Wiernik, Patil, Lüdecke, & Ben-Shachar, 2022) and the aov function were used. The ANOVAs and correlations used all 20 nonmusicians and 15 musicians for comparison. Alpha was set at .05 for all analyses, and p values were adjusted for multiple comparisons.
RESULTS
Averaged ERPs of all standards across musicians showed the classic auditory frontocentral negativity for the N1 component and a frontocentral positive peak for the auditory P2, both typically centered around the FCz electrode (Figure 1A). For this reason, the FCz electrode was selected for illustration. It is relevant to note that the ERP analyses reported below were conducted at the whole-brain level with corrections for multiple comparisons in time (samples) and space (electrodes/vertices) and were thus not performed solely on electrodes of interest, except for the ANOVAs (Figure 4), which were executed on the FCz electrode.
We first investigated whether a CNV was prompted before deviance detection as reported in Thibault and colleagues (2023) for nonmusicians. CNV was only calculated for delayed deviance detection as it has been previously associated with temporal accumulation and preparation processes (Kononowicz & van Rijn, 2014). We performed the CNV analysis for delayed deviants only, using cluster corrected (in time and space) nonparametric permutation tests with 1000 permutations and a cluster alpha of .05 for −500 to 0 msec. CNV was observed only for the long delayed condition for nonmusicians (see Thibault et al., 2023), (−500–0 msec, p = .006, d = −0.65; see Figure 2B). CNV was not significant for the early condition for nonmusicians, and it was not significant for musicians in both conditions.
We then investigated the brain responses of the musicians associated with deviance detection for the above and below 1.2-sec time intervals after the onset of the deviant sound. Nonparametric permutation tests with cluster-based correction in time and space were performed separately for the below and above 1.2-sec deviant ERPs against their respective standard ERPs for a time window of −100 to 500 msec post-stimulus onset (see Figure 3).
Significant differences between deviants and standards were observed for the <1.2-sec early condition during the P2 time window (180–244 msec, p = .042, d = 0.30; Figure 3A) and the N2 time window (296–411 msec, p = .034, d = −1.08; Figure 3A). Significant differences were observed for <1.2-sec delayed condition during the N1 time window (46–167 msec, p = .004, d = −0.22; Figure 3B), the P2 time window (208–290 msec, p = .032, d = 1.01; Figure 3B), and during the N2 period (324–421 msec, p = .002, d = −1.84; Figure 3B).
No significant differences between deviants and standards were observed for the N1, P2, and N2 time windows for the >1.2-sec early condition (Figure 3C). For the >1.2-sec delayed condition, significant differences between deviants and standards were found for the N1 (110–194 msec, p = .012, d = −2.54; Figure 3D) and N2 time periods (350–428 msec, p = .024, d = −1.37; Figure 3D). Cluster-corrected topographies were calculated for the significant above and below 1.2-sec ERP contrasts.
We then compared both groups (musicians and nonmusicians) on the amplitudes of the difference waves (N1, P2, N2, and CNV) using ANOVAs. For each participant, we computed the averaged difference in amplitude between deviants and standards for both delayed conditions (short, long) using a 30-msec time window surrounding the peak of each ERPs (N1, P2, N2, CNV) on the FCz electrode. ANOVAs with factors Group (musicians, nonmusicians) × Length (short, long) revealed a significant effect of Group for N1, F(1, 66) = 11.843, p = .001, η2 = .10 (Figure 4A); P2, F(1, 66) = 6.794, p = .011, η2 = .08 (Figure 4B); and N2, F(1, 66) = 4.030, p = .049, η2 = .06 (Figure 4C). The amplitude of brain responses was greater for musicians than nonmusicians for N1, P2, and N2. The main effect of Length was significant only for N1, F(1, 66) = 37.030, p < .001, η2 = .17 (Figure 4A), N1 being greater for longer when compared with shorter intervals. The Group × Length interaction was significant for P2, F(1, 66) = 13.327, p < .001, η2 = .15 (Figure 4B). Post hoc Tukey's tests revealed that nonmusicians display larger amplitude for short intervals as compared with long intervals (p = .022), whereas this was not significant for musicians (p = .114). The Group difference was not significant for the short interval (p = .883). In addition, analyses revealed a Group difference for P2 amplitude only for the long interval (p < .001), increased for musicians. Moreover, the Group × Length interaction was significant for the CNV, F(1, 66) = 6.230, p = .015, η2 = .08 (Figure 4D). Post hoc Tukey's tests revealed that, for nonmusicians, the CNV amplitudes is more increased with longer than with shorter intervals (p = .022). In addition, in the long condition, the increased CNV for nonmusicians was larger than that of musicians (p = .026). No difference between musicians and nonmusicians was observed for the short condition (p = .921).
No Length effect was observed for P2, F(1, 66) = 0.542, p = .464, η2 = .006 (Figure 4B); N2, F(1, 66) = 0.001, p = .972, η2 < .001 (Figure 4C); and CNV, F(1, 66) = 3.055, p = .085, η2 = .04 (Figure 4D). No Group effect was found for CNV, F(1, 66) = 2.557, p = .115, η2 = .03 (Figure 4D), and no interaction was observed for N1, F(1, 66) = 2.369, p = .129, η2 = .02 (Figure 4A), and N2, F(1, 66) = 0.009, p = .925, η2 < .001 (Figure 4C).
We tested whether the CNV acts as the temporal accumulator that facilitates later cognitive processes such as the N200 (reflecting explicit deviance detection). To do so, we performed the correlation between the amplitude of these two ERPs (difference wave deviants minus standard) across all participants. We did not further investigate on the association of the N1 and P2 components as they were both observable for musicians and nonmusicians. A Pearson's correlation test was conducted between the N200 difference wave (delayed conditions) and the CNV difference wave (delayed conditions) to assess the relationship between the two components. The positive correlation was significant (r = .393, p < .001; Figure 5). Note that this correlation was not driven by group differences as the correlation was nonsignificant for nonmusicians and was marginally significant for musicians. Fisher test for independent samples was not significant (z = −0.295, p = .384).
DISCUSSION
Confirming the Role of N1 and P2 in Deviance Detection
During the <1.2-sec early and delayed conditions, musicians elicited a P2 component. This result is in line with Thibault and colleagues' (2023) findings for nonmusicians revealing the presence of a significant P2 component for the <1.2-sec delayed condition. We identify this P2 component to be related to the error-positivity (Pe), which is elicited for conscious and easy error processing, peaking around 200–500 msec (Herrmann, Römmler, Ehlis, Heidrich, & Fallgatter, 2004; Yeung & Sanfey, 2004; Falkenstein, Hoormann, Christ, & Hohnsbein, 2000). Short intervals are typically identified as easier to discriminate than longer intervals (Grondin, 2012; Gibbon et al., 1997), which explains why we observe the Pe component in shorter intervals in both groups. According to our findings, although they do not reveal a P2 component for >1.2-sec intervals, musicians have a significantly increased amplitude in the P2 time period for longer intervals relative to nonmusicians (see Figure 4B). This could be interpreted as greater ease in detecting deviance in the >1.2-sec condition when compared with nonmusicians. Studies have also demonstrated that increased arousal toward the task could also modulate P2's amplitude, which would be in line with the hypothesis of musicians' increased motivation for the task (Olofsson & Polich, 2007). P2 has also been identified as a component reflecting sensory gating, which is the ability to filter irrelevant information to reach higher cognitive processes (Gjini, Arfken, & Boutros, 2010; Lijffijt et al., 2009). Anyhow, the P2 component for musicians, just like for nonmusicians, seems to illustrate an easy, automatic, and conscious sensory process of deviance in temporal information. An interesting finding in this study is how musicians are able to process short early deviance by eliciting P2 and N2 components, which was not observed for nonmusicians. We hypothesize that <1.2-sec early deviants may be easier to process for them relative to nonmusicians. The superiority of musicians over nonmusicians to discriminate early sounds has already been reported in behavioral studies using deviants that arrive earlier (Perna, Pavani, Zampini, & Mazza, 2018). Musicians may be conscious of this early short deviance, consequently eliciting a Pe component, which can then be processed cognitively as shown by the N2 activity (as will be discussed in the next subsection).
The N1 component was observed for the <1.2-sec and >1.2-sec delayed conditions for musicians (see Figure 3). Thibault and colleagues' (2023) study with nonmusicians also identified the N1 component as part of the process for >1.2-sec delayed deviance but do not show this component for <1.2-sec delayed deviants. The difference in the N1 component amplitude between deviants and standards has been associated with attentional processes and the expectation of temporal events (Herbst & Obleser, 2019). Typically, increased attentional allocation is reflected by a larger N1 amplitude for temporal tasks (Thibault et al., 2023; Jones, Hsu, Granjon, & Waszak, 2017; Habibi et al., 2014; Hsu, Hämäläinen, & Waszak, 2013). Enhanced N1 amplitude for musicians in both long and short delayed conditions illustrates that musicians allocate more attention to deviance than nonmusicians in the <1.2-sec condition. Furthermore, the N1 component can also be modulated by the Nd attention component, which is a response superimposed to the auditory N1 (Woods, Alho, & Algazi, 1994; Giard, Perrin, Pernier, & Peronnet, 1988). This results in an appearing increased N1 component, which could explain the effect on amplitude we observe for musicians when compared with nonmusicians (Figure 4A). The Nd testifies for the musicians' enhanced attentional involvement in the task. Musicians have already been reported to exhibit increased N1 amplitudes in cued rhythmic tasks relative to nonmusicians (Shen, Ross, & Alain, 2023). To explain our findings, we argue that musicians rely on supplementary attentional mechanisms for the processing of <1.2-sec intervals when compared with nonmusicians. The addition of the significant N1 for the short delayed condition for musicians when compared with nonmusicians is in line with musicians performing better and using more of their cognitive functions, such as attention, in temporal and rhythmic tasks (Shen et al., 2023; Vibell, Lim, & Sinnett, 2021). Indeed, musical training has been shown to generally induce neuroplasticity that enhances attention (Rodrigues, Loureiro, & Caramelli, 2013; Patston, Hogg, & Tippett, 2007; Münte, Altenmüller, & Jäncke, 2002). We postulate that this neuroplasticity is used to enrich musicians' capacity to better identify implicit temporal deviance. However, an alternative hypothesis could be that musicians might be paying more attention when presented with rhythmic stimuli because the stimuli may be deemed pertinent to their musical abilities (McAuley, Henry, & Tuft, 2011). Thus, more attention may not be necessary for musicians to process temporal deviance, but it may be a consequence of their identity as a “musician” and their motivation.
As discussed in Thibault and colleagues (2023), N1 difference in deviants relative to their standard may also be identified as an error-related negativity component (ERN/Ne), which is a frontocentral negativity peaking around 20–130 msec (Vallet, Neige, Mouchet-Mages, Brunelin, & Grondin, 2021; Herrmann et al., 2004). The ERN/Ne component is typically associated with error detection, correction, and compensation (Simons, 2010; Scheffers & Coles, 2000; Vidal, Hasbroucq, Grapperon, & Bonnet, 2000). This adds to the argument that N1 has a more cognitive part, playing an important role in the monitoring of errors.
In a similar manner, musicians and nonmusicians do not elicit any ERP to long early deviance (see Figure 3C). Thibault and colleagues (2023) proposed, in line with the pacemaker-counter model of time perception (Gibbon, Church, & Meck, 1984), that in the delayed conditions, participants accumulated more pulses in their accumulator (when compared with the standard presentation), which translated into evidence that an error had occurred. In contrast, in the early deviant condition, the accumulation of pulses does not exceed the standard's pulse count, which makes it harder for deviance detection. However, musicians can detect deviations in the early <1.2-sec condition, indicating they use an alternative mechanism that does not rely on pulse accumulation but that is still ineffective for longer early durations. We argue that musicians have honed their sensory-gating and error monitoring (shown by the elicited P2) mechanisms, through their extensive training. This refined sensory ability may only be useful for short time intervals, as they are claimed to be easier to discriminate (lower Weber fraction of coefficient of variation) than longer time intervals (Grondin, 2012; Gibbon et al., 1997).
The Inhibitory and Conflict Monitoring Role of N2 in Musicians
A novelty for musicians when comparing their ERPs to nonmusicians (Thibault et al., 2023) is the N2 component prompted for deviance processing (see Figure 3). The N2 is a negative ERP that peaks at around 200–350 msec after stimulus onset (Folstein & Van Petten, 2008). Musicians elicited an anterior/central N2 in the <1.2-sec early, <1.2-sec delayed, and >1.2-sec delayed conditions, whereas nonmusicians did not have such component. This component is evoked independently of whether the stimulus arrives earlier or after the anticipated standard for musicians. Studies have shown that such a negativity component may be associated with inhibitory cognitive control (Slater et al., 2018), conflict monitoring (Cheng, Chang, Han, & Lee, 2017; Smith, Smith, Provost, & Heathcote, 2010), and control of motor responses (Folstein & Van Petten, 2008). Inhibitory control is the capacity to successfully manage attention and block out external and internal distractions. It is fundamental to our capacity for environmental adaptation and serves as the basis for additional cognitive functions, including learning, thinking, and planning (Chen, Zhou, & Chen, 2020). However, alternatively, nonmusicians may not have elicited this component because they had lower arousal and motivation for the task. Indeed, previous studies have observed that the N2 amplitude is modulated by arousal (Rozenkrants & Polich, 2008).
Strong evidence for N2's role in cognitive inhibitory control has been illustrated using go/no-go tasks. In such tasks, larger N2 are elicited for no-go trials when compared with go trials (Hoyniak, 2017; Bruin & Wijers, 2002). In addition, the N2 also correlates with cognitive load, as it is increased when task difficulty is amplified (Benikos, Johnstone, & Roodenrys, 2013), further establishing its cognitive role. Furthermore, the no-go N2 is increased in participants with low false alarm rates relative to participants with higher false alarm rates, which implies an association between N2's amplitude and a successful inhibitory response (Falkenstein, Hoormann, & Hohnsbein, 1999). Literature extensively documents that musicians typically have much lower false alarm rates and, thus, higher sensitivity to temporal tasks (Chen et al., 2020; Ungan et al., 2013; Jones & Yee, 1997). We argue that our deviant presentation in our oddball task and the no-go trials in a go/no-go task are comparable. The participant noticing the deviance may want to act on this detection but is being instructed to inhibit this reaction (much like one does in the no-go trials). This was qualitatively reported to experimenters as musicians were frustrated by the deviance occurrence, whereas nonmusician did not express such frustration.
The N2 observed for musicians may also reflect conflict monitoring when a no-go trial is prompted (Smith et al., 2010). In our case, especially for musicians, a rhythm deviance normally requires adjustments; however, in our task, participants were told to do nothing and listen to the sounds. When a deviant time interval is played, which rarely occurs when compared with the standard interval, conflict monitoring processes are provoked. Donkers and Van Boxtel (2004) demonstrated that the no-go N2 could also be elicited during go trials in a go/no-go task if the probability of go versus no-go trials was changed. In that study, the N2 amplitude was increased for lower-frequency (deviant) trials, which is in line with the conflict monitoring hypothesis. These results were replicated by Nieuwenhuis, Yeung, Van Den Wildenberg, and Ridderinkhof (2003) who added that the N2 component should be co-registered with the ERN/Ne component, which we observe for musicians in both long- and short-delayed conditions.
Our results show that musicians use cognitive inhibitory and conflict monitoring processes, along with the earlier components that oversee primary deviance detection (N1 and P2), to process the deviance. In practice, musicians might benefit from this no-go N2, which instructs them not to play along with off-beat (deviant) notes.
The Absence of CNV in Deviance Detection for Musicians
Interestingly, musicians do not elicit a CNV, which has been thoroughly identified with temporal accumulation in the literature (Kononowicz & Penney, 2016; Macar & Vidal, 2003). The CNV was also observed with nonmusicians in Thibault and colleagues (2023) for the >1.2-sec delayed condition. We hypothesize that the N2 coupled with N1/ERN and/or P2/Pe may replace the need for CNV for musicians. The inhibitory and/or conflict monitoring anterior N2 component is correlated with the CNV. The CNV has been previously identified as a marker of temporal expectancy and temporal accumulation (Macar, Vidal, & Casini, 1999). This implies that musicians may not need to accumulate as much temporal information as nonmusicians do to perform more cognitive operations (as shown by the N2 component) involving temporal information. They may solely rely on sensory deviance detection and error monitoring components such as N1/ERN (Nieuwenhuis et al., 2003) and/or P2/Pe (Thibault et al., 2023). In any case, an increased CNV was associated with an increased N2 suggesting that temporal accumulation (CNV) facilitates explicit deviance detection (N2). Alternatively, one can consider that the participants may encode the stimuli as a train of empty time intervals (a rhythmic sequence). Alternatively, one can consider that participants may encode the stimuli as a train of empty time intervals (a rhythmic sequence). This would also be in line with the dynamic attending model of time perception. In this model, the internal oscillations are entrained to an external rhythm (Large & Jones, 1999). In our study, musicians may elicit the N1, P2, and N2 components in an attempt to entrain and “correct” their internal oscillations to match the length of the new deviant stimulus. More research is needed to link these oscillators and the ERPs observed in the present study.
Conclusion
This study provides evidence for the reproducibility of Thibault and colleagues' (2023) findings related to the ERP components of an oddball temporal task. It shows that <1.2-sec and >1.2-sec time intervals are processed differently. The shorter intervals (<1.2 sec) are processed more automatically, whereas the processing of longer intervals (>1.2 sec) requires a larger support from cognitive mechanisms, such as attention.
At a neural level, however, musicians differ significantly from nonmusicians in that their processing of temporal deviance generally includes more ERP components than nonmusicians. First, they do elicit N1 components, even in short conditions, reflecting a more cognitively based processing of the temporal deviance. Second, they do not need the contribution of the CNV to accumulate temporal information, as was the case for nonmusicians. Even if musicians do not elicit a CNV (at the group level), they still explicitly detect the temporal deviance. This suggests that (1) they do not need temporal accumulation to detect the deviance or (2) there is a different temporal accumulation mechanism that supports time perception in musicians which should be investigated in future studies. Third, they can perceive short early deviance more easily than nonmusicians, as demonstrated by the P2 and N2 components. Finally, they elicit a N2 component, which allows them to inhibit their need to respond to deviance, and that allows for conflict monitoring occasioned by the deviance detection. Therefore, there are clear neural differences between musicians and nonmusicians in the processing of time, and this would explain the superiority of musicians over nonmusicians, as thoroughly demonstrated in the literature, when time comes to perform behavioral explicit timing tasks.
Corresponding author: Nicola Thibault, École de Psychologie, Université Laval, Québec, G1V 0A6, Canada, or via e-mail: [email protected].
Data Availability Statement
Data and pipelines for musicians' analyses is available via Université Laval's repository link: https://doi.org/10.5683/SP3/U8WYLY.
Author Contributions
Nicola Thibault: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Validation; Visualization; Writing—Original draft; Writing—Review & editing. Stéphanie D'amours: Investigation; Writing—Review & editing. Philippe Albouy: Conceptualization; Data curation; Formal analysis; Funding acquisition; Methodology; Project administration; Resources; Supervision; Validation; Writing—Review & editing. Simon Grondin: Conceptualization; Formal analysis; Funding acquisition; Methodology; Project administration; Resources; Supervision; Validation; Writing—Review & editing.
Funding Information
This work was supported by a Fonds de Recherche du Québec – Santé (https://dx.doi.org/10.13039/501100000156), grant numbers: 280380 and 329968 to P. A.; National Sciences and Engineering Research Council of Canada Discovery (NSERC; https://dx.doi.org/10.13039/501100000038), grant numbers: RGPIN-2019-06162 and RGPIN-2023-03578 to P. A. and to S. G.; an NSERC scholarship to N. T. ; and a NSERC Summer scholarship to S. D. A.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance.
REFERENCES
Author notes
These authors contributed equally to this work.