Abstract

Humans are able to find and tap to the beat of musical rhythms varying in complexity from children's songs to modern jazz. Musical beat has no one-to-one relationship with auditory features—it is an abstract perceptual representation that emerges from the interaction between sensory cues and higher-level cognitive organization. Previous investigations have examined the neural basis of beat processing but have not tested the core phenomenon of finding and tapping to the musical beat. To test this, we used fMRI and had musicians find and tap to the beat of rhythms that varied from metrically simple to metrically complex—thus from a strong to a weak beat. Unlike most previous studies, we measured beat tapping performance during scanning and controlled for possible effects of scanner noise on beat perception. Results showed that beat finding and tapping recruited largely overlapping brain regions, including the superior temporal gyrus (STG), premotor cortex, and ventrolateral PFC (VLPFC). Beat tapping activity in STG and VLPFC was correlated with both perception and performance, suggesting that they are important for retrieving, selecting, and maintaining the musical beat. In contrast BG activity was similar in all conditions and was not correlated with either perception or production, suggesting that it may be involved in detecting auditory temporal regularity or in associating auditory stimuli with a motor response. Importantly, functional connectivity analyses showed that these systems interact, indicating that more basic sensorimotor mechanisms instantiated in the BG work in tandem with higher-order cognitive mechanisms in PFC.

INTRODUCTION

A defining characteristic of our interactions with music is the ability to identify and move to the “beat” (Large, Fink, & Kelso, 2002; Snyder & Krumhansl, 2001; Parncutt, 1994). The beat is an abstract property of a piece of music, corresponding to the strongest or most salient temporal pulse (Handel, 1989; Lerdahl & Jackendoff, 1983; Cooper & Meyer, 1960). Beat strength or saliency is influenced by multiple acoustic cues, such as duration, intensity, and pitch, that create accents (Snyder & Krumhansl, 2001; Parncutt, 1994; Essens & Povel, 1985; Povel & Essens, 1985). The more temporally regular are the accents; the more salient and predictable is the beat. Regularly occurring patterns of strong and weak beats are grouped together to create the percept of meter (e.g., waltz or march time; Palmer & Krumhansl, 1990; Handel, 1989). Rhythms with a consistent, predictable meter create a strong beat percept and are easier to remember and reproduce (Chapin et al., 2010; Grahn & Brett, 2007; Essens & Povel, 1985). Musical beat has no direct, one-to-one relationship with specific auditory features—it emerges from the interaction between acoustical cues and higher-level cognitive organization (Handel, 1989; Essens & Povel, 1985; Povel & Essens, 1985). We can even perceive a clear beat in a rhythm when there is no sound present at the beat location (Snyder & Large, 2005), and voluntarily imposing a beat modulates early auditory processing (Iversen, Repp, & Patel, 2009). Because of these features, metrical structure and musical beat can vary from salient to ambiguous, something well understood by composers and listeners—just compare a Sousa march to a jazz improvization by Coltrane.

Previous neuroimaging studies have examined beat processing, but none has directly assessed the core phenomenon of finding and tapping to the beat of a musical rhythm. “Finding” the beat requires integrating acoustic cues to identify temporal regularity. Thus, understanding the neural basis of beat finding can shed light on more general brain mechanisms that parse incoming auditory information. Tapping to the beat requires using the identified metrical structure to predict upcoming auditory events and to pace movement. Understanding the neural basis of beat tapping can thus inform us about auditory–motor interactions relevant for motor control, music, and speech. A more fundamental question is whether beat finding and tapping are best understood as unique processes or whether they depend on more general neurocognitive mechanisms. Finally, we can ask whether they rely on basic sensorimotor mechanisms, on higher-order cognitive mechanisms, or on an interaction between the two.

Neuroimaging studies of beat processing have consistently shown activity in the BG, and these findings have been interpreted as demonstrating a specific role for these structures in beat identification or tracking (Chapin et al., 2010; Fujioka, Zendel, & Ross, 2010; Grahn & Rowe, 2009; Grahn & Brett, 2007). However, these experiments did not include an active beat tapping condition and thus could not link behavioral measures of beat finding or tapping to BG activity. Most importantly, these studies do not address how listeners identify the beat in more complex rhythms or how they move to the beat.

Previous work in our laboratory has examined brain activity when people tap in synchrony with rhythms that varied in metrical complexity or beat strength (Chen, Penhune, & Zatorre, 2008a, 2008b). We found that auditory association areas, the premotor cortex, and prefrontal regions were recruited during synchronization. Activity in all of these regions was greater for weaker beats and functional connectivity analyses showed that they interact. Additionally, we found that musicians showed better rhythm synchronization and greater neural activity in PFC than nonmusicians (Chen et al., 2008b), perhaps because they have a stronger internal representation of the beat or are better able to hold it in memory (Kung, Tzeng, Hung, & Wu, 2011; Zatorre, Halpern, & Foster, 2010). Taken together, we proposed that auditory and premotor regions are engaged in integrating auditory information with motor response and that prefrontal regions might be relevant for retrieving or maintaining the rhythm representation during reproduction (Zatorre, Chen, & Penhune, 2007).

Although our previous experiments manipulated beat strength, they did not directly examine beat processing because people tapped to each sound in the rhythm, rather than to the underlying beat. Therefore, the current fMRI experiment specifically tested musicians' ability to find and tap to the beat of rhythms that varied in metrical complexity or beat strength (Povel & Essens, 1985). We used a sparse-sampling design to measure beat tapping during scanning to link brain activity directly to performance. Furthermore, the stimuli were designed such that any effects of scanner noise on rhythmic processing were controlled. On the basis of our previous work, we hypothesized that auditory, premotor, and prefrontal regions would be engaged in beat finding and tapping, particularly for metrically complex rhythms when beat strength was weak. Another goal of the experiment was to elucidate the role of the BG in beat processing by testing its engagement across a range of beat strengths and by using a beat tapping response.

METHODS

Participants

Participants were trained musicians (instruments included: strings, piano, percussion, woodwinds, and brass). Eleven musicians participated in the fMRI experiment (five women; mean age = 24.73 years, SD = 5.18 years, range = 20–38 years; mean years of training = 13.73, SD = 3.13, range = 8–18), and eight in the behavioral pilot study (seven women; mean age = 31.29 years, SD = 3.12 years, range = 26–37 years; mean years of training = 19.18, SD = 8.44, range = 13–27). No participants participated in both the pilot and fMRI studies. All participants were right-handed, neurologically healthy, and had normal hearing. The experimental protocol was approved by the Montreal Neurological Institute (MNI) and Hospital Research Ethics Board. After completing the study, participants were debriefed and compensated for their time.

Experimental Design for fMRI and Behavioral Pilot Studies

In both the behavioral pilot and fMRI experiments, participants were presented with three repetitions of a single rhythm (Figure 1). During the first presentation, they were instructed to listen closely and to try to identify the beat of each rhythm, and during the second and third presentations, they were instructed to tap to their selected beat.

Figure 1. 

Rhythm stimuli. This figure illustrates examples of the four levels of metrical complexity for rhythms in duple (upper half) and triple meter (lower half). All rhythms contained the same number of tones which were arranged to create four levels of increasing metrical complexity perfectly metric, strongly metric, metric, and weakly metric. (Left) Schematic depiction of the temporal organization of each rhythm, with the time of sound onset (x) in relation to the beat location (dot). The metrical stress or beat structure of the rhythms is represented along the x axis (S = strong; w = weak). (Right) Same rhythms in musical notation.

Figure 1. 

Rhythm stimuli. This figure illustrates examples of the four levels of metrical complexity for rhythms in duple (upper half) and triple meter (lower half). All rhythms contained the same number of tones which were arranged to create four levels of increasing metrical complexity perfectly metric, strongly metric, metric, and weakly metric. (Left) Schematic depiction of the temporal organization of each rhythm, with the time of sound onset (x) in relation to the beat location (dot). The metrical stress or beat structure of the rhythms is represented along the x axis (S = strong; w = weak). (Right) Same rhythms in musical notation.

The noise generated by the MR acquisition is rhythmic and is known to affect the perception of auditory stimuli (Gaab, Gabrieli, & Glover, 2007a, 2007b). Most previous studies have not controlled for the possible effect of scanner noise on beat perception. Therefore in the current study, we implemented two complementary design features to minimize any effect of scanner noise (Figure 2). First, we used the sparse sampling technique (Gaab et al., 2007b; Belin, Zatorre, Hoge, Evans, & Pike, 1999) where stimuli are presented in silence, followed by scan acquisition. Second, we fitted the temporal structure of both the rhythms and task trials to the temporal parameters of the fMRI acquisition. Thus, the duration of each trial, the scan acquisition, and the intertrial interval all were integer multiples of the smallest interonset interval (IOI) between sounds in the rhythms. Furthermore, the onset of the scanner bursts, the onset of each rhythm, and the onset of each trial all occurred on the predicted pulse of the rhythms.

Figure 2. 

Temporal alignment of the rhythm stimuli with the fMRI sparse sampling protocol. This figure illustrates the timing of the presentation of the rhythmic stimuli in the sparse-sampling protocol used for the fMRI and behavioral pilot studies. Each rhythm/isochronous sequence was presented three times (1 = listen; 2 and 3 = tap), interleaved with image acquisition (fMRI) of scanner noise (behavioral study).

Figure 2. 

Temporal alignment of the rhythm stimuli with the fMRI sparse sampling protocol. This figure illustrates the timing of the presentation of the rhythmic stimuli in the sparse-sampling protocol used for the fMRI and behavioral pilot studies. Each rhythm/isochronous sequence was presented three times (1 = listen; 2 and 3 = tap), interleaved with image acquisition (fMRI) of scanner noise (behavioral study).

To confirm that these manipulations were effective in minimizing the effect of scanner noise, we conducted a behavioral pilot experiment comparing beat finding and tapping with and without recorded scanner noise (Figure 3A). The pilot experiment used the identical stimuli and trial structure as the fMRI experiment.

Figure 3. 

Behavioral pilot protocol and results. (A) The protocol for the behavioral pilot study, which compared beat tapping performance with scanner noise (Noise) and without scanner noise (Click). Each rhythm sequence was presented three times (1 = listen; 2 and 3 = tap). (B) Presents the data for (1) subjective rating of ease of tapping to the beat, (2) Cor/Total, (3) Cor/Predicted taps, (4) magnitude of onset asynchrony, and (5) percent deviation of ITI. Each variable is plotted across the four levels of metricality. Solid lines represent data from the Noise condition, and dotted lines represent data from the Click condition. Data are reported as means ± SE.

Figure 3. 

Behavioral pilot protocol and results. (A) The protocol for the behavioral pilot study, which compared beat tapping performance with scanner noise (Noise) and without scanner noise (Click). Each rhythm sequence was presented three times (1 = listen; 2 and 3 = tap). (B) Presents the data for (1) subjective rating of ease of tapping to the beat, (2) Cor/Total, (3) Cor/Predicted taps, (4) magnitude of onset asynchrony, and (5) percent deviation of ITI. Each variable is plotted across the four levels of metricality. Solid lines represent data from the Noise condition, and dotted lines represent data from the Click condition. Data are reported as means ± SE.

Stimuli and Task Conditions

Sixty-eight rhythms were created based on Povel and Essens rules of metrical organization (Essens & Povel, 1985; Povel & Essens, 1985; Figure 1). Each rhythm was composed of eleven 100-msec woodblock sounds (for examples of the stimuli, visit http://www-psychology.concordia.ca/fac/penhune/index.html). By changing the pattern of IOIs between the sounds, we created rhythms that varied across four levels of metrical regularity, from strongly metrical rhythms, where the beat was easy to identify, to weakly metrical rhythms, where the beat was difficult to identify. To be sure musicians needed to find the beat for each rhythm and could not simply carry over the beat from the previous item, half of the rhythms were in duple meter and half were in triple meter. There were also two tempi, fast and slow, where the smallest IOIs were 195 and 260 msec, respectively.

The rhythm stimuli were developed based on the principle that an important feature influencing metrical strength is the number of sounds that occur at predicted beats for a specific meter (Essens & Povel, 1985; Povel & Essens, 1985). For example, as shown in Figure 1, in the strongly metrical duple meter, 8 of the 11 sounds fall on a predicted beat, whereas for the weakly metrical meter, only five sounds fall on the beat. Each of the duple rhythms contained 5 eighth notes (195 and 260 msec in fast and slow tempi, respectively), 3 quarter notes (390 and 520 msec), 1 dotted quarter note (585 and 780 msec), and 1 half note (780 and 1040 msec). Thus, rhythms at the fast and slow tempi were 3.51 and 4.68 sec in duration with an interbeat interval (IBI) of 390 and 520 msec, respectively. With the same total number and type of notes, rhythms in the same tempo differed only in their temporal organization and the number of tones that fell on the predicted beat (5, 6, 7, or 8).

To allow for the fact that some musicians might perceive the duple meter as a quadruple meter, the location of the sounds that fell on the beat were controlled such that the four levels of metrical regularity remained the same in quadruple meter (i.e., IBIs were 780 and 1040 msec in fast and slow tempo, respectively) and the number of sounds on the beat varied from 5 in the strongly metrical rhythms to 2 in the weakly metrical rhythms.

On the basis of the same rules, the number of tones on the beat in triple meter varied from 7 in the strongly metrical rhythms to 4 in the weakly metrical rhythms. To create strongly metrical triple rhythms without syncopation the musical durations were changed slightly. Each of the strongly triple rhythms contained 5 eighth notes (195 and 260 msec in fast and slow tempi, respectively), 2 quarter notes (390 and 520 msec), and 3 dotted quarter note (585 and 780 msec). All of the other triple rhythms used the same durations as the duple rhythms. The IBI for triple rhythms was 585 msec in the fast tempo and 780 msec in the slow tempo.

On each trial, participants were presented with three repetitions of a single rhythm (Figures 3A and 4A). During the first presentation, they were instructed to listen and try to find the beat—the Find Beat condition. During the second and third presentations, they were instructed to tap in synchrony with the beat—the Tap Beat condition. To control for brain activity purely related to the tap response, a control condition was implemented in which participants listened and then tapped to isochronous rhythms (where all IOIs are equal) that matched the number of taps they made in the Tap Beat condition. To do this, for each participant, the number of taps made for each rhythm in the Tap Beat condition was recorded on-line during scanning and divided into the total rhythm duration. The resulting interval for each rhythm was used to generate an isochronous rhythm, which was presented in the next block of trials. For example, if the total duration of a rhythm was 4680 msec, and the participant executed six taps in the Tap Beat condition, then the IOI for the Tap Isochronous condition would be 780 msec (4680/6). As with the rhythm conditions, during the first presentation of the isochronous rhythm, participants were instructed to listen only—the Listen Isochronous condition, and during the second and third presentations, they were asked to tap to each tone—the Tap Isochronous condition. Thus, the Tap Isochronous condition contained the same number of tap responses as the Tap Beat condition, hence controls for this motor variable within each individual. The Find Beat and Tap Beat conditions were presented in blocks of eight rhythms, followed by a block of eight Listen Isochronous and Tap Isochronous trials based on the preceding rhythms. Finally, a Silence/Rest condition was inserted between each trial of all conditions.

Figure 4. 

fMR protocol and behavioral results. (A) Protocol for the fMRI study. Each rhythm/isochronous sequence was presented three times (1 = listen; 2 and 3 = tap). (B) Presents the data for (1) subjective rating of ease of tapping to the beat, (2) Cor/Total taps, (3) Cor/Predicted taps, (4) magnitude of onset asynchrony, and (5) percent deviation of ITI. Each variable is plotted across the four levels of metricality. Data are reported as means ± SE.

Figure 4. 

fMR protocol and behavioral results. (A) Protocol for the fMRI study. Each rhythm/isochronous sequence was presented three times (1 = listen; 2 and 3 = tap). (B) Presents the data for (1) subjective rating of ease of tapping to the beat, (2) Cor/Total taps, (3) Cor/Predicted taps, (4) magnitude of onset asynchrony, and (5) percent deviation of ITI. Each variable is plotted across the four levels of metricality. Data are reported as means ± SE.

Procedure—fMRI Experiment

Familiarization

Two days before the fMRI session, participants were familiarized with the procedure using 16 rhythms not used in the fMRI session. Test trials were the same as the Noise condition of the pilot experiment, where participants heard recorded scanner noise between presentations of the rhythm stimuli (see below).

Scan Session

Thirty-two rhythms were used in the fMRI session. Participants completed two runs, containing four blocks of eight rhythms. The Find Beat and Tap Beat conditions were presented in the first and third blocks, and the corresponding Listen Isochronous and Tap Isochronous controls were presented in the second and fourth blocks (see Figure 4A). The number and order of meter types (duple/triple), metrical regularities (strong to weak), and tempo (fast/slow) were counterbalanced across participants. Rhythms were presented binaurally through Siemens MR-compatible pneumatic sound transmission headphones at a sound intensity of 75 dB sound pressure level (as measured using a sound pressure meter), using Presentation software (version 0.8, from Neurobehavioral Systems) on a PC computer. All conditions were performed with eyes closed, and tap responses (key onset and offset times) were collected on-line. After the fMRI session, the 32 rhythms were presented again, and participants rated how easy they found it to tap to the beat after each sequence, using a 7-point scale (1 = very easy to 7 = very difficult), by pressing a corresponding number on the keyboard.

Procedure—Behavioral Pilot Experiment

To test the effect of scanner noise on beat finding and tapping, we compared performance with and without recorded scanner noise (Figure 3A). In the Noise condition, we interleaved recorded scanner noise between the presentations of the rhythms to mimic the sparse sampling protocol. In the no-noise (Click) condition, we interleaved a click (2000 Hz, 5 msec in duration) at the point corresponding to the onset of the scanner noise to control for the temporal reference provided by the noise. At the end of each Click trial, a tone was played to coincide with the end of the scanner noise and the completion of the trial. Across subjects, the order of presentation of the Noise and Click conditions was counterbalanced. Within the Noise and Click conditions, the number and order of meter types (duple/triple); metrical regularities (strong to weak) and tempo (fast/slow) were counterbalanced. Rhythms were presented at a comfortable intensity level through Sony headphones using Presentation software (version 0.8, from Neurobehavioral Systems) on a PC computer. Participants' tap responses were recorded on-line and scored as described below. In addition, after each trial, participants rated how easy they found it to tap to the beat, using the same 7-point scale described above.

Behavioral Data Analysis

To analyze participants' ability to tap to the beat, the tap onsets from the Tap Beat condition for each rhythm were compared with the onsets of the closest predicted beat. First, tap response data were inspected to identify the beat level at which each participant had tapped (duple vs. quadruple or triple vs. sextuple) to avoid penalizing those who tapped at different levels. Then each tap was scored as correct or incorrect based on a tolerance window of ±20% of the correct interval duration. This is a moderately restrictive window, with previous studies using windows ranging from 10% to 50% (Patel, Iversen, Chen, & Repp, 2005; Drake, Jones, & Baruch, 2000; Parncutt, 1994). Two measures of accuracy were then calculated: Cor/Predicted is the number of correct beat taps divided by the predicted number of beats in each rhythm (depending on the level selected by the participant), and Cor/Total is the number of correct beat taps divided by the total number of taps made. Cor/Predicted tells us how accurate the participant was compared with an absolute criterion. Cor/Total tells us how accurate the participant was compared with their own output.

The timing of each tap in the sequence was assessed using measures of intertap interval (ITI) and asynchrony. The ITI measures the ability to sustain the inferred metrical structure to the acoustic sequence. We calculated the deviation (in absolute value) of a participant's ITI relative to the nominal IBI, as a percentage score (% ITI deviation); the greater the deviation, the poorer the performance. Asynchrony assesses the ability to time the onset of a motor response with the onset of a nominal beat. For this measure, the absolute value of asynchrony was calculated because we were only interested in quantifying the amount of phase mismatch without regard for whether participants were tapping ahead or lagging behind the nominal beat. Lastly, all dependent variables were calculated for each synchronized tap participants made averaged across all trials for each rhythm type.

To confirm these measures of beat tapping, we used three measures drawn from circular statistics used to evaluate the accuracy of tapping to a beat. First, we calculated the synchronization coefficient or vector strength (Chapin et al., 2010; Patel, Iversen, Bregman, & Schultz, 2009; Fisher, 1993), which quantified how well taps were time-locked to the perceived beat. Synchronization coefficients can range from 0 (no synchronization) to 1 (perfect synchronization). Second, we calculated the relative phase, which refers to the difference between the tap onset and the expected beat onset at a particular metrical level, normalized for the IBI. We used the absolute value of the relative phase, which can range from 0 to 180 degrees, with zero indicating no difference or perfect phase synchrony and 180 indicating antiphase synchrony. Finally, we calculated the angular deviation, a measure of variability in relative phase analogous to a standard deviation. Each dependent variable was calculated for each synchronized tap participants made averaged across all trials for the Tap Beat and Tap Isochronous conditions.

fMRI Data Acquisition and Analysis

Scanning was performed on a 1.5-T Siemens Sonata imager. High-resolution T1-weighted anatomical scans were collected for each participant (voxel size = 1 × 1 × 1 mm3, matrix size = 256 × 256). A total of 133 volumes were obtained for each of the two runs in the functional T2*-weighted gradient-echo-planar scans (132 = 16 rhythm sequences × 3 repetitions each, 16 isochronous sequences × 3 repetitions each, 32 silent baseline scans, 4 instruction scans), where the first volume was discarded. Whole-head interleaved scans (n = 26) were taken, oriented in a direction orthogonal to that of the Sylvian Fissure (echo time = 50 msec, repetition time = 9360 msec, voxel size = 5 × 5 × 5 mm3, matrix size = 64 × 64 mm2, field of view = 320 mm2). A sparse sampling protocol (i.e., long repetition time) ensured that the BOLD signal of the auditory stimuli would not be contaminated with the BOLD response of the acquisition noise (Belin et al., 1999). Furthermore, this paradigm avoids behavioral and thus neural interactions that may occur when auditory stimuli of a rhythmical nature are concurrently processed with the loud rhythmical scanner noise.

Images from each scan were realigned with the second frame of the first run as reference, motion-corrected using the AFNI software (Cox, 1996), and smoothed using an 8-mm FWHM isotropic Gaussian kernel. For each participant, both anatomical and functional volumes were transformed into standard stereotaxic space (Talairach & Tournoux, 1988) based on the International Consortium for Brain Mapping (ICBM) 152 template (Mazziotta et al., 2001). Statistical analysis of fMRI data was based on the general linear model (Y = Xβ + ɛ), performed using fMRISTAT (Worsley et al., 2002; available at www.math.mcgill.ca/keith/fmristat). Error (ɛ) and temporal drift are modeled and removed. A design matrix containing the explanatory variables (X) in each column and volume acquisition in each row is organized and the linear model is then fit with the fMRI time series (Y), solving parameter estimates (β) in the least squares sense, yielding estimates of effects, standard errors, and t statistics for each contrast, for each run. Runs are combined together within and then across subjects using a mixed-effects model (Worsley et al., 2002), generating group statistical maps for each contrast of interest.

For initial contrasts, we pooled together all trials collapsed across tempo, meter, and degree of metricality. To determine the brain regions engaged during beat finding we performed the contrast Find Beat versus Silence. To show that the brain regions engaged during beat finding were not recruited in the control condition, we performed the contrast Listen Isochronous versus Silence. Altlhough it was not possible to equate the number of auditory stimuli between Find Beat and Listen Isochronous, we nonetheless performed a direct contrast between these conditions to clarify the results of the contrasts with Silence. To determine the brain regions engaged during beat tapping, we performed the contrast Tap Beat versus Tap Isochronous. Lastly, to show that the brain regions engaged during beat tapping were not recruited in the control condition, we performed the contrast Tap Isochronous versus Rest (Silence). To determine brain regions commonly recruited by the Find Beat and the Tap Beat conditions, a conjunction analysis was performed for the two principal contrasts [Find Beat vs. Listen Isochronous] ∩ [Tap Beat vs. Tap Isochronous]. The conjunction analysis was implemented using the minimum of the t statistic obtained from each contrast (Friston, Penny, & Glaser, 2005). Thus, only those voxels that survive a common threshold were considered significantly activated in the conjunction analysis.

For the regression analyses, we pooled together trials collapsed across tempo and meter for the Find Beat and Tap Beat conditions. To determine the brain regions modulated by metricality, we modeled the four levels of beat strength as a linear regressor, where Level 1 represents rhythms that are strongly metrical and Level 4 represents rhythms that are weakly metrical. In addition, we also modeled each participant's subjective rating score across the four levels to determine the brain regions related to subjective perception of metrical complexity. Lastly, we modeled each participant's performance score (Cor/Total) across the four levels to determine the brain regions related to beat tapping accuracy. Regressors for subjective beat strength and accuracy were weighted from strong to weak and worst to best (parallel to the weighting for metric levels). On the basis of the results of the contrasts and regression analyses, %BOLD signal change was extracted from voxels in the superior temporal gyrus (STG) and ventrolateral PFC (VLPFC) and plotted for the four levels of metricality as well as their respective isochronous control conditions.

We used functional connectivity analyses to determine how the time course of neural activity in the STG (seed taken from the analysis regressing beat strength; see Table 5) and VLPFC (seed taken from the contrast Tap Beat vs. Tap Isochronous; see Table 2) were correlated with the time course of activity in the rest of the brain. To determine how the functional connectivity of these regions was modulated by the stimulus manipulation, we used a variant of the psychophysiological interactions method proposed by Friston et al. (1997; available at www.math.mcgill.ca/keith/fmristat). We looked for changes in connectivity when participants tapped to the beat of the most strongly metric rhythms compared with the weakly metric rhythms. In modeling the stimulus-modulated changes in temporal coherence, the effects of the stimulus are accounted for such that correlations are between the voxels of interest and not with those of the stimulus already identified from the covariation analysis. Thus, in the general linear model, an interaction product between the stimulus (X) and reference voxel value (R) is added as a regressor variable at each time point for every voxel and is solved for: Yij = Xiβ1j + Riβ2j + XiRiβ3j + ɛ, where Yij is the voxel value at each frame i, for each voxel j. Slice timing correction is also implemented and the voxel values at each frame are extracted from native space. The effect, standard error, and t statistic are then estimated using fMRISTAT as described previously.

All analyses were evaluated using p < .05 (t statistic = 5.0) corrected for multiple comparisons as determined by the minimum of the Bonferonni correction based on Gaussian random field theory and discrete local maximum (Worsley, 2005). Regions that were predicted a priori [STG, ventral premotor cortex (vPMC), dorsal premotor cortex (dPMC), PFC, dorsolateral PFC (DLPFC), VLPFC] were evaluated using a false discovery rate set at p < .05. Localization of peak neural activity was classified using anatomical atlases (Schmahmann et al., 1996; Duvernoy, 1991; Talairach & Tournoux, 1988) and/or previously established criteria (Picard & Strick, 1996; Westbury, Zatorre, & Evans, 1996).

RESULTS

Behavioral Pilot Study

Measures of behavioural performance and subjective ratings were evaluated using two-way repeated-measures ANOVAs to compare Click and Noise conditions across four levels of metrical complexity (Figure 3B). No significant differences between the Click and Noise conditions were obtained for any of the variables analyzed, and there were no significant interactions. As predicted, metrical complexity significantly influenced tapping performance such that participants were less accurate for more metrically complex rhythms [Cor/Total: F(3, 21) = 27.74, p < .001; Cor/Predicted: F(3, 21) = 21.29, p < .001; onset asynchrony: F(3, 21) = 11.26, p < .001; and ITI deviation: F(3, 21) = 12.76, p < .001]. The synchronization coefficient showed a significant decrease across levels of metricality, F(3, 21) = 11.34, p < .001, and significant increases were also demonstrated in mean direction of relative phase, F(3, 21) = 27.81, p < .001, and angular deviation, F(3, 21) = 11.07, p < .001. These results confirm that beat tapping was less consistent for more complex meters, where beat strength was weaker. Consistent with behavioral performance, subjective ratings also showed that participants rated strongly metrical rhythms easier to tap to than weakly metrical rhythms, F(3, 21) = 29.18, p < .001. Critically, there was no significant differences between the Noise and Click conditions and no interactions [F(1, 7) values for all analyses < 1]. These results demonstrate that the combined use of a sparse sampling paradigm and matching of the temporal structure of the rhythms and task trials to the pace of the scanner noise was effective in minimizing the impact of noise on beat finding and tapping.

fMRI Behavioral Results

Tap Beat Condition

Measures of behavioural performance from the Tap Beat condition and the subjective ratings collected after scanning were evaluated using one-way repeated-measures ANOVAs across the four levels of metrical regularity (Figure 4B). The manipulation of metricality significantly influenced tapping performance for both indices of accuracy [Cor/Total: F(3, 30) = 11.13, p < .001; Cor/Predicted: F(3, 30) = 8.61, p < .001] and ITI deviation, F(3, 30) = 6.74, p = .001; the onset asynchrony showed a trend in the same direction but did not reach significance, F(3, 30) = 1.90, p = .152. Analysis of the synchronization coefficient values showed a significant decrease across levels of metricality, F(3, 30) = 5.37, p < .005, confirming that beat tapping was less consistent in the weaker beat conditions (Table 1). The analysis of the mean relative phase and angular deviation showed consistent results, with an increase in phase discrepancy and variability with decreasing metricality [Table 1; relative phase: F(3, 30) = 9.24, p < .001; angular deviation: F(3, 30) = 5.42, p < .005]. Participant ratings showed that participants also found it was easiest to tap to the beat when the rhythm was strongly metric than when it was weakly metric, F(3, 30) = 14.42, p < .001. Critically, variables showed significant linear regression values across levels of metricality (Cor/Total: R2 = .32, p < .001; Cor/Predicted: R2 = .16, p = .008), deviation of ITI (R2 = .09, p = .05), and subjective rating (R2 = .23, p = .001) indicating that musicians tapped less precisely to the predicted beats with increasing metrical ambiguity.

Table 1. 

Values of the Circular Statistics for Rhythmic and Isochronous Sequences


Metric Level
Rhythmic Sequences
Isochronous Sequences
Circular Statistic 
 Synchronization coefficient 0.688 0.947 
0.646 0.948 
0.622 0.949 
0.628 0.949 
 Mean relative phase (degrees) 61.5 27.1 
68.6 27.3 
77.1 26.8 
82.5 27.9 
 Angular deviation (degrees) 0.863 0.328 
0.934 0.327 
0.974 0.324 
0.964 0.322 
 
 Metric Level Rhythmic Sequences (Mean ITI) Isochronous Sequences (Mean IOI) 
Number of Taps 13.6 12.9 
14.4 14.1 
13.9 13.5 
15.1 14.6 
Rate (msec) 673.49 683.86 
648.15 645.60 
665.00 663.51 
614.51 620.57 

Metric Level
Rhythmic Sequences
Isochronous Sequences
Circular Statistic 
 Synchronization coefficient 0.688 0.947 
0.646 0.948 
0.622 0.949 
0.628 0.949 
 Mean relative phase (degrees) 61.5 27.1 
68.6 27.3 
77.1 26.8 
82.5 27.9 
 Angular deviation (degrees) 0.863 0.328 
0.934 0.327 
0.974 0.324 
0.964 0.322 
 
 Metric Level Rhythmic Sequences (Mean ITI) Isochronous Sequences (Mean IOI) 
Number of Taps 13.6 12.9 
14.4 14.1 
13.9 13.5 
15.1 14.6 
Rate (msec) 673.49 683.86 
648.15 645.60 
665.00 663.51 
614.51 620.57 

Table details behavioral measures of beat synchronization for the rhythmic and isochronous sequences in the tapping and finding conditions: Level 1 = perfectly metric; Level 2 = strongly metric; Level 3 = metric; Level 4 = weakly metric.

We also assessed at which level of the metrical structure people were tapping. For the duple meter, approximately 50% of rhythms were tapped as duple and 50% as quadruple meter. There were no significant differences across levels of metricality or for the fast and slow rates (all paired t tests, p > .05). For the triple meter, more than 95% of the sequences were tapped as triple and less than 5% as sextuple meter. The results were also similar across the levels of metricality and rates (all paired t tests, p > .05). Finally, people did not appear to tap at different levels within the same rhythm as confirmed by an analysis of the average ITI across the four levels of metricality showing no significant differences across level, F(3, 30) = 1.6, p > .05. If participants had changed rates within sequences, this would result in differing ITIs, especially for the more complex rhythms where beat identification is more difficult.

Isochronous Control Conditions

The Listen and Tap Isochronous control conditions were designed to control for the effect of the rate and number of taps made in the Tap Beat condition. To confirm that rate was similar in the two conditions, we compared the IOIs of the auditory stimuli in the Listen Isochronous condition with the ITIs of the Tap Beat condition and found no significant differences between conditions, F(1, 10) < 1, p = .57, or across levels of metricality, F(3, 30) = 2.40, p = .09 (Table 1). When comparing the number of taps made in the Tap Beat and the Tap Isochronous conditions, we found a significant difference between conditions, F(1, 10) = 10.12, p = .01 (Mean Rhythm = 14.25; Mean Isochronous = 13.76), and a significant effect across levels, F(3, 30) = 3.93, p = .018 (Mean Perfectly Metric = 13.2; Mean Strongly Metric = 14.2; Mean Metric = 13.7; Mean Weakly Metric = 14.9). People made slightly fewer taps in the Tap Isochronous condition and fewer taps for the Perfectly Metrical compared with the Weakly Metrical conditions. These differences were relatively small (Tap Beat − Tap Isochronous = 0.49 taps; Weakly Metric − Perfectly Metric = 1.7 taps) and were thus unlikely to result in differential BOLD response for the two conditions.

To assess accuracy, we compared Cor/Total for Tap Isochronous compared with Tap Beat and found a significant interaction between conditions and metrical levels, F(3, 30) = 11.62, p < .001. In the Tap Beat condition, there was a significant effect of Metrical Level, such that tapping was less accurate for more weakly metrical rhythms, F(3, 30) = 11.13, p < .001 (see Figure 4). In the Tap Isochronous condition, there was no effect of Level, with equal accuracy for all levels, F(3, 30) < 1, p = .938 (Mean Cor/Total Tap Isochronous = 88.11% ± 3.99; Perfectly Metric = 87.4; Strongly Metric = 88.3; Metric 3 = 88.1; Weakly Metric = 88.6). Although we might have expected almost perfect accuracy in the Tap Isochronous condition, people missed occasional taps at the beginning or end of the sequences. Importantly, however, analysis of the synchronization coefficient values showed almost perfect synchronization for the Tap Isochronous condition (Mean = 0.948), with no significant differences across metrical levels (Table 1), F(3, 30) < 1, p = .867. Analyses of the mean relative phase and angular deviation also showed overall smaller phase discrepancies and lower variability compared with the Tap Beat conditions (relative phase: for both measurements, F(1, 10) > 149.71, ps < .001), with no differences across levels of metricality (Table 1; F(3, 30) < 1 for both measurements, ps > .72).

Finally, to assess whether the tempo of the isochronous conditions might differ from the predicted beat of the rhythms we compared the IOIs for the isochronous conditions to the predicted IBIs for the rhythms across the four levels of metricality. The results showed no differences between the conditions, F(1, 10) < 1, p = .765, or across levels of metricality (for four levels, pair-t10 < 1.59, ps > .143), indicating that the fit of the beat of the isochronous conditions to the scanner noise was similar to that of the other rhythm conditions.

fMRI Results

Finding the Beat

To identify the basic network of brain regions engaged in beat finding, we contrasted Find Beat both with Silence and with Listen Isochronous. Beat finding recruited a bilateral network of auditory, motor, and prefrontal regions including, STG, caudate nucleus, dPMC, vPMC, the cerebellum, dorsolateral PFC (DLPFC), and VLPFC (Figure 5 and Table 2).

Figure 5. 

Brain regions engaged in finding the beat and tapping to the beat. (A) The results for beat finding. (B) The results for listening to an isochronous beat. (C) The results for tapping to the beat. The color bar represents t values; range 10.0–4.0. (a) VLPFC, (b) STG/STS, (c) pre-SMA and SMA, (d) vPMC and dPMC, (e) caudate, (f) cerebellum (Lobules VI and VIIIa). In the graphs at the bottom %BOLD signal change is plotted for voxels of interest in left and right VLPFC and right STG across the four levels of metricality for each condition (Find Beat, Listen Isochronous, Tap Beat, and Tap Isochronous). Data are reported as means ± SE.

Figure 5. 

Brain regions engaged in finding the beat and tapping to the beat. (A) The results for beat finding. (B) The results for listening to an isochronous beat. (C) The results for tapping to the beat. The color bar represents t values; range 10.0–4.0. (a) VLPFC, (b) STG/STS, (c) pre-SMA and SMA, (d) vPMC and dPMC, (e) caudate, (f) cerebellum (Lobules VI and VIIIa). In the graphs at the bottom %BOLD signal change is plotted for voxels of interest in left and right VLPFC and right STG across the four levels of metricality for each condition (Find Beat, Listen Isochronous, Tap Beat, and Tap Isochronous). Data are reported as means ± SE.

Table 2. 

Beat Finding and Tapping

Region
Beat Finding (Find Beat vs. Silence)
Beat Finding (Find Beat vs. Listen Iso)
Beat Tapping (Tap Beat vs. Tap Iso)
t
MNI Coordinates
tMNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
x
y
z
L STG 6.17 −64 −40 16         
6.04 −38 −34 14         
5.46 −48 −20 5.53 −44 −10 −8 4.27* −46 −22 
5.18 −48 −34 12     3.66* −46 −36 
5.08 −62 −28         
L ant STG     5.58 −52 −2 −2     
    5.05 −64 −10     
R STG 7.46 62 −34 10         
7.00 62 −26         
6.63 54 −22 4.5* 54 −24 5.53 54 −22 
5.05 42 −36 14     5.27 50 −30 
L VLPFC 6.94 −30 22 −2 6.11 −28 22 −2 6.77 −32 22 −2 
R VLPFC 6.70 32 22 7.44 34 22 −2 6.56 34 22 −2 
5.97 34 20     6.48 40 16 −2 
R lat VLPFC 4.94* 48 20 7.55 50 20 −4 7.22 54 18 −6 
L DLPFC 3.26* −40 32 32         
R DLPFC         4.91* 44 44 12 
R BA 8         5.53 40 26 22 
        5.20 34 20 16 
        4.51* 34 10 46 
L caudate 7.39 −14 10 −2 6.21 −14     
R caudate 7.53 18 14 6.75 18 10     
L vPMC 5.55 −46 20     5.13 −42 26 
R vPMC         6.04 48 22 
        4.58* 38 36 
        3.48* 46 12 48 
L dPMC 4.15* −48 −4 54     3.93* −46 −2 58 
R dPMC 4.67* 50 −2 52     3.73* 42 50 
L vPMC/BA 44 5.45 −54 12         
R vPMC/BA 44 5.26 50 14 24 5.9 56 12 5.77 54 12 
L BA 44         5.23 −54 12 −6 
5.19 50 12 20         
L VIII 8.96 −30 −62 −50     5.45 −28 −66 −50 
7.43 −24 −68 −46     5.76 −32 −60 −40 
R VIII 7.18 30 −64 −50     5.97 30 −62 −56 
L VI 5.5 −28 −64 −26         
5.21 −32 −58 −28         
R VI 5.62 42 −58 −30         
5.61 36 −64 −26         
L Crus I/II 6.12 −12 −76 −34         
L pre-SMA 6.41 −4 60         
L pre-SMA 5.33 −4 16 44 5.10 −2 22 42     
R pre-SMA 6.44 12 52 5.85 14 48     
L IPS 5.07 −30 −50 40         
L thal         5.11 −12 −2 12 
R thal         5.63 −6 12 
R ACC     5.10 12 32 22     
Region
Beat Finding (Find Beat vs. Silence)
Beat Finding (Find Beat vs. Listen Iso)
Beat Tapping (Tap Beat vs. Tap Iso)
t
MNI Coordinates
tMNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
x
y
z
L STG 6.17 −64 −40 16         
6.04 −38 −34 14         
5.46 −48 −20 5.53 −44 −10 −8 4.27* −46 −22 
5.18 −48 −34 12     3.66* −46 −36 
5.08 −62 −28         
L ant STG     5.58 −52 −2 −2     
    5.05 −64 −10     
R STG 7.46 62 −34 10         
7.00 62 −26         
6.63 54 −22 4.5* 54 −24 5.53 54 −22 
5.05 42 −36 14     5.27 50 −30 
L VLPFC 6.94 −30 22 −2 6.11 −28 22 −2 6.77 −32 22 −2 
R VLPFC 6.70 32 22 7.44 34 22 −2 6.56 34 22 −2 
5.97 34 20     6.48 40 16 −2 
R lat VLPFC 4.94* 48 20 7.55 50 20 −4 7.22 54 18 −6 
L DLPFC 3.26* −40 32 32         
R DLPFC         4.91* 44 44 12 
R BA 8         5.53 40 26 22 
        5.20 34 20 16 
        4.51* 34 10 46 
L caudate 7.39 −14 10 −2 6.21 −14     
R caudate 7.53 18 14 6.75 18 10     
L vPMC 5.55 −46 20     5.13 −42 26 
R vPMC         6.04 48 22 
        4.58* 38 36 
        3.48* 46 12 48 
L dPMC 4.15* −48 −4 54     3.93* −46 −2 58 
R dPMC 4.67* 50 −2 52     3.73* 42 50 
L vPMC/BA 44 5.45 −54 12         
R vPMC/BA 44 5.26 50 14 24 5.9 56 12 5.77 54 12 
L BA 44         5.23 −54 12 −6 
5.19 50 12 20         
L VIII 8.96 −30 −62 −50     5.45 −28 −66 −50 
7.43 −24 −68 −46     5.76 −32 −60 −40 
R VIII 7.18 30 −64 −50     5.97 30 −62 −56 
L VI 5.5 −28 −64 −26         
5.21 −32 −58 −28         
R VI 5.62 42 −58 −30         
5.61 36 −64 −26         
L Crus I/II 6.12 −12 −76 −34         
L pre-SMA 6.41 −4 60         
L pre-SMA 5.33 −4 16 44 5.10 −2 22 42     
R pre-SMA 6.44 12 52 5.85 14 48     
L IPS 5.07 −30 −50 40         
L thal         5.11 −12 −2 12 
R thal         5.63 −6 12 
R ACC     5.10 12 32 22     

Brain regions recruited during beat finding or tapping. The stereotaxic coordinates of peak activations are given in MNI space, along with peak t values. Brain regions predicted a priori (STG, vPMC, dPMC, VLPFC, DLPFC) are also reported (*); they are significant with false discovery rate (FDR).

To identify brain regions engaged in listening to a very simple meter, we compared Listen Isochronous with Silence (Rest). This contrast showed activity in bilateral STG, vPMC, right dPMC, and left DLPFC (Figure 5 and Table 3). No activity in the BG or VLPFC was detected, even at a lower threshold (p < .001, uncorrected). Brain regions significantly more active in the Find Beat versus Listen Isochronous condition included bilateral STG, caudate, and VLPFC (Table 2).

Table 3. 

Listening and Tapping to an Isochronous Sequence

Region
Listen Iso–Silence
Tap Iso–Silence
t
MNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
L STG 5.50 −64 −40 16     
5.34 −52 −40 20     
5.03 −40 −36 14 5.15 −40 −36 16 
3.88* −62 −28     
R STG 7.09 62 −34 12 6.71 62 −34 12 
6.13 42 −40 12 5.42 42 −36 14 
L M1     6.22 −38 −22 52 
L vPMC 4.22* −44 20 4.21* −50 10 
R vPMC 4.38* 50 14 28 3.68* 50 
L dPMC     5.69 −46 −16 54 
R dPMC 4.06* 50 −4 52 5.22 52 −2 52 
L DLPFC 3.90* −38 36 32     
3.83* −38 40 32     
L SMA     5.09 −4 62 
L VIII 5.14 −28 −62 −50 5.15 −26 −58 −24 
R VIII 5.20 26 −64 −50 6.02 24 −62 −52 
R VI 5.09 42 −54 −32     
L Crus I/II 5.90 −10 −74 −36     
R Crus I/II 5.90 10 −78 −34 8.77 12 −54 −18 
    7.73 −66 −16 
L put     6.08 −20 −8 −2 
R caudate     5.96 16 20 
Region
Listen Iso–Silence
Tap Iso–Silence
t
MNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
L STG 5.50 −64 −40 16     
5.34 −52 −40 20     
5.03 −40 −36 14 5.15 −40 −36 16 
3.88* −62 −28     
R STG 7.09 62 −34 12 6.71 62 −34 12 
6.13 42 −40 12 5.42 42 −36 14 
L M1     6.22 −38 −22 52 
L vPMC 4.22* −44 20 4.21* −50 10 
R vPMC 4.38* 50 14 28 3.68* 50 
L dPMC     5.69 −46 −16 54 
R dPMC 4.06* 50 −4 52 5.22 52 −2 52 
L DLPFC 3.90* −38 36 32     
3.83* −38 40 32     
L SMA     5.09 −4 62 
L VIII 5.14 −28 −62 −50 5.15 −26 −58 −24 
R VIII 5.20 26 −64 −50 6.02 24 −62 −52 
R VI 5.09 42 −54 −32     
L Crus I/II 5.90 −10 −74 −36     
R Crus I/II 5.90 10 −78 −34 8.77 12 −54 −18 
    7.73 −66 −16 
L put     6.08 −20 −8 −2 
R caudate     5.96 16 20 

Brain regions recruited when listening or tapping to the isochronous control sequence, relative to silence. The stereotaxic coordinates of peak activations are given in MNI space, along with peak t values. Brain regions predicted a priori (STG, vPMC, dPMC, VLPFC, DLPFC) are also reported (*); they are significant with FDR.

Tapping to the Beat

To identify brain regions specifically engaged in tapping to the beat, we contrasted the Tap Beat condition, which requires tapping to an internally generated beat, with the Tap Isochronous condition, which does not require true beat generation and which controls for the exact number of movements made by each participant in the Tap Beat condition. Regions that were more active in the Tap Beat condition were bilateral STG, dPMC, vPMC, VLPFC, and right DLPFC (Figure 5 and Table 2).

To identify brain regions engaged in execution of the tap response to a simple meter, we compared Tap Isochronous versus Rest (Silence). Regions that were more active in the Tap Isochronous condition were bilateral STG, left M1, bilateral dPMC and vPMC, cerebellar lobule VIII, left putamen, and right caudate (Table 3). No significant activity in VLPFC was observed in this contrast.

Comparing Beat Finding and Beat Tapping

To assess regions that were commonly active for beat finding and tapping, we examined the conjunction of the two principal contrasts [Find Beat vs. Listen Isochronous] ∩ [Tap Beat vs. Tap Isochronous]. Regions commonly active in the two conditions included: bilateral STG and VLPFC, bilateral vPMC and left dPMC (Table 4). To assess regions that differed between conditions, we examined the contrast [Find Beat vs. Listen Isochronous] versus [Tap Beat vs. Tap Isochronous], which revealed no regions that were significantly different in either condition.

Table 4. 

Finding and Tapping to the Beat: Common Activations (Conjunction)

Region
(Find Beat vs. Listen Iso) ∩ (Tap Beat vs. Tap Iso)
t
MNI Coordinates
x
y
z
L STG 4.05* −46 −20 
L ant STG 4.35* −52 −8 −2 
R STG 4.51* 54 −24 
L VLPFC 6.09 −30 22 −2 
R VLPFC 6.56 34 22 −2 
R lat VLPFC 7.01 50 18 −6 
L vPMC 3.90* −50 10 20 
R vPMC 4.66* 50 10 18 
4.10* 38 −2 36 
L dPMC 3.95* −46 −2 58 
Region
(Find Beat vs. Listen Iso) ∩ (Tap Beat vs. Tap Iso)
t
MNI Coordinates
x
y
z
L STG 4.05* −46 −20 
L ant STG 4.35* −52 −8 −2 
R STG 4.51* 54 −24 
L VLPFC 6.09 −30 22 −2 
R VLPFC 6.56 34 22 −2 
R lat VLPFC 7.01 50 18 −6 
L vPMC 3.90* −50 10 20 
R vPMC 4.66* 50 10 18 
4.10* 38 −2 36 
L dPMC 3.95* −46 −2 58 

Brain regions commonly active during beat finding and tapping. The stereotaxic coordinates of peak activations are given in MNI space, along with peak t values. Threshold of t = 5 was considered significant, based on Hayasaka, Phan, Liberzon, Worsley, & Nichols (2004). Brain regions predicted a priori (STG, vPMC, dPMC, VLPFC, DLPFC) are also reported (*); they are significant with FDR.

Regression Analyses

Behavioral findings showed that as rhythms became more metrically complex, participants both perceived the rhythms as having a weaker beat and were less accurate in tapping to that beat. To identify brain regions whose activity was sensitive to metrical complexity or beat strength we performed a regression analysis modeling the four levels of metricality. We also conducted behavioral regression analyses using individual participants' performance scores and ratings of beat strength as variables. For the Find Beat condition, there were no regions whose activity significantly correlated with level of metricality or with perceptual or performance measures. For the Tap Beat condition, results showed that activity in right STG and VLPFC increased across levels of metrical complexity (Figure 6 and Table 5). Similar results were obtained when we modeled ratings of beat strength and performance scores (Cor/Total and Cor/Predicted). This confirms that activity in STG and VLPFC was directly related to measures of beat perception and production as well as to the experimenter-defined independent variable of metrical level. Values for the three behavioral variables were significantly correlated (Pearson R: Cor/Total vs. Cor/Predicted = 0.56; Cor/Total vs. Rating = −0.49; Cor/Predicted vs. Rating = −0.47; all ps < .01), likely contributing to similarities in the results these analyses. Analyses for ITI deviation and asynchrony showed similar findings but did not reach statistical threshold. No regions showed the opposite pattern, increasing activity with decreasing metrical complexity or stronger perceived beat.

Figure 6. 

Brain regions modulated by temporal complexity. (A) The results of the regression analysis across the four levels of metrical complexity. (B, C) The results of the regression analyses for subjective rating (B) and tapping performance (C). Each participant's subjective rating score and performance score (Cor/Total) across the four levels of metricality were modeled. Regions where neural activity shows a linear relationship with metricality are shown. The color bar represents t values; range from 5.0 to 3.0 for metricality and rating images and from −5.0 to −3.0 for performance images. (a) VLPFC, (b) STG/STS, (c) pre-SMA and SMA, (d) vPMC and dPMC, (g) DLPFC.

Figure 6. 

Brain regions modulated by temporal complexity. (A) The results of the regression analysis across the four levels of metrical complexity. (B, C) The results of the regression analyses for subjective rating (B) and tapping performance (C). Each participant's subjective rating score and performance score (Cor/Total) across the four levels of metricality were modeled. Regions where neural activity shows a linear relationship with metricality are shown. The color bar represents t values; range from 5.0 to 3.0 for metricality and rating images and from −5.0 to −3.0 for performance images. (a) VLPFC, (b) STG/STS, (c) pre-SMA and SMA, (d) vPMC and dPMC, (g) DLPFC.

Table 5. 

Correlations of Neural Activity with the Beat

Region
Correlation with Beat Strength
Correlation with Beat Rating
Correlation with Performance
t
MNI Coordinates
t
MNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
x
y
z
L STG 
R STG 4.93* 60 −38 3.90* 58 −38     
4.91* 48 −40 12 4.20* 46 −40 10     
R STS 4.41* 56 −34 −4     −3.74* 50 −24 −6 
L VLPFC         −3.60* −38 18 
R VLPFC 4.89* 38 18 4.58* 38 20 −3.97* 34 24 
    5.33 48 18     
    5.14 50 22 −2     
R DLPFC     4.07* 46 36 26 −5.23 36 40 38 
L BA 8 
R BA 8 5.20 38 24 22 5.15 38 24 22 −6.05 38 22 24 
        −5.05 44 16 32 
R vPMC     4.02* 44 10 28 −4.73* 48 10 44 
        −4.09* 38 −2 42 
L dPMC         −3.70* −44 −4 52 
R dPMC         −4.43* 48 10 50 
        −4.37* 22 58 
        −3.74* 36 58 
        −3.64* 40 −2 56 
R MTG     5.54 56 −34 −4     
L VIII         −5.38 −20 −76 −44 
Region
Correlation with Beat Strength
Correlation with Beat Rating
Correlation with Performance
t
MNI Coordinates
t
MNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
x
y
z
L STG 
R STG 4.93* 60 −38 3.90* 58 −38     
4.91* 48 −40 12 4.20* 46 −40 10     
R STS 4.41* 56 −34 −4     −3.74* 50 −24 −6 
L VLPFC         −3.60* −38 18 
R VLPFC 4.89* 38 18 4.58* 38 20 −3.97* 34 24 
    5.33 48 18     
    5.14 50 22 −2     
R DLPFC     4.07* 46 36 26 −5.23 36 40 38 
L BA 8 
R BA 8 5.20 38 24 22 5.15 38 24 22 −6.05 38 22 24 
        −5.05 44 16 32 
R vPMC     4.02* 44 10 28 −4.73* 48 10 44 
        −4.09* 38 −2 42 
L dPMC         −3.70* −44 −4 52 
R dPMC         −4.43* 48 10 50 
        −4.37* 22 58 
        −3.74* 36 58 
        −3.64* 40 −2 56 
R MTG     5.54 56 −34 −4     
L VIII         −5.38 −20 −76 −44 

Results from three different regression analyses, indicating brain regions whose neural activity is correlated with beat strength, beat rating, and performance. The stereotaxic coordinates of peak activations are given in MNI space, along with peak t values. Brain regions predicted a priori (STG, vPMC, dPMC, VLPFC, DLPFC) are also reported (*); they are significant with FDR.

To test whether the linear relationship with beat strength differed for Beat Finding and Tapping in VLPFC and STG, we contrasted the results of the regression analyses between the two conditions in these regions. These results revealed a greater correlation for Beat Tapping in right VLPFC (44, 16, 1; t = 3.11; p < .0002 uncorrected) and right STS (52, −34, −6; t = 3.83; p < .0002 uncorrected) adjacent to the STG location found for Beat Tapping.

To visualize the results of the regression analyses, % BOLD signal change for each condition was extracted from peak voxels identified from the Tap Rhythm versus Tap Isochronous condition for bilateral VLPFC and from the regression analysis for the right STG. These values were plotted for each of the conditions across the four levels of metrical complexity (Figure 5). These graphs reflect the results of the regression analysis showing a linear increase for the Tap Beat condition only, where neural activity in STG and VLPFC increased as a function of increasing metrical complexity. These graphs also show neither region was modulated by metric complexity in the Find Beat condition and that VLPFC was not engaged during the Listen Isochronous and Tap Isochronous control conditions.

Stimulus-modulated Functional Connectivity Analyses

To evaluate whether neural activity in STG and VLPFC was temporally correlated with activity in the rest of the brain and to assess whether any correlated activity was modulated by metrical complexity we performed stimulus-modulated functional connectivity analyses. Voxels in the right STG (60, −38, 8) and right VLPFC (34, 22, −2) were used as seeds in two separate analyses. The results of both analyses showed that neural activity in right STG and right VLPFC was temporally correlated in the Tap Beat condition and that the correlation was greater for the weakly metrical compared with the strongly metrical rhythms (Figure 7 and Table 6). Right STG and VLPFC also showed stimulus modulated coupling with premotor cortex and inferior parietal lobule. Very importantly, right VLPFC also showed stimulus modulated coupling with the right DLPFC and bilateral BG at the border of the caudate and putamen.

Figure 7. 

Functional connectivity results. (Top) Brain regions whose activity was more strongly coupled with activity in right VLPFC for the weak as compared with the strong metrical rhythms. (Bottom) Brain regions whose activity was more strongly coupled with activity in the right STG for the weak as compared with the strong metrical rhythms. The color bar represents t values; range, 10.0–5.0.

Figure 7. 

Functional connectivity results. (Top) Brain regions whose activity was more strongly coupled with activity in right VLPFC for the weak as compared with the strong metrical rhythms. (Bottom) Brain regions whose activity was more strongly coupled with activity in the right STG for the weak as compared with the strong metrical rhythms. The color bar represents t values; range, 10.0–5.0.

Table 6. 

Stimulus-modulated Functional Connectivity

Region
Seed: Right STG (60, −38, 8)
Seed: Right VLPFC (34, 22, − 2)
t
MNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
L STG 6.47 −56 −34 5.29 −38 −38 14 
6.19 −60 −44 12     
6.09 −48 −40     
5.46 −52 −20 −2     
R STG 7.74 52 −26 6.69 58 −34 10 
    5.77 44 −36 12 
L VLPFC 5.07 −32 22 −2 9.41 −30 20 
R VLPFC 6.17 44 26     
5.00 34 22     
L vPMC     6.65 −46 26 
R vPMC     7.07 48 16 22 
    5.05 34 48 
L dPMC 5.93 −38 −20 50     
5.78 −44 −14 58     
R dPMC 5.16 58 −4 52     
L SMC 5.79 −28 −28 58     
L IPL 5.27 −64 −50 20 5.12 −36 −52 46 
R IPL     6.02 44 −46 44 
L ACC     6.59 −10 28 26 
R ACC     9.09 24 36 
L pre-SMA     6.97 −6 54 
R DLPFC     6.67 38 26 26 
    6.06 36 38 20 
L caud/put     6.32 −14 12 
R caud/put     5.49 18 
L VIII     6.56 −26 −66 −50 
R VIII     5.14 24 −64 −50 
L VI     5.11 −26 −64 −28 
R V/VI     5.43 −64 −22 
Region
Seed: Right STG (60, −38, 8)
Seed: Right VLPFC (34, 22, − 2)
t
MNI Coordinates
t
MNI Coordinates
x
y
z
x
y
z
L STG 6.47 −56 −34 5.29 −38 −38 14 
6.19 −60 −44 12     
6.09 −48 −40     
5.46 −52 −20 −2     
R STG 7.74 52 −26 6.69 58 −34 10 
    5.77 44 −36 12 
L VLPFC 5.07 −32 22 −2 9.41 −30 20 
R VLPFC 6.17 44 26     
5.00 34 22     
L vPMC     6.65 −46 26 
R vPMC     7.07 48 16 22 
    5.05 34 48 
L dPMC 5.93 −38 −20 50     
5.78 −44 −14 58     
R dPMC 5.16 58 −4 52     
L SMC 5.79 −28 −28 58     
L IPL 5.27 −64 −50 20 5.12 −36 −52 46 
R IPL     6.02 44 −46 44 
L ACC     6.59 −10 28 26 
R ACC     9.09 24 36 
L pre-SMA     6.97 −6 54 
R DLPFC     6.67 38 26 26 
    6.06 36 38 20 
L caud/put     6.32 −14 12 
R caud/put     5.49 18 
L VIII     6.56 −26 −66 −50 
R VIII     5.14 24 −64 −50 
L VI     5.11 −26 −64 −28 
R V/VI     5.43 −64 −22 

Results from two stimulus-functional connectivity analyses, indicating brain regions whose neural activity is correlated with and modulated by that of the seed voxel (in R STG or R VLPFC). The stereotaxic coordinates of peak activations are given in MNI space, along with peak t values.

DISCUSSION

This experiment examined the brain networks involved in identifying and tapping to the beat of musical rhythms. In contrast to previous experiments using passive perceptual paradigms or which required reproduction of an entire rhythm, here we asked participants to actively find and tap to the underlying beat of rhythms that varied in metrical complexity. Our results showed that beat finding and tapping recruit largely overlapping auditory, motor, and prefrontal regions, including the STG, premotor cortex, and VLPFC. Activity in STG and VLPFC was more strongly modulated by beat strength than during beat tapping than beat finding, with greater activity for more metrically complex rhythms with weaker beats. Furthermore, activity in these regions was negatively correlated with both perceived beat strength and tapping performance. Functional connectivity analyses found that activity in VLPFC and STG showed greater temporal correlation for tapping to weak as compared with strong beats. These analyses also revealed temporal coupling between VLPFC and the BG during tapping to weaker beats. Taken together, our findings suggest that BG mechanisms are engaged in beat finding and tapping but that their activity was not modulated by beat strength and was not correlated with either beat perception or production. However, when tapping to the beat of more complex rhythms, working memory retrieval mechanisms in VLPFC are recruited and interact with mechanisms in the BG.

Previous neuroimaging studies of beat processing did not control for the effects of scanner noise and either did not collect behavioral measures or did not relate them to brain activity. The results of our behavioral pilot study showed that beat strength and tapping performance were equivalent with and without scanner noise (Figure 3). This confirmed that the combination of sparse sampling and careful design of the metric structure of the rhythm stimuli was successful in controlling noise interference. For both the pilot and fMRI experiment, behavioral findings showed that musicians were able to find and tap to the beat of all rhythms, but that they were more accurate in tapping to the beat of metrically simpler rhythms (Figure 4). This is consistent with previous literature showing that tapping to a strong beat is more accurate than tapping to a weak beat (Patel et al., 2005; Essens & Povel, 1985). These results confirm our manipulation of metrical complexity and beat strength and validate the use of the behavioral measures in the regression and stimulus-modulated functional connectivity analyses.

Conjunction analysis showed that beat finding and tapping engaged overlapping regions of STG, PMC, and VLPFC. Contrasts between the two conditions revealed no significant differences. Engagement of STG and PMC is consistent with previous findings showing that these auditory and motor regions are engaged during both listening to and tapping in synchrony with musical rhythms (Chapin et al., 2010; Chen et al., 2008a, 2008b; Karabanov, Blom, Forsman, & Ullén, 2008; Chen, Zatorre, & Penhune, 2006; Bengtsson, Ehrsson, Forssberg, & Ullén, 2004, 2005) as well as other musical tasks (Chen, Rae, & Watkins, 2012; Jancke, 2012; Karabanov et al., 2008; Lahav, Saltzman, & Schlaug, 2007).

Beat finding and tapping also recruited VLPFC, a component of the PFC working memory system. On the basis of work in both humans and monkeys, it has been proposed that VLPFC interacts with posterior sensory regions during active memory retrieval when it requires top–down control or selection among options (Kostopoulos & Petrides, 2003, 2008; Cadoret & Petrides, 2007; Kostopoulos, Albanese, & Petrides, 2007; Badre, Poldrack, Pare-Blagoev, Insler, & Wagner, 2005). In our task, as metrical complexity increases, there may be no single beat that fits a given rhythm. Thus, tapping to the beat of metrically complex rhythms would require active retrieval of the selected beat from competing options.

This interpretation is consistent with the results of other experiments requiring active memory retrieval in a musical context. Vuust, Roepstorff, Wallentin, Mouridsen, and Ostergaard (2006) showed that VLPFC was active when musicians were required to tap to the primary beat of a complex polyrhythm, a condition somewhat analogous to tapping to the beat of the more complex rhythms in our study. In addition, they also showed that performance was correlated with VLPFC activity, consistent with the current findings. In contrast, in their study VLPFC was active only when participants tapped to the main rhythm, not during listening. This is likely because they used a single stimulus that was repeated, making retrieval minimal during listening. VLPFC has also been shown to be engaged when musicians encode rhythmic sequences (Konoike et al., 2012), hold atonal pitch sequences in memory (Schulze, Mueller, & Koelsch, 2011) and during complex auditory imagery tasks (Zatorre et al., 2010; Leaver, Van Lare, Zielinski, Halpern, & Rauschecker, 2009).

In the beat tapping condition, activity in both VLPFC and STG increased as metrical complexity increased and beat strength decreased and was also correlated with tapping performance (Figures 5 and 6). Furthermore, temporally correlated activity in these regions was greater for weak compared with strong beats (Figure 7). Auditory regions within the STG have been shown to be sensitive to metrical complexity in a number of previous studies (Chen et al., 2008b; Karabanov et al., 2008; Bengtsson et al., 2005). Greater activity in VLPFC when tapping to more complex rhythms is consistent with data showing that this region is engaged when memory retrieval requires greater top–down control (Badre et al., 2005; Petrides, 2005). Very importantly, greater functional connectivity between VLPFC and STG for the weaker beats is consistent with findings showing that active retrieval results in correlated activity in VLPFC and posterior sensory regions where memory may be stored (Kostopoulos & Petrides, 2008).

In addition to STG, interactions between VLPFC and DLPFC also increased with metrical complexity. In a previous experiment, we found greater engagement of DLPFC when participants tapped to rhythms with weaker beats (Chen et al., 2008b). Synchronization requires people to continuously monitor their motor responses, and monitoring has been shown to specifically engage the DLPFC (Champod & Petrides, 2007, 2010). In our current experiment, tapping to weaker beats may place greater demands on retrieval and monitoring processes, leading to increases in correlated activity between VLPFC and DLPFC.

The BG have long been hypothesized to play a role in motor and/or perceptual timing (Teki, Grube, Sukhbinder, & Griffiths, 2011; Meck, Penney, & Pouthas, 2008; Lewis, Wing, Pope, Praamstra, & Miall, 2004; Rao, Mayer, & Harrington, 2001) and have commonly been found to be active during beat perception (Chapin et al., 2010; Fujioka et al., 2010; Grahn & Brett, 2007, 2009; Grahn & Rowe, 2009). It has also been suggested that the BG may underlie beat synchronization in some nonhuman animals (Patel et al., 2009). Consistent with these data, in the current experiment, activity in bilateral caudate nucleus was greater in the Find Beat condition compared with Silence or the isochronous control condition but did not differ between Find Beat and Tap Beat. This suggests that similar BG resources were recruited during both perception and production. BG activity was also not modulated by beat strength and showed no significant correlation with measures of beat perception or performance. However, activity in the BG was temporally correlated with activity in the VLPFC, with stronger correlation for weaker beats (Figure 7). This indicates that BG mechanisms may play a more basic role in beat processing that interacts with working memory retrieval mechanisms in the frontal lobe.

Studies of beat perception have generally shown greater BG activity when people listen to a strong or predictable beat (Fujioka et al., 2010; Grahn & Rowe, 2009; Grahn & Brett, 2007), once a beat percept has been established (Chapin et al., 2010), or in beat-based compared with interval-based timing tasks (Teki, Grube, Sukhbinder, et al., 2011). A recent study of beat perception showed greater BG activity when people were able to apply an already identified beat to a new rhythm, compared with finding the beat in a new rhythm (Grahn & Rowe, in press). BG dopaminergic mechanisms have been implicated to be important in temporal prediction (Coull, Cheng, & Meck, 2011; Teki, Grube, Sukhbinder, et al., 2011; Grahn & Rowe, 2009); thus, these mechanisms are proposed to be more engaged when the beat is predictable or strong. In the current experiment this model would predict that BG activity would be greater for highly metrical rhythms with stronger beats and in the Tap Beat compared with the Find Beat condition. We did not find evidence to support these hypotheses. An alternative explanation is based on data linking the BG to preparation or production of well-learned motor responses (Penhune & Steele, 2012; Thorn, Atallah, Howe, & Graybiel, 2010; Doyon et al., 2009; Graybiel, 2008; Szameitat, Shen, & Sterr, 2007). In previous experiments linking BG activity to beat processing, the beat was either highly salient (Grahn & Rowe, 2009; Grahn & Brett, 2007), carried over from a previous item (Grahn & Rowe, in press), or people were actively producing, preparing, or imagining a motor response (Chapin et al., 2010; Fujioka et al., 2010; Chen et al., 2008a, 2008b). In all these cases, the beat is associated with a well-known or primed motor response.

The cerebellum has frequently been implicated in the perception and production of rhythms (Teki, Grube, Sukhbinder, et al., 2011; Grube, Cooper, Chinnery, & Griffiths, 2010; Grahn & Rowe, 2009; Chen et al., 2008a; Sakai et al., 1999; Penhune, Zatorre, & Evans, 1998), but its specific role is not clear. In the current experiment, lobule VIII of the cerebellum was significantly engaged during tapping to the beat of complex compared with isochronous rhythms. This is consistent with the role of the cerebellum in producing complex motor responses and with previous work from our laboratory showing cerebellar engagement during rhythm synchronization (Chen et al., 2008a; Penhune et al., 1998). Activity in lobule VIII was also negatively correlated with beat tapping performance, with greater activity for metrically complex rhythms that were less well performed. This is consistent with a large body of literature implicating the cerebellum in error correction and feedback processing (Shadmehr & Krakauer, 2008; Miall, Christensen, Cain, & Stanley, 2007; Ito, 2000) and with theories proposing a role for the cerebellum in time-keeping processes (Coull et al., 2011; Teki, Grube, & Griffiths, 2011; Ivry & Spencer, 2004; Lewis & Miall, 2003), both of which are necessary to make an accurate response. Finally, activity in lobule VIII was temporally correlated with activity in the VLPFC and modulated by beat strength. Recent anatomical studies in animals and humans have shown connections between lateral cerebellar regions and frontal cortex (Ramnani et al., 2006; Kelly & Strick, 2003), and it has been hypothesized that the cerebellum may play a role in working memory processing or the application of higher-order rules or structures (Balsters, Whelan, Robertson, & Ramnani, in press; Marvel & Desmond, 2010). The current results do not allow us to dissociate the contribution of different cerebellar mechanisms to beat tapping. Future experiments focused on differentiating cerebellar involvement in error correction and time-keeping using perturbation paradigms could be useful in assessing the roles of these two mechanisms.

Cognitive theories propose two general models of beat perception: the dynamic attending model (Chapin et al., 2010; Large & Jones, 1999) and the template model (Essens & Povel, 1985; Povel & Essens, 1985). Both models agree that the beat percept develops over time as listeners derive possible meters and beats based on the accent structure of a piece. The dynamic attending model postulates that predicted beat points focus attention. The template model proposes that accent structure generates a template for a particular metrical grid or beat, which listeners apply to upcoming stimuli. Our paradigm is a better fit with the template model because the stimuli were created based on Povel and Essens' model (1985) and because the task requires listeners to generate a beat template in the Find Beat condition and to apply it in the Tap Beat condition. This paradigm is less well suited to assessing the dynamic attending model, which emphasizes the development of a predicted beat over time. It is possible that the dynamic attending model may better explain the process of beat identification and the template model may better explain how the identified beat is retrieved and applied. Thus, the dynamic attending model may map onto BG dependent processes important for beat finding and prediction, whereas the template model may better describe frontal mechanisms important for active retrieval of a selected beat.

In conclusion, the results of this experiment demonstrate that a network including the BG, STG, PMC, and VLPFC is engaged in finding and tapping to the beat of musical rhythms. Within this network, we were able to dissociate the role of the BG, which was equally engaged in all conditions, from cortical mechanisms, suggesting that BG may be involved in detecting auditory temporal regularity or in associating auditory stimuli with a motor response. In contrast, activity in cortical auditory, premotor and prefrontal regions was modulated by beat strength, suggesting that they are important for retrieving, selecting and maintaining the musical beat. We also found evidence for interaction between these two systems, indicating that more basic sensorimotor mechanisms instantiated in the BG work in tandem with higher-order cognitive mechanisms in auditory association and prefrontal regions. Overall, these results reinforce the larger concept that brain regions engaged by beat finding and tapping are not unique to music processing but rather rely on more general neural mechanisms important for predicting and integrating auditory information with a motor response as well as those required for active memory retrieval.

Acknowledgments

The authors would like to acknowledge Marc Bouffard for assistance developing the stimulus presentation and data collection software. We would also like to thank the staff at the McConnell Brain Imaging Centre of McGill University for their assistance with the scanning protocol. Funding for this project comes from the National Science Council of the Republic of China (S. J. K.), Fonds de la recherche en santé du Québec (post-doctoral fellowship to J. L. C.; Chercheur Boursier to V. B. P.), and the Canadian Institutes of Health Research (R. J. Z.).

Reprint requests should be sent to Virginia B. Penhune, Department of Psychology, Concordia University, SP-A 244, 7141 Sherbrooke Street West, Montréal, Québec, Canada H4B 1R6, or via e-mail: virginia.penhune@concordia.ca.

REFERENCES

REFERENCES
Badre
,
D.
,
Poldrack
,
R.
,
Pare-Blagoev
,
E.
,
Insler
,
R.
, &
Wagner
,
A.
(
2005
).
Dissociable controlled retrieval and generalized selection mechanisms in ventrolateral prefrontal cortex.
Neuron
,
47
,
907
918
.
Balsters
,
J. H.
,
Whelan
,
C. D.
,
Robertson
,
I. H.
, &
Ramnani
,
N.
(
in press
).
Cerebellum and cognition: Evidence for the encoding of higher order rules.
Cerebral Cortex
,
Epub ahead of print
.
Belin
,
P.
,
Zatorre
,
R.
,
Hoge
,
R.
,
Evans
,
A.
, &
Pike
,
B.
(
1999
).
Event-related fMRI of the auditory cortex.
Neuroimage
,
10
,
417
429
.
Bengtsson
,
S.
,
Ehrsson
,
H.
,
Forssberg
,
H.
, &
Ullén
,
F.
(
2004
).
Dissociating brain regions controlling the temporal and ordinal structure of learned movement sequences.
European Journal of Neuroscience
,
19
,
2591
2602
.
Bengtsson
,
S.
,
Ehrsson
,
H.
,
Forssberg
,
H.
, &
Ullén
,
F.
(
2005
).
Effector-independent voluntary timing: Behavioral and neuroimaging evidence.
European Journal of Neuroscience
,
22
,
3255
3265
.
Cadoret
,
G.
, &
Petrides
,
M.
(
2007
).
Ventrolateral prefrontal neuronal activity related to active controlled memory retrieval in nonhuman primates.
Cerebral Cortex
,
17
,
i27
i40
.
Champod
,
A.
, &
Petrides
,
M.
(
2007
).
Dissociable roles of the posterior parietal and the prefrontal cortex in manipulation and monitoring processes.
Proceedings of the National Academy of Sciences, U.S.A.
,
104
,
14837
14842
.
Champod
,
A.
, &
Petrides
,
M.
(
2010
).
Dissociation within the frontoparietal network in verbal working memory: A parametric functional magnetic resonance imaging study.
Journal of Neuroscience
,
30
,
3849
3856
.
Chapin
,
H.
,
Zanto
,
T.
,
Jantzen
,
K.
,
Kelso
,
J.
,
Steinberg
,
F.
, &
Large
,
E.
(
2010
).
Neural responses to complex auditory rhythms: The role of attending [Electronic Version].
Frontiers in Psychology: Auditory Cognitive Neuroscience
,
1
,
224
.
Chen
,
J.
,
Penhune
,
V.
, &
Zatorre
,
R.
(
2008a
).
Listening to musical rhythms recruits motor regions of the brain.
Cerebral Cortex
,
18
,
2844
2854
.
Chen
,
J.
,
Penhune
,
V.
, &
Zatorre
,
R.
(
2008b
).
Moving on time: The brain network for auditory–motor synchronization.
Journal of Cognitive Neuroscience
,
20
,
226
239
.
Chen
,
J.
,
Rae
,
C.
, &
Watkins
,
K.
(
2012
).
Learning to play a melody: An fMRI study examining the formation of auditory–motor associations.
Neuroimage
,
59
,
1200
1208
.
Chen
,
J.
,
Zatorre
,
R.
, &
Penhune
,
V.
(
2006
).
Interactions between auditory and dorsal premotor cortex during synchronization to musical rhythms.
Neuroimage
,
32
,
1771
1781
.
Cooper
,
G.
, &
Meyer
,
L.
(
1960
).
The rhythmic structure of music.
Chicago
:
University of Chicago Press
.
Coull
,
J. T.
,
Cheng
,
R. K.
, &
Meck
,
W. H.
(
2011
).
Neuroanatomical and neurochemical substrates of timing.
Neuropsychopharmacology Reviews
,
36
,
3
25
.
Cox
,
R. W.
(
1996
).
AFNI: Software for analysis and visualization of functional magnetic resonance neuroimages.
Computers and Biomedical Research
,
29
,
162
173
.
Doyon
,
J.
,
Bellec
,
P.
,
Amsel
,
R.
,
Penhune
,
V.
,
Monchi
,
O.
,
Carrier
,
J.
,
et al
(
2009
).
Contributions of the basal ganglia and functionally related brain structures to motor learning.
Behavioral and Brain Research
,
199
,
61
75
.
Drake
,
C.
,
Jones
,
M. R.
, &
Baruch
,
C.
(
2000
).
The development of rhythmic attending in auditory sequences: Attunement, referent period, focal attending.
Cognition
,
77
,
251
288
.
Duvernoy
,
H.
(
1991
).
The human brain: Surface, three-dimensional sectional anatomy and MRI.
New York
:
Springer-Verlag
.
Essens
,
P.
, &
Povel
,
D.
(
1985
).
Metrical and nonmetrical representations of temporal patterns.
Perception and Psychophysics
,
17
,
1
7
.
Fisher
,
N.
(
1993
).
Statistical analysis of circular data.
Cambridge
:
Cambridge University Press
.
Friston
,
K.
,
Buechel
,
C.
,
Fink
,
G.
,
Morris
,
J.
,
Rolls
,
E.
, &
Dolan
,
R.
(
1997
).
Psychophysiological and modulatory interactions in neuroimaging.
Neuroimage
,
6
,
218
229
.
Friston
,
K.
,
Penny
,
W.
, &
Glaser
,
D.
(
2005
).
Conjunction revisited.
Neuroimage
,
25
,
661
667
.
Fujioka
,
T.
,
Zendel
,
B.
, &
Ross
,
B.
(
2010
).
Endogenous neuromagnetic activity for mental hierarchy of timing.
Journal of Neuroscience
,
30
,
3458
3466
.
Gaab
,
N.
,
Gabrieli
,
J. D.
, &
Glover
,
G. H.
(
2007a
).
Assessing the influence of scanner background noise on auditory processing. I. An fMRI study comparing three experimental designs with varying degrees of scanner noise.
Human Brain Mapping
,
28
,
703
720
.
Gaab
,
N.
,
Gabrieli
,
J. D.
, &
Glover
,
G. H.
(
2007b
).
Assessing the influence of scanner background noise on auditory processing. II. An fMRI study comparing auditory processing in the absence and presence of recorded scanner noise using a sparse design.
Human Brain Mapping
,
28
,
721
732
.
Grahn
,
J.
, &
Brett
,
M.
(
2007
).
Rhythm and beat perception in motor areas of the brain.
Journal of Cognitive Neuroscience
,
19
,
893
906
.
Grahn
,
J.
, &
Brett
,
M.
(
2009
).
Impairment of beat-based rhythm discrimination in Parkinson's disease.
Cortex
,
45
,
54
61
.
Grahn
,
J.
, &
Rowe
,
J.
(
2009
).
Feeling the beat: Premotor and striatal interactions in musicians and non-musicians during beat perception.
Journal of Neuroscience
,
29
,
7540
7548
.
Grahn
,
J.
, &
Rowe
,
J.
(
in press
).
Finding and feeling the musical beat: Striatal dissociations between detection and prediction of regularity
.
Graybiel
,
A.
(
2008
).
Habits, rituals, and the evaluative brain.
Annual Review of Neuroscience
,
31
,
359
387
.
Grube
,
M.
,
Cooper
,
F. E.
,
Chinnery
,
P. F.
, &
Griffiths
,
T. D.
(
2010
).
Dissociation of duration-based and beat-based auditory timing in cerebellar degeneration.
Proceedings of the National Academy of Sciences, U.S.A.
,
107
,
11597
11601
.
Handel
,
S.
(
1989
).
Rhythm.
In
Listening
(p.
384
).
Cambridge, MA
:
MIT Press
.
Hayasaka
,
S.
,
Phan
,
K. L.
,
Liberzon
,
I.
,
Worsley
,
K. J.
, &
Nichols
,
T. E.
(
2004
).
Nonstationary cluster-size inference with random field and permutation methods.
Neuroimage
,
22
,
676
687
.
Ito
,
M.
(
2000
).
Mechanisms of motor learning in the cerebellum.
Brain Research
,
886
,
237
245
.
Iversen
,
J. R.
,
Repp
,
B. H.
, &
Patel
,
A. D.
(
2009
).
Top–down control of rhythm perception modulates early auditory responses.
Annals of the New York Academy of Sciences
,
1169
,
58
73
.
Ivry
,
R.
, &
Spencer
,
R.
(
2004
).
The neural representation of time.
Current Opinion in Neurobiology
,
14
,
225
232
.
Jancke
,
L.
(
2012
).
The dynamic audio-motor system in pianists.
Proceedings of the New York Academy of Sciences
,
252
,
246
252
.
Karabanov
,
A.
,
Blom
,
O.
,
Forsman
,
L.
, &
Ullén
,
F.
(
2008
).
The dorsal auditory pathway is involved in performance of both visual and auditory rhythms.
Neuroimage
,
44
,
480
488
.
Kelly
,
R.
, &
Strick
,
P.
(
2003
).
Cerebellar loops with motor cortex and prefrontal cortex of a non-human primate.
Journal of Neuroscience
,
23
,
8432
8444
.
Konoike
,
N.
,
Kotozaki
,
Y.
,
Miyachi
,
S.
,
Miyauchi
,
C. M.
,
Yomogida
,
Y.
,
Akimoto
,
Y.
,
et al
(
2012
).
Rhythm information represented in the fronto-parieto-cerebellar motor system.
Neuroimage
,
63
,
328
338
.
Kostopoulos
,
P.
,
Albanese
,
M.
, &
Petrides
,
M.
(
2007
).
The ventrolateral prefrontal cortex and tactile memory disambiguation in the human brain.
Proceedings of the National Academy of Sciences, U.S.A.
,
104
,
10223
10228
.
Kostopoulos
,
P.
, &
Petrides
,
M.
(
2003
).
The mid-ventrolateral prefrontal cortex: Insights into its role in memory retrieval.
European Journal of Neuroscience
,
17
,
1489
1497
.
Kostopoulos
,
P.
, &
Petrides
,
M.
(
2008
).
Left mid-ventrolateral prefrontal cortex: Underlying principles of function.
European Journal of Neuroscience
,
27
,
1037
1049
.
Kung
,
S.-J.
,
Tzeng
,
O.
,
Hung
,
D.
, &
Wu
,
D.
(
2011
).
Dynamic allocation of attention to metrical and grouping accents in rhythmic sequences.
Experimental Brain Research
,
210
,
269
282
.
Lahav
,
A.
,
Saltzman
,
E.
, &
Schlaug
,
G.
(
2007
).
Action representation of sound: Audiomotor recognition network while listening to newly acquired actions.
Journal of Neuroscience
,
27
,
308
314
.
Large
,
E.
,
Fink
,
P.
, &
Kelso
,
J.
(
2002
).
Tracking simple and complex sequences.
Psychological Research
,
66
,
3
17
.
Large
,
E.
, &
Jones
,
M.
(
1999
).
The dynamics of attending: How people track time-varying events.
Psychological Review
,
106
,
119
159
.
Leaver
,
A.
,
Van Lare
,
J.
,
Zielinski
,
B.
,
Halpern
,
A.
, &
Rauschecker
,
J.
(
2009
).
Brain activation during anticipation of sound sequences.
Journal of Neuroscience
,
29
,
2477
2485
.
Lerdahl
,
F.
, &
Jackendoff
,
R.
(
1983
).
A generative theory of tonal music.
Cambridge, MA
:
MIT Press
.
Lewis
,
P.
, &
Miall
,
R.
(
2003
).
Distinct systems for automatic and cognitively controlled time measurement: Evidence from neuroimaging.
Current Opinion in Neurobiology
,
13
,
250
255
.
Lewis
,
P.
,
Wing
,
A.
,
Pope
,
P.
,
Praamstra
,
P.
, &
Miall
,
R.
(
2004
).
Brain activity correlates differentially with increasing temporal complexity of rhythms during initialisation, and continuation phases of paced finger tapping.
Neuropsychologia
,
42
,
1301
1312
.
Marvel
,
C. L.
, &
Desmond
,
J. E.
(
2010
).
The contributions of cerebro-cerebellar circuitry to executive verbal working memory.
Cortex
,
46
,
880
895
.
Mazziotta
,
J.
,
Toga
,
A.
,
Evans
,
A.
,
Fox
,
P.
,
Lancaster
,
J.
,
Zilles
,
K.
,
et al
(
2001
).
A probabilistic atlas and reference system for the human brain: International Consortium for Brain Mapping (ICBM). Philosophical Transactions of the Royal Society of London, Series B,
Biological sciences
,
356
,
1293
1322
.
Meck
,
W.
,
Penney
,
T.
, &
Pouthas
,
V.
(
2008
).
Cortico-striatal representation of time in animals and humans.
Current Opinion in Neurobiology
,
18
,
145
152
.
Miall
,
R. C.
,
Christensen
,
L.
,
Cain
,
O.
, &
Stanley
,
J.
(
2007
).
Disruption of state estimation in the human lateral cerebellum.
PLoS Biology
,
5
,
e316
.
Palmer
,
C.
, &
Krumhansl
,
C.
(
1990
).
Mental representations for musical meter.
Journal of Experimental Psychology: Human Perception and Performance
,
16
,
728
741
.
Parncutt
,
R.
(
1994
).
A perceptual model of pulse salience and metrical accent in musical rhythm.
Music Perception
,
11
,
409
464
.
Patel
,
A.
,
Iversen
,
J.
,
Bregman
,
M.
, &
Schultz
,
I.
(
2009
).
Experimental evidence for synchronization to a musical beat in a nonhuman animal.
Current Biology
,
19
,
827
830
.
Patel
,
A.
,
Iversen
,
J.
,
Chen
,
Y.
, &
Repp
,
B.
(
2005
).
The influence of metricality and modality on synchronization with a beat.
Experimental Brain Research
,
163
,
226
238
.
Penhune
,
V.
, &
Steele
,
C.
(
2012
).
Parallel contributions of cerebellar, striatal and M1 mechanisms to motor sequence learning.
Behavioral and Brain Research
,
226
,
579
591
.
Penhune
,
V.
,
Zatorre
,
R.
, &
Evans
,
A.
(
1998
).
Cerebellar contributions to motor timing: A PET study of auditory and visual rhythm reproduction.
Journal of Cognitive Neuroscience
,
10
,
752
765
.
Petrides
,
M.
(
2005
).
Lateral prefrontal cortex: Architectonic and functional organization.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
360
,
781
795
.
Picard
,
N.
, &
Strick
,
P.
(
1996
).
Motor areas of the medial wall: A review of their location and functional activation.
Cerebral Cortex
,
6
,
342
353
.
Povel
,
D.
, &
Essens
,
P.
(
1985
).
Perception of temporal patterns.
Music Perception
,
2
,
411
440
.
Ramnani
,
N.
,
Behrens
,
T.
,
Johansen-Berg
,
H.
,
Richter
,
M.
,
Pinsk
,
M.
,
Andersson
,
J.
,
et al
(
2006
).
The evolution of profrontal inputs to the cortico-pontine system: Diffusion imaging evidence from Macaque monkeys and humans.
Cerebral Cortex
,
16
,
811
818
.
Rao
,
S.
,
Mayer
,
A.
, &
Harrington
,
D.
(
2001
).
The evolution of brain activation during temporal processing.
Nature Neuroscience
,
4
,
317
323
.
Sakai
,
K.
,
Hikosaka
,
O.
,
Miyachi
,
S.
,
Ryousuke
,
T.
,
Tamada
,
T.
,
Iwata
,
N.
,
et al
(
1999
).
Neural representation of a rhythm depends on its interval ratio.
Journal of Neuroscience
,
19
,
10074
10081
.
Schmahmann
,
J.
,
Doyon
,
J.
,
Holmes
,
C.
,
Makris
,
N.
,
Petrides
,
M.
,
Kennedy
,
D.
,
et al
(
1996
).
An MRI atlas of the human cerebellum in Talairach space.
Neuroimage
,
3
,
122
.
Schulze
,
K.
,
Mueller
,
K.
, &
Koelsch
,
S.
(
2011
).
Neural correlates of strategy use during auditory working memory in musicians and non-musicians.
European Journal of Neuroscience
,
33
,
189
196
.
Shadmehr
,
R.
, &
Krakauer
,
J.
(
2008
).
A computational neuroanatomy for motor control.
Experimental Brain Research
,
185
,
359
381
.
Snyder
,
J.
, &
Krumhansl
,
C.
(
2001
).
Tapping to ragtime: Cues to pulse finding.
Music Perception
,
18
,
455
489
.
Snyder
,
J.
, &
Large
,
E.
(
2005
).
Gamma-band activity reflects the metric structure of rhythmic tone sequences.
Cognitive Brain Research
,
24
,
117
126
.
Szameitat
,
A. J.
,
Shen
,
S.
, &
Sterr
,
A.
(
2007
).
Motor imagery of complex everyday movements. An fMRI study.
Neuroimage
,
34
,
702
713
.
Talairach
,
J.
, &
Tournoux
,
P.
(
1988
).
Co-planar stereotaxic atlas of the human brain.
New York
:
Thieme
.
Teki
,
S.
,
Grube
,
M.
, &
Griffiths
,
T.
(
2011
).
A unified model of time perception accounts for duration-based and beat-based timing mechanisms.
Frontiers in Integrative Neuroscience
,
5
,
90
.
Teki
,
S.
,
Grube
,
M.
,
Sukhbinder
,
K.
, &
Griffiths
,
T.
(
2011
).
Distinct neural substrates of duration-based and beat-based auditory timing.
Journal of Neuroscience
,
31
,
3805
3812
.
Thorn
,
C.
,
Atallah
,
H.
,
Howe
,
M.
, &
Graybiel
,
A.
(
2010
).
Differential dynamics of activity changes in dorsolateral and dorsomedial striatal loops during learning.
Neuron
,
66
,
781
795
.
Vuust
,
P.
,
Roepstorff
,
A.
,
Wallentin
,
M.
,
Mouridsen
,
K.
, &
Ostergaard
,
L.
(
2006
).
It don't mean a thing … Keeping the rhythm during polyrhythmic tension activates language areas (BA 47).
Neuroimage
,
31
,
832
841
.
Westbury
,
C.
,
Zatorre
,
R.
, &
Evans
,
A.
(
1996
).
The planum temporale: A re-assessment of its boundaries, area and volume using 3-D in-vivo morphometric techniques.
Society for Neuroscience Abstracts
,
22
,
1858
.
Worsley
,
K. J.
(
2005
).
An improved theoretical P value for SPMs based on discrete local maxima.
Neuroimage
,
28
,
1056
1062
.
Worsley
,
K.
,
Liao
,
C.
,
Aston
,
J.
,
Petre
,
V.
,
Duncan
,
G.
,
Morales
,
F.
,
et al
(
2002
).
A general statistical analysis for fMRI data.
Neuroimage
,
15
,
1
15
.
Zatorre
,
R.
,
Chen
,
J.
, &
Penhune
,
V.
(
2007
).
When the brain plays music: Sensory-motor interactions in music perception and production.
Nature Reviews Neuroscience
,
8
,
547
558
.
Zatorre
,
R.
,
Halpern
,
A.
, &
Foster
,
N.
(
2010
).
Mental reversal of imagined melodies: A role for the posterior parietal cortex.
Journal of Cognitive Neuroscience
,
22
,
775
789
.

Author notes

*

Now at Institute of Linguistics, Academia Sinica, Taiwan.

Now at Sunnybrook Research Institute, Toronto.