Abstract

To obtain further evidence that action observation can serve as a proxy for action execution and planning in posterior parietal cortex, we scanned participants while they were (1) observing two classes of action: vocal communication and oral manipulation, which share the same effector but differ in nature, and (2) rehearsing and listening to nonsense sentences to localize area Spt, thought to be involved in audio-motor transformation during speech. Using this localizer, we found that Spt is specifically activated by vocal communication, indicating that Spt is not only involved in planning speech but also in observing vocal communication actions. In addition, we observed that Spt is distinct from the parietal region most specialized for observing vocal communication, revealed by an interaction contrast and located in PFm. The latter region, unlike Spt, processes the visual and auditory signals related to other's vocal communication independently. Our findings are consistent with the view that several small regions in the temporoparietal cortex near the ventral part of the supramarginal/angular gyrus border are involved in the planning of vocal communication actions and are also concerned with observation of these actions, though involvements in those two aspects are unequal.

INTRODUCTION

The organization of posterior parietal cortex (PPC) is still widely debated. Although there is little doubt that one important function of human PPC is the planning of upcoming actions, it remains unclear how this function is implemented. Some maintain that the sensorimotor transformations underlying the planning of actions are organized according to the effector to be used in the action (Premereur, Janssen, & Vanduffel, 2015; Konen, Mruczek, Montoya, & Kastner, 2013; Gallivan, McLean, Smith, & Culham, 2011; Cavina-Pratesi et al., 2010; Filimon, 2010; Filimon, Nelson, Huang, & Sereno, 2009; Hinkley, Krubitzer, Padberg, & Disbrow, 2009; Levy, Schluppeck, Heeger, & Glimcher, 2007; Nishimura, Onoe, Morichika, Tsukada, & Isa, 2007; Frey, Vinton, Norlund, & Grafton, 2005; Connolly, Andersen, & Goodale, 2003; Culham et al., 2003; Binkofski et al., 1998). A few (Leone, Heed, Toni, & Medendorp, 2014; Heed, Beurze, Toni, Roder, & Medendorp, 2011) favor the view that sensorimotor localization follows both the motor (effector) and the sensorial (different sensory information from several senses) aspects of the transformations. The organization thus depends on the functional nature of the action, as defined by the goal of the action and the way to achieve this goal. We (Ferri, Rizzolatti, & Orban, 2015; Orban, 2015) have repeatedly noted that this debate is fueled to a great extent by the limited number of actions thus far studied, either in single-cell studies, largely using restrained monkeys trained to use their upper limbs, or in fMRI studies requiring that paticipants remain immobile to avoid susceptibly artifacts. We have proposed that action observation can serve as a proxy for action execution, at least at the areal level in PPC (Ferri et al., 2015), that is, that parietal regions involved in action execution and planning are also involved in observation of those actions, such that one can use action observation to explore the regional specialization of PPC for actions the execution of which is impossible to perform in the scanner for technical reasons. This proposal amounts to a generalization of the mirror principle, at the areal level, to actions other than grasping and manipulation, for which mirror neurons have been described in the monkey (Bonini, Maranesi, Livi, Fogassi, & Rizzolati, 2014; Gallese, Fadiga, Fogassi, & Rizzolati, 1996). As a proxy for execution, action observation clearly supports a functional interpretation of PPC organization (Ferri et al., 2015; Abdollahi, Jastorff, & Orban, 2013; Jastorff, Begliomini, Fabbri-Destro, Rizzolatti, & Orban, 2010). Indeed these studies showed that observation of different functional classes of action activates different parts of PPC. This demonstration has so far been restricted to action classes that depended on visual or somatosensory input for their planning. The first aim of this study was to extend the demonstration to classes of action requiring auditory information in their planning.

Ferri et al. (2015) discussed the arguments supporting the proxy view as well as its limitations, clarifying the need for more direct evidence. Thus far, direct support for the proxy hypothesis has come chiefly from the anterior intraparietal (AIP) area, which has been shown to be activated by both observation and execution of manipulative actions in humans (Frey, Hansen, & Marchal, 2015; Abdollahi et al., 2013; Konen et al., 2013; Cavina-Pratesi et al., 2010; Jastorff et al., 2010; Begliomini, Wall, Smith, & Castiello, 2007; Frey et al., 2005; Culham et al., 2003; Binkofski et al., 1999; Iacoboni et al., 1999) and nonhuman primates (Nelissen & Vanduffel, 2011; Nelissen et al., 2011). Additional direct evidence for the proxy hypothesis was thus a second goal of this study. Furthermore, the issue of the “quality” of the proxy—how well the spatial profile of the observation activation matches that of execution—was not addressed in Ferri et al. (2015). AIP studies provide little indication about the quality of the proxy, in that some studies used only a single modality (motor or vision) or where both were employed, the intent was to simply demonstrate overlap in what is a relatively large region. Hence, the question remains: Does the proxy hypothesis imply that the spatial profiles of action execution and observation precisely coincide or is there a looser organization in which some overlap between the profiles of the two modalities is sufficient?

To answer these issues, we opted for actions involving auditory input in their planning and requiring minimal head movements: speech and other types of vocal communication. Indeed, recent studies have shown that speech production regions in PPC can be revealed by silent rehearsal of nonsense sentences (Hickok, Buchsbaum, Humphries, & Muftuler, 2003). These authors proposed that speech production involves the Sylvian-parietal-temporal (Spt) area. This region is located in the posterior portion of the planum temporale (Simmonds, Leech, Collins, Redjep, & Wise, 2014; Tremblay, Baroni, & Hasson, 2013; Tremblay, Deschamps, & Gracco, 2013), extending up toward the parietal operculum, and is defined by a combination of passive auditory responses and covert rehearsal of nonsense sentences (Hickok et al., 2003). Spt is also activated by reading and hearing single pseudowords, provided adequate smoothing is applied (Buchsbaum, Olsen, Koch, & Berman, 2005). Impairment of this region in conduction aphasia suggests that it has a role in the control of speech production (Buchsbaum, Ye, & D'Esposito, 2011). Spt has thus been proposed as an auditory–motor interface, binding acoustic representations of speech with articulatory counterparts stored in the frontal cortex, in a manner analogous to visuomotor parietal regions, such as AIP or the medial intraparietal area (Buchsbaum & D'Esposito, 2008). Our decision to use vocal communication actions was further supported by Spt's combined sensitivity to speech and music (Hickok et al., 2003), thereby matching at least partially the range of vocal communication actions observed. Hence, the strategy of this study was straightforward. We localized the PPC regions specifically involved in observing vocal communication actions, using the standard strategy of comparing observation of actions performed with the same effectors but pursuing clearly different goals, as developed in earlier studies (Ferri et al., 2015; Abdollahi et al., 2013), and then tested whether this specific vocal communication map included Spt.

METHODS

Participants

Twenty four volunteers (13 women, mean age = 25 years, range = 20–30 years) participated in the visual and auditory main experiments. Eleven of these also participated in the auditory control experiment. All participants were right-handed, native Italian speakers with normal or corrected-to-normal visual acuity and no history of mental illness or neurological diseases. The study was approved by the Ethical Committee of Parma Province, and all volunteers gave written informed consent in accordance with the Helsinki Declaration before the experiment. Participants were paid for their participation in the experiment.

Stimuli and Experimental Conditions

Our experimental procedures are similar to those of earlier studies (Ferri et al., 2015; Abdollahi et al., 2013); hence, only the most important aspects are described.

Visual Main Experiment

Experimental stimuli of the visual main experiment consisted of silent video clips (448 × 336 pixels, 50 frames/sec) showing an actor, viewed from the side, performing (1) communicative mouth actions directed to another person (vocal communication), (2) mouth actions directed toward fruits (oral manipulation), (3) hand actions directed toward objects (object manipulation), and (4) hand actions bringing fruits to the mouth or removing them (hand and mouth manipulation). Each of these four action classes included 16 videos, showing four exemplars of each class and four versions of each exemplar. The vocal communication class included speaking, singing, shouting, and whistling, directed at a conspecific (male or female), as exemplars (Figure 1). Although mouth movements may have been more visible in a frontal view, the lateral view was preferred as it allowed for the introduction of a passive listener, emphasizing the communicative nature of the actions. In these actions, the target person did not react. During the recording of these videos, the sound produced by the actors was recorded but kept separate from the videos for use in a control experiment (see below). The four versions of each exemplar were generated by two actors (male and female) and two target conspecifics (male and female). The oral manipulation class included chewing, spitting, licking, and biting a fruit as exemplars (Figure 1). The four versions were generated by showing two fruits (blueberry or strawberry, suspended in front of the actor for the last two action exemplars) and two actors (male and female). During the vocal communication and oral manipulation actions, the arms of the actor rested alongside the body. In the hand and mouth manipulation, the actor performed spitting, licking, and biting as in the previous class but eating replaced chewing, and the actor used the right hand to move the fruit toward (in licking, biting, and eating) or away from (in spitting) the mouth. At the onset of the former three actions, the arm was positioned such that the actor needed to move only the wrist and the hand to lick the fruit or place it in his mouth. As with the previous action videos the four versions were generated by two actors (male, female) manipulating two fruits (strawberry and blueberry). In the object manipulation actions, the actors used the right hand to grasp, drag, drop, or push an object. The actions were performed by a male or female actor on a small red cube or larger blue ball, yielding four versions of each action exemplar. In the object manipulation class of actions, the hands rested on the table near the object when the actions began. Hence, manipulative actions involved only the wrist and fingers.

Figure 1. 

Stimuli: static frames taken from the videos portraying the four exemplars of vocal communication (A) and oral manipulation (B) actions. Green dot = fixation point.

Figure 1. 

Stimuli: static frames taken from the videos portraying the four exemplars of vocal communication (A) and oral manipulation (B) actions. Green dot = fixation point.

We took extensive precautions to make the videos uniform across the four classes of actions: The same two actors, wearing the same clothing, performed the actions without expressing any particular emotion. The actions were performed with the right hand on or directed to the same proportion of small and large targets. Efforts were also made to equalize the scenes in which the action took place. Lighting and background were identical as was the general organization of the scene. The actor stood on the right, kept his or her hands in the middle of the scene when performing the action, with the target on the left. The objects to be manipulated were placed on a table, occupying the left part of the scene, as did the person targeted by the interpersonal actions. To substitute for a person or target on the left in the oral manipulation videos, we presented a large box on the left in these videos (Figure 1).

Two types of control stimuli were used. The rationale is that an action may be described as an integration of two main components: a figural component (shape of the body) and a motion component (motion vectors of the body). To control both, we used static images taken from the action videos, and “dynamic scrambled” stimuli derived by animating a noise pattern with the motion extracted from the original action video, as described in Ferri et al. (2015). The static images consisted of three frames taken from the beginning, middle, and end of the video to capture the shape of the actor at different stages of the action. These images control for both shape and lower-order static features such as color or spatial frequency. To create “dynamic scrambled” control videos, the local motion vector was computed for each pixel in the image on a frame-by-frame basis. These vectors were subsequently used to animate a random dot texture pattern (isotropic noise image). The resulting videos were further processed in two steps. First, each frame of the video was divided into 128 squares, with increasing sizes toward the edge of the frame (from 0.26° to 1.2° side). The starting frame was randomized for each square, thereby temporally scrambling the global motion pattern. Second, to remove the biological kinematics, the optic flow in each frame was replaced by a uniform translation with mean (averaged over the square) speed and direction equal to that of the optic flow. This procedure eliminated the global perception of a moving human hand, but within each square, local motion remained identical to that in the original video. Mean luminance of “dynamic scrambled” and original videos was identical. Each version of an action exemplar had its corresponding two types of control stimuli. The two control conditions and the action videos represent the three values of the factor presentation mode. To assess the visual nature of the fMRI signals, we included an additional baseline fixation condition. In this condition, a gray rectangle of the same size and average luminance as the videos was shown. We thereby minimized luminance changes across the conditions, thus keeping the pupil size constant during the experiment.

All videos measured 17.7° by 13.2° and lasted 2.6 sec. The edges were blurred with an elliptical mask (14.3° × 9.6°), leaving the actor and the background of the video unchanged, but blending it gradually and smoothly into the background around the edges. A 0.2° fixation target was shown in all conditions. For action presentations, this fixation target was presented as close as possible to the position where the dynamic changes occurred in the video (Figure 1). For object manipulation actions, it was positioned below the position of the hands; for vocal communication and oral manipulation actions, it was positioned to the left of the mouth; for hand and mouth manipulation actions, it was positioned between the mouth and hand. In the control stimuli, the fixation point was placed at the same position as in the original videos.

The visual main experiment included two MRI sessions (Figure 2A). Each session comprised eight runs testing two action classes: vocal communication and oral manipulation in the first session and object manipulation and hand and mouth manipulation in the second. In each run, the five conditions (two action classes, two controls, and the baseline) were presented in 20.8-sec blocks. This cycle of five conditions was repeated four times for a total duration of 416 sec (Figure 2A). Each experimental block included eight videos of any given class, corresponding to four action exemplars and both genders. The two objects/targets (in the case of communication) alternated between cycles within a run. Similarly, the static and dynamic controls were also alternated: the first and third cycles were devoted to static controls, the second and fourth to dynamic controls. Both the individual videos within a block and the order of the blocks within a cycle were selected pseudorandomly and counterbalanced across runs and participants. Each static block of any given action class used a different frame of the video (start, middle, and end), which alternated over the static blocks of the different runs. During the visual main experiment, the participants were instructed not to move and to fixate the target in the middle of the screen.

Figure 2. 

(A) Run structure in the two sessions of the visual main experiment. (B) Statistical analysis of the visual main experiment: relationship between the three types of SPMs.

Figure 2. 

(A) Run structure in the two sessions of the visual main experiment. (B) Statistical analysis of the visual main experiment: relationship between the three types of SPMs.

Auditory Main Experiment

The intent of the auditory main experiment was to map Spt, following the procedures of Hickok et al. (2003). Auditory stimuli were recorded in a quiet room, using the built-in microphone on the digital video camcorder. These were delivered through an MR-compatible headset connected through air conduction tubes to a computer located in a room adjoining the MRI scanner. The stimuli were a set (n = 16) of Italian “jabberwocky” sentences (sentences in which content words were replaced with nonsense words), lasting 2.6 sec, as the action videos. There were four experimental conditions (listen, listen + rehearse, rehearse, rest), the trials of which lasted 5 sec and followed each other in a 20-sec cycle. For the “listen” and “listen + rehearse” conditions, the 5-sec period included a 2.6-sec sentence presentation, followed by 2.4-sec silence during which an MR volume was acquired, while the participant either rested or rehearsed (without articulation) the sentence just heard. In the two other conditions, no stimulus was presented, and MR volume was scanned in the second half of the 5-sec period, while the participant again either rested or rehearsed the sentence heard in the previous trial. The participants used the cessation of the scanner noise as the signal of a change in condition. These 20-sec cycles were repeated 24 times in a 480-sec run, and eight runs were collected in a single session devoted to this experiment. In each run, all 16 sentences were used, plus half of them a second time. The remaining half was repeated in the next run, so that over the course of two runs the complete set of sentences was presented three times. The experiment started with a short exposure session to familiarize participants with the task and to teach them the differences between the experimental conditions and when to perform them. Before starting the scan session, we optimized the sound volume for each participant. Participants held their eyes closed during the scan.

Auditory Control Experiment

The auditory control experiment included three different auditory conditions. The experimental stimuli were the 16 sentences or sounds produced by the actors themselves during the recordings of the vocal communication videos used in the visual main experiment (actor's voice). The control stimuli were 16 pure tones ranging in frequency from 2000 to 7500 Hz (pure tones), and the third condition corresponded to silence. Each of these three conditions was presented during the first 2.6 sec of a 5-sec period, as in the auditory main experiment, with the scanning taking place in the remaining 2.4 sec. The three conditions were presented in blocks lasting 80 sec (16 × 5 sec) repeated twice in a run totaling 480 sec. Eight runs were collected in a single session. Before starting the scan session, we optimized the sound volume for each participant. Participants held their eyes closed during the scan.

Presentation and Data Collection

Participants lay supine in the bore of the scanner. Visual stimuli were presented in the frontoparallel plane by means of a head-mounted display (60 Hz refresh rate) with a resolution of 800 horizontal pixels × 600 vertical pixels (Resonance Technology, Inc., Northridge, CA) in each eye. The display was controlled by an ATI Radeon 2400 DX dual output video card (AMD, Sun Valley, CA). Sound-attenuating headphones (Resonance Technology, Inc.) were used in all experiments to muffle scanner noise, to give instructions to participants, and to deliver stimuli in the auditory experiments. The presentation of the stimuli was controlled by E-Prime software (Psychology Software Tools, Inc., Sharpsburg, PA). To reduce the amount of head motion during scanning, the participant's head was padded with PolyScanTM vinyl-coated cushions. Throughout the visual scanning sessions, eye movements were recorded with an infrared eye tracking system (60 Hz, Resonance Technology, Inc.).

Scanning was performed using a 3-T MR scanner (GE Discovery MR750, Milwaukee, IL) with an eight parallel-channel receiver coil, located in the University Hospital of the University of Parma. Functional images were acquired using gradient echo-planar imaging with the following parameters: 49 horizontal slices (2.5 mm slice thickness; 0.25 mm gap), repetition time (TR) = 3 sec in the visual experiment (continuous acquisition) and 2.4 sec in the two auditory experiments (sparse acquisition, TR in the second half of the 5-sec trial; see above), time of echo (TE) = 30 msec, flip angle = 90°, 96 × 96 matrix, with field of view of 240 (2.5 × 2.5 mm in plane resolution), and ASSET factor of 2. The 49 slices contained in each volume covered the entire brain from cerebellum to vertex. A 3-D high-resolution T1-weighted IR-prepared fast SPGR (Bravo) image covering the entire brain was acquired in one of the scanning sessions and used for anatomical reference. Its acquisition parameters were as follows: TE/TR = 3.7/9.2 msec; inversion time = 650 msec, flip angle = 12°, acceleration factor (ARC) = 2; 186 sagittal slices acquired with 1 × 1 × 1 mm3 resolution. A single scanning session required about 75 min. All runs started with the acquisition of four dummy volumes to assure that the fMRI signal had reached a steady state.

Two thousand two hundred forty volumes were collected in the visual main experiment: 776 volumes in the auditory main experiment and 480 volumes in the auditory control experiment.

Data Analysis of Experiments

Data analysis was performed using the SPM8 software package (Welcome Department of Cognitive Neurology, London, UK) running under MATLAB (The Mathworks, Inc., Natick, MA). The 16 runs included in the two sessions of the visual main experiment were preprocessed together. The preprocessing of the three experiments consisted of four steps: realignment of the images, coregistration of the anatomical image and the mean functional image, spatial normalization of all images to a standard stereotaxic space (MNI) with a voxel size of 2 × 2 × 2 mm, and smoothing of the resulting images with an isotropic Gaussian kernel of 6 mm.

Analysis of Visual Main Experiment

For each participant, the duration of conditions and onsets were modeled by a general linear model (GLM). In the two sessions, the design matrix for each run was composed of 11 regressors: five modeling the conditions used (two actions, two controls, and a baseline) and six from the realignment process (three translations and three rotations). All regressors were convolved with the canonical hemodynamic response function. For the statistical analysis, we calculated contrast images for each participant and performed a second-level random effects analysis (Holmes & Friston, 1998) for the group of 22 participants (see below). We defined three types of statistical parametric maps (SPMs; Figure 2B). To compute these maps, simple contrasts or interaction contrasts were defined at the first level, whereas conjunctions and maskings were performed at the second level.

The first type of map took the conjunction (conjunction null; Nichols, Brett, Andersson, Wager, & Poline, 2005) of the contrasts, comparing the action condition to the static and dynamic control conditions. This conjunction, thresholded at p < .001 uncorrected with a cluster size of 5 voxels and inclusively masked by the contrast action condition versus fixation at p < .01 uncorrected, defined the activation map for each action class, as in the previous study (Abdollahi et al., 2013). The relatively liberal threshold was used because the activation maps are only a preliminary analysis preceding the common activation and specific maps (Figure 2B). Finally, the map was inclusively masked by a contrast with the fixation baseline to ensure that the map reflected visually responsive sites. Thus, the activation map indicates the network of visually responsive brain regions significantly more activated by the observation of that action class than by the static or the dynamic control conditions.

The second type of map examined the interaction between action class and presentation mode to determine which regions were differentially activated by the observation of a particular action class compared with observation of the other class. This interaction analysis ensures that reported differences in activity cannot be explained by lower-order factors also present in control conditions. It can be written as the conjunction of the interactions for the static controls: (a1-st1)–(a2-st2) and for the dynamic controls (a1-ds1)–(a2-ds2), where a, st, and ds are the action, static, and dynamic scrambled condition, respectively, and where 1 is the class of interest and 2 is the other class. The threshold for the interaction was set at p < .001 uncorrected with a cluster size of 5 voxels. To ensure that the interaction is due to a strong activity in the action condition for the class of interest (a1) rather than strong activities of the control conditions of the second class (st2 and ds2), this contract was masked with the activation map of the class of interest at p < .01. It was also inclusively masked with the contrast “all action and control conditions versus fixation” at p < .01 uncorrected to ensure that the activation reflected visual responses. Because an interaction guarantees only that the activation by one class is greater than that evoked by the other class, we exclusively masked the contrast with the activation map of the other class at p < .01 uncorrected. This ensures that there is little or no significant activation for the other class. This combination of interaction at p < .001 uncorrected, with its inclusive and exclusive masking, defines the specific map for a given class. This map includes the visually responsive regions that are activated specifically by observing a given action class. Because we hypothesized that parietal components of the specific map for vocal communication actions might be involved in auditory–motor integration, we used the coordinates of Spt (−51, −43, 20) reported by Buchsbaum et al. (2011) as the center of a 10-mm a priori ROI for small-volume correction (SVC) in the specific vocal communication map (see Table 2). Similarly to augment the whole-brain analysis of the map specific for oral manipulation, we used the phAIP ellipse (Jastorff et al., 2010) as an a priori ROI for SVC.

The third type of map, the common map, utilizes the conjunction of the activation maps of the two action classes tested in each session to visualize the regions significantly activated by observing both action classes.

Analysis of Auditory Main Experiment

This experiment was intended to localize area Spt in each participant. To that end, a GLM was computed for each participant (the same 22 individuals as in the visual main experiment) with the design matrix: listen, listen + rehearse, rehearse, rest, and six regressors from the realignment process (three translations and three rotations). All regressors were convolved with the canonical hemodynamic response function. For each participant, we defined the two contrasts listen and rehearse compared with rest. Next, in each participant, we defined the conjunction map of the contrasts comparing listen to rest and rehearse to rest. Those voxels located within a 10 mm of the Spt coordinates (−51, −43, 20), based on scanning a large cohort (n = 105) of participants (Buchsbaum et al., 2011), and reaching p < .01 in each of the contrasts of the conjunction were considered as defining Spt in a participant.

Analysis of the Auditory Control Experiment

The control experiment was submitted only to an ROI-based, single-participant analysis. Its design matrix included the following nine regressors: three conditions (actor's voice, pure tones, and silence) and six motion regressors taken from realignment process. This GLM was created for each individual participant, and the contrast of actors' voice versus pure tones (at p < .001 as threshold), masked by actors' voice versus rest, was computed for each participant.

Maps and Profiles

The activation, specific, and common maps were projected (enclosing voxel projection) onto flattened left and right hemispheres of the human PALS B12 atlas (Van Essen, 2005; sumsdb.wustl.edu:8081/sums/directory.do?id=636032) using the Caret software package (Van Essen, 2002; brainvis.wustl.edu/caret). For visualization of the cytoarchitectonic areas, we used the maximum probability map (MPM) of parietal and inferior frontal areas (Caspers et al., 2006; Amunts et al., 1999). These areas are directly available in Caret, and their outline was drawn using the border function of Caret. For attribution of an activation site to a cytoarchitectonic area, the probability map of the area thresholded at 10% was used. The ellipses along the intraparietal sulcus are confidence ellipses defined directly in Caret by Jastorff et al. (2010): putative human AIP (phAIP) area, dorsal intraparietal sulcus anterior (DIPSA) area, and dorsal intraparietal sulcus medial (DIPSM) area. Areas phAIP and DIPSA are the homologues of the anterior (motor) and posterior (visual) parts of macaque AIP (Orban, 2016). The retinotopic MT cluster is also directly available in Caret (Abdollahi et al., 2014).

Activity profiles plot the mean (and standard error), across participants, of the MR signal change relative to the fixation baseline, as percent average activity, for the various conditions of the experiment. Group profiles were computed for ROIs functionally defined by contrasts that could be included in some of the profiles. Hence, a split analysis was employed in which two runs were used to define the local maximum and the other six runs to compute the profile (Ferri et al., 2015). Single-participant profiles were computed using all the voxels that reached p < .01 in the contrast of interest located within 10 mm either from a group local maximum (LM) or from an a priori coordinate such as that of Spt (Buchsbaum et al., 2011). Single-participant profiles were averaged across participants in which at least 5 voxels met the criteria and subjected to repeated-measures ANOVA.

Single voxel correlations in Spt and PFm were also restricted to participants in which at least 5 voxels of these regions met the above criteria. Activation of single voxels in one contrast was plotted as a function of their activation in a second contrast in single participants (Jastorff & Orban, 2009; Peelen, Wiggett, & Downing, 2006). A significant correlation suggests that a single voxel population is involved in the two contrasts, whereas anticorrelation suggests that distinct voxel populations are involved. The regression lines were averaged across participants and plotted along with their 95% confidence limits.

RESULTS

Visual Main Experiment

Two participants did not fixate well, averaging more than 15 saccades/min during the scanning of the visual main experiment and so were omitted from the data analysis. The remaining 22 participants included in the GLMs of the main experiments averaged only 5.67 (SD = 1.8) saccades/min in the visual main experiment. The number of saccades did not differ across conditions (F(4, 16) = 0.6, p > .5).

Activation Maps for the Four Classes of Observed Actions

In a whole-brain analysis, we defined the activation map for each action class (Abdollahi et al., 2013) by the conjunction of contrasts comparing action to static and to dynamic controls (see Methods). This yielded four activation maps (Table 1) including one for vocal communication and three for manipulation with various effectors: mouth, hand, and hand to mouth. In the vocal communication map, the activity pattern was largely restricted to the lateral cortex as shown both on the folded hemispheres (Figure 3A, C) and corresponding flatmaps (Figure 3B, D). It was relatively symmetrical in occipitotemporal cortex, including posterior and middle superior temporal gyrus (STG) bilaterally and the MT cluster predominantly in the left hemisphere (Figure 3B, D). The parietal, premotor, and prefrontal activation sites were predominantly, though not exclusively, left-sided: in the left inferior parietal lobule (IPL), more precisely ventral tip of PFm, left pars opercularis (BA 44) and left pars triangularis (BA 45), but also in right ventral premotor cortex (Figure 3B, D).

Table 1. 

Local Maxima of the SPMs of the Activation Maps

 Vocal Communication Oral Manipulation Object Manipulation Hand and Mouth Manipulation 
Left Right Left Right Left Right Left Right 
pMSTv  48 −62 8 −46 −62 6 46 −62 4 −46 −74 4 50 −72 6 −46 −74 4 50 −66 2 
Mid STG −54 −4 −12 56 −2 −12       
Post STG −66 −40 10 64 −36 10       
DIPSM     −24 −66 58 22 −56 64   
DIPSA     −34 −48 60 36 −50 64   
phAIP   −36 −48 50  −36 −48 60  −34 −50 52 36 −40 58 
PF   −58 −20 26 58 −20 26 −52 −30 22  −62 −22 28  
PFm −62 −48 20        
BA 45 −48 28 4        
BA 44 −44 18 24 40 10 26       
Premotor  40 −4 52      40 −6 56 
 Vocal Communication Oral Manipulation Object Manipulation Hand and Mouth Manipulation 
Left Right Left Right Left Right Left Right 
pMSTv  48 −62 8 −46 −62 6 46 −62 4 −46 −74 4 50 −72 6 −46 −74 4 50 −66 2 
Mid STG −54 −4 −12 56 −2 −12       
Post STG −66 −40 10 64 −36 10       
DIPSM     −24 −66 58 22 −56 64   
DIPSA     −34 −48 60 36 −50 64   
phAIP   −36 −48 50  −36 −48 60  −34 −50 52 36 −40 58 
PF   −58 −20 26 58 −20 26 −52 −30 22  −62 −22 28  
PFm −62 −48 20        
BA 45 −48 28 4        
BA 44 −44 18 24 40 10 26       
Premotor  40 −4 52      40 −6 56 
Figure 3. 

Activation map of vocal communication observation: SPM showing voxels significant (yellow, see color code) in the activation map in the left (A, B) and right (C, D) hemisphere, displayed on the folded brain (A, C) and flatmap (B, D). Rectangles in B and D indicate the parts of the flatmap shown in Figures 4 and 5. Colored outlines: MPM of inferior parietal and inferior frontal cytoarchitectonic regions (Caspers et al., 2006; Amunts et al., 1999) and the retinotopic MT cluster (Abdollahi et al., 2014). IPS = intraparietal sulcus; CS = central sulcus; preCS = precentral sulcus; postCS = postcentral sulcus; STS = superior temporal sulcus; ITS = inferior temporal sulcus; SF = Sylvian fissure; a = anterior; p = posterior; v = ventral; d = dorsal.

Figure 3. 

Activation map of vocal communication observation: SPM showing voxels significant (yellow, see color code) in the activation map in the left (A, B) and right (C, D) hemisphere, displayed on the folded brain (A, C) and flatmap (B, D). Rectangles in B and D indicate the parts of the flatmap shown in Figures 4 and 5. Colored outlines: MPM of inferior parietal and inferior frontal cytoarchitectonic regions (Caspers et al., 2006; Amunts et al., 1999) and the retinotopic MT cluster (Abdollahi et al., 2014). IPS = intraparietal sulcus; CS = central sulcus; preCS = precentral sulcus; postCS = postcentral sulcus; STS = superior temporal sulcus; ITS = inferior temporal sulcus; SF = Sylvian fissure; a = anterior; p = posterior; v = ventral; d = dorsal.

All the manipulative actions (oral manipulation, object manipulation, hand and mouth manipulation) activated the three processing stages typical of action observation (Figure 4, Table 1): occipitotemporal, parietal, and premotor cortex (Rizzolatti, Cattaneo, Fabbri-Destro, & Rozzi, 2014; Caspers, Zilles, Laird, & Eickhoff, 2010; Jastorff et al., 2010). The occipitotemporal activation included the MT cluster in all manipulative actions, especially in the right hemisphere. The parietal activation included, for all observed manipulative actions, the left phAIP region, the proposed homologue of the rostral part of monkey AIP area (Orban, 2016; Vanduffel, Zhu, & Orban, 2014; Georgieva, Peeters, Kolster, Todd, & Orban, 2009). In addition, the left cytoarchitectonic PFt was activated in oral and hand and mouth manipulation. Two bilateral motion-sensitive areas (Sunaert, Van Hecke, Marchal, & Orban, 1999), DIPSM and DIPSA, were activated by observation of the three manipulation classes, either in left or right hemisphere. Only hand and mouth manipulation yielded a significant premotor activation in the right hemisphere.

Figure 4. 

Activation maps of manipulation observation: SPM showing voxels significant (yellow, see color code) in the activation map of oral manipulation (A, B), hand and mouth manipulation (C, D), and object manipulation (E, F) in the left (A, C, E) and right (B, D, F) hemispheres displayed on flatmaps. Same conventions as in Figure 3.

Figure 4. 

Activation maps of manipulation observation: SPM showing voxels significant (yellow, see color code) in the activation map of oral manipulation (A, B), hand and mouth manipulation (C, D), and object manipulation (E, F) in the left (A, C, E) and right (B, D, F) hemispheres displayed on flatmaps. Same conventions as in Figure 3.

Higher-order Maps

The second type of map, the specific activation map of each action class, was defined by the interaction between action class and presentation mode (see Methods). Such maps were obtained in the first session, but not in the second. The vocal communication specific map (Table 2, Figure 5B, D) revealed a region in the left ventral IPL, straddling the border of area PFm, in addition to a region in the left BA 45 and bilateral posterior, middle, and anterior STG sites. The inferior parietal region at the edge of PFm reached p < .001 uncorrected in the interaction, but using the coordinates of Buchsbaum et al. (2011) for a Spt (−51, −43, 20), its rostral part reached family-wise error (FWE) correction at the voxel level (Table 2). This site is located near the boundary of PFm, but 15 of the 22 voxels were located within PFm and hence referred to as PFm. Its activity profile (Figure 5B, inset) indicated that, in the first session, it responded only when participants observed vocal communication, while being completely silent in the second session.

Table 2. 

Local Maxima of the SPM of the Specific Vocal Communication Map (Session 1)

 MNI Coordinates t Brain Level SVC Level (Spt: −51, −43, 20) 
Left Right FWE Voxel FWE Cluster FWE Voxel FWE Cluster 
1 post STG −50 −44 12  4.08 0.9 0.9 0.0001 0.01 
2 ant STG −54 0 –12  4.22 0.8 0.5 – – 
3 mid STG −64 −36 6  5.20 0.1 0.02 – – 
4 mid STG  56 −4 −10 5.16 0.1 0.1 – – 
5 post STG  64 −36 10 6.24 0.007 0.000 – – 
6 ant STG  50 12 –16 3.54 0.9 0.1 – – 
7 PFm −64 −48 16  4.75 0.4 0.02 0.001* 0.06 
8 BA 45 −48 26 2  3.67 0.9 0.9 – – 
 MNI Coordinates t Brain Level SVC Level (Spt: −51, −43, 20) 
Left Right FWE Voxel FWE Cluster FWE Voxel FWE Cluster 
1 post STG −50 −44 12  4.08 0.9 0.9 0.0001 0.01 
2 ant STG −54 0 –12  4.22 0.8 0.5 – – 
3 mid STG −64 −36 6  5.20 0.1 0.02 – – 
4 mid STG  56 −4 −10 5.16 0.1 0.1 – – 
5 post STG  64 −36 10 6.24 0.007 0.000 – – 
6 ant STG  50 12 –16 3.54 0.9 0.1 – – 
7 PFm −64 −48 16  4.75 0.4 0.02 0.001* 0.06 
8 BA 45 −48 26 2  3.67 0.9 0.9 – – 

Asterisk indicates that only the rostral part of PFm (−58, −44, 14) was included in the SVC. Bold represents significance at p < .05, FWE-corrected.

Figure 5. 

Vocal communication observation network: comparison of the specific map to the activation map. (A, C) SPM showing voxels significant (yellow, see color code) in the activation map displayed on the flatmap (partial) of left (A) and right (C) hemispheres. (B, D) SPM showing voxels significant (yellow, see color code) in the specific map displayed on the flatmap (partial) of left (B) and right (D) hemispheres. Inset in B: group activity profile of PFm (activation site 7). A and C have the same data as in Figure 3B, D, shown for comparison purposes. Colored outlines: MPM of inferior parietal and inferior frontal cytoarchitectonic regions (Caspers et al., 2006; Amunts et al., 1999). For the numbers, see Table 2.

Figure 5. 

Vocal communication observation network: comparison of the specific map to the activation map. (A, C) SPM showing voxels significant (yellow, see color code) in the activation map displayed on the flatmap (partial) of left (A) and right (C) hemispheres. (B, D) SPM showing voxels significant (yellow, see color code) in the specific map displayed on the flatmap (partial) of left (B) and right (D) hemispheres. Inset in B: group activity profile of PFm (activation site 7). A and C have the same data as in Figure 3B, D, shown for comparison purposes. Colored outlines: MPM of inferior parietal and inferior frontal cytoarchitectonic regions (Caspers et al., 2006; Amunts et al., 1999). For the numbers, see Table 2.

The specific map of oral manipulation (yellow voxels in Figure 6A, B) included bilateral regions in IPL mainly PFt and PFop, in addition to left phAIP and a site near left MT cluster. These regions reached p < .001 uncorrected, but using previous phAIP activations (Ferri et al., 2015) as a priori for SVC, the phAIP site (11 voxels) reached FWE correction (Table 3). For the purpose of comparison, Figure 6(A, B) also shows the specific map for vocal communication, described earlier, in red voxels. Although the specific parietal regions of oral manipulation are located in the dorsal part of rostral IPL, that of vocal communication is situated in its ventral part. The activity profiles (Figure 6C, D) confirm that the phAIP site was specifically activated by observing oral manipulation in Session 1 but was also activated by observing hand and hand to mouth manipulation in Session 2.

Figure 6. 

Oral manipulation specific map, compared with the vocal communication specific map in left (A) and right (B) hemispheres and activity profiles (split plot analysis) of phAIP site for conditions of Session 1 (C) and Session 2 (D). A, B: SPM showing voxels significant (yellow, see color code) in the specific map for oral manipulation displayed on partial flatmaps (for the numbers, see Table 3). For comparison, the specific map of vocal communication is indicated by red voxels (same data as in Figure 5). In A, B: red ellipses: phAIP, other conventions as in Figures 3 and 4; group profiles in C and D were obtained by split analysis. In C, D: C = vocal communication; O = oral manipulation; M = object manipulation; H = hand and mouth manipulation; A = action; D = dynamic control; S = static control (as in Figure 2).

Figure 6. 

Oral manipulation specific map, compared with the vocal communication specific map in left (A) and right (B) hemispheres and activity profiles (split plot analysis) of phAIP site for conditions of Session 1 (C) and Session 2 (D). A, B: SPM showing voxels significant (yellow, see color code) in the specific map for oral manipulation displayed on partial flatmaps (for the numbers, see Table 3). For comparison, the specific map of vocal communication is indicated by red voxels (same data as in Figure 5). In A, B: red ellipses: phAIP, other conventions as in Figures 3 and 4; group profiles in C and D were obtained by split analysis. In C, D: C = vocal communication; O = oral manipulation; M = object manipulation; H = hand and mouth manipulation; A = action; D = dynamic control; S = static control (as in Figure 2).

Table 3. 

Local Maxima of the SPM of the Specific Oral Manipulation Map (Session 1)

 MNI Coordinates t Brain Level SVC Level (phAIP) 
Left Right FWE Voxel FWE Cluster FWE Voxel FWE Cluster 
1 post ITS −52 −64 4  4.44 0.6 0.4 – – 
2 phAIP −36 −46 48  4.05 0.9 0.9 0.01 0.02 
3 PF/PFt −60 −28 40  6.12 0.01 0.001 – – 
4 PFop −62 −18 22  4.07 0.9 0.001 – – 
5 PFop  60 −20 26 4.05 0.9 0.07 – – 
 MNI Coordinates t Brain Level SVC Level (phAIP) 
Left Right FWE Voxel FWE Cluster FWE Voxel FWE Cluster 
1 post ITS −52 −64 4  4.44 0.6 0.4 – – 
2 phAIP −36 −46 48  4.05 0.9 0.9 0.01 0.02 
3 PF/PFt −60 −28 40  6.12 0.01 0.001 – – 
4 PFop −62 −18 22  4.07 0.9 0.001 – – 
5 PFop  60 −20 26 4.05 0.9 0.07 – – 

Bold represents significance at p < .05, FWE-corrected.

The common activation map (Figure 7A) obtained in Session 1 included no parietal or premotor regions, being limited to the MT cluster and posterior STG bilaterally. The map common to hand and hand and mouth manipulation obtained in Session 2 (Figure 7B) included the MT cluster, predominantly on the right side, and a small site near the caudal border of left phAIP.

Figure 7. 

Common maps in left (A, C) and right (B, D) hemispheres for observation of vocal communication and oral manipulation (A, B) and for observation of hand and mouth manipulation and object manipulation (C, D) derived from the first and second sessions of the visual main experiment, respectively. Same conventions as in Figure 4.

Figure 7. 

Common maps in left (A, C) and right (B, D) hemispheres for observation of vocal communication and oral manipulation (A, B) and for observation of hand and mouth manipulation and object manipulation (C, D) derived from the first and second sessions of the visual main experiment, respectively. Same conventions as in Figure 4.

Auditory Experiments

The visual main experiment yielded a parietal region, left PFm, which was specifically involved in observing vocal communication. The LM of the region (−64, −48, 16) was about 15 mm from the Spt coordinate of Buchsbaum et al. (2011). This distance suggests that Spt may be distinct from PFm, even if the interaction defining the specific map of vocal communication still reached significance (t = 2.14, p < .02) in the Spt coordinate and the rostral part of PFm was located within 10 mm of Spt (Buchsbaum et al., 2011). To obtain further information about the involvement of Spt in vocal communication and to compare the properties of Spt to those of PFm, we performed a single-participant analysis of the auditory main experiment. Spt could be defined in all 22 participants with the number of voxels defining the area ranging from 1 to 194 voxels (Table 4). These individual sites were all located in cytoarchitectonic PFcm, the part of the IPL extending onto the lower bank of the lateral fissure (Figure 8A), and occupied mainly its ventral and caudal part. The median of these individual locations correspond closely to the Buchsbaum coordinate (Figure 9A).

Table 4. 

Definition of Spt and PFm in Single Participants: Number of Voxels, t and p Values, and Coordinates of LM

Participant Spt PFm 
Voxels t p x (mm) y (mm) z (mm) Voxels t p x (mm) y (mm) z (mm) 
Ale 48 2.67 .004 −50 −40 10 25 2.41 .008 −60 −44 16 
Ama 19 2.04 .02 −56 −40 22 21 2.47 .007 −62 −48 12 
Amo 194 2.99 .001 −52 −44 16 87 3.21 .001 −60 −54 22 
Bat 16 2.21 .014 −48 −42 16 2.22 .013 −70 −46 12 
Bia 1.77 .038 −42 −46 20 2.24 .012 −68 −56 14 
Dch 1.89 .029 −52 −44 24 2.7 .004 −68 −40 16 
Dna 23 2.51 .006 −52 −46 22 63 3.22 .001 −60 −44 16 
Dma 1.77 .038 −54 −38 26 77 2.46 .007 −60 −52 20 
Fas 66 2.61 .005 −56 −42 22 154 2.92 .002 −66 −46 24 
Mat 1.84 .033 −46 −38 20 2.4 .008 −64 −52 24 
Mol 2.16 .015 −50 −46 24 30 2.25 .012 −68 −44 20 
Old 126 2.73 .003 −48 −42 26 40 2.56 .005 −58 −54 16 
Pal 28 2.19 .014 −46 −46 26 40 3.61 .001 −64 −44 
Pet 11 2.13 .017 −44 −48 18 271 4.23 .001 −64 −44 20 
Pie 18 2.81 .003 −46 −44 12 1.96 .025 −68 −44 
Pin 24 2.3 .011 −52 −40 18 12 2.98 .001 −58 −54 18 
Pla 125 2.75 .003 −52 −48 24 19 2.73 .003 −62 −44 
Puz 12 2.27 .012 −56 −46 28 2.51 .006 −56 −52 14 
Scm 26 2.47 .007 −48 −36 14 1.76 .039 −64 −46 24 
Scs 16 2.63 .004 −56 −46 28 65 2.76 .003 −56 −46 18 
Tra 2.42 .008 −48 −40 28 69 2.93 .002 −68 −52 12 
Vig 2.53 .006 −48 −36 26 1.82 .034 −60 −42 10 
Median 17 2.36 .01 −50 −43 22 23 2.49 .005 −63 −46 16 
Participant Spt PFm 
Voxels t p x (mm) y (mm) z (mm) Voxels t p x (mm) y (mm) z (mm) 
Ale 48 2.67 .004 −50 −40 10 25 2.41 .008 −60 −44 16 
Ama 19 2.04 .02 −56 −40 22 21 2.47 .007 −62 −48 12 
Amo 194 2.99 .001 −52 −44 16 87 3.21 .001 −60 −54 22 
Bat 16 2.21 .014 −48 −42 16 2.22 .013 −70 −46 12 
Bia 1.77 .038 −42 −46 20 2.24 .012 −68 −56 14 
Dch 1.89 .029 −52 −44 24 2.7 .004 −68 −40 16 
Dna 23 2.51 .006 −52 −46 22 63 3.22 .001 −60 −44 16 
Dma 1.77 .038 −54 −38 26 77 2.46 .007 −60 −52 20 
Fas 66 2.61 .005 −56 −42 22 154 2.92 .002 −66 −46 24 
Mat 1.84 .033 −46 −38 20 2.4 .008 −64 −52 24 
Mol 2.16 .015 −50 −46 24 30 2.25 .012 −68 −44 20 
Old 126 2.73 .003 −48 −42 26 40 2.56 .005 −58 −54 16 
Pal 28 2.19 .014 −46 −46 26 40 3.61 .001 −64 −44 
Pet 11 2.13 .017 −44 −48 18 271 4.23 .001 −64 −44 20 
Pie 18 2.81 .003 −46 −44 12 1.96 .025 −68 −44 
Pin 24 2.3 .011 −52 −40 18 12 2.98 .001 −58 −54 18 
Pla 125 2.75 .003 −52 −48 24 19 2.73 .003 −62 −44 
Puz 12 2.27 .012 −56 −46 28 2.51 .006 −56 −52 14 
Scm 26 2.47 .007 −48 −36 14 1.76 .039 −64 −46 24 
Scs 16 2.63 .004 −56 −46 28 65 2.76 .003 −56 −46 18 
Tra 2.42 .008 −48 −40 28 69 2.93 .002 −68 −52 12 
Vig 2.53 .006 −48 −36 26 1.82 .034 −60 −42 10 
Median 17 2.36 .01 −50 −43 22 23 2.49 .005 −63 −46 16 
Figure 8. 

Comparison of Spt and PFm. Location of Spt (A) and PFm (G) sites in 22 single participants. Single-participant activity profiles of Spt and PFm for all conditions (B, H) and the eight exemplar videos (C, I) of the first session of the visual main experiment, for the second session of that experiment (D, J), and for the auditory main (E, K) and control (F, L) experiments.

Figure 8. 

Comparison of Spt and PFm. Location of Spt (A) and PFm (G) sites in 22 single participants. Single-participant activity profiles of Spt and PFm for all conditions (B, H) and the eight exemplar videos (C, I) of the first session of the visual main experiment, for the second session of that experiment (D, J), and for the auditory main (E, K) and control (F, L) experiments.

Figure 9. 

Properties of Spt and PFm voxels. (A) Median location of individual Spt (green star) and PFm (red star) in left parietotemporal junction. Yellow square: Spt from Buchsbaum et al. (2011), pink outline: Spt like ICA component of Simmonds et al. (2014), blue and gray outlines: PFm and post STG from the specific map of vocal communication. (B, C) Activation levels of single voxels of PFm (blue) and Spt (red) for vocal communication observation plotted as a function of the level for rehearsal (B) and for actors' voice (C). Mean regression (full lines) and 95% confidence limits (dashed lines) are shown. Mean equations in B: y = 0.39 (SD 0.32) x + 0.19 (SD 0.13) for PFm and y = 0.45 (SD 0.6) x − 0.04 (SD 0.19) for Spt; in C: y = −0.47 (SD 0.72) x + 0.4 (SD 0.70) for PFm and y = 0.84 (SD 0.85) x + 0.04 (SD 0.3) for Spt.

Figure 9. 

Properties of Spt and PFm voxels. (A) Median location of individual Spt (green star) and PFm (red star) in left parietotemporal junction. Yellow square: Spt from Buchsbaum et al. (2011), pink outline: Spt like ICA component of Simmonds et al. (2014), blue and gray outlines: PFm and post STG from the specific map of vocal communication. (B, C) Activation levels of single voxels of PFm (blue) and Spt (red) for vocal communication observation plotted as a function of the level for rehearsal (B) and for actors' voice (C). Mean regression (full lines) and 95% confidence limits (dashed lines) are shown. Mean equations in B: y = 0.39 (SD 0.32) x + 0.19 (SD 0.13) for PFm and y = 0.45 (SD 0.6) x − 0.04 (SD 0.19) for Spt; in C: y = −0.47 (SD 0.72) x + 0.4 (SD 0.70) for PFm and y = 0.84 (SD 0.85) x + 0.04 (SD 0.3) for Spt.

Activity profiles (Figure 8BF) were computed using data obtained in all three experiments for 19 of 22 participants (see Methods). The most important profile is that for the first session of the visual main experiment in which the interaction between action class and presentation mode reached significance (repeated-measure ANOVA F(1, 18) = 4.81, p < .015) indicating that Spt is indeed also specific for vocal communication. The other profiles indicate that it is not involved in the observation of hand or hand and mouth manipulative actions (Figure 8D) but, as expected, is activated by listening to nonsense words and rehearsing them silently (Figure 8E) and also by listening to the actors' vocal communication (Figure 8F).

To compare Spt with PFm, we also analyzed PFm in single participants. We regarded the voxels within 10 mm from the group coordinate of PFm (−64, −48, 16) and reaching p < .01 in the interaction between class and presentation mode as defining PFm in the individual participants. PFm was again defined in all 22 participants, with the number of voxels ranging from 2 to 154 (Table 4). All these individual sites were located in the ventral part of PFm or in the neighboring few millimeters of the STG (Figure 8G). The segregation of individual Spt and PFm sites, located in different cytoarchitectonic sectors of IPL and separated by an average of 18 mm in participants, supports the view that Spt and PFm are neighboring yet distinct parietal areas.

The single-participant activity profiles of PFm for the three experiments are shown in Figure 8HL. The profiles (Figure 8H, J) for the two visual sessions are similar to those obtained in the group analysis (Figure 4B). The responses to rehearse are weak but present, unlike those to listen (Figure 8K). Combining the 14 conditions of the visual and auditory main experiments into a single ANOVA yielded a significant interaction between ROI and conditions (F(1, 15) = 3.02, p < .001), further supporting the distinction between Spt and PFm. However, the profiles for the eight action exemplars of the first visual session indicated no difference between ROIs (F(1, 7) = 2.6, p > .1) nor interaction (F(1, 7) = 1.55, p > .1) between condition and ROI. In both areas, the four vocal communication exemplars evoked more activity than the four oral manipulation exemplars, underscoring the involvement of Spt in observation of vocal communication. Finally, the auditory control experiment identified a further distinction between Spt and PFm: whereas Spt reacts about as strongly to other's nonsense word (listen condition of auditory main experiment) as to the actors' vocal communication sounds (Figure 8E, F), PFm reacts more strongly to the actors' sounds than to others' nonsense words (Figure 8K, L).

Correlation Analysis in ROIs

Spt is thus involved in speech production, as tested by its response to nonsense words and their rehearsal, as well as being concerned with observing vocal communication. This does not necessarily imply, however, that the same set of voxels is responding in the visual and speech conditions. To address this ambiguity, in each participant we plotted the activation levels of individual voxels of Spt for the two conditions, whereby a strong correlation between the levels indicates that a single group of voxels subserves both functions (Jastorff & Orban, 2009; Peelen et al., 2006). The individual regression lines were averaged across participants and shown in Figure 9 (19 participants in panel B and 9 in panel C). In Spt, the visual observation responses increased linearly with rehearsal activation level (Figure 9B). The correlation was significant in 18 of 19 individual participants, as well as in the group (mean r = .66, SD = .05), explaining over 40% of the variance. Hence, a single group of Spt voxels is involved in both rehearsal and observation, strengthening the link between these two behaviors. The same visual observation responses also increased significantly (8 of 9 participants) with responses to the actors' voice in Spt (Figure 9C, mean r = .56, SD = .08), explaining 32% of the variance. Thus, a single group of voxels defined by its response to nonsense sentences and their rehearsal is activated by observing others communicating vocally and also, to some degree, by hearing others communicate.

In contrast, visual observation responses in PFm still increased significantly (18 of 19 participants) with rehearsal responses (Figure 9B, mean r = .49, SD = .08), but the visual observation responses decreased significantly (9 of 9 participants) with the responses to actors' voice (mean r = −.57, SD = .11), suggesting two distinct sensory populations of voxels (Figure 9C). Although these relationships explained only a third or a quarter of the variance, PFm and Spt displayed significantly different slopes in the relationship between observing and hearing others communicate vocally (paired t test t = −4.31, p < .005). Thus, PFm is functionally distinct from Spt, even if observation of vocal communication and rehearsal converge onto single groups of voxels in both regions.

DISCUSSION

Our results show that visual signals related to the observation of vocal communication specifically activate a small part of PFm at the transition between supramarginal and angular gyrus, confirming that observed actions activate diverse parts of PPC, including those actions within the auditory realm. Using individual ROIs defined by a separate auditory experiment, we found that these visual signals also reach Spt, a parietal node involved in auditory–motor transformation, directly supporting the observation-as-proxy-for-execution viewpoint, at least for PPC. Because PFm, the parietal region that is maximally engaged by observing vocal communication, is spatially and functionally distinct from Spt, these results also suggest that the proxy is only approximate. Taking these points together, our results support the functional view of PPC organization.

Observing Vocal Communication Activates the Most Ventral Parts of Rostral IPL

The two parietal regions specifically involved in the observation of vocal communicative actions are located in distinct sectors of the supramarginal gyrus (PFm and PFcm). Although we obtained clear evidence of overlap between the two activation sites, their local maxima both at the group and individual levels are distant enough to propose that they are neighboring yet distinct regions. The median location of Spt in this study (Figure 9) corresponds closely to the mean location for over 100 participants reported by Buchsbaum et al. (2011). The maps indicate that Spt is located in PFcm, the extension of IPL onto the lower bank of the lateral fissure, providing evidence that it is in fact a parietal area. In most participants, it occupies the ventral part of PFcm, near the planum temporale, consistent with its involvement in auditory processing as indicated by Simmonds et al. (2014) and the studies of Hickok and colleagues (Buchsbaum et al., 2011; Hickok et al., 2003). As the rostroventral boundary of PFcm is ill defined, given the lack of identified cytoarchitectonic regions in the posterior part of the STG, further work is needed to confirm this identification of Spt as a parietal region. Furthermore, the supramarginal regions PFm and PFcm correspond to parts of primate cortex that have expanded dramatically in humans (Van Essen & Dierker, 2007) and are strongly asymmetric between hemispheres (Van Essen, Glasser, Dierker, Harwell, & Coalson, 2012), in keeping with their putative roles in a typically human behavior.

It is remarkable that small movements of the mouth and cheek observed laterally during vocal communication specifically activate these cortical regions. Facial movements during speech and other vocal communicative actions are a direct consequence of the vocal-tract motion which defines speech acoustics (Yehia, Kuratate, & Vatikiotis-Bateson, 2002). Computer vision has shown that facial behavior is a robust predictor of speech acoustics (Yehia, Rubin, & Vatikiotis-Bateson, 1998). In fact the spectral envelope of speech acoustics can be better estimated from 3-D motion of the face than from motion of the anterior part of the vocal tract (lips, tongue, and jaw). Thus, it is unsurprising that observing vocal communication actions evokes a response in the human IPL. What is not directly evident from those computer studies is the specificity of these activations, because very similar movements of the head and mouth in oral manipulation do not activate these regions. Those computer studies were performed using a frontal view in which most of the facial movements are out of the frontoparallel plane and would require stereopsis to be properly evaluated. In the absence of stereopsis, as was the case here, a lateral view may provide sufficiently robust information, because movements of the jaw can be accurately tracked in this view. Further work is needed to assess the roles of viewpoint and stereopsis on the observation of communicative actions, as has been recently done for manipulative actions (Ferri et al., 2016).

Spt Is Activated by Observing Vocal Communication

The results of the auditory main experiment defining Spt by the conjunction of auditory stimulation and rehearsal is consistent with other recent studies (Buchsbaum et al., 2011; Hickok, Okada, & Serences, 2009; Okada & Hickok, 2009; Hickok et al., 2003). The Spt region is compromised in participants with conduction aphasia (Buchsbaum et al., 2011), a syndrome characterized not only by phonological production deficits but also by a relatively spared auditory comprehension (Baldo, Klostermann, & Dronkers, 2008; Goodglass, 1992; Benson et al., 1973). Buchsbaum et al. (2011) therefore argued that conduction aphasia is a “dorsal stream disorder.” Our study suggests that Spt is located in the rostroventral corner of cytoarchitectonic IPL regions. This finding supports the view that it belongs to the dorsal auditory stream but requires further study, in particular, the mapping of cytoarchitectonic regions in the posterior part of the STG.

The results of the visual and auditory main experiments combined demonstrate that Spt, considered an auditory–motor transformation region (Buchsbaum et al., 2011), also receives visual signals related to the actions whose planning it contributes to. The properties of this region made it particularly well suited for demonstrating that observation can serve as a proxy for planning and execution. Although relatively variable across individual participants, it has now been mapped in a large cohort (n = 100) of participants (Buchsbaum et al., 2011), firmly establishing its location in the posterior end of the lateral fissure. It is a relatively small area, as indicated by its sensitivity to smoothing kernel size, requiring a precise match between observation and execution. Finally, it is not speech specific, being activating equally well by sentences or tonal stimuli (Hickok et al., 2003), thereby matching the range of vocal communication actions tested visually. Testing the activity of Spt of each participant using the conditions of the first session of the visual main experiment yielded a significant interaction between action class and presentation mode, the hallmark of a region specific for observing vocal communication. Thus, even if Spt did not appear as a distinct activation site in the specific map of vocal communication observation, our experimental strategy proved successful and provided direct support for the view that observation can serve as proxy for execution in PPC, not only for the manipulative actions, but also for an additional class of actions.

A Neighboring Parietal Region Is Also Involved in Observing Vocal Communication

The main parietal region in the specific map of vocal communication was located in left PFm, slightly caudal to Spt. The location of the individual action sites confirms that the functional activation straddles the ventral border of PFm. Technical factors suggest that the definition of this ventral border is only approximate. The cytoarchtectonic definition was obtained from 10 postmortem participants, obviously different from those of this study, for whom Caspers et al. (2006) reported a large individual variability. MPMs, moreover, are optimally defined only when areas on both sides of the boundary are included in the MPM, which is not the case for the ventral boundary of PFm (see above).

The PFm region was functionally and spatially distinct from Spt and hosted two populations of voxels, one dominated by the visual aspect of vocal communication and the other by its auditory aspect. Furthermore, listening to the actors' vocal communication was more effective in this area than listening to nonsense words, in contrast to Spt. Finding an additional parietal region specifically involved in the observation of communicative mouth actions is consistent with the results of Simmonds et al. (2014), suggesting that parietal regions other than Spt might be involved in speech production, a function for which PFm is a prime candidate. Speech, however, was only one of four vocal communicative actions tested, and little is known about the parietal regions involved in producing those other vocal communication actions. PFm might be one of the areas involved in the sensory-motor transformation for whistling, singing, or shouting. Indeed, a site (−52, −46, 22) close to our Pfm is activated by voluntary shifts in pitch during singing (Zarate, Wood, & Zatorre, 2010). Yet the comparison of the activity profiles for the individual action exemplars (Figure 8) did not provide any evidence for differential involvement of PFm and Spt in different vocal communication actions. This latter finding is consistent with the activation of Spt not only by silent speech (Hickok et al., 2003) but also by auditory and visual processing of rhythms (Karabanov, Blom, Forsman, & Ullen, 2009) and in musical improvisation (Bengtsson, Csikszentmihalyi, & Ullen, 2007). It is worth noting that language and probably vocal communication have appeared relatively recently in evolution, unlike manipulation or locomotion, which are shared with much older ancestors, even monkeys. Thus, the parietal organization of sensory-motor transformations underlying this behavior may be different and more distributed than that for other actions such as manipulation or climbing. This would mean that communicative vocal actions are actually not the ideal testbed for the quality of the proxy, as there may be more variability in the overlap between execution and observation regions.

On the other hand, the segregation of Spt and PFm, although not complete, indicates that the spatial activation profiles for execution and observation of vocal communication actions match only partially. One explanation might be the distributed architecture of sensory transformation of vocal communication, as noted above. Alternatively, it may be that action observation is indeed but an imperfect proxy of execution in PPC and that parietal regions involved in execution and observation coincide only partially. In this view, only those regions in which planning and visual signals are equally strong may house the hypothetical mirror neurons for vocal communication. Given the imbalance of activation strength by rehearsal and action observation, both Spt and PFm may contain only a small proportion of mirror neurons. In PFm, this relatively restricted mirror neuron population might be surrounded by relatively large numbers of purely visual, observed action-selective neurons. Such an arrangement might be sufficient to generate action observation signals that are invariant, a property that we (Orban, 2012, 2015) have repeatedly argued is the main reason for the involvement of PPC in action observation. A small set of mirror neurons may suffice as a seed to compel a large number of visual dominant neurons to also become invariant for changes in the viewpoint or posture of the actor. Further work, including that at the neuronal level, is thus needed to test these views, which need not be mutually exclusive.

Other Regions Activated by Observing Vocal Communication

Consistent with previous studies of the lip-reading (Okada & Hickok, 2009; Buccino et al., 2004; Calvert et al., 1997), the vocal communication specific map included, in addition to Spt, a speech-related region in the left frontal cortex (BA 45) and several regions bilaterally in STG. The latter are indeed close to the voice regions of Belin, Zatorre, Lafaille, Ahad, and Pike (2000), who described a rostrocaudal string of areas along the STG providing information about the speaker's identity. Although the visual responses observed here are in agreement with the results of Watson, Latinus, Charest, Crabbe, and Belin (2014), they do not support the proposal of Watson et al. (2014) that these regions are “people-selective.” Indeed, these regions were active in the specific map, which implies that they are not activated by mouth movements having no communicative goal. Hence, they cannot be considered “people-selective” regions (Watson et al., 2014), as not all dynamic faces will drive them. “Communicator regions” or “audio-visual voice” regions seem more appropriate terms.

The prearticulatory processing stages of speech production in frontal cortex are strongly left-lateralized to classic Broca's area (Brodmann's areas 44 and 45), whereas the primary motor cortical outflow to the many axial muscles controlling speech production and the auditory and somatosensory reafferent feedback involve both cerebral hemispheres (Hickok & Poeppel, 2007). Observing vocal communication activates left BA 44/45, suggesting a possible presence in these regions of mirror or observed action-selective neurons for speech. Cerri et al. (2015) have recently disputed the existence of mirror neurons in these regions, but the mouth action participants observed in that study, though communicative in nature, were not vocal.

phAIP Activation by Various Manipulation Actions

An additional aspect of our study was the investigation of the cortical sector specifically activated during observation of actions having the same motor goal but using a different effector. Consistent with the previous literature (Ferri et al., 2015; Abdollahi et al., 2013; Jastorff et al., 2010), area phAIP was specifically involved in observing manipulative actions. The activation of phAIP by observing oral manipulation, hand and mouth manipulation, and object manipulation confirms that phAIP is activated by manipulation independent of the effector employed (Jastorff et al., 2010). Effector independence in phAIP has also been reported in a TMS study of grasping with left and right hands (Davare, Andres, Clerget, Thonnard, & Olivier, 2007). Present results further indicate that the nature of the object, food, or otherwise is also unimportant. On the other hand, it is interesting that the manipulative hand action activation was more caudal than manipulative mouth action activation in phAIP. Further work is needed to understand the internal organization of phAIP, a region that has greatly expanded in humans. The activation of phAIP by manipulative actions, independent of the effector used, combined with the segregation of the specific maps for vocal communication and oral manipulation that share the mouth as effector, strongly support the functional organization interpretation of the PPC.

Conclusions

Our study, the first to investigate observation of actions depending on audio-motor transformation, has demonstrated that visual signals related to observed actions reach Spt, an audio-motor transformation site, as well as neighboring parietal regions, confirming the widespread involvement of PPC in action observation.

Acknowledgments

This study was supported by ERC Grant Parietalaction (323606). The authors are grateful to Dr. S. Ferri for help with the analysis.

Reprint requests should be sent to Guy A. Orban, Department of Neuroscience, University of Parma, Via Volturno 39, 43100 Parma, Italy, or via e-mail: guy.orban@med.kuleuven.be.

REFERENCES

REFERENCES
Abdollahi
,
R. O.
,
Jastorff
,
J.
, &
Orban
,
G. A.
(
2013
).
Common and segregated processing of observed actions in human SPL
.
Cerebral Cortex
,
23
,
2734
2753
.
Abdollahi
,
R. O.
,
Kolster
,
H.
,
Glasser
,
M. F.
,
Robinson
,
E. C.
,
Coalson
,
T. S.
,
Dierker
,
D.
, et al
(
2014
).
Correspondences between retinotopic areas and myelin maps in human visual cortex
.
Neuroimage
,
99
,
509
524
.
Amunts
,
K.
,
Schleicher
,
A.
,
Burgel
,
U.
,
Mohlberg
,
H.
,
Uylings
,
H. B.
, &
Zilles
,
K.
(
1999
).
Broca's region revisited: Cytoarchitecture and intersubject variability
.
Journal of Comparative Neurology
,
412
,
319
341
.
Baldo
,
J. V.
,
Klostermann
,
E. C.
, &
Dronkers
,
N. F.
(
2008
).
It's either a cook or a baker: Patients with conduction aphasia get the gist but lose the trace
.
Brain and Language
,
105
,
134
140
.
Begliomini
,
C.
,
Wall
,
M. B.
,
Smith
,
A. T.
, &
Castiello
,
U.
(
2007
).
Differential cortical activity for precision and whole-hand visually guided grasping in humans
.
European Journal of Neuroscience
,
25
,
1245
1252
.
Belin
,
P.
,
Zatorre
,
R. J.
,
Lafaille
,
P.
,
Ahad
,
P.
, &
Pike
,
B.
(
2000
).
Voice-selective areas in human auditory cortex
.
Nature
,
403
,
309
312
.
Bengtsson
,
S. L.
,
Csikszentmihalyi
,
M.
, &
Ullen
,
F.
(
2007
).
Cortical regions involved in the generation of musical structures during improvisation in pianists
.
Journal of Cognitive Neuroscience
,
19
,
830
842
.
Benson
,
D. F.
,
Sheremata
,
W. A.
,
Bouchard
,
R.
,
Segarra
,
J. M.
,
Price
,
D.
, &
Geschwind
,
N.
(
1973
).
Conduction aphasia. A clinicopathological study
.
Archives of Neurology
,
28
,
339
346
.
Binkofski
,
F.
,
Buccino
,
G.
,
Stephan
,
K. M.
,
Rizzolatti
,
G.
,
Seitz
,
R. J.
, &
Freund
,
H. J.
(
1999
).
A parieto-premotor network for object manipulation: Evidence from neuroimaging
.
Experimental Brain Research
,
128
,
210
213
.
Binkofski
,
F.
,
Dohle
,
C.
,
Posse
,
S.
,
Stephan
,
K. M.
,
Hefter
,
H.
,
Seitz
,
R. J.
, et al
(
1998
).
Human anterior intraparietal area subserves prehension: A combined lesion and functional MRI activation study
.
Neurology
,
50
,
1253
1259
.
Bonini
,
L.
,
Maranesi
,
M.
,
Livi
,
A.
,
Fogassi
,
L.
, &
Rizzolatti
,
G.
(
2014
).
Space-dependent representation of objects and other's action in monkey ventral premotor grasping neurons
.
Journal of Neuroscience
,
34
,
4108
4119
.
Buccino
,
G.
,
Lui
,
F.
,
Canessa
,
N.
,
Patteri
,
I.
,
Lagravinese
,
G.
,
Benuzzi
,
F.
, et al
(
2004
).
Neural circuits involved in the recognition of actions performed by nonconspecifics: An fMRI study
.
Journal of Cognitive Neuroscience
,
16
,
114
126
.
Buchsbaum
,
B. R.
, &
D'Esposito
,
M.
(
2008
).
The search for the phonological store: From loop to convolution
.
Journal of Cognitive Neuroscience
,
20
,
762
778
.
Buchsbaum
,
B. R.
,
Olsen
,
R. K.
,
Koch
,
P.
, &
Berman
,
K. F.
(
2005
).
Human dorsal and ventral auditory streams subserve rehearsal-based and echoic processes during verbal working memory
.
Neuron
,
48
,
687
697
.
Buchsbaum
,
B. R.
,
Ye
,
D.
, &
D'Esposito
,
M.
(
2011
).
Recency effects in the inferior parietal lobe during verbal recognition memory
.
Frontiers in Human Neuroscience
,
5
,
59
.
Calvert
,
G. A.
,
Bullmore
,
E. T.
,
Brammer
,
M. J.
,
Campbell
,
R.
,
Williams
,
S. C.
,
McGuire
,
P. K.
, et al
(
1997
).
Activation of auditory cortex during silent lipreading
.
Science
,
276
,
593
596
.
Caspers
,
S.
,
Geyer
,
S.
,
Schleicher
,
A.
,
Mohlberg
,
H.
,
Amunts
,
K.
, &
Zilles
,
K.
(
2006
).
The human inferior parietal cortex: Cytoarchitectonic parcellation and interindividual variability
.
Neuroimage
,
33
,
430
448
.
Caspers
,
S.
,
Zilles
,
K.
,
Laird
,
A. R.
, &
Eickhoff
,
S. B.
(
2010
).
ALE meta-analysis of action observation and imitation in the human brain
.
Neuroimage
,
50
,
1148
1167
.
Cavina-Pratesi
,
C.
,
Monaco
,
S.
,
Fattori
,
P.
,
Galletti
,
C.
,
McAdam
,
T. D.
,
Quinlan
,
D. J.
, et al
(
2010
).
Functional magnetic resonance imaging reveals the neural substrates of arm transport and grip formation in reach-to-grasp actions in humans
.
Journal of Neuroscience
,
30
,
10306
10323
.
Cerri
,
G.
,
Cabinio
,
M.
,
Blasi
,
V.
,
Borroni
,
P.
,
Iadanza
,
A.
,
Fava
,
E.
, et al
(
2015
).
The mirror neuron system and the strange case of Broca's area
.
Human Brain Mapping
,
36
,
1010
1027
.
Connolly
,
J. D.
,
Andersen
,
R. A.
, &
Goodale
,
M. A.
(
2003
).
fMRI evidence for a ‘parietal reach region’ in the human brain
.
Experimental Brain Research
,
153
,
140
145
.
Culham
,
J. C.
,
Danckert
,
S. L.
,
DeSouza
,
J. F.
,
Gati
,
J. S.
,
Menon
,
R. S.
, &
Goodale
,
M. A.
(
2003
).
Visually guided grasping produces fMRI activation in dorsal but not ventral stream brain areas
.
Experimental Brain Research
,
153
,
180
189
.
Davare
,
M.
,
Andres
,
M.
,
Clerget
,
E.
,
Thonnard
,
J. L.
, &
Olivier
,
E.
(
2007
).
Temporal dissociation between hand shaping and grip force scaling in the anterior intraparietal area
.
Journal of Neuroscience
,
27
,
3974
3980
.
Ferri
,
S.
,
Rizzolatti
,
G.
, &
Orban
,
G. A.
(
2015
).
The organization of the posterior parietal cortex devoted to upper limb actions: An fMRI study
.
Human Brain Mapping
,
36
,
3845
3866
.
Filimon
,
F.
(
2010
).
Human cortical control of hand movements: Parietofrontal networks for reaching, grasping, and pointing
.
The Neuroscientist
,
16
,
388
407
.
Filimon
,
F.
,
Nelson
,
J. D.
,
Huang
,
R. S.
, &
Sereno
,
M. I.
(
2009
).
Multiple parietal reach regions in humans: Cortical representations for visual and proprioceptive feedback during on-line reaching
.
Journal of Neuroscience
,
29
,
2961
2971
.
Frey
,
S. H.
,
Hansen
,
M.
, &
Marchal
,
N.
(
2015
).
Grasping with the press of a button: Grasp-selective responses in the human anterior intraparietal sulcus depend on nonarbitrary causal relationships between hand movements and end-effector actions
.
Journal of Cognitive Neuroscience
,
27
,
1146
1160
.
Frey
,
S. H.
,
Vinton
,
D.
,
Norlund
,
R.
, &
Grafton
,
S. T.
(
2005
).
Cortical topography of human anterior intraparietal cortex active during visually guided grasping
.
Brain Research. Cognitive Brain Research
,
23
,
397
405
.
Gallese
,
V.
,
Fadiga
,
L.
,
Fogassi
,
L.
, &
Rizzolatti
,
G.
(
1996
).
Action recognition in the premotor cortex
.
Brain
,
119
,
593
609
.
Gallivan
,
J. P.
,
McLean
,
D. A.
,
Smith
,
F. W.
, &
Culham
,
J. C.
(
2011
).
Decoding effector-dependent and effector-independent movement intentions from human parieto-frontal brain activity
.
Journal of Neuroscience
,
31
,
17149
17168
.
Georgieva
,
S.
,
Peeters
,
R.
,
Kolster
,
H.
,
Todd
,
J. T.
, &
Orban
,
G. A.
(
2009
).
The processing of three-dimensional shape from disparity in human brain
.
Journal of Neuroscience
,
29
,
727
742
.
Goodglass
,
H.
(
1992
).
Diagnosis of conduction aphasia
. In
S. E.
Kohn
(Ed.),
Conduction aphasia
(pp.
39
49
)
Hillsdale, NJ
:
Lawrence Erlbaum Associate
.
Heed
,
T.
,
Beurze
,
S. M.
,
Toni
,
I.
,
Roder
,
B.
, &
Medendorp
,
W. P.
(
2011
).
Functional rather than effector-specific organization of human posterior parietal cortex
.
Journal of Neuroscience
,
31
,
3066
3076
.
Hickok
,
G.
,
Buchsbaum
,
B.
,
Humphries
,
C.
, &
Muftuler
,
T.
(
2003
).
Auditory–motor interaction revealed by fMRI: Speech, music, and working memory in area Spt
.
Journal of Cognitive Neuroscience
,
15
,
673
682
.
Hickok
,
G.
,
Okada
,
K.
, &
Serences
,
J. T.
(
2009
).
Area Spt in the human planum temporale supports sensory-motor integration for speech processing
.
Journal of Neurophysiology
,
101
,
2725
2732
.
Hickok
,
G.
, &
Poeppel
,
D.
(
2007
).
The Cortical organization of speech processing
.
Nature Reviews Neuroscience
,
8
,
393
402
.
Hinkley
,
L. B.
,
Krubitzer
,
L. A.
,
Padberg
,
J.
, &
Disbrow
,
E. A.
(
2009
).
Visual-manual exploration and posterior parietal cortex in humans
.
Journal of Neurophysiology
,
102
,
3433
3446
.
Holmes
,
A.
, &
Friston
,
K.
(
1998
).
Generalisability, random effects and population inference
.
Neuroimage
,
7
,
S754
.
Iacoboni
,
M.
,
Woods
,
R. P.
,
Brass
,
M.
,
Bekkering
,
H.
,
Mazziotta
,
J. C.
, &
Rizzolatti
,
G.
(
1999
).
Cortical mechanisms of human imitation
.
Science
,
286
,
2526
2528
.
Jastorff
,
J.
,
Begliomini
,
C.
,
Fabbri-Destro
,
M.
,
Rizzolatti
,
G.
, &
Orban
,
G. A.
(
2010
).
Coding observed motor acts: Different organizational principles in the parietal and premotor cortex of humans
.
Journal of Neurophysiology
,
104
,
128
140
.
Jastorff
,
J.
, &
Orban
,
G. A.
(
2009
).
Human functional magnetic resonance imaging reveals separation and integration of shape and motion cues in biological motion processing
.
Journal of Neuroscience
,
29
,
7315
7329
.
Karabanov
,
A.
,
Blom
,
O.
,
Forsman
,
L.
, &
Ullen
,
F.
(
2009
).
The dorsal auditory pathway is involved in performance of both visual and auditory rhythms
.
Neuroimage
,
44
,
480
488
.
Konen
,
C. S.
,
Mruczek
,
R. E.
,
Montoya
,
J. L.
, &
Kastner
,
S.
(
2013
).
Functional organization of human posterior parietal cortex: Grasping- and reaching-related activations relative to topographically organized cortex
.
Journal of Neurophysiology
,
109
,
2897
2908
.
Leone
,
F. T.
,
Heed
,
T.
,
Toni
,
I.
, &
Medendorp
,
W. P.
(
2014
).
Understanding effector selectivity in human posterior parietal cortex by combining information patterns and activation measures
.
Journal of Neuroscience
,
34
,
7102
7112
.
Levy
,
I.
,
Schluppeck
,
D.
,
Heeger
,
D. J.
, &
Glimcher
,
P. W.
(
2007
).
Specificity of human cortical areas for reaches and saccades
.
Journal of Neuroscience
,
27
,
4687
4696
.
Nelissen
,
K.
,
Borra
,
E.
,
Gerbella
,
M.
,
Rozzi
,
S.
,
Luppino
,
G.
,
Vanduffel
,
W.
, et al
(
2011
).
Action observation circuits in the macaque monkey cortex
.
Journal of Neuroscience
,
31
,
3743
3756
.
Nelissen
,
K.
, &
Vanduffel
,
W.
(
2011
).
Grasping-related functional magnetic resonance imaging brain responses in the macaque monkey
.
Journal of Neuroscience
,
31
,
8220
8229
.
Nichols
,
T.
,
Brett
,
M.
,
Andersson
,
J.
,
Wager
,
T.
, &
Poline
,
J. B.
(
2005
).
Valid conjuction inference with the minimum statistics
.
Neuroimage
,
25
,
653
660
.
Nishimura
,
Y.
,
Onoe
,
H.
,
Morichika
,
Y.
,
Tsukada
,
H.
, &
Isa
,
T.
(
2007
).
Activation of parieto-frontal stream during reaching and grasping studied by positron emission tomography in monkeys
.
Neuroscience Research
,
59
,
243
250
.
Okada
,
K.
, &
Hickok
,
G.
(
2009
).
Two cortical mechanisms support the integration of visual and auditory speech: A hypothesis and preliminary data
.
Neuroscience Letters
,
452
,
219
223
.
Orban
,
G. A.
(
2012
).
Lessons from the primate visual system in computer vision_ECCV 2012. Lecture notes in computer vision
.
Berlin
:
Springer-Verlag
.
Orban
,
G. A.
(
2015
).
The mirror system in human and nonhuman primates: Comparative functional imaging studies suggest multiple systems
. In
New frontiers in mirrors neurons research
(
chap. 7
).
Oxford
:
Oxford University Press
.
Orban
,
G. A.
(
2016
).
Functional definitions of parietal areas in human and non-human primates
.
Proceedings of the Royal Society, Series B, Biological Sciences
,
283
.
Peelen
,
M. V.
,
Wiggett
,
A. J.
, &
Downing
,
P. E.
(
2006
).
Patterns of fMRI activity dissociate overlapping functional brain areas that respond to biological motion
.
Neuron
,
49
,
815
822
.
Premereur
,
E.
,
Janssen
,
P.
, &
Vanduffel
,
W.
(
2015
).
Effector specificity in macaque frontal and parietal cortex
.
Journal of Neuroscience
,
35
,
3446
3459
.
Rizzolatti
,
G.
,
Cattaneo
,
L.
,
Fabbri-Destro
,
M.
, &
Rozzi
,
S.
(
2014
).
Cortical mechanisms underlying the organization of goal-directed actions and mirror neuron-based action understanding
.
Physiological Reviews
,
94
,
655
706
.
Simmonds
,
A. J.
,
Leech
,
R.
,
Collins
,
C.
,
Redjep
,
O.
, &
Wise
,
R. J.
(
2014
).
Sensory-motor integration during speech production localizes to both left and right plana temporale
.
Journal of Neuroscience
,
34
,
12963
12972
.
Sunaert
,
S.
,
Van Hecke
,
P.
,
Marchal
,
G.
, &
Orban
,
G. A.
(
1999
).
Motion-responsive regions of the human brain
.
Experimental Brain Research
,
127
,
355
370
.
Tremblay
,
P.
,
Baroni
,
M.
, &
Hasson
,
U.
(
2013
).
Processing of speech and non-speech sounds in the supratemporal plane: Auditory input preference does not predict sensitivity to statistical structure
.
Neuroimage
,
66
,
318
332
.
Tremblay
,
P.
,
Deschamps
,
I.
, &
Gracco
,
V. L.
(
2013
).
Regional heterogeneity in the processing and the production of speech in the human planum temporale
.
Cortex
,
49
,
143
157
.
Van Essen
,
D. C.
, &
Dierker
,
D. L.
(
2007
).
Surface-based and probabilistic atlases of primate cerebral cortex
.
Neuron
,
56
,
209
225
.
Van Essen
,
D. C.
,
Drury
,
H. A.
,
Dickson
,
J.
,
Harwell
,
J.
,
Hanlon
,
D.
, &
Anderson
,
C. H.
(
2001
).
An integrated software suite for surface-based analyses of cerebral cortex
.
Journal of the American Medical Informatics Association
,
8
,
443
459
.
Van Essen
,
D. C.
,
Glasser
,
M. F.
,
Dierker
,
D. L.
,
Harwell
,
J.
, &
Coalson
,
T.
(
2012
).
Parcellations and hemispheric asymmetries of human cerebral cortex analyzed on surface-based atlases
.
Cerebral Cortex
,
22
,
2241
2262
.
Vanduffel
,
W.
,
Zhu
,
Q.
, &
Orban
,
G. A.
(
2014
).
Monkey cortex through fMRI glasses
.
Neuron
,
83
,
533
550
.
Watson
,
R.
,
Latinus
,
M.
,
Charest
,
I.
,
Crabbe
,
F.
, &
Belin
,
P.
(
2014
).
People-selectivity, audiovisual integration and heteromodality in the superior temporal sulcus
.
Cortex
,
50
,
125
136
.
Yehia
,
H.
,
Kuratate
,
T.
, &
Vatikiotis-Bateson
,
E.
(
2002
).
Linking facial animation, head motion and speech acoustics
.
Journal of Phonetics
,
30
,
555
568
.
Yehia
,
H.
,
Rubin
,
P.
, &
Vatikiotis-Bateson
,
E.
(
1998
).
Quantitative association of vocal-tract and facial behaviour
.
Speech Communication
,
29
,
23
43
.
Zarate
,
J. M.
,
Wood
,
S.
, &
Zatorre
,
R. J.
(
2010
).
Neural networks involved in voluntary and involuntary vocal pitch regulation in experienced singers
.
Neuropsychologia
,
48
,
607
618
.