Primates use vision to guide their actions in everyday life. Visually guided object grasping is known to rely on a network of cortical areas located in the parietal and premotor cortex. We recorded in the anterior intraparietal area (AIP), an area in the dorsal visual stream that is critical for object grasping and densely connected with the premotor cortex, while monkeys were grasping objects under visual guidance and during passive fixation of videos of grasping actions from the first-person perspective. All AIP neurons in this study responded during grasping execution in the light, that is, became more active after the hand had started to move toward the object and during grasping in the dark. More than half of these AIP neurons responded during the observation of a video of the same grasping actions on a display. Furthermore, these AIP neurons responded as strongly during passive fixation of movements of a hand on a scrambled background and to a lesser extent to a shape appearing within the visual field near the object. Therefore, AIP neurons responding during grasping execution also respond during passive observation of grasping actions and most of them even during passive observation of movements of a simple shape in the visual field.
Primates exhibit an exquisite capability to grasp objects guided by visual information that relies on a network of areas in the parietal and frontal cortex. In the macaque monkey, the anterior intraparietal area (AIP) is the end stage of the dorsal visual stream and is strongly connected, among others, with the ventral premotor cortex (PMv), with the inferior parietal lobule, and with the STS. AIP neurons share many properties with PMv neurons (Fluet, Baumann, & Scherberger, 2010; Baumann, Fluet, & Scherberger, 2009; Murata, Gallese, Luppino, Kaseda, & Sakata, 2000; Sakata, Taira, Murata, & Mine, 1995), and reversible inactivations of AIP or PMv cause similar and profound deficits in grasping (Fogassi et al., 2001; Gallese, Murata, Kaseda, Niki, & Sakata, 1994). Different types of responses can be observed in AIP neurons during grasping, from the object presentation until the prehension and holding of the object. AIP neurons frequently respond to the visual presentation of an object, encoding the orientation (Baumann et al., 2009; Murata et al., 2000; Sakata et al., 1995), the 3-D structure (Srivastava, Orban, De Maziere, & Janssen, 2012; Theys, Srivastava, van Loon, Goffin, & Janssen, 2012), and even the 2-D contours of object images (Romero, Van Dromme, & Janssen, 2012). Other AIP neurons respond during grasp planning and execution (Murata et al., 2000), and in both epochs, the grip type (power grip or precision grip) is represented in the firing rate of AIP neurons (Baumann et al., 2009). It is widely believed that these visual responses in AIP represent an important stage in the visual extraction of object features that can be used to plan the appropriate grip. Many AIP neurons only become active when the hand starts to move toward the object (Murata et al., 2000) during grasping in the light. This increase in activity while the monkey receives visual information about its own hand moving toward the object has been related to the visual analysis of the shape of the hand or to the interaction between the moving hand and the object to be grasped: During visually guided grasping, AIP neurons may monitor the grip aperture of the hand to adjust it to the dimensions of the object. However, it has never been demonstrated that neural activity in AIP neurons is driven by the sight of the hand while the monkey is not grasping.
The observation of a grasping action is able to evoke neural responses in the STS (Barraclough, Keith, Xiao, Oram, & Perrett, 2009; Perrett et al., 1989), as well as in PMv (Gallese, Fadiga, Fogassi, & Rizzolatti, 1996) and PFG (Fogassi et al., 2005). Specifically these last two areas host mirror neurons that fire both when the monkey grasps an object and when observing an experimenter or another monkey grasping the object (Caggiano et al., 2011; Fogassi et al., 2005; Gallese et al., 1996),whereas STS neurons lack motor properties. Considering the strong connections of AIP with the aforementioned areas, one could ask whether AIP neurons can be driven by the observation of a grasping action, and if so, what aspect of this visual stimulus is driving AIP activity. Note that Fujii, Hihara, and Iriki (2008) observed activity during grasping execution and during grasping observation in the medial bank of the IPS, not in AIP. Furthermore, preliminary data reported in Murata and Ishida (2007) showed that PFG (and possibly AIP) neurons respond to videos of the experimenters actions.
In this study, we investigated whether AIP neurons active during grasping execution also respond when the animal simply observes a grasping action. Specifically, we tested whether AIP neurons active during grasping execution also respond during passive fixation of a video of the same action, that is, show both activity during action execution and activity during action observation. Moreover, we asked whether the response during grasping observation is related to the visual analysis of the shape of the hand approaching the object and whether AIP activity can also be elicited with simple visual stimuli moving in the visual field. We observed that most AIP neurons active during grasping execution could be activated by pure visual stimulation with a video of the same grasping action. However, the majority of these neurons were also activated by videos of an isolated hand on a scrambled background and even by a shape entering or appearing within the visual field.
Two adult male rhesus monkeys (MY, 9 kg and MD, 8 kg) served as subjects for the experiments. All experimental procedures, surgical techniques, and veterinary care were performed in accordance with the NIH Guide for Care and Use of Laboratory Animals and in accordance with the European Communities Council Directive 2010/63/EU and were approved by the local ethical committee of the KU Leuven.
Surgery, Apparatus, and Recording Procedures
Under isofluorane anesthesia, an MRI-compatible head fixation post and recording chamber (Crist Instruments, Hagerstown, MD) were implanted using dental acrylic and ceramic screws above the left intraparietal sulcus in both monkeys.
During the experiments, the monkey was seated upright in a chair with the head fixed, with the arm contralateral to the recorded hemisphere free while the other arm was kept restrained in a comfortable position. In front of the monkey, a custom-built, vertically rotating carousel was used to present the target objects used to perform the grasping task. Six different objects (a small cylinder [15 × 15 (diameter) mm], and a small cube [side 15 mm], a large cylinder [35 × 35 mm], a sphere [diameter 35 mm], a large cube [side 35 mm], and a cylinder with a groove [cylinder 35 × 35 mm; groove dimensions: 35 × 7 × 5 mm]) were pseudorandomly presented one at a time in the same position (28 cm viewing distance, at the chest level, ∼20 cm reaching distance measured from the center of the hand rest position to the center of the objects). The objects drove two different types of grasp depending on their dimensions: a pad-to-side grip (for small objects), which is a subtype of precision grip, and a finger-splayed wrap (Macfarlane & Graziano, 2009), corresponding to a whole-hand grip. Both monkeys used the same grip types. The resting position of the hand, the start of the reach to grasp movement, and the lifting of the object were detected by fiber-optic cables. The start of the hand movement was detected as soon as the palm of the hand was 0.3 cm above the resting plan, whereas lifting of the object was detected when the object was lifted for 0.5 cm in the vertical axis.
Behind the carousel was located a display (20-in. monitor equipped with ultrafast P46 phosphor, frame rate 120 Hz, Vision Research Graphics, experimental setup as in Theys, Pani, van Loon, Goffin, & Janssen, 2012) on which videos of grasping actions were presented at a viewing distance of 40 cm (i.e., in extrapersonal space) during the passive fixation of a small spot located in the center of the display. Movie onset and end were registered by a photodiode attached to the lower right corner of the screen detecting the onset of a bright square (occluded and not visible to the monkey) appearing simultaneously with stimulus onset and endpoint. For each of the six objects, movies were recorded during training sessions by a Fujicom camera and a laptop running Streampix software (500 frames per sec, 640 × 480 resolution). The camera was positioned above the monkey's head and recorded the work space that the monkey could see from his position during the task (see individual frames in Figure 1B). Thus, all movies showed grasping actions from the same perspective as the monkey (first-person perspective). One monkey (MY) was recorded while performing a visually guided grasping task. The movies were then edited and modified (Matlab) to produce the sequence of events described in the tasks (see below).
The horizontal and the vertical coordinates of the right eye were monitored using an infrared-based camera system (EyeLink II; SR Research, Ontario, Canada). Eye position signals were sampled at 500 Hz, whereas spiking activity and photodiode pulses were sampled at 20 kHz on a DSP (C6000 series; Texas Instruments, Dallas, TX). Spikes were discriminated on-line using a dual time window discriminator on the DSP and displayed using LabView and custom-built software.
Neural activity of single units was recorded extracellularly by means of “Micro Matrix” microdrives (Thomas Recording, Marburg, Germany) with integrated single-channel preamplifier. Electrodes were quartz-platinum/tungsten fibers (∼0.5–1 MÙ at 1 kHz; 80-μm diameter) referenced to their own guide tube. Neural signals were amplified and filtered between 0.5 and 5 kHz for spikes. Spike discrimination was performed on-line using a dual time window discriminator. That recordings were of single units was confirmed by offline analysis using the Offline Spike sorter software (Plexon, Inc., Dallas, TX).
The monkeys were trained to perform various tasks (Figure 1), which we classify for simplicity as Grasping tasks, requiring the interaction of the monkeys with real objects, and observation tasks, requiring the monkeys to fixate a small spot on the monitor.
Two versions of the grasping task were used (Figure 1A): a visually guided grasping task (VGG) and a memory-guided grasping task (MGG). In the VGG, the monkey had to place its right hand in a resting position in complete darkness for a variable time (intertrial interval, 3000–5000 msec). During this time, the carousel rotated to position the test object. An LED located near the bottom of the object was then illuminated, which the monkey had to fixate (keeping the gaze inside a ±2.5-degree fixation window throughout the trial until the object was lifted). After a fixation time of 500 msec, the object was illuminated by a lamp located above the object. After a variable delay (900–1100 msec) an auditory GO (grasping observation) cue instructed the monkey to release the rest position, reach, grasp, lift, and hold the object for a variable interval (holding time, 500–900 msec). Only at the end of a correctly completed trial was a drop of juice reward given. During this task, the time delay between the start of the movement and lifting of the object was calculated (reaching–grasping time, RGT).
The MGG differed from the VGG task in that the object was illuminated for only 400 msec. After this time, the light went off, and after a delay of 500–700 msec, an auditory GO cue instructed the monkey to grasp the object in the dark (Figure 1, blue mark around the task).
During the observation tasks, the monkey sat in darkness in the same rest position as in the grasping tasks. All trials started in the same way: After a variable time (intertrial interval, 1000–4000 msec), a fixation spot (0.20 deg) appeared at eye level in the center of the display. The monkey had to fixate it with the hand in the resting position and maintain his gaze around the fixation point throughout the trial inside a ±1.5 degree fixation window. After 500 msec, the stimulus (a movie) started. The fixation point was in the same position for all tests and was superimposed onto the bottom of the object (as in the real grasping task) appearing in the movies. There was no sound in any of the videos tested. Three different tests were run in the observation task.
Observation Task 1: Real Grasping Observation Test
After 700 msec (200 msec in 49% of the cells) from the movie onset, a monkey hand appeared in the right lower corner of the movie (roughly corresponding to the starting position of the grasping task; Figure 1B, “Movement on”) and remained stationary for 800 msec (300 msec in 49% of the cells). Then, the hand started moving (Movement on) and reached and grasped the object. In this task, only one grasping action was presented, based on the object to which the neuron gave the strongest response in the VGG task (judged on-line). The distance between the lower right corner of the movie and the center of the object was 15 cm (21.2 deg). The average estimated RGT for all the movies presented was 651.7 ± 272.26 msec, which was longer than the RGT recorded during the VGG of the same objects presented in the movies (494.2 ± 182.5) but nevertheless within the range of the RGTs performed (movies RGT = 0.86 z values referred to the VGG RGT).
Observation Task 2: Effectors and Background Test
The sequence of events in this task was the same as in the previous task (Figure 1B) and with the same timing regarding the onset of the effector and the start of the movement, but two different effectors and two different backgrounds were combined (Figure 1C1): The effectors were either an isolated hand (top panels) or an ellipse (with a texture consisting of a scrambled version of the hand [major axis ∼95 mm, minor axis ∼43 mm], contrast equalized, bottom panels), and the background was either the natural (right) or the scrambled background (a scrambled version of the natural background, left). For all four combinations, the same kinematic parameters (speed, trajectory) characterized the effector displacement in the visual field. The kinematic parameters were matched to the real grasping test: We recreated a path for reach-to-grasp for the finger-splayed wrap used on the large objects and one for the pad-to-side grip used for the small objects. We did this because the reach-to-grasp approach movements were very similar for the different objects in the two grip types and because no preshaping of the hand was present in this test. The duration of the RGT was ∼750 msec.
Observation Task 3: Ellipse Test
In this task, an ellipse appeared in one of six possible positions and moved along three different axes of motion: vertical, oblique, and horizontal (Figure 1C2). In three conditions, the ellipse appeared in the periphery and moved toward the object, and in the other three conditions, the ellipse appeared on the object and moved toward one of the three peripheral positions (total trajectory lengths: vertical 8.5 cm, diagonal 11 cm, horizontal 7 cm, or 12.1, 15.7 and 10 deg, respectively). The speed and trajectory of the movement was linear, and the time of the movement was always 500 msec. The background in this case was always the original natural background (hence with the object visible). The sequence of events of this task was the same as in the previous task (Figure 1C).
Sequence of Tasks during the Recordings
During the experiment, the VGG task was used as a first task (Figure 1A) to isolate neurons. Up to six objects were alternately presented. As soon as a clear task-modulated response (i.e., difference of activity compared to the intertrial interval, irrespective of the precise epoch of the task) for at least one of the conditions (objects presented) was observed in the on-line histogram, the data collection was started. Two different objects were presented, one preferred (eliciting the highest response as observed on-line) and one nonpreferred. To exclude somatosensory or proprioceptive responses, each neuron was tested before data collection by the experimenter going inside the setup and stimulating the hand, forearm, upper arm, and shoulder of the monkey with light and deep touch and manipulating the joints of the fingers, the wrist, the elbow, and the shoulder (see Rozzi, Ferrari, Bonini, Rizzolatti, & Fogassi, 2008, for a similar procedure). Only neurons not responding to this kind of stimulation were recorded. Roughly between 10 and 15 neurons across all recording sessions were excluded in this way. All neurons were also tested in the MGG task, and these neurons were the topic of this study. This task was, depending on the session, presented just after the VGG task or as the final test after all the observation tasks had been presented. Two objects were presented in this task (the preferred object selected during the VGG task and the nonpreferred), and we selected only cells tested with a minimum of six trials (median number of trials: 10) per condition. The typical order of testing after the VGG test was real grasping observation test (a video of the grasping action with the same object as in the VGG test) or MGG task and then effector and background test, ellipse test (minimum number of trials: 5, median number of trials: 10).
To assess the involvement of a neuron in a task, we defined three epochs of interest for the VGG and MGG tasks: baseline, starting 400 msec before and ending at the light onset; early grasping: starting 50 msec after monkey removed his hand from the resting position, until 50 msec before lifting; late grasping: starting 50 msec before lifting and ending 250 msec after the lifting started. Grasping execution activity was defined as a significant change in spike rate in the early and/or late grasping phase compared to the baseline epoch (before the light onset). In the analysis of the grasping observation task, similar epochs were defined: The baseline epoch started 400 msec before movement onset and lasted until movement onset, the early grasping epoch (comprising the approach to the object and the closure of the hand around it) ran from 50 msec after the hand started to move until the hand was closed around the object (duration of epochs depending on the object grasped in the movie: min 350 msec, max 950 msec, median 550 msec), and the late grasping epoch started ∼100 msec before the start of the lifting and ended 200 msec later. There was no overlap between the two epochs.
For the effector-and-background tests, the early and late grasping phases were also defined by the position of the effector on the target object, mimicking the early and late phases of the VGG task. In this case, the epoch durations were both 400 msec, because the stimuli were constructed by video editing. A neuron was considered ellipse responsive if its firing rate was significantly modulated by the presence of the ellipse in the normal background task compared to the baseline for at least one of the two epochs.
In the ellipse test, we calculated a modulation index, defined as (activity for movement toward the object − activity for movement away from the object)/(activity for movement toward the object + activity for movement away from the object), doing this separately on the early phase (50–300 msec after effector onset, before the start of the movement onset) and the late phase (50-300 msec after the effector reached the final position, i.e., after the end of the movement).
All statistics were performed on the exact spike counts in the defined epochs. To perform analysis on each cell, we used nonparametric tests (Kruskall–Wallis p < .05, Bonferroni corrected); to perform the analysis at the population level, we used parametric tests on the average firing rate of the grasping epochs. For each trial, the net firing rate was obtained by subtracting the baseline activity from the epochs of interest. Normalization, when performed, was obtained by dividing the net firing rate of each cell by the maximum absolute firing rate of the smoothed average firing rate in the grasping execution or grasping observation epoch (depending on the task). No inversion of the activity was used for population averages. Spike density functions were obtained by using a Gaussian half-kernel of 50 msec and were used for illustration purposes only.
Figure 2A shows anatomical MRIs with glass capillaries inserted into grid positions that were used in this study for both monkeys. The depth measurements on the microdrive and the pattern of gray to white matter transitions indicated that all neurons were recorded in the anterior part (anterior–posterior range of recording positions: 3 and 4 mm for MD and MY, respectively, centered on Horsley–Clark coordinates 3 and 3.5 mm anterior and 14 and 15 mm lateral) of the lateral bank of the IPS (area AIP). Consistent with previous studies (Baumann et al., 2009), all recordings were performed within 7 mm from the tip of the IPS. The recording area is indicated on the horizontal sections (Figure 2B).
Using the VGG task, we selected on-line 128 neurons (n = 83 in MD, n = 45 in MY) that were significantly modulated during the grasping phase, that is, showed grasping execution-related activity. We use the term “grasping execution activity” simply to describe a modulation in activity after the hand starts to move toward the object. For this study, we considered only neurons recorded both in the VGG and MGG task showing grasping execution activity in both tasks (104 neurons). All these AIP neurons were also tested during grasping observation, in which we presented videos of the same actions in the center of a display located in front of the animal behind the carousel. The results were qualitatively similar for the two animals and were therefore combined.
Activity during Action Execution and Action Observation in AIP
The example neuron in Figure 3A fired strongly when the animal grasped an object both in the light and in the dark (i.e., grasping execution activity in MGG and VGG task) but not to the onset of the light above the object (Figure 3A, left). Interestingly, this neuron also fired during the observation of a video of the same grasping action (fist-person perspective) presented on the display (activity aligned on the onset of the movement in the video; Figure 3A, right). Thus, some AIP neurons respond not only during grasping execution but also during the observation of videos of the same grasping actions, which we refer to as GO activity.
Not all AIP neurons displaying grasping execution activity in the VGG and MGG tasks responded during grasping observation. The example neuron in Figure 3B responded strongly during the movement of the hand toward the object and during object lift (Figure 3B, activity aligned on movement onset). However, it failed to fire during the passive fixation of a movie of the same grasping action presented on the display (Figure 3B, right; nongrasping observation [NGO] neurons, responding only during grasping execution).
Across the population of AIP neurons that were active during VGG and MGG, 59% (61/104) were also active during grasping observation (GO neurons), whereas 43 neurons did not respond during grasping observation (NGO neurons). The average normalized population responses of GO and NGO neurons during the MGG, VGG, and grasping observation task are illustrated in Figure 4. The grasping execution activity of the NGO neurons in the MGG task was not significantly different from that of the GO neurons, t(102) = 1.38, p = .017, but the activity during object presentation was higher in the GO population than in the NGO population, t(102) = 2.18, p = .03.
Previous studies have classified AIP neurons active during grasping in two categories: visual-motor neurons, characterized by a lower firing rate during grasping in the dark than in the light, and motor-dominant neurons that did not show such a difference (Murata et al., 2000; Sakata et al., 1995). In our sample of AIP neurons, 28/61 (46%) GO neurons were more active in the light than in the dark (i.e., were visuomotor), the remaining 33 neurons (54%) were motor dominant. Similar percentages of visual-motor and motor-dominant neurons were observed in the population of NGO neurons (37% and 63%, respectively, Z test = 1.12, p = .26). At the population level, no difference in average firing rate was found between MGG and VGG task for both GO neurons, t(120) = 1.59, p = .1136, and NGO neurons, t(84) = 0.013, p = .98.
All neurons were initially tested with at least two objects in the VGG task, which allowed us to analyze the responses (both responsiveness and selectivity) to object presentation (i.e., light onset above the object). We found a difference in the percentages of object-responsive but not in that of object-selective neurons between the GO and NGO populations (object responsive: 46% of GO neurons and 26% of NGO neuron, Z test = 2.1079, p = .017; object selective: 23% of GO neurons and 21% of NGO neurons Z test = 0.24, p = .81). Therefore, GO and NGO neurons showed a qualitatively similar composition in terms of response properties during the grasping phase tested with classical tasks.
Considering that more GO neurons responded to the object presentation and that, on average, a higher level of activity at the moment of object presentation characterized the GO neurons, one could ask whether the grasping observation response in GO neurons could be accounted for by the level of activity at the object presentation. To tackle this issue, we ran a stepwise multiple linear regression using two predictors of the grasping observation responses: the neural activity at the object presentation and the activity during grasping execution in the MGG task. The prediction was significant for the activity during MGG (F(59, 1) = 9.22, p = .004; R adjusted = .135; standardized beta = .37), but not for the object responses (beta = .17, p = .28). Thus, an account of the GO responses based only to the activity related to the object presentation can be excluded.
Influence of the Visual Aspects of the Effector and of the Background on Grasping Observation Activity in AIP
We asked what kind of “movement properties” these neurons are extracting by degrading the visual scene, that is, by presenting simple motor acts lacking the specific effector used (the hand) and/or the visual context of the grasping action. We recorded from 51 GO AIP neurons with videos of an isolated hand moving toward the object (in which no preshaping of the fingers occurred during prehension) and a simple ellipse moving toward the object, both following the same trajectory as the hand in the video of the real grasping action. These simple movements were presented on two different backgrounds (Effector–Background test; Figure 1C1): the natural background and a scrambled background. All 51 GO neurons tested responded significantly to the video of the isolated hand moving in the visual field against the natural background. More remarkably, the great majority of GO neurons tested (42/51 cells, 76%) were significantly modulated even in the ellipse condition, in which a simple shape moved toward the object, as illustrated by the example neuron in Figure 5B and C. Most GO neurons modulated in the ellipse condition (26/42, 62%) did not even reliably discriminate between videos of the real grasping action, the isolated hand, or the ellipse (Kruskall–Wallis, p > .05; Figure 5C). We did not use simpler stimuli such as gratings and bars, but naturalistic testing of GO neurons confirmed that these neurons could be activated by simple movements of an object toward the to-be-grasped object (data not shown). The remaining 16/42 (36%) ellipse-responsive neurons showed a selectivity for the effector (Kruskall–Wallis, p < .05).
To compare the responses to the three types of video (real grasping, isolated hand, and ellipse), we plotted the normalized average population activity of all GO neurons aligned on movement onset (right) in Figure 5B. Passive fixation of a moving, isolated hand evoked strong responses that did not differ from those evoked by the movement of the actual hand in the real grasping video (Figure 5B, activity aligned on movement onset). However, passive observation of a simple ellipse moving toward the object was sufficient to activate this type of AIP GO neurons, albeit to a lesser degree. The average responses to the real grasping video and to the isolated hand video were significantly different from the response to the ellipse video (repeated-measures ANOVA: F(2, 100) = 7.49, p = .0009; Bonferroni post hoc test: MSE = 47.195, p = .0006 real grasping vs. ellipse, and p = .04, isolated hand vs. ellipse).
The scatterplot in Figure 6 illustrates the average net response of all GO neurons during the observation of the ellipse and the isolated hand videos as a function of the net responses during the observation of the real grasping video. For many GO neurons, ellipse responses were almost as strong as those to the real grasping action (blue data points near the diagonal), but the average response to the ellipse was weaker than that to the videos of the grasping action or isolated hand.
Specifically, it seems that neurons with higher responses during real grasping observation tended to respond less to the ellipse video. Therefore, we divided the population of GO neurons into two groups based on the firing rate during the real grasping video (higher or lower than 15 spikes/sec) and then calculated for each cell the relative response to the ellipse video (higher or lower than 50% of the real grasping video). No significant association between the two categories of neurons was detected (X2(51) = 2.99, p = .08), in other words neurons that were more active during observation of the real grasping video were not more likely to respond less to the ellipse video. Thus, it seems that there is no clear relationship between the responses to the real grasping video and the responses to the ellipse video.
At the population level, in the normal background condition only, we tested which of two predictors, the response to the isolated hand or the response to the ellipse, could better predict the response to the real grasping. By using a stepwise multiple linear regression, we found that the response to the isolated alone was able to explain up the 80% of the variance observed in the real grasping (F(49, 1) = 211.3, p = .000, R adjusted = .808, standardized beta = .90; ellipse: beta = .029, p = .769). These data show that overall the response of the GO neurons can be driven—at least to some degree—by a simple shape moving in the visual field and that the movement of the fingers to grasp the object are not necessary to drive the responses.
To test whether the visual context in which the movement was performed could affect the activity in GO neurons, we also presented videos of the isolated hand and ellipse against a scrambled background (Figure 1C1). We found that the Background exerted a small effect across the population of neurons considered: Specifically, there was a slightly higher response to the normal compared to the scrambled background (repeated-measures ANOVA, F(1, 50) = 4.11, p = .04791, partial eta squared = .07). This small effect was also confirmed at the single neuron level. In fact, most GO neurons (63%) were not significantly affected by the background. As illustrated in Figure 7, the average responses to the isolated hand and to the ellipse on the normal background correlated strongly with the responses to the same stimuli against a scrambled background (r(49) = .96, p < .000, for the isolated hand and r(49) = .88, for the ellipse, p < .000). Therefore, the visual context of the observed action exerted a small influence on GO neurons.
Responses of GO Neurons to Ellipses Moving in Different Directions
Taken together, the data presented so far demonstrate that for most GO neurons in AIP, the simple movement of a shape (ellipse) against a scrambled background was sufficient to evoke visual responses. The responses of these GO AIP neurons to the ellipse could represent selectivity for the direction of motion (e.g., motion from the lower right corner to the center of the display) or could arise by virtue of a spatial selectivity, that is, be caused by the ellipse entering a region in the visual field. To distinguish between these two possibilities, we tested 29 GO ellipse-responsive neurons with videos in which the ellipse moved against the natural background along three different axes of motion (vertical, oblique, and horizontal) and in two different directions for each axis of motion (movement toward and movement away from the object; Figure 8A) while the monkeys fixated a small spot in the center of the display. In the movement-away conditions, the ellipse appeared in the center of the display (superimposed on the object and on the fixation point) and moved to the edge of the display, whereas in the movement-toward conditions, the ellipse appeared in peripheral vision and moved toward the center of the display.
Most GO neurons (n = 14/29, 48%) gave the greatest response differences between the movement-toward and movement-away conditions for the oblique axis of motion, compared with eight neurons preferring the vertical and another seven preferring the horizontal axis of motion. Figure 8B shows the average activity of all 29 GO neurons for the movement-toward and the movement-away conditions of each neuron's preferred axis of motion (as determined by the greatest response difference between the movement-toward and the movement-away conditions during the movement phase). Before the onset of the movement of the ellipse, the average population activity was higher when the ellipse appeared in the center of the display (movement-away condition) compared with when it appeared in the periphery (movement-toward condition). Once movement began, however, the stimulus preference of this AIP population altered radically, such that the activity became higher in the movement-toward condition compared with the movement-away condition, a selectivity that persisted until 300 msec after movement ceased (Figure 8B, right). Analysis of the two static trial epochs ([50–300 msec] after onset of the ellipse and [50–300 msec] after movement cessation) revealed that GO neurons responded more strongly on average to a shape appearing or located close to the fixation point (and therefore close to the to-be-grasped object, t test p < .01), but during the movement of the ellipse (the dynamic trial epoch [50–300 msec] after movement onset), the average population activity became stronger for movement-toward trials. The strong influence of the position where the ellipse appeared and the reversal of the neural preference (from the move-away condition to the move-toward condition) indicates that GO activity in AIP cannot be entirely explained by a selectivity for the direction of motion. Instead, AIP neurons with grasping observation responses appear to be influenced primarily by spatial position, where an object (or shape) is located in proximity of the to-be-grasped object that the monkey is fixating, combined with a preference for movements toward the object.
We quantified the modulation of AIP neuronal responses in the ellipse test using a modulation index (Methods) computed in the early epoch of the trial ([50–300 msec after the appearance of the ellipse) and in the late epoch of the trial ([50–300 msec] after movement ceased) for the preferred axis of motion of each cell. Individual AIP neurons showed a variety of response patterns, but the arrangement most frequently observed consisted of an initial preference for the ellipse appearing in the center of the display followed by a later preference for movement toward the center of the display (data points in the lower right quadrant of Figure 8C). Although the neural selectivity for movement-toward versus movement-away trials was more balanced in the early trial epoch before the start of the movement (62% preferring the movement-away condition), this preference clearly shifted during the later trial epoch such that the great majority of AIP neurons (25/29, 86%) preferred movement-toward condition once the ellipse had started to move.
The example neurons in Figure 8C illustrate that, for most AIP neurons, the responses were driven by the spatial position of the ellipse. Only a very small minority of GO neurons (see c88, bottom left) showed robust motion-selective responses. Thus, the responses of GO neurons in AIP to a moving shape appeared to be based on a spatial selectivity combined with a preference for movement toward the object or the center of gaze. Because we searched for responsive neurons using a VGG task in which the monkeys' hand moved from the lower right to the center of gaze, this preference for movement-toward trials may have been partly the result of bias in the search test.
We investigated the responses of AIP neurons during grasping execution and grasping observation and found that most AIP neurons in this study fired during the execution of an action in the dark and during observation of a video of this same action, thus exhibiting mirror-like properties. These neurons were also active in a similar way during observation of movements of an isolated hand on the screen and to a lesser extent during movements of an abstract shape on the screen.
The great majority of AIP neurons in this study showed a modulation in activity when the hand started to move toward the object. Previous studies (Murata et al., 2000; Sakata et al., 1995; Taira, Mine, Georgopoulos, Murata, & Sakata, 1990) have labeled movement-related responses as “motor” or “visuomotor” based on the persistence of the activity during grasping in the dark (in the absence of visual feedback). Here we showed that, although all AIP neurons in this study remained active during grasping in the dark, a large fraction of these neurons could also be activated by pure visual stimulation with a video of a hand (or another shape) moving toward the object that is fixated. Therefore, visual information is sufficient—but not necessary—to activate these AIP neurons. In contrast, other AIP neurons were also active during grasping in the dark but could not be activated by visual stimulation. Thus, AIP houses a variety of neurons that can be distinguished by the presence or absence of responses to action observation.
The convergence of motor-related and observation-related responses on the same neuron is reminiscent of two types of activity in which a match between a motor code and a visual stimulus corresponding to that motor code occurs: Rehearsal activity (Cisek & Kalaska, 2004) and mirror activity (Gallese et al., 1996). Cisek and Kalaska (2004) found that neurons in dorsal premotor cortex respond during the execution of reaching movements in the preferred direction, during the observation of cursor movements in the preferred direction, and even in the instructed delay period before cursor movement begins, as if monkeys are mentally rehearsing the movement before it actually starts. Our study was not designed to test whether the GO responses in AIP also represent predictive activity, making it difficult to sustain this hypothesis. Indirect evidence comes from the activity we measured in the ellipse test. The 800-msec epoch before movement onset of the ellipse after it had appeared could be considered as an instructed delay period because the direction of movement was entirely predictable based on the position where the ellipse appeared. We found very little evidence for predictive activity in this epoch. Only two neurons showed strong responses throughout the premovement epoch and during the movement epoch of the same condition (e.g., neuron c88 in Figure 8). In the top right quadrant of Figure 8, all neurons are located close to the main axes, indicating that these neurons responded weakly before movement onset. Thus, the data of the ellipse test suggest that at least the subpopulation of ellipse-responsive AIP neurons did not show predictive coding of the ellipse movements; the close relation of the AIP activity to the onset of movement of the ellipse suggests that the GO responses in AIP were largely sensorial.
Although the responses of our AIP population appeared visual during grasping observation, all neurons in this study remained active during grasping in the dark, that is, showed “motor” activity. Motor activity might appear difficult to reconcile with the robust visual responses we measured during grasping observation, unless the activity during grasping in the dark represents an efference copy from premotor areas (Rizzolatti & Luppino, 2001). The possibility exists that the reafference of motor activity is integrated with visual information in these neurons during grasping execution.
The necessity of the presence of a moving stimulus makes the AIP neurons in this study more similar to mirror neurons. Caggiano et al. (2011) showed that mirror neurons in PMv can exhibit both view-dependent (i.e., tuned for a particular viewpoint such as first-person perspective) and view-independent responses to videos of actions. Thus, a specific subpopulation of mirror neurons responds to videos similar to the one we used. Nevertheless, PMv and PFG mirror neurons also encode the goal of the action, responding differentially to, for example, grasping-for-eating compared with grasping-for-placing, or even to partially occluded actions (Bonini et al., 2010; Rizzolatti & Sinigaglia, 2010; Fogassi et al., 2005; Umiltà et al., 2001). Our data are consistent with a role for AIP in the mirror neuron network (Nelissen et al., 2011; Rizzolatti & Sinigaglia, 2010; Oztop & Arbib, 2002) but do not permit to decide whether GO activity in AIP is in fact related to the mirror neuron system or to action recognition in general. Future studies should test whether GO activity in AIP constitutes an early stage in which the mapping between visual and motor properties occurs to provide input for cortical areas involved in the recognition of biological actions and/or mirror activity (Fleischer, Caggiano, Thier, & Giese, 2013).
Earlier areas in the dorsal visual stream such as the caudal intraparietal area (Sakata, Taira, Kusunoki, Murata, & Tanaka, 1997) or the STS regions may share some of the responses we observed in AIP (Vangeneugden et al., 2011; Barraclough et al., 2009; Perrett et al., 1989), but not the motor activity in the dark. AIP is connected with the F5a sector in PMv and to PFG (Gerbella, Belmalih, Borra, Rozzi, & Luppino, 2011; Belmalih et al., 2009; Borra et al., 2008), and these latter areas are both connected with the F5c sector in PMv. Furthermore TMS in humans has shown that AIP and PMv causally interact during object grasping (Davare, Rothwell, & Lemon, 2010), and it has been shown that AIP and F5a neurons share very similar visual selectivities for 3-D shape and grasping activity (Theys, Pani, van Loon, Goffin, & Janssen, 2013; Theys, Pani, et al., 2012; Theys, Srivastava, et al., 2012). Interestingly, videos of a grasping isolated hand activate the macaque AIP and F5a—even in the absence of an object—but not F5c (Nelissen et al., 2011; Nelissen, Luppino, Vanduffel, Rizzolatti, & Orban, 2005). Targeted inactivation experiments combined with recordings in PMv should clarify to what extent neurons in PMv and PFG depend upon input from AIP.
Area AIP has been implicated in on-line visual control during grasping (Tunik, Frey, & Grafton, 2005; Rizzolatti & Luppino, 2001; Murata et al., 2000). Specifically the nonobject type visual neurons may be involved in visual feedback for the adjustment of the grip around the object (Murata et al., 2000). Could the GO activity in AIP be related to the on-line visual control of the hand configuration to match it with the dimensions of the object? Several of the findings in our study are relevant for this hypothesis: (1) the control experiment with the isolated hand demonstrated that the dynamics of the observed grasping movement (extension of the fingers followed by flexion) rarely affected the AIP responses, (2) most AIP neurons also responded to the moving ellipse (i.e., hand preshaping and detailed aspects of the hand were not necessary), and (3) the videos with the scrambled background (no object present) evoked responses similar to those with a natural background.
Although we cannot be sure that the monkeys were actually using visual information to on-line correct their grasping actions, the AIP responses during action observation could be related to the monitoring of the effector position and axis of movement during grasping based on visual information, possibly in retinotopic coordinates. In support of this idea is the observation that these neurons mostly responded to an ellipse entering the central visual field (close to the fixation point). Consistent with this hypothesis, Lehmann and Scherberger (2009) recently showed robust reach position signals in area AIP that were mainly encoded in retinotopic coordinates. Furthermore, TMS of the posterior parietal cortex in humans (Reichenbach, Bresciani, Peer, Bülthoff, & Thielscher, 2011; Chib, Krutky, Lynch, & Mussa-Ivaldi, 2009; Desmurget et al., 1999) interferes with the on-line control of visually guided reaches. It should be acknowledged that we did not test different positions of the object and the fixation point, and the grasping movements made by our animals were highly overtrained and probably did not require precise on-line visual control. Thus, in the absence of conclusive data, the on-line control hypothesis remains to be tested in future studies.
Our population of GO neurons in AIP also showed a clear preference for movement toward the center of gaze over movement toward the periphery. This bias could have been at least partially a consequence of a selection bias, because all our neurons showed movement-related activity during visually guided grasping, where the animal can see its own hand moving from the rest position in peripheral vision toward the object at the center of gaze. The bias for movement toward the center of gaze may represent a first stage in the encoding of goal-directed motor acts. Motter, Steinmetz, Duffy, and Mountcastle (1987) also showed opponently organized directional preferences (toward or away from the center of gaze) in neighboring area PG, and MT/V5 neurons show a strong bias for diagonal motion in the lower contralateral quadrant (Maunsell & Van Essen, 1987).
To conclude, we found neurons in AIP that match a motor code with a visual stimulus corresponding to that motor code. This match is mostly driven by the vision of a hand mimicking the kinematic characteristics of the real action, but for some of the neurons a simple shape sharing kinematic parameters with the moving hand is able to evoke a response. The combination of motor tasks and detailed visual testing of single neurons will be critical in future studies investigating the neural circuitry underlying object grasping in the primate brain.
This study was supported by Geconcerteerde Onderzoeksacties (GOA 2010/19), Fonds voor Wetenschappelijk Onderzoek Vlaanderen (G.0495.05, G.0713.09), Programmafinanciering (PFV/10/008), Interuniversity Attraction Poles 7/11, and ERC-StG-260607. We thank Sara De Pril, Piet Kayenbergh, Gerrit Meulemans, Stijn Verstraeten, Peter Parker, Marc Depaep, Wouter Depuydt, and Inez Puttemans for assistance and Steve Raiguel for comments on the manuscript.
Reprint requests should be sent to Peter Janssen, Laboratorium voor Neuro- en Psychofysiologie, Leuven Medical School, Herestraat 49, bus 1021B-3000 Leuven, Belgium, or via e-mail: firstname.lastname@example.org.