The current study explores the neural correlates of action perception and its relation to infants' active experience performing goal-directed actions. Study 1 provided active training with sticky mittens that enables grasping and object manipulation in prereaching 4-month-olds. After training, EEG was recorded while infants observed images of hands grasping toward (congruent) or away from (incongruent) objects. We demonstrate that brief active training facilitates social perception as indexed by larger amplitude of the P400 ERP component to congruent compared with incongruent trials. Study 2 presented 4-month-old infants with passive training in which they observed an experimenter perform goal-directed reaching actions, followed by an identical ERP session to that used in Study 1. The second study did not demonstrate any differentiation between congruent and incongruent trials. These results suggest that (1) active experience alters the brains' response to goal-directed actions performed by others and (2) visual exposure alone is not sufficient in developing the neural networks subserving goal processing during action observation in infancy.
During the first year of life, infants need to develop both their motor repertoire and their ability to process the goals of other people to learn about the physical world and to interact with others. Early in life, these abilities develop in tandem. Behavioral studies demonstrate strong correlations between the ability to perform an action and the ability to process the goal of other people's reaching (Gredebäck & Falck-Ytter, 2015; Daum & Gredeback, 2011; Kanakogi & Itakura, 2011; Kochukhova & Gredebäck, 2010; Brune & Woodward, 2007) and means-end actions (Gredebäck, Stasiewicz, Falck-Ytter, Rosander, & von Hofsten, 2009; Sommerville & Woodward, 2005).
These behavioral tasks have been complemented by studies targeting the neural correlates of action perception in infancy. Many of these studies have focused on mμ desynchronization, thought to reflect sensory-motor activity, during action observation in infants who are already proficient reachers (Cannon et al., 2015; Upshaw, Bernier, & Sommerville, 2015; Marshall, Young, & Meltzoff, 2011; Nystrom, Ljunghammar, Rosander, & von Hofsten, 2011; Southgate, Johnson, Osborne, & Csibra, 2009; van Elk, van Schie, Hunnius, Vesper, & Bekkering, 2008). To our knowledge, only one study has investigated the neural correlates of action perception at the onset of reaching between 4 and 6 months old. In this study, Bakker, Daum, Handl, and Gredebäck (2015) related infants' reaching ability to an ERP component, P400, known to index social processing, and more specifically, hypothesized to reflect detection of goal-directed actions and agents early in life (Gredebäck & Daum, 2015; Gredebäck, Melinder, & Daum, 2010; see also, Csibra, Kushnerenko, & Grossman, 2008; Nelson, Moulson, & Richmond, 2006; de Haan, Johnson, & Halit, 2003). Bakker, Daum, et al. (2015) demonstrated larger P400 amplitudes for power grasps directed toward (congruent trials) than away (incongruent trials) from objects, but only for infants who were able to reach proficiently using power grasps (not present in 4-month-olds but observable in 5- and 6-month-olds that have developed good grasping skills). No effect was observed for infants who observed precision grasps, an action that infants are unable to perform at these ages. These results suggest that a close correspondence between the observed action and the action repertoire of the infant is needed to promote the perception of goal-directed actions. The same P400 ERP component has previously been shown to be sensitive to gaze direction (Senju, Csibra, & Johnson, 2008), pointing hands (Melinder, Konijnenberg, Hermansen, Daum, & Gredebäck, 2015; Gredebäck et al., 2010), give-me gestures (Bakker, Kaduk, Elsner, Juvrud, & Gredebäck, 2015), and goal-directed prosocial agents (Gredebäck et al., 2015) in infants. This ERP component is believed to originate in the STS (Gredebäck & Daum, 2015), an area involved in the perception of biological motion and goal-directed actions (Keysers & Perrett, 2004) such as reaching and looking (Allison, Puce, & McCarthy, 2000).
It is often claimed that infants' own motor development is a driving force in the development of social perception—that we learn to understand others through an embodied process grounded in action (Gredebäck & Falck-Ytter, 2015; Uithol & Paulus, 2014; Marshall & Meltzoff, 2011). However, the abovementioned studies suffer from one basic confound—that of visual experience. Because we visually attend to our own actions (Rosander & von Hofsten, 2011; Flanagan & Johansson, 2003), a large amount of observational experience comes along naturally with enhanced motor skills. As such, correlational studies cannot isolate the involvement of a participant's own motor proficiency from prolonged visual experience observing the same actions (for adult studies that have demonstrated a causal connection between motor cortex activity and social perception, see Cardellicchio, Sinigaglia, & Costantini, 2013; Elsner, D'Ausilio, Gredeback, Falck-Ytter, & Fadiga, 2013; there are also examples of facilitated social cognition after motor learning in the absence of visual experience: Casile & Giese, 2006).
To overcome the motor and visual learning confound in infancy, Sommerville, Woodward, and Needham (2005) relied on a behavioral paradigm in which active action experience and passive observational experience were separated. Sommerville and colleagues were able to induce grasping experience in 3-month-old infants who typically are unable to reach in a goal-directed and functional manner (von Hofsten, 2004). Through the use of Velcro-covered mittens, infants were provided with the opportunity to grasp and lift objects. After a short grasping training with mittens, a habituation paradigm was used to demonstrate that active training facilitates goal processing during action observation. In contrast, 3-month-old infants who were habituated to actions by passively observing others perform them did not demonstrate the same effect. In other words, they were unable to process other people's actions in terms of goals, although their visual experience with reaching was similar to those infants who were able to perform the actions in tandem. The same paradigm has been replicated (Gerson & Woodward, 2014) and used to demonstrate that active training facilitates one's own action exploration (Libertus & Needham, 2010; Needham, Barrett, & Peterman, 2002), face perception (Libertus & Needham, 2011), and the efficiency or rationality of others' actions (Skerry, Carey, & Spelke, 2013). As in the original study, none of these effects were observed after passive observational training.
The current study aims to map the neural correlates of action perception after active training with sticky mittens and compare this with a control condition in which infants are given only visual exposure to similar actions (similar control conditions have previously been used by Gerson & Woodward, 2014; Skerry et al., 2013; Libertus & Needham, 2010, 2011; Sommerville et al., 2005; Needham et al., 2002). We hypothesize that the same ERP component demonstrated to index social perception at the onset of functional grasping (P400; Bakker, Daum, et al., 2015) will differentiate congruent from incongruent grasping actions in prereaching infants after training. This effect should only be observable after active training (Experiment 1) and not present after an equivalent amount of visual exposure to goal-directed grasping actions (Experiment 2). This effect should remain when comparing the two experiments directly (larger difference in P400 amplitudes in Experiment 1 than Experiment 2). We targeted 4-month-olds because infants of this age do not typically (in the absence of sticky mittens, active training) differentiate goal-directed from non-goal-directed actions in this paradigm (Bakker, Daum, et al., 2015). A result consistent with this hypothesis would demonstrate, for the first time, a neural correlate of action perception that is driven by active motor experience (when controlling for visual experience), demonstrating a causal connection between cortical activity, motor experience, and social perception early in life.
It is likely the case that training in general alters brain activity not specifically related to the social domain. To assess this, we also analyze the ERP components Nc, known to index attention early in life (Richards, 2003), and the P100 (Csibra et al., 2008), targeting early perceptual processes. We do not formulate any specific hypothesis with respect to these components in the individual experiments but suggest that any such effect should not be present when comparing the two experiments. In other words, there might be general training effects on both early visual and attentional components, but these should not remain when directly comparing active and passive training.
Fifteen 4-month-old infants (nine girls; mean age = 129 days, SD = 5 days) were included in the final sample. Fourteen additional infants were tested but excluded from the final analysis, as they did not obtain a sufficient number of artifact-free trials (n < 10 trials/condition) because of excessive movements or fussiness (reviews of infant ERP studies have noted that a dropout of 50% is standard; Stets, Stahl, & Reid, 2012). Before the experimental session, parents signed a consent form and were given a gift certificate (value of approximately €10). The study was approved by the local ethics committee and conducted in accordance with the standards specified in the 1964 Declaration of Helsinki.
Procedure and Stimuli
Infants participated in two experimental tasks. The experiment began with a behavioral task, here referred as the active experience task. Subsequently, we used EEG to record infants' neural responses to sequences of pictures displaying hands and objects. The total duration of the experiment was approximately 30 min.
Active experience task
Infants were seated on their caregiver's lap, in front of a table, supported at their torso by their parents. A pair of mittens was put on infants' hands. The mittens were custom made from ecru-colored fabric, modeled from those used in Needham et al. (2002) and Sommerville et al. (2005). The dorsal side of the mittens was transparent allowing infants to see their fingers, whereas the palmar side was equipped with Velcro covering the infants' entire palm. To encourage grasping, infants were presented with nine colorful balls (3 × 3 cm) placed one at a time at a reachable distance, approximately 10 cm from the infants' torso. The balls were covered with the corresponding side Velcro so, when in contact, they easily adhered to the mittens. In this way, each infant was given an opportunity to lift the ball after making contact with it. In total, infants were given 240 sec to interact with the objects. During this time, infants had several opportunities to interact with the objects, that is, to touch them, manipulate them, and lift them up. When an infant lifted a ball, he or she was given approximately 5 sec to explore the object, after which the ball was removed from the mitten and placed again in front of the child. A new object was presented as soon as the child showed no interest in lifting up the previous one. During the training, the experimenter interacted with the infant, by smiling, talking, and praising the child when he or she acted upon the object and when he or she lifted the object. After 240 sec, the experimenter took away the objects and removed the mittens from the infant's hands. The training sessions were video recorded and coded for the number of goal-directed action performed by the infant. In this case, a goal-directed action was coded if infant's action resulted in contact with the object, the object was subsequently lifted from the table, and if the action was preceded by a visual fixation on the object. The time interval between gaze at the object and the grasping action could not exceed 3 sec for it to be counted as a goal-directed action. On average, each infant performed 9.7 goal-directed actions (range = 5–16, SD = 3) during the training session.
The stimuli were presented on a 17-in. computer screen that was rotated 90° (so the vertical extension of the screen was larger than its horizontal counterpart) at a distance of 60 cm from the observing infant. Infants observed sequences of still images of colorful balls appearing on the upper and lower parts of the screen (to avoid lateralization effects) followed by static images of human hands shaped in a whole hand grasp. The hand was directed either toward (congruent condition) or away from (incongruent condition) the previously presented ball. The hand did not move toward the object but remained still. Each infant watched both the congruent and incongruent trials presented in randomized order. All trials began with a 100-msec presentation of two rectangles (6 horizontal × 5 vertical degrees) displayed at the upper and lower parts of the computer screen, 13 vertical degrees apart. This was followed by a fixation cross (duration random within the range of 1300–1750 msec) at the center of the screen followed by a ball presented (for 240 msec) inside an upper or lower rectangle. The whole trial sequence ended with a picture of a grasping hand performing a power grasp (duration of 1000 msec) without the object being present (Figure 1). Stimuli were presented using the software E-Prime 2.0, E-Studio (Psychology Software Tools, Inc., Pittsburgh, PA). These stimuli were repeated until infants demonstrated signs of fussiness. In these instances, a distractor image was displayed on the screen along with an attention-grabbing sound until infants reoriented their attention to the screen, at which time the stimulus presentation commenced once more.
During the EEG recording, we kept possible distractions to a minimum. Light conditions in the laboratory were kept low, and a curtain separated the experimenter and EEG equipment from the infant. The experimenter monitored the infant's gaze direction through a camera. As soon as an infant got distracted and no longer looked at the stimuli, the experimenter used attention-grabbing pictures with a sound to orient the infant's attention back to stimuli. The experiment was terminated when an infant was no longer interested in the stimulus.
EEG recording and analysis
We used a 128-channel HydroCel Geodesics Sensor Net (Electric Geodesics, Eugene, OR) to record infants' EEGs and EOG. The signal was sampled at 250 Hz, vertex referenced, amplified (EGI Net Amps 300 amplifier), and low-pass filtered at 100 Hz. The EEG signal was postprocessed using a digital filter (0.5–25 Hz) and segmented from 550 msec before the appearance of the hand (including the last 160 msec of the empty rectangles with the fixation cross and the 240-msec-long presentation of the object) until 900 msec after the hand was presented. The electrodes from the most anterior and posterior areas were not included in the final analysis because of high noise caused by poor contact with the scalp (which is common in infant ERP studies; see Gredebäck et al., 2010, 2015). In total, 38 electrodes were excluded from the final analyses. Furthermore, the data were inspected manually, and those electrodes contaminated with artifacts were rejected (a standard procedure in infant ERP work; see Hoehl & Wahl, 2012). We used frontal electrodes to detect eye movement and eye blinks. The data from the missing channels were interpolated from the surrounding electrodes. The trials included in the final data contained no more than 10% of the artifact-contaminated electrodes. The data were baseline corrected with a 350-msec baseline starting from the onset of the analysis interval. This slightly longer than usual baseline (350 compared with 200 msec in Gredebäck et al., 2010) was chosen to obtain more stable ERPs, taking in consideration the very young age of our participants. As this baseline overlaps with the presentation of the object (not informative of condition), this might result in an enhanced negativity over visual areas in the final ERP waveform. This should however be the same for all conditions and experiments in this study and cannot account for any of the effects reported below. All remaining trials were merged to individual averages, separated by condition. Individual averages that contained at least 10 artifact-free trials were used to create the grand average. Analyses were performed on six channel areas—lower occipital (Electrode numbers 74, 75 [Oz], 82), left posterior temporal (Electrode numbers 65, 66, 67, 69, 70 , 71), right posterior temporal (Electrode numbers 76, 77, 83 , 84, 89, 90), left central (Electrode numbers 47, 42, 37, 31, 51, 52, 53 [P3], 54), and right central (Electrode numbers 87, 93, 98, 79, 80, 86, 92 [P4], 97)—with a focus on a time interval ranging from 90 to 150 (labeled P100), 300 to 400 (labeled N290), and 400 to 600 (labeled P400) msec after the onset of the stimulus. In addition, the analysis on the frontal area (Electrode numbers 5, 6, 7, 12, 13, 20, 29, 30, 36, 104, 105 [C4], 106, 111, 112, 118) was performed on a time window of 300–700 msec (labeled Nc) after the onset of the stimulus. The choice of the electrodes for P100 and P400 components was based on a similar procedure used in the study by Bakker, Daum, et al. (2015) and visual inspection of the data for the Nc and N290 components.
The average amplitudes within the time window defined above for the P100, N290, and P400 components for posterior temporal and central areas were statistically compared in a 2 × 2 repeated-measures ANOVA, with Condition (congruent, incongruent) and Lateralization (left, right) as within-subject factors. The average amplitude within the time window used to assess the lower occipital P100, the P400, and the frontal Nc was compared using a repeated-measure t test with averaged amplitude over channels as the dependent variable and condition as the independent variable. To check whether the results would replicate the same findings when the time window was matched to previously published work using the same paradigm (Bakker, Daum, et al., 2015), we performed additional analysis between 300 and 600 msec for the component labeled P400. This time window was targeted in the posterior temporal and occipital areas.
Results and Discussion
The average number of presented trials for both conditions was 85. On average, infants provided 35 artifact-free trials: 16 (range = 10–33) for the congruent condition and 18 (range = 10–29) for the incongruent condition.
The analysis revealed a significant difference in the P400 amplitudes for congruent and incongruent trials in the posterior temporal and lower occipital areas (Figure 2). More specifically, for the posterior temporal area, a main effect of Congruency was observed with higher amplitudes in the congruent condition (4.5 μV) than in the incongruent condition (0.45 μV), F(1, 14) = 10.392, p = .006, η2 = 0.426. There was no main effect of Hemisphere, F(1, 14) = 0.056, p = .817, nor was there an interaction effect between Condition and Hemisphere, F(1, 14) = 0.784, p = .391.
ERP amplitudes for the lower central occipital P400 demonstrate higher amplitudes for congruent (2.6 μV) than incongruent (−2.6 μV) trials, t(14) = 3.68, p = .002, d = 0.77. In addition, the analysis for P400 component within left and right central area was performed but did not show any significant difference between the congruent (−1.8 μV) and incongruent (−0.4 μV) trials, t(14) = 1.835, p = .09.
The analysis within the posterior temporal area for N290 ERP component did not reveal a main effect of Congruency, F(1, 14) = 2.215, p = .159, nor did it reveal a main effect of Hemisphere, F(1, 14) = 0.915, p = .355, or an interaction between Congruency and Hemisphere, F(1, 14) = 0.185, p = .674. No significant difference for N290 within the lower occipital area was observed, t(14) = 1.491, p = .158.
Similar to the previous component, the analysis within the posterior temporal area for P100 ERP component did not reveal a main effect of Congruency, F(1, 14) = 3.004, p = .103, a main effect of Hemisphere, F(1, 14) = 1.009, p = .332, or an interaction between Congruency and Hemisphere, F(1, 14) = 0.012, p = .916. No significant difference for P100 within the lower occipital area was observed, t(14) = 1.110, p = .286.
The results demonstrate differential neural activity within the Nc component (Figure 3) located in the frontal area with more negatively pronounced congruent trials (−3 μV) than incongruent trials (1.34 μV), t(14) = −3.696, p = .002, d = 0.48.
The additional analysis on the P400 component within the time window previously used by Bakker, Daum, et al. (2015; 300–600 msec) replicated the findings from the time chosen (400–600 msec) based on the visual inspection of the data. The analysis revealed a significant difference in the P400 amplitudes for congruent (2.5 μV in the posterior temporal area and 2.6 μV in the lower occipital) and incongruent (−1.38 μV in the posterior temporal area and −2.6 μV in the lower occipital) trials in the posterior temporal (F(1, 14) = 9.133, p = .009, η2 = 0.61) and lower occipital areas, t(14) = 3.41, p = .004, d = 0.77. There was no main effect of Hemisphere (F(1, 14) = 0.007, p = .934), nor was there an interaction between Condition and Hemisphere, F(1, 14) = 1.009, p = .332, in posterior temporal region.
These results demonstrate that active training with goal-directed actions in prereaching infants results in changes at the neural level that allow coding of observed goal-directed actions. More specifically, experience grasping and lifting objects elicited substantial differences between congruent and incongruent manual actions in both posterior temporal, occipital, and frontal sites. The differential ERP responses include both the hypothesized P400 known to be involved in the processing of goal-directed actions (Bakker, Daum, et al., 2015; Melinder et al., 2015; Gredebäck et al., 2010) and ERP component Nc known to index selective attention (Richards, 2003). No differences were observed for ERP component P100 or for P400 over central sites. Given Experiment 1 alone, it is difficult to assess the nature of this training effect. Experiment 2 provides another form of training in which infants observe an experimenter perform the same reaching and grasping actions, followed by ERP measures of the P100, the P400, and the Nc during observation of goal-directed grasping actions (identical to that used in Experiment 1). If active training with sticky mittens is needed for action perception to emerge, then no differentiation between goal-directed (congruent) and non-goal-directed (incongruent) trials is expected after passive observation.
In Experiment 2, infants were presented with the same stimulus material as in Experiment 1. The only difference between the two experiments was that the active training in Experiment 1 was replaced by passive observation of goal-directed actions performed by the experimenter. The time and the number of goal-directed actions that infants observed were kept constant across experiments.
Thirty 4-month-old infants were included in the study. The final sample consisted of 15 infants (seven girls; mean age = 126 days, SD = 7 days). In addition, 15 infants were not included in the final analysis because of an insufficient number of artifact-free trials (n < 10). Similar to Experiment 1, before participation in the experiment, the parents were informed about the procedure and purpose of the study and signed a consent form. They received a voucher with an approximate value of €10.
Procedure and Stimuli
Similar to Experiment 1, infants performed two tasks: First, they took part in passive observation training followed by the EEG recording. This design allowed us to compare the neural activity of infants attending to the same ERP stimuli after different training opportunities. All aspects of the study were identical to Experiment 1 with one exception: During training, infants observed goal-directed reaching actions that mimicked the average active object engagement from Experiment 1. Each infant attended to 10 goal-directed actions (compared with 9.7 in Experiment 1). The timing of the events matched the 240 sec of active training. All the grasping events were performed in the similar way; the experimenter looked at the child and, if necessary, brought the child's attention to the action by initiating a contact by speaking directly to a child (greeting or calling by the name). The experimenter extended the hand and lifted the object up. While holding the object, the experimenter praised her own successful reach in the same way as during the active training, for example, the experimenter said “well done,” “good job,” and so forth. Consequently, the experimenter put away the object on the table. In the event that the infants did not show interest in the used object, the experimenter exchanged it for a new one and started from the beginning. If the infant did not attend to a grasping event, the action was repeated.
Results and Discussion
On average, infants observed 87 trials in total and provided an average of 26 artifact-free trials: 12 (range = 10–27) for the congruent condition and 14 (range = 10–22) for the incongruent condition. The analysis was performed on P100, N290, P400, and Nc components and revealed no significant differences between conditions or lateralization.
The analysis over posterior temporal area for the P400 component (mean amplitudes of 0.23 μV for congruent condition and 1.12 μV for incongruent condition) did not reveal a main effect of Congruency (F(1, 14) = 0.284, p = .602), a main effect of Hemisphere, F(1, 14) = 2.394, p = .146, or an interaction between Condition and Hemisphere, F(1, 14) = 0.299, p = .593. The analysis for the P400 component over the lower occipital area (mean amplitudes of 3.15 μV for the congruent trials and 1.10 μV for the incongruent trails) did not reveal a significant difference between the conditions (t(14) = 0.836, p = .417). Additional analysis for P400 component over central (left, right) area did not indicate significant difference between the congruent (−1.73 μV) and trials (−1.46 μV) incongruent, t(14) = 0.124, p = .730.
Analysis of the P100 component over the posterior temporal area did not reveal a main effect of Congruency (mean amplitudes of −5.9 μV for the congruent condition and −4.5 μV for the incongruent condition), F(1, 14) = 1.304, p = .273. The analysis did not show a main effect of Hemisphere, F(1, 14) = 0.666, p = .428, or an interaction between Condition and Hemisphere, F(1, 14) = 0.000, p = .991. Finally, the t test analysis for lower occipital area for P100 did not reveal significantly different amplitudes between congruent (−6.1 μV) and incongruent (−5.7 μV) trials (t(14) = −1.057, p = .309).
The analysis of the Nc component did not demonstrate significant differences in neural activity between congruent (−0.2 μV) and incongruent (−0.6 μV) trials, t(14) = 0.355, p = .309. The results from Experiment 2 demonstrate that the effects observed in Experiment 1 were related to active experience and that pure visual exposure to similar actions is not sufficient to elicit the same effect. In general, these findings support the notion that active sticky mittens training enhances social networks dedicated to the processing of goal-directed actions. To further isolate the ERP components that make a strong contribution to the sticky mittens effect, direct comparisons between ERP components from Experiments 1 and 2 were performed.
The analysis over posterior temporal area for the P400 component when using the time window between 300 and 600 msec after stimulus onset (Figure 4; mean amplitudes of −0.6 μV for the congruent condition and 0.4 μV for the incongruent condition) did not reveal a main effect of Congruency (F(1, 14) = 0.452, p = .512), a main effect of Hemisphere, F(1, 14) = 2.718, p = .121, or an interaction between Condition and Hemisphere, F(1, 14) = 2.483, p = .499. The analysis for the P400 component over the lower occipital area (mean amplitudes of 1.4 μV for the congruent trials and 0.3 μV for the incongruent trails) did not reveal a significant difference between the conditions (t(14) = 0.626, p = .541). Additional analysis for the P400 component over the central (left, right) area did not indicate significant differences between the congruent (−1.4 μV) and incongruent (−2.4 μV) trials, t(14) = −0.069, p = .946.
STATISTICAL COMPARISON OF EXPERIMENTS 1 AND 2
To examine the effect of the training across experiments (active training in Experiment 1 and passive training in Experiment 2), we performed additional analysis on the difference score between conditions in the P400 component located over the posterior temporal area. The analysis revealed a significant difference as a function of the experience that infants acquired before EEG testing, demonstrating a larger relative P400 amplitude for the active training (3.9 μV) than passive training (−1.1 μV) condition, F(1, 28) = 6.381, p = .017, η2 = 0.18. These findings demonstrate that active but not the passive training influenced neural responses to congruent and incongruent grasping.
We performed additional analyses on the difference score between conditions for the two components that were found significant after active training experience. The analysis for P400 component over the lower occipital area did not reveal significantly different neural activity when comparing active and passive training, F(1, 28) = 2.979, p = .095. We did not find significant difference between active and passive training when comparing difference score between conditions within the Nc component, F(1, 28) = 0.80, p = .780.
No correlation between the number of goal-directed actions during active training and the amplitude of the P400 ERP component was observed (b = 0.166, t(14) = 0.526, p = .608).
Additional analysis was performed on the number of trials presented in each condition and experiment. The rationale for this analysis is to assess potential differences in general attention and motivation to participate in the EEG session after active and passive training. As the experiment continues until infants become too fussy to provide additional artifact-free trials, the number of presented trials directly relates to how long infants were attending to the stimuli. An ANOVA with Condition and Type of experience as factors did not demonstrate any significant differences in the attention between conditions, F(1, 28) = 0.771, p = .395, or studies, F(1, 28) = 0.235, p = .635 (for descriptive statistics, see each individual experiment). The analysis within the posterior temporal area for the P100 ERP component did not reveal a main effect of Congruency, F(1, 14) = 3.004, p = .103, a main effect of Hemisphere, F(1, 14) = 1.009, p = .332, or an interaction effect between Congruency and Hemisphere, F(1, 14) = 0.012, p = .916.
The direct comparison between the two experiments demonstrates that infants were equally attentive in both experiments and that the posterior temporal P400 has a central role as a key marker of active action experience after training. This effect can be elicited in 4-month-old infants, 1–2 months before a similar neural correlate of active action experience emerges alongside the typical development of goal-directed reaching. The fact that observed goal-directed actions produce an enhanced P400 relative to observed non-goal-directed action, but only in infants who receive active sticky mittens training, suggests that infants, through sticky mittens training, learn to encode the goal of other people's actions.
The aim of the current study was threefold. First, we aimed to contribute to a better understanding of how infants become proficient in encoding other people's goal-directed actions and the neural mechanisms involved in this process. Second, we aimed to examine which type of experience (active vs. passive) more strongly influences processing of others' action at a neural level. Finally, given that our design to training infants to perform an action that they are not able to perform in the absence of training, we aimed to look at the causal connection between action production and action perception.
Our findings demonstrate that brief experience producing goal-directed grasping actions, at an age when functional grasping has not yet emerged, enhances the neural circuitry dedicated to processing goal-directed actions performed by others. The neural correlate that indexes the training-induced action perception was the P400 ERP component. This neural signature has previously been associated with the processing of goal-directed actions in humans (Bakker, Daum, et al., 2015; Melinder et al., 2015; Gredebäck et al., 2010; Senju et al., 2008) and animate agents (Gredebäck et al., 2015). In the light of these prior articles, we believe that infants, through active training, begin to process manual actions as goal directed and come to expect hands to be directed toward objects in the environment.
We have previously argued (Bakker, Daum, et al., 2015) that seeing an object appearing either at the top or at the bottom of the screen heightens attention to this location. When a grasping hand later appears in the middle of the screen, infants fixate this location, and covert attention is distributed along the reaching axis. If the axis of attention overlaps with the previously cued location, then the amplitude of the P400 is enlarged, and the action is perceived as goal directed. In essence, the effect observed here with respect to P400 is thought to represent a priming effect similar to what is demonstrated in infants and adults using eye tracking (Daum, Ulber, & Gredebäck, 2013; Daum & Gredeback, 2011) and response time measures (Driver et al., 1999; Friesen & Kingstone, 1998). Only 4 min of active reaching and grasping training, with an average of 10 object-directed actions, was sufficient to elicit priming and social perception networks typically developing 1–2 months later in life (Bakker, Daum, et al., 2015). The speed of this neural reorganization or enhancement is quite remarkable, particularly given the young age of these infants.
Although lacking in spatial resolution, it has been argued that the ERP component P400 originates from the STS, an area known to be involved in social perception and the processing of goal-directed actions (Keysers & Perrett, 2004; Allison et al., 2000). Both fMRI studies of STS activation in adults and ERP studies of P400 amplitudes in infants demonstrate similar signatures. First of all, both are sensitive to faces (for STS, see Yang, Rosenblau, Keifer, & Pelphrey, 2015; for P400, see Leppänen, Moulson, Vogel-Farley, & Nelson, 2007; de Haan, Pascalis, & Johnson, 2002), the cued direction of gaze (for STS, see Perrett et al., 1985; for P400, see Senju et al., 2008), the goal directedness of human hands (for STS, see Bahnemann, Dziobek, Prehn, Wolf, & Heekeren, 2010; for P400, see Bakker, Daum et al., 2015; Melinder et al., 2015; Gredebäck et al., 2010), and point-light walkers (for STS, see Grossman et al., 2000; for P400, see Reid, Hoehl, Landt, & Striano, 2008). Second, the infant P400 is closely related to the adult ERP component N170 (Nelson et al., 2006; de Haan et al., 2002) documented in source localization and joint fMRI/EEG studies to originate from the STS (Dalrymple et al., 2011; Itier & Taylor, 2004; Puce, Allison, Bentin, Gore, & McCarthy, 1998). Our results demonstrate for the first time that P400 is engaged in the establishment of the action production–action perception link and suggest that the STS is the source of this activation.
We believe that the role of the STS area is to detect the goal-directed actions from visual input. Relating to the theoretical model of action perception timeline proposed by Gredebäck and Daum (2015), the STS would be responsible for the very first step of action processing, that is, the initial modulation of attention that occurs within the first approximately 500 msec from the moment of detection of the agent or cue. In the current study, this first step of processing would be the summation of attentional shifts from the object and the congruent hand, indexed by P400 component. In addition, we argue that the brief training provided by the sticky mittens, although helpful in enhancing the detection of incoming information and drawing infants' attention in the direction of the object, is not sufficient to build up a strong motor representation of the observed action. We believe that, for more advanced processing (involving, e.g., the ability to predict the goal of others' actions), infants need strong and stable motor plans that are built up during regular motor development (Gredebäck & Falck-Ytter, 2015; Gredebäck & Melinder, 2010) associated with massive training (Adolph et al., 2012).
The effects of active motor experience were also visible in more frontal ERP components, in this case, the Nc, known to index selective attention (Richards, 2003). This component is known to be sensitive to surprising events. In our case, the increased negativity of Nc is more strongly elicited for congruent actions. It is possible that infants may see congruent actions as more interesting as these trials signal that the action will be accomplished. Studies that have presented similar stimuli to infants (e.g., hands pointing, grasping, gaze, or goal-directed nonhuman animate agents; see Introduction) do not report Nc differences. One suggestion is that this effect is directly related to active training and, as such, is less visible in other studies that have assessed P400 amplitudes at the onset of functional grasping and pointing later in life (Bakker, Daum, et al., 2015; Melinder et al., 2015; Gredebäck et al., 2010). Interpretations of the Nc effects should however be made with caution as no differential effect (comparing Nc between the two experiments) could be observed. On the basis of this, another suggestion is likely that the temporal-parietal P400 and the centro-frontal Nc create a functional network that arises from active training with one's own reaching, grasping, and object manipulation, involved in the encoding of goal-directed manual actions (P400) and subsequent modulation of attention (Nc). Considering that prior studies that examined older infants did not report Nc effects, it is possible that the attentional component becomes less prominent later in life, once the motor plan needed to perform functional goal-directed actions develops.
It is clear that more studies need to target the Nc component with respect to observation of goal-directed actions to validate our alternative explanations. The involvement of P400 is more clearly expressed, however, both with its presence in Experiment 1 and absence in Experiment 2 and through direct comparisons between the two experiments. We further suggest that this P400 effect is maintained throughout the first year of life and that, in adults, it is expressed as an N170–N190, a transition similar to that previously documented for observation of congruent and incongruent pointing (Melinder et al., 2015; Gredebäck et al., 2010). As demonstrated in the statistical comparison between Experiments 1 and 2, the training effect is only found after active training, in which infants obtain first-hand experience grasping for and manipulating objects. Passive observation training of others' actions does not generate the same, or any, enhancement in neural activity in ERP component P400. This finding speaks for the unique role of active experience in shaping our ability to make sense of other people and the world at large.
One important caveat of our study is necessary to mention. Our experiments are designed to compare active and passive experience as they naturalistically occur in the real world. Specifically, during active training, infants experience a first-person visual perspective on their own actions, whereas during passive training, infants adopt a third-party perspective on others' actions. It is possible that first-person observational experience would be more influential in activating the networks responsible for action understanding. An alternate control for the active experience condition would be to present infants with videos of another infants' first-person reaching experience. However, because of the pragmatic challenges of using such an approach with very young, and given the confounds inherent in comparing video and live experiences, this approach was not our first choice.
Although our data support the supremacy of active training over passive training in enhancing processing of goal-directed actions, we acknowledge that both types of experiences (active and passive) contribute to social perception processes in infants' everyday lives. The literature provides evidence that visual learning facilitates encoding of other people's goals that cannot be rejected (Green, Li, Lockman, & Gredebäck, in press; Henrichs, Elsner, Elsner, Wilkinson, & Gredeback, 2014; Biro, Verschoor, & Coenen, 2011). However, our data show that active experience generates a stronger signal and more in-depth processing in social perception networks, creating a foundation for social learning very early in life. We believe that the first-person experience is uniquely powerful as it gives simultaneous visual and motor experience. Thus, it is possible that actions that are only visually accessible may be gained with some time lag. This speculation requires more investigation and may be a subject for the further studies.
Together, our findings suggest that brief active reaching and grasping experience provides a causal facilitation of social networks and attentional priming with respect to goal-directed manual actions. We learn about others through active engagement with the world, through an embodied process that uses available first-hand experience of reaching and grasping for objects to perceive, encode, and interpret second-hand information obtained while observing others reach and grasp for objects. This active training enhances a social network involving P400, assumed to originate in STS and to be responsible for the detection and processing of goal-directed actions via priming. This study provides the first causal connection between neural networks dedicated to social processing and active action experience during the first 6 months of life.
We are grateful to all the parents and their children participating in this study. In addition, we thank Kahl Hellmer and Tori Wesevich for their help in participant recruitment as well as Malin Karstens for her excellent assistance with the data collection. This work was financed by an ERC StG grant (CACTUS 312292) and a Wallenberg Academy Fellow grant from the Knut and Alice Wallenberg foundation (KWA 2012.0120).
Reprint requests should be sent to Marta Bakker, Uppsala University, Box 1225, 751 42 Uppsala, Sweden, or via e-mail: firstname.lastname@example.org, email@example.com.