We investigated the temporal dynamics and changes in connectivity in the mental rotation network through the application of spatio-temporal support vector machines (SVMs). The spatio-temporal SVM [Mourao-Miranda, J., Friston, K. J., et al. (2007). Dynamic discrimination analysis: A spatial-temporal SVM. Neuroimage, 36, 88–99] is a pattern recognition approach that is suitable for investigating dynamic changes in the brain network during a complex mental task. It does not require a model describing each component of the task and the precise shape of the BOLD impulse response. By defining a time window including a cognitive event, one can use spatio-temporal fMRI observations from two cognitive states to train the SVM. During the training, the SVM finds the discriminating pattern between the two states and produces a discriminating weight vector encompassing both voxels and time (i.e., spatio-temporal maps). We showed that by applying spatio-temporal SVM to an event-related mental rotation experiment, it is possible to discriminate between different degrees of angular disparity (0° vs. 20°, 0° vs. 60°, and 0° vs. 100°), and the discrimination accuracy is correlated with the difference in angular disparity between the conditions. For the comparison with highest accuracy (0° vs. 100°), we evaluated how the most discriminating areas (visual regions, parietal regions, supplementary, and premotor areas) change their behavior over time. The frontal premotor regions became highly discriminating earlier than the superior parietal cortex. There seems to be a parcellation of the parietal regions with an earlier discrimination of the inferior parietal lobe in the mental rotation in relation to the superior parietal. The SVM also identified a network of regions that had a decrease in BOLD responses during the 100° condition in relation to the 0° condition (posterior cingulate, frontal, and superior temporal gyrus). This network was also highly discriminating between the two conditions. In addition, we investigated changes in functional connectivity between the most discriminating areas identified by the spatio-temporal SVM. We observed an increase in functional connectivity between almost all areas activated during the 100° condition (bilateral inferior and superior parietal lobe, bilateral premotor area, and SMA) but not between the areas that showed a decrease in BOLD response during the 100° condition.
The mental rotation task is a special case of an imaginary task, in that it requires the generation, maintenance, and active manipulation of the presented object. Therefore, it is reasonable to believe that areas involved in different components of the task are differently activated and functionally connected over time. Many studies have previously investigated the neural mechanisms that underlie mental rotation (Schendan & Stern, 2007; Ecker, Brammer, David, & Williams, 2006; Keehner, Guerin, Miller, Turk, & Hegarty, 2006; Levin, Mohamed, & Platek, 2005; Vingerhoets, de Lange, Vandemaele, Deblaere, & Achten, 2002; Harris et al., 2000; Cohen et al., 1996). They showed that the so-called rotation network comprises parietal regions and several areas of the motor system as well as of the visual system. Although the areas involved in the metal rotation task are well established, there is little existing description of the dynamic changes in this network of brain regions in terms of activity and connectivity between the regions during task execution. Previous studies have investigated the specific time course of the hemodynamic response function (HRF) during a mental rotation task in different ROIs using time-resolved fMRI analysis (Ecker et al., 2006; Richter et al., 2000; Richter, Ugurbil, Georgopoulos, & Kim, 1997). This type of analysis is based on correlating the time course of a mental process with the time course of the HRF in predefined ROIs. There are two drawbacks in using this approach. First, one needs to make assumptions about HRF shape, and the analysis will not work properly for areas in which the response departs from the HRF. Secondly, one needs an objective criterion to define ROIs a priori. Normally, the ROIs are selected based on the results of a standard general linear model (GLM) analysis that fits a model to the time series at each voxel (Friston et al., 1995) and if the model is not correctly specified some important areas might not be detected. A common problem when analyzing event-related (ER) fMRI data, like in the mental rotation task, is that one needs to have a temporally precise model of the BOLD response to detect areas involved in the cognitive task. The situation becomes more difficult for complex tasks that can be split in a number of subcomponents with different latencies and durations. The traditional approach for analyzing ER fMRI data is using a set of basis functions within the GLM. Choices of basis functions include a Fourier set (Josephs, Turner, & Fiston, 1997), a lagged gamma function (Dale & Buckner, 1997), a canonical response function and its partial derivatives (Friston et al., 1998), and a finite impulse response set (Henson, 2001). Henson, Price, Rugg, Turner, and Friston (2002) detected latency differences in ER BOLD responses using a first-order Taylor approximation in terms of the temporal derivative of a canonical HRF. However, it is difficult to accommodate all the possible variability of the ER BOLD response in a single model, and if the model underlying the GLM is incorrectly specified (e.g., if the impulse response does not resemble the canonical form), then the results are inaccurately biased. Only few studies have used model-free approaches to investigate the areas involved in the mental rotation task (Lamm, Windischberger, Moser, & Bauer, 2007; Windischberger, Lamm, Bauer, and Moser, 2003).
The present work aimed to investigate the temporal dynamics and changes in connectivity in the mental rotation network through the application of spatio-temporal support vector machines (SVM). Using this pattern recognition approach, one can look at the whole brain to detect regions activated by the task over time without a priori spatial or temporal constraints. The use of pattern recognition methods to analyze neuroimaging data has increased significantly in the last years (Kriegeskorte, Goebel, & Bandettini, 2006; Mourao-Miranda, Reynaud, McGlone, Calvert, & Brammer, 2006; Davatzikos et al., 2005; LaConte, Strother, Cherkassky, Anderson, & Hu, 2005; Mourao-Miranda, Bokde, Born, Hampel, & Stetter, 2005; Cox & Savoy, 2003; Mitchell et al., 2003). In these applications, the fMRI scans are treated as spatial patterns, and machine learning methods are used to identify statistical properties of the data that discriminate between brain states (e.g., Task 1 vs. Task 2) or group of subjects (e.g., patients vs. controls). Using these approaches, researchers are addressing questions about how cognitive states are mapped onto patterns of neural activity and using this information to make predictions about subject's mental state from their fMRI scans (i.e., as a “mind reading” approach). Recently, Mourao-Miranda, Friston, and Brammer (2007) have shown that by using the combined spatial and temporal information as input to a pattern recognition method (e.g., the spatio-temporal SVM), one cannot only make predictions about a subject's mental state but in addition perform a dynamic discrimination analysis. The spatio-temporal SVM is a suitable approach to investigate dynamic changes in the brain network during a complex mental task because it does not require a model describing each component of the task and the precise shape of the BOLD impulse response. By defining a time window including the event, one can use spatio-temporal fMRI observations to train the SVM. This produces a discriminating weight vector encompassing both voxels and time (i.e., spatio-temporal maps). These spatio-temporal maps can be difficult to interpret because the value in each voxel is function of both, changes that happened in the current voxel (univariate effect) plus changes in spatial and temporal correlations with the other voxels in the brain (multivariate effect). To increase the interpretability of the spatio-temporal SVM maps, we investigated changes in functional connectivity between the most discriminating areas.
We used data from 11 right-handed female volunteers between 20 and 30 years of age (detailed in Ecker et al., 2006). All participants were in good general health without a history of neurological or psychiatric disorders and exhibited normal eyesight. Written consent was provided by all subjects. The study was approved by the South London and Maudsley NHS Trust Ethics Committee.
The objects used in the mental rotation task were Shepard and Metzler (1971)–like structures (detailed in Ecker et al., 2006). During the task, the stimuli were presented in pairs and were displayed by a computer-controlled projector system on a screen. The three-dimensional objects in each pair were either same (same pair) or mirror images (different pair). Object pairs were presented in four different experimental conditions according to the angular disparity between the objects: (1) 0° angular disparity (2 identical objects at the same orientation), (2) 20° angular disparity, (3) 60° angular disparity, and (4) 100° angular disparity.
A total of 80 trials (20 trials per condition) were presented in random order with respect to the identical/mirror image condition and the angular disparity (0, 20°, 60°, and 100°). Half of the trials per condition were the same pair, and the other half of the trials were a different pair. No object pair was presented twice. Once the object pair appeared on the screen, subjects were asked to decide whether the objects were same or mirror images. Subjects were instructed to respond as quickly as possible while keeping errors to a minimum (for details, see Ecker et al., 2006). The nature of the response (same or mirror image) and the RTs were recorded.
Desynchronization of Stimulus Duration and RTs
The data were acquired in an ER fMRI design with constant stimulus duration and constant ISI. In all trial conditions, the three-dimensional objects were presented for a period of 10 sec, followed by a fixation cross (baseline; Figure 1). To keep the stimulus duration constant, the objects did not disappear from the screen after subjects indicated their choice by button press. Instead, objects were presented for an additional period of Dt with Dt = 10 sec − RT sec. Subjects were instructed to avoid repeated object rotation by focusing their eye gaze on one of the presented objects. Individual trials were separated by an ISI of 5 to 6 sec, after which the HRF should have decayed to the baseline level (Kwong et al., 1992).
Whole-brain gradient echo-planar MR images were acquired using a 1.5-T GE Signa Neuro-optimized System (General Electric, Milwaukee, WI) fitted with 40 mT/m high-speed gradients at the Maudsley Hospital, London. A foam padding and a forehead strap were used to limit head motion. Seven hundred sixty T2*-weighted images depicting BOLD contrast (Kwong et al., 1992; Ogawa & Lee, 1990) were acquired over 25.34 min at each of 25 near-axial, noncontiguous 5-mm-thick planes parallel to the intercommissural (AC–PC) line: TE = 40 msec, TR = 2000 msec, theta = 80°, in-plane resolution = 3.75 mm, interslice gap 0.5 mm. Volumes 1 to 60 (0–119 sec) were acquired during the first resting period, volumes 61 to 700 (120–1399 sec) were acquired during the mental rotation task, and volumes 701 to 760 (1400–1519 sec) were acquired during the second resting period.
The data were preprocessed using SPM2 (Wellcome Department of Imaging Neuroscience, London, UK). The scans were realigned to remove residual motion effect. The slice timing correction was applied, and the scans were transformed into standard space (Talairach & Tournoux, 1988). The data were smoothed in space using an 8-mm Gaussian filter (full-width half-maximum). Finally, a mask was applied to select voxels that contain brain tissue for all subjects.
Singular Value Decomposition
We used singular value decomposition (SVD) to reduce the raw data to its eigenvariates (Mourao-Miranda et al., 2007). The SVD was performed across data for all training subjects. The training and the test data were projected onto the resulting singular vectors or basis (we used all singular vectors or principal components that have eigenvalues different from zero). A description of SVD can be found in the Supplementary material (Appendix A).
In the current approach, each trial (described in Figure 1) is treated as a single spatio-temporal observation. Thus, the time window runs from the beginning of stimulus presentation until the end of the ISI. This means the number of observations of one condition per subject is the number of trials. The fMRI data are represented by a spatio-temporal observation of size M = Nv × Nt, where Nv is the total number of voxels and Nt is the number of time points included in the time window. A single feature in one observation is defined by yjti, that is, the fMRI data signal of a voxel j at a given time point t in the trial i.
Support Vector Machine
SVMs are pattern recognition devices that find functions of the data that facilitate classification. They are based in statistical learning theory (Vapnik, 1995) and have emerged as powerful tools for statistical pattern recognition (Boser, Guyon, & Valdimir, 1992). In the linear formulation, an SVM finds during the training phase a hyperplane that separates the examples in the input space according to a class label. The SVM classifier is trained by providing examples of the form <x,c>, where x represents a spatial pattern and c is the class label. Once the decision function is learned from the training data, it can be used to predict the class of a new test example. In the present study, x represents a spatio-temporal observation and c is the task performed (e.g., c = 1 for the 0° condition and c = −1 for the 100° condition). Previous application of the spatio-temporal SVM to a blocked design can be found in Mourao-Miranda et al. (2007).
We used a linear kernel SVM that allows direct extraction of the weight vector as an image (i.e., the discriminating spatio-temporal pattern). The parameter C that controls the trade-off between having zero training errors and allowing misclassifications was fixed at C = 1 for all cases (default value). The SVM toolbox for Matlab was used to perform the classifications (http://ida.first.fraunhofer.de/∼anton/index.html). A brief description of SVM can be found in the Supplementary material (Appendix B).
We evaluated the performance of the classifier using a leave-one-subject-out cross-validation test. In each trial, we used all observations from all but one subject (S − 1 of the S subjects) to train the classifier. Subsequently, the class assignment of the test subject was calculated during the test phase. This procedure was repeated S times each time leaving observations from a different subject out (i.e., each subject was left out once). The classifier accuracy was measured by the proportion of observations correctly classified.
In each leave-one-out test, the number of training examples per class (e.g., class 1 = 0° and class 2 = 100°) was the number of training subjects times the number of trials per subject (10 subjects × 20 trials = 200 examples), and the number of test examples per class was 20 (all trials of the test subject).
Dynamic Discrimination Maps
If the input space is the voxel space (one voxel per dimension), the weight vector (w) normal to the hyperplane will be the direction along which the volumes of the two tasks differ most. Hence, it represents a map of the most discriminating regions (i.e., a discrimination map). Given two classes, Task 1 and Task 2, with the labels +1 and −1, a positive value in the discrimination map means that this voxel had relatively higher activity during Task 1 than during Task 2 in the training examples that contribute most to the overall classification (i.e., the support vectors), and a negative value means relatively lower activity during Task 1 than during Task 2. In our case, the input space covers voxels and time points; consequently, w encodes a spatio-temporal map (w-map) showing, for each voxel, how the discrimination between the tasks changes over time. Because the classifier is multivariate by nature, the combination of all voxels as a whole is identified as a global pattern by which the brain states differ (i.e., the discriminating pattern). Selecting a subset of the most discriminating voxels is a nontrivial problem and has been previously addressed by a permutation approach (Mourao-Miranda et al., 2005). However, the permutation approach is not ideal once it uses a univariate test to threshold the output of a multivariate approach. In the present article, we decided to present nonthresholded w-maps that give us a global view of the discriminating pattern. To keep the article down to a reasonable length, we have only included the w-map for the classification between 0° and 100°. However, for the interest of readers, we have included the other maps (0 vs. 20 and 0 vs. 60) in the Supplementary material.
For comparative purposes, we computed a standard mass-univariate t test on a time-point by time-point basis at each voxel. We performed a two-sample t test between the two classes (0° vs. 100°) using the same data that were used to train the SVM (the SVM received all data from all training subjects with no explicit distinction between within- and between-subjects effect); that is, for each voxel and for each time point, we compared the values during Task 1 (e.g., 0° condition) with the values during Task 2 (e.g., 100° condition) regardless of within- and between-subjects effect (i.e., a fixed effect model). To identify the peaks of the t map, we used a threshold of p < .001 (uncorrected). Differences observed between the peaks of the t map and the weight vector are likely to be due to multivariate structure in the data that cannot be detected by the univariate t test. To investigate this hypothesis, we performed a functional connectivity analysis between the most discriminating regions in the SVM weight vector.
Functional connectivity is defined as the temporal coherence between spatially remote neurophysiological events (Friston, Frith, Liddle, & Frackowiak, 1993), for example, the correlation between the time series of two voxels or regions. It does not provide any direct insight into how these correlations are mediated. We computed functional connectivity between a subset of the most discriminating regions identified by the spatio-temporal SVM. To make the interpretability of the results feasible, we limited the analysis to a subset of regions. The choice of the regions was based on the previous literature describing regions involved in the mental rotation task and regions previously described as the resting state network (i.e., regions that systematically present a decrease of BOLD signal during task performance). Overall, 12 regions were selected for functional connectivity analysis based on previous publications. These regions include motor areas such as the primary motor area (M1), the premotor area (PM), and the SMA (Ecker et al., 2006; Vanrie, Beatse, Wagemans, Sunaert, & Van Hecke, 2002; Vingerhoets et al., 2002; Parsons et al., 1995), parietal regions such as the superior parietal lobe (LPs) and the inferior parietal lobe (LPi) (e.g., Jordan, Heinze, Lutz, Kanowski, & Jäncke, 2001; Barnes et al., 2000), and visual regions such as the fusiform gyrus (FG) (e.g., Schoening et al., 2007) and the medial occipital gyrus (GOm) (e.g., Ecker et al., 2006). In addition, the anterior cingulate was included (e.g., Butler et al., 2006). The selected regions and their respective Talairach coordinates are presented in Table 1. We performed the connectivity analysis using the same time series information that was used to train the spatio-temporal SVM. For each region, we selected the time series of a voxel in the center of the cluster and we defined a condition-specific time series by concatenating the 20 trials of a specific condition (i.e., the trials used to train the SVM). The time window of each trial contained eight time points (the time window runs from the beginning of stimulus presentation until the end of the ISI). The time series was normalized by subtracting the mean and then dividing it by the standard deviations of the 160 time points. For each subject, we computed a connectivity matrix for both conditions (0° and 100°). We applied a Wilcoxon signed rank test to find statistically significant changes in connectivity between the selected regions over all subjects.
|Regions That Increase BOLD Signal during the Mental Rotation Task|
|Right PM||34 −6 50|
|Left PM||−29 −4 50|
|Right LPs||22 −70 50|
|Left LPs||−22 −70 50|
|Right LPi||38 −40 42|
|Left LPi||−38 −40 42|
|Left M1||−41 −22 50|
|SMA||−8 12 50|
|CA||6 24 34|
|Left FG||−43 −66 −22|
|Right GOm||28 −82 18|
|Left GFi||−42 22 18|
|Regions That Decrease BOLD Signal during the Mental Rotation Task|
|CP||−2 −40 34|
|Right GFs||24 26 50|
|Left GFs||−18 34 50|
|Right GTs||55 −62 26|
|Left GTs||−51 −62 26|
|Regions That Increase BOLD Signal during the Mental Rotation Task|
|Right PM||34 −6 50|
|Left PM||−29 −4 50|
|Right LPs||22 −70 50|
|Left LPs||−22 −70 50|
|Right LPi||38 −40 42|
|Left LPi||−38 −40 42|
|Left M1||−41 −22 50|
|SMA||−8 12 50|
|CA||6 24 34|
|Left FG||−43 −66 −22|
|Right GOm||28 −82 18|
|Left GFi||−42 22 18|
|Regions That Decrease BOLD Signal during the Mental Rotation Task|
|CP||−2 −40 34|
|Right GFs||24 26 50|
|Left GFs||−18 34 50|
|Right GTs||55 −62 26|
|Left GTs||−51 −62 26|
CA = anterior cingulate; GOm = middle occipital gyrus; GFi = inferior frontal gyrus; CP = posterior cingulate; GFs = superior frontal gyrus; GTs = superior temporal gyrus.
In Figure 2, we present the mean accuracy for the SVM trained with spatio-temporal observations. We trained three SVM classifiers to predict if the subject was performing a mental rotation task (with 100°, 60°, or 20° of angular disparity between the figures, respectively) or a control task (with 0° of angular disparity between the figures). Error bars indicate the standard error across 11 leave-one-subject-out cross-validation tests. The accuracy was correlated with the amount of difference in angular disparity between the mental rotation task and the control task: 67.0% for 20° versus 0°, 78.9% for 60° versus 0°, and 89.5% for 100° versus 0°. Because the comparison between 100° versus 0° resulted in the best classification accuracy, we included in the article only the spatio-temporal profile of the SVM weight vector for this comparison.
Weight Vector—Spatio-temporal SVM
In Figure 3, we show the spatio-temporal profile of the SVM weight vector (w-maps) for classifying between 0° and 100°. These maps give information about the areas involved in different components of the mental rotation task changing their behavior over time. The value of each voxel in the weight vector is proportional to the importance of this voxel in discriminating between the two tasks (0° vs. 100° of angular disparity). The color scale identifies the most discriminating regions for each time point (light/dark blue for negative values, i.e., relatively more activation during 100°, and red/orange for positive values, i.e., relatively more activation during 0°) in relation to the regions with lower discriminating weight (green, cyan, and yellow). Each row corresponds to a different time point during the experimental design. The time resolution is the 2 sec (i.e., one TR).
Discriminating Areas between 0 and 2 sec (T1)
The first row shows that during the first 2 sec after the beginning of the task, there were not many areas with high discriminating weights; this is in agreement with reported values of hemodynamic delay.
Discriminating Areas between 2 and 4 sec (T2)
The first discriminating areas appear in the second row. Negative peaks in blue (i.e., increases in BOLD during the 100° condition) include (bilaterally) the LPi, the SMA, the right PM, the anterior cingulate, the right GOm, and (bilaterally) the FG. Positive peaks in red (i.e., decreases in BOLD during the 100° condition or increase in BOLD during the 0° condition) include left M1, posterior cingulate, and inferior mediofrontal cortex. The positive peak in the left M1 corresponds to the motor response for the task with 0° of angular disparity, and it agrees with the subject's mean RT for the task (1.5 sec) plus the hemodynamic delay (∼2 sec). The negative peaks correspond to the areas involved in the initiation of the mental rotation task.
Discriminating Areas between 4 and 6 sec (T3)
The negative peaks (i.e., increases in BOLD during the 100° condition) during the third TR include the SMA, the PM (bilaterally), the left LPs, the LPi (bilaterally), and the FG. Positive peaks (i.e., decreases in BOLD during the 100° condition) include posterior cingulate and right superior temporal gyrus. Between 4 and 6 sec, the transformation of the spatial coordinates is almost complete, and a response is made following 4.5 sec on average after stimulus onset. In addition, the presented visual stimuli can still be seen on the screen. Taking into account the hemodynamic delay, regions displaying negative peaks at T3 are therefore primarily involved in the computation of the rotation process and in the preparation of the motor response. Positive peaks would be expected in regions that are suppressed during mental rotation and motor preparation, as would be expected in regions that normally decrease the BOLD signal with an increase in task load (e.g., posterior cingulate).
Discriminating Areas between 6 and 8 sec (T4)
The negative peaks (i.e., increases in BOLD during the 100° condition) in the fourth TR include the FG bilaterally, the LPs, the PM, and the SMA. The positive peaks (i.e., decreases in BOLD during the 100° condition) include the posterior cingulate, the right postcentral gyrus, and both the superior temporal gyrus and the superior frontal gyrus bilaterally. This period corresponds to the time when the subjects finish the mental rotation task. This result agrees with the subjects' mean RT for the condition with 100° of angular disparity (4.5 sec) plus the time required for a significant hemodynamic response to appear (∼2 sec). Overall, the resulting peaks at T4 are very similar to the peaks at T3. Negative peaks were observed primarily in regions of the rotation network as well as motor areas, whereas positive peaks were observed primarily in regions that normally decrease the BOLD signal with an increase in task load.
Discriminating Areas between 8 and 10 sec (T5)
The positive and the negative peaks in the fifth TR are basically the same as described in the fourth TR but with smaller cluster sizes. The main remaining negative peak is in the LPs bilaterally. At this time, the subjects had already finished the mental rotation task, but the visual stimulus is still being presented.
Discriminating Areas between 10 and 16 sec (T6, T7, and T8)
After 10 sec, there are almost no areas with high discriminating weight. This period corresponds to the 6 sec of inter stimulus interval.
We also evaluated the discriminating maps for the other comparisons. In contrast to the 0 versus 100 map, the maps for discriminating between lower angular disparities are noisier, the weights being less concentrated in regions known to play a main role in the task (see Supplementary material).
t Maps and Connectivity Analysis
For purposes of comparison with the standard univariate methods of analyzing fMRI data, we also compute a t map (i.e., two-sample t test) for each time point (Figure 4) and applied a threshold corresponding to p < .001 (uncorrected). By comparing the peaks of the spatio-temporal weight vector with the peaks of thresholded t maps, it is possible to see that there are many differences between them. The red clusters (representing areas relatively more activated during the 0° condition) were not identified by the t test; in addition, the distribution of the blue clusters (representing areas relatively more activated during the 100° condition) are very different in the two maps. The difference between these maps might be due to multivariate structure in the data (inter voxel interactions). To investigate the multivariate structure of the data, we computed the functional connectivity or the temporal correlation between the peaks of the w-map for each subject and for each condition and tested for task-related statistical changes in functional connectivity that occurred consistently across subjects. A number of areas belonging to the mental rotation network show significant increase in correlation during the 100° condition compared with the 0° condition. The results for these analyses are presented in Table 2. We can see that there are basically two networks. Network 1: areas that increased the BOLD signal during the mental rotation task (blue peaks in Figure 3); and Network 2: areas that decreased the BOLD signal during the mental rotation task (red peaks in Figure 3). The results of the functional connectivity analysis showed that areas within each network are well correlated with themselves but not with areas of the other network. As we can see from Table 2, the areas of the Network 2 did not change their functional connectivity with each other during the 100° condition in relation to the 0° condition. For each selected region (i.e., peaks of the SVM weight vector), we presented the BOLD signal (percent signal change) for all conditions averaged over all trials and over all subjects. In addition, we also presented the temporal profile of the weight vector for the comparison between 100° versus 0° and the difference in BOLD response between these conditions (Figure 5A–C).
CA = anterior cingulate; FG L = left fusiform gyrus; GOm R = right middle occipital gyrus; GFi L = left inferior frontal gyrus; CP = posterior cingulate; GFs R = right superior frontal gyrus; GFs L = left superior frontal gyrus; GTs R = right superior temporal gyrus; GTs L = left superior temporal gyrus.
In the present study, spatio-temporal SVM was applied to an ER mental rotation paradigm. The classification accuracy was correlated with amount of difference in angular disparity between the mental rotation task and the control task: 67.0% for 20° versus 0°, 78.9% for 60° versus 0°, and 89.5% for 100° versus 0°. The SVM was also able to detect brain regions functionally associated with different aspects of the task and to show how the contributions of these regions change over time. This approach is different to conventional, model-dependent techniques, which rely on an a priori postulated model for the time course of activation following stimulation. In addition, we investigated changes in functional connectivity between the most discriminating areas identified by the spatio-temporal SVM. Results of these analyses revealed that functional connectivity increased between almost all areas involved in the mental spatial transformation from 0° to 100° rotation. Interestingly, this increase in connectivity was absent in areas that decreases in their BOLD responses during the mental rotation task.
Overall, the regions detected by the spatio-temporal SVM correspond to those regions reported previously during mental rotation paradigms based on model-dependent techniques. These include motor and parietal regions, visual system components (Vingerhoets et al., 2002; Alivisatos & Petrides, 1997; Parsons et al., 1995). Activations in motor system are generally explained by a possible involvement of motor imagery (Vingerhoets et al., 2002; Parsons et al., 1995); that is, the subjects rotate mental images in the same way as they physically would do using appropriate hand movements. Parietal activation has been traditionally linked to visuospatial processing and has also been observed during both overt and covert movement, which would support the motor-imagery hypothesis (Kosslyn, Thompson, Wraga, & Alpert, 2001). Activation in LPi and LPs has consistently been reported in previous mental rotation studies using three-dimensional cubic structures (Vanrie et al., 2002; Jordan et al., 2001; Barnes et al., 2000; Cohen et al., 1996) and alpha-numeric stimuli (Vingerhoets et al., 2001; Harris et al., 2000; Alivisatos & Petrides, 1997) and during rotation of image of body parts (Kosslyn, DiGirolamo, Thompson, & Alpert, 1998; Parsons et al., 1995).
A few studies have also used model-free approaches to investigate the areas involved in the mental rotation task (Lamm et al., 2007; Windischberger et al., 2003). Windischberger et al. (2003) used exploratory fuzzy cluster analysis to identify brain regions with stimulus-related time courses. They showed that in 3 of 14 subjects, fuzzy cluster analysis allowed the separation of distinct clusters corresponding either to visuospatial processing (i.e., mental rotation of the three-dimensional cube stimuli) or to movement execution (button pressing). Voxels of the first cluster were located in the parietal, occipital, and premotor cortex as well as in anterior parts of the SMA. Voxels activated during the manual task response appeared in the primary somatomotor area (contra lateral to the button-pressing hand) and more posterior parts of the SMA. Lamm et al. (2007) developed a new task paradigm to explicitly separate the different cognitive processes required by solving the mental rotation task (encoding, mental rotation proper, and object matching). In addition, they used an fMRI analysis approach that does not require assumption about the shape of the hemodynamic response (Windischberger et al., 2003). In this approach, separate regressors are constructed for every time point (TR) within the trial, each predicting intensity changes at corresponding time point. By computing contrasts between these regressors, it is possible to model the cognitive processes associated with the different steps of task solving. These analyses reveled that mental rotation task involves a number of brain regions associated with visual and motor processing. The LPs was persistently involved into all aspects of task solving, showing highest activation during mental rotation proper. In addition, higher-order visual areas in occipital and temporal lobe as well as various motor areas were found to be active during the different steps of task solving. The authors also showed that activation in the dorsolateral premotor areas (dPM) during the mental rotation was not strongly modulated by the processing of spatial information but that activation in the dPM areas was strongest during the mental rotation proper. They suggested that dPM is involved in more generalized processes such as visuospatial attention and movement anticipation.
An interesting novel finding provided by the spatio-temporal SVM is that frontal premotor regions appear as discriminating earlier than parietal regions (specifically the superior parietal cortex) during the mental rotation task. By looking at the spatio-temporal discrimination map (Figure 3), it is possible to see that the set of motor regions are highly discriminating shortly after stimulus onset (between 0 and 4 sec after the onset peaking between 4 and 6 sec). This implies that there is an instantaneous onset in the activity of the motor system, and that is greater during the 100° condition than during the 0° condition. In addition, the discrimination weight of these areas switches off at the same time as the rotational process (between 6 and 8 sec). This seems to support suggestions that parietal activation seems less directly linked to the computation of the spatial transformation than formerly hypothesized (Lamm et al., 2007; Ecker et al., 2006). This finding is also in accordance with previous investigations demonstrating a significant correlation between the width of the hemodynamic response and the RT measures in these regions (Ecker et al., 2006; Richter et al., 2000). The data therefore seem to support the hypothesis that several areas of the motor system are directly linked to the rotational process.
Parietal regions on the other hand were predominantly found to be discriminating during later stages of the trial time course (between 6 and 10 sec). Furthermore, there seems to be a parcellation of the parietal cortex based on the time course of the weight vector. Whereas the LPi appears as a discriminating region as earlier in T2 (between 2 and 4 sec), the superior parietal cortex appears mainly at T4/T5 (between 6 and 10 sec). This would suggest a later activation of the superior parietal cortex in relation to the inferior parietal cortex. However, it is important to keep in mind that the value of a voxel in the w-maps does not depend only on the activity on this voxel, but it is a combination of univariate and multivariate effects. The difference in temporal discrimination between the inferior and the superior parietal cortex was not observed in the temporal evolution of the t maps. This difference between the w-maps and the t maps may reveal components in the parietal cortex that are detected by SVM because of its multivariate nature but overlooked by the univariate t test. There are direct anatomical connections from LPs to motor cortex (reviewed by Picard & Strick, 2001) that may be well involved in the behavioral response. Parcellation of the parietal lobe during real and imaginary mental rotation has been previously described (Podzebenko, Egan, & Watson, 2005). The authors identified subregions within the posterior parietal cortex that displayed differential functional activation according to stimulus type. Those conditions requiring canonical-mirror judgments consistently recruited the ventrolateral banks of the inferior parietal lobule (IPS), whereas those conditions in which no orientation discrimination was required (the subjects passively viewed a rotating abstract stimulus) consistently recruited the medial bank of the IPS, extending into the superior parietal lobule. In addition, conditions that require identification of a stimulus' orientation activate, among other areas, the BA 40 and the caudolateral bank of the IPS (Shikata et al., 2001; Orban, Dupont, Vogels, Bormans, & Mortelmans, 1997).
The spatio-temporal SVM revealed very little participation of visual regions as discriminating areas. The visual stimulation was constant from T1 to T5 in both conditions (0° and 100°). This supports Ecker's et al. (2006) previous conclusion that visual system components do not actively participate in the mental process itself (i.e., computation of the spatial transformation). The only discriminating areas in the visual system were the right GOm and the left FG. The discriminating cluster in the GOm (Talairach coordinates = −43, −66, −22) corresponds to the object-sensitive area in the dorsal occipito-temporal cortex described by Schendan and Stern (2007) as part of the common network activated by mental rotation and object recognition. This area has been suggested to have a role intermediate between typical ventral “what” and dorsal “where” functions (Hasson, Harel, Levy, & Malach, 2003), computing two- and three-dimensional spatial aspects of the objects and relations between objects and space.
Another interesting discriminating region found by the spatio-temporal SVM was the dorsal portion of the anterior cingulate. This region has an important role in mediating attention-requiring processes as response selection and error monitoring (Botvinick, 2004). Although the anterior cingulate has been found to be activated during a mental rotation task (Schendan & Stern, 2007), its role has not been discussed in detail. By observing the discriminating regions relatively more activated during the 0° condition than the 100° condition, we found a decrease in the BOLD signal in a set of areas that normally show decreases in BOLD signal with an increase in task load. These areas have been also previously been described as part of the default brain network. We suspected that these areas do not appear on the t maps because the difference in activation between the 0° and the 100° conditions is small (and therefore not significant) but the high degree of connectivity leads to high discrimination in the SVM maps. The “default network” has been described as a number of regions that consistently decrease activity when subjects engage in goal-direct tasks as compared with a control or resting condition (Fox & Raichle, 2007; Raichle & Snyder, 2007; Raichle et al., 2001). These regions include posterior cingulate, precuneus, and medial prefrontal cortex. The pattern of areas showing task-induced deactivation (TID) is remarkably similar across different studies (Mazoyer et al., 2001; Binder et al., 1999; Shulman et al., 1997), and the common regions include the posterior cingulate cortex, the dorsomedial frontal cortex in the middle, and the superior frontal gyri, rostral anterior cingulated gyrus, and angular gyrus. McKiernan, Kaufman, Kucera-Thompson, and Binder (2003) observed that the magnitude of the TID increased with task difficulty. Although we observed areas with TID in the present study, the control condition corresponds to the comparison of two objects with 0° of angular disparity, and therefore it is still a task with very low cognitive load rather than a resting state.
The overall impression gained by examining the three discriminating maps (0° vs. 20°, 0° vs. 60°, and 0° vs. 100°; see Supplementary material) is that those involving lower angular disparities (0° vs. 20° and 0° vs. 60°) are noisier than the 0° versus 100° map. In the last map, the weights become more concentrated in those regions known from previous work to be involved in the mental rotation task. It is also noteworthy that the classification accuracy improves markedly with increasing angular disparity. Combining these two sets of observances, we believe that higher accuracy is achieved when the discriminating map is concentrated on those regions with the most important physiological role in the task.
So far, this discussion has mainly addressed the interpretation of the results in the context of mental rotation. There are, however, several methodological issues requiring further consideration. The interpretation of the w-maps is very complex because of the multivariate properties of the SVM approach. The value of a voxel in the weight vector is related to its importance in discriminating the classes (i.e., Task 1 vs. Task 2), and it is a combination of univariate (i.e., mean difference between the classes) and multivariate effects (e.g., temporal correlation between regions). To make the interpretation of the w-maps easier, it is important to compute the t maps to the same contrast as the SVM. If the w-map is quite similar to the t map (as previously shown in Mourao-Miranda et al., 2006, 2007), it means that the main effect driving the SVM at a given voxel is the mean difference between the groups (univariate effect) at that voxel. However, if the maps are dissimilar (as shown in the present article), one can hypothesize that the differences are due to the multivariate components of the data. One possible way to investigate this hypothesis is to evaluate the functional connectivity between the peaks of the w-maps. In Figure 5A–C, we showed the time series (averaged over trials and subjects) for the selected peaks of the w-maps. It is possible to see that in all regions there was a parametric modulation of the BOLD signal by the angular disparity (i.e., by the task difficulty). This parametric modulation was present in areas that increased BOLD signal with the task difficulty as well as in the areas that decreased BOLD signal with the task difficulty. In addition, we also presented the temporal profile of the weight vector for the comparison between 100° versus 0° and the difference in BOLD response between these conditions.
We used connectivity analysis to aid the interpretability of the differences between the t map and the discriminating map (i.e., w-map). The results of these analysis showed that the peaks in the discriminating volume were part of two networks of regions (Network 1: regions that increased their BOLD signal during the mental rotation task; and Network 2: regions that decreased their BOLD signal with task load). Activations in areas within these networks were highly correlated with themselves but not with areas of the other network. Additionally, we found an increase in functional connectivity among the regions well established as the main core of the mental rotation network (inferior and superior parietal, premotor, and supplementary regions) during the difficult task but not between areas that presented negative BOLD response during mental rotation task (Table 2). Interestingly, the degree of functional connectivity between parietal regions, premotor regions, SMA, and anterior cingulate increases significantly during the most difficult task. Although the SVM analysis detects other discriminating regions like the M1, the FG, the GOm, and the inferior frontal gyrus, these areas seem to be uncoupled from the mental rotation network. One can therefore conclude that these regions are not directly related to the transformation itself but are more likely to be involved in other aspects of the task (e.g., object recognition, report decision). The fact that M1 does not change its connectivity with the other regions can be considered as evidence that this region is only involved in the motor output of the decision. These results emphasize the advantage of using multivariate methods, as the SVM, to investigate how multiple regions interact to perform a cognitive task.
Functional connectivity analyses have been previously applied to a mental rotation task (Koshino, Carpenter, Keller, & Just, 2005). In this work, functional connectivity analysis was performed using a number of anatomically defined regions of interest. The correlation coefficient was computed between the mean signals of the activated voxels within the ROIs. In addition, these authors performed an exploratory factor analysis in each condition and found three main common factors (or clusters of regions) across conditions corresponding to executive processing, spatial processing, and lower level visual processing. Our work differs from Koshino et al. (2005) in many aspects. First, our main interest was to use the spatio-temporal SVM to perform a dynamic discrimination analysis, that is, to observe areas involved in different components of the mental rotation task changing the behavior over time. Second, we performed the functional connectivity analysis using a subset of the most discriminating areas in an attempt to investigate the multivariate components of regions identified by the spatio-temporal SVM.
Recently, pattern recognition approaches have been applied to analyze fMRI data (Mourao-Miranda et al., 2005, 2006; Davatzikos et al., 2005; LaConte et al., 2005; Cox & Savoy, 2003; Mitchell et al., 2003). The main idea behind the use of pattern recognition methodology is that most of the brain functions are represented in a distributed mode and therefore is possible to recognize a pattern of activity as being associated with one cognitive state versus another (Norman, Polyn, Detre, & Haxby, 2006). These approaches can be extended using temporal embedding, for example, the spatio-temporal SVM (Mourao-Miranda et al., 2007). This makes the dynamic aspect of the brain activation an explicit part of the classification problem. Applications of spatio-temporal SVM are especially advantageous to analyze complex tasks that can be split in a number of subcomponents with different temporal profiles.
In summary, we have shown that by applying spatio-temporal SVM to a mental rotation ER experiment, we could detect areas involved in different components of the task as they change their behavior over time. In addition, we investigated changes in functional connectivity between the most discriminating areas identified by the spatio-temporal SVM. We observed an increase in functional connectivity between areas of the mental rotation network (bilateral LPi and LPs, bilateral PM, and SMA) during the difficult task but not between the areas that presented a decrease of the BOLD signal during the mental rotation task (posterior cingulate, frontal, and superior temporal gyrus).