Given their unique connectivity, a primary function of brain networks must be to transfer and integrate information. Therefore, the way in which information is integrated by individual nodes of the network may be an informative aspect of cognitive processing. Here we present a method inspired by telecommunications research that utilizes time–frequency fluctuations of neural activity to infer how information is integrated by individual nodes of the network. We use a queueing theoretical model to interpret empirical data in terms of information processing and integration. In particular, we demonstrate, in participants aged from 6 to 41 years, that the well-known face inversion phenomenon may be explained in terms of information integration. Our model suggests that inverted faces may be associated with shorter and more frequent neural integrative stages, indicating fractured processing and consistent with the notion that inverted faces are perceived by parts. Conversely, our model suggests that upright faces may be associated with a smaller number of sustained episodes of integration, indicating more involved processing, akin to holistic and configural processing. These differences in how upright and inverted faces are processed became more pronounced during development, indicating a gradual specialization for face perception. These effects were robustly expressed in the right fusiform gyrus (all groups), as well as right parahippocampal gyrus (children and adolescents only) and left inferior temporal cortex (adults only).
The complex topology of brain networks suggests that they are optimized for information processing. Anatomical brain networks have been found to possess a whole host of nontrivial properties that may make them ideally suited for information transfer, including economical small-world connectivity (Bullmore & Sporns, 2012; Bassett & Bullmore, 2006), hierarchical modularity (Meunier, Lambiotte, & Bullmore, 2010; Hilgetag, Burns, O'Neill, Scannell, & Young, 2000), and the presence of highly interconnected hubs that potentially serve as a backbone for communication (van den Heuvel & Sporns, 2011; Sporns, Honey, & Kötter, 2007).
To this end, a rich repertoire of techniques has emerged to capture the flow and exchange of information in brain networks. For instance, connectivity-based methods can quantify the role of brain regions in the context of network communication (Mišić, Vakorin, Paus, & McIntosh, 2011; van den Heuvel & Sporns, 2011; Rubinov et al., 2009; Hagmann et al., 2008; Stam, Jones, Nolte, Breakspear, & Scheltens, 2007; Sporns & Zwi, 2004; Sporns, Tononi, & Edelman, 2000). Neural activity may also be treated as an expression of a dynamic system to characterize various aspects of information processing. Entropy-based metrics can be used to estimate the rate at which information is generated by that system (Vakorin, Lippé, & McIntosh, 2011; Vakorin, Mišić, Krakovska, & McIntosh, 2011; Mišić, Mills, Taylor, & McIntosh, 2010; McIntosh, Kovacevic, & Itier, 2008). Likewise, the dynamic range of the system may be studied in terms of the variability of neural activity (Garrett, Kovacevic, McIntosh, & Grady, 2010, 2011, 2012). Finally, time series of neural activity can be segregated into a set of states to study transitions among those states (Freyer, Roberts, Ritter, & Breakspear, 2012; Freyer, Aquino, Robinson, Ritter, & Breakspear, 2009; Pascual-Marqui, Michel, & Lehmann, 1995). Yet despite considerable progress, we lack full understanding of how information is integrated and projected to other nodes in the network.
The role of integration in the context of networked computation is of interest not only in neuroscience but in network science in general. In a recent report, we demonstrated that tools from teletraffic engineering could be adapted to study information flow in brain networks (Mišić, Vakorin, Kovaević, Paus, & McIntosh, 2011). A fundamental statistic for communication networks is the distribution of time between successive message departures at each node (inter-event time), as it reflects the nature of the arrivals at the node as well as how the node processes these arrivals. We showed that the resting-state EEG could be partitioned into a trace of output events, analogous to a departure of a unit of information.
However, information is dynamically routed through brain networks over multiple timescales. Long-term changes such as pruning and myelination that take place during development transform the configuration of anatomical brain networks (Hagmann et al., 2010) and the paths available for information transfer. Likewise, transient interareal functional associations may form spontaneously or in response to external demand, which suggests that information flow can be flexibly redirected (Honey et al., 2009; Stephan et al., 2003; McIntosh et al., 1994). The purpose of this study is to investigate how the reconfiguration of structural and functional networks affects the way in which information is integrated. To study regionally specific dynamics, we used magnetoencephalography (MEG) paired with a beam-forming technique (Cheyne, Bostan, Gaetz, & Pang, 2007; Sekihara, Sahani, & Nagarajan, 2005; Robinson & Vrba, 1999) to estimate activity at several hundred sources in the brain (Figure 1), while children and adults performed a face recognition task with upright and inverted faces. We hypothesized that developmental changes in inter-event times should be observed throughout the brain, as most brain areas experience structural changes because of myelination and synaptic pruning over the age range of our sample (Lenroot & Giedd, 2006; Taylor, 2006; Paus et al., 1999).
A face recognition task allowed us to characterize integration and subsequent output during a cognitive task. We investigated the face inversion effect, whereby recognition of faces is impaired if they are presented upside-down. A similar effect is observed for other types of visual stimuli such as everyday objects, but the impairment is disproportionately large for faces (Yin, 1969). Face inversion is thought to disrupt the configural (Mondloch, Le Grand, & Maurer, 2002; Freire & Lee, 2001; Rhodes, Brake, & Atkinson, 1993) and/or holistic (Farah, Tanaka, & Drain, 1995; Tanaka & Farah, 1993) information about a face, leading to significant behavioural and neural differences in processing of upright and inverted faces, in adults and children (Taylor, Batty, & Itier, 2004). We hypothesized that if integration of information about the relations among features as well as their overall arrangement is affected by face inversion, inter-event time distributions should be different for upright and inverted faces. As processing of faces changes with development and generally shows a different pattern for upright and inverted faces (Baudouin, Gallay, Durand, & Robichon, 2010; Mondloch, Robbins, & Maurer, 2010; Joseph et al., 2006; Itier & Taylor, 2004), this effectively allowed us to test our model on many different data sets. A principal strength of developmental data is that they can be used to confirm models of processing that are based only on adult results (Enns, 1993).
Events were defined as the local maxima of the MEG scalogram (Figure 2), which follows from the physiological interpretation of gross electromagnetic brain signals such as EEG and MEG. Namely, peaks in electromagnetic signal power represent synchronous postsynaptic potentials from a population of neurons. These dendritic currents in turn represent the integration of input from afferent neuronal populations. Inter-event times were fitted with the two-parameter gamma distribution (Figure 3). Our goal was to identify spatial patterns of task-dependent changes in inter-event times that were expressed across multiple age groups, without an a priori ROI. We opted for a multivariate analytic approach (partial least squares [PLS]; McIntosh & Lobaugh, 2004), which allowed us to statistically assess group and task differences simultaneously across all sources and frequencies.
We interpret changes in the inter-event distributions in terms of a particular generative model (the Erlang server), which models how information accumulates at a particular node in a network. We note, however, that finding a Gamma distribution does not imply this particular model and in fact the underlying process cannot be inferred directly from the functional form of the distribution. Therefore, our interpretation of these results in terms of the Erlang model is strictly speculative but allowed a parsimonious explanation of the findings, consistent with the wider literature.
To give a brief overview, the methodological approach is organized as follows. MEG data are transformed into a time–frequency representation. Peaks in MEG signal power are delinated as events that represent bursts of dendritic currents. Inter-event times are fitted with the two-parameter gamma distribution. Using the Erlang model, inter-event times are represented as a series of integrative stages, which correspond to the input of information to a neuronal population from multiple afferent sources.
Seventy-one healthy participants took part in the study, 49 children, aged 6–7 (n = 10, 3 girls), 8–9 (n = 10, 9 female), 10–11 (n = 9, 2 girls), 12–13 (n = 10, 6 girls), 14–16 (n = 10, 3 girls); and 22 adults, aged 20–41 years (mean age = 25.7 years, 9 women). None of the participants wore any metallic implants or had ferromagnetic dental work. All participants reported normal or corrected-to-normal vision. Experiments were performed with the approval of the Research Ethics Board at the Hospital for Sick Children, as well as the informed assent from the children and informed consent by their parents and the adult participants.
Stimuli and Task
Participants performed a simple 1-back task in which they were asked to judge whether each face presented was the same or different compared with the previously presented face. The stimulus set consisted of 240 grayscale photographs of faces of young adults that were unfamiliar to the participants (2.4 × 3° visual angle) with neutral expressions. All faces were without any paraphernalia or facial hair. Male and female faces were equiprobable. In each block of 180 trials, one third of the faces immediately repeated, such that in each block 120 faces were new and 60 were repeated. To avoid any confounding effect of memory, only the trials that contained new presentations of faces were analyzed. Upright faces were presented in one block and inverted faces in the other, with the order of the two blocks counterbalanced across participants.
Stimuli were back-projected via two mirrors to a screen positioned at a viewing distance of 70 cm from the participant. Faces were presented centrally over a uniform black background for 500 msec. The duration of the ISI varied randomly and with equal probability between 1200 and 1500 msec, and during this time a white fixation cross was displayed at the center of the screen. Participants were instructed to respond as quickly and accurately as possible to indicate if a face had repeated. Stimulus presentation and responses were controlled using Presentation software (Neurobehavioral Systems, Inc., Albany, CA).
MEG data were recorded using a 151-channel whole-head CTF system (MEG International Services, Ltd., Coquitlam, BC, Canada). Participants lay supine in a dimly lit magnetically shielded room at the Hospital for Sick Children in Toronto. Before acquisition, three localization coils were placed at the nasion and bilateral preauricular points to help localize the participant's head relative to the MEG sensor array at the start and finish of each block. Motion tolerance was limited to 1 cm, and neuromagnetic activity was digitized at a rate of 625 Hz. Data were filtered off-line with a band-pass of 0.5–100 Hz. Data were epoched into [−100 1500] msec segments time-locked to stimulus onset. Following the MEG recording session, the three fiducial coils were replaced by MRI-visible contrast markers, and 3-D SPGR (T1-weighted) anatomical images were acquired using a 1.5T Signa Advantage system (GE Medical Systems, Markham, ON, Canada).
Event-related beamforming (Cheyne et al., 2007; Sekihara et al., 2005; Robinson & Vrba, 1999) is a 3-D adaptive spatial filter that uses surface field measurements to estimate activity at desired locations in the brain. Individual anatomical MRIs were warped into the standard Talairach space using a nonlinear transform in SPM2. Five hundred twenty-nine source locations were chosen on a grid of size of 5 mm, such that they were uniformly spaced and sufficiently few in number to allow reasonable computation time (Figure 1). These locations were then warped back into the individual participants' brains using the inverse transform. Activity at each target source was estimated as a weighted sum of the surface field measurements. Weight parameters and the orientation of the source dipole were optimized in the least squares sense, such that the average power originating from all other locations was maximally attenuated without any change to the power of the forward solution associated with the target source. The forward solution for the beamformer was modeled by fitting multiple sphere models to the inner skull surface of each participant's MRI using BrainSuite software (Shattuck & Leahy, 2002). The weights were then used to compute single-trial time series for each source.
Inter-event Time Distributions
The output process at each source was defined in terms of the peaks and troughs in the scalograms (Figure 2). Events were identified by searching for all local maxima in the scalogram. To prevent insignificant peaks from being selected, a local neighborhood threshold was set as a ratio (5%) of the range of the scalogram amplitude. The exact choice of threshold did not impact the functional form of the distributions in any significant manner, such that thresholds varying from 2% to 10% gave rise to very similar distributions. The times between successive events were calculated for each participant, condition, source, and wavelet scale.
The statistical properties of the inter-event times at each source were characterized by modeling the functional form of the distribution (Figure 2). EasyFit software (MathWave Technologies) was used to fit the inter-event time data with 30 common distributions by maximum likelihood estimation. The fitted distributions were Beta, Burr (Burr Type 12, or Singh-Maddala), Cauchy (Lorentz), Chi-Square, Dagum (Burr Type 3, or Inverse Burr), Erlang, Error (Exponential Power or Generalized Error), Error Function, Exponential, F Distribution, Fatigue Life (Birnbaum-Saunders), Frechet (Maximum Extreme Value Type 2), Gamma, Generalized Gamma, Gumbel Max (Maximum Extreme Value Type 1), Gumbel Min (Minimum Extreme Value Type 1), Hyperbolic Secant, Inverse Gaussian, Johnson SB, Johnson SU, Kumaraswamy, Levy, Laplace (Double Exponential), Logistic, Log-Gamma, Log-Logistic (Fisk), Lognormal, Nakagami (Nakagami-m), Normal (Gaussian), Pareto-first kind, Pareto-second kind (Lomax), Pearson Type 5 (Inverse Gamma), Pearson Type 6 (Beta dist. of the second kind), Pert, Power Function, Rayleigh, Reciprocal, Rice (Ricean or Nakagami-n), Student's t, Triangular, Uniform, and Weibull.
A simple generative model for a gamma-distributed interevent time is the situation where several stages must be completed between successive events and the waiting time at each of these stages is itself exponentially distributed (Figure 4). Thus, we conceptualize the time between events as a sum of integrative stages. This is illustrated in Figure 4, where the time between successive peaks (the inter-event time) is a sum of k integrative stages (shown in blue). The shape parameter (k) corresponds to the number of stages and the scale parameter (θ) to their mean duration.
What do gamma-distributed inter-event times reveal about brain function? Peaks in electromagnetic signal power arise because of interactions within and between neuronal ensembles. In particular, peaks in signal power are thought to originate from synchronized dendritic currents resulting from synaptic transmission, which act as current dipoles. In the case of magnetic fields measured by MEG, these current dipoles are generated mainly by pyramidal cells oriented perpendicular to the cortical surface. From the perspective of information processing, peaks in MEG signal power signify the integration of input, during which signals from afferent neurons are transmitted across the synapse and integrated, giving rise to dendritic currents in the postsynaptic neurons. Therefore, the stages between successive events may correspond to a series of volleys or bursts of input to that neuronal population.
From a cognitive perspective, these integrative stages correspond to additional input or accumulation of information at a given brain region. For instance, in the case of visual processing, integrative steps may represent putative components or lower-level features of a stimulus, which are integrated to engender a percept. In the specific case of face processing, these integrative stages may relate to how different facial features are integrated to produce a coherent, unified facial percept.
Thus, the parameters of the gamma distributed interevent times reveal how information is integrated to give rise to the observed neural activity. Specifically, the time between successive events can be thought of as a sum of integrative steps and changes in either parameter because of experimental manipulation can be interpreted from this perspective. For instance, if inter-event time distributions recovered from neural activity during task A have a greater shape parameter than during task B, this would suggest that additional integrative steps are required to generate events during task A. These additional integrative stages may correspond to additional input, such as additional bursts or volleys of spikes from afferent neuronal populations. If distributions during task A have a greater scale parameter than during task B, this would indicate that each of the integrative steps proceeds at a slower rate during task A.
Note, however, that this is not the only possible model that can generate gamma statistics. Although queueing network models are often used in cognitive psychology (Liu, 1996), diffusion models are also commonly used to explain gamma distributions (Van Zandt & Ratcliff, 1995; Ratcliff, 1979). However, there is no way to objectively assess which model is most suitable. In this study, we chose to interpret changes in gamma statistics from the perspective of the Erlang model because of its simplicity and because it provides a taxonomy that fits well with the well-established notions of configural and holistic processing of faces.
We treated each of the two fitted gamma distribution parameters k and θ as measures of neural activity and assessed the differences between age groups and conditions for each of these measures, using mean-centering PLS analysis. PLS analysis is a multivariate statistical technique that can be used to relate two “blocks” or sets of variables to each other (McIntosh & Mišić, 2012; McIntosh & Lobaugh, 2004; Lobaugh, West, & McIntosh, 2001; McIntosh, Bookstein, Haxby, & Grady, 1996). In the context of neuroimaging, one set of variables may be exogenous, such as the study design (e.g., age groups and/or conditions), whereas the other may represent a set of endogenous variables, such as neural activity (e.g., k or θ) that varies across one or more dimensions (e.g., sources and frequencies). In this study, we related the differentiation between age groups to source- and frequency-dependent patterns of inter-event time distribution parameters.
In PLS, this is achieved by computing the covariance matrix between the two sets of variables and decomposing this matrix into mutually orthogonal “latent variables” using singular value decomposition (Eckart & Young, 1936). Each latent variable represents a particular relationship between the study design on one hand and neural activity on the other. Specifically, each latent variable is expressed as a vector of design saliences and a vector of source saliences, as well as a scalar singular value (s). The elements of the design salience vectors are interpreted as a contrast between age groups and/or conditions, whereas the source saliences represent a particular pattern of sources and frequencies that maximally express that contrast. The singular value reflects the proportion of covariance between the design variables (groups and conditions) and neuromagnetic variables (sources and frequencies) that is accounted for by each latent variable. This allows effect size to be estimated as the ratio of the square of the singular value associated with that particular latent variable to the sum of all squared singular values derived from the decomposition.
Nonparametric resampling techniques were used to assess the statistical significance and reliability of experimental effects. For each effect, statistical significance was determined using permutation tests. The rows (i.e., the observations) of the neural activity data matrix were randomly reordered (permuted), and the new data were subjected to singular value decomposition as before to obtain a new set of singular values. These singular values are effectively generated under the null hypothesis that there is no association between neural activity and the task. The procedure was repeated 500 times to generate a sampling distribution of singular values under the null hypothesis. Because the singular value is proportional to the magnitude of the effect, a p value is estimated as the probability that singular values from the distribution of permuted samples exceed the singular value from the original, nonpermuted data.
To determine which source saliences were reliable, their standard errors were estimated using bootstrap resampling (Efron & Tibshirani, 1986). Bootstrap samples were generated by random sampling with replacement of participants within conditions (500 replications). Saliences were deemed to be reliable if the 99% confidence interval did not include zero. Under the assumption that the bootstrap distribution is unit normal, this condition holds if and only if the absolute value of the ratio of the salience to its bootstrap-estimated standard error is greater than or equal to 2.57 (Efron & Tibshirani, 1986). Statistical maps were generated using tessellation-based linear interpolation to estimate bootstrap ratios for each voxel. Bootstrap ratios were thresholded across all data points to allow parsimonious identification of latent variables.
Inter-event Time Distributions
Interevent time distributions were fitted with thirty different probability distributions, separately for each participant, condition, source, and frequency. The first finding was that the functional form of the distributions was remarkably consistent across participants and that the two-parameter gamma distribution offered the best fit of all distributions tested. Importantly, the gamma distribution proved to be a good fit at all developmental time points, indicating that the functional form of the inter-event time distribution changed little during healthy brain development. This is illustrated in Figure 5, where the inter-event time histograms and corresponding fits are displayed for five randomly selected participants in each age group, at one representative source and frequency.
However, the shape and scale of the distributions were not identical for all participants and the parameters appeared to be sensitive both to individual differences and maturation (Figure 5). In the following sections, we investigate whether the parameters of the inter-event time distributions systematically varied because of development and/or changes in cognitive processing because of face inversion. We interpret these changes in terms of the Erlang model.
To determine whether brain development was associated with systematic changes in the parameters of inter-event time distributions, we contrasted the six age groups to each other, separately for each condition. During maturation, the scale parameter θ increased at low frequencies and decreased at high frequencies, indicating slower and faster processing stages, respectively. The effect was observed for both inverted (p = .001, singular value, s = 78.66) and upright (p = .001, singular value, s = 71.15) faces, accounting for 63.45% and 58.71% of the covariance between age groups and θ. This is evident in Figure 6, where the ratios of the source saliences to their bootstrap-estimated SEs are generally positive at low frequencies and negative at high frequencies. The spatial distributions of these statistical trends at 5 and 15 Hz are shown in Figure 6. Although the age-dependent increase at low frequencies was particularly reliable in the occipital lobe, the corresponding decrease at higher frequencies was more diffuse and could be observed in most brain regions.
With increasing age the shape parameter k decreased at lower frequencies and increased at higher frequencies, indicating fewer integrative steps at lower frequencies and more integrative steps at higher frequencies. The effect can be observed in the frequency-wise transition from negative to positive bootstrap ratios at 10 Hz (Figure 7). This trend was statistically significant for both inverted (p = .001, s = 7.62) and upright (p = .001, s = 7.23) faces and accounted for 75.58% and 72.96% of the cross-block covariance in the respective runs. As with the scale parameter, the age-driven decrease in the shape parameter at low frequencies was relatively restricted to the occipital lobe, whereas the increase at higher frequencies was robust throughout the brain (Figure 7).
Development and Face Inversion
To investigate the effect of development and face processing simultaneously, runs with upright and inverted faces were contrasted across all age groups. In the youngest group, the scale parameter θ was greater for inverted compared with upright faces, but with maturation this difference became smaller and reversed, such that in adulthood θ was greater for upright faces compared with inverted faces (p = .001, s = 20.21, corresponding to 25.0% of cross-block covariance; Figure 8). This effect was primarily positively expressed in right inferior temporal cortex at 15 Hz and negatively expressed in left inferior temporal cortex at 5 Hz. Although there was some contribution from the superior aspect of the right cerebellum, this is likely to reflect spatial smearing or leakage from inferior temporal sources because of the data-driven MEG source localization (Taylor, Mills, Smith, & Pang, 2008; Itier, Herdman, George, Cheyne, & Taylor, 2006).
The shape parameter k of the fitted gamma distribution displayed a similar trend, in the sense that k was greater for upright compared with inverted faces in the youngest group, but during development this inversion effect gradually reversed so that by late adolescence inverted faces were associated with greater k than upright faces (p = .001, s = 1.79, corresponding to 23.38% of cross-block covariance; Figure 9A). This trend was reliable mainly in the anterior portion of the right parahippocampal gyrus at 25 Hz. Note however that the spatiotemporal pattern associated with this effect was only able to capture face inversion effects in the nonadult groups. In the adult group, a different spatiotemporal pattern differentiated upright and inverted faces (p = .08, s = 1.01, corresponding to 9.02% of cross-block covariance), mainly in the left inferior frontal cortex at 10 Hz (Figure 9B).
Inter-event Time Distributions
The empirical probability distribution of inter-event times most closely resembled a two-parameter gamma distribution, in concordance with previous findings (Mišić, Vakorin, Kovaević, et al., 2011). The fact that the distribution translated from resting-state EEG recordings to MEG recordings with multiple shorter trials is encouraging because they measure the same biophysical process and should display similar dynamics. Interestingly, the functional form of the distribution was the same across groups and tasks, which suggests that even under changing network conditions the integrative dynamics maintain a characteristic mode of operation. Experimental manipulation caused systematic variation in the parameters of the distribution, across regions and individuals, which we discuss in more detail in subsequent sections.
Our analyses revealed that inter-event times in human MEG display robust gamma statistics. In the following discussion, we offer an interpretation of experimentally induced nonstationarities in terms of the Erlang model and information accumulation. We stress that this is a model and that our data do not prove the existence of such a mechanism. The main advantage of the Erlang model is that it allows us to parsimoniously relate changes in gamma statistics to holistic and configural face processing over development.
Neural development was characterized by changes in both the shape and scale of the inter-event distributions. At low frequencies (<10 Hz), increasing age was marked by fewer and slower integrative steps. At higher frequencies (≥10 Hz), the opposite was true, with increasing age marked by more integrative steps that were also faster. The differential effects observed for low and high frequencies are reminiscent of previous literature, which shows that spectrum occupancy shifts during development, with decreased power in low frequencies and increased power in high frequencies (Gasser, Verleger, Bacher, & Sroka, 1988). Note that the scale and shape of a distribution are theoretically independent and the fact that one changes during development does not imply that the other will change as well. Therefore, the dissociation between low and high frequencies on one hand and scale and shape on the other suggests a gradual redistribution of communication over specific bands.
Interestingly, the low-frequency effects appeared to be more regionally specific than the high-frequency effects. In the lower-frequency regime, statistical trends were most prominent in occipital cortex, with additional contribution from superior paracentral regions in the case of the scale parameter. The fact that this particular trajectory was constrained mainly to areas with visual and motor utility suggests reconfiguration constrained by functional neuronatomy. This is not surprising, given that the stimuli used in the experimental protocol were faces, which are likely to engage the ventral visual processing stream, and that participants were required to respond manually to these stimuli. Conversely, in the higher-frequency regime statistical trends were spatially broader and could be observed throughout the brain.
Development and Face Inversion
Inter-event times were also sensitive to face inversion: the statistical analysis revealed an interaction between development and face processing, with the inversion effect becoming more pronounced with increasing age. In addition, the inversion effects identified for both parameters were reversed in the youngest groups compared with the adolescent and adult groups. This type of interaction is also observed with the latency and amplitude of the face-sensitive N170 evoked potential and the M170 evoked field. For example, the adult N170 has both a greater (more negative) amplitude and a longer-latency peak in response to inverted faces (Taylor, Bayless, Mills, & Pang, 2011; Itier & Taylor, 2002; Rossion et al., 1999, 2000; Bentin, Allison, Puce, Perez, & McCarthy, 1996). In young children, the effect is often reversed, with greater amplitudes and longer latencies in response to upright faces (Taylor et al., 2004; Taylor, Edmonds, McCarthy, & Allison, 2001). Likewise, a reversal of the face inversion effect is also observed when measuring the amplitude of the BOLD response in the fusiform gyrus (Passarotti, Smith, DeLano, & Huang, 2007).
The biological cause for the reversal in the face inversion effect during development is still poorly understood. One hypothesis is that the reversal may be the result of distinct rates of development of configural and holistic processing (Maurer, Le Grand, & Mondloch, 2002). For instance, holistic processing is thought to be developed at age 6 (Tanaka, Kay, Grinnell, Stansfield, & Szechter, 1998; Carey & Diamond, 1994), whereas configural processing does not fully develop until late adolescence (Mondloch et al., 2002; Freire & Lee, 2001). Thus, the reversal of the inversion effect during childhood may be caused by an interaction between holistic and configural processing and may signify the point in the developmental trajectory during which the two modes of processing reach a balance.
Task-dependent developmental trajectories were driven mainly by parts of inferior temporal cortex. From the perspective of the Erlang model, maturation was associated with progressively faster integrative stages for upright compared with inverted faces in left inferior temporal cortex and slower integrative stages in right inferior temporal cortex. Face perception has consistently been linked with inferior temporal cortex in general and fusiform gyrus in particular. For example, fMRI studies frequently report increased activation in fusiform gyrus in response to faces compared with other complex visual stimuli (Schiltz & Rossion, 2006; Halgren et al., 1999; Haxby et al., 1999; Kanwisher, McDermott, & Chun, 1997; McCarthy, Puce, Gore, & Allison, 1997; Puce, Allison, Gore, & McCarthy, 1995). Face inversion effects have also been reported in fMRI studies, whereby upright faces elicit greater activation compared with inverted faces (Yovel & Kanwisher, 2005; Haxby et al., 1999; Kanwisher, Tong, & Nakayama, 1998). Increasing lateralization of brain functions with age is well documented, and ERP studies have shown less right-sided dominance for face processing in children (Taylor et al., 2001; Taylor, McCarthy, Seliba, & Degiovanni, 1999). Face processing skills are widely reported to improve throughout childhood and into adulthood (Baudouin et al., 2010; Taylor et al., 2004; Mondloch et al., 2002; Kolb, Wilson, & Taylor, 1992; but see also Crookes & McKone, 2009), so the opposing trends in right and left fusiform gyri may indicate that the right fusiform gyrus becomes increasingly specialized for faces (Passarotti et al., 2007).
The possibility that information about upright faces may be integrated in a slower, more deliberate manner in the right fusiform gyrus is consistent with the notion that faces are perceived in terms of the configuration and relative positions of facial features (configural processing), as well as a coherent whole (holistic processing; Maurer et al., 2002). Both accounts suggest that face perception depends on intensified interareal integration to synthesize a complex percept and our data support this premise, as we found that face perception involves longer integrative steps, as measured by the scale parameter. Similarly, if the relations among facial features are disrupted by inversion, the individual features are perceived as separate parts and are not integrated into a unified whole. As a result, inverted faces elicit neural activity with shorter integrative steps, which we report in this study.
According to our model, maturation was associated with an increasing number of integrative steps for inverted compared with upright faces in right parahippocampal gyrus, an area where activation is commonly observed in face recognition and face imagery paradigms for adults (Chua, Schacter, Rand-Giovannetti, & Sperling, 2007; Powell et al., 2005; Bernard et al., 2004; Ishai, Haxby, & Ungerleider, 2002), as well as face recognition in children (Taylor, Mills, & Pang, 2011). Thus, the prominent interaction between development and face inversion displayed by this region may be because of the fact that participants had to judge whether each stimulus had appeared on the previous trial. This result points to an increasing involvement of right parahippocampal structures in face recognition.
Our data also revealed a salient transition from childhood/adolescence to adulthood. In adulthood, when inverted faces were presented, neural activity in left inferior PFC was associated with additional integrative steps—an effect that was not observed in any other group. A number of studies have noted that this region is sensitive to situations in which face–name associations are formed (Tsukiura & Cabeza, 2008; Sperling et al., 2003; Herholz et al., 2001) as well as face recognition (Bernard et al., 2004). This suggests that development of face processing skills may culminate in the emergence of left inferior PFC as a unique conduit in the fully mature brain that facilitates the recognition of inverted faces via additional integrative stages.
The fact that inverted faces elicited additional integrative stages further supports the notion that they are perceived “by parts.” If each feature of an inverted face is perceived separately, then more discrete stages of integration should be required compared with upright faces, which are consolidated and perceived holistically. This is consistent with the longer latencies widely reported in older children and adults for inverted faces (Taylor, Bayless, et al., 2011; Taylor et al., 2001; Rossion et al., 1999, 2000). To our knowledge, ours is the first study to operationalize information integration and to demonstrate differences in how upright and inverted faces are perceived from this perspective.
Why were face inversion effects for the rate of integration localized to inferior temporal cortices, whereas the effects for the number of integrative steps were localized to “higher” areas, such as the parahippocampal gyrus and PFC? We speculate that this may reflect the hierarchical positions of these regions in the context of object recognition. At the level of inferior temporal cortex, representations of complex visual stimuli such as upright faces and features of inverted faces (e.g., eyes, mouth, etc.) are formulated by integrating low-level visual features from the receptive fields of neurons in striate cortex. Compared with an inverted face, the representation of an upright face must be endowed with additional information about the configuration of individual features, as well as their overall, holistic arrangement, resulting in longer integrative steps.
This study used a working memory recognition task that required participants to compare the representation of each presented stimulus to the working memory representation of the stimulus presented in the previous trial. Therefore, the involvement of parahippocampal gyrus and PFC may reflect this secondary cognitive requirement. For upright faces, the comparison is presumably made between two holistic representations. Conversely, inverted faces necessitate a piecemeal comparison between multiple parts of each face, resulting in a greater number of integrations.
The strength of our analysis depends on two key questions. First, is the present method of delineating output events physiologically valid? A long-standing assumption is that the basic units of information transfer in the CNS are action potentials transmitted by single neurons, whereas our method is based on electromagnetic recordings that reflect the activity of entire populations of neurons. Therefore, peaks in MEG signal power obviously do not carry information in the same way as action potentials. Nevertheless, neuron soma constitute the gray matter nodes of brain networks and in that sense, postsynaptic potentials represent outputs from these nodes. Peaks in signal power are relevant output events because they originate from synchronized post-synaptic potentials.
Second, is our interpretation of gamma-distributed inter-event times the most plausible? There exist a number of attractive conceptual alternatives based on diffusion models. For example, in cognitive psychology RTs can be modeled as biased random walkers (biased by incoming information) crossing a “threshold” boundary (Van Zandt & Ratcliff, 1995; Ratcliff, 1979). Diffusion is also commonly used to model spiking in single neurons (Codling, Plank, & Benhamou, 2008; Smith & Ratcliff, 2004). In integrate-and-fire models, interspike intervals can be thought of as the time required for a random walker to reach an absorbing boundary. In the context of this experiment, it may be possible to formulate such a diffusion model, parameterized by starting position and bias (i.e., rate of information flow). Given the nature of our data, both integration and diffusion seem to be plausible generative mechanisms and the way in which diffusion models the accumulation of information is similar in spirit to the integrative process we have focused on.
By isolating local maxima in the scalogram and treating the time between successive events as a stochastic process, we have been able to model information integration. We have shown that nonstationarities of inter-event times may be meaningfully related to specific cognitive processes and their development. We have shown that this model can help to relate the development of face processing skills (and how they are disrupted by face inversion) to information processing and integration. Our model suggests that inversion may fragment face processing, resulting in a greater number of shorter integrative episodes. Our data shed new light and provide further evidence for the well-known holistic and configural accounts of face processing.
Reprint requests should be sent to Bratislav Mišić, 3560 Bathurst St., Toronto, Ontario, Canada, M6A 2E1, or via e-mail: firstname.lastname@example.org.