One fifth of neurons in the medial-temporal lobe of human epilepsy patients respond selectively to categories of images, such as faces or cars. Here we show that responses of hippocampal neurons are rapidly modified as subjects alternate (over 60 sec) between two tasks (1) identifying images from a category, or (2) playing a simple video game superimposed on the same images. Category-selective responses, present when a subject identifies categories, are eliminated when the subject shifts to playing the game for 87% of category-selective hippocampal neurons. By contrast, responses in the amygdala are present during both tasks for 72% of category-selective amygdalar neurons. These results suggest that attention to images is required to evoke selective responses from single neurons in the hippocampus, but is not required by neurons in the amygdala.
A key part of human conscious experience is the ability to flexibly deploy cognitive resources to analyze and react to different aspects of the environment, that is, to select different aspects of the world and attend to them. Such selection functions to filter the large amount of sensory information which is available at any instant in time, choosing that small subset which is important to guide behavior, and emphasizing it for further evaluation (Pashler, 1998, Chap. 1). Historically, the mechanisms underlying attentional selection have been envisioned as a specific processing bottleneck, either early (Treisman, 1969; Broadbent, 1958) or late in sensory processing (Norman, 1968; Deutsch & Deutsch, 1963), which limits the information reaching awareness. Recent theoretical accounts have advocated more distributed processes and limitations (Lavie, 2005; Yantis & Johnston, 1990), and these are consonant with neurophysiological recordings showing that attention affects both subcortical and cortical neural responses (Kastner & Pinsk, 2004; Shipp, 2004).
Our current knowledge of attentional mechanisms has been derived largely from noninvasive imaging (Haxby, Courtney, & Clark, 1998) and electrophysiological studies in human subjects (Luck & Girelli, 1998), as well as single-neuron recordings in nonhuman primates (Maunsell & Treue, 2006; Motter, 1998; Luck, Chelazzi, Hillyard, & Desimone, 1997; Moran & Desimone, 1985; Andersen & Mountcastle, 1983). Both noninvasive imaging and scalp recordings indicate those brain areas whose average inputs or activity change as attention is shifted. There are, of course, many patterns of change of single-neuron activity which produce the same average change in an area, or are produced by the same average change in inputs. This study was designed to extend knowledge of attentional mechanisms in human subjects derived from imaging studies by examining changes in single-neuron firing as subjects shifted between a task which focused attention on visual images and a task not involving the images.
Single-neuron activity can also be recorded in nonhuman primates, and indicates the single-neuron correlates of the types of attentional shifts for which these animals can be trained over several months. For human subjects, new tasks with different attentional demands can be learned in 1 min following verbal instruction. The experiments reported here were designed to extend primate recordings by measuring changes in single-neuron firing within the human brain during overt attentional shifts in a newly acquired task.
During task performance, we recorded the firing activity of single neurons within the medial-temporal lobe (MTL). Previous studies reported category-selective visual responses within several MTL structures: the hippocampus, the parahippocampal gyrus, the amygdala, and the entorhinal cortex (Kreiman et al., 2000; Fried, MacDonald, & Wilson, 1997). Here we focus on the influence of selective attention on category-selective responses of single neurons in two of these structures, the hippocampus and the amygdala.
The hippocampus was of particular interest given its reciprocal connections to other brain areas involved in high-level visual processing (Felleman & Van Essen, 1991), as well as its involvement in declarative memory for items and events (Viskontas, Knowlton, Steinmetz, & Fried, 2007; Squire, 2004; Cohen et al., 1999; Squire & Zola, 1996). The amygdala was of interest given its role in the emotional evaluation of faces (Adolphs, Tranel, & Buchanan, 2005; Zald, 2003; Oya, Kawasaki, Howard, & Adolphs, 2002), its potential involvement in a rapid subcortical network for face processing (Pessoa, Japee, Sturman, & Ungerleider, 2006; Whalen et al., 2004; Compton, 2003), as well as its involvement in declarative and nondeclarative memory for emotional events (Adolphs et al., 2005; Phelps, 2004; Richter-Levin, 2004).
By examining changes in single-neuron firing as subjects shifted between two tasks over short time intervals, this design examines how the visual responses of single neurons change as human subjects shift attention in a relatively natural way. When subjects attended to images, we found category selectivity in one fifth of hippocampal and amygdalar neurons, in agreement with previous report (Kreiman et al., 2000). When attention was shifted between tasks, we found two distinct classes of cellular responses: The category selectivity of neurons in the hippocampus was strongly inhibited by shifting to game play, whereas the category selectivity of neurons in the amygdala was largely unaffected.
Subjects were patients with pharmacologically resistant epilepsy. Extensive noninvasive evaluation did not yield concordant data corresponding to a single epileptogenic focus, and therefore, the patients were stereotactically implanted with up to 10 chronic intracranial depth electrodes for 1 to 2 weeks to determine the origin of their seizures for possible surgical resection. Through the lumen of the electrodes, up to eight platinum–iridium microwires (40 μm diameter) were inserted (Babb, Carr, & Crandall, 1973). All studies conformed with the guidelines of the Medical Institutional Review Board at UCLA. Recordings were obtained from 10 subjects (6 men; 9 right-handed; 17 to 44 years old). The sites of electrode implantation were based exclusively on clinical criteria, and the location was verified by structural magnetic resonance imaging. Individual microwires extended 4 mm from the tip, lying in a cone with an opening angle of less than 45°.
The activity of single neurons was isolated from the electric potential recorded between microwire tips and a nearby reference microwire. This difference was amplified (×10,000) and band-pass filtered (278–5000 Hz) and possible action potential events were isolated using a fixed threshold. The events were sorted to represent the activity of single neurons using the Spiker program (http://ramonycajal.mit.edu/kreiman/academia/spike_sorting.html). Only clusters of events with an initial depolarization followed by a hyperpolarization, minimal repetitions at power line harmonic frequencies, and a decrease in the fraction of interspike intervals near zero delay, were considered to arise from single neurons and were included in further analysis. Each cluster thus contains similar events over the entire recording session and across the changes between tasks.
The experimental task involves shifting attention between two superimposed video streams, similar to the technique used by Neisser and Becklen (1975) to induce unawareness of one of the streams. One stream consisted of a sequence of images selected randomly from several categories (4 to 7 based on clinical constraints): animals, cars, emotional faces, famous people, everyday objects, patterns, and spatial outdoor scenes. The images and categories were chosen from a set shown in previous studies to yield measurable neural responses (Kreiman et al., 2000). Each image subtended about 10° visual angle and was presented for 1 sec with 500 msec of black screen in between.
While these images were presented, an outline of a video game was presented over the images or the background. The video game (Figure 1) was a modified version of “Asteroids” (Atari, 1979), where the object is to rotate the central cannon and fire at moving rocks by depressing the space bar. During each game, the subjects were asked to either: (game play) play the video game while ignoring the images presented in the background, or (picture identification) press a key when images from one specific image category (faces, or in 4 experiments, outdoor scenes) were presented while ignoring the outline display of the last game, which was automatically replayed by the computer. Each game involved presentation of 30 images and lasted 45 sec.
The subjects alternated between game play and picture identification and would play 10 games per experimental session (with some variability due to clinical constraints). The visual display during each game of picture identification was identical to the display during the preceding game play (although this ordering introduces a potential confound, prior studies show it is unlikely to affect the results here; see Discussion). Each image was presented during multiple games and so the appearance of the game was random with respect to image presentations. On average, each subject participated in two sessions, normally on separate days.
This task was designed to present the same visual stimulus in two conditions and to shift the subject's attention between the images and the game. To verify a shift, normal subjects (n = 4) were asked to simultaneously play the game and identify pictures (patients were not asked to perform the tasks simultaneously as it would be relatively taxing during their postoperative recovery). During simultaneous performance, the error rate while identifying a specific image category was 2% (average number of errors = 15), 4.5 times higher than when they performed picture category identification alone (average number of errors = 3) during the initial 50 games played (approximately the total number played by the patients). Fisher's exact test rejects (p < .01) the hypothesis that the error rates were the same during simultaneous performance and picture identification.
Background firing rates were determined from the 500 msec prior to image presentation and were averaged over all images. Firing during the 1000 msec of image presentation was examined for category-selective visual responses, that is, whether a neuron responded specifically to one or more image categories during either game play or picture identification. Trials during game play and during picture category identification were analyzed in separate blocks. For picture identification, both correctly and incorrectly performed trials were analyzed together because this is most similar to the analysis during game play, and because the small number of error trials (<1%) would have little effect on the overall results.
A neuron was considered to have a category-selective visual response if there was a significant effect of image category on the average firing rates during 1000 msec of image presentation (one-way analysis of variance [ANOVA], p < .05). The p value for this test (Kleinbaum, 1988) measured the degree of category selectivity. For neurons with category-selective responses, Tukey's method of honest significant differences (Kleinbaum, 1988) was used to identify those pairwise comparisons between categories which were significant.
A neuron was considered to have a general response to presentation of any picture if the firing rate during 1000 msec of image presentation, averaged over all image categories, was significantly different from the average activity in the 500 msec preceding image presentation (unpaired t test, p < .05). A neuron was considered to have a change in baseline firing due to play condition if the average firing rate in the 500 msec preceding image presentation differed significantly between picture identification and game play (unpaired t test, p < .05).
In total, we recorded from 190 individual neurons in the amygdala (59 neurons), entorhinal cortex (38), hippocampus (36), parahippocampal gyrus (22), orbito-frontal cortex (13), supplementary motor area (13), and anterior cingulate gyrus (9) during 20 experiments in 10 patients. The limited number of neurons and category-selective neurons in areas other than the hippocampus and the amygdala prevents firm conclusions from being drawn about these other areas. Neurons in these other areas are discussed in the “Responses in Other Brain Areas” section, but are otherwise not analyzed further here. The average firing rate for neurons in the hippocampus and the amygdala during picture identification was 3.0 spikes/sec (sp/sec), in agreement with a previous report (Kreiman et al., 2000). There was no difference in background firing rate between brain areas (one-way ANOVA, p = .33) or between areas in the likelihood of the background firing rate changing between game play and picture identification (Fisher's exact test, p = .28).
Neurons Showing Category Selectivity
Consistent with a previous report (Kreiman et al., 2000), as the patient performed picture identification, 18 neurons (19%) showed a significant effect of picture category on the firing rate during 1 sec of image presentation (one-way ANOVA, p < .05; over three times the number expected by chance). Eleven of these selective neurons were in the amygdala (recorded from 5 patients) and seven were in the hippocampus (recorded from 6 patients). No systematic differences of category selectivity between sides of the brain containing or not containing subsequently resected areas were noted and so data from both sides were pooled by brain area.
Figure 3B shows the average responses of a separate amygdalar neuron (in a different patient) during presentation of images from the same seven categories. For this neuron, presentation of images of famous people evoked an excitatory response during both picture identification and game play.
Changes in Selectivity between Tasks
These p values can be compared to a chosen significance level (p < .05) to test whether a particular cell has a category-selective response during either picture identification or game play. In the hippocampus, seven neurons had a category-selective response (p < .05) during picture identification, whereas only one had a selective response during game play, and none had a selective response during both tasks.
In the amygdala, 11 neurons had a category-selective response (p < .05) during picture identification and 10 had a selective response during game play. Of these 10, 8 neurons also had a selective response during picture identification, thus 8 neurons had a category-selective response during both picture identification and game play. For these eight neurons, a pair of categories with a significant difference in response during picture identification (Tukey's method of honest significant differences, p < .05) also had a significant difference during game play 75% (18/24) of the time. Amygdalar neurons remain selective in both tasks and the specific categorical differences are largely the same for both tasks.
Also shown in the top panel of Figure 5 are two amygdalar neurons with very different responses among categories, creating much smaller p values than observed for other cells. For both of these neurons, the categories involving faces (emotional faces, famous people) had a very strong response compared to other categories. Although observed occasionally (2 of 95 neurons), the dataset reported here is too small to determine whether this is a separate phenomenon from category selectivity.
Attention devoted to picture identification thus has significantly different effects on category selectivity in the hippocampus and the amygdala. Fisher's exact test rejects (p < .025) the null hypothesis that the percentage of cells with a category-selective response in both picture identification and game play is unaffected by brain region (thus 8 of 59 for amygdala and 0 of 36 for hippocampus). This difference appears graphically in Figure 5. Neurons in the hippocampus have a greater tendency to lie below the diagonal, both for neurons with a significant category-selective response and those which fail to exceed the threshold F-ratio value for this test.
Background Firing Rate vs. Task Performed
When the firing rates during game play and picture identification were compared for each neuron, 37 of 95 (39%) neurons showed a significant change (t test, p < .05) in background firing rate between these conditions. Eighteen of these (49%) showed an increase in baseline firing during picture identification and 19 (51%) showed a decrease.
Relative Responses to Categories vs. Task Performed
As shown above, changes in category selectivity between picture identification and game play may be due to either increased or decreased firing for particular image categories. Because background firing may also change between task types, an increased firing rate during game play may represent an increased or decreased effective response, depending on whether that firing rate was an increase or decrease relative to the background rate; similarly for a change which is a decreased firing rate during game play.
To better understand these response changes, the relative response for a category was defined as the ratio of the average firing rate during presentation of images from the category to the baseline firing rate for all images during the same task. As noted in Figure 3A, this ratio is defined for each image category and task type, producing a ratio, ρpid during picture identification trials, and ρgp during game play trials, for each neuron and image category. The difference of the logs of these ratios, Δ = log(ρpid) − log(ρgp) then provides a measure of the magnitude and direction of the change in relative response.
In addition to category-selective responses, a minority of neurons showed a change in firing relative to baseline for presentation of an image from any category—a generalized response. Twenty percent of MTL neurons (19 of 95) showed a significant response during picture identification, 18% (17 of 95) showed a response during game play, 15 of these neurons also having a response during picture identification. Of the 21 neurons having a significant generalized response during either picture identification or game play, 5 had a category-selective visual response in either condition. Having a category-selective visual response and having a general response to image presentation appeared to be independent effects (Fisher's exact test, p = .76). Having a category-selective visual response also appeared independent of a shift in baseline firing rate (Fisher's exact test, p = .14).
Responses in Other Brain Areas
As noted earlier, limited numbers of neurons and category-selective neurons were recorded from brain areas other than the hippocampus and the amygdala. A primary distinction of interest here is whether a brain area has a fraction of neurons retaining category selectivity between tasks which is different from the amygdala, where the largest number of neurons was recorded. For the hippocampus, this fraction was significantly different from that found in the amygdala (Fisher's exact test, p < .025).
|No. Selective in Picture Identification|
|No. Selective in Game Play|
|No. Selective in Both Tasks|
|Total No. Recorded|
|Supplementary motor area||3||0||0||13|
|No. Selective in Picture Identification|
|No. Selective in Game Play|
|No. Selective in Both Tasks|
|Total No. Recorded|
|Supplementary motor area||3||0||0||13|
One interpretation of this result is that the entorhinal cortex is acting similarly to the hippocampus, with few neurons that retain their category selectivity during both tasks. In addition, considering the entorhinal cortex to provide inputs to the hippocampus (Burwell & Witter, 2002), category selectivity is already modulated by task at the inputs to the hippocampus. A difficulty with this interpretation is the small number of neurons (3) which are category selective during picture identification.
An alternative explanation would be that there are few category-selective neurons in the input and that the hippocampus generates category selectivity, which is appropriate to the task being performed. Although the fraction of entorhinal cortical neurons which are category selective during picture identification is low compared to previous report (Kreiman et al., 2000 reported 25 of 153, 16.3%, of entorhinal cortical neurons to be category selective), that fraction is not significantly different (Fisher's exact test, p < .05) from the data reported here. Thus, it also appears reasonable to interpret these results for the entorhinal cortex as having recorded fewer category-selective neurons by chance, and then having found none with category selectivity during both tasks, as was the case in the amygdala in a larger number of neurons. Reliably distinguishing between these interpretations, and understanding the broader relationship of these brain areas to one another, will require collection of the responses of a larger number of neurons in these brain areas as tasks are varied.
Regarding the brain areas which failed to show a difference from the amygdala in the fraction of neurons with category selectivity during both tasks, the power to detect these differences is limited due to the limited number of cells recorded. For example, a post hoc power analysis of Fisher's exact test (Erdfelder, Paul, & Buchner, 1996), testing for a change in the fraction of neurons which retain selectivity across tasks from the 0.14 observed in the amygdala to 0.05 or less, with a significance level of .05, shows that the power achieved for a comparison to the brain area with the next largest number of cells, the entorhinal cortex with 38 neurons, is only 3.7%. Thus, it is not possible to conclude that failure to observe a difference in this fraction between brain areas indicates the effect is not present, rather it may not have shown itself due to chance.
No Response to Planning Key Presses
In agreement with previous recordings of single-neuron activity in human epilepsy patients, we have assumed that pressing a key does not, by itself, create responses in the human hippocampus and the amygdala (Reddy, Quiroga, Wilken, Koch, & Fried, 2006; Rutishauser, Mamelak, & Schuman, 2006; Kreiman et al., 2000; Fried et al., 1997). As an additional check, because the subject is pressing keys more rapidly during game play than during picture identification, we compared firing during 500 msec before and after each keypress for 12 selective neurons (randomly chosen from the 21 selective neurons). Only one neuron showed a significant change in firing rate before and after key depression greater than 0.05 sp/sec, and that neuron showed a change of 0.1 sp/sec during the response to the category of emotional expressions, which evoked responses greater than 3 sp/sec. This control shows that responses are not related simply to planning the keypress (which should result in a response prior to executing the keypress), although it does not exclude more complex forms of motor responses.
Magnitude of Eye Movements during the Two Tasks
Another difference between game play and picture identification is that the subjects may move their eyes more during game play, as there are targets in the periphery. Measurement of eye movements in normal subjects performing the same tasks (ISCAN II, Burlington, MA; frame rate = 60 Hz) shows that the standard deviation of eye position (in degrees from the line perpendicular to the display screen) is 5° during game play (thus from the center of the image to its edge) and 1° during picture identification. Possible implications of this difference are described in the Discussion.
A key finding in the present study is the different effect of changing tasks on single neurons in the human hippocampus and the amygdala. For neurons in the hippocampus, a sizable fraction (19%) has a category-selective visual response when the subject performs picture identification but this is virtually eliminated (15% of those remaining) when the subject shifts to game play. By contrast, for the majority (62%) of amygdala neurons with a category-selective response, the selectivity is present during both picture identification and game play.
This finding argues for the hippocampus processing only those visual stimuli to which the subject is attending. In terms of cognitive models, such as the hybrid model of Vogel, Luck, and Shapiro (1998), the hippocampus would reside in a stage beyond attentive filtering, and only those stimuli which the subject was attending would be available for forming durable and reportable memories (Squire, 2004; Cohen et al., 1999; Squire & Zola, 1996). As shown in separate experiments in normal subjects, subjects are largely unaware of unattended images in a pair of superimposed video streams (Jenkins, Lavie, & Driver, 2005; Wright, Katz, & Hughes, 1993; Rock & Gutman, 1981; Neisser & Becklen, 1975).
The preservation of responses in the amygdala, despite the demands of the task being performed, argues in favor of the amygdala providing preattentive evaluation of the images. This would be consistent with the amygdala being involved in a subcortical network for face recognition (Pessoa et al., 2006; Whalen et al., 2004; Compton, 2003), and with the amygdala providing an emotional assessment of stimuli (Adolphs et al., 2005; Zald, 2003; Oya et al., 2002). More generally, these neurons in the amygdala could serve to draw the attentional focus to important items in the environment.
Insofar as these recordings were performed in human epilepsy patients, there is always the question whether the presence of the disease might account for observed findings. In both these and prior studies (Kreiman et al., 2000), we have not observed a difference in visual response properties between subsequently resected brains areas and areas within the other hemisphere. Although this argues that these responses are representative of the operation of the normal human brain, these are recordings in epilepsy patients whose brains have a significant neurological illness and have been treated over decades with antiepileptic medications, which must qualify the conclusions drawn here. Nonetheless, this is presently the only method of recording single-neuron activity in human subjects which is employed in a sufficient number of cases to generate a reasonable dataset. Response differences between separate parts of the hippocampus (anterior vs. posterior) and the nuclei of the amygdala would be of particular interest, but the number of cells recorded in this study does not permit separation of these groups.
Recent functional magnetic resonance imaging studies suggest that processing in the amygdala is modulated as subjects shift their attention from emotional faces to an orientation discrimination task (Pessoa, Kastner, & Ungerleider, 2002). The exact relationship between the blood oxygenation level-dependent (BOLD) signal and single-unit activity is still poorly understood (Nair, 2005; Logothetis, 2002), although BOLD likely reflects the average inputs or firing activity in a brain area. Considered together with these single-neuron recordings, one possibility is that neurons in the amygdala provide evaluation and firing output which is independent of the visual items attended, even when their inputs vary, creating changes in the BOLD signal.
Because images were always presented in the background during game play prior to being presented as the objects of attention in picture identification, one must consider whether this ordering, given reported dependence of responses in the human MTL on the sequence of presentation (Rutishauser et al., 2006; Fried et al., 1997), could explain the results reported here. Rutishauser et al. (2006) described neurons with increased firing in response to either new or old stimuli, in roughly equal proportions. Such firing would not predict a differential response to categories because each image in each category was presented once as a new image. Rutishauser et al. also did not report any differences between neurons in the amygdala and the hippocampus, a central finding here.
Changing between picture identification and game play involves an overt shift of attention: Both the position of the eyes and the visual objects fixated change. This raises a specific explanation of the changes in hippocampal visual responses, namely, that they could be caused by changes of object position in retinotopic space, and consequently, by changes in position within the hippocampal receptive fields. Under this interpretation, greater eye movements during game play decrease category-selective responses because the images fall outside a receptive field with a central on response (Nowicka & Ringo, 2000; Sobotka, Nowicka, & Ringo, 1997; Sobotka & Ringo, 1996). The observed response changes between tasks would then be a result of an overt shift of attention. The response ratios shown in Figure 6 are not consistent with this interpretation, however, because roughly equal numbers of responses to individual categories increase and decrease when the video game is played. Additionally, the eye movements during game play (standard deviation of 5°) are smaller than the reported widths of primate hippocampal receptive fields (on the order of 20°) (Nowicka & Ringo, 2000).
The attentional shift between game play and picture identification is separate from spatially selective visual attention, such as demonstrated by covert shifts of attention, where the subject maintains eye fixation. It is a general shift between tasks, where each task may create different levels of arousal or vigilance. There is no benefit to the subject to direct any attention to the nonpertinent task, indeed, this would hinder performance in the task for which they are currently performing. We thus assume that a majority of attention is directed to the current task.
The recent report of failure of human MTL neurons to respond to images when changes in the images are not detected (Reddy et al., 2006) lends strong support to the view that the response changes described here are due to shifts of attention, if one assumes that a failure to detect a change occurs because it was not attended. That article did not describe differences between the hippocampus and the amygdala, the central finding here. This may be due to differences between change detection and the shifts of selective attention employed here, or due to the smaller number of neurons reported in that article.
The changes in responses in the hippocampus and the amygdala described here, considered both on a single-neuron and population basis, directly demonstrate changes of neuronal activity due to a rapidly and flexibly acquired task at the single-cell level in the human hippocampus, and a contrast between how single neurons in the human hippocampus and the amygdala respond to changes in visual objects which are attended.
I thank the patients for participating in this study and Tony Fields, Itzhak Fried, Eve Isham, and Leila Reddy for technical assistance. I also thank William Banks, Christof Koch, Gabriel Kreiman, Ernst Niebur, Patrick Wilken, and Charles Wilson for useful and informative discussions. This study was supported by the McDonnell-Pew Program in Cognitive Neuroscience (JSMF #20002058).
Reprint requests should be sent to Peter N. Steinmetz, Harrington Department of Bioengineering, Arizona State University, MC 9709, Tempe, AZ 85287, or via e-mail: Peter.Steinmetz@asu.edu.