Abstract

What do we learn when we practice a simple perceptual task? Many studies have suggested that we learn to refine or better select the sensory representations of the task-relevant dimension. Here we show that learning is specific to the trained structural regularities. Specifically, when this structure is modified after training with a fixed temporal structure, performance regresses to pretraining levels, even when the trained stimuli and task are retained. This specificity raises key questions as to the importance of low-level sensory modifications in the learning process. We trained two groups of participants on a two-tone frequency discrimination task for several days. In one group, a fixed reference tone was consistently presented in the first interval (the second tone was higher or lower), and in the other group the same reference tone was consistently presented in the second interval. When following training, these temporal protocols were switched between groups, performance of both groups regressed to pretraining levels, and further training was needed to attain postlearning performance. ERP measures, taken before and after training, indicated that participants implicitly learned the temporal regularity of the protocol and formed an attentional template that matched the trained structure of information. These results are consistent with Reverse Hierarchy Theory, which posits that even the learning of simple perceptual tasks progresses in a top–down manner, hence can benefit from temporal regularities at the trial level, albeit at the potential cost that learning may be specific to these regularities.

INTRODUCTION

One of the major goals of recent studies in perceptual learning is to map between behavioral modifications and the underlying anatomical sites. This has prompted many studies to characterize the stimulus specificities of the learning process and to inquire whether these specificities match the nature of representations, which are typically probed by tuning curves of single neurons at early sensory cortices (e.g., Sabin, Eddins, & Wright, 2012; Jeter, Dosher, Liu, & Lu, 2010; Spang, Grimsen, Herzog, & Fahle, 2010; Li, Polat, Makous, & Bavelier, 2009; Van Wassenhove & Nagarajan, 2007; Amitay, Hawkey, & Moore, 2005; Seitz & Watanabe, 2005; Yu, Klein, & Levi, 2004; Adini, Sagi, & Tsodyks, 2002; Demany & Semal, 2002; Dorais & Sagi, 1997; Ahissar & Hochstein, 1993, 1996; Levi & Polat, 1996; Polat & Sagi, 1994; Poggio, Fahle, & Edelman, 1992; Karni & Sagi, 1991; see, for a review, Wright & Zhang, 2009). Findings show that the early sensory cortices (Schwartz, Maquet, & Frith, 2002; Recanzone, Schreiner, & Merzenich, 1993) and even subcortical sites (Tzounopoulos & Kraus, 2009; Russo, Nicol, Zecker, Hayes, & Kraus, 2005) show training induced plasticity, although this plasticity may not be easy to induce (Yang & Maunsell, 2004).

Nevertheless the bottleneck to naïve performance, even of simple perceptual tasks, probably does not reside in the width of the tuning curves of neurons in low-level sensory areas (i.e., the resolution of bottom–up, stimulus dictated neural activity). Nor can it be attributed to a lack of experience with the specifically trained procedure (Hawkey, Amitay, & Moore, 2004). Although it is clear that the broader context within which training occurs makes a difference, the specific contribution of this context to the learning process and the specificity of learning to this context have received scant attention.

The Reverse Hierarchy Theory (RHT) of perceptual learning (Ahissar, Nahum, Nelken, & Hochstein, 2009; Ahissar & Hochstein, 1997, 2004; Hochstein & Ahissar, 2002) provides a conceptual framework for addressing this question. It argues that perception and hence learning progress in a top–down manner. Thus, as a default strategy, our perceptual system first detects more general structural relations, based on broad, high-level representations, and only then gains access to lower-level representations with refined stimulus specificities. These broad representations provide the context within which subsequent learning takes place. Thus, both the dynamics and the specificity of perceptual learning are expected to depend on broad structural regularities as well as on more local stimulus consistencies (Ahissar et al., 2009). The impact of stimulus regularities on learning dynamics had been investigated in several studies (e.g., Nahum, Daikhin, Lubin, Cohen, & Ahissar, 2010; Nahum, Nelken, & Ahissar, 2008; Yu et al., 2004; Nagarajan, Blake, Wright, Byl, & Merzenich, 1998). However, the impact of structural regularities on the nature of learning and its generalization has not been studied (though see Green & Bavelier, 2003).

To examine the impact of structural regularities, we trained two groups of participants on a two-tone frequency discrimination task (Nahum et al., 2010). Although we presented two tones in each trial, the protocol we used differed from the typical 2-alternative-forced-choice (2AFC) paradigm. In the typical 2AFC paradigm, the reference tone, presented either first or second, is always lower. By contrast, here we used a fixed position for the reference tone, either always first or always second. The nonreference tone, presented at the other position, was either higher or lower. Therefore, unlike the standard 2AFC protocol, the reference tone itself did not indicate the correct response. The informative position, which uniquely dictated the response on each trial, was that of the nonreference tone; that is, the second in the Reference 1st protocol and the first in the Reference 2nd protocol.

Studies of ERP measures during performance of these protocols (Nahum et al., 2010) confirm that listeners are sensitive to their structural differences. Specifically, the temporal position of the ERP component P3, which indicates that a perceptual decision had been made (Polich, 2007; Nieuwenhuis, Aston-Jones, & Cohen, 2005; Mecklinger & Ullsperger, 1993; Donchin & Coles, 1988; Verleger, 1988), followed the informative (nonreference) tone. Thus, P3 was induced ∼300 msec after the second tone in Reference 1st and ∼300 msec after the first tone in Reference 2nd. This coupling meant that, when the reference tone was presented second, participants made an implicit decision even before it was presented. Thus, listeners implicitly compared the first tone with a recently formed internal reference, using the regularity of the position of the informative tone. This implicit decision process was not accompanied by explicit awareness or by an explicit response. Participants' response (button press) followed the second tone of each trial.

The Nahum et al. (2010) study indicated that participants quickly utilized regularity and adapted their implicit strategy accordingly. However, their data did not include an examination of the role of regularity in long-term training. Regularity may be important for the naïve performer and may facilitate initial learning, although it may decline in importance for the trained performer (e.g., Kuai, Zhang, Klein, Levi, & Yu, 2005). Alternatively, regularity may dictate a specific learning process, where improvement is limited to the trained regularity. A “gating” mechanism is implicitly assumed when perceptual learning tasks are administered, typically with the same protocol throughout the training process, as a means to achieve generalized perceptual improvement (e.g., Polat, Ma-Naim, Belkin, & Sagi, 2004). However, if learning proceeds backwards, as proposed by the RHT, the high-level regularities that characterize the trained protocol may dictate the specific pattern of subsequent improvement and limit it to stimuli presented with the trained structural context. The current study was designed to test which of these scenarios has more explanatory weight.

We trained two groups of participants. One group trained with Reference 1st and the other trained with the same reference tone (1000 Hz) presented second on every trial (Reference 2nd). After each group had been trained on its protocol, the protocols were switched between groups, and the groups continued training, allowing us to assess whether learning was structure specific. After practicing with both protocols, both groups were tested with a combined protocol where the reference tone was presented first on odd trials and second on even trials. We examined whether separate training with each of the two protocols transferred to the combined protocol. In addition, we measured ERPs during task performance before and after training to characterize training-induced modifications. Specifically, we tested whether ERP components that reveal implicit detection of protocol structure are strengthened through training. For this purpose, we recorded the task dependent contingent negative variation (CNV), a negative deflection that begins ∼150–200 msec before an expected informative target stimulus (Tecce, 1971, 1972), and P3, which follows the informative tone. We reasoned that if listeners increase their sensitivity to the structure of the trained protocol, perhaps by forming an attentional template that matches the trained regularity, the magnitude of these components should increase. In addition, we assessed whether the faster obligatory N1 (a negative deflection that peaks ∼100 msec after the onset of an auditory stimulus) and P2 (a positive deflection that peaks ∼200 msec after the onset of an auditory stimulus) components are affected by training, as suggested by previous multisession training studies (e.g., Orduña, Liu, Church, Eddins, & Mercado, 2012; Tong, Melara, & Rao, 2009; Atienza, Cantero, & Dominguez-Marin, 2002; Tremblay & Kraus, 2002).

METHODS

Participants

A group of 18 individuals (mean age = 23 ± 2.5 years, 10 women) participated in a multisession training procedure. All participants were naïve performers of psychoacoustic tasks, and none had more than 1 year of formal musical training with a tonal instrument. Participants were recruited via advertisements at the Hebrew University. The study was approved by the Departmental Ethics Committee. Informed consent was obtained from all participants, and all were paid standard student fees for participation.

The Two-tone Frequency Discrimination Task

All participants practiced two-tone frequency discrimination, in which they were asked to indicate which of two sequentially presented 50-msec tones had a higher pitch. This task involved the following protocols:

  • Reference 1st: The first tone had a fixed frequency of 1000 Hz, and the second tone had either a higher or a lower frequency (illustrated in Figure 1A, left).

  • Reference 2nd: The first tone was either higher or lower than the second tone, which had a fixed frequency of 1000 Hz (illustrated in Figure 1A, middle).

  • Combined: Odd trials had Reference 1st structure, and even trials had Reference 2nd structure (illustrated in Figure 1A, right).

Figure 1. 

Methodology. (A) Schematic illustrations of the three protocols: (left) Reference 1st, blue bars denote the 1000-Hz tone, which is presented first in all trials; (middle) Reference 2nd, red bars denote the 1000-Hz tone, which is consistently presented second; (right) Combined protocol, formed by interleaving the two trial types. (B) The frequency difference (%) between the informative variable tone and the reference tone in the block of 400 trials of the combined protocol. Odd trials (blue) had Reference 1st structure and even trials (red) had Reference 2nd structure. The same tone sequences, presented separately (although with 100 added trials to each protocol), were also used in the Reference 1st and Reference 2nd ERP protocols: Initial frequency difference was large (20%), and it reached a plateau at ∼60 trials with each type of protocol. (C) A schematic illustration of the full 14-day training procedure: four ERP-while-behaving tests (Days 1, 7, 13, and 14), the first three with the same separate protocols and the last with the combined protocol, and 10 (adaptive) behavioral training days (Days 2–6 and 8–12).

Figure 1. 

Methodology. (A) Schematic illustrations of the three protocols: (left) Reference 1st, blue bars denote the 1000-Hz tone, which is presented first in all trials; (middle) Reference 2nd, red bars denote the 1000-Hz tone, which is consistently presented second; (right) Combined protocol, formed by interleaving the two trial types. (B) The frequency difference (%) between the informative variable tone and the reference tone in the block of 400 trials of the combined protocol. Odd trials (blue) had Reference 1st structure and even trials (red) had Reference 2nd structure. The same tone sequences, presented separately (although with 100 added trials to each protocol), were also used in the Reference 1st and Reference 2nd ERP protocols: Initial frequency difference was large (20%), and it reached a plateau at ∼60 trials with each type of protocol. (C) A schematic illustration of the full 14-day training procedure: four ERP-while-behaving tests (Days 1, 7, 13, and 14), the first three with the same separate protocols and the last with the combined protocol, and 10 (adaptive) behavioral training days (Days 2–6 and 8–12).

Behavioral Training Sessions

All participants were administered 10 behavioral training sessions and four additional ERP testing sessions (Days 1, 7, 13, and 14), as illustrated in Figure 1C. They were divided into two groups. Group 1 (n = 9) trained with Reference 1st on Days 2–6 and with Reference 2nd on Days 8–12 and Group 2 (n = 9) trained in the reverse order. All participants were administered five 80-trial assessments on each of these training days.

Over the course of the 10 training days, we used the following adaptive procedure. Each assessment began with a 20% frequency difference (above or below the reference frequency, 1000 Hz). The frequency of the variable tone was modified adaptively in a three-down one-up staircase procedure, with a step size that decreased every four reversals (incorrect response after three consecutive correct responses or three consecutive correct responses after one incorrect response) from 4.5 to 2 to 1 to 0.5 to 0.1%. There were 80 trials per assessment. The threshold estimate was calculated as the arithmetic mean of the frequency difference (in percent of standard frequency) in the last seven reversal points. The tones within a pair were separated by a 950-msec ISI. Pleasant visual feedback (a happy face) was provided following correct responses, and unpleasant feedback (a sad face) was provided following incorrect responses. Participants' responses were not time limited. The calculated thresholds were log transformed before the statistical analysis to obtain normally distributed data.

The tone intensity was 65 dB. All stimuli were presented binaurally through Sennheiser HD-265 linear headphones using a TDT (Tucker Davis Technologies) System III signal generator controlled by in-house software.

ERP Testing Sessions

In the ERP sessions, we used a predetermined easy-to-difficult sequence of stimuli. The frequency differences presented along each block in these sessions were chosen to be the average differences obtained by naïve listeners in each trial during their first adaptive 80-trial assessment. Separate averages were calculated for Reference 1st and Reference 2nd protocols (reaching JNDs of 5.1 ± 0.2% and 6.3 ± 0.4%, respectively, based on previous performance of 25 naïve participants in each protocol [Nahum et al., 2010]). Because ERP blocks were longer than behavioral assessments, nonreference frequencies in Trials 81 onwards were randomly selected from the range of frequencies attained in Trials 61–80 in the first adaptive assessment. The tones within each pair were separated by a 620-msec onset-to-onset interval. The interpair interval was 1.4 sec (i.e., the onset of one trial to onset of the next was 2 sec). Participants were asked to indicate, by pressing “1” or “2” on the keyboard, which of the two tones had a higher pitch. Participants had a limited time for responding (1.4 sec), and there was no feedback.

ERP sessions (Days 1, 7, and 13) contained four blocks of 300 trials each, presented in the following sequence: Reference 1st, Reference 2nd, Reference 1st, Reference 2nd. The last ERP session (Day 14) was composed of three blocks presented in the following sequence: Combined (400 trials), Reference 2nd, Combined (400 trials). The sequence of stimuli used in the combined protocol is shown in Figure 1C. Note that it is composed of the same stimuli used in the Reference 1st and Reference 2nd ERP assessments (although only 200 rather than 300 trials of each structure were administered in this protocol), presented in an interleaved manner.

ERP Measurement and Analysis

Electrophysiological activity was recorded in a sound-attenuated room while participants performed the two-tone frequency discrimination task. Sounds were produced by Matlab and were presented by E-Prime software.

EEG was recorded from 32 active Ag-AgCl electrodes mounted on an elastic cap using the BioSemi ActiveTwo tools. Electrode sites were based on the 10–20 international system. Two additional electrodes were placed over the left and right mastoids. Horizontal EOG was recorded from two electrodes placed at the outer canthi of both eyes. Vertical EOG was recorded from electrodes on the infraorbital and supraorbital regions of the right eye in line with the pupil.

EEG and EOG signals were sampled at 256 Hz, amplified and filtered with an analog band-pass filter of 0.16–100 Hz. Off-line analysis was performed using BrainVision Analyzer software. EEG was referenced to the tip of the nose and was digitally filtered using a band-pass of 1–30 Hz. Artifact rejection was applied to the nonsegmented data according to the following criteria: Any data point with EOG or EEG > ±100 μV was rejected along with the data ± 300 msec around it. In addition, if the difference between the maximum and the minimum amplitudes of two data points within an interval of 50 msec exceeded 100 μV, data ± 200 msec around it were rejected. Finally, if the difference between the maximum and the minimum amplitudes of two data points within an interval of 100 msec was below 0.5 μV, the data point and the data ± 300 msec around it were rejected. Trials containing rejected data points were omitted from further analysis as well as the first three trials of each block. For ERP averaging across trials, the EEG was parsed to 1430-msec epochs, starting 230 msec before the onset of the first stimulus in the trial and then averaged separately for each protocol and for each electrode. The averaged epochs were digitally filtered using a band-pass of 1–12 Hz. The baseline was adjusted by subtracting the mean amplitude of the prestimulus period (230–130 msec before the onset of the first stimulus in the trial) of each ERP from each data point in the epoch.

ERP analysis

ERP analysis was based on the epochs that were recorded with electrode Cz (at the vertex of the scalp) after they were processed as described above. Each assessment of each participant was analyzed separately. To assess the effects of protocol and training on the faster, obligatory (N1, P2) components, the waveforms obtained for each participant for each protocol and day of measurement were additionally high-pass filtered (phase shift-free Butterworth filter, with a slope of 24 db/octave). The high-pass filter was set to 2.5 Hz to exclude the slower CNV component, which partially overlaps them in time. Peak amplitude and delay to peak of the N1 and P2 components were determined based on the filtered waveforms. The areas of the slow, task-dependent ERP components (CNV, P3) were obtained for each averaged assessment from the unfiltered waveforms (because complete separation of these slower components (300–600 msec) could not be obtained with temporal or spatial filtering). Time windows for obtaining peak amplitudes and calculating component areas were chosen based on the time windows of these components in the grand-averaged response. N1 peaks were determined as the most negative values within the time window of 50–220 msec relative to the onsets of the first and the second stimuli, separately. P2 peaks were determined as the most positive values within the time window of 150–320 msec relative to the onsets of the first and the second stimuli, separately. CNV areas were calculated from the intervals −80–140 msec with respect to the onsets of first and second tones, and P3 areas were calculated from the intervals 230–380 msec with respect to the onsets of the first and second tones. For the CNV component, protocol-specific areas were determined as the difference between the area induced in the response to the informative tone and the area induced during the same time interval (i.e., to the reference tone) in the complementary protocol.

RESULTS

Behavioral Specificity

Two groups of participants (n = 9 in each group) received multisession training. In the first training phase, Group 1 trained with the Reference 1st protocol and Group 2 with the Reference 2nd. In the second phase, the trained protocol was switched between the groups. Both groups were also pretested, “midtested,” and posttested with both protocols, while their ERP responses were recorded on three additional sessions. Consequently, both groups received brief training with each of the two structures during the ERP pretest. Thus, the specificities we report are those attained after the initial test phase (marked “Day 1” in Figure 2).

Figure 2. 

Long-term learning is structure specific. Average thresholds of the two groups during their two multiday training phases. Left: Group 1, first trained with Reference 1st (blue line) protocol. Right: Group 2, first trained with Reference 2nd (red line). The improvement is protocol specific for both structures. Black head schemes under x axis denote days of ERP assessments (Days 1, 7, and 13). Vertical bars denote cross-subject standard errors.

Figure 2. 

Long-term learning is structure specific. Average thresholds of the two groups during their two multiday training phases. Left: Group 1, first trained with Reference 1st (blue line) protocol. Right: Group 2, first trained with Reference 2nd (red line). The improvement is protocol specific for both structures. Black head schemes under x axis denote days of ERP assessments (Days 1, 7, and 13). Vertical bars denote cross-subject standard errors.

The average performance of each group on each of the training days is shown in Figure 2. The main observation is that initial thresholds for both structures (blue for Reference 1st and red for Reference 2nd) did not depend on whether the tested group had previous training with the complementary structure. Namely, the average threshold of Group 1 at Day 2 was similar to that of Group 2 at Day 8 [t(8) = −1.22, p = .24 in a two-tailed t test], and the average threshold for Group 2 on Day 2 was similar to that obtained by Group 1 on Day 8 [t(8) = 1.05, p = .31 in a two-tailed t test]. Thus, although practice induced learning with the trained structure, improvement did not transfer to performance on the complementary structure. In both groups, the thresholds obtained on the untrained structure (Day 8 for each of the groups) were significantly poorer than those obtained on the last training day (Day 6) with the initially trained structure [Group 1: t(8) = −2.89, p = .02; Group 2: t(8) = −2.85, p = .02; within group paired two-tailed t test between Day 6 and Day 8]. Taken together, these data indicate that learning was structure specific for both protocols.

The results also revealed a marginal tendency toward better performance on Reference 1st than on Reference 2nd [i.e., initial thresholds obtained with Reference 1st were marginally lower than those initially obtained on Reference 2nd; paired, one-tailed t test between Day 2 and Day 8, Group 1: t(8) = −1.42, p = .097; Group 2: t(8) = −1.72, p = .06]. This tendency is consistent with the finding reported in Nahum et al. (2010) that the Reference 1st protocol was performed marginally better than Reference 2nd.

ERP Measures

ERP was measured while participants performed the frequency discrimination task during pretraining (Day 1), midtraining (between the two multisession training phases, Day 7), and posttraining (Day 13). The stimulation sequence was predetermined based on previous results (see Methods). Recording sessions were composed of four blocks (300 trials per block) presented in the following order: Reference 1st, Reference 2nd, Reference 1st, Reference 2nd.

The task-related CNV and P3 components showed an asymmetry between the Reference 1st and Reference 2nd protocols from the very first session. As shown in Figure 3A, both components were produced only around the informative tone in each protocol. As the position of the informative tone differed between protocols, so did the position of these components. CNV, which overlaps N1, was produced in anticipation of the second tone in Reference 1st (blue line more negative than red line, bounded area is shaded between −80 and 140 msec around the onset of the second tone), and before the first tone in Reference 2nd (red line more negative than blue line, bounded area is shaded between −80 and 140 msec around the onset of the first tone). Similarly, P3 was elicited ∼300 msec after the second tone in Reference 1st and ∼300 msec after first tone in Reference 2nd (Nahum et al., 2010). On Day 1, a similar CNV + P3 pattern was induced around the informative tone in both protocols (see Table 1), although behavioral accuracy was somewhat higher in the Reference 1st protocol [Day 1: 84 ± 3% vs. 79 ± 4% correct; t(16) = 2.37, p = .03 in paired, two-tailed t test]. On Day 7, midway through training, Group 2, which was trained with the Reference 2nd protocol, exhibited significantly better performance on this protocol during the ERP measurement [Group 1: 80 ± 5% vs. Group 2: 91.5 ± 2%: t(15) = −2.15, p = .048 in two-tailed t test]. The accuracy on Reference 1st did not significantly differ between the groups (Group 1: 91 ± 3%, and Group 2: 89 ± 3%). The magnitude of the CNV and P3 tended to be larger around the informative tone on the trained protocol of each group, but the difference was not significant (Table 1). By the end of the training stage (Day 13), performance under both protocols had significantly improved [Day 13: 93 ± 2% and 90 ± 2% for Reference 1st and Reference 2nd, respectively; t(16) = −2.44, p = .013 and t(16) = −3.186, p = .003 in paired, two-tailed t tests between Day 1 and Day 13]. Nevertheless, performance was marginally better on Reference 1st [paired, two-tailed t test; t(16) = 1.78, p = .094], and as shown in Figure 3B, the pattern of specific anticipation, measured by the area of the CNV preceding the informative tone, became asymmetric and favored this protocol (see Table 1). In fact, training increased the areas of P3 and CNV around the informative tone in the Reference 1st protocol, which was already marginally “favorable” on the first session (see Table 1). CNV around the informative tone in Reference 2nd protocol did not change significantly (see Table 1), and P3 that followed it increased only marginally (Table 1). These results indicate that the structural pattern of Reference 1st, which was initially performed marginally better, was also more affordable to implicit learning in multisession training.

Figure 3. 

Practice-induced modifications of the slow ERP components CNV and P3. ERP plots from electrode Cz, averaged across all participants (N = 17; ERP of one participant was excluded because of being too noisy) and across all trials of each protocol. Short black bars bellow the x axis denote periods of tone presentation. Vertical lines mark the CNV time window. (A, B) Superposition of Reference 1st (blue) and Reference 2nd (red), recorded in separate blocks. (A) Day 1: CNV, N1, P2, and P3 are labeled for clarity. Note the difference in the dynamics of CNV and P3 components under the two protocols. In Reference 1st, they are produced around the second tone (with CNV preceding it and P3 following it), whereas in Reference 2nd they are produced around the first tone. Thus, in both protocols, they are produced around the variable, informative tone. (B) Day 13: The areas of P3 and of CNV around the informative tone increased with training, mainly for Reference 1st protocol. (C) Day 14: Superposition of Reference 1st (blue) and Reference 2nd (red), recorded with the combined protocol, in which they were administered in the same block. The structure specific CNV–P3 pattern is almost eliminated.

Figure 3. 

Practice-induced modifications of the slow ERP components CNV and P3. ERP plots from electrode Cz, averaged across all participants (N = 17; ERP of one participant was excluded because of being too noisy) and across all trials of each protocol. Short black bars bellow the x axis denote periods of tone presentation. Vertical lines mark the CNV time window. (A, B) Superposition of Reference 1st (blue) and Reference 2nd (red), recorded in separate blocks. (A) Day 1: CNV, N1, P2, and P3 are labeled for clarity. Note the difference in the dynamics of CNV and P3 components under the two protocols. In Reference 1st, they are produced around the second tone (with CNV preceding it and P3 following it), whereas in Reference 2nd they are produced around the first tone. Thus, in both protocols, they are produced around the variable, informative tone. (B) Day 13: The areas of P3 and of CNV around the informative tone increased with training, mainly for Reference 1st protocol. (C) Day 14: Superposition of Reference 1st (blue) and Reference 2nd (red), recorded with the combined protocol, in which they were administered in the same block. The structure specific CNV–P3 pattern is almost eliminated.

Table 1. 

Descriptive Statistics of CNV and P3 Areas before and after Training

Comparison
Average Values (μVs)
t(16)
p
Verbal Description
Day 1: CNV1 vs. CNV2 −28 (±22) vs. −73 (±19) 1.27 .22 Initially CNV shows no preference for the Reference 1st structure 
Day 1: P31 vs. P32 45 (±18) vs. 73 (±31) −1.41 .18 nor does P3 
CNV1: Day 1 vs. Day 7 −28 (±22) vs. −38 (±20) 0.3 .77 Following midtraining no significant change in the first CNV 
CNV2: Day 1 vs. Day 7 −73 (±19) vs. −109 (±31) 1.1 .3 or the second CNV 
P31: Day 1 vs. Day 7 45 (±18) vs. 72 (±20) 1.5 .15 or the first P3 
P32: Day 1 vs. Day 7 73 (±31) vs. 118 (±32) 1.64 .12 or the second P3 
Day 13: CNV1 vs. CNV2 −48 (±17) vs. −127 (±29) 3.46 .003 Following full training CNV is larger for Reference 1st 
Day 13: P31 vs. P32 79 (±22) vs. 159 (±33) −3.94 .001 So is P3 
CNV1: Day 1 vs. Day 13 −28 (±22) vs. −48 (±17) 0.72 .48 Training does not significantly increase CNV in Reference 2nd 
P31: Day 1 vs. Day 13 45 (±18) vs. 79 (±22) −2.073 .06 Training marginally increases P3 in Reference 2nd 
CNV2: Day 1 vs. Day 13 −73 (±19) vs. −127 (±29) 1.97 .06 Training marginally increases CNV in Reference 1st 
P32: Day 1 vs. Day 13 73 (±31) vs. 159 (±33) −2.58 .02 Training significantly increases P3 in Reference 1st 
CNVavg: Day 13 vs. Day 14 −88 (±21) vs. −35 (±8) −2.41 .03 In the trained group, CNV measured in separate protocols is significantly larger than the CNV of the combined protocol 
P3avg: Day 13 vs. Day 14 119 (±26) vs. 44 (±13) 3.65 .002 So is P3 
Comparison
Average Values (μVs)
t(16)
p
Verbal Description
Day 1: CNV1 vs. CNV2 −28 (±22) vs. −73 (±19) 1.27 .22 Initially CNV shows no preference for the Reference 1st structure 
Day 1: P31 vs. P32 45 (±18) vs. 73 (±31) −1.41 .18 nor does P3 
CNV1: Day 1 vs. Day 7 −28 (±22) vs. −38 (±20) 0.3 .77 Following midtraining no significant change in the first CNV 
CNV2: Day 1 vs. Day 7 −73 (±19) vs. −109 (±31) 1.1 .3 or the second CNV 
P31: Day 1 vs. Day 7 45 (±18) vs. 72 (±20) 1.5 .15 or the first P3 
P32: Day 1 vs. Day 7 73 (±31) vs. 118 (±32) 1.64 .12 or the second P3 
Day 13: CNV1 vs. CNV2 −48 (±17) vs. −127 (±29) 3.46 .003 Following full training CNV is larger for Reference 1st 
Day 13: P31 vs. P32 79 (±22) vs. 159 (±33) −3.94 .001 So is P3 
CNV1: Day 1 vs. Day 13 −28 (±22) vs. −48 (±17) 0.72 .48 Training does not significantly increase CNV in Reference 2nd 
P31: Day 1 vs. Day 13 45 (±18) vs. 79 (±22) −2.073 .06 Training marginally increases P3 in Reference 2nd 
CNV2: Day 1 vs. Day 13 −73 (±19) vs. −127 (±29) 1.97 .06 Training marginally increases CNV in Reference 1st 
P32: Day 1 vs. Day 13 73 (±31) vs. 159 (±33) −2.58 .02 Training significantly increases P3 in Reference 1st 
CNVavg: Day 13 vs. Day 14 −88 (±21) vs. −35 (±8) −2.41 .03 In the trained group, CNV measured in separate protocols is significantly larger than the CNV of the combined protocol 
P3avg: Day 13 vs. Day 14 119 (±26) vs. 44 (±13) 3.65 .002 So is P3 

Indices refer to the within-trial position of the tone the ERP component is related to: 1 for first tone, 2 for second tone, and avg for average across the first and second components.

In addition to increasing the magnitude of both components, training increased their correlation. On Day 1, the areas of CNV and P3 were not significantly correlated (Day 1, Spearman correlation: ρ = 0.25, p = .32). However, after training they became correlated (Day 13, Spearman's correlation: ρ = −0.53, p = .03), suggesting that more specific anticipation (CNV) is linked with better performance (P3).

Following the third ERP assessment (Day 13), participants returned for another ERP session (Day 14; Figure 3C), in which they were tested on the combined protocol, composed of the Reference 1st and Reference 2nd trials, interleaved. The average behavioral and ERP measures on the odd and even trials were then calculated separately. Performance on the combined protocol was significantly worse than performance on each of the protocols separately [85 ± 3% in Reference 1st - odd trials (Day 14) vs. 93 ± 2% in Reference 1st - separate (Day 13), t(16) = −3.17, p = .006; 77 ± 3% in Reference 2nd - even trials (Day 14) vs. 90 ± 2% in Reference 2nd - separate (Day 13); t(16) = −4.99, p = .0001]. In fact, performance levels were similar to those obtained on Day 1 for each of the structures separately [Day 1: 84 ± 3% in Reference 1st fixed, t(16) = 0.25, p = .80; and 79 ± 4% correct in Reference 2nd fixed; t(16) = −0.76, p = .46]. These results do not refute some potential transfer from the separate to the combined protocol, because initial performance was poorer on the combined protocol (Nahum et al., 2010) and was not measured. However, they clearly show that if any transfer occurred it was very limited.

The poor behavioral performance on the combined protocol was reflected in the slow ERP components. The areas of both P3 and CNV were smaller in the combined protocol than in the previous ERP session (Table 1). Moreover, as with behavior, for both structures, the areas of the CNV and P3 components were similar to the areas produced on the first ERP test (Day 1) when these structures were administered separately [Day 14 vs. Day 1, averaged CNV: t(16) = −1.34, p = .20; averaged P3: t(16) = 0.89, p = .39; in paired, two-tailed t tests]. Table 1 summarizes the average values for the CNV and P3 areas, their differences, and the training-induced modifications.

To analyze whether the faster obligatory N1 and P2 ERP components were also modified by training, we additionally high-pass filtered the ERP responses (at 2.5 Hz, mainly to separate it from the CNV, which partially overlaps it in time). The filtered N1 component reached its peak magnitude over the central electrodes, whereas the CNV component had a somewhat posterior distribution, as previously described for the “late CNV” (Bender, Resch, Weisbrod, & Oelkers-Ax, 2004).

As shown in Figure 4, the filtered responses did not contain the slow CNV negative deflection. As expected from previous reports, N1 and P2 had a different adaptation pattern for the reference and nonreference tone (e.g., Daikhin & Ahissar, 2012; Haenschel, Vernon, Dwivedi, Gruzelier, & Baldeweg, 2005). The N1 component presented a larger adaptation to the reference tone. This difference was significant in the response to the second tone in each trial [i.e., the second N1, elicited by the nonreference tone in Reference 1st, was larger than the second N1 elicited by the reference tone in Reference 2nd for all days of ERP measurement: Day 1 t(16) = −3.7, p = .002; Day 7 t(15) = −4.2, p = .001; Day 13 t(16) = −2.52, p = .02; Day 14 t(16) = −2.7, p = .02] and on Day 13 in the response to the first tone as well [t(16) = 3, p = .01]. P2, on the other hand, was larger in response to the reference tone on Day 1 and Day 7, particularly in the first interval [P2first in Reference 1st vs. P2first in Reference 2nd: Day 1 t(16) = 3.64, p = .002; Day 7 t(15) = 5.34, p = .0001]. However, the magnitude of the N1 and P2 components was not modified by training (see Table 2 for descriptive statistics).

Figure 4. 

Fast obligatory components (N1, P2). The grand-averaged (n = 17; ERP of one participant was excluded because of being too noisy) 2.5-Hz high-pass filtered ERPs recorded from electrode Cz. Short black bars bellow the x axis denote periods of tone presentation. Vertical lines mark the CNV time window. Horizontal lines enable by-eye comparison of the peaks. Left: Superposition of Reference 1st recorded at Day 1 (solid blue) and Day 13 (dashed blue). Right: Superposition of Reference 2nd recorded at Day 1 (solid red) and Day 13 (dashed red). Neither N1 nor P2 were significantly modified with training.

Figure 4. 

Fast obligatory components (N1, P2). The grand-averaged (n = 17; ERP of one participant was excluded because of being too noisy) 2.5-Hz high-pass filtered ERPs recorded from electrode Cz. Short black bars bellow the x axis denote periods of tone presentation. Vertical lines mark the CNV time window. Horizontal lines enable by-eye comparison of the peaks. Left: Superposition of Reference 1st recorded at Day 1 (solid blue) and Day 13 (dashed blue). Right: Superposition of Reference 2nd recorded at Day 1 (solid red) and Day 13 (dashed red). Neither N1 nor P2 were significantly modified with training.

Table 2. 

Descriptive Statistics of N1 and P2 Peak Amplitudes before and after Training

Comparison, Day 1 vs. Day 13
Average Values (μV)
t(16)
p
Reference 1st - N1first −1.6 (±0.2) vs. −1.4 (±0.2) −1.9 .07 
Reference 1st - P2first 1.9 (±0.2) vs. 1.8 (±0.2) .32 
Reference 1st - N1second −1.2 (±0.2) vs. −1.4 (±0.1) 0.95 .36 
Reference 1st - P2second 0.9 (±0.2) vs. 1 (±0.2) −0.77 .46 
Reference 2nd - N1first −1.7 (±0.2) vs. −1.7 (±0.2) 0.9 .4 
Reference 2nd - P2first 1.5 (±0.2) vs. 1.6 (±0.1) −1.3 .21 
Reference 2nd - N1second −1 (±0.1) vs. −1 (±0.1) 0.94 .36 
Reference 2nd - P2second 1 (±0.2) vs. 1 (±0.1) −0.07 .94 
Comparison, Day 1 vs. Day 13
Average Values (μV)
t(16)
p
Reference 1st - N1first −1.6 (±0.2) vs. −1.4 (±0.2) −1.9 .07 
Reference 1st - P2first 1.9 (±0.2) vs. 1.8 (±0.2) .32 
Reference 1st - N1second −1.2 (±0.2) vs. −1.4 (±0.1) 0.95 .36 
Reference 1st - P2second 0.9 (±0.2) vs. 1 (±0.2) −0.77 .46 
Reference 2nd - N1first −1.7 (±0.2) vs. −1.7 (±0.2) 0.9 .4 
Reference 2nd - P2first 1.5 (±0.2) vs. 1.6 (±0.1) −1.3 .21 
Reference 2nd - N1second −1 (±0.1) vs. −1 (±0.1) 0.94 .36 
Reference 2nd - P2second 1 (±0.2) vs. 1 (±0.1) −0.07 .94 

Interestingly, although the average magnitude of the fast components was not modified by training, by Day 13, the single subject magnitude of response became correlated with participants' behavioral accuracy (Reference 1st, Spearman's correlation for N1first: ρ = 0.48, p = .05; for P2first: ρ = −0.63, p = .009; P2second: ρ = −0.58, p = .02; Reference 2nd: N1first: ρ = 0.56, p = .02). This correlation disappeared on Day 14 when the combined protocol was assessed.

DISCUSSION

The main behavioral finding reported here is that perceptual learning of a simple discrimination task was specific to the regularity embedded in the trained structure of information, even when participants were unaware of this structure. In line with this specificity, the pattern of the task-related ERP components CNV and P3 suggests that listeners learned to focus their anticipation around the temporal position of the informative tone, which differed between the two trained structures. With training, the protocol-specific anticipatory pattern was enhanced, and the magnitude of the two task-related ERP components became correlated. These observations suggest that the increased anticipation to the informative tone (CNV) contributed to the increase in perceptual clarity (P3) and to the training-induced reduction in the discrimination thresholds. The strengthening of the anticipatory pattern was particularly evident in the Reference 1st protocol, which was performed slightly better than the Reference 2nd protocol, both before and after training on each of these protocols.

What Do We Learn?

The specificity of the trained structure is in line with learning theories that view perceptual learning as a context-specific experience. The importance of the broader stimulation context is in line with the Gibsonian ecological view (Gibson, 1979) of perceptual learning. Gibson emphasized the limited transfer from lab training with impoverished stimuli to natural environments with their noisiness and complexity, which he claimed to be useful rather than disruptive for human performance and learning. Several recent theories have reached similar conclusions based either on computational considerations such as Bayesian reasoning (e.g., Friston, 2005; Bialek, Nemenman, & Tishby, 2001) or psychophysical results. RHT is the main example of the latter. This theory posits that perceivers first capture the gist of the scene, whether visual (Hochstein & Ahissar, 2002) or auditory (Nahum et al., 2008), and only afterwards proceed to the details. A similar top–down cascade characterizes perceptual learning (Ahissar et al., 2009; Ahissar & Hochstein, 2004). Both the Bayesian view and RHT assume that backward learning is the formation of specific hypotheses rather than the implementation of a general gating mechanism. Thus, in a Bayesian framework, higher levels in the processing hierarchy set specific priors for subsequent learning. It follows that learning of even simple discriminations incorporates the structural context into a minischeme (prior), which may include both the characteristics of the repeated reference and the structural regularity that characterizes the trained protocol.

Interestingly, training did not eliminate the a priori bias favoring the Reference 1st structure. This a priori bias is not accounted for by RHT (or related theories). But its retention in spite of separate training is predicted by RHT. RHT predicts that learning will follow backwards from the high-level regularity that is initially detected. Hence, there is an advantage to structures that are already coded and are easily detected by the perceptual system.

The bias itself may reflect an automatic default mechanism of our perceptual system to incorporate the first tone in a sequence (of two tones in our case) into long-term memory and use it as an “anchor” (prior), representing a crude estimate of the expected stimulus. This hypothesis suggests that when asked to discriminate between two tones, participants take this anchor from the previous trial into account (i.e., a comparison is made between the second tone and a weighted average of the first tone and the “anchor”; Raviv, Ahissar, & Loewenstein, 2012). This hypothesis, based on analysis of behavior when there is no reference tone, naturally predicts the bias favoring Reference 1st structure, because the first tone in this protocol is always the repeated average. Therefore, taking it into account improves performance. Participants' ability to specifically learn Reference 2nd following several training sessions suggests that this default strategy is itself susceptible to change because, otherwise, training on Reference 2nd would have fully transferred to improvement in Reference 1st. The finding that the bias favoring Reference 1st remains after training suggests that the default strategy did not change.

Lessons from ERP

The main training-induced modification we found in the ERP measures was the specific increase in the CNV-P3 pattern around the informative tone, particularly for the Reference 1st protocol. This increase reflects the slow enhancement of the pattern of attention whose basic characteristics were produced quickly, as evidenced by its presence within the first session.

Inspecting the “raw” ERP response (Figure 3) suggests that the P2 component also increased after training. However, because N1 and P2 partially overlap in time with the slower CNV component, we further inspected it with a high-pass filter (2.5 Hz). As shown in Figure 4, the training-induced modifications were no longer observed. This null finding with respect to P2 is somewhat surprising, because previous studies have found that P2 is modified with multi session practice (e.g., Orduña et al., 2012; Tong et al., 2009; Atienza et al., 2002; Tremblay & Kraus, 2002). However, the mechanisms underlying the training-induced modifications of P2 are not well understood. These changes are not directly related to behavioral improvement and occur even without it (e.g., Sheehan, McArthur, & Bishop, 2005). Although it was hypothesized that the enhancement of P2 may be related to incorporation of the trained (or repeatedly exposed) stimuli into memory, dissociation from behavioral improvement (Carcagno & Plack, 2011) challenges this interpretation.

Comparison with Protocols in Previous Learning Studies

The behavioral protocols used here differ from prevalent psychophysical protocols in several ways. In a typical 2AFC protocol, the reference stimulus is presented first or second on each trial, and the target stimulus is higher/louder/longer, etc. (e.g., Fitzgerald & Wright, 2011; Banai, Ortiz, Oppenheimer, & Wright, 2010; Wright, Sabin, Zhang, Marrone, & Fitzgerald, 2010; Wright, Wilson, & Sabin, 2010). Thus, although seemingly similar, this protocol is substantially different. First, the positions of the reference and target stimuli are interchangeable, so there is no structural regularity. Second, the identity of each of the stimuli (both reference and target) is sufficient to resolve the task, because the repeated reference is always the lower (or weaker or shorter) stimulus, hence the nonreference (target) stimulus is always higher (or stronger/longer, respectively). Thus, the typical protocol does not contain structural regularities.

Our finding of structural specificity in simple perceptual learning is not only novel but also raises issues as regards generalization. Findings of specificity to stimuli are often interpreted as suggesting a basic modification in their bottom–up representation, thus perhaps leading to potential modifications in the performance of all tasks that use these stimuli (Bejjanki, Beck, Lu, & Pouget, 2011; Seitz, Kim, & Watanabe, 2009; Tsushima, Seitz, & Watanabe, 2008; Watanabe et al., 2002; Poggio et al., 1992; Karni & Sagi, 1991). Specificity to the conjunction of stimuli and the trained behavioral task (Fahle, 1997; Fahle & Morgan, 1996; Ahissar & Hochstein, 1993) has been interpreted as indicating task-specific reweighting procedures (Huang, Lu, & Dosher, 2011; Dosher & Lu, 2009), which perhaps need to be separately implemented for different ranges of stimuli (although the learning of some combinations of the task and the stimuli may be additive rather than multiplicative [Zhang et al., 2010; Xiao et al., 2008] and not all learning stages require active performance [Wright, Sabin, et al., 2010]). The addition of protocol specificity adds a third dimension, suggesting a functionally “infinite” number of combinations of task, stimuli, and structural context, which perhaps need to be learned separately (for a discussion of the clinical implications of the limits of generalization, see Moore, Halliday, & Amitay, 2009).

However, the number of structures that can be easily learned is probably small. Moreover, structure learning seems to require within-trial regularities, with trial perhaps interpreted by our perceptual system as a basic event. For example, the combined protocol of the two-tone discrimination task in which the position of the fixed tone is switched in every trial, yielded significantly worse performance, even after learning the constituent structures (Figure 3C). Similarly, even when global structures are introduced in the form of a repeated sequence of references in consecutive trials but the positions of the reference and target stimuli within each trial are interchangeable (i.e., using the typical 2AFC paradigm described above, e.g., Zhang et al., 2008; Kuai et al., 2005), there is no fast learning. Slow gradual learning, which is not specific to the trained sequence of reference stimuli, is seen across sessions. Introducing local structural regularities (i.e., a fixed position of the reference stimulus) would perhaps improve discrimination thresholds and facilitate the learning process, although not its generalization (see related discussion in Aberg & Herzog, 2009). This interpretation is consistent with both visual (Nachmias, 2006) and auditory (Oganian & Ahissar, 2012) studies that have found that discrimination thresholds are significantly lower on the Reference 1st protocol than on the standard 2AFC protocol where the reference tone is always lower and is presented in either the first or the 2nd interval.

Learning Regularities in Sequences of Auditory Stimuli

Mastering implicit structural regularities of auditory stimuli is known to be a basic mechanism in language acquisition, at the sublexical, lexical, and supralexical levels. In fact, individuals with language (and typically reading) disabilities tend to have poor frequency discrimination abilities (Bishop & McArthur, 2005; Lachmann, Berti, Kujala, & Schröger, 2005; Banai & Ahissar, 2004; McArthur & Bishop, 2004; Ramus et al., 2003; Amitay, Ahissar, & Nelken, 2002; France et al., 2002; Ahissar, Protopapas, Reid, & Merzenich, 2000; Heath, Hogben, & Clark, 1999), with specific problems of regularity detection (Ahissar, Lubin, Putter-Katz, & Banai, 2006; Banai & Ahissar, 2006), including in the Reference 1st protocol (Oganian & Ahissar, 2012). Their deficit in regularity detection and learning may lead to a greater reliance on mechanisms of explicit working memory than the automatic implicit mechanisms found in the general population (Banai & Ahissar, 2010; Ahissar, 2007). It would be of interest to assess whether training individuals with language and reading disabilities with auditory discrimination protocols that include structural regularities would generalize to improved regularity detection and learning in linguistic contexts.

Learning Regularities as a Means of Acquiring Cognitive Skills

The data here show that structural regularities play a major role in simple perceptual learning. When regularities in the structure of information were detected, performance was immediately boosted, and further improvement was specific to these regularities. These findings suggest that becoming an expert in simple perceptual tasks and in complex cognitive tasks like playing chess (Chase & Simon, 1973) or bridge (Ericsson & Lehmann, 1996) may be similar in terms of relying on detected regularities, although the complexity of these regularities differ vastly.

In a work that summarized a broad range of studies of experts in different domains (firemen, clinicians), Kahneman (2011) reached a similar conclusion regarding the conditions, in addition to practice, required for becoming an expert: (1) an environment that contains regularities and (2) an opportunity to detect these regularities.

Obviously, both conditions were met in our two-tone discrimination protocols. Although this learning process is relatively simple and fast, it could perhaps serve as a “toy model” for exploring the processes that underlie skill acquisition.

Implications for the Design of Applied Perceptual Training Procedures

The typical goal of applied training procedures, which inevitably administer only a limited set of examples and often in a very restricted (and artificial) context, is to attain maximal generalization. Our finding of specificity to the trained structural regularities suggests that the ability to generalize to novel contexts may be limited. The complete lack of transfer between protocols may partially have been a consequence of the adaptive (behavioral) or easy-to-difficult (ERP) sequence of stimuli that we used. Such sequences begin with an initially large frequency difference (20%) in all sessions. The easy-to-difficult procedure clarifies cross-trial regularities, because it enhances the initial cross-trial variability of the informative tone, which is presented in a fixed position. At the same time, the stimulus presented in the other position (that of the fixed reference) induces very little cross-trial variability. This enhanced difference can serve as a cue to the perceptual system indicating that the large cross-trial variability in the response (at the informative position) stems from a genuine change in the external stimulus, whereas the small one probably reflects variability because of internal noise alone. Thus, easy-to-difficult protocols may induce fast benefits in performance, at least partially, because regularity becomes more salient. In that case, fast improvement may indicate structure-specific learning that does not generalize to other structural contexts.

Adaptive protocols are typically used in perceptual training paradigms because they indeed yield fast improvement. However, if the objective is training to transfer to other structural contexts, as is the typical case in applied procedures, the nature of fast learning should be carefully evaluated.

Acknowledgments

We thank Sagi Jaffe-Dax for help with the data analyses. This research was supported by the Israel Science Foundation.

Reprint requests should be sent to Merav Ahissar, Department of Psychology and Edmond and Lily Safra Center for Brain Sciences, Hebrew University of Jerusalem, Jerusalem 91905, Israel, or via e-mail: msmerava@gmail.com.

REFERENCES

Aberg
,
K. C.
, &
Herzog
,
M. H.
(
2009
).
Interleaving bisection stimuli—randomly or in sequence—does not disrupt perceptual learning, it just makes it more difficult.
Vision Research
,
49
,
2591
2598
.
Adini
,
Y.
,
Sagi
,
D.
, &
Tsodyks
,
M.
(
2002
).
Context-enabled learning in the human visual system.
Nature
,
415
,
790
793
.
Ahissar
,
M.
(
2007
).
Dyslexia and the anchoring-deficit hypothesis.
Trends in Cognitive Sciences
,
11
,
458
465
.
Ahissar
,
M.
, &
Hochstein
,
S.
(
1993
).
Attentional control of early perceptual learning.
Proceedings of the National Academy of Sciences, U.S.A.
,
90
,
5718
5722
.
Ahissar
,
M.
, &
Hochstein
,
S.
(
1996
).
Learning pop-out detection: Specificities to stimulus characteristics.
Vision Research
,
36
,
3487
3500
.
Ahissar
,
M.
, &
Hochstein
,
S.
(
1997
).
Task difficulty and the specificity of perceptual learning.
Nature
,
387
,
401
406
.
Ahissar
,
M.
, &
Hochstein
,
S.
(
2004
).
The reverse hierarchy theory of visual perceptual learning.
Trends in Cognitive Sciences
,
8
,
457
464
.
Ahissar
,
M.
,
Lubin
,
Y.
,
Putter-Katz
,
H.
, &
Banai
,
K.
(
2006
).
Dyslexia and the failure to form a perceptual anchor.
Nature Neuroscience
,
9
,
1558
1564
.
Ahissar
,
M.
,
Nahum
,
M.
,
Nelken
,
I.
, &
Hochstein
,
S.
(
2009
).
Reverse hierarchies and sensory learning.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
364
,
285
299
.
Ahissar
,
M.
,
Protopapas
,
A.
,
Reid
,
M.
, &
Merzenich
,
M. M.
(
2000
).
Auditory processing parallels reading abilities in adults.
Proceedings of the National Academy of Sciences, U.S.A.
,
97
,
6832
6837
.
Amitay
,
S.
,
Ahissar
,
M.
, &
Nelken
,
I.
(
2002
).
Auditory processing deficits in reading disabled adults.
Journal of the Association for Research in Otolaryngology: JARO
,
3
,
302
320
.
Amitay
,
S.
,
Hawkey
,
D. J. C.
, &
Moore
,
D. R.
(
2005
).
Auditory frequency discrimination learning is affected by stimulus variability.
Perception & Psychophysics
,
67
,
691
698
.
Atienza
,
M.
,
Cantero
,
J. L.
, &
Dominguez-Marin
,
E.
(
2002
).
The time course of neural changes underlying auditory perceptual learning.
Learning & Memory
,
9
,
138
150
.
Banai
,
K.
, &
Ahissar
,
M.
(
2004
).
Poor frequency discrimination probes dyslexics with particularly impaired working memory.
Audiology & Neuro-otology
,
9
,
328
340
.
Banai
,
K.
, &
Ahissar
,
M.
(
2006
).
Auditory processing deficits in dyslexia: Task or stimulus related?
Cerebral Cortex
,
16
,
1718
1728
.
Banai
,
K.
, &
Ahissar
,
M.
(
2010
).
On the importance of anchoring and the consequences of its impairment in dyslexia.
Dyslexia
,
16
,
240
257
.
Banai
,
K.
,
Ortiz
,
J. A.
,
Oppenheimer
,
J. D.
, &
Wright
,
B. A.
(
2010
).
Learning two things at once: Differential constraints on the acquisition and consolidation of perceptual learning.
Neuroscience
,
165
,
436
444
.
Bejjanki
,
V. R.
,
Beck
,
J. M.
,
Lu
,
Z.-L.
, &
Pouget
,
A.
(
2011
).
Perceptual learning as improved probabilistic inference in early sensory areas.
Nature Neuroscience
,
14
,
642
648
.
Bender
,
S.
,
Resch
,
F.
,
Weisbrod
,
M.
, &
Oelkers-Ax
,
R.
(
2004
).
Specific task anticipation versus unspecific orienting reaction during early contingent negative variation.
Clinical Neurophysiology
,
115
,
1836
1845
.
Bialek
,
W.
,
Nemenman
,
I.
, &
Tishby
,
N.
(
2001
).
Predictability, complexity, and learning.
Neural Computation
,
13
,
2409
2463
.
Bishop
,
D. V. M.
, &
McArthur
,
G. M.
(
2005
).
Individual differences in auditory processing in specific language impairment: A follow-up study using event-related potentials and behavioural thresholds.
Cortex
,
41
,
327
341
.
Carcagno
,
S.
, &
Plack
,
C. J.
(
2011
).
Subcortical plasticity following perceptual learning in a pitch discrimination task.
Journal of the Association for Research in Otolaryngology: JARO
,
12
,
89
100
.
Chase
,
G.
, &
Simon
,
H. A.
(
1973
).
Perception in chess.
Cognitive Psychology
,
4
,
55
81
.
Daikhin
,
L.
, &
Ahissar
,
M.
(
2012
).
Responses to deviants are modulated by subthreshold variability of the standard.
Psychophysiology
,
49
,
31
42
.
Demany
,
L.
, &
Semal
,
C.
(
2002
).
Learning to perceive pitch differences.
The Journal of the Acoustical Society of America
,
111
,
1377
1388
.
Donchin
,
E.
, &
Coles
,
M. G. H.
(
1988
).
Is the P300 component a manifestation of context updating?
Behavioral and Brain Sciences
,
11
,
357
374
.
Dorais
,
A.
, &
Sagi
,
D.
(
1997
).
Contrast masking effects change with practice.
Vision Research
,
37
,
1725
1733
.
Dosher
,
B. A.
, &
Lu
,
Z.
(
2009
).
Hebbian reweighting on stable representations in perceptual learning.
Learning & Perception
,
1
,
37
58
.
Ericsson
,
K. A.
, &
Lehmann
,
A. C.
(
1996
).
Expert and exceptional performance: Evidence of maximal adaptation to task constraints.
Annual Review of Psychology
,
47
,
273
305
.
Fahle
,
M.
(
1997
).
Specificity of learning curvature, orientation, and vernier discriminations.
Vision Research
,
37
,
1885
1895
.
Fahle
,
M.
, &
Morgan
,
M.
(
1996
).
No transfer of perceptual learning between similar stimuli in the same retinal position.
Current Biology
,
6
,
292
297
.
Fitzgerald
,
M. B.
, &
Wright
,
B. A.
(
2011
).
Perceptual learning and generalization resulting from training on an auditory amplitude-modulation detection task.
The Journal of the Acoustical Society of America
,
129
,
898
906
.
France
,
S. J.
,
Rosner
,
B. S.
,
Hansen
,
P. C.
,
Calvin
,
C.
,
Talcott
,
J. B.
,
Richardson
,
A. J.
,
et al
(
2002
).
Auditory frequency discrimination in adult developmental dyslexics.
Perception & Psychophysics
,
64
,
169
179
.
Friston
,
K.
(
2005
).
A theory of cortical responses.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
360
,
815
836
.
Gibson
,
J. J.
(
1979
).
The ecological approach to visual perception.
Boston
:
Houghton Mifflin
.
Green
,
C. S.
, &
Bavelier
,
D.
(
2003
).
Action video game modifies visual selective attention.
Nature
,
423
,
534
537
.
Haenschel
,
C.
,
Vernon
,
D. J.
,
Dwivedi
,
P.
,
Gruzelier
,
J. H.
, &
Baldeweg
,
T.
(
2005
).
Event-related brain potential correlates of human auditory sensory memory-trace formation.
The Journal of Neuroscience
,
25
,
10494
10501
.
Hawkey
,
D. J. C.
,
Amitay
,
S.
, &
Moore
,
D. R.
(
2004
).
Early and rapid perceptual learning.
Nature Neuroscience
,
7
,
1055
1056
.
Heath
,
S. M.
,
Hogben
,
J. H.
, &
Clark
,
C. D.
(
1999
).
Auditory temporal processing in disabled readers with and without oral language delay.
Journal of Child Psychology and Psychiatry, and Allied Disciplines
,
40
,
637
647
.
Hochstein
,
S.
, &
Ahissar
,
M.
(
2002
).
View from the top: Hierarchies and reverse hierarchies in the visual system.
Neuron
,
36
,
791
804
.
Huang
,
C. B.
,
Lu
,
Z. L.
, &
Dosher
,
B. A.
(
2011
).
Co-learning analysis of two perceptual learning tasks with identical input stimuli supports the reweighting hypothesis.
Vision Research
,
61
,
25
32
.
Jeter
,
P. E.
,
Dosher
,
B. A.
,
Liu
,
S. H.
, &
Lu
,
Z. L.
(
2010
).
Specificity of perceptual learning increases with increased training.
Vision Research
,
50
,
1928
1940
.
Kahneman
,
D.
(
2011
).
Thinking fast and slow.
New York
:
Farrar, Straus and Giroux
.
Karni
,
A.
, &
Sagi
,
D.
(
1991
).
Where practice makes perfect in texture discrimination: Evidence for primary visual cortex plasticity.
Proceedings of the National Academy of Sciences, U.S.A.
,
88
,
4966
4970
.
Kuai
,
S. G.
,
Zhang
,
J. Y.
,
Klein
,
S. A.
,
Levi
,
D. M.
, &
Yu
,
C.
(
2005
).
The essential role of stimulus temporal patterning in enabling perceptual learning.
Nature Neuroscience
,
8
,
1497
1499
.
Lachmann
,
T.
,
Berti
,
S.
,
Kujala
,
T.
, &
Schröger
,
E.
(
2005
).
Diagnostic subgroups of developmental dyslexia have different deficits in neural processing of tones and phonemes.
International Journal of Psychophysiology
,
56
,
105
120
.
Levi
,
D. M.
, &
Polat
,
U.
(
1996
).
Neural plasticity in adults with amblyopia.
Proceedings of the National Academy of Sciences, U.S.A.
,
93
,
6830
6834
.
Li
,
R.
,
Polat
,
U.
,
Makous
,
W.
, &
Bavelier
,
D.
(
2009
).
Enhancing the contrast sensitivity function through action video game training.
Nature Neuroscience
,
12
,
549
551
.
McArthur
,
G. M.
, &
Bishop
,
D. V. M.
(
2004
).
Which people with specific language impairment have auditory processing deficits?
Cognitive Neuropsychology
,
21
,
79
94
.
Mecklinger
,
A.
, &
Ullsperger
,
P.
(
1993
).
P3 varies with stimulus categorization rather than probability.
Electroencephalography and Clinical Neurophysiology
,
86
,
395
407
.
Moore
,
D. R.
,
Halliday
,
L. F.
, &
Amitay
,
S.
(
2009
).
Use of auditory learning to manage listening problems in children.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
364
,
409
420
.
Nachmias
,
J.
(
2006
).
The role of virtual standards in visual discrimination.
Vision Research
,
46
,
2456
2464
.
Nagarajan
,
S. S.
,
Blake
,
D. T.
,
Wright
,
B. A.
,
Byl
,
N.
, &
Merzenich
,
M. M.
(
1998
).
Practice-related improvements in somatosensory interval discrimination are temporally specific but generalize across skin location, hemisphere, and modality.
The Journal of Neuroscience
,
18
,
1559
1570
.
Nahum
,
M.
,
Daikhin
,
L.
,
Lubin
,
Y.
,
Cohen
,
Y.
, &
Ahissar
,
M.
(
2010
).
From comparison with classification: A cortical tool for boosting perception.
The Journal of Neuroscience
,
30
,
1128
1136
.
Nahum
,
M.
,
Nelken
,
I.
, &
Ahissar
,
M.
(
2008
).
Low-level information and high-level perception: The case of speech in noise.
PLoS Biology
,
6
,
e126
.
Nieuwenhuis
,
S.
,
Aston-Jones
,
G.
, &
Cohen
,
J. D.
(
2005
).
Decision making, the P3, and the locus coeruleus-norepinephrine system.
Psychological Bulletin
,
131
,
510
532
.
Oganian
,
Y.
, &
Ahissar
,
M.
(
2012
).
Poor anchoring limits dyslexics' perceptual, memory, and reading skills.
Neuropsychologia
,
50
,
1895
1905
.
Orduña
,
I.
,
Liu
,
E. H.
,
Church
,
B. A.
,
Eddins
,
A. C.
, &
Mercado
,
E.
(
2012
).
Evoked-potential changes following discrimination learning involving complex sounds.
Clinical Neurophysiology
,
123
,
711
719
.
Poggio
,
T.
,
Fahle
,
M.
, &
Edelman
,
S.
(
1992
).
Fast perceptual learning in visual hyperacuity.
Science
,
256
,
1018
1021
.
Polat
,
U.
,
Ma-Naim
,
T.
,
Belkin
,
M.
, &
Sagi
,
D.
(
2004
).
Improving vision in adult amblyopia by perceptual learning.
Proceedings of the National Academy of Sciences, U.S.A.
,
101
,
6692
6697
.
Polat
,
U.
, &
Sagi
,
D.
(
1994
).
The architecture of perceptual spatial interactions.
Vision Research
,
34
,
73
78
.
Polich
,
J.
(
2007
).
Updating P300: An integrative theory of P3a and P3b.
Clinical Neurophysiology
,
118
,
2128
2148
.
Ramus
,
F.
,
Rosen
,
S.
,
Dakin
,
S. C.
,
Day
,
B. L.
,
Castellone
,
J. M.
,
White
,
S.
,
et al
(
2003
).
Theories of developmental dyslexia: Insights from a multiple case study of dyslexic adults.
Brain
,
126
,
841
865
.
Raviv
,
O.
,
Ahissar
,
M.
, &
Loewenstein
,
Y.
(
2012
).
How recent history affects perception: The normative approach and its heuristic approximation.
PLoS Computational Biology
,
8
,
e1002731
.
Recanzone
,
G. H.
,
Schreiner
,
C. E.
, &
Merzenich
,
M. M.
(
1993
).
Plasticity in the frequency representation of primary auditory cortex following discrimination training in adult owl monkeys.
The Journal of Neuroscience
,
13
,
87
103
.
Russo
,
N. M.
,
Nicol
,
T. G.
,
Zecker
,
S. G.
,
Hayes
,
E. A.
, &
Kraus
,
N.
(
2005
).
Auditory training improves neural timing in the human brainstem.
Behavioural Brain Research
,
156
,
95
103
.
Sabin
,
A. T.
,
Eddins
,
D. A.
, &
Wright
,
B. A.
(
2012
).
Perceptual learning evidence for tuning to spectrotemporal modulation in the human auditory system.
The Journal of Neuroscience
,
32
,
6542
6549
.
Schwartz
,
S.
,
Maquet
,
P.
, &
Frith
,
C.
(
2002
).
Neural correlates of perceptual learning: A functional MRI study of visual texture discrimination.
Proceedings of the National Academy of Sciences, U.S.A.
,
99
,
17137
17142
.
Seitz
,
A. R.
,
Kim
,
D.
, &
Watanabe
,
T.
(
2009
).
Rewards evoke learning of unconsciously processed visual stimuli in adult humans.
Neuron
,
61
,
700
707
.
Seitz
,
A.
, &
Watanabe
,
T.
(
2005
).
A unified model for perceptual learning.
Trends in Cognitive Sciences
,
9
,
329
334
.
Sheehan
,
K. A.
,
McArthur
,
G. M.
, &
Bishop
,
D. V. M.
(
2005
).
Is discrimination training necessary to cause changes in the P2 auditory event-related brain potential to speech sounds?
Brain Research
,
25
,
547
553
.
Spang
,
K.
,
Grimsen
,
C.
,
Herzog
,
M. H.
, &
Fahle
,
M.
(
2010
).
Orientation specificity of learning vernier discriminations.
Vision Research
,
50
,
479
485
.
Tecce
,
J.
(
1971
).
Contingent negative variation and individual differences. A new approach in brain research.
Archives of General Psychiatry
,
24
,
1
16
.
Tecce
,
J.
(
1972
).
Contingent negative variation (CNV) and psychological processes in man.
Psychological Bulletin
,
77
,
73
108
.
Tong
,
Y.
,
Melara
,
R. D.
, &
Rao
,
A.
(
2009
).
P2 enhancement from auditory discrimination training is associated with improved reaction times.
Brain Research
,
1297
,
80
88
.
Tremblay
,
K. L.
, &
Kraus
,
N.
(
2002
).
Auditory training induces asymmetrical changes in cortical neural activity.
Journal of Speech, Language, and Hearing Research
,
4–5
,
564
572
.
Tsushima
,
Y.
,
Seitz
,
A. R.
, &
Watanabe
,
T.
(
2008
).
Task-irrelevant learning occurs only when the irrelevant feature is weak.
Current Biology
,
18
,
R516
R517
.
Tzounopoulos
,
T.
, &
Kraus
,
N.
(
2009
).
Learning to encode timing: Mechanisms of plasticity in the auditory brainstem.
Neuron
,
62
,
463
469
.
Van Wassenhove
,
V.
, &
Nagarajan
,
S. S.
(
2007
).
Auditory cortical plasticity in learning to discriminate modulation rate.
The Journal of Neuroscience
,
27
,
2663
2672
.
Verleger
,
R.
(
1988
).
Event-related potentials and cognition: A critique of the context updating hypothesis and an alternative interpretation of P3.
Behavioral and Brain Sciences
,
11
,
343
356
.
Watanabe
,
T.
,
Náñez
,
J. E.
,
Koyama
,
S.
,
Mukai
,
I.
,
Liederman
,
J.
, &
Sasaki
,
Y.
(
2002
).
Greater plasticity in lower-level than higher-level visual motion processing in a passive perceptual learning task.
Nature Neuroscience
,
5
,
1003
1009
.
Wright
,
B. A.
,
Sabin
,
A. T.
,
Zhang
,
Y.
,
Marrone
,
N.
, &
Fitzgerald
,
M. B.
(
2010
).
Enhancing perceptual learning by combining practice with periods of additional sensory stimulation.
The Journal of Neuroscience
,
30
,
12868
12877
.
Wright
,
B. A.
,
Wilson
,
R. M.
, &
Sabin
,
A. T.
(
2010
).
Generalization lags behind learning on an auditory perceptual task.
The Journal of Neuroscience
,
30
,
11635
11639
.
Wright
,
B. A.
, &
Zhang
,
Y.
(
2009
).
A review of the generalization of auditory learning.
Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences
,
364
,
301
311
.
Xiao
,
L. Q.
,
Zhang
,
J. Y.
,
Wang
,
R.
,
Klein
,
S. A.
,
Levi
,
D. M.
, &
Yu
,
C.
(
2008
).
Complete transfer of perceptual learning across retinal locations enabled by double training.
Current Biology
,
18
,
1922
1926
.
Yang
,
T.
, &
Maunsell
,
J. H. R.
(
2004
).
The effect of perceptual learning on neuronal responses in monkey visual area V4.
The Journal of Neuroscience
,
24
,
1617
1626
.
Yu
,
C.
,
Klein
,
S. A.
, &
Levi
,
D. M.
(
2004
).
Perceptual learning in contrast discrimination and the (minimal) role of context.
Journal of Vision
,
4
,
169
182
.
Zhang
,
J. Y.
,
Kuai
,
S. G.
,
Xiao
,
L. Q.
,
Klein
,
S. A.
,
Levi
,
D. M.
, &
Yu
,
C.
(
2008
).
Stimulus coding rules for perceptual learning.
PLoS Biology
,
6
,
e197
.
Zhang
,
J. Y.
,
Zhang
,
G. L.
,
Xiao
,
L. Q.
,
Klein
,
S. A.
,
Levi
,
D. M.
, &
Yu
,
C.
(
2010
).
Rule-based learning explains visual perceptual learning and its specificity and transfer.
The Journal of Neuroscience
,
30
,
12323
12328
.

Author notes

*

The first two authors contributed equally to this study.