## Abstract

Language has a complex grammatical system we still have to understand computationally and biologically. However, some evolutionarily ancient mechanisms have been repurposed for grammar so that we can use insight from other taxa into possible circuit-level mechanisms of grammar. Drawing upon recent evidence for the importance of disinhibitory circuits across taxa and brain regions, I suggest a simple circuit that explains the acquisition of core grammatical rules used in 85% of the world's languages: grammatical rules based on sameness/difference relations. This circuit acts as a sameness detector. “Different” items are suppressed through inhibition, but presenting two “identical” items leads to inhibition of inhibition. The items are thus propagated for further processing. This sameness detector thus acts as a feature detector for a grammatical rule. I suggest that having a set of feature detectors for elementary grammatical rules might make language acquisition feasible based on relatively simple computational mechanisms.

## INTRODUCTION

Language acquisition is fast, largely based on positive evidence (or sometimes no evidence at all; Senghas, Kita, & Ozyürek, 2004; Goldin-Meadow & Mylander, 1998), goes far beyond what learners hear or see in their environment (Pinker, 1984; Chomsky, 1959), and results in a uniquely complex grammatical system that stands out in the animal kingdom (Yang, 2013; Hauser, Chomsky, & Fitch, 2002). Even seemingly straightforward “memory” problems such as learning the meanings of words hide complexities that call for human-specific grammatical adaptations (Medina, Snedeker, Trueswell, & Gleitman, 2011; Pinker & Jackendoff, 2005). Unsurprisingly, we know very little about the underlying computational mechanisms at the circuit level.

However, some linguistic mechanisms are evolutionarily ancient and have been repurposed for linguistic use (Fitch, 2017; Endress, Cahill, et al., 2009; Endress, Nespor, et al., 2009; Dehaene & Cohen, 2007). In such cases, it might be possible to identify core linguistic mechanisms whose systems-level implementation might be tractable due to its evolutionary history.

Here, I use sameness/difference relations as a case in point. I will first show that many grammatical rules are based on such relations, especially in morphology and phonology, but that similar relations are critical in many other domains and animals, suggesting that they reflect a linguistic core mechanism with evolutionarily ancient roots. I will then suggest that such relations can be computed using an ubiquitous processing motif: disinhibition among neurons or neural populations.

### Sameness/Difference Relations in Language and Other Domains and Animals

Sameness/difference relations are critical for many aspects of linguistic structure, especially in phonology and morphology. For example, some 85% of the world's languages use some form of reduplication (Rubino, 2013). Among many other uses, reduplications can signal changes in word class (e.g., from noun to verb, as in the Marshallese contrast between “takin–sock” and “takinkin–to wear socks”; Moravcsik, 1978), attenuation (as in the Alabama contrast between “kasatka–cold” and “kássatka–cool”; Hardy & Montler, 1988), or intensification; they can mark differences in number (e.g., singular vs. plural), tense (e.g., past vs. present), aspect (e.g., continued vs. repeated occurrence or temporary vs. permanent), size, or case (see Rubino, 2013, and references therein).

Phonological processes also often appeal to sameness/difference relations, with some processes requiring some features to be identical within a relevant constituent and others requiring them to be different. Processes that require identical features include vowel harmony and assimilation. Specifically, in languages with vowel harmony, vowels within words (or smaller domains) need to have one or more features in common (Rose & Walker, 2011). For example, Hungarian words generally have either only back vowels or only front vowels; grammatical suffixes thus come in two varieties, one with back vowels and one with front vowels. Accordingly, the dative suffix is –nak for words like “ablak–window” (resulting in forms like “ablaknak”) and –nek for words like “bíró–judge” (resulting in forms like “bírónek”; Hayes & Londe, 2006). Likewise, in languages with consonant assimilation, consonants must share a feature with other surrounding consonants. For example, in English, “football” might be pronounced as “foopball” because the place of articulation of the [t] at the end of [foot] gets assimilated to the place of articulation of the [b] at the start of “ball”; in contrast, in French, “football” might be pronounced as “foodball” because the voicing feature of the [t] (but not the place feature) gets assimilated to the following [b] (Darcy, Ramus, Christophe, Kinzler, & Dupoux, 2009). Both vowel harmony and assimilation thus introduce sameness relations among phonemes. Listeners use these sameness relations not only in word recognition (Darcy et al., 2009; Mitterer & Blomert, 2003; Suomi, McQueen, & Cutler, 1997) but also as cues to learn new words (Vroomen, Tuomainen, & de Gelder, 1998). Furthermore, sameness relations in the form of vowel harmony often interact with other area of grammar, such as stress assignment or morphology (Rose & Walker, 2011).

Although vowel harmony and assimilation require sameness relations among phonemic features, other phonological processes impose difference relations. Such processes include the obligatory contour principle (OCP; Frisch, Pierrehumbert, & Broe, 2004; McCarthy, 1986). Initially, the OCP was proposed to account for the observation that, in certain tone languages, tones cannot be repeated within words, but it also applies to other phonological phenomena. For example, in Semitic languages like Arabic and Hebrew, the basic meaning of words is given by their consonantal root; roots like /k t b/ are then transformed into surface forms such as “kataba–he wrote” and “kutiba–it was written” (Frisch et al., 2004). The OCP prevents consonantal roots from having repeated consonants, whereas other morphological processes can “create” (rather than prevent) sameness relations among consonants (Frisch et al., 2004; McCarthy, 1986). Such rules might also interact with other areas of grammar (Yip, 1988), and speakers apply them even when presented with novel nonsense words (e.g., Frisch & Zawaydeh, 2001; Berent & Shimron, 1997).

Sameness relations are also important during language acquisition. Reduplications are prominent in child-directed speech across languages (Ferguson, 1964), and children themselves “invent” forms with reduplicated syllables; these reduplicated forms might be important for acquiring multisyllabic words (Schwartz, Leonard, Wilcox, & Folger, 1980) and syllable-final consonants that would otherwise be lost (Fee & Ingram, 1982).

More generally, sameness relations have been critical for defining the computational complexity of phonological rules (Manaster-Ramer, 1986; Culy, 1985), and in developmental psychology, rules based on sameness relations have been the most prominent assay for studying rule learning in human infants (Marcus, Vijayan, Rao, & Vishton, 1999), to the extent that in a recent meta-analysis of “rule learning” in infancy, rule learning was treated as synonymous with the learning of sameness relations (Rabagliati, Ferguson, & Lew-Williams, 2019).

Sameness relations are also important for other forms of language use. Not only are rhymes and alliterations important in poetry (Fabb, 2015), but many language games that spontaneously arise in children also make extensive use of sameness relations in the form of reduplications (Bagemihl, 1995). For example, in the Chinese May-ka language game, syllables are duplicated, and then the vowel of the first duplicate is replaced by “ay” and the consonant of the second duplicate is replaced by “k”; ma (mother) thus becomes may-ka (Bao, 1990; Yip, 1982).

Despite their simplicity, sameness relations thus appear to be a core part of the language faculty.

However, sameness/difference rules are clearly not specific to language. They are crucial for many other aspects of cognition, including motor learning (Brooks, 1986), any comparison of sensory input to predictions or internal state (e.g., novelty detection in the hippocampus; Kumaran & Maguire, 2007) and STM tasks such as delayed match to sample tasks (Cope et al., 2018; Engel & Wang, 2011). Accordingly, grammar-like rules based on sameness/difference relations can be learned in many nonlinguistic domains in humans (Dawson & Gerken, 2009; Endress, Dehaene-Lambertz, & Mehler, 2007; Marcus, Fernandes, & Johnson, 2007; Saffran, Pollak, Seibel, & Shkolnik, 2007) and by many nonhuman animals (Versace, Spierings, Caffini, Ten Cate, & Vallortigara, 2017; Martinho & Kacelnik, 2016; Smirnova, Zorina, Obozova, & Wasserman, 2015; de la Mora & Toro, 2013; Neiworth, 2013; Hauser & Glynn, 2009; Murphy, Mondragón, & Murphy, 2008; Pepperberg, 1987; but see Hupé, 2017; Langbein & Puppe, 2017; van Heijningen, Visser, Zuidema, & ten Cate, 2009), possibly through a specialized sameness detector (Endress, 2013; Endress et al., 2007) that might exist from birth (Gervain, Berent, & Werker, 2012; Gervain, Macagno, Cogoi, Peña, & Mehler, 2008; Antell, Caron, & Myers, 1985). The computations underlying sameness/difference relations thus reflect a core linguistic mechanism whose systems-level implementation might be tractable due to its evolutionary history.

### Disinhibition-based Computations

Here, drawing upon recent evidence stressing the importance of disinhibitory circuits (neurons that inhibit other inhibitory neurons) across a variety of taxa and brain regions (Koyama et al., 2016; Goddard, Mysore, Bryant, Huguenard, & Knudsen, 2014; Hangya et al., 2014; Xu et al., 2013; Mysore & Knudsen, 2012; Chevalier & Deniau, 1990), I suggest a simple circuit that acts as a sameness detector. Disinhibition has been observed in a variety of brain areas (Letzkus et al., 2015; Chevalier & Deniau, 1990), and some interneuron populations specifically inhibit other inhibitory interneurons (Hangya et al., 2014; Xu et al., 2013). Critically, some interneuron types receive both local and long-range input; such interneurons have been found to inhibit other inhibitory interneurons in auditory (Pi et al., 2013), visual (Pfeffer, Xue, He, Huang, & Scanziani, 2013), somatosensory (Lee, Kruglikov, Huang, Fishell, & Rudy, 2013), and prefrontal cortex (Pi et al., 2013), from where they can exert spatially remarkably specific disinhibition on other populations (Zhang et al., 2014). Accordingly, Hangya et al. (2014) argued that this disinhibitory circuit might be a cortical circuit motif. Other authors suggested a more local disinhibitory circuit motif with mutual inhibition among inhibitory neurons (Koyama & Pujala, 2018; Koyama et al., 2016; Goddard et al., 2014; Mysore & Knudsen, 2012).

Disinhibitory circuits have been proposed to account for a variety of cognitive phenomena, including attentional selection (Zhang et al., 2014; van Der Velde & de Kamps, 2001), gain control (Fu et al., 2014), sequential discriminations of stimulus strength of stimuli (Miller & Wang, 2006; Machens, Romo, & Brody, 2005; but see Barak, Sussillo, Romo, Tsodyks, & Abbott, 2013), categorization of stimuli (Goddard et al., 2014; Mysore & Knudsen, 2012; Kusunoki, Sigala, Nili, Gaffan, & Duncan, 2010), behavioral response selection (Zhao et al., 2019; Jovanic et al., 2016), associative learning (Letzkus et al., 2011), plasticity (Fu, Kaneko, Tang, Alvarez-Buylla, & Stryker, 2015), and social behavior (Marlin, Mitre, D'amour, Chao, & Froemke, 2015; Owen et al., 2013). Here, I suggest that the same biological mechanisms might provide a circuit-level mechanism for a core grammatical computation based on sameness versus difference computations.

### Models of Sameness/Difference Relations

A number of models of how sameness relations might be computed have been proposed in the literature (Cope et al., 2018; Arena et al., 2013; Ludueña & Gros, 2013; Engel & Wang, 2011; J. S. Johnson, Spencer, Luck, & Schöner, 2009; Wen, Ulloa, Husain, Horwitz, & Contreras-Vidal, 2008; Hasselmo & Wyble, 1997; Carpenter & Grossberg, 1987). The underlying principles and assumptions vary substantially across models. Some rely on the fact that repeatedly activated representations suffer some form of neural “fatigue” (Kumaran & Maguire, 2007; Grill-Spector, Henson, & Martin, 2006); others rely on circuitry where the “combined” input from some form of memory and from sensory representations matching (or mismatching) the memory representations must be sufficiently strong (Wen et al., 2008; Hasselmo & Wyble, 1997; Carpenter & Grossberg, 1987) or where the “difference” between input from memory and from sensory representations is the critical variable (Engel & Wang, 2011). Still other models detect reduced levels of inhibition for novel compared with previously encountered items (Cope et al., 2018; J. S. Johnson, Spencer, et al., 2009). I discuss these models in more detail in Supplementary Material 1,1 where I show that they fall short on at least one of two criteria of grammar learning: They either do not generalize to unseen exemplars or they require labeled counterexamples.

To better illustrate the computational principles underlying the current dishibition-based circuit, I will first present a version of the model that can detect sameness relations in sequentially presented stimuli. Following this, I will sketch a version of the model that can detect sameness relations in spatially distributed, simultaneously presented stimuli and finally a model that can detect sameness relations in both simultaneously presented stimuli and sequentially presented stimuli.

## RESULTS

### Sameness Detection for Sequential Stimuli

Figure 1A shows a possible disinhibition-based architecture of how sameness might be detected for sequentially presented items. (Model equations are given in Appendix A; an R implementation is available online). The model comprises two populations of neurons (hereafter “layers”) that encode features of items (e.g., frequency, color, and so on; in Figure 1, the features are represented as geometric shapes).

Figure 1.

A disinhibition-based sameness detector for (A) sequentially and (B) simultaneously presented identical items. The geometric shapes (squares and triangles) stand for populations of neurons that encode features of the items (e.g., frequency, shape); filled shapes are currently active, whereas empty shapes are currently inactive. (A) Units in the “source layer” (bottom gray box) receive (sensory or other) input. Units in the “copy layer” (top gray box) receive one-to-one excitatory input from the source layer. Critically, units from the “inhibition layer” (right gray box) exert tonic inhibition on the copy layer. (A, left) Upon initial presentation of a feature (represented here as a square), all units in the inhibition layer are active. As a result, excitatory input from the source layer is not propagated to the copy layer. (A, top right) Feature-specific inhibition from the source layer to the corresponding units in the inhibition layer shuts down the inhibitory input to the copy layer. If the same item is presented again during the time window of reduced inhibition, input from the source layer is propagated to the copy layer. (A, bottom right) If a new, nonidentical item is presented, the source layer cannot drive the copy layer because the corresponding units in the inhibition layer have not been inhibited. Sameness detection thus proceeds by reading out the copy layer, as only repeated items are propagated to the copy layer. (B) Sameness detection in simultaneously presented, spatially arranged items. The source layer consists of populations of neurons coding for features (arranged in the y direction), but these units encode space as well (arranged in the x direction). Tonically active inhibitory (inter)neurons (small gray box on the right) prevent activation in the copy layer (top gray box). Critically, they receive inhibitory input from those units in the source layer that code for the same feature and excitatory input from units coding for other features. For example, units representing squares in the input layer inhibit all units representing squares in the inhibition layer, and excite all other units. (B, left) If the stimuli consist of two identical items (squares), the combined inhibitory input from the identical items in the source layer shuts down the corresponding units in the inhibition layer, which lets identical items “pass through” to the copy layer. (B, right) In contrast, when the stimuli consist of two different items, these singleton features are insufficient to drive the copy population due to inhibition from the inhibition layer.

Figure 1.

A disinhibition-based sameness detector for (A) sequentially and (B) simultaneously presented identical items. The geometric shapes (squares and triangles) stand for populations of neurons that encode features of the items (e.g., frequency, shape); filled shapes are currently active, whereas empty shapes are currently inactive. (A) Units in the “source layer” (bottom gray box) receive (sensory or other) input. Units in the “copy layer” (top gray box) receive one-to-one excitatory input from the source layer. Critically, units from the “inhibition layer” (right gray box) exert tonic inhibition on the copy layer. (A, left) Upon initial presentation of a feature (represented here as a square), all units in the inhibition layer are active. As a result, excitatory input from the source layer is not propagated to the copy layer. (A, top right) Feature-specific inhibition from the source layer to the corresponding units in the inhibition layer shuts down the inhibitory input to the copy layer. If the same item is presented again during the time window of reduced inhibition, input from the source layer is propagated to the copy layer. (A, bottom right) If a new, nonidentical item is presented, the source layer cannot drive the copy layer because the corresponding units in the inhibition layer have not been inhibited. Sameness detection thus proceeds by reading out the copy layer, as only repeated items are propagated to the copy layer. (B) Sameness detection in simultaneously presented, spatially arranged items. The source layer consists of populations of neurons coding for features (arranged in the y direction), but these units encode space as well (arranged in the x direction). Tonically active inhibitory (inter)neurons (small gray box on the right) prevent activation in the copy layer (top gray box). Critically, they receive inhibitory input from those units in the source layer that code for the same feature and excitatory input from units coding for other features. For example, units representing squares in the input layer inhibit all units representing squares in the inhibition layer, and excite all other units. (B, left) If the stimuli consist of two identical items (squares), the combined inhibitory input from the identical items in the source layer shuts down the corresponding units in the inhibition layer, which lets identical items “pass through” to the copy layer. (B, right) In contrast, when the stimuli consist of two different items, these singleton features are insufficient to drive the copy population due to inhibition from the inhibition layer.

The “source layer” receives input; input can be sensory or nonsensory, depending on where this circuit is located in the brain. Units in the “copy layer” receive excitatory one-to-one input from units in the source layer that code for the same feature. However, they also receive feature-specific tonic inhibition from an “inhibition layer” (which might consist of interneurons); tonic inhibition has been observed in a variety of brain regions and might subserve functions such as maintaining an appropriate level of excitability or the suppression of undesirable motor programs (Benjamin, Staras, & Kemenes, 2010; Farrant & Nusser, 2005; Semyanov, Walker, Kullmann, & Silver, 2004).

Because of the inhibition from the inhibition layer to the copy layer, input from the source layer is not propagated to the copy layer with a single stimulation. The critical aspect of this circuit is that each feature in the source layer also “inhibits” the corresponding feature in the inhibition layer, which, in turn, reduces inhibitory input to the copy layer for that feature. A similar phenomenon has been observed in auditory fear conditioning, where inhibition of (inhibitory) parvalbumin-positive interneurons allowed for associations between sounds and aversive stimuli to be formed (Letzkus et al., 2011).

Accordingly, once the inhibitory input to the copy layer ceases, there will be a time window during which the excitatory input from the source layer can drive the corresponding units in the copy layer. As a result, only repeated items will be propagated to the copy layer. Any readout mechanism for the copy layer (e.g., a population of thresholded neurons) could thus act as a sameness detector.2

I simulated this model at various levels of noise; at each noise level, I ran 50 simulations, representing 50 virtual participants. Figure 2 (left) shows that, in the copy layer, activation for repeated features is high, whereas activation for nonrepeated features is low. Repeated items are thus highly discriminable from nonrepeated items. This result is robust to the simulated noise level. A simple disinhibition-based circuit can thus act as a sameness detector that discriminates repeated features from not repeated features.

Figure 2.

Activation of repeated or nonrepeated items in the copy layer. The noise level is the standard deviation of normally distributed noise centered at zero. In each curve, the middle line shows the average activation across 50 simulations, representing 50 participants. The shaded areas represent standard errors from the mean. (Top) Activation in the models shown in Figure 1 that detect either sequentially (Figure 1A) or simultaneously presented (Figure 1B) identical items. (Left) In the “sequential” sameness detector (Figure 1A), the activity of repeated items is highly discriminable from that of nonidentical items even for high noise levels. (Right) In the “simultaneous” sameness detector (Figure 1B), the activity of repeated items is highly discriminable from that of nonrepeated items even for high noise levels.

Figure 2.

Activation of repeated or nonrepeated items in the copy layer. The noise level is the standard deviation of normally distributed noise centered at zero. In each curve, the middle line shows the average activation across 50 simulations, representing 50 participants. The shaded areas represent standard errors from the mean. (Top) Activation in the models shown in Figure 1 that detect either sequentially (Figure 1A) or simultaneously presented (Figure 1B) identical items. (Left) In the “sequential” sameness detector (Figure 1A), the activity of repeated items is highly discriminable from that of nonidentical items even for high noise levels. (Right) In the “simultaneous” sameness detector (Figure 1B), the activity of repeated items is highly discriminable from that of nonrepeated items even for high noise levels.

Although the primary goal of this model is to detect when two temporarily adjacent items are identical, whether or not it can detect the sameness of two objects with intervening material depends on the time constants of the disinhibitory effects. If disinhibition is sufficiently long lasting, the model will also detect the sameness of two nonadjacent items (e.g., of the two As in the sequence ABA). If so, it would predict that the further two items are separated (in terms of the amount of intervening time and/or the number of intervening items, which might or might not have separable effects), the harder it should become to detect the sameness of the two items. At least in infants, it might be harder to detect nonadjacent repetitions compared with adjacent repetitions (S. P. Johnson, Fernandas, et al., 2009; Kovács & Mehler, 2008, 2009).

That being said, the separation of two items is unlikely to be the only determinant of how easy it is to detect whether they are the same. For example, in a longer sequence like ABCDEDFGA, the two As are farther apart than the two Ds. Still, it might be easier to detect the sameness of the two As than that of the two Ds despite their greater distance, because initial and final items are more salient than medial items (Benavides-Varela & Mehler, 2015; Endress, Scholl, & Mehler, 2005). As a result, the representations of initial items are likely stronger than those of medial items and might thus create stronger and longer-lasting disinhibition. However, the goal of the current model is just to show that a simple and ubiquitous mechanism such as disinhibition can serve as the basis of a sameness detector, whereas more detailed predictions require a biophysically more realistic model.

### Sameness Detection for Simultaneous Stimuli

In its current stage, the model can detect the sameness of sequentially presented stimuli, but not of spatially distributed, simultaneously presented stimuli, simply because space is not represented. Figure 1B shows a version of the model where items are presented simultaneously rather than sequentially. Again, there is a source layer, a copy layer, and an inhibition layer. The model differs from the sequential model in three critical aspects. First, all layers now represent space. In Figure 1B, the vertical axis represents the features as before, whereas the horizontal axis represents the spatial locations of the items (though space is presumably represented in some topological order in real neuronal populations). This change is necessary so that two simultaneously presented identical objects can be represented.

Second, the connectivity between the source layer and the inhibition layer has been changed. Units in the source layer send (i) inhibitory input to all units in the inhibition layer that code for the same feature across all locations and (ii) excitatory input to all units in the inhibition layer that code for different features; in other words, there is center-surround disinhibition among features. This ensures that, in the copy layer, different-feature input from the source layer stays inhibited, whereas same-feature input is disinhibited.

Third, the sequential model needs to update the activation of the copy layer before that of the inhibition layer; if the inhibition layer were updated first, a single presentation of a feature would be sufficient to produce disinhibition. In contrast, the simultaneous model needs to update the inhibition layer before the copy layer; if the copy layer were updated first, there would be no disinhibition for identical features.

I simulated this architecture using 50 virtual participants. As shown in Figure 2, identical items are highly discriminable from nonidentical items even at high levels of noise. A simple, disinhibition-based circuit can thus detect sameness relations among simultaneously presented identical objects.

### A Combined Model of Sameness Detection for Simultaneous and Sequential Stimuli

Although the main differences between the sequential and the simultaneous circuit are simply due to how stimuli are presented (i.e., spatial representations and lateral inhibition among features could be added to the sequential model but are not necessary), the different update orders raise the question of whether a combined model can be developed that detects both sequential and simultaneous sameness relations. Practically speaking, sequential and simultaneous presentation might not be as different as they seem. For example, if observers attend simultaneously presented items one after the other (Liu & Becker, 2013; Vogel, Woodman, & Luck, 2006; but see Mance, Becker, & Liu, 2012), we need a sequential model to account for “simultaneous” sameness detection; conversely, if sequential items are placed in some kind of (short-term) memory before being compared, we need a simultaneous model for sameness detection in sequentially presented items. As such, a combined sequential/simultaneous model might be neither necessary nor desirable.

Be that as it might, such a combined model is shown in Figure 3.

Figure 3.

Combined disinhibition-based sameness detector for both sequential and simultaneous sameness relations. As in the simultaneous circuit from Figure 1B, the source layer (bottom left gray box) consists of populations of neurons coding for features (arranged in the y direction) and spatial locations (arranged in the x direction). Tonically active units in the inhibition layer (top right gray box) prevent activation in the copy layer (top left gray box). Units in the inhibition layer receive (i) inhibitory input from the source layer for units coding for the same feature and (ii) excitatory input for units coding for other features, leading to center-surround disinhibition among features and, in the copy layer, to inhibition for different-feature input and disinhibition for same-feature input. Critically and in contrast to the simultaneous model from Figure 1B, units in the source layer do not inhibit units in the inhibition layer that code for features at their own spatial location; they disinhibit features only at other locations. To obtain disinhibition at the spatial location of a given unit, a self-inhibition layer (bottom right gray box) was added that receives one-to-one input from the source layer and that specifically inhibits units in the inhibition layer that code for the same feature at the same spatial location. This delays same-feature/same-location disinhibition to prevent a single sequential presentation of a feature from disinhibiting that feature.

Figure 3.

Combined disinhibition-based sameness detector for both sequential and simultaneous sameness relations. As in the simultaneous circuit from Figure 1B, the source layer (bottom left gray box) consists of populations of neurons coding for features (arranged in the y direction) and spatial locations (arranged in the x direction). Tonically active units in the inhibition layer (top right gray box) prevent activation in the copy layer (top left gray box). Units in the inhibition layer receive (i) inhibitory input from the source layer for units coding for the same feature and (ii) excitatory input for units coding for other features, leading to center-surround disinhibition among features and, in the copy layer, to inhibition for different-feature input and disinhibition for same-feature input. Critically and in contrast to the simultaneous model from Figure 1B, units in the source layer do not inhibit units in the inhibition layer that code for features at their own spatial location; they disinhibit features only at other locations. To obtain disinhibition at the spatial location of a given unit, a self-inhibition layer (bottom right gray box) was added that receives one-to-one input from the source layer and that specifically inhibits units in the inhibition layer that code for the same feature at the same spatial location. This delays same-feature/same-location disinhibition to prevent a single sequential presentation of a feature from disinhibiting that feature.

This “combined” sameness detector is similar to the simultaneous sameness detector in that it comprises a source layer, a copy layer, and an inhibition layer and that the copy layer receives excitatory input from the source layer. However, (dis)inhibition is organized differently. The copy layer still receives tonic inhibition from those units in the inhibition layer that code for the same feature and spatial position. Furthermore, each feature of the input layer inhibits the corresponding feature in the inhibition layer across spatial positions (i.e., it disinhibits this feature in the copy layer) and excites all other features.

The critical difference is that disinhibition of features at the same location is delayed. To do so, I removed direct connections between the source layer and the inhibition layer that coded for the same feature at the same location (while keeping the center-surround disinhibition at other locations). Instead, I added a “self-disinhibition layer” where each unit (i) receives excitatory input from the corresponding feature and location in the source layer and (ii) sends inhibitory input to all units coding for the same feature (across locations) in the inhibition layer. (Although these modifications might seem to some extent ad hoc, as mentioned above, it is not clear if a combined sequential/simultaneous model is necessary or desirable in the first place.)

As shown in Figure 4, identical items were highly discriminable from nonidentical items in the simultaneous situation across noise levels; in contrast, in the sequential situation, discriminability suffered as noise increased.

Figure 4.

Activation in the copy layer of the combined sequential/simultaneous sameness detector (Figure 3). (Left) In the combined sequential/simultaneous sameness detector, repeated features can be repeated either at the same location or at a different location. Although activation of (same or different location) repeated items is highly discriminable from activation for nonrepeated items for moderate noise levels, discriminability becomes much poorer at high noise levels, when the standard deviation of the noise reaches about 15% of the activation level of active neurons. (Right) The combined sequential/simultaneous sameness detector (Figure 3) shows that the activation in the copy layer is highly discriminable between simultaneously repeated items and nonrepeated items, even for high noise levels.

Figure 4.

Activation in the copy layer of the combined sequential/simultaneous sameness detector (Figure 3). (Left) In the combined sequential/simultaneous sameness detector, repeated features can be repeated either at the same location or at a different location. Although activation of (same or different location) repeated items is highly discriminable from activation for nonrepeated items for moderate noise levels, discriminability becomes much poorer at high noise levels, when the standard deviation of the noise reaches about 15% of the activation level of active neurons. (Right) The combined sequential/simultaneous sameness detector (Figure 3) shows that the activation in the copy layer is highly discriminable between simultaneously repeated items and nonrepeated items, even for high noise levels.

## DISCUSSION

The current results thus show that a simple and biologically realistic circuit can support a core grammatical computation that is used in more than 80% of the world's languages: grammatical rules based on sameness/difference relationships. In this circuit, nonidentical items are filtered out through tonic inhibition as well as center-surround inhibition. In contrast, when identical items are presented sequentially or simultaneously, inhibition is inhibited; this disinhibition of identical items then allows them to be propagated for further processing.

Unlike previous models of sameness detection (Cope et al., 2018; Arena et al., 2013; Ludueña & Gros, 2013; Engel & Wang, 2011; J. S. Johnson, Spencer, et al., 2009; Wen et al., 2008; Hasselmo & Wyble, 1997; Carpenter & Grossberg, 1987; see Supplementary Material 1), the model satisfies critical criteria of grammar acquisition: (1) It generalizes to unseen stimuli and (2) does not require any labeled counterexamples for learning, simply because this circuit architecture does not require any learning at all.

Once such a sameness detector is available, it can be used for building more complex grammatical rules. For example, after exposure to syllable sequences such as dubaba, 7-month-olds notice that the last two syllables are identical and generalize this sameness relation to new items (Marcus et al., 1999). Critically, they do not only have to detect the sameness relation between the last two syllables but also have to associate it with the “correct serial” position (Gervain et al., 2012; Endress et al., 2007). Once a sameness detector is available, it can form associations with representations of sequential positions or other stimuli (Kabdebon & Dehaene-Lambertz, 2019), allowing learners to acquire more complex, composite rules, which is one of the hallmarks of complex cognition (Hauser & Watumull, 2017; Dehaene, Meyniel, Wacongne, Wang, & Pallier, 2015; Corballis, 2014; Fitch & Martins, 2014).

This, in turn, suggests a fundamentally new view on language acquisition. Learners might be equipped with a potentially large number of potentially complex detectors for a variety of rules that act as feature detectors for grammatical rules (Endress, Nespor, et al., 2009). Learning then involves combining these features, potentially through the use of associative mechanisms. This would be consistent with results from formal language theory, where suitable preprocessing (e.g., through feature detectors) can reduce the complexity of the required computational mechanism. For example, a finite state automaton operating on trees can recognize context-free languages (Morgan, 1986), and even humble rules based on sameness relations can be shown to be beyond the reach of even context-free grammars (Manaster-Ramer, 1986; Culy, 1985).

Feature detectors for elementary grammatical rules might thus expand the range of grammars that even simple learning mechanisms (such as associative mechanisms) can learn, which, in turn, might make language acquisition feasible using relatively simple computational machinery.

## APPENDIX A: MODEL EQUATIONS

### A.1 Sequential Model

The feature f is encoded in the source layer, the inhibition layer, and the copy layer; the corresponding activations, are Sf(t) for a unit encoding feature f in the source layer, If(t) for such a unit in the inhibition layer, and Cf(t) for such a unit in the copy layer. Ef(t) is the external input, N(μ, σ) is a random value drawn from a normal distribution with mean μ and standard deviation σ.

Before stimulation, the activation in the source layer and in the copy layer is initialized to zero (plus noise), whereas the activation in the inhibition layer is initialized to some value aI (here arbitrarily set to 1):
$Sft=0∼N0σactivationCft=0∼N0σactivationIft=0∼NaIσactivation$
(1)
The connection weights between units in the different layers are indicated by w: wI,S from the source layer to the inhibition layer, wC,S from the source layer to the copy layer, and wC,I from the inhibition layer to the copy layer. A connection between a source layer unit coding for feature f and a copy layer unit coding for feature f′ is indicated by $wf′,fC,S$. The weights are given as follows:
$wf′,fC,S∼N1σweightf=f′0f≠f′wf′,fC,I∼N−1σweightf=f′0f≠f′wf′,fI,S∼N−1σweightf=f′0f≠f′$
(2)
At each time step, the activations in the different layers are then updated as follows; as mentioned in the main text, the update order is critical.
$Sft=Eft+N0σactivationCft=wfC,SSft+wfC,IIft+N0σactivationIft=NaIσactivation+wfI,SSft$
(3)

At the end of each update cycle, the activations are curtailed to be between zero and one.

### A.2. Simultaneous Model

In the simultaneous model, units represent both features and spatial locations. Sf,l(t) is thus the activation of a unit in the source layer that encodes feature f at location l, If,l(t) is the corresponding activation in the inhibition layer, and Cf,l(t) is the corresponding activation in the copy layer. Ef,l(t) is the external input.

Before stimulation, the activation in the source layer and in the copy layer is initialized to zero (plus noise), whereas the activation in the inhibition layer is initialized to some value aI (here arbitrarily set to 1):
$Sf,lt=0∼N0σactivationCf,lt=0∼N0σactivationIf,lt=0∼NaIσactivation$
(4)
Connection weights now carry indices for both features and spatial locations. For example, a connection between a source layer unit coding for feature f at location l and a copy layer unit coding for feature f′ at location l′ is indicated by $wf′,f,l′,lC,S$. The weights are given as follows:
$wf′,f,l′,lC,S∼N1σweightf=f′,l=l′0otherwisewf′,f,l′,lC,I∼N−1σweightf=f′0f≠f′wf′,f,l′,lI,S∼N−1σweightf=f′N1σweightf≠f′$
(5)
At each time step, the activations in the different layers are then updated as follows; as mentioned in the main text, the update order is critical.
$Sf,lt=Ef,lt+N0σactivationIf,lt=NaIσactivation+∑fS,lSwf,l,fS,lsI,SSfS,lStCf,lt=∑fS,lSwf,l,fS,lsC,SSfS,lSt+∑fI,lIwf,l,fI,lIC,IIfI,lIt+N0σactivation$
(6)

At the end of each update cycle, the activations are curtailed to be between zero and one.

### A.3. Combined Model

The combined sequential/simultaneous model is similar to the simultaneous model in that it comprises a source layer, a copy layer, and an inhibition layer and that the copy layer receives excitatory input from the source layer as well as tonic inhibition from those units in the inhibition layer that code for the same feature and spatial position. Furthermore, each feature of the input layer inhibits the corresponding feature in the inhibition layer across spatial positions and excites all other features. The critical difference between the simultaneous and the combined model is that there are no connections between the source layer and the inhibition layer that code for the same feature “at the same location” (whereas disinhibition occurs for other locations) and that same-location disinhibition of features proceeds through a “self-disinhibition layer” where each unit (1) receives excitatory input from the corresponding feature and location in the source layer and (2) sends inhibitory input to all units coding for the same feature (across locations) in the inhibition layer.

The symbols for the activation in the source, inhibition, and copy layers are the same as in the simultaneous model; activation in the self-disinhibition layer for a unit coding for feature f at location l is designated as Df,l(t) and is initialized using random values around zero.

The symbols for the connection weights are similar to those in the simultaneous model, but the weights reflect the changes above:
$wf′,f,l′,lC,S∼N1σweightf=f′,l=l′0otherwisewf′,f,l′,lD,S∼N1σweightf=f′,l=l′0otherwisewf′,f,l′,lC,I∼N−1σweightf=f′0f≠f′wf′,f,l′,lI,S∼N−1σweightf=f′,l≠l′0f=f′,l=l′N1σweightf≠f′wf′,f,l′,lI,D∼N−1σweightf=f′0f≠f′$
(7)
At each time step, the activations in the different layers are then updated as follows; again, the update order is critical.
$Sf,lt=Ef,lt+N0σactivationIf,lt=NaIσactivation+∑fS,lSwf,l,fS,lsI,SSfS,lSt+∑fD,lDwf,l,fD,lDI,DDfD,lDtCf,lt=∑fS,lSwf,l,fS,lsC,SSfS,lSt+∑fI,lIwf,l,fI,lIC,IIfI,lIt+N0σactivationDf,lt=∑fS,lSwf,l,fS,lsD,SSfS,lSt+N0σactivation$
(8)

At the end of each update cycle, the activations are curtailed to be between zero and one.

Reprint request should be sent to Ansgar D. Endress, Department of Psychology, City, University of London, Northampton Square, London EC1V 0HB, United Kingdom, or via e-mail: ansgar.endress@m4x.org.

## Notes

1.

Supplementary material for this paper can be retrieved as follows: Supplementary Text: Previous biological models of sameness detection (https://doi.org/10.25383/city.10280579). Supplementary Code: R code for the model (https://doi.org/10.25383/city.10280585).

2.

Although I model disinhibition across different neural populations, the same computational principles could be implemented using reciprocal inhibition among inhibitory neurons as in earlier models of stimulus selection and categorization (Koyama & Pujala, 2018; Koyama et al., 2016; Goddard et al., 2014; Mysore & Knudsen, 2012). To do so, one would simply replace the inhibitory connections from the source layer to the inhibition layer with inhibition in the source layer that is itself subject to lateral inhibition.

## REFERENCES

Antell
,
S. E.
,
Caron
,
A. J.
, &
Myers
,
R. S.
(
1985
).
Perception of relational invariants by newborns
.
Developmental Psychology
,
21
,
942
948
.
Arena
,
P.
,
Patané
,
L.
,
Stornanti
,
V.
,
Termini
,
P. S.
,
Zäpf
,
B.
, &
Strauss
,
R.
(
2013
).
Modeling the insect mushroom bodies: Application to a delayed match-to-sample task
.
Neural Networks
,
41
,
202
211
.
Bagemihl
,
B.
(
1995
).
Language games and related areas
. In
J. A.
Goldsmith
(Ed.),
Handbook of phonological theory
(1st ed., pp.
697
712
).
Cambridge, MA
:
Blackwell
.
Bao
,
Z.
(
1990
).
Fanqie languages and reduplication
.
Linguistic Inquiry
,
21
,
317
350
.
Barak
,
O.
,
Sussillo
,
D.
,
Romo
,
R.
,
Tsodyks
,
M.
, &
Abbott
,
L. F.
(
2013
).
From fixed points to chaos: Three models of delayed discrimination
.
Progress in Neurobiology
,
103
,
214
222
.
Benavides-Varela
,
S.
, &
Mehler
,
J.
(
2015
).
Verbal positional memory in 7-month-olds
.
Child Development
,
86
,
209
223
.
Benjamin
,
P. R.
,
Staras
,
K.
, &
Kemenes
,
G.
(
2010
).
What roles do tonic inhibition and disinhibition play in the control of motor programs?
Frontiers in Behavioral Neuroscience
,
4
,
30
.
Berent
,
I.
, &
Shimron
,
J.
(
1997
).
The representation of Hebrew words: Evidence from the obligatory contour principle
.
Cognition
,
64
,
39
72
.
Brooks
,
V. B.
(
1986
).
How does the limbic system assist motor learning? A limbic comparator hypothesis
.
Brain, Behavior and Evolution
,
29
,
29
53
.
Carpenter
,
G. A.
, &
Grossberg
,
S.
(
1987
).
A massively parallel architecture for a self-organizing neural pattern recognition machine
.
Computer Vision, Graphics, and Image Processing
,
37
,
54
115
.
Chevalier
,
G.
, &
Deniau
,
J. M.
(
1990
).
Disinhibition as a basic process in the expression of striatal functions
.
Trends in Neurosciences
,
13
,
277
280
.
Chomsky
,
N.
(
1959
).
A review of B. F. Skinner's verbal behavior
.
Language
,
35
,
26
58
.
Cope
,
A. J.
,
Vasilaki
,
E.
,
Minors
,
D.
,
Sabo
,
C.
,
Marshall
,
J. A. R.
, &
Barron
,
A. B.
(
2018
).
Abstract concept learning in a simple neural network inspired by the insect brain
.
PLoS Computational Biology
,
14
,
e1006435
.
Corballis
,
M. C.
(
2014
).
The recursive mind: The origins of human language, thought, and civilization
.
Princeton, NJ
:
Princeton University Press
.
Culy
,
C.
(
1985
).
The complexity of the vocabulary of Bambara
.
Linguistics and Philosophy
,
8
,
345
351
.
Darcy
,
I.
,
Ramus
,
F.
,
Christophe
,
A.
,
Kinzler
,
K. D.
, &
Dupoux
,
E.
(
2009
).
Phonological knowledge in compensation for native and non-native assimilation
. In
F.
Kügler
,
C.
Féry
, &
R.
van de Vijver
(Eds.),
Variation and gradience in phonetics and phonology
(pp.
265
309
).
Berlin
:
Mouton De Gruyter
.
Dawson
,
C.
, &
Gerken
,
L.
(
2009
).
From domain-generality to domain-sensitivity: 4-month-olds learn an abstract repetition rule in music that 7-month-olds do not
.
Cognition
,
111
,
378
382
.
de la Mora
,
D. M.
, &
Toro
,
J. M.
(
2013
).
Rule learning over consonants and vowels in a non-human animal
.
Cognition
,
126
,
307
312
.
Dehaene
,
S.
, &
Cohen
,
L.
(
2007
).
Cultural recycling of cortical maps
.
Neuron
,
56
,
384
398
.
Dehaene
,
S.
,
Meyniel
,
F.
,
Wacongne
,
C.
,
Wang
,
L.
, &
Pallier
,
C.
(
2015
).
The neural representation of sequences: From transition probabilities to algebraic patterns and linguistic trees
.
Neuron
,
88
,
2
19
.
Endress
,
A. D.
(
2013
).
Bayesian learning and the psychology of rule induction
.
Cognition
,
127
,
159
176
.
Endress
,
A. D.
,
Cahill
,
D.
,
Block
,
S.
,
Watumull
,
J.
, &
Hauser
,
M. D.
(
2009
).
Evidence of an evolutionary precursor to human language affixation in a nonhuman primate
.
Biology Letters
,
5
,
749
751
.
Endress
,
A. D.
,
Dehaene-Lambertz
,
G.
, &
Mehler
,
J.
(
2007
).
Perceptual constraints and the learnability of simple grammars
.
Cognition
,
105
,
577
614
.
Endress
,
A. D.
,
Nespor
,
M.
, &
Mehler
,
J.
(
2009
).
Perceptual and memory constraints on language acquisition
.
Trends in Cognitive Sciences
,
13
,
348
353
.
Endress
,
A. D.
,
Scholl
,
B. J.
, &
Mehler
,
J.
(
2005
).
The role of salience in the extraction of algebraic rules
.
Journal of Experimental Psychology: General
,
134
,
406
419
.
Engel
,
T. A.
, &
Wang
,
X. J.
(
2011
).
Same or different? A neural circuit mechanism of similarity-based pattern match decision making
.
Journal of Neuroscience
,
31
,
6982
6996
.
Fabb
,
N.
(
2015
).
What is poetry? Language and memory in the poems of the world
.
Cambridge
:
Cambridge University Press
.
Farrant
,
M.
, &
Nusser
,
Z.
(
2005
).
Variations on an inhibitory theme: Phasic and tonic activation of GABA(A) receptors
.
Nature Reviews Neuroscience
,
6
,
215
229
.
Fee
,
J.
, &
Ingram
,
D.
(
1982
).
Reduplication as a strategy of phonological development
.
Journal of Child Language
,
9
,
41
54
.
Ferguson
,
C.
(
1964
).
Baby talk in six languages
.
American Anthropologist
,
66
,
103
114
.
Fitch
,
W. T.
(
2017
).
Empirical approaches to the study of language evolution
.
Psychonomic Bulletin & Review
,
24
,
3
33
.
Fitch
,
W. T.
, &
Martins
,
M. D.
(
2014
).
Hierarchical processing in music, language, and action: Lashley revisited
.
Annals of the New York Academy of Sciences
,
1316
,
87
104
.
Frisch
,
S. A.
,
Pierrehumbert
,
J. B.
, &
Broe
,
M. B.
(
2004
).
Similarity avoidance and the OCP
.
Nature Language & Linguistic Theory
,
22
,
179
228
.
Frisch
,
S. A.
, &
Zawaydeh
,
B. A.
(
2001
).
The psychological reality of OCP-place in Arabic
.
Language
,
77
,
91
106
.
Fu
,
Y.
,
Kaneko
,
M.
,
Tang
,
Y.
,
,
A.
, &
Stryker
,
M. P.
(
2015
).
A cortical disinhibitory circuit for enhancing adult plasticity
.
eLife
,
4
,
e05558
.
Fu
,
Y.
,
Tucciarone
,
J. M.
,
Espinosa
,
J. S.
,
Sheng
,
N.
,
Darcy
,
D. P.
,
Nicoll
,
R. A.
, et al
(
2014
).
A cortical circuit for gain control by behavioral state
.
Cell
,
156
,
1139
1152
.
Gervain
,
J.
,
Berent
,
I.
, &
Werker
,
J. F.
(
2012
).
Binding at birth: The newborn brain detects identity relations and sequential position in speech
.
Journal of Cognitive Neuroscience
,
24
,
564
574
.
Gervain
,
J.
,
Macagno
,
F.
,
Cogoi
,
S.
,
Peña
,
M.
, &
Mehler
,
J.
(
2008
).
The neonate brain detects speech structure
.
Proceedings of the National Academy of Sciences, U.S.A.
,
105
,
14222
14227
.
Goddard
,
C. A.
,
Mysore
,
S. P.
,
Bryant
,
A. S.
,
Huguenard
,
J. R.
, &
Knudsen
,
E. I.
(
2014
).
Spatially reciprocal inhibition of inhibition within a stimulus selection network in the avian midbrain
.
PLoS One
,
9
,
e85865
.
,
S.
, &
Mylander
,
C.
(
1998
).
Spontaneous sign systems created by deaf children in two cultures
.
Nature
,
391
,
279
281
.
Grill-Spector
,
K.
,
Henson
,
R. N.
, &
Martin
,
A.
(
2006
).
Repetition and the brain: Neural models of stimulus-specific effects
.
Trends in Cognitive Sciences
,
10
,
14
23
.
Hangya
,
B.
,
Pi
,
H. J.
,
Kvitsiani
,
D.
,
,
S. P.
, &
Kepecs
,
A.
(
2014
).
From circuit motifs to computations: Mapping the behavioral repertoire of cortical interneurons
.
Current Opinion in Neurobiology
,
26
,
117
124
.
Hardy
,
H. K.
, &
Montler
,
T.
(
1988
).
Imperfective gemination in Alabama
.
International Journal of American Linguistics
,
54
,
399
475
.
Hasselmo
,
M. E.
, &
Wyble
,
B. P.
(
1997
).
Free recall and recognition in a network model of the hippocampus: Simulating effects of scopolamine on human memory function
.
Behavioural Brain Research
,
89
,
1
34
.
Hauser
,
M. D.
,
Chomsky
,
N.
, &
Fitch
,
W. T.
(
2002
).
The faculty of language: What is it, who has it, and how did it evolve?
Science
,
298
,
1569
1579
.
Hauser
,
M. D.
, &
Glynn
,
D.
(
2009
).
Can free-ranging rhesus monkeys (Macaca mulatta) extract artificially created rules comprised of natural vocalizations?
Journal of Comparative Psycholgy
,
123
,
161
167
.
Hauser
,
M. D.
, &
Watumull
,
J.
(
2017
).
The universal generative faculty: The source of our expressive power in language, mathematics, morality, and music
.
Journal of Neurolinguistics
,
43
,
78
94
.
Hayes
,
B.
, &
Londe
,
Z. C.
(
2006
).
Stochastic phonological knowledge: The case of Hungarian vowel harmony
.
Phonology
,
23
,
59
104
.
Hupé
,
J.-M.
(
2017
).
Comment on “ducklings imprint on the relational concept of ‘same or different’”
.
Science
,
355
,
806
.
Johnson
,
J. S.
,
Spencer
,
J. P.
,
Luck
,
S. J.
, &
Schöner
,
G.
(
2009
).
A dynamic neural field model of visual working memory and change detection
.
Psychological Science
,
20
,
568
577
.
Johnson
,
S. P.
,
Fernandas
,
K. J.
,
Frank
,
M. C.
,
Kirkham
,
N.
,
Marcus
,
G.
,
Rabagliati
,
H.
, et al
(
2009
).
Abstract rule learning for visual sequences in 8- and 11-month-olds
.
Infancy
,
14
,
2
18
.
Jovanic
,
T.
,
Schneider-Mizell
,
C. M.
,
Shao
,
M.
,
Masson
,
J. B.
,
Denisov
,
G.
,
Fetter
,
R. D.
, et al
(
2016
).
Competitive disinhibition mediates behavioral choice and sequences in drosophila
.
Cell
,
167
,
858
870
.
Kabdebon
,
C.
, &
Dehaene-Lambertz
,
G.
(
2019
).
Symbolic labeling in 5-month-old human infants
.
Proceedings of the National Academy of Sciences, U.S.A.
,
116
,
5805
5810
.
Kovács
,
Á. M.
, &
Mehler
,
J.
(
2008
).
Regularity learning in 7-month-old infants under ‘noisy’ conditions: Adjacent repetitions vs. Non-adjacent repetitions
.
Paper presented at the 33rd Boston University Conference on Language Development
.
Boston, MA
.
Kovács
,
Á. M.
, &
Mehler
,
J.
(
2009
).
Flexible learning of multiple speech structures in bilingual infants
.
Science
,
325
,
611
612
.
Koyama
,
M.
,
Minale
,
F.
,
Shum
,
J.
,
Nishimura
,
N.
,
Schaffer
,
C. B.
, &
Fetcho
,
J. R.
(
2016
).
A circuit motif in the zebrafish hindbrain for a two alternative behavioral choice to turn left or right
.
eLife
,
5
,
e16808
.
Koyama
,
M.
, &
Pujala
,
A.
(
2018
).
Mutual inhibition of lateral inhibition: A network motif for an elementary computation in the brain
.
Current Opinion in Neurobiology
,
49
,
69
74
.
Kumaran
,
D.
, &
Maguire
,
E. A.
(
2007
).
Which computational mechanisms operate in the hippocampus during novelty detection?
Hippocampus
,
17
,
735
748
.
Kusunoki
,
M.
,
Sigala
,
N.
,
Nili
,
H.
,
Gaffan
,
D.
, &
Duncan
,
J.
(
2010
).
Target detection by opponent coding in monkey prefrontal cortex
.
Journal of Cognitive Neuroscience
,
22
,
751
760
.
Langbein
,
J.
, &
Puppe
,
B.
(
2017
).
Comment on “ducklings imprint on the relational concept of ‘same or different’”
.
Science
,
355
,
806
.
Lee
,
S.
,
Kruglikov
,
I.
,
Huang
,
Z. J.
,
Fishell
,
G.
, &
Rudy
,
B.
(
2013
).
A disinhibitory circuit mediates motor integration in the somatosensory cortex
.
Nature Neuroscience
,
16
,
1662
1670
.
Letzkus
,
J. J.
,
Wolff
,
S. B.
, &
Lüthi
,
A.
(
2015
).
Disinhibition, a circuit mechanism for associative learning and memory
.
Neuron
,
88
,
264
276
.
Letzkus
,
J. J.
,
Wolff
,
S. B.
,
Meyer
,
E. M.
,
Tovote
,
P.
,
Courtin
,
J.
,
Herry
,
C.
, et al
(
2011
).
A disinhibitory microcircuit for associative fear learning in the auditory cortex
.
Nature
,
480
,
331
335
.
Liu
,
T.
, &
Becker
,
M. W.
(
2013
).
Serial consolidation of orientation information into visual short-term memory
.
Psychological Science
,
24
,
1044
1050
.
Ludueña
,
G. A.
, &
Gros
,
C.
(
2013
).
A self-organized neural comparator
.
Neural Computation
,
25
,
1006
1028
.
Machens
,
C. K.
,
Romo
,
R.
, &
Brody
,
C. D.
(
2005
).
Flexible control of mutual inhibition: A neural model of two-interval discrimination
.
Science
,
307
,
1121
1124
.
Manaster-Ramer
,
A.
(
1986
).
Copying in natural languages, context-freeness, and queue grammars
. In
Proceedings of the 24th Annual Meeting on Association for Computational Linguistics
(pp.
85
89
).
Mance
,
I.
,
Becker
,
M. W.
, &
Liu
,
T.
(
2012
).
Parallel consolidation of simple features into visual short-term memory
.
Journal of Experimental Psychology: Human Perception and Performance
,
38
,
429
438
.
Marcus
,
G. F.
,
Fernandes
,
K. J.
, &
Johnson
,
S. P.
(
2007
).
Infant rule learning facilitated by speech
.
Psychological Science
,
18
,
387
391
.
Marcus
,
G. F.
,
Vijayan
,
S.
,
Rao
,
S. B.
, &
Vishton
,
P.
(
1999
).
Rule learning by seven-month-old infants
.
Science
,
283
,
77
80
.
Marlin
,
B. J.
,
Mitre
,
M.
,
D'amour
,
J. A.
,
Chao
,
M. V.
, &
Froemke
,
R. C.
(
2015
).
Oxytocin enables maternal behaviour by balancing cortical inhibition
.
Nature
,
520
,
499
504
.
Martinho
,
A.
, III
, &
Kacelnik
,
A.
(
2016
).
Ducklings imprint on the relational concept of “same or different”
.
Science
,
353
,
286
288
.
McCarthy
,
J. J.
(
1986
).
OCP effects: Gemination and antigemination
.
Linguistic Inquiry
,
17
,
207
263
.
Medina
,
T. N.
,
Snedeker
,
J.
,
Trueswell
,
J. C.
, &
Gleitman
,
L. R.
(
2011
).
How words can and cannot be learned by observation
.
Proceedings of the National Academy of Sciences, U.S.A.
,
108
,
9014
9019
.
Miller
,
P.
, &
Wang
,
X. J.
(
2006
).
Inhibitory control by an integral feedback signal in prefrontal cortex: A model of discrimination between sequential stimuli
.
Proceedings of the National Academy of Sciences, U.S.A.
,
103
,
201
206
.
Mitterer
,
H.
, &
Blomert
,
L.
(
2003
).
Coping with phonological assimilation in speech perception: Evidence for early compensation
.
Perception & Psychophysics
,
65
,
956
969
.
Moravcsik
,
E.
(
1978
).
Reduplicative constructions
. In
J. H.
Greenberg
(Ed.),
Universals of human language: Word structure
(
Vol. 3
, pp.
297
334
).
Stanford, CA
:
Stanford University Press
.
Morgan
,
J. L.
(
1986
).
From simple input to complex grammar
.
Cambridge, MA
:
MIT Press
.
Murphy
,
R. A.
,
Mondragón
,
E.
, &
Murphy
,
V. A.
(
2008
).
Rule learning by rats
.
Science
,
319
,
1849
1851
.
Mysore
,
S. P.
, &
Knudsen
,
E. I.
(
2012
).
Reciprocal inhibition of inhibition: A circuit motif for flexible categorization in stimulus selection
.
Neuron
,
73
,
193
205
.
Neiworth
,
J. J.
(
2013
).
Chasing sounds
.
Behavioural Processes
,
93
,
111
115
.
Owen
,
S. F.
,
Tuncdemir
,
S. N.
,
,
P. L.
,
Tirko
,
N. N.
,
Fishell
,
G.
, &
Tsien
,
R. W.
(
2013
).
Oxytocin enhances hippocampal spike transmission by modulating fast-spiking interneurons
.
Nature
,
500
,
458
462
.
Pepperberg
,
I. M.
(
1987
).
Acquisition of the same/different concept by an African Grey parrot (Psittacus erithacus): Learning with respect to categories of color, shape, and material
.
Animal Learning & Behavior
,
15
,
421
432
.
Pfeffer
,
C. K.
,
Xue
,
M.
,
He
,
M.
,
Huang
,
Z. J.
, &
Scanziani
,
M.
(
2013
).
Inhibition of inhibition in visual cortex: The logic of connections between molecularly distinct interneurons
.
Nature Neuroscience
,
16
,
1068
1076
.
Pi
,
H. J.
,
Hangya
,
B.
,
Kvitsiani
,
D.
,
Sanders
,
J. I.
,
Huang
,
Z. J.
, &
Kepecs
,
A.
(
2013
).
Cortical interneurons that specialize in disinhibitory control
.
Nature
,
503
,
521
524
.
Pinker
,
S.
(
1984
).
Language learnability and language development
.
Cambridge, MA
:
MIT Press
.
Pinker
,
S.
, &
Jackendoff
,
R.
(
2005
).
The faculty of language: What's special about it?
Cognition
,
95
,
201
236
.
Rabagliati
,
H.
,
Ferguson
,
B.
, &
Lew-Williams
,
C.
(
2019
).
The profile of abstract rule learning in infancy: Meta-analytic and experimental evidence
.
Developmental Science
,
22
,
e12704
.
Rose
,
S.
, &
Walker
,
R.
(
2011
).
Harmony systems
. In
J.
Goldsmith
,
J.
Riggle
, &
A. C. L.
Yu
(Eds.),
The handbook of phonological theory
(2nd ed., pp.
240
290
).
Oxford
:
John Wiley & Sons
.
Rubino
,
C.
(
2013
).
Reduplication
. In
M. S.
Dryer
&
M.
Haspelmath
(Eds.),
The world atlas of language structures online
.
Saffran
,
J. R.
,
Pollak
,
S. D.
,
Seibel
,
R. L.
, &
Shkolnik
,
A.
(
2007
).
Dog is a dog is a dog: Infant rule learning is not specific to language
.
Cognition
,
105
,
669
680
.
Schwartz
,
R. G.
,
Leonard
,
L. B.
,
Wilcox
,
M. J.
, &
Folger
,
M. K.
(
1980
).
Again and again: Reduplication in child phonology
.
Journal of Child Language
,
7
,
75
87
.
Semyanov
,
A.
,
Walker
,
M. C.
,
Kullmann
,
D. M.
, &
Silver
,
R. A.
(
2004
).
Tonically active GABA A receptors: Modulating gain and maintaining the tone
.
Trends in Neurosciences
,
27
,
262
269
.
Senghas
,
A.
,
Kita
,
S.
, &
Ozyürek
,
A.
(
2004
).
Children creating core properties of language: Evidence from an emerging sign language in Nicaragua
.
Science
,
305
,
1779
1782
.
Smirnova
,
A.
,
Zorina
,
Z.
,
Obozova
,
T.
, &
Wasserman
,
E.
(
2015
).
Crows spontaneously exhibit analogical reasoning
.
Current Biology
,
25
,
256
260
.
Suomi
,
K.
,
McQueen
,
J. M.
, &
Cutler
,
A.
(
1997
).
Vowel harmony and speech segmentation in Finnish
.
Journal of Memory and Language
,
36
,
422
444
.
van Der Velde
,
F.
, &
de Kamps
,
M.
(
2001
).
From knowing what to knowing where: Modeling object-based attention with feedback disinhibition of activation
.
Journal of Cognitive Neuroscience
,
13
,
479
491
.
van Heijningen
,
C. A.
,
de Visser
,
J.
,
Zuidema
,
W.
, &
ten Cate
,
C.
(
2009
).
Simple rules can explain discrimination of putative recursive syntactic structures by a songbird species
.
Proceedings of the National Academy of Sciences, U.S.A.
,
106
,
20538
20543
.
Versace
,
E.
,
Spierings
,
M. J.
,
Caffini
,
M.
,
Ten Cate
,
C.
, &
Vallortigara
,
G.
(
2017
).
Spontaneous generalization of abstract multimodal patterns in young domestic chicks
.
Animal Cognition
,
20
,
521
529
.
Vogel
,
E. K.
,
Woodman
,
G. F.
, &
Luck
,
S. J.
(
2006
).
The time course of consolidation in visual working memory
.
Journal of Experimental Psychology: Human Perception and Performance
,
32
,
1436
1451
.
Vroomen
,
J.
,
Tuomainen
,
J.
, &
de Gelder
,
B.
(
1998
).
The roles of word stress and vowel harmony in speech segmentation
.
Journal of Memory and Language
,
38
,
133
149
.
Wen
,
S.
,
Ulloa
,
A.
,
Husain
,
F.
,
Horwitz
,
B.
, &
Contreras-Vidal
,
J. L.
(
2008
).
Simulated neural dynamics of decision-making in an auditory delayed match-to-sample task
.
Biological Cybernetics
,
99
,
15
27
.
Xu
,
H.
,
Jeong
,
H. Y.
,
Tremblay
,
R.
, &
Rudy
,
B.
(
2013
).
Neocortical somatostatin-expressing GABAergic interneurons disinhibit the thalamorecipient layer 4
.
Neuron
,
77
,
155
167
.
Yang
,
C.
(
2013
).
Ontogeny and phylogeny of language
.
Proceedings of the National Academy of Sciences, U.S.A.
,
110
,
6324
6327
.
Yip
,
M.
(
1982
).
Reduplication and C-V skeleta in Chinese secret languages
.
Linguistic Inquiry
,
13
,
637
661
.
Yip
,
M.
(
1988
).
The obligatory contour principle and phonological rules: A loss of identity
.
Linguistic Inquiry
,
19
,
65
100
.
Zhang
,
S.
,
Xu
,
M.
,
Kamigaki
,
T.
,
Hoang Do
,
J. P.
,
Chang
,
W. C.
,
Jenvay
,
S.
, et al
(
2014
).
Selective attention. Long-range and local circuits for top–down modulation of visual cortex processing
.
Science
,
345
,
660
665
.
Zhao
,
W.
,
Zhou
,
P.
,
Gong
,
C.
,
Ouyang
,
Z.
,
Wang
,
J.
,
Zheng
,
N.
, et al
(
2019
).
A disinhibitory mechanism biases Drosophila innate light preference
.
Nature Communications
,
10
,
546
.