Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-3 of 3
Colin Wilson
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Linguistic Inquiry (2018) 49 (3): 610–623.
Published: 01 July 2018
Abstract
View article
PDF
The lexicon of a natural language does not contain all of the phonological structures that are grammatical. This presents a fundamental challenge to the learner, who must distinguish linguistically significant restrictions from accidental gaps ( Fischer-Jørgensen 1952 , Halle 1962 , Chomsky and Halle 1965 , Pierrehumbert 1994 , Frisch and Zawaydeh 2001 , Iverson and Salmons 2005 , Gorman 2013 , Hayes and White 2013 ). The severity of the challenge depends on the size of the lexicon ( Pierrehumbert 2001 ), the number of sounds and their frequency distribution ( Sigurd 1968 , Tambovtsev and Martindale 2007 ), and the complexity of the generalizations that learners must entertain ( Pierrehumbert 1994 , Hayes and Wilson 2008 , Kager and Pater 2012 , Jardine and Heinz 2016 ). In this squib, we consider the problem that accidental gaps pose for learning phonotactic grammars stated on a single, surface level of representation. While the monostratal approach to phonology has considerable theoretical and computational appeal ( Ellison 1993 , Bird and Ellison 1994 , Scobbie, Coleman, and Bird 1996 , Burzio 2002 ), little previous research has investigated how purely surface-based phonotactic grammars can be learned from natural lexicons (but cf. Hayes and Wilson 2008 , Hayes and White 2013 ). The empirical basis of our study is the sound pattern of South Bolivian Quechua, with particular focus on the allophonic distribution of high and mid vowels. We show that, in characterizing the vowel distribution, a surface-based analysis must resort to generalizations of greater complexity than are needed in traditional accounts that derive outputs from underlying forms. This exacerbates the learning problem, because complex constraints are more likely to be surface-true by chance (i.e., the structures they prohibit are more likely to be accidentally absent from the lexicon). A comprehensive quantitative analysis of the Quechua lexicon and phonotactic system establishes that many accidental gaps of the relevant complexity level do indeed exist. We propose that, to overcome this problem, surface-based phonotactic models should have two related properties: they should use distinctive features to state constraints at multiple levels of granularity, and they should select constraints of appropriate granularity by statistical comparison of observed and expected frequency distributions. The central idea is that actual gaps typically belong to statistically robust feature-based classes, whereas accidental gaps are more likely to be featurally isolated and to contain independently rare sounds. A maximum-entropy learning model that incorporates these two properties is shown to be effective at distinguishing systematic and accidental gaps in a whole-language phonotactic analysis of Quechua, outperforming minimally different models that lack features or perform nonstatistical induction.
Journal Articles
Publisher: Journals Gateway
Linguistic Inquiry (2012) 43 (1): 97–119.
Published: 01 January 2012
FIGURES
| View All (6)
Abstract
View article
PDF
A computational model by Hayes and Wilson (2008) seemingly captures a diverse range of phonotactic phenomena without variables, contrasting with the presumptions of many formal theories. Here, we examine the plausibility of this approach by comparing generalizations of identity restrictions by this architecture and human learners. Whereas humans generalize identity restrictions broadly, to both native and nonnative phonemes, the original model and several related variants failed to generalize to nonnative phonemes. In contrast, a revised model equipped with variables more closely matches human behavior. These findings suggest that, like syntax, phonological grammars are endowed with algebraic relations among variables that support across-the-board generalizations.
Journal Articles
Publisher: Journals Gateway
Linguistic Inquiry (2008) 39 (3): 379–440.
Published: 01 July 2008
Abstract
View article
PDF
The study of phonotactics is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. The grammars assess possible words on the basis of the weighted sum of their constraint violations. The learning algorithm yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPE -style constraint format, suffices to learn many phonotactic phenomena. In order for the model to learn nonlocal phenomena such as stress and vowel harmony, it must be augmented with autosegmental tiers and metrical grids. Our results thus offer novel, learning-theoretic support for such representations. We apply the model in a variety of learning simulations, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactic experiment.