The study of phonotactics is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. The grammars assess possible words on the basis of the weighted sum of their constraint violations. The learning algorithm yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPE-style constraint format, suffices to learn many phonotactic phenomena. In order for the model to learn nonlocal phenomena such as stress and vowel harmony, it must be augmented with autosegmental tiers and metrical grids. Our results thus offer novel, learning-theoretic support for such representations. We apply the model in a variety of learning simulations, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactic experiment.
We would like to thank two anonymous LI reviewers, Steven Abney, Paul Boersma, Michael Hammond, Robert Kirchner, Robert Malouf, Joe Pater, Donca Steriade, Kie Zuraw, and audiences at the University of Michigan, the University of California at San Diego, the University of Arizona, and UCLA for helpful input on our project.
Special thanks to Jason Eisner for alerting us to the feasibility of using finite state machines to formalize the computations of our model.