Abstract
This article provides a critical assessment of the Gradual Learning Algorithm (GLA) for probabilistic optimality-theoretic (OT) grammars proposed by Boersma and Hayes (2001). We discuss the limitations of a standard algorithm for OT learning and outline how the GLA attempts to overcome these limitations. We point out a number of serious shortcomings with the GLA: (a) A methodological problem is that the GLA has not been tested on unseen data, which is standard practice in computational language learning. (b) We provide counterexamples, that is, attested data sets that the GLA is not able to learn. (c) Essential algorithmic properties of the GLA (correctness and convergence) have not been proven formally. (d) By modeling frequency distributions in the grammar, the GLA conflates the notions of competence and performance. This leads to serious conceptual problems, as OT crucially relies on the competence/performance distinction.