Abstract

During the course of first language acquisition, children produce linguistic forms that do not conform to adult grammar. In this paper, we introduce a data set and approach for systematically modeling this child-adult grammar divergence. Our corpus consists of child sentences with corrected adult forms. We bridge the gap between these forms with a discriminatively reranked noisy channel model that translates child sentences into equivalent adult utterances. Our method outperforms MT and ESL baselines, reducing child error by 20%. Our model allows us to chart specific aspects of grammar development in longitudinal studies of children, and investigate the hypothesis that children share a common developmental path in language acquisition.

This content is only available as a PDF.
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits you to copy and redistribute in any medium or format, for non-commercial use only, provided that the original work is not remixed, transformed, or built upon, and that appropriate credit to the original source is given. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.