Abstract

Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for cross-linguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.

This content is only available as a PDF.

Author notes

*

The Ohio State University, Department of Linguistics, Columbus, OH 43210, U.S.A. E-mail: demarneffe.1@osu.edu.

**

Stanford University, Department of Linguistics, Stanford, CA 94305, U.S.A. E-mail: manning@cs.stanford.edu.

Uppsala University, Department of Linguistics and Philology, Box 635, 75126 Uppsala, Sweden. E-mail: joakim.nivre@lingfil.uu.se.

Charles University, Faculty of Mathematics and Physics, ÚFAL, 11800 Praha, Czechia. E-mail: zeman@ufal.mff.cuni.cz.

This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits you to copy and redistribute in any medium or format, for non-commercial use only, provided that the original work is not remixed, transfromed, or built upon, and the appropriate credit to the original source is given. For a full description of the license, please visit https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode.

Article PDF first page preview

Article PDF first page preview