Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-1 of 1
Alexis Nasr
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2016) 42 (1): 55–90.
Published: 01 March 2016
FIGURES
| View All (5)
Abstract
View article
PDF
Statistical parsers are trained on treebanks that are composed of a few thousand sentences. In order to prevent data sparseness and computational complexity, such parsers make strong independence hypotheses on the decisions that are made to build a syntactic tree. These independence hypotheses yield a decomposition of the syntactic structures into small pieces, which in turn prevent the parser from adequately modeling many lexico-syntactic phenomena like selectional constraints and subcategorization frames. Additionally, treebanks are several orders of magnitude too small to observe many lexico-syntactic regularities, such as selectional constraints and subcategorization frames. In this article, we propose a solution to both problems: how to account for patterns that exceed the size of the pieces that are modeled in the parser and how to obtain subcategorization frames and selectional constraints from raw corpora and incorporate them in the parsing process. The method proposed was evaluated on French and on English. The experiments on French showed a decrease of 41.6% of selectional constraint violations and a decrease of 22% of erroneous subcategorization frame assignment. These figures are lower for English: 16.21% in the first case and 8.83% in the second.