Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-2 of 2
Mark Johnson
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2007) 33 (4): 477–491.
Published: 01 December 2007
Abstract
View article
PDF
This article studies the relationship between weighted context-free grammars (WCFGs), where each production is associated with a positive real-valued weight, and probabilistic context-free grammars (PCFGs), where the weights of the productions associated with a nonterminal are constrained to sum to one. Because the class of WCFGs properly includes the PCFGs, one might expect that WCFGs can describe distributions that PCFGs cannot. However, Z. Chi (1999, Computational Linguistics, 25(1):131–160) and S. P. Abney, D. A. McAllester, and P. Pereira (1999, In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 542–549, College Park, MD) proved that every WCFG distribution is equivalent to some PCFG distribution. We extend their results to conditional distributions, and show that every WCFG conditional distribution of parses given strings is also the conditional distribution defined by some PCFG, even when the WCFG's partition function diverges. This shows that any parsing or labeling accuracy improvement from conditional estimation of WCFGs or conditional random fields (CRFs) over joint estimation of PCFGs or hidden Markov models (HMMs) is due to the estimation procedure rather than the change in model class, because PCFGs and HMMs are exactly as expressive as WCFGs and chain-structured CRFs, respectively.
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2002) 28 (1): 71–76.
Published: 01 March 2002
Abstract
View article
PDF
A data-oriented parsing or DOP model for statistical parsing associates fragments of linguistic representations with numerical weights, where these weights are estimated by normalizing the empirical frequency of each fragment in a training corpus (see Bod [1998] and references cited therein). This note observes that this estimation method is biased and inconsistent that is, the estimated distribution does not in general converge on the true distribution as the size of the training corpus increases.