Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-2 of 2
Ivan Titov
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2014) 40 (3): 671–685.
Published: 01 September 2014
FIGURES
| View All (4)
Abstract
View articletitled, Improved Estimation of Entropy for Evaluation of Word Sense Induction
View
PDF
for article titled, Improved Estimation of Entropy for Evaluation of Word Sense Induction
Information-theoretic measures are among the most standard techniques for evaluation of clustering methods including word sense induction (WSI) systems. Such measures rely on sample-based estimates of the entropy. However, the standard maximum likelihood estimates of the entropy are heavily biased with the bias dependent on, among other things, the number of clusters and the sample size. This makes the measures unreliable and unfair when the number of clusters produced by different systems vary and the sample size is not exceedingly large. This corresponds exactly to the setting of WSI evaluation where a ground-truth cluster sense number arguably does not exist and the standard evaluation scenarios use a small number of instances of each word to compute the score. We describe more accurate entropy estimators and analyze their performance both in simulations and on evaluation of WSI systems.
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2013) 39 (4): 949–998.
Published: 01 December 2013
FIGURES
| View All (14)
Abstract
View articletitled, Multilingual Joint Parsing of Syntactic and Semantic Dependencies with a Latent Variable Model
View
PDF
for article titled, Multilingual Joint Parsing of Syntactic and Semantic Dependencies with a Latent Variable Model
Current investigations in data-driven models of parsing have shifted from purely syntactic analysis to richer semantic representations, showing that the successful recovery of the meaning of text requires structured analyses of both its grammar and its semantics. In this article, we report on a joint generative history-based model to predict the most likely derivation of a dependency parser for both syntactic and semantic dependencies, in multiple languages. Because these two dependency structures are not isomorphic, we propose a weak synchronization at the level of meaningful subsequences of the two derivations. These synchronized subsequences encompass decisions about the left side of each individual word. We also propose novel derivations for semantic dependency structures, which are appropriate for the relatively unconstrained nature of these graphs. To train a joint model of these synchronized derivations, we make use of a latent variable model of parsing, the Incremental Sigmoid Belief Network (ISBN) architecture. This architecture induces latent feature representations of the derivations, which are used to discover correlations both within and between the two derivations, providing the first application of ISBNs to a multi-task learning problem. This joint model achieves competitive performance on both syntactic and semantic dependency parsing for several languages. Because of the general nature of the approach, this extension of the ISBN architecture to weakly synchronized syntactic-semantic derivations is also an exemplification of its applicability to other problems where two independent, but related, representations are being learned.