Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-4 of 4
Stefan Riezler
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2014) 40 (1): 235–245.
Published: 01 March 2014
Abstract
View articletitled, On the Problem of Theoretical Terms in Empirical Computational Linguistics
View
PDF
for article titled, On the Problem of Theoretical Terms in Empirical Computational Linguistics
Philosophy of science has pointed out a problem of theoretical terms in empirical sciences. This problem arises if all known measuring procedures for a quantity of a theory presuppose the validity of this very theory, because then statements containing theoretical terms are circular. We argue that a similar circularity can happen in empirical computational linguistics, especially in cases where data are manually annotated by experts. We define a criterion of T -non-theoretical grounding as guidance to avoid such circularities, and exemplify how this criterion can be met by crowdsourcing, by task-related data annotation, or by data in the wild. We argue that this criterion should be considered as a necessary condition for an empirical science, in addition to measures for reliability of data annotation.
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2010) 36 (3): 569–582.
Published: 01 September 2010
Abstract
View articletitled, Query Rewriting Using Monolingual Statistical Machine Translation
View
PDF
for article titled, Query Rewriting Using Monolingual Statistical Machine Translation
Long queries often suffer from low recall in Web search due to conjunctive term matching. The chances of matching words in relevant documents can be increased by rewriting query terms into new terms with similar statistical properties. We present a comparison of approaches that deploy user query logs to learn rewrites of query terms into terms from the document space. We show that the best results are achieved by adopting the perspective of bridging the “lexical chasm” between queries and documents by translating from a source language of user queries into a target language of Web documents. We train a state-of-the-art statistical machine translation model on query-snippet pairs from user query logs, and extract expansion terms from the query rewrites produced by the monolingual translation system. We show in an extrinsic evaluation in a real-world Web search task that the combination of a query-to-snippet translation model with a query language model achieves improved contextual query expansion compared to a state-of-the-art query expansion model that is trained on the same query log data.
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2008) 34 (1): 81–124.
Published: 01 March 2008
Abstract
View articletitled, Wide-Coverage Deep Statistical Parsing Using Automatic Dependency Structure Annotation
View
PDF
for article titled, Wide-Coverage Deep Statistical Parsing Using Automatic Dependency Structure Annotation
A number of researchers have recently conducted experiments comparing “deep” hand-crafted wide-coverage with “shallow” treebank- and machine-learning-based parsers at the level of dependencies, using simple and automatic methods to convert tree output generated by the shallow parsers into dependencies. In this article, we revisit such experiments, this time using sophisticated automatic LFG f-structure annotation methodologies with surprising results. We compare various PCFG and history-based parsers to find a baseline parsing system that fits best into our automatic dependency structure annotation technique. This combined system of syntactic parser and dependency structure annotation is compared to two hand-crafted, deep constraint-based parsers, RASP and XLE. We evaluate using dependency-based gold standards and use the Approximate Randomization Test to test the statistical significance of the results. Our experiments show that machine-learning-based shallow grammars augmented with sophisticated automatic dependency annotation technology outperform hand-crafted, deep, wide-coverage constraint grammars. Currently our best system achieves an f-score of 82.73% against the PARC 700 Dependency Bank, a statistically significant improvement of 2.18% over the most recent results of 80.55% for the hand-crafted LFG grammar and XLE parsing system and an f-score of 80.23% against the CBS 500 Dependency Bank, a statistically significant 3.66% improvement over the 76.57% achieved by the hand-crafted RASP grammar and parsing system.
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2006) 32 (3): 439–442.
Published: 01 September 2006