Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-5 of 5
Timothy Baldwin
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2023) 11: 997–1013.
Published: 15 August 2023
FIGURES
| View All (7)
Abstract
View article
PDF
Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as gold standard. Averaging masks the true distribution of human opinions on examples of low agreement, and prevents models from capturing the semantic vagueness that the individual ratings represent. In this work, we introduce USTS, the first U ncertainty-aware STS dataset with ∼15,000 Chinese sentence pairs and 150,000 labels, to study collective human opinions in STS. Analysis reveals that neither a scalar nor a single Gaussian fits a set of observed judgments adequately. We further show that current STS models cannot capture the variance caused by human disagreement on individual instances, but rather reflect the predictive confidence over the aggregate dataset.
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2022) 10: 680–696.
Published: 15 June 2022
FIGURES
Abstract
View article
PDF
State-of-the-art classification and regression models are often not well calibrated, and cannot reliably provide uncertainty estimates, limiting their utility in safety-critical applications such as clinical decision-making. While recent work has focused on calibration of classifiers, there is almost no work in NLP on calibration in a regression setting. In this paper, we quantify the calibration of pre- trained language models for text regression, both intrinsically and extrinsically. We further apply uncertainty estimates to augment training data in low-resource domains. Our experiments on three regression tasks in both self-training and active-learning settings show that uncertainty estimation can be used to increase overall performance and enhance model generalization.
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2021) 9: 621–640.
Published: 08 July 2021
FIGURES
Abstract
View article
PDF
While pretrained language models (LMs) have driven impressive gains over morpho-syntactic and semantic tasks, their ability to model discourse and pragmatic phenomena is less clear. As a step towards a better understanding of their discourse modeling capabilities, we propose a sentence intrusion detection task. We examine the performance of a broad range of pretrained LMs on this detection task for English. Lacking a dataset for the task, we introduce INSteD, a novel in truder s en te nce d etection dataset, containing 170,000+ documents constructed from English Wikipedia and CNN news articles. Our experiments show that pretrained LMs perform impressively in in-domain evaluation, but experience a substantial drop in the cross-domain setting, indicating limited generalization capacity. Further results over a novel linguistic probe dataset show that there is substantial room for improvement, especially in the cross- domain setting.
Journal Articles
Unsupervised Acquisition of Comprehensive Multiword Lexicons using Competition in an n -gram Lattice
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2017) 5: 455–470.
Published: 01 November 2017
Abstract
View article
PDF
We present a new model for acquiring comprehensive multiword lexicons from large corpora based on competition among n -gram candidates. In contrast to the standard approach of simple ranking by association measure, in our model n -grams are arranged in a lattice structure based on subsumption and overlap relationships, with nodes inhibiting other nodes in their vicinity when they are selected as a lexical item. We show how the configuration of such a lattice can be optimized tractably, and demonstrate using annotations of sampled n -grams that our method consistently outperforms alternatives by at least 0.05 F-score across several corpora and languages.
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2014) 2: 27–40.
Published: 01 February 2014
Abstract
View article
PDF
Language identification is the task of automatically detecting the language(s) present in a document based on the content of the document. In this work, we address the problem of detecting documents that contain text from more than one language ( multilingual documents). We introduce a method that is able to detect that a document is multilingual, identify the languages present, and estimate their relative proportions. We demonstrate the effectiveness of our method over synthetic data, as well as real-world multilingual documents collected from the web.