Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-4 of 4
Joel Tetreault
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2019) 7: 551–566.
Published: 01 September 2019
FIGURES
| View All (5)
Abstract
View article
PDF
Until now, grammatical error correction (GEC) has been primarily evaluated on text written by non-native English speakers, with a focus on student essays. This paper enables GEC development on text written by native speakers by providing a new data set and metric. We present a multiple-reference test corpus for GEC that includes 4,000 sentences in two new domains ( formal and informal writing by native English speakers) and 2,000 sentences from a diverse set of non-native student writing . We also collect human judgments of several GEC systems on this new test set and perform a meta-evaluation, assessing how reliable automatic metrics are across these domains. We find that commonly used GEC metrics have inconsistent performance across domains, and therefore we propose a new ensemble metric that is robust on all three domains of text.
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2016) 4: 565.
Published: 01 December 2016
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2016) 4: 169–182.
Published: 01 May 2016
Abstract
View article
PDF
The field of grammatical error correction (GEC) has grown substantially in recent years, with research directed at both evaluation metrics and improved system performance against those metrics. One unvisited assumption, however, is the reliance of GEC evaluation on error-coded corpora, which contain specific labeled corrections. We examine current practices and show that GEC’s reliance on such corpora unnaturally constrains annotation and automatic evaluation, resulting in (a) sentences that do not sound acceptable to native speakers and (b) system rankings that do not correlate with human judgments. In light of this, we propose an alternate approach that jettisons costly error coding in favor of unannotated, whole-sentence rewrites. We compare the performance of existing metrics over different gold-standard annotations, and show that automatic evaluation with our new annotation scheme has very strong correlation with expert rankings (ρ = 0.82). As a result, we advocate for a fundamental and necessary shift in the goal of GEC, from correcting small, labeled error types, to producing text that has native fluency .
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2016) 4: 61–74.
Published: 01 March 2016
Abstract
View article
PDF
This paper presents an empirical study of linguistic formality. We perform an analysis of humans’ perceptions of formality in four different genres. These findings are used to develop a statistical model for predicting formality, which is evaluated under different feature settings and genres. We apply our model to an investigation of formality in online discussion forums, and present findings consistent with theories of formality and linguistic coordination.