Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-1 of 1
Tania Bedrax-Weiss
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary
Open AccessPublisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2021) 9: 774–789.
Published: 02 August 2021
FIGURES
| View All (6)
Abstract
View articletitled, Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary
View
PDF
for article titled, Towards Question-Answering as an Automatic Metric for Evaluating the Content Quality of a Summary
A desirable property of a reference-based evaluation metric that measures the content quality of a summary is that it should estimate how much information that summary has in common with a reference. Traditional text overlap based metrics such as ROUGE fail to achieve this because they are limited to matching tokens, either lexically or via embeddings. In this work, we propose a metric to evaluate the content quality of a summary using question-answering (QA). QA-based methods directly measure a summary’s information overlap with a reference, making them fundamentally different than text overlap metrics. We demonstrate the experimental benefits of QA-based metrics through an analysis of our proposed metric, QAEval. QAEval outperforms current state-of-the-art metrics on most evaluations using benchmark datasets, while being competitive on others due to limitations of state-of-the-art models. Through a careful analysis of each component of QAEval, we identify its performance bottlenecks and estimate that its potential upper-bound performance surpasses all other automatic metrics, approaching that of the gold-standard Pyramid Method. 1