Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-2 of 2
Yumo Xu
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Document Summarization with Latent Queries
Open AccessPublisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2022) 10: 623–638.
Published: 04 May 2022
FIGURES
Abstract
View articletitled, Document Summarization with Latent Queries
View
PDF
for article titled, Document Summarization with Latent Queries
The availability of large-scale datasets has driven the development of neural models that create generic summaries for single or multiple documents. For query-focused summarization (QFS), labeled training data in the form of queries, documents, and summaries is not readily available. We provide a unified modeling framework for any kind of summarization, under the assumption that all summaries are a response to a query, which is observed in the case of QFS and latent in the case of generic summarization. We model queries as discrete latent variables over document tokens, and learn representations compatible with observed and unobserved query verbalizations. Our framework formulates summarization as a generative process, and jointly optimizes a latent query model and a conditional language model . Despite learning from generic summarization data only, our approach outperforms strong comparison systems across benchmarks, query types, document settings, and target domains. 1
Journal Articles
Weakly Supervised Domain Detection
Open AccessPublisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2019) 7: 581–596.
Published: 01 September 2019
FIGURES
| View All (4)
Abstract
View articletitled, Weakly Supervised Domain Detection
View
PDF
for article titled, Weakly Supervised Domain Detection
In this paper we introduce domain detection as a new natural language processing task. We argue that the ability to detect textual segments that are domain-heavy (i.e., sentences or phrases that are representative of and provide evidence for a given domain) could enhance the robustness and portability of various text classification applications. We propose an encoder-detector framework for domain detection and bootstrap classifiers with multiple instance learning. The model is hierarchically organized and suited to multilabel classification. We demonstrate that despite learning with minimal supervision, our model can be applied to text spans of different granularities, languages, and genres. We also showcase the potential of domain detection for text summarization.