Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
TocHeadingTitle
Date
Availability
1-2 of 2
Johanna D. Moore
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2011) 37 (3): 489–539.
Published: 01 September 2011
Abstract
View article
PDF
In spoken dialog systems, information must be presented sequentially, making it difficult to quickly browse through a large number of options. Recent studies have shown that user satisfaction is negatively correlated with dialog duration, suggesting that systems should be designed to maximize the efficiency of the interactions. Analysis of the logs of 2,000 dialogs between users and nine different dialog systems reveals that a large percentage of the time is spent on the information presentation phase, thus there is potentially a large pay-off to be gained from optimizing information presentation in spoken dialog systems. This article proposes a method that improves the efficiency of coping with large numbers of diverse options by selecting options and then structuring them based on a model of the user's preferences. This enables the dialog system to automatically determine trade-offs between alternative options that are relevant to the user and present these trade-offs explicitly. Multiple attractive options are thereby structured such that the user can gradually refine her request to find the optimal trade-off. To evaluate and challenge our approach, we conducted a series of experiments that test the effectiveness of the proposed strategy. Experimental results show that basing the content structuring and content selection process on a user model increases the efficiency and effectiveness of the user's interaction. Users complete their tasks more successfully and more quickly. Furthermore, user surveys revealed that participants found that the user-model based system presents complex trade-offs understandably and increases overall user satisfaction. The experiments also indicate that presenting users with a brief overview of options that do not fit their requirements significantly improves the user's overview of available options, also making them feel more confident in having been presented with all relevant options.
Journal Articles
Publisher: Journals Gateway
Computational Linguistics (2010) 36 (2): 159–201.
Published: 01 June 2010
Abstract
View article
PDF
Generating responses that take user preferences into account requires adaptation at all levels of the generation process. This article describes a multi-level approach to presenting user-tailored information in spoken dialogues which brings together for the first time multi-attribute decision models, strategic content planning, surface realization that incorporates prosody prediction, and unit selection synthesis that takes the resulting prosodic structure into account. The system selects the most important options to mention and the attributes that are most relevant to choosing between them, based on the user model. Multiple options are selected when each offers a compelling trade-off. To convey these trade-offs, the system employs a novel presentation strategy which straightforwardly lends itself to the determination of information structure, as well as the contents of referring expressions. During surface realization, the prosodic structure is derived from the information structure using Combinatory Categorial Grammar in a way that allows phrase boundaries to be determined in a flexible, data-driven fashion. This approach to choosing pitch accents and edge tones is shown to yield prosodic structures with significantly higher acceptability than baseline prosody prediction models in an expert evaluation. These prosodic structures are then shown to enable perceptibly more natural synthesis using a unit selection voice that aims to produce the target tunes, in comparison to two baseline synthetic voices. An expert evaluation and f0 analysis confirm the superiority of the generator-driven intonation and its contribution to listeners' ratings.