Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-1 of 1
Jinchuan Long
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2024) 12: 1578–1596.
Published: 27 November 2024
FIGURES
| View All (20)
Abstract
View article
PDF
Training a task-oriented dialogue policy using deep reinforcement learning is promising but requires extensive environment exploration. The amount of wasted invalid exploration makes policy learning inefficient. In this paper, we define and argue that dead-end states are important reasons for invalid exploration. When a conversation enters a dead-end state, regardless of the actions taken afterward, it will continue in a dead-end trajectory until the agent reaches a termination state or maximum turn. We propose a D ead-end D etection and R esurrection (DDR) method that detects dead-end states in an efficient manner and provides a rescue action to guide and correct the exploration direction. To prevent dialogue policies from repeating errors, DDR also performs dialogue data augmentation by adding relevant experiences that include dead-end states and penalties into the experience pool. We first validate the dead-end detection reliability and then demonstrate the effectiveness and generality of the method across various domains through experiments on four public dialogue datasets.