Skip Nav Destination
Close Modal
Update search
NARROW
Format
Journal
Date
Availability
1-2 of 2
Han Zhou
Close
Follow your search
Access your saved searches in your account
Would you like to receive an alert when new items match your search?
Sort by
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2024) 12: 525–542.
Published: 03 May 2024
FIGURES
| View All (9)
Abstract
View article
PDF
Large pretrained language models are widely used in downstream NLP tasks via task- specific fine-tuning, but such procedures can be costly. Recently, Parameter-Efficient Fine-Tuning (PEFT) methods have achieved strong task performance while updating much fewer parameters than full model fine-tuning (FFT). However, it is non-trivial to make informed design choices on the PEFT configurations , such as their architecture, the number of tunable parameters, and even the layers in which the PEFT modules are inserted. Consequently, it is highly likely that the current, manually designed configurations are suboptimal in terms of their performance-efficiency trade-off. Inspired by advances in neural architecture search, we propose AutoPEFT for automatic PEFT configuration selection: We first design an expressive configuration search space with multiple representative PEFT modules as building blocks. Using multi-objective Bayesian optimization in a low-cost setup, we then discover a Pareto-optimal set of configurations with strong performance-cost trade-offs across different numbers of parameters that are also highly transferable across different tasks. Empirically, on GLUE and SuperGLUE tasks, we show that AutoPEFT -discovered configurations significantly outperform existing PEFT methods and are on par or better than FFT without incurring substantial training efficiency costs.
Journal Articles
Publisher: Journals Gateway
Transactions of the Association for Computational Linguistics (2023) 11: 1396–1415.
Published: 16 November 2023
FIGURES
| View All (7)
Abstract
View article
PDF
Creating high-quality annotated data for task-oriented dialog ( ToD ) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages. Therefore, the current datasets are still very scarce and suffer from limitations such as translation-based non-native dialogs with translation artefacts, small scale, or lack of cultural adaptation, among others. In this work, we first take stock of the current landscape of multilingual ToD datasets, offering a systematic overview of their properties and limitations. Aiming to reduce all the detected limitations, we then introduce Multi 3 WOZ , a novel multilingual, multi-domain, multi-parallel ToD dataset. It is large-scale and offers culturally adapted dialogs in 4 languages to enable training and evaluation of multilingual and cross-lingual ToD systems. We describe a complex bottom–up data collection process that yielded the final dataset, and offer the first sets of baseline scores across different ToD -related tasks for future reference, also highlighting its challenging nature.