Table 3

List of common subtasks of TST and their corresponding attribute values and datasets. For datasets with multiple attribute-specific corpora, we report their sizes by the number of sentences of the smallest of all corpora. We also report whether the dataset is parallel (Pa?).

TaskAttribute ValuesDatasetsSizePa?
Style Features 
Formality Informal ↔ Formal GYAFC3 (Rao and Tetreault 2018) 50K ✓ 
XFORMAL4 (Briakou et al. 2021b) 1K ✓ 
Politeness Impolite → Polite Politeness5 (Madaan et al. 2020) 1M ✗ 
Gender Masculine ↔ Feminine Yelp Gender6 (Prabhumoye et al. 2018) 2.5M ✗ 
Humor & Romance Factual ↔ Humorous ↔ Romantic FlickrStyle7 (Gan et al. 2017) 5K ✓ 
Biasedness Biased → Neutral Wiki Neutrality8 (Pryzant et al. 2020) 181K ✓ 
Toxicity Offensive → Non-offensive Twitter (dos Santos, Melnyk, and Padhi 2018) 58K ✗ 
Reddit (dos Santos, Melnyk, and Padhi 2018) 224K 
Reddit Politics (Tran, Zhang, and Soleymani 2020) 350K 
Authorship Shakespearean ↔ Modern Shakespeare (Xu et al. 2012) 18K ✓ 
Different Bible translators Bible9 (Carlson, Riddell, and Rockmore 2018) 28M 
Simplicity Complicated → Simple PWKP (Zhu, Bernhard, and Gurevych 2010) 108K ✓ 
Expert (den Bercken, Sips, and Lofi 2019) 2.2K ✓ 
MIMIC-III10 (Weng, Chung, and Szolovits 2019) 59K ✗ 
MSD11 (Cao et al. 2020) 114K ✓ 
Engagingness Plain → Attractive Math12 (Koncel-Kedziorski et al. 2016) <1K ✓ 
TitleStylist13 (Jin et al. 2020a) 146K ✗ 
Content Preferences 
Sentiment Positive ↔ Negative Yelp14 (Shen et al. 2017) 250K ✗ 
Amazon15 (He and McAuley 2016) 277K 
Topic Entertainment ↔ Politics Yahoo! Answers16 (Huang et al. 2020) 153K ✗ 
Politics Democratic ↔ Republican Political17 (Voigt et al. 2018) 540K ✗ 
TaskAttribute ValuesDatasetsSizePa?
Style Features 
Formality Informal ↔ Formal GYAFC3 (Rao and Tetreault 2018) 50K ✓ 
XFORMAL4 (Briakou et al. 2021b) 1K ✓ 
Politeness Impolite → Polite Politeness5 (Madaan et al. 2020) 1M ✗ 
Gender Masculine ↔ Feminine Yelp Gender6 (Prabhumoye et al. 2018) 2.5M ✗ 
Humor & Romance Factual ↔ Humorous ↔ Romantic FlickrStyle7 (Gan et al. 2017) 5K ✓ 
Biasedness Biased → Neutral Wiki Neutrality8 (Pryzant et al. 2020) 181K ✓ 
Toxicity Offensive → Non-offensive Twitter (dos Santos, Melnyk, and Padhi 2018) 58K ✗ 
Reddit (dos Santos, Melnyk, and Padhi 2018) 224K 
Reddit Politics (Tran, Zhang, and Soleymani 2020) 350K 
Authorship Shakespearean ↔ Modern Shakespeare (Xu et al. 2012) 18K ✓ 
Different Bible translators Bible9 (Carlson, Riddell, and Rockmore 2018) 28M 
Simplicity Complicated → Simple PWKP (Zhu, Bernhard, and Gurevych 2010) 108K ✓ 
Expert (den Bercken, Sips, and Lofi 2019) 2.2K ✓ 
MIMIC-III10 (Weng, Chung, and Szolovits 2019) 59K ✗ 
MSD11 (Cao et al. 2020) 114K ✓ 
Engagingness Plain → Attractive Math12 (Koncel-Kedziorski et al. 2016) <1K ✓ 
TitleStylist13 (Jin et al. 2020a) 146K ✗ 
Content Preferences 
Sentiment Positive ↔ Negative Yelp14 (Shen et al. 2017) 250K ✗ 
Amazon15 (He and McAuley 2016) 277K 
Topic Entertainment ↔ Politics Yahoo! Answers16 (Huang et al. 2020) 153K ✗ 
Politics Democratic ↔ Republican Political17 (Voigt et al. 2018) 540K ✗ 
Close Modal

or Create an Account

Close Modal
Close Modal