Statistics of datasets used in (Gururangan et al., 2020). Sources: ChemProt (Kringelum et al., 2016), RCT (Dernoncourt and Lee, 2017), ACL-ARC (Jurgens et al., 2018), SciERC (Luan et al., 2018), HyperPartisan (Kiesel et al., 2019), AGNews (Zhang et al., 2015), Helpfulness (McAuley et al., 2015), IMDB (Maas et al., 2011). This table is taken from (Gururangan et al., 2020).
Domain . | Dataset . | Label Type . | Train . | Dev . | Test . | Classes . |
---|---|---|---|---|---|---|
BioMed | ChemProt | relation classification | 4169 | 2427 | 3469 | 13 |
RCT | abstract sent. roles | 180040 | 30212 | 30135 | 5 | |
CS | ACL-ARC | citation intent | 1688 | 114 | 139 | 6 |
SciERC | relation classification | 3219 | 455 | 974 | 7 | |
News | HyperPartisan | partisanship | 515 | 65 | 65 | 2 |
AGNews | topic | 115000 | 5000 | 7600 | 4 | |
Reviews | Helpfulness | review helpfulness | 115251 | 5000 | 25000 | 2 |
IMDB | review sentiment | 20000 | 5000 | 25000 | 2 |
Domain . | Dataset . | Label Type . | Train . | Dev . | Test . | Classes . |
---|---|---|---|---|---|---|
BioMed | ChemProt | relation classification | 4169 | 2427 | 3469 | 13 |
RCT | abstract sent. roles | 180040 | 30212 | 30135 | 5 | |
CS | ACL-ARC | citation intent | 1688 | 114 | 139 | 6 |
SciERC | relation classification | 3219 | 455 | 974 | 7 | |
News | HyperPartisan | partisanship | 515 | 65 | 65 | 2 |
AGNews | topic | 115000 | 5000 | 7600 | 4 | |
Reviews | Helpfulness | review helpfulness | 115251 | 5000 | 25000 | 2 |
IMDB | review sentiment | 20000 | 5000 | 25000 | 2 |