Data Set . | Splits . | Categories . | Properties . |
---|---|---|---|
Yelp 2013 | 62,522 / 7,773 / 8,671 | • users (1.6k) • products (1.6k) | Categories can be sparse (i.e., there may not be enough reviews for each user/product). |
AAPR | 33,464 / 2,000 / 2,000 | • author (48k) • research area (144) | Authors are sparse and have many category labels. Categories can have multiple labels (e.g., multiple authors, multidisciplinary fields). |
PolMed | 4,500 / 0 / 500 | • politician (505) • media source (2) • audience (2) • political bias (2) | The data set has more categories. Categories with binary labels may not be diverse enough to be useful. |
Data Set . | Splits . | Categories . | Properties . |
---|---|---|---|
Yelp 2013 | 62,522 / 7,773 / 8,671 | • users (1.6k) • products (1.6k) | Categories can be sparse (i.e., there may not be enough reviews for each user/product). |
AAPR | 33,464 / 2,000 / 2,000 | • author (48k) • research area (144) | Authors are sparse and have many category labels. Categories can have multiple labels (e.g., multiple authors, multidisciplinary fields). |
PolMed | 4,500 / 0 / 500 | • politician (505) • media source (2) • audience (2) • political bias (2) | The data set has more categories. Categories with binary labels may not be diverse enough to be useful. |