mGENRE on the Wikinew-7 unseen languages. Models are trained only on the Mewsli-9 languages (1M datapoints per language ar, de, en, es, fa, ja, sr, ta, and tr). ‘Can.’ is canonical, ‘N+L’ is ‘name+language’, and ‘L+N’ is the opposite. M indicates marginalization.
Lang. . | Can. . | N+L . | L+N . | L+NM . |
---|---|---|---|---|
cs | 36.3 | 30.2 | 34.0 | 69.7 |
fr | 62.9 | 57.0 | 53.3 | 73.4 |
it | 44.8 | 43.7 | 42.9 | 56.8 |
pl | 31.9 | 21.2 | 25.6 | 68.8 |
pt | 60.8 | 61.7 | 59.5 | 76.2 |
ru | 34.9 | 32.4 | 35.1 | 65.8 |
zh | 35.1 | 41.1 | 44.0 | 52.8 |
micro-avg | 41.6 | 38.3 | 39.5 | 65.9 |
macro-avg | 43.8 | 41.0 | 42.1 | 66.2 |
Lang. . | Can. . | N+L . | L+N . | L+NM . |
---|---|---|---|---|
cs | 36.3 | 30.2 | 34.0 | 69.7 |
fr | 62.9 | 57.0 | 53.3 | 73.4 |
it | 44.8 | 43.7 | 42.9 | 56.8 |
pl | 31.9 | 21.2 | 25.6 | 68.8 |
pt | 60.8 | 61.7 | 59.5 | 76.2 |
ru | 34.9 | 32.4 | 35.1 | 65.8 |
zh | 35.1 | 41.1 | 44.0 | 52.8 |
micro-avg | 41.6 | 38.3 | 39.5 | 65.9 |
macro-avg | 43.8 | 41.0 | 42.1 | 66.2 |