Skip to Main Content
Table 4: 

mGENRE on the Wikinew-7 unseen languages. Models are trained only on the Mewsli-9 languages (1M datapoints per language ar, de, en, es, fa, ja, sr, ta, and tr). ‘Can.’ is canonical, ‘N+L’ is ‘name+language’, and ‘L+N’ is the opposite. M indicates marginalization.

Lang.Can.N+LL+NL+NM
cs 36.3 30.2 34.0 69.7 
fr 62.9 57.0 53.3 73.4 
it 44.8 43.7 42.9 56.8 
pl 31.9 21.2 25.6 68.8 
pt 60.8 61.7 59.5 76.2 
ru 34.9 32.4 35.1 65.8 
zh 35.1 41.1 44.0 52.8 
 
micro-avg 41.6 38.3 39.5 65.9 
macro-avg 43.8 41.0 42.1 66.2 
Lang.Can.N+LL+NL+NM
cs 36.3 30.2 34.0 69.7 
fr 62.9 57.0 53.3 73.4 
it 44.8 43.7 42.9 56.8 
pl 31.9 21.2 25.6 68.8 
pt 60.8 61.7 59.5 76.2 
ru 34.9 32.4 35.1 65.8 
zh 35.1 41.1 44.0 52.8 
 
micro-avg 41.6 38.3 39.5 65.9 
macro-avg 43.8 41.0 42.1 66.2 
Close Modal

or Create an Account

Close Modal
Close Modal