Skip to Main Content
Table 6: 

F1 scores on different evaluation slices of CustomNews for models trained on data from 2004–18. Numbers in the parentheses show the absolute difference from the same model trained on data from 2010–18.

Model2004–092010–182019–20
Uniform 34.8 (+6.3) 29.8 (#x2013;0.8) 27.4 (–0.4) 
Temporal 36.3 (+5.2) 31.1 (–1.0) 28.8 (–0.7) 
Model2004–092010–182019–20
Uniform 34.8 (+6.3) 29.8 (#x2013;0.8) 27.4 (–0.4) 
Temporal 36.3 (+5.2) 31.1 (–1.0) 28.8 (–0.7) 
Close Modal

or Create an Account

Close Modal
Close Modal