Skip to Main Content
Table 6: 

Mention level F1 scores of 7 multilingual models trained on their respective training data, and tested on their respective in-domain test, NRB, and WTS sets.

ModelGermanSpanishDutchFinnishDanishCroatianAfrikaans
testnrbwtstestnrbwtstestnrbwtstestnrbwtstestnrbwtstestnrbwtstestnrbwts
Feature-based 
BERT-LSTM 78.9 36.4 84.2 85.6 59.9 90.8 84.9 45.4 85.7 76.0 38.9 84.5 76.4 42.6 78.1 78.0 28.4 79.3 76.2 39.7 65.8 
+adv 78.2 44.1 82.8 85.0 65.8 90.2 84.3 57.8 83.5 75.1 52.9 81.0 75.4 47.2 76.9 77.5 35.2 75.5 75.7 42.3 63.3 
+adv&mask 78.1 47.6 82.9 84.9 72.2 88.7 84.0 62.8 83.5 74.6 54.3 81.8 75.1 48.4 76.6 76.9 36.8 76.7 75.1 52.8 63.1 
 
Fine-tuning 
BERT-base 83.8 64.0 93.3 88.0 72.3 93.9 91.8 56.1 92.0 91.3 64.6 91.9 83.6 56.6 86.2 89.7 54.7 95.6 80.4 54.3 91.6 
+adv 83.7 68.9 93.6 87.9 75.9 93.9 91.9 58.3 91.8 90.2 66.4 92.5 82.7 58.4 86.5 89.5 57.9 95.5 79.7 60.2 92.1 
+a&m&f 83.2 73.3 94.0 87.4 81.6 93.7 91.2 63.6 91.0 89.8 67.4 92.7 82.3 63.1 85.4 88.8 59.6 94.9 79.4 64.2 91.6 
ModelGermanSpanishDutchFinnishDanishCroatianAfrikaans
testnrbwtstestnrbwtstestnrbwtstestnrbwtstestnrbwtstestnrbwtstestnrbwts
Feature-based 
BERT-LSTM 78.9 36.4 84.2 85.6 59.9 90.8 84.9 45.4 85.7 76.0 38.9 84.5 76.4 42.6 78.1 78.0 28.4 79.3 76.2 39.7 65.8 
+adv 78.2 44.1 82.8 85.0 65.8 90.2 84.3 57.8 83.5 75.1 52.9 81.0 75.4 47.2 76.9 77.5 35.2 75.5 75.7 42.3 63.3 
+adv&mask 78.1 47.6 82.9 84.9 72.2 88.7 84.0 62.8 83.5 74.6 54.3 81.8 75.1 48.4 76.6 76.9 36.8 76.7 75.1 52.8 63.1 
 
Fine-tuning 
BERT-base 83.8 64.0 93.3 88.0 72.3 93.9 91.8 56.1 92.0 91.3 64.6 91.9 83.6 56.6 86.2 89.7 54.7 95.6 80.4 54.3 91.6 
+adv 83.7 68.9 93.6 87.9 75.9 93.9 91.9 58.3 91.8 90.2 66.4 92.5 82.7 58.4 86.5 89.5 57.9 95.5 79.7 60.2 92.1 
+a&m&f 83.2 73.3 94.0 87.4 81.6 93.7 91.2 63.6 91.0 89.8 67.4 92.7 82.3 63.1 85.4 88.8 59.6 94.9 79.4 64.2 91.6 
Close Modal

or Create an Account

Close Modal
Close Modal