Table 8: 

Language-wise breakdown for Named Entity Recognition for the CoNLL and MasakhaNER datasets (labeled F1). mBert obtains a score of zero on Amharic due to having no vocabulary entries in the Amharic script.

LanguagemBertCanine-CCanine-C + n-grams
CoNLL 
Dutch 90.2 74.7 (–15.5) 88.5 (–1.7) 
English 91.1 79.8 (–11.3) 89.8 (–1.3) 
German 82.5 64.1 (–18.4) 82.1 (–0.4) 
Spanish 87.6 77.4 (–10.2) 86.5 (–1.1) 
Macro Avg 87.8 74.0 (–13.8) 86.7 (–1.1) 
 
MasakhaNER 
Amharic 0.0 44.6 (+44.6) 50.0 (+50.0) 
Hausa 89.3 76.1 (–13.2) 88.0 (–1.3) 
Igbo 84.6 75.6 (–9.0) 85.0 (+0.4) 
Kinyarwanda 73.9 58.3 (–15.6) 72.8 (–1.1) 
Luganda 80.2 69.4 (–10.8) 79.6 (–0.6) 
Luo 75.8 63.4 (–12.4) 74.2 (–1.6) 
Nigerian Pidgin 89.8 66.6 (–23.2) 88.7 (–1.1) 
Swahili 87.1 72.7 (–14.4) 83.7 (–3.4) 
Wolof 64.9 60.7 (–4.2) 66.5 (+1.6) 
Yorùbá 78.7 67.9 (–10.8) 79.1 (+0.4) 
Macro Avg 72.4 65.5 (–6.9) 76.8 (+4.3) 
LanguagemBertCanine-CCanine-C + n-grams
CoNLL 
Dutch 90.2 74.7 (–15.5) 88.5 (–1.7) 
English 91.1 79.8 (–11.3) 89.8 (–1.3) 
German 82.5 64.1 (–18.4) 82.1 (–0.4) 
Spanish 87.6 77.4 (–10.2) 86.5 (–1.1) 
Macro Avg 87.8 74.0 (–13.8) 86.7 (–1.1) 
 
MasakhaNER 
Amharic 0.0 44.6 (+44.6) 50.0 (+50.0) 
Hausa 89.3 76.1 (–13.2) 88.0 (–1.3) 
Igbo 84.6 75.6 (–9.0) 85.0 (+0.4) 
Kinyarwanda 73.9 58.3 (–15.6) 72.8 (–1.1) 
Luganda 80.2 69.4 (–10.8) 79.6 (–0.6) 
Luo 75.8 63.4 (–12.4) 74.2 (–1.6) 
Nigerian Pidgin 89.8 66.6 (–23.2) 88.7 (–1.1) 
Swahili 87.1 72.7 (–14.4) 83.7 (–3.4) 
Wolof 64.9 60.7 (–4.2) 66.5 (+1.6) 
Yorùbá 78.7 67.9 (–10.8) 79.1 (+0.4) 
Macro Avg 72.4 65.5 (–6.9) 76.8 (+4.3) 
Close Modal

or Create an Account

Close Modal
Close Modal