F1 score for two varieties of hard-to-identify entities: zero-frequency entities that do not appear in the training corpus, and longer entities of four or more words.
Language . | CNN-BiLSTM . | mBERT-base . | XLM-R-base . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
all . | 0-freq . | 0-freq Δ . | long . | long Δ . | all . | 0-freq . | 0-freq Δ . | long . | long Δ . | all . | 0-freq . | 0-freq Δ . | long . | long Δ . | |
amh | 52.89 | 40.98 | −11.91 | 45.16 | −7.73 | – | – | – | – | – | 70.96 | 68.91 | −2.05 | 64.86 | −6.10 |
hau | 83.70 | 78.52 | −5.18 | 66.21 | −17.49 | 87.34 | 79.41 | −7.93 | 67.67 | −19.67 | 89.44 | 85.48 | −3.96 | 76.06 | −13.38 |
ibo | 78.48 | 70.57 | −7.91 | 53.93 | −24.55 | 85.11 | 78.41 | −6.70 | 60.46 | −24.65 | 84.51 | 77.42 | −7.09 | 59.52 | −24.99 |
kin | 64.61 | 55.89 | −8.72 | 40.00 | −24.61 | 70.98 | 65.57 | −5.41 | 55.39 | −15.59 | 73.93 | 66.54 | −7.39 | 54.96 | −18.97 |
lug | 74.31 | 67.99 | −6.32 | 58.33 | −15.98 | 80.56 | 76.27 | −4.29 | 65.67 | −14.89 | 80.71 | 73.54 | −7.17 | 63.77 | −16.94 |
luo | 66.42 | 58.93 | −7.49 | 54.17 | −12.25 | 72.65 | 72.85 | 0.20 | 66.67 | −5.98 | 75.14 | 72.34 | −2.80 | 69.39 | −5.75 |
pcm | 66.43 | 59.73 | −6.70 | 47.80 | −18.63 | 87.78 | 82.40 | −5.38 | 77.12 | −10.66 | 87.39 | 83.65 | −3.74 | 74.67 | −12.72 |
swa | 79.26 | 64.74 | −14.52 | 44.78 | −34.48 | 86.37 | 78.77 | −7.60 | 45.55 | −40.82 | 87.55 | 80.91 | −6.64 | 53.93 | −33.62 |
wol | 60.43 | 49.03 | −11.40 | 26.92 | −33.51 | 66.10 | 59.54 | −6.56 | 19.05 | −47.05 | 64.38 | 57.21 | −7.17 | 38.89 | −25.49 |
yor | 67.07 | 56.33 | −10.74 | 64.52 | −2.55 | 78.64 | 73.41 | −5.23 | 74.34 | −4.30 | 77.58 | 72.01 | −5.57 | 76.14 | −1.44 |
avg (excl. amh) | 69.36 | 60.27 | −9.09 | 50.18 | −19.18 | 79.50 | 74.07 | −5.43 | 59.10 | −20.40 | 79.15 | 73.80 | −5.36 | 63.22 | −15.94 |
Language . | CNN-BiLSTM . | mBERT-base . | XLM-R-base . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
all . | 0-freq . | 0-freq Δ . | long . | long Δ . | all . | 0-freq . | 0-freq Δ . | long . | long Δ . | all . | 0-freq . | 0-freq Δ . | long . | long Δ . | |
amh | 52.89 | 40.98 | −11.91 | 45.16 | −7.73 | – | – | – | – | – | 70.96 | 68.91 | −2.05 | 64.86 | −6.10 |
hau | 83.70 | 78.52 | −5.18 | 66.21 | −17.49 | 87.34 | 79.41 | −7.93 | 67.67 | −19.67 | 89.44 | 85.48 | −3.96 | 76.06 | −13.38 |
ibo | 78.48 | 70.57 | −7.91 | 53.93 | −24.55 | 85.11 | 78.41 | −6.70 | 60.46 | −24.65 | 84.51 | 77.42 | −7.09 | 59.52 | −24.99 |
kin | 64.61 | 55.89 | −8.72 | 40.00 | −24.61 | 70.98 | 65.57 | −5.41 | 55.39 | −15.59 | 73.93 | 66.54 | −7.39 | 54.96 | −18.97 |
lug | 74.31 | 67.99 | −6.32 | 58.33 | −15.98 | 80.56 | 76.27 | −4.29 | 65.67 | −14.89 | 80.71 | 73.54 | −7.17 | 63.77 | −16.94 |
luo | 66.42 | 58.93 | −7.49 | 54.17 | −12.25 | 72.65 | 72.85 | 0.20 | 66.67 | −5.98 | 75.14 | 72.34 | −2.80 | 69.39 | −5.75 |
pcm | 66.43 | 59.73 | −6.70 | 47.80 | −18.63 | 87.78 | 82.40 | −5.38 | 77.12 | −10.66 | 87.39 | 83.65 | −3.74 | 74.67 | −12.72 |
swa | 79.26 | 64.74 | −14.52 | 44.78 | −34.48 | 86.37 | 78.77 | −7.60 | 45.55 | −40.82 | 87.55 | 80.91 | −6.64 | 53.93 | −33.62 |
wol | 60.43 | 49.03 | −11.40 | 26.92 | −33.51 | 66.10 | 59.54 | −6.56 | 19.05 | −47.05 | 64.38 | 57.21 | −7.17 | 38.89 | −25.49 |
yor | 67.07 | 56.33 | −10.74 | 64.52 | −2.55 | 78.64 | 73.41 | −5.23 | 74.34 | −4.30 | 77.58 | 72.01 | −5.57 | 76.14 | −1.44 |
avg (excl. amh) | 69.36 | 60.27 | −9.09 | 50.18 | −19.18 | 79.50 | 74.07 | −5.43 | 59.10 | −20.40 | 79.15 | 73.80 | −5.36 | 63.22 | −15.94 |