Accuracy of binary (mono/poly) and multi-class (poly bands) classifiers using SelfSim and pairCos features on the test sets. Comparison to a baseline that predicts always the same class and a classifier that only uses log frequency as feature. Subscripts denote the layers used.
. | . | mono/poly . | poly bands . | ||
---|---|---|---|---|---|
. | Model . | SelfSim . | pairCos . | SelfSim . | pairCos . |
en | BERT | 0.7610 | 0.798 | 0.4910 | 0.4610 |
mBERT | 0.778 | 0.758 | 0.4612 | 0.4312 | |
ELMo | 0.692 | 0.633 | 0.372 | 0.343 | |
context2vec | 0.61 | 0.61 | 0.34 | 0.31 | |
Frequency | 0.77 | 0.41 | |||
FR | Flaubert | 0.587 | 0.556 | 0.298 | 0.279 |
mBERT | 0.669 | 0.649 | 0.387 | 0.388 | |
Frequency | 0.61 | 0.37 | |||
ES | BETO | 0.709 | 0.667 | 0.426 | 0.485 |
mBERT | 0.6911 | 0.647 | 0.389 | 0.437 | |
Frequency | 0.67 | 0.41 | |||
el | GreekBERT | 0.704 | 0.644 | 0.344 | 0.386 |
mBERT | 0.607 | 0.657 | 0.3211 | 0.349 | |
Frequency | 0.63 | 0.35 | |||
Baseline | 0.50 | 0.25 |
. | . | mono/poly . | poly bands . | ||
---|---|---|---|---|---|
. | Model . | SelfSim . | pairCos . | SelfSim . | pairCos . |
en | BERT | 0.7610 | 0.798 | 0.4910 | 0.4610 |
mBERT | 0.778 | 0.758 | 0.4612 | 0.4312 | |
ELMo | 0.692 | 0.633 | 0.372 | 0.343 | |
context2vec | 0.61 | 0.61 | 0.34 | 0.31 | |
Frequency | 0.77 | 0.41 | |||
FR | Flaubert | 0.587 | 0.556 | 0.298 | 0.279 |
mBERT | 0.669 | 0.649 | 0.387 | 0.388 | |
Frequency | 0.61 | 0.37 | |||
ES | BETO | 0.709 | 0.667 | 0.426 | 0.485 |
mBERT | 0.6911 | 0.647 | 0.389 | 0.437 | |
Frequency | 0.67 | 0.41 | |||
el | GreekBERT | 0.704 | 0.644 | 0.344 | 0.386 |
mBERT | 0.607 | 0.657 | 0.3211 | 0.349 | |
Frequency | 0.63 | 0.35 | |||
Baseline | 0.50 | 0.25 |