Impact of training methods on BERT-large models fine-tuned on CoNLL or OntoNotes.
Method . | CoNLL . | OntoNotes . | ||||
---|---|---|---|---|---|---|
Test . | nrb . | wts . | Test . | nrb . | wts . | |
BERT-lrg | 92.8 | 75.6 | 98.6 | 89.9 | 75.4 | 95.1 |
+mask | 92.9 | 82.9 | 98.4 | 89.8 | 77.3 | 96.5 |
+freeze | 92.7 | 83.1 | 98.4 | 89.9 | 79.8 | 96.0 |
+adv | 92.7 | 86.1 | 98.3 | 90.1 | 85.8 | 95.2 |
+f&m | 92.8 | 85.5 | 97.8 | 89.9 | 80.6 | 95.9 |
+a&m | 92.8 | 87.7 | 98.1 | 89.7 | 87.6 | 95.9 |
+a&f | 92.7 | 88.4 | 98.2 | 90.0 | 88.1 | 95.7 |
+a&m&f | 92.8 | 89.7 | 97.9 | 89.9 | 88.8 | 95.6 |
Method . | CoNLL . | OntoNotes . | ||||
---|---|---|---|---|---|---|
Test . | nrb . | wts . | Test . | nrb . | wts . | |
BERT-lrg | 92.8 | 75.6 | 98.6 | 89.9 | 75.4 | 95.1 |
+mask | 92.9 | 82.9 | 98.4 | 89.8 | 77.3 | 96.5 |
+freeze | 92.7 | 83.1 | 98.4 | 89.9 | 79.8 | 96.0 |
+adv | 92.7 | 86.1 | 98.3 | 90.1 | 85.8 | 95.2 |
+f&m | 92.8 | 85.5 | 97.8 | 89.9 | 80.6 | 95.9 |
+a&m | 92.8 | 87.7 | 98.1 | 89.7 | 87.6 | 95.9 |
+a&f | 92.7 | 88.4 | 98.2 | 90.0 | 88.1 | 95.7 |
+a&m&f | 92.8 | 89.7 | 97.9 | 89.9 | 88.8 | 95.6 |