Method . | CM3 . | CD3 . | C3 . | |||
---|---|---|---|---|---|---|
. | Dev . | Test . | Dev . | Test . | Dev . | Test . |
Random | 27.8 | 27.8 | 26.4 | 26.6 | 27.1 | 27.2 |
Distance-Based Sliding Window (Richardson et al., 2013) | 47.9 | 45.8 | 39.6 | 40.4 | 43.8 | 43.1 |
Co-Matching (Wang et al., 2018) | 47.0 | 48.2 | 55.5 | 51.4 | 51.0 | 49.8 |
BERT (Devlin et al., 2019) | 65.6 | 64.6 | 65.9 | 64.4 | 65.7 | 64.5 |
ERNIE (Sun et al., 2019b) | 63.7 | 63.6 | 67.3 | 64.6 | 65.5 | 64.1 |
BERT-wwm (Cui et al., 2019) | 66.1 | 64.0 | 64.8 | 65.0 | 65.5 | 64.5 |
BERT-wwm-ext (Cui et al., 2019) | 67.9 | 68.0 | 67.7 | 68.9 | 67.8 | 68.5 |
Human Performance* | 96.0 | 93.3 | 98.0 | 98.7 | 97.0 | 96.0 |
Method . | CM3 . | CD3 . | C3 . | |||
---|---|---|---|---|---|---|
. | Dev . | Test . | Dev . | Test . | Dev . | Test . |
Random | 27.8 | 27.8 | 26.4 | 26.6 | 27.1 | 27.2 |
Distance-Based Sliding Window (Richardson et al., 2013) | 47.9 | 45.8 | 39.6 | 40.4 | 43.8 | 43.1 |
Co-Matching (Wang et al., 2018) | 47.0 | 48.2 | 55.5 | 51.4 | 51.0 | 49.8 |
BERT (Devlin et al., 2019) | 65.6 | 64.6 | 65.9 | 64.4 | 65.7 | 64.5 |
ERNIE (Sun et al., 2019b) | 63.7 | 63.6 | 67.3 | 64.6 | 65.5 | 64.1 |
BERT-wwm (Cui et al., 2019) | 66.1 | 64.0 | 64.8 | 65.0 | 65.5 | 64.5 |
BERT-wwm-ext (Cui et al., 2019) | 67.9 | 68.0 | 67.7 | 68.9 | 67.8 | 68.5 |
Human Performance* | 96.0 | 93.3 | 98.0 | 98.7 | 97.0 | 96.0 |