Comparison of selected single-model systems on English (W&I+L, CoNLL-2014), Czech (AKCES-GEC), German (Falko-Merlin GEC), and Russian (RULEC-GEC) datasets. Our reimplementation of the AG finetuned model is from Náplava and Straka (2019). Note that models vastly differ in training/fine-tuning data and size (e.g., Rothe et al. (2021) xxl is 50 times larger than AG finetuned).
System . | Params . | English . | Czech . | German . | Russian . | ||
---|---|---|---|---|---|---|---|
W&I+L . | CoNLL 14 . | AKCES-GEC . | Falko-Merlin . | RULEC-GEC . | |||
Boyd (2018) | – | – | – | – | 45.22 | – | |
Choe et al. (2019) | – | 63.05 | – | – | – | – | |
Lichtarge et al. (2019) | – | – | 56.8 | – | – | ||
Lichtarge et al. (2020) | – | 66.5 | 62.1 | – | – | – | |
Omelianchuk et al. (2020) | – | 72.4 | 65.3 | – | – | – | |
Rothe et al. (2021) base | 580M | 60.2 | 54.10 | 71.88 | 69.21 | 26.24 | |
Rothe et al. (2021) xxl | 13B | 69.83 | 65.65 | 83.15 | 75.96 | 51.62 | |
Rozovskaya and Roth (2019) | – | – | – | – | – | 21.00 | |
Xu et al. (2019) | – | 63.94 | 60.90 | – | – | – | |
AG finetuned | 210M | 69.00 | 63.40 | 80.17 | 73.71 | 50.20 |
System . | Params . | English . | Czech . | German . | Russian . | ||
---|---|---|---|---|---|---|---|
W&I+L . | CoNLL 14 . | AKCES-GEC . | Falko-Merlin . | RULEC-GEC . | |||
Boyd (2018) | – | – | – | – | 45.22 | – | |
Choe et al. (2019) | – | 63.05 | – | – | – | – | |
Lichtarge et al. (2019) | – | – | 56.8 | – | – | ||
Lichtarge et al. (2020) | – | 66.5 | 62.1 | – | – | – | |
Omelianchuk et al. (2020) | – | 72.4 | 65.3 | – | – | – | |
Rothe et al. (2021) base | 580M | 60.2 | 54.10 | 71.88 | 69.21 | 26.24 | |
Rothe et al. (2021) xxl | 13B | 69.83 | 65.65 | 83.15 | 75.96 | 51.62 | |
Rozovskaya and Roth (2019) | – | – | – | – | – | 21.00 | |
Xu et al. (2019) | – | 63.94 | 60.90 | – | – | – | |
AG finetuned | 210M | 69.00 | 63.40 | 80.17 | 73.71 | 50.20 |