System-level Pearson (r) and Spearman (ρ) correlation between the automatic metric scores and human annotations.
Metric . | System level . | |
---|---|---|
r . | ρ . | |
GLEU | 97.37 ± 1.52 | 92.28 ± 6.19 |
I-measure | 95.37 ± 2.16 | 98.66 ± 3.21 |
M | 96.25 ± 1.71 | 93.27 ± 9.45 |
M | 98.28 ± 1.03 | 97.77 ± 4.27 |
M | 95.62 ± 1.81 | 93.22 ± 4.30 |
ERRANT0.2 | 94.66 ± 2.44 | 91.19 ± 4.76 |
ERRANT0.5 | 98.28 ± 1.04 | 98.35 ± 4.81 |
ERRANT1.0 | 95.70 ± 1.80 | 93.61 ± 4.47 |
Metric . | System level . | |
---|---|---|
r . | ρ . | |
GLEU | 97.37 ± 1.52 | 92.28 ± 6.19 |
I-measure | 95.37 ± 2.16 | 98.66 ± 3.21 |
M | 96.25 ± 1.71 | 93.27 ± 9.45 |
M | 98.28 ± 1.03 | 97.77 ± 4.27 |
M | 95.62 ± 1.81 | 93.22 ± 4.30 |
ERRANT0.2 | 94.66 ± 2.44 | 91.19 ± 4.76 |
ERRANT0.5 | 98.28 ± 1.04 | 98.35 ± 4.81 |
ERRANT1.0 | 95.70 ± 1.80 | 93.61 ± 4.47 |