Human judgments agreement: Pearson (r) and Spearman (ρ) mean correlation between 3 human judgments of 5 sentence versions at sentence- and system-level.
Domain . | Sentence level . | System level . | ||
---|---|---|---|---|
r . | ρ . | r . | ρ . | |
Native Formal | 87.13 | 88.76 | 92.01 | 92.52 |
Native Web Inf. | 80.23 | 81.47 | 95.33 | 91.80 |
Romani | 86.57 | 86.57 | 88.73 | 85.90 |
Second Learners | 78.50 | 79.97 | 96.50 | 97.23 |
Whole Dataset | 79.07 | 80.40 | 96.11 | 95.54 |
Domain . | Sentence level . | System level . | ||
---|---|---|---|---|
r . | ρ . | r . | ρ . | |
Native Formal | 87.13 | 88.76 | 92.01 | 92.52 |
Native Web Inf. | 80.23 | 81.47 | 95.33 | 91.80 |
Romani | 86.57 | 86.57 | 88.73 | 85.90 |
Second Learners | 78.50 | 79.97 | 96.50 | 97.23 |
Whole Dataset | 79.07 | 80.40 | 96.11 | 95.54 |