Skip to Main Content
Table 4 
Conditional Pearson correlation with direct assessment scores for popular and top scoring metrics from the WMT16 Metrics Task for a four-way split of the data set resulting in data samples corresponding to four quality levels (Q1Q4). † and ‡ indicate, for each column, if the correlation is significantly different from the correlation in Q1 and Q4, respectively. In each column, results for the metrics that are not significantly outperformed by any other metric are marked in bold.
Q1Q2Q3Q4Q4*All
Meteor 0.198 0.151 0.163 0.514 0.347 0.570 
-TERp-A 0.168 0.113 0.180 0.404 0.287 0.570 
MPEDA 0.200 0.150 0.166 0.512 0.343 0.568 
ROUGE-SU* 0.199 0.118 0.193 0.398 0.252 0.551 
ChrF3 0.218 0.119†‡ 0.139 0.375 0.219 0.541 
NIST-4 0.189 0.109 0.137 0.397 0.246 0.508 
BLEU-4 0.051 0.084 0.136 0.453 0.282 0.488 
-TER 0.051 0.056 0.172†‡ 0.388 0.254 0.462 
-WER 0.031 0.048 0.189†‡ 0.404 0.276 0.456 
-PER 0.115 0.072 0.133 0.351 0.212 0.422 
 
UPF-Cobalt 0.150 0.122 0.170 0.403 0.275 0.566 
CP-Oc(*) 0.164 0.078 0.172 0.431 0.270 0.527 
SP-lNIST 0.198 0.109 0.141 0.392 0.241 0.512 
DP-Oc(*) 0.055 0.072 0.153†‡ 0.349 0.224 0.424 
SR-Or(*) 0.083 0.085 0.062 0.215 0.174 0.371 
 
DPMFcomb 0.204 0.146 0.193 0.443 0.303 0.615 
Metrics-F 0.127 0.172 0.199 0.480 0.327 0.612 
Cobalt-F-comp 0.092 0.160 0.216†‡ 0.469 0.344 0.599 
BEER 0.228 0.119†‡ 0.143 0.384 0.218 0.534 
UoW-ReVal 0.096 0.092 0.163 0.376 0.257 0.525 
Q1Q2Q3Q4Q4*All
Meteor 0.198 0.151 0.163 0.514 0.347 0.570 
-TERp-A 0.168 0.113 0.180 0.404 0.287 0.570 
MPEDA 0.200 0.150 0.166 0.512 0.343 0.568 
ROUGE-SU* 0.199 0.118 0.193 0.398 0.252 0.551 
ChrF3 0.218 0.119†‡ 0.139 0.375 0.219 0.541 
NIST-4 0.189 0.109 0.137 0.397 0.246 0.508 
BLEU-4 0.051 0.084 0.136 0.453 0.282 0.488 
-TER 0.051 0.056 0.172†‡ 0.388 0.254 0.462 
-WER 0.031 0.048 0.189†‡ 0.404 0.276 0.456 
-PER 0.115 0.072 0.133 0.351 0.212 0.422 
 
UPF-Cobalt 0.150 0.122 0.170 0.403 0.275 0.566 
CP-Oc(*) 0.164 0.078 0.172 0.431 0.270 0.527 
SP-lNIST 0.198 0.109 0.141 0.392 0.241 0.512 
DP-Oc(*) 0.055 0.072 0.153†‡ 0.349 0.224 0.424 
SR-Or(*) 0.083 0.085 0.062 0.215 0.174 0.371 
 
DPMFcomb 0.204 0.146 0.193 0.443 0.303 0.615 
Metrics-F 0.127 0.172 0.199 0.480 0.327 0.612 
Cobalt-F-comp 0.092 0.160 0.216†‡ 0.469 0.344 0.599 
BEER 0.228 0.119†‡ 0.143 0.384 0.218 0.534 
UoW-ReVal 0.096 0.092 0.163 0.376 0.257 0.525 
Close Modal

or Create an Account

Close Modal
Close Modal