Skip to Main Content
Table 9 
Absolute Pearson correlation between automatic evaluation scores and human judgments in the MTSummit17 data set. † indicates that the correlations for the neural MT system is significantly different from the correlation for the statistical system. WER*, TER*, and PER* show correlations for the corresponding metrics when eliminating the outliers.
PBMTNMT
ROUGE-SU* 0.411 0.506 
ChrF3 0.400 0.478 
BLEU-4 0.403 0.461 
NIST-4 0.379 0.464 
WER 0.285 0.267 
TER 0.270 0.260 
PER 0.226 0.236 
WER* 0.405 0.461 
TER* 0.400 0.463 
PER* 0.369 0.451 
 
BEER 0.416 0.511 
PBMTNMT
ROUGE-SU* 0.411 0.506 
ChrF3 0.400 0.478 
BLEU-4 0.403 0.461 
NIST-4 0.379 0.464 
WER 0.285 0.267 
TER 0.270 0.260 
PER 0.226 0.236 
WER* 0.405 0.461 
TER* 0.400 0.463 
PER* 0.369 0.451 
 
BEER 0.416 0.511 
Close Modal

or Create an Account

Close Modal
Close Modal