Skip to Main Content
Table 3: 

Actual utility of different MBR methods on newstest2021 De→En. Actual utility is computed with respect to reference A. This table is the equivalent of Table 2 for En→De.

MethodAutomatic EvaluationModelHuman Eval
BleusBleuChrfYisiBL.1BL.2logPMQM ↓
Human Transl. Ref-B 29.5 30.4 57.7. 82.8 38.3 75.4 −23.0 0.447 
 
Beam 4  33.1 34.2 61.2 84.1 41.1 75.2 −6.1 0.345 
 
MBR sBleu 33.3 34.7 61.1 84.1 40.1 75.0 −7.1 0.323 
Chrf 32.5 34.1 62.2 84.2 41.7 75.3 −8.0 0.380 
Yisi 32.6 33.8 60.8 84.4 41.5 75.1 −7.7 0.307 
Bleurt v0.1 28.2 29.7 58.5 82.9 41.9 77.3 −11.8 0.302 
Bleurt v0.2 28.4 30.0 58.2 82.9 41.2 78.2 −12.2 0.272 
MethodAutomatic EvaluationModelHuman Eval
BleusBleuChrfYisiBL.1BL.2logPMQM ↓
Human Transl. Ref-B 29.5 30.4 57.7. 82.8 38.3 75.4 −23.0 0.447 
 
Beam 4  33.1 34.2 61.2 84.1 41.1 75.2 −6.1 0.345 
 
MBR sBleu 33.3 34.7 61.1 84.1 40.1 75.0 −7.1 0.323 
Chrf 32.5 34.1 62.2 84.2 41.7 75.3 −8.0 0.380 
Yisi 32.6 33.8 60.8 84.4 41.5 75.1 −7.7 0.307 
Bleurt v0.1 28.2 29.7 58.5 82.9 41.9 77.3 −11.8 0.302 
Bleurt v0.2 28.4 30.0 58.2 82.9 41.2 78.2 −12.2 0.272 
Close Modal

or Create an Account

Close Modal
Close Modal