Skip to Main Content
Table 5: 

Actual versus estimated Bleurt v0.2 of human references, oracle selection and MBR on Newstest2021 En→De. This table shows that Bleurt estimates that the oracle method is biased toward a specific human reference.

ActualModel
Ref-CRef-DmeanEst.
Human Ref-C 0.963 0.757 0.860 0.680 
Ref-D 0.756 0.963 0.860 0.677 
 
Oracle Ref-C 0.827 0.774 0.801 0.709 
Ref-D 0.779 0.828 0.805 0.711 
Ref-C+D 0.810 0.815 0.813 0.719 
 
MBR BL.2 0.790 0.789 0.790 0.739 
ActualModel
Ref-CRef-DmeanEst.
Human Ref-C 0.963 0.757 0.860 0.680 
Ref-D 0.756 0.963 0.860 0.677 
 
Oracle Ref-C 0.827 0.774 0.801 0.709 
Ref-D 0.779 0.828 0.805 0.711 
Ref-C+D 0.810 0.815 0.813 0.719 
 
MBR BL.2 0.790 0.789 0.790 0.739 
Close Modal

or Create an Account

Close Modal
Close Modal