Skip to Main Content
Table 3: 
Detailed consistency results, including surface form consistency (Sur; §3.2), correlation of error (Cor; §3.3), and the combined task-specific metric (Cmb; §3.4). Bold font indicates the best score among automatic outputs. Results that are not statistically significantly worse than the best score in the same column are in italics.
TranscriptTranslationConsistency
ModelParams.↓WER↑BLEU↓CharCut↓Lex↑Sur↑Cor↑Cmb↑Human
Casc 223M 21.6 19.2 47.2 10.36 10.65 0.396 0.474 3.119 
 
DirInd 175M 21.6 11.0 60.3 21.13 5.24 0.346 0.374 2.195 
DirMu 124M 23.6 18.4 48.7 13.89 7.07 0.376 0.457 2.715 
DirSh 106M 23.6 19.0 47.9 14.71 8.54 0.371 0.464 2.776 
 
2St 122M 22.2 20.1 46.1 9.86 12.08 0.391 0.484 3.170 
Tri 141M 22.2 19.9 46.3 9.72 11.54 0.414 0.484 3.192 
Concat 106M 21.9 19.2 47.1 12.79 9.60 0.387 0.477 2.875 
 
Reference – 100 12.6 13.3 3.594 
TranscriptTranslationConsistency
ModelParams.↓WER↑BLEU↓CharCut↓Lex↑Sur↑Cor↑Cmb↑Human
Casc 223M 21.6 19.2 47.2 10.36 10.65 0.396 0.474 3.119 
 
DirInd 175M 21.6 11.0 60.3 21.13 5.24 0.346 0.374 2.195 
DirMu 124M 23.6 18.4 48.7 13.89 7.07 0.376 0.457 2.715 
DirSh 106M 23.6 19.0 47.9 14.71 8.54 0.371 0.464 2.776 
 
2St 122M 22.2 20.1 46.1 9.86 12.08 0.391 0.484 3.170 
Tri 141M 22.2 19.9 46.3 9.72 11.54 0.414 0.484 3.192 
Concat 106M 21.9 19.2 47.1 12.79 9.60 0.387 0.477 2.875 
 
Reference – 100 12.6 13.3 3.594 
Close Modal

or Create an Account

Close Modal
Close Modal