Table 6: 

Degradation of mT5 and ByT5 under various types of noise. “Clean” shows original task performance. Subsequent rows show the delta from “clean” when adding different types of noise. Learnable noise is added in training and eval, while unseen noise only affects eval.

ModelLearnable NoiseUnseen Noise
XNLI (accuracy)TyDiQA-GoldP (F1)XNLI (accuracy)
Clean mT5 81.1 85.3 81.1 
ByT5 79.7 87.7 79.7 
Drop mT5 −10.2 −24.0 −18.3 
ByT5 −8.2 −19.5 −11.4 
Repetitions mT5 −8.5 −9.5 −12.3 
ByT5 −4.1 −3.0 −5.9 
Antspeak mT5 −32.0 −27.7 −34.4 
ByT5 −8.7 −4.8 −24.4 
Uppercase mT5 −7.0 −8.0 −8.1 
ByT5 −1.5 −0.5 −1.7 
Random Case mT5 −25.7 −14.3 −19.2 
ByT5 −1.5 −0.2 −5.9 
ModelLearnable NoiseUnseen Noise
XNLI (accuracy)TyDiQA-GoldP (F1)XNLI (accuracy)
Clean mT5 81.1 85.3 81.1 
ByT5 79.7 87.7 79.7 
Drop mT5 −10.2 −24.0 −18.3 
ByT5 −8.2 −19.5 −11.4 
Repetitions mT5 −8.5 −9.5 −12.3 
ByT5 −4.1 −3.0 −5.9 
Antspeak mT5 −32.0 −27.7 −34.4 
ByT5 −8.7 −4.8 −24.4 
Uppercase mT5 −7.0 −8.0 −8.1 
ByT5 −1.5 −0.5 −1.7 
Random Case mT5 −25.7 −14.3 −19.2 
ByT5 −1.5 −0.2 −5.9 
Close Modal

or Create an Account

Close Modal
Close Modal