Table 5: 
Generalization to unseen languages Language transfer results, fine-tuning on language-pairs without pre-training on them. mBART25 uses all languages during pre-training, while other settings contain at least one unseen language pair. For each model, we also show the gap to mBART25 results.
 MonolingualNl-EnEn-NlAr-EnEn-ArNl-DeDe-Nl
Random None 34.6 (−8.7) 29.3 (−5.5) 27.5 (−10.1) 16.9 (−4.7) 21.3 (−6.4) 20.9 (−5.2) 
mBART02 En Ro 41.4 (−2.9) 34.5 (−0.3) 34.9 (−2.7) 21.2 (−0.4) 26.1 (−1.6) 25.4 (−0.7) 
mBART06 En Ro Cs It Fr Es 43.1 (−0.2) 34.6 (−0.2) 37.3 (−0.3) 21.1 (−0.5) 26.4 (−1.3) 25.3 (−0.8) 
mBART25 All 43.3 34.8 37.6 21.6 27.7 26.1 
 MonolingualNl-EnEn-NlAr-EnEn-ArNl-DeDe-Nl
Random None 34.6 (−8.7) 29.3 (−5.5) 27.5 (−10.1) 16.9 (−4.7) 21.3 (−6.4) 20.9 (−5.2) 
mBART02 En Ro 41.4 (−2.9) 34.5 (−0.3) 34.9 (−2.7) 21.2 (−0.4) 26.1 (−1.6) 25.4 (−0.7) 
mBART06 En Ro Cs It Fr Es 43.1 (−0.2) 34.6 (−0.2) 37.3 (−0.3) 21.1 (−0.5) 26.4 (−1.3) 25.3 (−0.8) 
mBART25 All 43.3 34.8 37.6 21.6 27.7 26.1 
Close Modal

or Create an Account

Close Modal
Close Modal