Table 3: 

mT5 vs. ByT5 on three English generation tasks, reporting the best score on the validation set.

ModelGEM-XSum (BLEU)TweetQA (BLEU-1)DROP (F1 / EM)
mT5ByT5mT5ByT5mT5ByT5
Small 6.9 9.1 54.4 65.7 40.0 / 38.4 66.6 / 65.1 
Base 8.4 11.1 61.3 68.7 47.2 / 45.6 72.6 / 71.2 
Large 10.1 11.5 67.9 70.0 58.7 / 57.3 74.4 / 73.0 
XL 11.9 12.4 68.8 70.6 62.7 / 61.1 68.7 / 67.2 
XXL 14.3 15.3 70.8 72.0 71.2 / 69.6 80.0 / 78.5 
ModelGEM-XSum (BLEU)TweetQA (BLEU-1)DROP (F1 / EM)
mT5ByT5mT5ByT5mT5ByT5
Small 6.9 9.1 54.4 65.7 40.0 / 38.4 66.6 / 65.1 
Base 8.4 11.1 61.3 68.7 47.2 / 45.6 72.6 / 71.2 
Large 10.1 11.5 67.9 70.0 58.7 / 57.3 74.4 / 73.0 
XL 11.9 12.4 68.8 70.6 62.7 / 61.1 68.7 / 67.2 
XXL 14.3 15.3 70.8 72.0 71.2 / 69.6 80.0 / 78.5 
Close Modal

or Create an Account

Close Modal
Close Modal