mT5 vs. ByT5 on three English generation tasks, reporting the best score on the validation set.
Model . | GEM-XSum (BLEU) . | TweetQA (BLEU-1) . | DROP (F1 / EM) . | |||
---|---|---|---|---|---|---|
mT5 . | ByT5 . | mT5 . | ByT5 . | mT5 . | ByT5 . | |
Small | 6.9 | 9.1 | 54.4 | 65.7 | 40.0 / 38.4 | 66.6 / 65.1 |
Base | 8.4 | 11.1 | 61.3 | 68.7 | 47.2 / 45.6 | 72.6 / 71.2 |
Large | 10.1 | 11.5 | 67.9 | 70.0 | 58.7 / 57.3 | 74.4 / 73.0 |
XL | 11.9 | 12.4 | 68.8 | 70.6 | 62.7 / 61.1 | 68.7 / 67.2 |
XXL | 14.3 | 15.3 | 70.8 | 72.0 | 71.2 / 69.6 | 80.0 / 78.5 |
Model . | GEM-XSum (BLEU) . | TweetQA (BLEU-1) . | DROP (F1 / EM) . | |||
---|---|---|---|---|---|---|
mT5 . | ByT5 . | mT5 . | ByT5 . | mT5 . | ByT5 . | |
Small | 6.9 | 9.1 | 54.4 | 65.7 | 40.0 / 38.4 | 66.6 / 65.1 |
Base | 8.4 | 11.1 | 61.3 | 68.7 | 47.2 / 45.6 | 72.6 / 71.2 |
Large | 10.1 | 11.5 | 67.9 | 70.0 | 58.7 / 57.3 | 74.4 / 73.0 |
XL | 11.9 | 12.4 | 68.8 | 70.6 | 62.7 / 61.1 | 68.7 / 67.2 |
XXL | 14.3 | 15.3 | 70.8 | 72.0 | 71.2 / 69.6 | 80.0 / 78.5 |