Table 2: 

mT5 and ByT5 performance on GLUE and SuperGLUE. For each benchmark, we fine-tune a single model on a mixture of all tasks, select the best checkpoint per task based on validation set performance, and report average validation set scores over all tasks.

ModelGLUESuperGLUE
mT5ByT5mT5ByT5
Small 75.6 80.5 60.2 67.8 
Base 83.0 85.3 72.5 74.0 
Large 87.6 87.0 81.9 80.4 
XL 88.7 87.9 84.7 83.2 
XXL 90.7 90.1 89.2 88.6 
ModelGLUESuperGLUE
mT5ByT5mT5ByT5
Small 75.6 80.5 60.2 67.8 
Base 83.0 85.3 72.5 74.0 
Large 87.6 87.0 81.9 80.4 
XL 88.7 87.9 84.7 83.2 
XXL 90.7 90.1 89.2 88.6 
Close Modal

or Create an Account

Close Modal
Close Modal