Skip to Main Content
Table 11: 

Results of RoBERTa-based experiments on GLUE development sets, where the median results are the median of five runs. Because we used a different optimization method to re-implement RoBERTa, our median performance is not the same as that reported in (Liu et al., 2019b).

CoLASST-2RTEQNLIMRPC
(Matthew Corr.)(Accuracy)(Accuracy)(Accuracy)(Accuracy/F1)
The median result 
RoBERTa, Liu et al., 2019b  68.0 96.4 86.6 94.7 90.9/– 
RoBERTa, our run 68.7 96.1 84.8 94.6 89.5/92.3 
SSL-Reg (MTP) 69.2 96.3 85.2 94.9 90.0/92.7 
 
The best result 
RoBERTa, our run 69.2 96.7 86.6 94.7 90.4/93.1 
SSL-Reg (MTP) 70.2 96.7 86.6 95.2 91.4/93.8 
CoLASST-2RTEQNLIMRPC
(Matthew Corr.)(Accuracy)(Accuracy)(Accuracy)(Accuracy/F1)
The median result 
RoBERTa, Liu et al., 2019b  68.0 96.4 86.6 94.7 90.9/– 
RoBERTa, our run 68.7 96.1 84.8 94.6 89.5/92.3 
SSL-Reg (MTP) 69.2 96.3 85.2 94.9 90.0/92.7 
 
The best result 
RoBERTa, our run 69.2 96.7 86.6 94.7 90.4/93.1 
SSL-Reg (MTP) 70.2 96.7 86.6 95.2 91.4/93.8 
Close Modal

or Create an Account

Close Modal
Close Modal