Skip to Main Content
Table 3: 
Macro-averaged accuracy of different methods (%). Majority gives us 2.2%. Italic indicates best single-prompt accuracy, and bold indicates the best non-oracle accuracy overall.
PromptsTop1Top3Top5Opti.Oracle
BERT-base (Man=22.8) 
Mine 20.7 22.7 23.9 25.7 36.2 
Mine+Man 21.3 23.8 24.8 26.6 38.0 
Mine+Para 21.2 22.4 23.0 23.6 34.1 
Man+Para 22.8 23.8 24.6 25.0 34.9 
 
BERT-large (Man=25.7) 
Mine 26.4 26.3 25.9 30.1 40.7 
Mine+Man 28.1 28.3 27.3 30.7 42.2 
Mine+Para 26.2 27.1 27.0 27.1 38.3 
Man+Para 25.9 27.8 28.3 28.0 39.3 
PromptsTop1Top3Top5Opti.Oracle
BERT-base (Man=22.8) 
Mine 20.7 22.7 23.9 25.7 36.2 
Mine+Man 21.3 23.8 24.8 26.6 38.0 
Mine+Para 21.2 22.4 23.0 23.6 34.1 
Man+Para 22.8 23.8 24.6 25.0 34.9 
 
BERT-large (Man=25.7) 
Mine 26.4 26.3 25.9 30.1 40.7 
Mine+Man 28.1 28.3 27.3 30.7 42.2 
Mine+Para 26.2 27.1 27.0 27.1 38.3 
Man+Para 25.9 27.8 28.3 28.0 39.3 
Close Modal

or Create an Account

Close Modal
Close Modal