Skip to Main Content
Table 3: 
Left to right: Agreement between datasets’ original labels and the majority label according to our (discretized) re-annotation; accuracy of BERT NLI model against original labels; accuracy of BERT against re-annotation labels; number of p/h pairs (out of 100) on which all three label sources (original, re-annotation, model prediction) agree on the most likely label. Our analysis in §5.3 is performed only over pairs in ∩.
 Orig./OursBERT/OrigBERT/Ours
SNLI 0.790 0.890 0.830 76 
MNLI 0.707 0.818 0.687 62 
RTE2 0.690 0.460 0.470 36 
DNC 0.780 0.900 0.800 74 
JOCI 0.651 0.698 0.581 41 
 Orig./OursBERT/OrigBERT/Ours
SNLI 0.790 0.890 0.830 76 
MNLI 0.707 0.818 0.687 62 
RTE2 0.690 0.460 0.470 36 
DNC 0.780 0.900 0.800 74 
JOCI 0.651 0.698 0.581 41 
Close Modal

or Create an Account

Close Modal
Close Modal