Table 7: 

Knowledge and consistency results for the baseline, BERT base, and our model. The results are averaged over the 25 test relations. Underlined: best performance overall, including ablations. Bold: Best performance for BERT-ft and the two baselines (BERT-base, majority).

ModelAccuracyConsistencyConsistent-Acc
majority 24.4±22.5 100.0±0.0 24.4±22.5 
 
BERT-base 45.6±27.6 58.2±23.9 27.3±24.8 
BERT-ft 47.4 ±27.3 64.0 ±22.9 33.2 ±27.0 
 -consistency 46.9±27.6 60.9±22.6 30.9±26.3 
 -typed 46.5±27.1 62.0±21.2 31.1±25.2 
 -MLM 16.9±21.1 80.8 ±27.1 9.1±11.5 
ModelAccuracyConsistencyConsistent-Acc
majority 24.4±22.5 100.0±0.0 24.4±22.5 
 
BERT-base 45.6±27.6 58.2±23.9 27.3±24.8 
BERT-ft 47.4 ±27.3 64.0 ±22.9 33.2 ±27.0 
 -consistency 46.9±27.6 60.9±22.6 30.9±26.3 
 -typed 46.5±27.1 62.0±21.2 31.1±25.2 
 -MLM 16.9±21.1 80.8 ±27.1 9.1±11.5 
Close Modal

or Create an Account

Close Modal
Close Modal