Skip to Main Content
Table 7: 
Performance comparison in accuracy (%) by categories based on a subset of development sets of C3 (*: ≤ 10 annotated instances fall into that category).
Co-MatchingBERTBERT-wwm-extHuman
CM3 | CD3CM3 | CD3CM3 | CD3CM3 | CD3
Matching 54.6 | 70.4 81.8 | 81.5 100.0 | 85.2 100.0 | 100.0 
Prior knowledge 47.5 | 51.2 64.0 | 64.2 62.6 | 68.3 95.7 | 97.6 
 ◇ Linguistic 49.4 | 49.0 67.1 | 62.8 61.2 | 68.6 97.7 | 100.0 
 ◇ Domain-specific* – | 66.7 – | 0.0 – | 0.0 – | 100.0 
 ◇ General world 46.5 | 53.8 57.7 | 66.3 64.8 | 70.0 93.0 | 96.3 
  Arithmetic* 50.0 | 60.0 0.0 | 80.0 50.0 | 60.0 100.0 | 100.0 
  Connotation* 0.0 | 50.0 0.0 | 62.5 0.0 | 62.5 100.0 | 100.0 
  Cause-effect 47.6 | 55.6 57.1 | 55.6 66.7 | 66.7 95.2 | 100.0 
  Implication 46.7 | 45.5 70.0 | 50.0 70.0 | 54.6 86.7 | 95.5 
  Part-whole 60.0 | 50.0 40.0 | 50.0 40.0 | 50.0 100.0 | 83.3 
  Precondition* 66.7 | 50.0 66.7 | 25.0 66.7 | 75.0 100.0 | 100.0 
  Scenario 40.0 | 61.3 40.0 | 80.7 60.0 | 83.9 100.0 | 96.8 
  Other* – | 0.0 – | 0.0 – | 0.0 – | 100.0 
 
Single sentence 50.0 | 64.7 72.4 | 76.5 71.1 | 82.4 97.4 | 97.1 
Multiple sentences 47.2 | 51.7 58.3 | 64.7 61.1 | 68.1 94.4 | 98.3 
Independent* 0.0 | – 50.0 | – 0.0 | – 100.0 | – 
Co-MatchingBERTBERT-wwm-extHuman
CM3 | CD3CM3 | CD3CM3 | CD3CM3 | CD3
Matching 54.6 | 70.4 81.8 | 81.5 100.0 | 85.2 100.0 | 100.0 
Prior knowledge 47.5 | 51.2 64.0 | 64.2 62.6 | 68.3 95.7 | 97.6 
 ◇ Linguistic 49.4 | 49.0 67.1 | 62.8 61.2 | 68.6 97.7 | 100.0 
 ◇ Domain-specific* – | 66.7 – | 0.0 – | 0.0 – | 100.0 
 ◇ General world 46.5 | 53.8 57.7 | 66.3 64.8 | 70.0 93.0 | 96.3 
  Arithmetic* 50.0 | 60.0 0.0 | 80.0 50.0 | 60.0 100.0 | 100.0 
  Connotation* 0.0 | 50.0 0.0 | 62.5 0.0 | 62.5 100.0 | 100.0 
  Cause-effect 47.6 | 55.6 57.1 | 55.6 66.7 | 66.7 95.2 | 100.0 
  Implication 46.7 | 45.5 70.0 | 50.0 70.0 | 54.6 86.7 | 95.5 
  Part-whole 60.0 | 50.0 40.0 | 50.0 40.0 | 50.0 100.0 | 83.3 
  Precondition* 66.7 | 50.0 66.7 | 25.0 66.7 | 75.0 100.0 | 100.0 
  Scenario 40.0 | 61.3 40.0 | 80.7 60.0 | 83.9 100.0 | 96.8 
  Other* – | 0.0 – | 0.0 – | 0.0 – | 100.0 
 
Single sentence 50.0 | 64.7 72.4 | 76.5 71.1 | 82.4 97.4 | 97.1 
Multiple sentences 47.2 | 51.7 58.3 | 64.7 61.1 | 68.1 94.4 | 98.3 
Independent* 0.0 | – 50.0 | – 0.0 | – 100.0 | – 
Close Modal

or Create an Account

Close Modal
Close Modal