ARC challenge scores compared with other Fully or Partially explainable approaches trained only on the ARC dataset.
Model . | Explainable . | Accuracy . |
---|---|---|
BERTLarge | No | 35.11 |
IR Solver (Clark et al., 2016) | Yes | 20.26 |
TupleILP (Khot et al., 2017) | Yes | 23.83 |
TableILP (Khashabi et al., 2016) | Yes | 26.97 |
ExplanationLP (Thayaparan et al., 2021) | Yes | 40.21 |
DGEM (Clark et al., 2016) | Partial | 27.11 |
KG2 (Zhang et al., 2018) | Partial | 31.70 |
ET-RR (Ni et al., 2019) | Partial | 36.61 |
Unsupervised AHE (Yadav et al., 2019a) | Partial | 33.87 |
Supervised AHE (Yadav et al., 2019a) | Partial | 34.47 |
AutoRocc (Yadav et al., 2019b) | Partial | 41.24 |
Diff-Explainer (ExplanationLP) | Yes | 42.95 |
Model . | Explainable . | Accuracy . |
---|---|---|
BERTLarge | No | 35.11 |
IR Solver (Clark et al., 2016) | Yes | 20.26 |
TupleILP (Khot et al., 2017) | Yes | 23.83 |
TableILP (Khashabi et al., 2016) | Yes | 26.97 |
ExplanationLP (Thayaparan et al., 2021) | Yes | 40.21 |
DGEM (Clark et al., 2016) | Partial | 27.11 |
KG2 (Zhang et al., 2018) | Partial | 31.70 |
ET-RR (Ni et al., 2019) | Partial | 36.61 |
Unsupervised AHE (Yadav et al., 2019a) | Partial | 33.87 |
Supervised AHE (Yadav et al., 2019a) | Partial | 34.47 |
AutoRocc (Yadav et al., 2019b) | Partial | 41.24 |
Diff-Explainer (ExplanationLP) | Yes | 42.95 |