Table 8

CoNLL04 evaluation of the plausability of explanations, which measures the overlap between machine explanations and human annotations. For each method, we pick the higher F1 score between the two human annotators.

ApproachPrecisionRecallF1
Attention 61.06 30.30 38.94 
Saliency Mapping 18.79 39.39 24.43 
LIME 22.14 53.33 30.09 
Unsupervised Rationale 5.35 74.55 9.31 
SHAP 18.18 36.36 23.27 
CXPlain 21.21 44.55 27.82 
Greedy Adding 33.33 38.03 32.21 
Our Approach 65.15 59.24 58.97 
ApproachPrecisionRecallF1
Attention 61.06 30.30 38.94 
Saliency Mapping 18.79 39.39 24.43 
LIME 22.14 53.33 30.09 
Unsupervised Rationale 5.35 74.55 9.31 
SHAP 18.18 36.36 23.27 
CXPlain 21.21 44.55 27.82 
Greedy Adding 33.33 38.03 32.21 
Our Approach 65.15 59.24 58.97 
Close Modal

or Create an Account

Close Modal
Close Modal