Table 7

TACRED evaluation of the plausability of explanations, which measures the overlap between machine explanations and human annotations. For each method, we pick the higher F1 score between the two human annotators.

ApproachPrecisionRecallF1
Attention 41.39 20.60 26.50 
Saliency Mapping 18.73 35.58 23.41 
LIME 14.31 26.03 18.09 
Unsupervised Rationale 4.73 69.66 8.30 
SHAP 13.86 22.85 16.79 
CXPlain 28.84 55.06 36.48 
Greedy Adding 31.59 33.52 30.16 
Our Approach 74.72 61.20 62.05 
ApproachPrecisionRecallF1
Attention 41.39 20.60 26.50 
Saliency Mapping 18.73 35.58 23.41 
LIME 14.31 26.03 18.09 
Unsupervised Rationale 4.73 69.66 8.30 
SHAP 13.86 22.85 16.79 
CXPlain 28.84 55.06 36.48 
Greedy Adding 31.59 33.52 30.16 
Our Approach 74.72 61.20 62.05 
Close Modal

or Create an Account

Close Modal
Close Modal