Performance of the rule-based model on the TACRED test partition. [1] is the set of manually written surface rules of Angeli et al. (2015) coupled with our syntactic rules (see Section 4.1). [2] is the set of rules generated from our explainability classifier’s outputs with gold labels on the training partition. [3] is the set of rules from the explainability classifier’s outputs with predicted labels on the test partition. We also evaluate the performance on combinations of these sets of rules: [2]+[3] contain all rules generated by our approach; [1]+[2]+[3] combine machine-generated rules with the manually written rules.
Approach . | Precision . | Recall . | F1 . |
---|---|---|---|
Baseline | |||
Manual Rules[1] | 85.93 | 24.24 | 37.81 |
Our Approach | |||
Rules from Training[2] | 49.39 | 30.26 | 37.52 |
Rules from Test[3] | 59.69 | 55.04 | 57.27 |
Combination of [1] and [2] | 54.12 | 62.95 | 58.20 |
Combination of [1] and [3] | 65.28 | 71.64 | 68.31 |
Combination of [2] and [3] | 56.34 | 40.90 | 47.40 |
Combination of [1], [2], and [3] | 57.36 | 72.00 | 63.85 |
Approach . | Precision . | Recall . | F1 . |
---|---|---|---|
Baseline | |||
Manual Rules[1] | 85.93 | 24.24 | 37.81 |
Our Approach | |||
Rules from Training[2] | 49.39 | 30.26 | 37.52 |
Rules from Test[3] | 59.69 | 55.04 | 57.27 |
Combination of [1] and [2] | 54.12 | 62.95 | 58.20 |
Combination of [1] and [3] | 65.28 | 71.64 | 68.31 |
Combination of [2] and [3] | 56.34 | 40.90 | 47.40 |
Combination of [1], [2], and [3] | 57.36 | 72.00 | 63.85 |