Table 9 shows the results (F1 measure) of the two SVM classifiers (author and top source levels, as well as their average) running on both the original and the corrected versions of the gold standard. For a more meaningful comparison with our system, we also computed De Facto's performance on these two source levels. The results are shown in Table 10, where we also added, as a reference point, the figures obtained from evaluating De Facto on all source levels (corresponding to the F1 rows in Table 8).
Baseline performance (F1 measures).
. | CT+ . | CT− . | PR+ . | PS+ . | Uu . | Macro-A . | Micro-A . |
---|---|---|---|---|---|---|---|
Original parses | |||||||
Author | 0.88 | 0.53 | 0.07 | 0.29 | 0.75 | 0.53 | 0.83 |
Top sources | 0.92 | 0.69 | 0.51 | 0.50 | 0.57 | 0.66 | 0.86 |
Average | 0.90 | 0.61 | 0.29 | 0.39 | 0.66 | 0.59 | 0.84 |
Corrected parses | |||||||
Author | 0.88 | 0.54 | 0.07 | 0.27 | 0.77 | 0.53 | 0.83 |
Top sources | 0.92 | 0.67 | 0.50 | 0.50 | 0.51 | 0.64 | 0.85 |
Average | 0.90 | 0.61 | 0.28 | 0.38 | 0.64 | 0.58 | 0.84 |
. | CT+ . | CT− . | PR+ . | PS+ . | Uu . | Macro-A . | Micro-A . |
---|---|---|---|---|---|---|---|
Original parses | |||||||
Author | 0.88 | 0.53 | 0.07 | 0.29 | 0.75 | 0.53 | 0.83 |
Top sources | 0.92 | 0.69 | 0.51 | 0.50 | 0.57 | 0.66 | 0.86 |
Average | 0.90 | 0.61 | 0.29 | 0.39 | 0.66 | 0.59 | 0.84 |
Corrected parses | |||||||
Author | 0.88 | 0.54 | 0.07 | 0.27 | 0.77 | 0.53 | 0.83 |
Top sources | 0.92 | 0.67 | 0.50 | 0.50 | 0.51 | 0.64 | 0.85 |
Average | 0.90 | 0.61 | 0.28 | 0.38 | 0.64 | 0.58 | 0.84 |
De Facto performance (F1 measures).
. | CT+ . | CT− . | PR+ . | PS+ . | Uu . | Macro-A . | Micro-A . |
---|---|---|---|---|---|---|---|
Original parses | |||||||
All sources | 0.85 | 0.75 | 0.46 | 0.59 | 0.75 | 0.70 | 0.80 |
Author | 0.88 | 0.88 *** | 0.67 *** | 0.33 | 0.78 | 0.73 *** | 0.84 * |
Top sources | 0.90 | 0.79 * | 0.33 * | 0.66 ** | 0.58 | 0.67 | 0.84 |
Average | 0.89 | 0.84 *** | 0.50 ** | 0.50 * | 0.68 | 0.70 *** | 0.84 |
Corrected parses | |||||||
All sources | 0.89 | 0.82 | 0.55 | 0.61 | 0.81 | 0.74 | 0.85 |
Author | 0.90 | 0.91 *** | 0.67 *** | 0.35 | 0.84 ** | 0.75 *** | 0.88 * |
Top sources | 0.93 | 0.85 ** | 0.53 | 0.67 ** | 0.65 * | 0.74 * | 0.88 |
Average | 0.92 | 0.88 *** | 0.60 *** | 0.51 * | 0.75 * | 0.75 *** | 0.88 ** |
. | CT+ . | CT− . | PR+ . | PS+ . | Uu . | Macro-A . | Micro-A . |
---|---|---|---|---|---|---|---|
Original parses | |||||||
All sources | 0.85 | 0.75 | 0.46 | 0.59 | 0.75 | 0.70 | 0.80 |
Author | 0.88 | 0.88 *** | 0.67 *** | 0.33 | 0.78 | 0.73 *** | 0.84 * |
Top sources | 0.90 | 0.79 * | 0.33 * | 0.66 ** | 0.58 | 0.67 | 0.84 |
Average | 0.89 | 0.84 *** | 0.50 ** | 0.50 * | 0.68 | 0.70 *** | 0.84 |
Corrected parses | |||||||
All sources | 0.89 | 0.82 | 0.55 | 0.61 | 0.81 | 0.74 | 0.85 |
Author | 0.90 | 0.91 *** | 0.67 *** | 0.35 | 0.84 ** | 0.75 *** | 0.88 * |
Top sources | 0.93 | 0.85 ** | 0.53 | 0.67 ** | 0.65 * | 0.74 * | 0.88 |
Average | 0.92 | 0.88 *** | 0.60 *** | 0.51 * | 0.75 * | 0.75 *** | 0.88 ** |
* p ≤ 0.05
** p ≤ 0.01
*** p ≤ 0.001