The authors of the article “Frame-Semantic Parsing” and a graduate student discovered that in rows 7 and 8 of Table 8, at inference time for argument identification with gold frames, the described model included gold spans along with the candidate set of automatic spans (elaborated in Section 6.1), thus creating an oracle, and artificially bloating the precision, recall, and F1 metrics. The revised metrics are:
Naive decoding: Precision=78.65 Recall=72.85 Fscore=75.64 (row 7)
Beam search decoding: Precision=80.40 Recall=72.84 Fscore=76.43 (row 8)
This unintended artifact also changes the interpretation of Table 9. The reported results there should be interpreted as an oracle comparison of various inference methods, that uses both automatically extracted candidate spans as well as gold spans for argument identification.
None of the other results in the article are affected by this error.