Skip to Main Content
Table 4: 
Summary of the best performance of each family of representations on the various tasks. The evaluation metric is accuracy except for the phrase type task in which we report span-based F1 score, excluding O tags.
Model FamilyVPC ClassificationLVC ClassificationNC LiteralityNC RelationsAN AttributesPhrase Type
 Acc Acc Acc Acc Acc F1 
Majority Baselines 23.6 43.7 72.5 50.0 50.0 26.6 
Word Embeddings 60.5 74.6 80.4 51.2 53.8 44.0 
Contextualized 90.0 82.5 91.3 54.3 65.1 64.8 
Human 93.8 83.8 91.0 77.8 86.4 
Model FamilyVPC ClassificationLVC ClassificationNC LiteralityNC RelationsAN AttributesPhrase Type
 Acc Acc Acc Acc Acc F1 
Majority Baselines 23.6 43.7 72.5 50.0 50.0 26.6 
Word Embeddings 60.5 74.6 80.4 51.2 53.8 44.0 
Contextualized 90.0 82.5 91.3 54.3 65.1 64.8 
Human 93.8 83.8 91.0 77.8 86.4 
Close Modal

or Create an Account

Close Modal
Close Modal