Abstract
We propose to use the visual denotations of linguistic expressions (i.e. the set of images they describe) to define novel denotational similarity metrics, which we show to be at least as beneficial as distributional similarities for two tasks that require semantic inference. To compute these denotational similarities, we construct a denotation graph, i.e. a subsumption hierarchy over constituents and their denotations, based on a large corpus of 30K images and 150K descriptive captions.
This content is only available as a PDF.
©2014 Association for Computational
Linguistics.
2014
Association for Computational Linguistics
This is an open-access article distributed under the terms of the
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International
License, which permits you to copy and redistribute in any medium or format,
for non-commercial use only, provided that the original work is not remixed,
transformed, or built upon, and that appropriate credit to the original
source is given. For a full description of the license, please visit https://creativecommons.org/licenses/by/4.0/legalcode.