The contextual hypernym prediction model is based on BERT (Devlin et al., 2019). Input sentences ct and cw are tokenized,
prepended with a [CLS] token, and separated by a [SEP] token. The target word t in the first sentence, ct, and the related
word w in the second sentence, cw, are surrounded
by < and > tokens.
The class label (hypernym or not) is
predicted by feeding the output representation of the [CLS] token through fully-connected and
softmax layers.
This site uses cookies. By continuing to use our website, you are agreeing to our privacy policy. No content on this site may be used to train artificial intelligence systems without permission in writing from the MIT Press.