Skip to Main Content
Table 1

List of the embedding models used for the study, together with their hyperparameter settings.

ModelHyperparameters
PPMI.w2 345K window-selected context words, window of width 2 weighted with Positive Pointwise Mutual Information (PPMI) reduced with Singular Value Decomposition (SVD) subsampling method from Mikolov et al. (2013). 
PPMI.synf 345K syntactically filtered context words weighted with Positive Pointwise Mutual Information (PPMI) reduced with Singular Value Decomposition (SVD) subsampling method from Mikolov et al. (2013). 
PPMI.synt 345K syntactically typed context words weighted with Positive Pointwise Mutual Information (PPMI) reduced with Singular Value Decomposition (SVD) subsampling method from Mikolov et al. (2013). 
GloVe Window of width 2 subsampling method from Mikolov et al. (2013). 
SGNS.w2 Skip-gram with negative sampling window of width 2, 15 negative examples trained with the word2vec library (Mikolov et al. 2013). 
SGNS.synf Skip-gram with negative sampling syntactically-filtered context words, 15 negative examples trained with the word2vecf library (Levy and Goldberg 2014). 
SGNS.synt Skip-gram with negative sampling syntactically-typed context words, 15 negative examples trained with the word2vecf library (Levy and Goldberg 2014). 
FastText Skip-gram with negative sampling and subword information window of width 2, 15 negative examples trained with the fasttext library (Bojanowski et al. 2017). 
ELMo Pretrained ELMo embeddings (Peters et al. 2018), available at https://allennlp.org/elmo, original model trained on the 1 Billion Word Benchmark (Chelba et al. 2013). 
BERT Pretrained BERT-Large embeddings (Devlin et al. 2019) available at https://github.com/google-research/bert model trained on the concatenation of the Books corpus (Zhu et al. 2015) and the English Wikipedia. 
ModelHyperparameters
PPMI.w2 345K window-selected context words, window of width 2 weighted with Positive Pointwise Mutual Information (PPMI) reduced with Singular Value Decomposition (SVD) subsampling method from Mikolov et al. (2013). 
PPMI.synf 345K syntactically filtered context words weighted with Positive Pointwise Mutual Information (PPMI) reduced with Singular Value Decomposition (SVD) subsampling method from Mikolov et al. (2013). 
PPMI.synt 345K syntactically typed context words weighted with Positive Pointwise Mutual Information (PPMI) reduced with Singular Value Decomposition (SVD) subsampling method from Mikolov et al. (2013). 
GloVe Window of width 2 subsampling method from Mikolov et al. (2013). 
SGNS.w2 Skip-gram with negative sampling window of width 2, 15 negative examples trained with the word2vec library (Mikolov et al. 2013). 
SGNS.synf Skip-gram with negative sampling syntactically-filtered context words, 15 negative examples trained with the word2vecf library (Levy and Goldberg 2014). 
SGNS.synt Skip-gram with negative sampling syntactically-typed context words, 15 negative examples trained with the word2vecf library (Levy and Goldberg 2014). 
FastText Skip-gram with negative sampling and subword information window of width 2, 15 negative examples trained with the fasttext library (Bojanowski et al. 2017). 
ELMo Pretrained ELMo embeddings (Peters et al. 2018), available at https://allennlp.org/elmo, original model trained on the 1 Billion Word Benchmark (Chelba et al. 2013). 
BERT Pretrained BERT-Large embeddings (Devlin et al. 2019) available at https://github.com/google-research/bert model trained on the concatenation of the Books corpus (Zhu et al. 2015) and the English Wikipedia. 
Close Modal

or Create an Account

Close Modal
Close Modal