Skip to Main Content
Table 1: 
Specifications of variants of ELMo models compared in Sections 4 and 5. Cont means the model has continuous outputs. LN means layer normalization.
ModelInputSequence EncoderOutput
ELMo CNN LSTM Sampled Softmax 
ELMo-C (ours) FastTextcc LSTM w/ LN Cont w/ FastTextcc 
ELMo-A FastTextcc LSTM w/ LN Adaptive Softmax 
ELMo-Sub Subword LSTM w/ LN Softmax 
 
ELMo-COneB FastTextOneB LSTM w/ LN Cont w/ FastTextOneB 
ELMo-CRnd FastTextcc LSTM w/ LN Cont w/ Random Embedding 
ELMo-CCNN Trained CNN LSTM w/ LN Cont w/ Trained CNN 
ELMo-CCNN-CC Trained CNN LSTM w/ LN Cont w/ FastTextcc 
ELMo-CCC-CNN FastTextcc LSTM w/ LN Cont w/ Trained CNN 
ModelInputSequence EncoderOutput
ELMo CNN LSTM Sampled Softmax 
ELMo-C (ours) FastTextcc LSTM w/ LN Cont w/ FastTextcc 
ELMo-A FastTextcc LSTM w/ LN Adaptive Softmax 
ELMo-Sub Subword LSTM w/ LN Softmax 
 
ELMo-COneB FastTextOneB LSTM w/ LN Cont w/ FastTextOneB 
ELMo-CRnd FastTextcc LSTM w/ LN Cont w/ Random Embedding 
ELMo-CCNN Trained CNN LSTM w/ LN Cont w/ Trained CNN 
ELMo-CCNN-CC Trained CNN LSTM w/ LN Cont w/ FastTextcc 
ELMo-CCC-CNN FastTextcc LSTM w/ LN Cont w/ Trained CNN 
Close Modal

or Create an Account

Close Modal
Close Modal