Table 6: 

Rough perplexity comparisons between our trained language models and models with the same architectures in previous work (aGulordava et al., 2018; bRadford et al., 2019; cAina et al., 2019; dWolf et al., 2020).

OursPrevious work
# ParamsPerplexity# ParamsPerplexity
LSTM 37M 54.8 72M a 52.1 
GPT-2 108M 30.2 117M b 37.5 
BiLSTM 51M 9.0 42M c 18.1 
BERT 109M 7.2 110M d 9.4 
OursPrevious work
# ParamsPerplexity# ParamsPerplexity
LSTM 37M 54.8 72M a 52.1 
GPT-2 108M 30.2 117M b 37.5 
BiLSTM 51M 9.0 42M c 18.1 
BERT 109M 7.2 110M d 9.4 
Close Modal

or Create an Account

Close Modal
Close Modal