Skip to Main Content
Table 1: 

Parameter counts and evaluation perplexities for the trained language models. For reference, the pre-trained BERT base model from Huggingface reached a perplexity of 9.4 on our evaluation set. Additional perplexity comparisons with comparable models are included in Appendix A.1.

# ParametersPerplexity
LSTM 37M 54.8 
GPT-2 108M 30.2 
BiLSTM 51M 9.0 
BERT 109M 7.2 
# ParametersPerplexity
LSTM 37M 54.8 
GPT-2 108M 30.2 
BiLSTM 51M 9.0 
BERT 109M 7.2 
Close Modal

or Create an Account

Close Modal
Close Modal