Skip to Main Content
Table 4 
Evaluation of language models by using the AWD-LSTM model (trained with WT2), in comparison with using the perplexity and the Taylor exponent.
 PerplexityTaylor exponentPerplexity from eval-AWD-LSTM
Original Data set 
Wikitext-2 (Preprocessed) 0.62 (0.15) 33.81 
  
Shuffled Data Set 
Wikitext-2 (1-gram) 0.50 (0.02) 7,389.15 
Wikitext-2 (2-gram) 0.50 (0.02) 2,405.15 
Wikitext-2 (5-gram) 0.50 (0.02) 559.92 
Wikitext-2 (10-gram) 0.50 (0.02) 236.49 
  
N-gram Language Model 
3-gram 837.58 0.50 (0.02) 3,730.74 
5-gram 534.98 0.50 (0.02) 7,532.91 
linear interpolation 294.72 0.50 (0.02) 1,371.75 
Katz backoff 3-gram 285.14 0.50 (0.02) 663.74 
Katz backoff 5-gram 357.94 0.50 (0.02) 664.25 
Kneser-Ney 3-gram 204.15 0.50 (0.02) 2,562.24 
Kneser-Ney 5-gram 215.44 0.50 (0.02) 2,743.65 
HPYLM 184.34 0.50 (0.02) 884.76 
  
Neural Language Model 
Simple RNN 164.51 0.50 (0.02) 645.64 
GRU 96.22 0.52 (0.03) 266.33 
QRNN 74.74 0.52 (0.03) 135.68 
LSTM (no regularization) 113.18 0.52 (0.03) 177.12 
AWD-LSTM 64.27 0.58 (0.06) 88.73 
AWD-LSTM-Simon 61.59 0.55 (0.05) 130.52 
AWD-LSTM-MoS 62.44 0.54 (0.04) 97.89 
AWD-LSTM-MoS-Cache 59.21 0.57 (0.07) 164.39 
AWD-LSTM-Cache 50.39 0.59 (0.07) 109.02 
 PerplexityTaylor exponentPerplexity from eval-AWD-LSTM
Original Data set 
Wikitext-2 (Preprocessed) 0.62 (0.15) 33.81 
  
Shuffled Data Set 
Wikitext-2 (1-gram) 0.50 (0.02) 7,389.15 
Wikitext-2 (2-gram) 0.50 (0.02) 2,405.15 
Wikitext-2 (5-gram) 0.50 (0.02) 559.92 
Wikitext-2 (10-gram) 0.50 (0.02) 236.49 
  
N-gram Language Model 
3-gram 837.58 0.50 (0.02) 3,730.74 
5-gram 534.98 0.50 (0.02) 7,532.91 
linear interpolation 294.72 0.50 (0.02) 1,371.75 
Katz backoff 3-gram 285.14 0.50 (0.02) 663.74 
Katz backoff 5-gram 357.94 0.50 (0.02) 664.25 
Kneser-Ney 3-gram 204.15 0.50 (0.02) 2,562.24 
Kneser-Ney 5-gram 215.44 0.50 (0.02) 2,743.65 
HPYLM 184.34 0.50 (0.02) 884.76 
  
Neural Language Model 
Simple RNN 164.51 0.50 (0.02) 645.64 
GRU 96.22 0.52 (0.03) 266.33 
QRNN 74.74 0.52 (0.03) 135.68 
LSTM (no regularization) 113.18 0.52 (0.03) 177.12 
AWD-LSTM 64.27 0.58 (0.06) 88.73 
AWD-LSTM-Simon 61.59 0.55 (0.05) 130.52 
AWD-LSTM-MoS 62.44 0.54 (0.04) 97.89 
AWD-LSTM-MoS-Cache 59.21 0.57 (0.07) 164.39 
AWD-LSTM-Cache 50.39 0.59 (0.07) 109.02 
Close Modal

or Create an Account

Close Modal
Close Modal