Table 3: 

Summary of hyper-parameter tuning. The * indicates divergence from the NCRF++ proposed setup and empirical findings (Yang and Zhang, 2018).

ParameterValueParameterValue
Optimizer SGD *LR (token-single0.01 
*Batch Size *LR (token-multi0.005 
LR decay 0.05 *LR (morpheme0.01 
Epochs 200 Dropout 0.5 
Bi-LSTM layers *CharCNN window 
*Word Emb Dim 300 Char Emb dim 30 
Word Hidden Dim 200 *Char Hidden Dim 70 
ParameterValueParameterValue
Optimizer SGD *LR (token-single0.01 
*Batch Size *LR (token-multi0.005 
LR decay 0.05 *LR (morpheme0.01 
Epochs 200 Dropout 0.5 
Bi-LSTM layers *CharCNN window 
*Word Emb Dim 300 Char Emb dim 30 
Word Hidden Dim 200 *Char Hidden Dim 70 
Close Modal

or Create an Account

Close Modal
Close Modal