A more informed unknown word model (character n-gram) in combination with the word-level known word model consistently performs better than the alternatives for all four languages in our dataset.
Known Word Model . | Unknown Word Model . | % Character Error Rate . | % Word Error Rate . | ||||||
---|---|---|---|---|---|---|---|---|---|
ain . | grk . | ybh . | kwk . | ain . | grk . | ybh . | kwk . | ||
CharLM | (not needed) | 0.64 | 1.43 | 6.22 | 3.85 | 4.50 | 6.44 | 16.78 | 22.90 |
WordLM | Character uniform | 0.64 | 1.42 | 6.12 | 3.95 | 4.50 | 6.39 | 16.71 | 23.11 |
Ours | Character n-gram | 0.63 | 1.37 | 5.98 | 3.82 | 4.43 | 6.36 | 16.65 | 22.61 |
Known Word Model . | Unknown Word Model . | % Character Error Rate . | % Word Error Rate . | ||||||
---|---|---|---|---|---|---|---|---|---|
ain . | grk . | ybh . | kwk . | ain . | grk . | ybh . | kwk . | ||
CharLM | (not needed) | 0.64 | 1.43 | 6.22 | 3.85 | 4.50 | 6.44 | 16.78 | 22.90 |
WordLM | Character uniform | 0.64 | 1.42 | 6.12 | 3.95 | 4.50 | 6.39 | 16.71 | 23.11 |
Ours | Character n-gram | 0.63 | 1.37 | 5.98 | 3.82 | 4.43 | 6.36 | 16.65 | 22.61 |