Our semi-supervised approach improves performance over the baselines (10-fold cross-validation averaged over five randomly seeded runs). “Self-Training” and “Lexical Decoding” refer to experiments where we use these methods independently. “Both” refers to their combination. We highlight the best model for each language.
Model . | % Character Error Rate . | % Word Error Rate . | ||||||
---|---|---|---|---|---|---|---|---|
ain . | grk . | ybh . | kwk . | ain . | grk . | ybh . | kwk . | |
First-Pass | 1.34 | 3.27 | 8.90 | 7.90 | 6.27 | 15.63 | 31.64 | 38.22 |
Base | 0.80 | 1.70 | 8.44 | 4.97 | 5.19 | 7.51 | 21.33 | 27.65 |
Semi-Supervised | ||||||||
Self-Training | 0.82 | 1.45 | 7.20 | 4.00 | 5.31 | 6.47 | 18.09 | 23.98 |
Lexical Decoding | 0.81 | 1.51 | 7.56 | 4.28 | 5.18 | 6.60 | 19.13 | 25.09 |
Both | 0.63 | 1.37 | 5.98 | 3.82 | 4.43 | 6.36 | 16.65 | 22.61 |
Error Reduction | 21% | 19% | 29% | 23% | 15% | 15% | 22% | 18% |
Model . | % Character Error Rate . | % Word Error Rate . | ||||||
---|---|---|---|---|---|---|---|---|
ain . | grk . | ybh . | kwk . | ain . | grk . | ybh . | kwk . | |
First-Pass | 1.34 | 3.27 | 8.90 | 7.90 | 6.27 | 15.63 | 31.64 | 38.22 |
Base | 0.80 | 1.70 | 8.44 | 4.97 | 5.19 | 7.51 | 21.33 | 27.65 |
Semi-Supervised | ||||||||
Self-Training | 0.82 | 1.45 | 7.20 | 4.00 | 5.31 | 6.47 | 18.09 | 23.98 |
Lexical Decoding | 0.81 | 1.51 | 7.56 | 4.28 | 5.18 | 6.60 | 19.13 | 25.09 |
Both | 0.63 | 1.37 | 5.98 | 3.82 | 4.43 | 6.36 | 16.65 | 22.61 |
Error Reduction | 21% | 19% | 29% | 23% | 15% | 15% | 22% | 18% |