Table 3 

Graphical models consistently outperform n-gram models by a larger margin on sparse words than not-sparse words, and by a larger margin on polysemous words than not-polysemous words. One exception is the NB-R, which performs worse relative to Web1T-n-gram-R on polysemous words than non-polysemous words. For each graphical model representation, we show the difference in performance between that representation and Web1T-n-gram-R in parentheses. For each representation, differences in accuracy on polysemous and non-polysemous subsets were statistically significant at p < 0.01 using a two-tailed Fisher's exact test. Likewise for performance on sparse vs. non-sparse categories.


polysemous
not polysemous
sparse
not sparse
tokens 159 4,321 463 12,194 
Trad-R 59.5 78.5 52.5 89.6 
Web1T-n-gram-R 68.2 85.3 61.8 94.0 
NB-R 64.5 88.7 57.8 89.4 
(-Web1T-n-gram-R) (−3.7) (+3.4(−4.0(−4.6) 
Hmm-Token-R 67.9 83.4 60.2 91.6 
(-Web1T-n-gram-R) (−0.3(−1.9) (−1.6(−2.4) 
I-Hmm-Token-R 75.6 85.2 62.9 94.5 
(-Web1T-n-gram-R) (+7.4(−0.1) (+1.1(+0.5) 
Lattice-Token-R 70.5 86.9 65.2 94.6 
(-Web1T-n-gram-R) (+2.3(+1.6) (+3.4(+0.6) 

polysemous
not polysemous
sparse
not sparse
tokens 159 4,321 463 12,194 
Trad-R 59.5 78.5 52.5 89.6 
Web1T-n-gram-R 68.2 85.3 61.8 94.0 
NB-R 64.5 88.7 57.8 89.4 
(-Web1T-n-gram-R) (−3.7) (+3.4(−4.0(−4.6) 
Hmm-Token-R 67.9 83.4 60.2 91.6 
(-Web1T-n-gram-R) (−0.3(−1.9) (−1.6(−2.4) 
I-Hmm-Token-R 75.6 85.2 62.9 94.5 
(-Web1T-n-gram-R) (+7.4(−0.1) (+1.1(+0.5) 
Lattice-Token-R 70.5 86.9 65.2 94.6 
(-Web1T-n-gram-R) (+2.3(+1.6) (+3.4(+0.6) 
Close Modal

or Create an Account

Close Modal
Close Modal