Table 3: 
System performance using Macro-F1% (test set created via AMT); models with the symbol † are significantly (p < 0.05) different from the best system in each task using the approximate randomization test (Noreen, 1989).
SystemsSentencesWords
Wiki-en Wiki-zh NYT Wiki-en Wiki-zh NYT 
Major 1.34 6.14 0.51 1.39 14.95 0.39 
L-LDA 27.81 28.94 28.08 24.58 42.67 26.24 
HierNet 42.23 29.93 44.74 15.57 24.25 18.27 
MilNet 39.30 45.14 29.31 22.11 33.10 23.33 
 
DetNet1ℋ 48.12 51.76 57.06 16.21 26.90 21.61 
DetNet2ℋ 54.70 57.60 55.78 27.06 43.82 26.52 
DetNet* 58.01 51.28 60.62 26.08 43.18 27.03 
SystemsSentencesWords
Wiki-en Wiki-zh NYT Wiki-en Wiki-zh NYT 
Major 1.34 6.14 0.51 1.39 14.95 0.39 
L-LDA 27.81 28.94 28.08 24.58 42.67 26.24 
HierNet 42.23 29.93 44.74 15.57 24.25 18.27 
MilNet 39.30 45.14 29.31 22.11 33.10 23.33 
 
DetNet1ℋ 48.12 51.76 57.06 16.21 26.90 21.61 
DetNet2ℋ 54.70 57.60 55.78 27.06 43.82 26.52 
DetNet* 58.01 51.28 60.62 26.08 43.18 27.03 
Close Modal

or Create an Account

Close Modal
Close Modal