Table 3: 

A comparison of a single-task model (ST), a simple multi-task model (SMT), and our complex multi-task model (CMT) in full training and active learning. Models were trained with the cross-entropy (CE) or label smoothing (LS) losses, on all OntoNotes domains.

Full TrainingActive Learning
DPNERDPNER
AvgBestAvgBestAvgBestAvgBest
ST (CE) 87.17 1/6 70.35 0/6 86.43 3/6 74.65 6/6 
SMT (CE) 86.94 2/6 67.51 0/6 85.86 2/6 70.31 0/6 
CMT (CE) 87.04 3/6 72.79 6/6 85.91 1/6 72.11 0/6 
ST (LS) 87.64 0/6 71.31 1/6 88.96 0/6 75.61 2/6 
SMT (LS) 86.87 0/6 69.07 0/6 87.53 0/6 73.26 1/6 
CMT (LS) 87.98 6/6 72.86 5/6 89.03 6/6 74.44 3/6 
Full TrainingActive Learning
DPNERDPNER
AvgBestAvgBestAvgBestAvgBest
ST (CE) 87.17 1/6 70.35 0/6 86.43 3/6 74.65 6/6 
SMT (CE) 86.94 2/6 67.51 0/6 85.86 2/6 70.31 0/6 
CMT (CE) 87.04 3/6 72.79 6/6 85.91 1/6 72.11 0/6 
ST (LS) 87.64 0/6 71.31 1/6 88.96 0/6 75.61 2/6 
SMT (LS) 86.87 0/6 69.07 0/6 87.53 0/6 73.26 1/6 
CMT (LS) 87.98 6/6 72.86 5/6 89.03 6/6 74.44 3/6 
Close Modal

or Create an Account

Close Modal
Close Modal