Table 2: 
Validation and test results for the structured prediction tasks; each entry reflects the mean of three random seeds. To preserve test set integrity, we only obtain test set results for the no distillation baseline and the best structure-distilled BERT on the validation set; “Err. Red.” reports the test error reductions relative to the No-KD baseline. We report F1 and exact match (EM) for PTB phrase-structure parsing; for dependency, we report unlabeled (UAS) and labeled (LAS) attachment scores. The “Const. OOD” () row indicates the mean F1 from three out-of-domain corpora: Brown, Genia, and the English Web Treebank (EWT), although the validation results exclude the Brown Treebank that has no validation set.
TaskValidation SetTest Set
BaselinesStructure-distilled BERTsNo-KDBest-KDErr. Red.
No-KDSeq-KDL2R-KDR2L-KDUF-KDUG-KD
Parsing Const. PTB - F1 95.38 95.33 95.55 95.55 95.58 95.59 95.35 95.70 7.6% 
Const. PTB - EM 55.33 55.41 55.92 56.18 56.39 56.59 55.25 57.77 5.63% 
Const. OOD - F1 86.76 86.54 87.43 87.53 87.23 87.40 89.04 89.76 6.55% 
Dep. PTB - UAS 96.48 96.40 96.70 96.64 96.60 96.66 96.79 96.86 2.18% 
Dep. PTB - LAS 94.65 94.56 94.90 94.80 94.79 94.83 95.13 95.23 1.99% 
 
 SRL - OntoNotes 86.17 86.09 86.34 86.29 86.30 86.46 86.08 86.39 2.23% 
 Coref. - OntoNotes 72.53 69.27 73.74 73.49 73.79 73.33 72.71 73.69 3.58% 
 
 CCG supertag. probe 93.69 91.59 93.97 95.21 95.13 95.21 93.88 95.2 21.57% 
 Probe selectivity 24.79 23.77 23.3 23.57 27.28 28.3 23.15 26.07 N/A 
TaskValidation SetTest Set
BaselinesStructure-distilled BERTsNo-KDBest-KDErr. Red.
No-KDSeq-KDL2R-KDR2L-KDUF-KDUG-KD
Parsing Const. PTB - F1 95.38 95.33 95.55 95.55 95.58 95.59 95.35 95.70 7.6% 
Const. PTB - EM 55.33 55.41 55.92 56.18 56.39 56.59 55.25 57.77 5.63% 
Const. OOD - F1 86.76 86.54 87.43 87.53 87.23 87.40 89.04 89.76 6.55% 
Dep. PTB - UAS 96.48 96.40 96.70 96.64 96.60 96.66 96.79 96.86 2.18% 
Dep. PTB - LAS 94.65 94.56 94.90 94.80 94.79 94.83 95.13 95.23 1.99% 
 
 SRL - OntoNotes 86.17 86.09 86.34 86.29 86.30 86.46 86.08 86.39 2.23% 
 Coref. - OntoNotes 72.53 69.27 73.74 73.49 73.79 73.33 72.71 73.69 3.58% 
 
 CCG supertag. probe 93.69 91.59 93.97 95.21 95.13 95.21 93.88 95.2 21.57% 
 Probe selectivity 24.79 23.77 23.3 23.57 27.28 28.3 23.15 26.07 N/A 
Close Modal

or Create an Account

Close Modal
Close Modal