Skip to Main Content
Table 2: 

Summarization results on Space. Best system (shown in boldface) significantly outperforms all comparison systems, except where underlined ( p < 0.05; paired bootstrap resampling; Koehn, 2004). We exclude Oracle systems from comparisons as they access gold summaries at test time. RLASP is the Rouge-L of general summarizers against gold aspect summaries. AC and SC are shorthands for Aspect Coverage and Sentiment Consistency. Subscripts P and R refer to precision and recall, and F1 is their harmonic mean. MSE is mean squared error (lower is better).

Space [General ]R1R2RLRLASPACPACRACF1SCMSE
Best Review CentroidSENTI 27.36 5.81 15.15 8.77 .788 .705 .744 .580 
CentroidBERT 31.33 5.78 16.54 9.35 .805 .701 .749 .524 
OracleSENTI 32.14 7.52 17.43 9.29 .817 .699 .753 .455 
OracleBERT 33.21 8.33 18.02 9.67 .823 .777 .799 .401 
 
Extract Random 26.24 3.58 14.72 11.53 .799 .374 .509 .592 
LexRank 29.85 5.87 17.56 11.84 .840 .382 .525 .518 
LexRankSENTI 30.56 4.75 17.19 12.11 .820 .441 .574 .572 
LexRankBERT 31.41 5.05 18.12 13.29 .823 .380 .520 .500 
 
Abstract Opinosis 28.76 4.57 15.96 11.68 .791 .446 .570 .561 
MeanSum 34.95 7.49 19.92 14.52 .845 .477 .610 .479 
Copycat 36.66 8.87 20.90 14.15 .840 .566 .676 .446 
 
QT w/o 2-step samp. 38.66 10.22 21.90 14.26 .843 .689 .758 .430 
37.82 9.13 20.10 13.88 .833 .680 .748 .439 
 
Human Up. Bound 49.80 18.80 29.19 34.58 .829 .862 .845 .264 
Space [General ]R1R2RLRLASPACPACRACF1SCMSE
Best Review CentroidSENTI 27.36 5.81 15.15 8.77 .788 .705 .744 .580 
CentroidBERT 31.33 5.78 16.54 9.35 .805 .701 .749 .524 
OracleSENTI 32.14 7.52 17.43 9.29 .817 .699 .753 .455 
OracleBERT 33.21 8.33 18.02 9.67 .823 .777 .799 .401 
 
Extract Random 26.24 3.58 14.72 11.53 .799 .374 .509 .592 
LexRank 29.85 5.87 17.56 11.84 .840 .382 .525 .518 
LexRankSENTI 30.56 4.75 17.19 12.11 .820 .441 .574 .572 
LexRankBERT 31.41 5.05 18.12 13.29 .823 .380 .520 .500 
 
Abstract Opinosis 28.76 4.57 15.96 11.68 .791 .446 .570 .561 
MeanSum 34.95 7.49 19.92 14.52 .845 .477 .610 .479 
Copycat 36.66 8.87 20.90 14.15 .840 .566 .676 .446 
 
QT w/o 2-step samp. 38.66 10.22 21.90 14.26 .843 .689 .758 .430 
37.82 9.13 20.10 13.88 .833 .680 .748 .439 
 
Human Up. Bound 49.80 18.80 29.19 34.58 .829 .862 .845 .264 
Close Modal

or Create an Account

Close Modal
Close Modal