Summarization results on Space. Best system (shown in boldface) significantly outperforms all comparison systems, except where underlined ( p < 0.05; paired bootstrap resampling; Koehn, 2004). We exclude Oracle systems from comparisons as they access gold summaries at test time. RLASP is the Rouge-L of general summarizers against gold aspect summaries. AC and SC are shorthands for Aspect Coverage and Sentiment Consistency. Subscripts P and R refer to precision and recall, and F1 is their harmonic mean. MSE is mean squared error (lower is better).
. | Space [General ] . | R1 . | R2 . | RL . | RLASP . | ACP . | ACR . | ACF1 . | SCMSE . |
---|---|---|---|---|---|---|---|---|---|
Best Review | CentroidSENTI | 27.36 | 5.81 | 15.15 | 8.77 | .788 | .705 | .744 | .580 |
CentroidBERT | 31.33 | 5.78 | 16.54 | 9.35 | .805 | .701 | .749 | .524 | |
OracleSENTI | 32.14 | 7.52 | 17.43 | 9.29 | .817 | .699 | .753 | .455 | |
OracleBERT | 33.21 | 8.33 | 18.02 | 9.67 | .823 | .777 | .799 | .401 | |
Extract | Random | 26.24 | 3.58 | 14.72 | 11.53 | .799 | .374 | .509 | .592 |
LexRank | 29.85 | 5.87 | 17.56 | 11.84 | .840 | .382 | .525 | .518 | |
LexRankSENTI | 30.56 | 4.75 | 17.19 | 12.11 | .820 | .441 | .574 | .572 | |
LexRankBERT | 31.41 | 5.05 | 18.12 | 13.29 | .823 | .380 | .520 | .500 | |
Abstract | Opinosis | 28.76 | 4.57 | 15.96 | 11.68 | .791 | .446 | .570 | .561 |
MeanSum | 34.95 | 7.49 | 19.92 | 14.52 | .845 | .477 | .610 | .479 | |
Copycat | 36.66 | 8.87 | 20.90 | 14.15 | .840 | .566 | .676 | .446 | |
QT w/o 2-step samp. | 38.66 | 10.22 | 21.90 | 14.26 | .843 | .689 | .758 | .430 | |
37.82 | 9.13 | 20.10 | 13.88 | .833 | .680 | .748 | .439 | ||
Human Up. Bound | 49.80 | 18.80 | 29.19 | 34.58 | .829 | .862 | .845 | .264 |
. | Space [General ] . | R1 . | R2 . | RL . | RLASP . | ACP . | ACR . | ACF1 . | SCMSE . |
---|---|---|---|---|---|---|---|---|---|
Best Review | CentroidSENTI | 27.36 | 5.81 | 15.15 | 8.77 | .788 | .705 | .744 | .580 |
CentroidBERT | 31.33 | 5.78 | 16.54 | 9.35 | .805 | .701 | .749 | .524 | |
OracleSENTI | 32.14 | 7.52 | 17.43 | 9.29 | .817 | .699 | .753 | .455 | |
OracleBERT | 33.21 | 8.33 | 18.02 | 9.67 | .823 | .777 | .799 | .401 | |
Extract | Random | 26.24 | 3.58 | 14.72 | 11.53 | .799 | .374 | .509 | .592 |
LexRank | 29.85 | 5.87 | 17.56 | 11.84 | .840 | .382 | .525 | .518 | |
LexRankSENTI | 30.56 | 4.75 | 17.19 | 12.11 | .820 | .441 | .574 | .572 | |
LexRankBERT | 31.41 | 5.05 | 18.12 | 13.29 | .823 | .380 | .520 | .500 | |
Abstract | Opinosis | 28.76 | 4.57 | 15.96 | 11.68 | .791 | .446 | .570 | .561 |
MeanSum | 34.95 | 7.49 | 19.92 | 14.52 | .845 | .477 | .610 | .479 | |
Copycat | 36.66 | 8.87 | 20.90 | 14.15 | .840 | .566 | .676 | .446 | |
QT w/o 2-step samp. | 38.66 | 10.22 | 21.90 | 14.26 | .843 | .689 | .758 | .430 | |
37.82 | 9.13 | 20.10 | 13.88 | .833 | .680 | .748 | .439 | ||
Human Up. Bound | 49.80 | 18.80 | 29.19 | 34.58 | .829 | .862 | .845 | .264 |