Results of length control on different specified length bins using the DUC-2002 data. Our CMDP framework consistently improves the ROUGE scores of PG and D.GPT2 (p < 0.04, approximate randomization test, for ROUGE-1 and ROUGE-L).
. | Bin 1 . | Bin 4 . | Bin 7 . | Bin 10 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Method . | R-1 . | R-2 . | R-L . | R-1 . | R-2 . | R-L . | R-1 . | R-2 . | R-L . | R-1 . | R-2 . | R-L . |
ControlSum | 32.40 | 14.30 | 28.28 | 36.30 | 15.34 | 31.95 | 38.55 | 16.18 | 34.50 | 40.30 | 17.08 | 36.59 |
PG | 27.93 | 12.06 | 24.40 | 31.41 | 12.51 | 27.23 | 31.81 | 12.27 | 27.54 | 31.94 | 11.79 | 28.09 |
PG+CMDP | 35.30 | 17.00 | 31.98 | 37.88 | 17.59 | 34.27 | 39.85 | 18.46 | 36.17 | 40.73 | 17.11 | 37.30 |
D.GPT2 | 31.21 | 13.36 | 27.12 | 36.27 | 15.97 | 31.91 | 38.18 | 16.43 | 33.64 | 40.87 | 17.45 | 36.62 |
D.GPT2+CMDP | 33.09 | 13.48 | 29.74 | 38.41 | 16.55 | 34.59 | 39.65 | 16.77 | 35.79 | 42.05 | 17.77 | 38.35 |
. | Bin 1 . | Bin 4 . | Bin 7 . | Bin 10 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Method . | R-1 . | R-2 . | R-L . | R-1 . | R-2 . | R-L . | R-1 . | R-2 . | R-L . | R-1 . | R-2 . | R-L . |
ControlSum | 32.40 | 14.30 | 28.28 | 36.30 | 15.34 | 31.95 | 38.55 | 16.18 | 34.50 | 40.30 | 17.08 | 36.59 |
PG | 27.93 | 12.06 | 24.40 | 31.41 | 12.51 | 27.23 | 31.81 | 12.27 | 27.54 | 31.94 | 11.79 | 28.09 |
PG+CMDP | 35.30 | 17.00 | 31.98 | 37.88 | 17.59 | 34.27 | 39.85 | 18.46 | 36.17 | 40.73 | 17.11 | 37.30 |
D.GPT2 | 31.21 | 13.36 | 27.12 | 36.27 | 15.97 | 31.91 | 38.18 | 16.43 | 33.64 | 40.87 | 17.45 | 36.62 |
D.GPT2+CMDP | 33.09 | 13.48 | 29.74 | 38.41 | 16.55 | 34.59 | 39.65 | 16.77 | 35.79 | 42.05 | 17.77 | 38.35 |