Skip to Main Content
Table 4
Text summarization results. Shown are ROUGE F-{1,2,L} scores on the test split for the CNN / Daily Mail and the Science Daily datasets. Some settings are different from ours: lines 8–9 show results when training and testing on an anonymized data set, and lines 12–14 use reinforcement learning. The ROUGE scores have a 95% confidence interval ranging within ±0.25 points absolute. For lines 2 and 7, the maximum decoder steps during testing is 100. In lines 15–18, L/dR stands for LEAD/decRUM. Replacing ReLU with tanh or removing the update gate in decRUM line 17 yields a drop in ROUGE of 0.01/0.09/0.25 and 0.36/0.39/0.42 points absolute, respectively.
ModelROUGE
12L
LEAD (ours) 36.89 15.92 33.65 
decRUM 256 (ours) 37.07 16.17 34.07 
allRUM 360 cov. (ours) 35.01 14.69 32.02 
encRUM 360 cov. (ours) 36.34 15.24 33.16 
decRUM 360 cov. (ours) 37.44 16.17 34.23 
LEAD cov. (ours) 39.11 16.86 35.86 
decRUM 256 cov. (ours) 39.54 16.92 36.21 
(Nallapati et al., 2016) 35.46 13.30 32.65 
(Nallapati et al., 2017) 39.60 16.20 35.30 
10 (See et al., 2017) 36.44 15.66 33.42 
11 (See et al., 2017) cov. 39.53 17.28 36.38 
12 (Narayan et al., 2018) 40.0 18.20 36.60 
13 (Celikyilmaz et al., 2018) 41.69 19.47 37.92 
14 (Chen and Bansal, 2018) 41.20 18.18 38.79 
 
 L/dR ROUGE (on Science Daily) 
 
15 s2s 68.83/65.56 61.43/57.24 65.75/62.03 
16 sh2s 56.63/56.13 45.24/44.50 51.75/51.19 
17 s2t 27.33/27.18 10.33/10.56 24.81/24.97 
18 oods2s 32.91/37.01 16.67/22.36 26.75/31.11 
ModelROUGE
12L
LEAD (ours) 36.89 15.92 33.65 
decRUM 256 (ours) 37.07 16.17 34.07 
allRUM 360 cov. (ours) 35.01 14.69 32.02 
encRUM 360 cov. (ours) 36.34 15.24 33.16 
decRUM 360 cov. (ours) 37.44 16.17 34.23 
LEAD cov. (ours) 39.11 16.86 35.86 
decRUM 256 cov. (ours) 39.54 16.92 36.21 
(Nallapati et al., 2016) 35.46 13.30 32.65 
(Nallapati et al., 2017) 39.60 16.20 35.30 
10 (See et al., 2017) 36.44 15.66 33.42 
11 (See et al., 2017) cov. 39.53 17.28 36.38 
12 (Narayan et al., 2018) 40.0 18.20 36.60 
13 (Celikyilmaz et al., 2018) 41.69 19.47 37.92 
14 (Chen and Bansal, 2018) 41.20 18.18 38.79 
 
 L/dR ROUGE (on Science Daily) 
 
15 s2s 68.83/65.56 61.43/57.24 65.75/62.03 
16 sh2s 56.63/56.13 45.24/44.50 51.75/51.19 
17 s2t 27.33/27.18 10.33/10.56 24.81/24.97 
18 oods2s 32.91/37.01 16.67/22.36 26.75/31.11 
Close Modal

or Create an Account

Close Modal
Close Modal