Skip to Main Content
Table 1: 
Comparison with prior work on NIST Chinese–English translation task. The evaluation metric is tokenized case-insensitive BLEU. The first three rows are numbers reported in the papers of prior work. The first two baselines are the results that we obtained by running the transformer (Vaswani et al., 2017) and the document transformer (Zhang et al., 2018) on the NIST dataset. The sent-reranker is a variation of our model in which sentences in documents are assumed to be independent. The backtranslation baseline is obtained by training the document transformer using additional synthetic parallel documents generated by backtranslation.
MethodModelProposalMT06MT03MT04MT05MT08
(Wang et al., 2017) RNNsearch – 37.76 – – 36.89 27.57 
(Kuang et al., 2017) Transformer + cache – 48.14 48.05 47.91 48.53 38.38 
(Zhang et al., 2018) Doc-transformer – 49.69 50.21 49.73 49.46 39.69 
 
Baseline Sent-transformer – 47.72 47.21 49.08 46.86 40.18 
Doc-transformer (q– 49.79 49.29 50.17 48.99 41.70 
Backtranslation (q′– 50.77 51.80 51.61 51.81 42.47 
Sent-reranker q 51.33 52.23 52.36 51.63 43.63 
 
This work Doc-reranker q 51.99 52.77 52.84 51.84 44.17 
Doc-reranker q′ 53.63 54.51 54.23 54.86 45.17 
MethodModelProposalMT06MT03MT04MT05MT08
(Wang et al., 2017) RNNsearch – 37.76 – – 36.89 27.57 
(Kuang et al., 2017) Transformer + cache – 48.14 48.05 47.91 48.53 38.38 
(Zhang et al., 2018) Doc-transformer – 49.69 50.21 49.73 49.46 39.69 
 
Baseline Sent-transformer – 47.72 47.21 49.08 46.86 40.18 
Doc-transformer (q– 49.79 49.29 50.17 48.99 41.70 
Backtranslation (q′– 50.77 51.80 51.61 51.81 42.47 
Sent-reranker q 51.33 52.23 52.36 51.63 43.63 
 
This work Doc-reranker q 51.99 52.77 52.84 51.84 44.17 
Doc-reranker q′ 53.63 54.51 54.23 54.86 45.17 
Close Modal

or Create an Account

Close Modal
Close Modal