Table 2:
Results for the MTNT test sets using $PR1-R2S$ synthesized by our approaches. “zero-shot NMT” is the NMT system used for synthesizing $PR1-R2S$. “FT on $PR1-R2S$” are configurations for which we sampled 100k sentence pairs from $PR1-R2S$ to fine-tune the vanilla NMT system. The last row is given for reference: the vanilla NMT system fined-tuned on the official MTNT training parallel data. “*” denotes systems significantly better than the FT on SNI system with a p-value < 0.05.
SystemBLEUchrF
fr→enen→frja→enen→ja
zero-shot NMT 21.4 22.4 3.0 0.126

vanilla 21.6 21.7 8.1 0.174
FT on SNI 23.1 22.3 8.2 0.164

#1: $PR1-R2S$ synthesized from PL1-L2
FT on $PR1-R2S$ 22.0 24.2* 9.0* 0.174
+ $PR1-R2S$ 23.1 24.7* 9.5* 0.180*

#2: $PR1-R2S$ synthesized from MR2 monolingual data
FT on $PR1-R2S$ 26.5* 26.2* 9.1* 0.202*
+ $PR1-R2S$ 29.3* 26.8* 10.0* 0.212*

$PR1-R2S$ synthesized by #1 and #2
+ #1 + #2 29.0* 27.2* 10.4* 0.213*

With the Reddit training parallel data from MTNT
FT on MTNT 29.0* 27.5* 9.9* 0.192*
