Term usage percentage (Term%) and BLEU scores of En-De models on terminology test sets (Dinu et al., 2019) provided with correct terminology entries (exact matches on both source and target sides). EDITOR with soft constraints achieves higher BLEU than LevT with soft constraints, and on par or higher BLEU than LevT with hard constraints.
. | Wiktionary . | IATE . | ||
---|---|---|---|---|
. | Term%↑ . | BLEU↑ . | Term%↑ . | BLEU↑ . |
Prior Results | ||||
Base Trans. | 76.9 | 26.0 | 76.3 | 25.8 |
Post18 | 99.5 | 25.8 | 82.0 | 25.3 |
Dinu19 | 93.4 | 26.3 | 94.5 | 26.0 |
Base LevT | 81.1 | 30.2 | 80.3 | 29.0 |
Susanto20 | 100.0 | 31.2 | 100.0 | 30.1 |
Our Results | ||||
LevT | 84.3 | 28.2 | 83.9 | 27.9 |
+ soft constraints | 90.5 | 28.5 | 92.5 | 28.3 |
+ hard constraints | 100.0 | 28.8 | 100.0 | 28.9 |
EDITOR | 83.5 | 28.8 | 83.0 | 27.9 |
+ soft constraints | 96.8 | 29.3 | 97.1 | 28.8 |
+ hard constraints | 99.8 | 29.3 | 100.0 | 28.9 |
. | Wiktionary . | IATE . | ||
---|---|---|---|---|
. | Term%↑ . | BLEU↑ . | Term%↑ . | BLEU↑ . |
Prior Results | ||||
Base Trans. | 76.9 | 26.0 | 76.3 | 25.8 |
Post18 | 99.5 | 25.8 | 82.0 | 25.3 |
Dinu19 | 93.4 | 26.3 | 94.5 | 26.0 |
Base LevT | 81.1 | 30.2 | 80.3 | 29.0 |
Susanto20 | 100.0 | 31.2 | 100.0 | 30.1 |
Our Results | ||||
LevT | 84.3 | 28.2 | 83.9 | 27.9 |
+ soft constraints | 90.5 | 28.5 | 92.5 | 28.3 |
+ hard constraints | 100.0 | 28.8 | 100.0 | 28.9 |
EDITOR | 83.5 | 28.8 | 83.0 | 27.9 |
+ soft constraints | 96.8 | 29.3 | 97.1 | 28.8 |
+ hard constraints | 99.8 | 29.3 | 100.0 | 28.9 |