Skip to Main Content
Table 2: 

Comparison of Various Evaluation Benchmarks. We compare Flores-101 to a variety of popular, existing translation benchmarks, indicating language coverage, topic diversity, whether many-to-many translation is supported, if the translations are manually aligned by humans, and if the tasks of document-level translation or multimodal translation are supported.

# LanguagesDiverse TopicsMany to ManyManual AlignmentsDocument LevelMulti modal
FLORES v1 (Guzmán et al., 2019) ✓ ✗ ✓ ✗ ✗ 
AmericasNLI (Ebrahimi et al., 2021) 10 ✓ ✓ ✓ ✗ ✗ 
ALT (Riza et al., 2016) 13 ✓ ✓ ✓ ✗ ✗ 
Europarl (Koehn, 2005) 21 ✗ ✓ ✗ ✓ ✗ 
TICO-19 (Anastasopoulos et al., 2020) 36 ✗ ✓ ✓ ✗ ✗ 
OPUS-100 (Zhang et al., 2020) 100 ✓ ✓ ✗ ✗ ✗ 
M2M (Fan et al., 2020) 100 ✗ ✓ ✓✗ ✗ ✗ 
 
Flores-101 101 ✓ ✓ ✓ ✓ ✓ 
# LanguagesDiverse TopicsMany to ManyManual AlignmentsDocument LevelMulti modal
FLORES v1 (Guzmán et al., 2019) ✓ ✗ ✓ ✗ ✗ 
AmericasNLI (Ebrahimi et al., 2021) 10 ✓ ✓ ✓ ✗ ✗ 
ALT (Riza et al., 2016) 13 ✓ ✓ ✓ ✗ ✗ 
Europarl (Koehn, 2005) 21 ✗ ✓ ✗ ✓ ✗ 
TICO-19 (Anastasopoulos et al., 2020) 36 ✗ ✓ ✓ ✗ ✗ 
OPUS-100 (Zhang et al., 2020) 100 ✓ ✓ ✗ ✗ ✗ 
M2M (Fan et al., 2020) 100 ✗ ✓ ✓✗ ✗ ✗ 
 
Flores-101 101 ✓ ✓ ✓ ✓ ✓ 
Close Modal

or Create an Account

Close Modal
Close Modal