Evaluation of Persian-only models (top), English-only (middle), and Persian+English (bottom) models on Persian tasks. Best baseline scores are indicated in bold.
Setup . | Model ↓ - Task → . | Reading Comprehension . | Multiple-Choice Question Answering . | Textual Entailment . | Question Paraphrasing . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subtask → . | all . | literature . | com-know . | math & logic . | natural . | mnli . | natural . | qqp . | |||||||
trained on Persian | mBERT (base) | 49.0 | 30.1. | 28.7 | 33.8 | 48.7 | 51.6 | 80.4 | 75.3 | ||||||
WikiBERT (base) | 39.2 | 36.9 | 30.2 | 34.1 | 52.8 | 52.6 | 80.0 | 75.5 | |||||||
ParsBERT (base) | 40.7 | 33.4 | 28.6 | 32.5 | 51.8 | 53.9 | 79.4 | 72.0 | |||||||
mT5 (small) | 30.9 | 33.7 | 23.7 | 39.1 | 51.9 | 51.0 | 75.2 | 72.0 | |||||||
mT5 (base) | 42.6 | 34.0 | 24.0 | 36.9 | 57.8 | 59.9 | 79.1 | 75.1 | |||||||
mT5 (large) | 49.2 | 32.6 | 27.1 | 38.9 | 69.1 | 71.6 | 84.6 | 76.6 | |||||||
mT5 (XL) | 70.4 | 33.7 | 27.7 | 38.9 | 77.2 | 74.5 | 88.6 | 80.3 | |||||||
trained on English | mT5 (small) | 33.0 | 20.9 | 25.7 | 28.9 | 45.1 | 55.6 | 73.5 | 75.1 | ||||||
mT5 (base) | 53.4 | 23.4 | 23.4 | 24.3 | 44.4 | 43.3 | 83.2 | 81.8 | |||||||
mT5 (large) | 67.4 | 27.4 | 33.1 | 25.4 | 46.5 | 54.9 | 88.1 | 86.6 | |||||||
mT5 (XL) | 68.2 | 28.3 | 38.6 | 22.0 | 66.2 | 77.8 | 89.2 | 87.0 | |||||||
trained on Per + Eng | mT5 (small) | 45.3 | 30.9 | 24.9 | 36.6 | 53.3 | 56.2 | 77.9 | 71.3 | ||||||
mT5 (base) | 63.9 | 32.3 | 24.0 | 37.7 | 57.8 | 63.9 | 80.2 | 73.4 | |||||||
mT5 (large) | 73.6 | 30.6 | 28.9 | 38.6 | 70.9 | 72.5 | 85.3 | 78.9 | |||||||
mT5 (XL) | 74.7 | 38.0 | 33.7 | 38.0 | 75.5 | 78.7 | 88.2 | 80.3 | |||||||
Human | 86.2 | 80.0 | 85.0 | 85.0 | 87.1 | 90.2 | 92.3 | 88.4 | |||||||
Setup | Model ↓ - Task → | Sentiment (sentence sent.) | Sentiment (aspect ext.) | Sentiment (aspect sent.) | Machine Translation (Eng → Per) | Machine Translation (Per → Eng) | |||||||||
Subtask → | food | movies | food | movies | food | movies | quran | bible | qqp | mizan | quran | bible | qqp | mizan | |
trained on our data | mBERT (base) | 55.2 | 48.6 | 87.1 | 73.24 | 53.9 | 34.7 | – | – | – | – | – | – | – | – |
WikiBERT (base) | 52.0 | 58.5 | 91.9 | 78.0 | 56.5 | 41.6 | – | – | – | – | – | – | – | – | |
ParsBERT (base) | 59.1 | 56.8 | 91.1 | 76.8 | 53.9 | 37.6 | – | – | – | – | – | – | – | – | |
mT5 (small) | 54.6 | 49.4 | 86.4 | 78.6 | 52.4 | 40.6 | 10.2 | 2.1 | 22.2 | 8.4 | 20.6 | 2.5 | 22.9 | 14.6 | |
mT5 (base) | 56.6 | 52.9 | 88.6 | 80.5 | 52.9 | 46.5 | 11.4 | 2.1 | 27.3 | 9.4 | 22.8 | 2.5 | 34.6 | 14.9 | |
mT5 (large) | 62.9 | 72.5 | 92.2 | 85.0 | 58.1 | 53.5 | 11.9 | 2.1 | 24.8 | 10.6 | 24.7 | 2.4 | 35.1 | 16.4 | |
mT5 (XL) | 63.1 | 70.6 | 92.0 | 85.8 | 58.9 | 54.5 | 13.5 | 2.2 | 20.0 | 11.0 | 30.0 | 2.6 | 33.7 | 19.3 | |
trained on English | mT5 (small) | – | – | – | – | – | – | – | – | – | – | 6.6 | 1.9 | 7.7 | 3.7 |
mT5 (base) | – | – | – | – | – | – | – | – | – | – | 11.5 | 2.1 | 14.0 | 5.7 | |
mT5 (large) | – | – | – | – | – | – | – | – | – | – | 20.2 | 2.3 | 21.0 | 7.4 | |
mT5 (XL) | – | – | – | – | – | – | – | – | – | – | 25.6 | 2.3 | 30.7 | 9.7 | |
trained on Per + Eng | mT5 (small) | – | – | – | – | – | – | – | – | – | – | 19.2 | 2.5 | 25.6 | 12.1 |
mT5 (base) | – | – | – | – | – | – | – | – | – | – | 24.1 | 2.4 | 36.0 | 14.8 | |
mT5 (large) | – | – | – | – | – | – | – | – | – | – | 29.9 | 2.6 | 36.5 | 18.1 | |
mT5 (XL) | – | – | – | – | – | – | – | – | – | – | 33.4 | 2.6 | 41.0 | 18.2 | |
Human | 88.4 | 90.3 | 93.1 | 91.6 | 71.0 | 61.6 | – | – | – | – | – | – | – | – |
Setup . | Model ↓ - Task → . | Reading Comprehension . | Multiple-Choice Question Answering . | Textual Entailment . | Question Paraphrasing . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Subtask → . | all . | literature . | com-know . | math & logic . | natural . | mnli . | natural . | qqp . | |||||||
trained on Persian | mBERT (base) | 49.0 | 30.1. | 28.7 | 33.8 | 48.7 | 51.6 | 80.4 | 75.3 | ||||||
WikiBERT (base) | 39.2 | 36.9 | 30.2 | 34.1 | 52.8 | 52.6 | 80.0 | 75.5 | |||||||
ParsBERT (base) | 40.7 | 33.4 | 28.6 | 32.5 | 51.8 | 53.9 | 79.4 | 72.0 | |||||||
mT5 (small) | 30.9 | 33.7 | 23.7 | 39.1 | 51.9 | 51.0 | 75.2 | 72.0 | |||||||
mT5 (base) | 42.6 | 34.0 | 24.0 | 36.9 | 57.8 | 59.9 | 79.1 | 75.1 | |||||||
mT5 (large) | 49.2 | 32.6 | 27.1 | 38.9 | 69.1 | 71.6 | 84.6 | 76.6 | |||||||
mT5 (XL) | 70.4 | 33.7 | 27.7 | 38.9 | 77.2 | 74.5 | 88.6 | 80.3 | |||||||
trained on English | mT5 (small) | 33.0 | 20.9 | 25.7 | 28.9 | 45.1 | 55.6 | 73.5 | 75.1 | ||||||
mT5 (base) | 53.4 | 23.4 | 23.4 | 24.3 | 44.4 | 43.3 | 83.2 | 81.8 | |||||||
mT5 (large) | 67.4 | 27.4 | 33.1 | 25.4 | 46.5 | 54.9 | 88.1 | 86.6 | |||||||
mT5 (XL) | 68.2 | 28.3 | 38.6 | 22.0 | 66.2 | 77.8 | 89.2 | 87.0 | |||||||
trained on Per + Eng | mT5 (small) | 45.3 | 30.9 | 24.9 | 36.6 | 53.3 | 56.2 | 77.9 | 71.3 | ||||||
mT5 (base) | 63.9 | 32.3 | 24.0 | 37.7 | 57.8 | 63.9 | 80.2 | 73.4 | |||||||
mT5 (large) | 73.6 | 30.6 | 28.9 | 38.6 | 70.9 | 72.5 | 85.3 | 78.9 | |||||||
mT5 (XL) | 74.7 | 38.0 | 33.7 | 38.0 | 75.5 | 78.7 | 88.2 | 80.3 | |||||||
Human | 86.2 | 80.0 | 85.0 | 85.0 | 87.1 | 90.2 | 92.3 | 88.4 | |||||||
Setup | Model ↓ - Task → | Sentiment (sentence sent.) | Sentiment (aspect ext.) | Sentiment (aspect sent.) | Machine Translation (Eng → Per) | Machine Translation (Per → Eng) | |||||||||
Subtask → | food | movies | food | movies | food | movies | quran | bible | qqp | mizan | quran | bible | qqp | mizan | |
trained on our data | mBERT (base) | 55.2 | 48.6 | 87.1 | 73.24 | 53.9 | 34.7 | – | – | – | – | – | – | – | – |
WikiBERT (base) | 52.0 | 58.5 | 91.9 | 78.0 | 56.5 | 41.6 | – | – | – | – | – | – | – | – | |
ParsBERT (base) | 59.1 | 56.8 | 91.1 | 76.8 | 53.9 | 37.6 | – | – | – | – | – | – | – | – | |
mT5 (small) | 54.6 | 49.4 | 86.4 | 78.6 | 52.4 | 40.6 | 10.2 | 2.1 | 22.2 | 8.4 | 20.6 | 2.5 | 22.9 | 14.6 | |
mT5 (base) | 56.6 | 52.9 | 88.6 | 80.5 | 52.9 | 46.5 | 11.4 | 2.1 | 27.3 | 9.4 | 22.8 | 2.5 | 34.6 | 14.9 | |
mT5 (large) | 62.9 | 72.5 | 92.2 | 85.0 | 58.1 | 53.5 | 11.9 | 2.1 | 24.8 | 10.6 | 24.7 | 2.4 | 35.1 | 16.4 | |
mT5 (XL) | 63.1 | 70.6 | 92.0 | 85.8 | 58.9 | 54.5 | 13.5 | 2.2 | 20.0 | 11.0 | 30.0 | 2.6 | 33.7 | 19.3 | |
trained on English | mT5 (small) | – | – | – | – | – | – | – | – | – | – | 6.6 | 1.9 | 7.7 | 3.7 |
mT5 (base) | – | – | – | – | – | – | – | – | – | – | 11.5 | 2.1 | 14.0 | 5.7 | |
mT5 (large) | – | – | – | – | – | – | – | – | – | – | 20.2 | 2.3 | 21.0 | 7.4 | |
mT5 (XL) | – | – | – | – | – | – | – | – | – | – | 25.6 | 2.3 | 30.7 | 9.7 | |
trained on Per + Eng | mT5 (small) | – | – | – | – | – | – | – | – | – | – | 19.2 | 2.5 | 25.6 | 12.1 |
mT5 (base) | – | – | – | – | – | – | – | – | – | – | 24.1 | 2.4 | 36.0 | 14.8 | |
mT5 (large) | – | – | – | – | – | – | – | – | – | – | 29.9 | 2.6 | 36.5 | 18.1 | |
mT5 (XL) | – | – | – | – | – | – | – | – | – | – | 33.4 | 2.6 | 41.0 | 18.2 | |
Human | 88.4 | 90.3 | 93.1 | 91.6 | 71.0 | 61.6 | – | – | – | – | – | – | – | – |