Exact Match score for highest accuracy RePAQ configurations in comparison to recent state-of-the-art systems. Highest score indicated in bold, highest non-retrieve-and-read model underlined.
# . | Model Type . | Model . | NaturalQuestions . | TriviaQA . |
---|---|---|---|---|
1 | Closed-book | T5-11B-SSM (Roberts et al., 2020) | 35.2 | 51.8 |
2 | Closed-book | BART-large (Lewis et al., 2021) | 26.5 | 26.7 |
3 | QA-pair retriever | Dense retriever (Lewis et al., 2021) | 26.7 | 28.9 |
4 | Open-book, retrieve-and-read | RAG-Sequence (Lewis et al., 2020b) | 44.5 | 56.8 |
5 | Open-book, retrieve-and-read | FiD-large, 100 docs (Izacard and Grave, 2021) | 51.4 | 67.6 |
6 | Open-book, phrase index | DensePhrases (Lee et al., 2021) | 40.9 | 50.7 |
7 | Closed-book | BART-large, pre-finetuned on PAQ | 32.7 | 33.2 |
8 | QA-pair retriever | RePAQ (retriever only) | 41.2 | 38.8 |
9 | QA-pair retriever | RePAQ (with reranker) | 47.7 | 50.7 |
10 | QA-pair retriever | RePAQ-multitask (retriever only) | 41.7 | 41.3 |
11 | QA-pair retriever | RePAQ-multitask (with reranker) | 47.6 | 52.1 |
12 | QA-pair retriever | RePAQ-multitask w/ FiD-Large Backoff | 52.3 | 67.3 |
# . | Model Type . | Model . | NaturalQuestions . | TriviaQA . |
---|---|---|---|---|
1 | Closed-book | T5-11B-SSM (Roberts et al., 2020) | 35.2 | 51.8 |
2 | Closed-book | BART-large (Lewis et al., 2021) | 26.5 | 26.7 |
3 | QA-pair retriever | Dense retriever (Lewis et al., 2021) | 26.7 | 28.9 |
4 | Open-book, retrieve-and-read | RAG-Sequence (Lewis et al., 2020b) | 44.5 | 56.8 |
5 | Open-book, retrieve-and-read | FiD-large, 100 docs (Izacard and Grave, 2021) | 51.4 | 67.6 |
6 | Open-book, phrase index | DensePhrases (Lee et al., 2021) | 40.9 | 50.7 |
7 | Closed-book | BART-large, pre-finetuned on PAQ | 32.7 | 33.2 |
8 | QA-pair retriever | RePAQ (retriever only) | 41.2 | 38.8 |
9 | QA-pair retriever | RePAQ (with reranker) | 47.7 | 50.7 |
10 | QA-pair retriever | RePAQ-multitask (retriever only) | 41.7 | 41.3 |
11 | QA-pair retriever | RePAQ-multitask (with reranker) | 47.6 | 52.1 |
12 | QA-pair retriever | RePAQ-multitask w/ FiD-Large Backoff | 52.3 | 67.3 |