Model . | Question Rep . | Dev . | Test . | ||
---|---|---|---|---|---|
EM . | F1 . | EM . | F1 . | ||
Human | 40.2 | 70.1 | 40.3 | 70.0 | |
GPT-3 | 12.4 | 33.4 | 10.4 | 31.8 | |
BM25 + DPR Reader | Original | 7.1 | 12.8 | 7.2 | 13.0 |
AllHistory | 13.6 | 25.0 | 13.8 | 25.2 | |
Rewrites | 15.4 | 32.5 | 15.7 | 31.7 | |
BM25 + FiD | Original | 10.1 | 21.8 | 10.5 | 22.6 |
AllHistory | 24.1 | 37.2 | 23.4 | 36.1 | |
Rewrites | 24.0 | 41.6 | 24.9 | 41.4 | |
DPR Retriever + DPR Reader | Original | 4.9 | 14.9 | 4.3 | 14.9 |
AllHistory | 21.0 | 43.4 | 19.4 | 41.1 | |
Rewrites | 17.2 | 36.4 | 16.5 | 35.2 | |
DPR Retriever + FiD | Original | 7.9 | 21.6 | 7.8 | 21.4 |
AllHistory | 33.0 | 55.3 | 33.4 | 55.8 | |
Rewrites | 23.5 | 44.2 | 24.0 | 44.7 |
Model . | Question Rep . | Dev . | Test . | ||
---|---|---|---|---|---|
EM . | F1 . | EM . | F1 . | ||
Human | 40.2 | 70.1 | 40.3 | 70.0 | |
GPT-3 | 12.4 | 33.4 | 10.4 | 31.8 | |
BM25 + DPR Reader | Original | 7.1 | 12.8 | 7.2 | 13.0 |
AllHistory | 13.6 | 25.0 | 13.8 | 25.2 | |
Rewrites | 15.4 | 32.5 | 15.7 | 31.7 | |
BM25 + FiD | Original | 10.1 | 21.8 | 10.5 | 22.6 |
AllHistory | 24.1 | 37.2 | 23.4 | 36.1 | |
Rewrites | 24.0 | 41.6 | 24.9 | 41.4 | |
DPR Retriever + DPR Reader | Original | 4.9 | 14.9 | 4.3 | 14.9 |
AllHistory | 21.0 | 43.4 | 19.4 | 41.1 | |
Rewrites | 17.2 | 36.4 | 16.5 | 35.2 | |
DPR Retriever + FiD | Original | 7.9 | 21.6 | 7.8 | 21.4 |
AllHistory | 33.0 | 55.3 | 33.4 | 55.8 | |
Rewrites | 23.5 | 44.2 | 24.0 | 44.7 |