Table 4:

Retrieval results by “slice” on the NQ validation set. For each slice, the table reports the number of queries (Size), Success@1 on NQ of ColBERT-QA3 (C3), and the improvement of this model over three baselines: ColBERT-QA1 (C1), DPR, and BM25. Each row contains an example query, for which C3 gets a correct passage and all three baselines fail. Rows are sorted by delta over BM25.

S@1 Delta of C3 over
SliceSizeC3C1DPRBM25Example (where C3 outperforms all baselines)
all 8757 53 +5 +8 +30 who sang i won’t give up on you
misc. 1144 45 +2 +7 +21 poems that use the first letter of a word
what 1116 47 +5 +9 +25 what ’s the major league baseball record for games won in a row
who 3651 59 +6 +9 +33 who wrote the song ruby sung by kenny rogers
where 712 53 +4 +5 +33 where does the united states keep an emergency stockpile of oil quizlet
when 1818 50 +7 +6 +34 when did wales last win the 6 nations
superlative 628 52 +7 +20 +35 who formed the highest social class of republican and early imperial rome
