Statistics on the native training data sets used in the key experiments. For the WikiNYT corpus, training sizes indicate the number of relevant training examples (articles/prepositions/verbs) in millions. (The entire WikiNYT corpus contains more examples; the table shows the sizes used in the key experiments.) Training sizes for the Web1T corpus are approximate and are based on the corpus size (1012 words).
Data set . | Training sizes . | ||
---|---|---|---|
Articles . | Prepositions . | Verb agreement . | |
WikiNYT | 3M | 5M | 1.8M |
Web1T | 40,000M | 20,000M | 25,000M |
Data set . | Training sizes . | ||
---|---|---|---|
Articles . | Prepositions . | Verb agreement . | |
WikiNYT | 3M | 5M | 1.8M |
Web1T | 40,000M | 20,000M | 25,000M |