Skip to Main Content
Table 6

Statistics on the native training data sets used in the key experiments. For the WikiNYT corpus, training sizes indicate the number of relevant training examples (articles/prepositions/verbs) in millions. (The entire WikiNYT corpus contains more examples; the table shows the sizes used in the key experiments.) Training sizes for the Web1T corpus are approximate and are based on the corpus size (1012 words).

Data setTraining sizes
ArticlesPrepositionsVerb agreement
WikiNYT 3M 5M 1.8M 
Web1T 40,000M 20,000M 25,000M 
Data setTraining sizes
ArticlesPrepositionsVerb agreement
WikiNYT 3M 5M 1.8M 
Web1T 40,000M 20,000M 25,000M 
Close Modal

or Create an Account

Close Modal
Close Modal