Skip to Main Content
Table 1: 
Statistics of the dataset. Numbers of tokens are after BPE processing.
Dataset#Sent.#Tok. (EN)#Tok. (DE)
NC-v11 226K 6.4M 7.3M 
Full 4.17M 109M 118M 
News2013 3000 84.7K 95.6K 
News2016 2999 88.1K 98.8K 
Dataset#Sent.#Tok. (EN)#Tok. (DE)
NC-v11 226K 6.4M 7.3M 
Full 4.17M 109M 118M 
News2013 3000 84.7K 95.6K 
News2016 2999 88.1K 98.8K 
Close Modal

or Create an Account

Close Modal
Close Modal