Skip to Main Content
Table 2

Summed absolute correlations averaged over five runs on the Bible corpus for three “scripts”: Chinese, trigram English, and Korean syllables. Results are given for two definitions of “document.” Also shown are the number of distinct characters in the Bible for each “script.”

LanguageDoc. = 1 ch.Doc. = 6 ch.# distinct chars.
Chinese 5,828 14,859 3,177 
3-gram English 6,386 15,116 3,194 
Korean 8,508 18,200 1,249 
LanguageDoc. = 1 ch.Doc. = 6 ch.# distinct chars.
Chinese 5,828 14,859 3,177 
3-gram English 6,386 15,116 3,194 
Korean 8,508 18,200 1,249 
Close Modal

or Create an Account

Close Modal
Close Modal