Corpus-based measures of morphology defined for this study. These measures are calculated on tokenized data sets before applying any segmentation method.
Measure . | Definition . |
---|---|
Types | Number of unique word tokens |
TTR | Number of unique word tokens divided by total |
number of word tokens | |
MATTR | Average TTR calculated over a moving window |
of 500 word tokens | |
MLW | Average number of characters per word token |
Measure . | Definition . |
---|---|
Types | Number of unique word tokens |
TTR | Number of unique word tokens divided by total |
number of word tokens | |
MATTR | Average TTR calculated over a moving window |
of 500 word tokens | |
MLW | Average number of characters per word token |