Table 2 

Impact of global term selection (GTS) criteria on the different text types in the training set (80% of the corpus).


total # of terms
# of terms selected in GTS
% of terms removed in GTS
unigram 142,396 58,423 58.97 
bigram 3,119,422 1,115,170 64.25 
stanford 7,430,397 1,618,478 78.22 
AEGIR 5,096,918 1,312,715 74.24 

total # of terms
# of terms selected in GTS
% of terms removed in GTS
unigram 142,396 58,423 58.97 
bigram 3,119,422 1,115,170 64.25 
stanford 7,430,397 1,618,478 78.22 
AEGIR 5,096,918 1,312,715 74.24 
Close Modal

or Create an Account

Close Modal
Close Modal