Descriptive statistics for the Sentiment Classification data sets. r(adj) denotes the ratio of Adjectives to Non-adjectives in an example. θdomain is the mean probability of the topic that is most observed in that domain, which will also serve as our treated topic. b,d,e,k,m are abbreviations for Books, DVD, Electronics, Kitchen, and Movies.
Domain . | Min. r(adj) . | Med. # r(adj) . | Max. # r(adj) . | σ of # r(adj) . | θb . | θd . | θe . | θk . | θm . |
---|---|---|---|---|---|---|---|---|---|
Books | 0.0 | 0.135 | 0.444 | 0.042 | 0.311 | 0.011 | 0.052 | 0.026 | 0.014 |
DVD | 0.0 | 0.138 | 0.425 | 0.042 | 0.014 | 0.045 | 0.045 | 0.016 | 0.225 |
Electronics | 0.0 | 0.136 | 0.461 | 0.049 | 0.010 | 0.065 | 0.080 | 0.039 | 0.003 |
Kitchen | 0.0 | 0.142 | 0.500 | 0.052 | 0.007 | 0.039 | 0.075 | 0.066 | 0.002 |
Movies | 0.0 | 0.138 | 0.666 | 0.0333 | 0.010 | 0.007 | 0.045 | 0.016 | 0.281 |
Domain . | Min. r(adj) . | Med. # r(adj) . | Max. # r(adj) . | σ of # r(adj) . | θb . | θd . | θe . | θk . | θm . |
---|---|---|---|---|---|---|---|---|---|
Books | 0.0 | 0.135 | 0.444 | 0.042 | 0.311 | 0.011 | 0.052 | 0.026 | 0.014 |
DVD | 0.0 | 0.138 | 0.425 | 0.042 | 0.014 | 0.045 | 0.045 | 0.016 | 0.225 |
Electronics | 0.0 | 0.136 | 0.461 | 0.049 | 0.010 | 0.065 | 0.080 | 0.039 | 0.003 |
Kitchen | 0.0 | 0.142 | 0.500 | 0.052 | 0.007 | 0.039 | 0.075 | 0.066 | 0.002 |
Movies | 0.0 | 0.138 | 0.666 | 0.0333 | 0.010 | 0.007 | 0.045 | 0.016 | 0.281 |