Point-biserial correlation between ground truth and similarity methods, multiplied by 1,000
Language . | VSM, cosine . | VSM, L1 dist. . | trigram . | LSA 1000 . | fastText . | RI 1000 . | RRI 1000 . | Always 0 . |
---|---|---|---|---|---|---|---|---|
de | 0.599 | 0.457 | 0.534 | 0.522 | 0.526 | 0.548 | 0.528 | 0.750 |
en | 0.755 | 0.299 | 0.559 | 0.591 | 0.549 | 0.586 | 0.533 | 1.039 |
Combined | 0.653 | 0.402 | 0.543 | 0.546 | 0.534 | 0.561 | 0.530 | 0.850 |
Language . | VSM, cosine . | VSM, L1 dist. . | trigram . | LSA 1000 . | fastText . | RI 1000 . | RRI 1000 . | Always 0 . |
---|---|---|---|---|---|---|---|---|
de | 0.599 | 0.457 | 0.534 | 0.522 | 0.526 | 0.548 | 0.528 | 0.750 |
en | 0.755 | 0.299 | 0.559 | 0.591 | 0.549 | 0.586 | 0.533 | 1.039 |
Combined | 0.653 | 0.402 | 0.543 | 0.546 | 0.534 | 0.561 | 0.530 | 0.850 |