Results of Pearson correlation r and Spearman’s rank correlation ρ on the Argument Facet Similarity (AFS) dataset. ♯ indicates that results are reproduced by ourselves; § indicates results are taken from Reimers and Gurevych (2019); Surrogate are results for our proposed method.
Model . | Pearson r . | Spearman ρ . |
---|---|---|
Unsupervised Setting | ||
Avg. Glove embeddings♯ | 32.40 | 34.00 |
Avg. Skip-Thought embeddings♯ | 22.34 | 23.24 |
InferSent-Glove♯ | 24.83 | 25.83 |
Avg. BERT embeddings♯ | 29.15 | 31.45 |
BERT [CLS]♯ | 12.00 | 9.06 |
BERTScore♯ | 45.32 | 33.56 |
DPR♯ | 41.89 | 32.16 |
BLEURT♯ | 45.98 | 44.12 |
Universal Sent Encoder♯ | 44.28 | 43.47 |
Origin | 56.20 | 54.40 |
Surrogatebase | 53.00 | 52.50 |
Surrogatelarge | 54.50 | 54.70 |
Supervised Setting | ||
BERT [CLS]♯ | 35.28 | 36.24 |
BERTbase§ | 77.20 | 74.84 |
SBERTbase§ | 76.57 | 74.13 |
SRoBERTabase♯ | 77.26 | 74.89 |
Surrogatebase | 79.80 | 78.20 |
BERTlarge§ | 78.68 | 76.38 |
SBERTlarge§ | 77.85 | 75.93 |
SRoBERTalarge♯ | 79.03 | 76.92 |
Surrogatelarge | 81.00 | 80.50 |
Model . | Pearson r . | Spearman ρ . |
---|---|---|
Unsupervised Setting | ||
Avg. Glove embeddings♯ | 32.40 | 34.00 |
Avg. Skip-Thought embeddings♯ | 22.34 | 23.24 |
InferSent-Glove♯ | 24.83 | 25.83 |
Avg. BERT embeddings♯ | 29.15 | 31.45 |
BERT [CLS]♯ | 12.00 | 9.06 |
BERTScore♯ | 45.32 | 33.56 |
DPR♯ | 41.89 | 32.16 |
BLEURT♯ | 45.98 | 44.12 |
Universal Sent Encoder♯ | 44.28 | 43.47 |
Origin | 56.20 | 54.40 |
Surrogatebase | 53.00 | 52.50 |
Surrogatelarge | 54.50 | 54.70 |
Supervised Setting | ||
BERT [CLS]♯ | 35.28 | 36.24 |
BERTbase§ | 77.20 | 74.84 |
SBERTbase§ | 76.57 | 74.13 |
SRoBERTabase♯ | 77.26 | 74.89 |
Surrogatebase | 79.80 | 78.20 |
BERTlarge§ | 78.68 | 76.38 |
SBERTlarge§ | 77.85 | 75.93 |
SRoBERTalarge♯ | 79.03 | 76.92 |
Surrogatelarge | 81.00 | 80.50 |