Skip to Main Content
Table 2: 

Spearman rank correlation ρ between the cosine similarity of sentence representations and the gold labels for various Textual Similarity (STS) tasks under the unsupervised setting. We use *-NLI to denote the model additionally trained on NLI datasets. ♯ indicates that results are reproduced by ourselves; § indicates results are taken from Reimers and Gurevych (2019); Surrogate are results for our proposed method.

ModelSTS12STS13STS14STS15STS16STSbSICK-RAvg
 fully unsupervised without human labels 
Avg. Glove embeddings§ 55.14 70.66 59.73 68.25 63.66 58.02 53.76 61.32 
Avg. Skip-Thought embeddings§ 57.11 71.98 61.30 70.13 65.21 59.42 55.50 62.95 
InferSent-Glove 52.86 66.75 62.15 72.77 66.87 68.03 65.65 65.01 
Avg. BERT embeddings§ 38.78 57.98 57.98 63.15 61.06 46.35 58.40 54.81 
BERT [CLS] 20.16 30.01 20.09 36.88 38.08 16.50 42.63 29.19 
BERTScore 54.60 50.11 57.74 70.79 64.58 57.58 51.37 58.11 
DPR 53.98 56.00 57.83 66.68 67.43 58.53 61.85 60.33 
BLEURT 70.16 64.97 57.41 72.91 70.01 69.81 58.46 66.25 
Universal Sent Encoder 64.49 67.80 64.61 76.83 73.18 74.92 76.69 71.22 
 
Origin 72.41 74.30 75.45 78.45 79.93 78.47 79.49 76.93 
Surrogatebase 70.62 72.14 72.72 76.34 75.24 74.19 77.20 74.06 
Surrogatelarge 71.93 73.74 73.95 77.01 76.64 75.32 77.84 75.20 
 
 partially supervised without human labels but not the same domain 
InferSent-NLI 50.48 67.75 62.15 72.77 66.87 68.03 65.65 64.81 
BERT [CLS]-NLI 60.35 54.97 64.92 71.49 70.49 73.25 70.79 66.61 
BERTScore-NLI 60.89 54.64 63.96 74.35 66.67 65.65 66.01 64.60 
DPR-NLI 61.36 56.71 65.49 71.80 71.03 74.08 70.86 67.33 
BLEURT-NLI 66.40 68.15 71.98 79.69 77.86 77.98 70.92 73.28 
Universal Sent Ecoder-NLI 65.55 67.95 71.47 80.81 78.70 78.41 69.31 73.17 
 
BERT-NLIbase 71.07 76.81 73.29 79.56 74.58 77.10 72.65 75.01 
SBERT-NLIbase§ 70.97 76.53 73.19 79.09 74.30 77.03 72.91 74.86 
SRoBERTa-NLIbase§ 71.54 72.49 70.80 78.74 73.69 77.77 74.46 74.21 
Surrogate-NLIbase 74.15 76.50 72.23 81.24 78.75 79.32 78.56 77.25 
 
BERT-NLIlarge 71.62 77.40 72.69 78.61 75.28 77.83 72.64 75.15 
SBERT-NLIlarge§ 72.27 78.46 74.90 80.99 76.25 79.23 73.75 76.55 
SRoBERTa-NLIlarge§ 74.53 77.00 73.18 81.85 76.82 79.10 74.29 76.68 
Surrogate-NLIlarge 76.98 79.83 75.15 83.54 79.32 80.82 79.64 79.33 
ModelSTS12STS13STS14STS15STS16STSbSICK-RAvg
 fully unsupervised without human labels 
Avg. Glove embeddings§ 55.14 70.66 59.73 68.25 63.66 58.02 53.76 61.32 
Avg. Skip-Thought embeddings§ 57.11 71.98 61.30 70.13 65.21 59.42 55.50 62.95 
InferSent-Glove 52.86 66.75 62.15 72.77 66.87 68.03 65.65 65.01 
Avg. BERT embeddings§ 38.78 57.98 57.98 63.15 61.06 46.35 58.40 54.81 
BERT [CLS] 20.16 30.01 20.09 36.88 38.08 16.50 42.63 29.19 
BERTScore 54.60 50.11 57.74 70.79 64.58 57.58 51.37 58.11 
DPR 53.98 56.00 57.83 66.68 67.43 58.53 61.85 60.33 
BLEURT 70.16 64.97 57.41 72.91 70.01 69.81 58.46 66.25 
Universal Sent Encoder 64.49 67.80 64.61 76.83 73.18 74.92 76.69 71.22 
 
Origin 72.41 74.30 75.45 78.45 79.93 78.47 79.49 76.93 
Surrogatebase 70.62 72.14 72.72 76.34 75.24 74.19 77.20 74.06 
Surrogatelarge 71.93 73.74 73.95 77.01 76.64 75.32 77.84 75.20 
 
 partially supervised without human labels but not the same domain 
InferSent-NLI 50.48 67.75 62.15 72.77 66.87 68.03 65.65 64.81 
BERT [CLS]-NLI 60.35 54.97 64.92 71.49 70.49 73.25 70.79 66.61 
BERTScore-NLI 60.89 54.64 63.96 74.35 66.67 65.65 66.01 64.60 
DPR-NLI 61.36 56.71 65.49 71.80 71.03 74.08 70.86 67.33 
BLEURT-NLI 66.40 68.15 71.98 79.69 77.86 77.98 70.92 73.28 
Universal Sent Ecoder-NLI 65.55 67.95 71.47 80.81 78.70 78.41 69.31 73.17 
 
BERT-NLIbase 71.07 76.81 73.29 79.56 74.58 77.10 72.65 75.01 
SBERT-NLIbase§ 70.97 76.53 73.19 79.09 74.30 77.03 72.91 74.86 
SRoBERTa-NLIbase§ 71.54 72.49 70.80 78.74 73.69 77.77 74.46 74.21 
Surrogate-NLIbase 74.15 76.50 72.23 81.24 78.75 79.32 78.56 77.25 
 
BERT-NLIlarge 71.62 77.40 72.69 78.61 75.28 77.83 72.64 75.15 
SBERT-NLIlarge§ 72.27 78.46 74.90 80.99 76.25 79.23 73.75 76.55 
SRoBERTa-NLIlarge§ 74.53 77.00 73.18 81.85 76.82 79.10 74.29 76.68 
Surrogate-NLIlarge 76.98 79.83 75.15 83.54 79.32 80.82 79.64 79.33 
Close Modal

or Create an Account

Close Modal
Close Modal