Table 2: 

Correlation r and uncertainty prediction quality metrics (Cal, NLPD, and Shp) on three STS datasets (STS-B, EBMSASS, and MedSTS) and a SA rating dataset (Yelp), with SBERT and BERT sentence embeddings with various task-specific layers: Cosine similarity = calculate cosine similarity between vectors representing S1 and S2; LR = single-layer linear regression; Bayesian LR = Bayesian linear regression; and Sparse GP Regression = Sparse Gaussian process regression. n/a indicates that the method doesn’t produce an uncertainty estimate to apply the given metric to.

STS-B testEBMSASS testMedSTS testYelp test
rCalNLPDShprCalNLPDShprCalNLPDShprCalNLPDShp
SBERT Cosine similarity 0.842 n/a n/a n/a 0.773 n/a n/a n/a 0.784 n/a n/a n/a — n/a n/a n/a 
SBERT LR 0.835 n/a n/a n/a 0.743 n/a n/a n/a 0.776 n/a n/a n/a 0.666 n/a n/a n/a 
SBERT Bayesian LR 0.810 0.046 0.648 1.632 0.688 0.443 1.095 2.156 0.740 0.101 0.801 2.092 0.671 0.019 0.447 0.753 
SBERT Sparse GP Regression 0.847 0.065 0.614 1.621 0.788 0.195 0.541 1.627 0.781 0.073 0.499 1.453 0.689 0.049 0.573 1.507 
 
BERT LR 0.868 n/a n/a n/a 0.914 n/a n/a n/a 0.858 n/a n/a n/a 0.826 n/a n/a n/a 
BERT ConvLR 0.855 n/a n/a n/a 0.922 n/a n/a n/a 0.846 n/a n/a n/a 0.822 n/a n/a n/a 
BERT Bayesian LR (BBB) 0.848 0.521 + 0.005 0.914 0.669 1177.2 0.005 0.848 0.514 6594.3 0.006 0.827 0.531 3908.6 0.083 
BERT Bayesian ConvLR (BBB) 0.849 0.495 2061.0 0.015 0.898 0.618 327.3 0.010 0.835 0.506 1037.2 0.017 0.797 1.513 119.2 0.089 
BERT LR MC dropout 0.868 0.181 4.659 0.215 0.921 0.054 0.036 0.140 0.859 0.163 4.118 0.168 0.827 0.267 7.285 0.153 
BERT ConvLR MC dropout 0.855 0.202 5.830 0.209 0.922 0.093 2.137 0.085 0.852 0.219 6.402 0.146 0.823 0.291 8.214 0.150 
STS-B testEBMSASS testMedSTS testYelp test
rCalNLPDShprCalNLPDShprCalNLPDShprCalNLPDShp
SBERT Cosine similarity 0.842 n/a n/a n/a 0.773 n/a n/a n/a 0.784 n/a n/a n/a — n/a n/a n/a 
SBERT LR 0.835 n/a n/a n/a 0.743 n/a n/a n/a 0.776 n/a n/a n/a 0.666 n/a n/a n/a 
SBERT Bayesian LR 0.810 0.046 0.648 1.632 0.688 0.443 1.095 2.156 0.740 0.101 0.801 2.092 0.671 0.019 0.447 0.753 
SBERT Sparse GP Regression 0.847 0.065 0.614 1.621 0.788 0.195 0.541 1.627 0.781 0.073 0.499 1.453 0.689 0.049 0.573 1.507 
 
BERT LR 0.868 n/a n/a n/a 0.914 n/a n/a n/a 0.858 n/a n/a n/a 0.826 n/a n/a n/a 
BERT ConvLR 0.855 n/a n/a n/a 0.922 n/a n/a n/a 0.846 n/a n/a n/a 0.822 n/a n/a n/a 
BERT Bayesian LR (BBB) 0.848 0.521 + 0.005 0.914 0.669 1177.2 0.005 0.848 0.514 6594.3 0.006 0.827 0.531 3908.6 0.083 
BERT Bayesian ConvLR (BBB) 0.849 0.495 2061.0 0.015 0.898 0.618 327.3 0.010 0.835 0.506 1037.2 0.017 0.797 1.513 119.2 0.089 
BERT LR MC dropout 0.868 0.181 4.659 0.215 0.921 0.054 0.036 0.140 0.859 0.163 4.118 0.168 0.827 0.267 7.285 0.153 
BERT ConvLR MC dropout 0.855 0.202 5.830 0.209 0.922 0.093 2.137 0.085 0.852 0.219 6.402 0.146 0.823 0.291 8.214 0.150 
Close Modal

or Create an Account

Close Modal
Close Modal