Skip to Main Content
Table 4: 

Results for zero-shot probe metrics for 3 unsupervised models plus one LogMel baseline and 3 unit sizes. ABX within and across speakers, spot-the-word, and acceptability judgments are error rates (lower is better); chance is 50%.

SystemMetricsS2uuLM
Nb unitsABX with.↓ABX acr.↓spot-the-word↓accept. judg.↓
Toplines 
ASR+LM – – 3.12 29.02 
 
Baselines 
LogMel 50 23.95 35.86 48.52 46.78 
LogMel 100 24.33 37.86 48.12 46.83 
LogMel 200 25.71 39.65 49.62 47.76 
 
Unsupervised 
CPC 50 5.50 7.20 32.18 45.43 
CPC 100 5.09 6.55 31.72 44.35 
CPC 200 5.18 6.83 37.40 45.19 
HuBERT-L6 50 7.37 8.61 32.88 44.06 
HuBERT-L6 100 6.00 7.41 31.30 42.94 
HuBERT-L6 200 5.99 7.31 36.52 47.03 
wav2vec-L14 50 22.30 24.56 51.92 45.75 
wav2vec-L14 100 18.16 20.44 50.24 45.97 
wav2vec-L14 200 16.59 18.69 44.68 45.70 
SystemMetricsS2uuLM
Nb unitsABX with.↓ABX acr.↓spot-the-word↓accept. judg.↓
Toplines 
ASR+LM – – 3.12 29.02 
 
Baselines 
LogMel 50 23.95 35.86 48.52 46.78 
LogMel 100 24.33 37.86 48.12 46.83 
LogMel 200 25.71 39.65 49.62 47.76 
 
Unsupervised 
CPC 50 5.50 7.20 32.18 45.43 
CPC 100 5.09 6.55 31.72 44.35 
CPC 200 5.18 6.83 37.40 45.19 
HuBERT-L6 50 7.37 8.61 32.88 44.06 
HuBERT-L6 100 6.00 7.41 31.30 42.94 
HuBERT-L6 200 5.99 7.31 36.52 47.03 
wav2vec-L14 50 22.30 24.56 51.92 45.75 
wav2vec-L14 100 18.16 20.44 50.24 45.97 
wav2vec-L14 200 16.59 18.69 44.68 45.70 
Close Modal

or Create an Account

Close Modal
Close Modal