Evaluation results for debiasing multiple protected attributes using FaRM. Both configurations of FaRM outperform AdS (Basu Roy Chowdhury et al., 2021) in guarding protected attribute and intersectional group biases.
Setup . | Pan16 . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Mention (y) . | Age (g1) . | Fairness (g1) . | Gender (g2) . | Fairness (g2) . | Inter. Groups (g1,g2) . | |||||||
F1↑ . | MDL↓ . | ΔF1↓ . | MDL↑ . | DP↓ . | GapgRMS ↓ . | ΔF1↓ . | MDL↑ . | DP↓ . | GapgRMS ↓ . | ΔF1↓ . | MDL↑ . | |
BERTbase (fine-tuned) | 88.6 | 6.8 | 14.9 | 196.4 | 0.06 | 0.009 | 16.5 | 192.0 | 0.04 | 0.014 | 20.7 | 117.2 |
AdS | 88.6 | 5.5 | 2.2 | 231.5 | 0.05 | 0.006 | 1.6 | 230.9 | 0.04 | 0.017 | 9.1 | 118.5 |
FaRM (N-partition) | 87.0 | 13.4 | 0.0 | 234.3 | 0.03 | 0.003 | 0.0 | 234.2 | 0.06 | 0.025 | 0.7 | 468.0 |
FaRM (1-partition) | 86.4 | 15.6 | 0.0 | 234.6 | 0.05 | 0.006 | 0.0 | 234.2 | 0.02 | 0.009 | 0.0 | 467.7 |
Setup . | Pan16 . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Mention (y) . | Age (g1) . | Fairness (g1) . | Gender (g2) . | Fairness (g2) . | Inter. Groups (g1,g2) . | |||||||
F1↑ . | MDL↓ . | ΔF1↓ . | MDL↑ . | DP↓ . | GapgRMS ↓ . | ΔF1↓ . | MDL↑ . | DP↓ . | GapgRMS ↓ . | ΔF1↓ . | MDL↑ . | |
BERTbase (fine-tuned) | 88.6 | 6.8 | 14.9 | 196.4 | 0.06 | 0.009 | 16.5 | 192.0 | 0.04 | 0.014 | 20.7 | 117.2 |
AdS | 88.6 | 5.5 | 2.2 | 231.5 | 0.05 | 0.006 | 1.6 | 230.9 | 0.04 | 0.017 | 9.1 | 118.5 |
FaRM (N-partition) | 87.0 | 13.4 | 0.0 | 234.3 | 0.03 | 0.003 | 0.0 | 234.2 | 0.06 | 0.025 | 0.7 | 468.0 |
FaRM (1-partition) | 86.4 | 15.6 | 0.0 | 234.6 | 0.05 | 0.006 | 0.0 | 234.2 | 0.02 | 0.009 | 0.0 | 467.7 |