Evaluation results of unconstrained debiasing on the Dial dataset. We report the performance of the DeepMoji (Original), INLP (Ravfogel et al., 2020), and FaRM representations. We observe that FaRM achieves the best fairness scores in all setups, while maintaining similar performance on sentiment classification task.
Metric . | Method . | Split . | |||
---|---|---|---|---|---|
50% . | 60% . | 70% . | 80% . | ||
Sentiment Acc. (↑) | Original | 75.5 | 75.5 | 74.4 | 71.9 |
INLP | 75.1 | 73.1 | 69.2 | 64.5 | |
FaRM | 74.8 | 73.2 | 67.3 | 63.5 | |
Race Acc. (↓) | Original | 87.7 | 87.8 | 87.3 | 87.4 |
INLP | 69.5 | 82.2 | 80.3 | 69.9 | |
FaRM | 54.2 | 69.9 | 69.0 | 52.1 | |
DP (↓) | Original | 0.26 | 0.44 | 0.63 | 0.81 |
INLP | 0.16 | 0.33 | 0.30 | 0.28 | |
FaRM | 0.09 | 0.10 | 0.17 | 0.22 | |
GapgRMS (↓) | Original | 0.15 | 0.24 | 0.33 | 0.41 |
INLP | 0.12 | 0.18 | 0.16 | 0.16 | |
FaRM | 0.09 | 0.10 | 0.12 | 0.14 |
Metric . | Method . | Split . | |||
---|---|---|---|---|---|
50% . | 60% . | 70% . | 80% . | ||
Sentiment Acc. (↑) | Original | 75.5 | 75.5 | 74.4 | 71.9 |
INLP | 75.1 | 73.1 | 69.2 | 64.5 | |
FaRM | 74.8 | 73.2 | 67.3 | 63.5 | |
Race Acc. (↓) | Original | 87.7 | 87.8 | 87.3 | 87.4 |
INLP | 69.5 | 82.2 | 80.3 | 69.9 | |
FaRM | 54.2 | 69.9 | 69.0 | 52.1 | |
DP (↓) | Original | 0.26 | 0.44 | 0.63 | 0.81 |
INLP | 0.16 | 0.33 | 0.30 | 0.28 | |
FaRM | 0.09 | 0.10 | 0.17 | 0.22 | |
GapgRMS (↓) | Original | 0.15 | 0.24 | 0.33 | 0.41 |
INLP | 0.12 | 0.18 | 0.16 | 0.16 | |
FaRM | 0.09 | 0.10 | 0.12 | 0.14 |