Skip to Main Content
Table 7

The TPR-GAP results for all versions (Balanced, Gentle, and Aggressive) for the three concepts where we have ground truth (Adjectives, Gender, and Race).

ExperimentAdjectivesGenderRace
Balanced 0.057 0.003 0.002 
Gentle 0.074 0.014 0.012 
Aggressive 0.012 0.003 0.049 
ExperimentAdjectivesGenderRace
Balanced 0.057 0.003 0.002 
Gentle 0.074 0.014 0.012 
Aggressive 0.012 0.003 0.049 
Close Modal

or Create an Account

Close Modal
Close Modal