Social norm bias: residual harms of fairness-aware algorithms | Data Mining and Knowledge Discovery Skip to main content

Advertisement

Log in

Social norm bias: residual harms of fairness-aware algorithms

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race. However, these algorithms seldom account for within-group heterogeneity and biases that may disproportionately affect some members of a group. In this work, we characterize Social Norm Bias (SNoB), a subtle but consequential type of algorithmic discrimination that may be exhibited by machine learning models, even when these systems achieve group fairness objectives. We study this issue through the lens of gender bias in occupation classification. We quantify SNoB by measuring how an algorithm’s predictions are associated with conformity to inferred gender norms. When predicting if an individual belongs to a male-dominated occupation, this framework reveals that “fair” classifiers still favor biographies written in ways that align with inferred masculine norms. We compare SNoB across algorithmic fairness techniques and show that it is frequently a residual bias, and post-processing approaches do not mitigate this type of bias at all.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
¥17,985 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Availability of data and material

Publicly available at http://aka.ms/biasbios.

Code Availability

Publicly available at http://bit.ly/snobcode.

Notes

  1. Twitter thread started by Dr. Timothy Verstynen: https://twitter.com/tdverstynen/status/1501386481415434245

  2. The dataset is publicly available at http://aka.ms/biasbios and licensed under the MIT License.

  3. Consistent with previous work (De-Arteaga et al. 2019), we used regular expressions to remove the following words from the data: he, she, her, his, him, hers, himself, herself, mr, mrs, ms, ph, dr.

  4. We compute p values for the two-sided test of zero correlation between \(p_c\) and \(r_c\) using SciPy’s spearmanr function (Virtanen et al. 2020). Values marked with \(^*\) and \(^{**}\) indicate that the p value is \( < 0.05\) and \(< 0.01\) respectively.

  5. CoCL (Romanov et al. 2019) is modulated by a hyperparameter \(\lambda \) that determines the strength of the fairness constraint. We use \(\lambda = 2\), which Romanov et al. (2019) finds to have the smallest Gap\(^{{\textsc {RMS}}}\)on the occupation classification task.

  6. We computed the p values using the fdrcorrection method from the statsmodels Python package (Seabold and Perktold 2010).

References

  • Adi Y, Kermany E, Belinkov Y, Lavi O, Goldberg Y (2017) Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, https://openreview.net/forum?id=BJh6Ztuxl

  • Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H (2018) A reductions approach to fair classification. In: International conference on machine learning, PMLR, pp 60–69

  • Agius S, Tobler C (2012) Trans and intersex people. Discrimination on the grounds of sex, gender identity and gender expression. Office for Official Publications of the European Union

  • Antoniak M, Mimno D (2021) Bad seeds: evaluating lexical methods for bias measurement. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), Association for Computational Linguistics, Online, pp 1889–1904, https://doi.org/10.18653/v1/2021.acl-long.148

  • Bartl M, Nissim M, Gatt A (2020) Unmasking contextual stereotypes: Measuring and mitigating bert’s gender bias. In: Proceedings of the second workshop on gender bias in natural language processing, pp 1–16

  • Bellamy RK, Dey K, Hind M, Hoffman SC, Houde S, Kannan K, Lohia P, Martino J, Mehta S, Mojsilović A et al (2019) Ai fairness 360: an extensible toolkit for detecting and mitigating algorithmic bias. IBM J Res Dev 63(4/5):1–4

    Article  Google Scholar 

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc: Ser B (Methodol) 57(1):289–300

    MathSciNet  MATH  Google Scholar 

  • Bertrand M, Mullainathan S (2004) Are Emily and Greg more employable than Lakisha and Jamal? a field experiment on labor market discrimination. Am Econ Rev 94(4):991–1013

    Article  Google Scholar 

  • Bird S, Dudík M, Edgar R, Horn B, Lutz R, Milan V, Sameki M, Wallach H, Walker K (2020) Fairlearn: a toolkit for assessing and improving fairness in AI. Tech. Rep. MSR-TR-2020-32, Microsoft, https://www.microsoft.com/en-us/research/publication/fairlearn-a-toolkit-for-assessing-and-improving-fairness-in-ai/

  • Blodgett SL, Barocas S, Daumé III H, Wallach H (2020) Language (technology) is power: a critical survey of “bias” in nlp. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 5454–5476

  • Blodgett SL, Lopez G, Olteanu A, Sim R, Wallach H (2021) Stereotyping Norwegian salmon: an inventory of pitfalls in fairness benchmark datasets. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), Association for Computational Linguistics, Online, pp 1004–1015, https://doi.org/10.18653/v1/2021.acl-long.81

  • Bogen M, Rieke A (2018) Help wanted: an examination of hiring algorithms, equity, and bias. Upturn, December 7

  • Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146

    Article  Google Scholar 

  • Bolukbasi T, Chang KW, Zou JY, Saligrama V, Kalai AT (2016) Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Adv Neural Inf Process Syst 29:4349–4357

    Google Scholar 

  • Bordia S, Bowman SR (2019) Identifying and reducing gender bias in word-level language models. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: student research workshop, association for computational linguistics, Minneapolis, Minnesota, pp 7–15, https://doi.org/10.18653/v1/N19-3002

  • Buolamwini J, Gebru T (2018) Gender shades: intersectional accuracy disparities in commercial gender classification. In: Friedler SA, Wilson C (eds) Conference on fairness, accountability and transparency, FAT 2018, 23-24 February 2018, New York, NY, USA, PMLR, proceedings of machine learning research, vol 81, pp 77–91, http://proceedings.mlr.press/v81/buolamwini18a.html

  • Butler J (1989) Gender trouble: feminism and the subversion of identity. Routledge, London

    Google Scholar 

  • Caliskan A, Bryson JJ, Narayanan A (2017) Semantics derived automatically from language corpora contain human-like biases. Science 356(6334):183–186

    Article  Google Scholar 

  • Calmon FP, Wei D, Vinzamuri B, Ramamurthy KN, Varshney KR (2017) Optimized pre-processing for discrimination prevention. In: Proceedings of the 31st international conference on neural information processing systems, pp 3995–4004

  • Cao YT, III HD (2019) Toward gender-inclusive coreference resolution. CoRR, arXiv:1910.13913

  • Ceren A, Tekir S (2021) Gender bias in occupation classification from the new york times obituaries. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 24(71):425–436

    Google Scholar 

  • Commission OHR (2021) Gender identity and gender expression. http://www.ohrc.on.ca/en/policy-preventing-discrimination-because-gender-identity-and-gender-expression/3-gender-identity-and-gender-expression

  • Crawford JT, Leynes PA, Mayhorn CB, Bink ML (2004) Champagne, beer, or coffee? a corpus of gender-related and neutral words. Behav Res Methods Instrum Comput 36(3):444–458. https://doi.org/10.3758/bf03195592

    Article  Google Scholar 

  • Crenshaw K (1990) Mapping the margins: intersectionality, identity politics, and violence against women of color. Stan L Rev 43:1241

    Article  Google Scholar 

  • Cryan J, Tang S, Zhang X, Metzger M, Zheng H, Zhao BY (2020) Detecting gender stereotypes: Lexicon versus supervised learning methods. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–11

  • De-Arteaga M, Romanov A, Wallach H, Chayes J, Borgs C, Chouldechova A, Geyik S, Kenthapadi K, Kalai AT (2019) Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: Proceedings of the conference on fairness, accountability, and transparency, pp 120–128

  • Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, pp 4171–4186, https://doi.org/10.18653/v1/n19-1423

  • Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: Proceedings of the 3rd innovations in theoretical computer science conference, pp 214–226

  • Dwork C, Immorlica N, Kalai AT, Leiserson M (2018) Decoupled classifiers for group-fair and efficient machine learning. In: Conference on fairness, accountability and transparency, PMLR, pp 119–133

  • Ensmenger N (2015) beards, sandals, and other signs of rugged individualism: masculine culture within the computing professions. Osiris 30(1):38–65

    Article  Google Scholar 

  • Garg N, Schiebinger L, Jurafsky D, Zou J (2018) Word embeddings quantify 100 years of gender and ethnic stereotypes. Proc Natl Acad Sci 115(16):E3635–E3644

    Article  Google Scholar 

  • Geyik SC, Ambler S, Kenthapadi K (2019) Fairness-aware ranking in search and recommendation systems with application to linkedin talent search. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 2221–2231

  • Glick JL, Theall K, Andrinopoulos K, Kendall C (2018) For data’s sake: dilemmas in the measurement of gender minorities. Cult Health Sex 20(12):1362–1377

    Article  Google Scholar 

  • Gonen H, Goldberg Y (2019) Lipstick on a pig: debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, pp 609–614, https://doi.org/10.18653/v1/n19-1061

  • Hanna A, Denton E, Smart A, Smith-Loud J (2020) Towards a critical race methodology in algorithmic fairness. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 501–512

  • Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. In: Proceedings of the 30th international conference on neural information processing systems, pp 3323–3331

  • Hébert-Johnson U, Kim M, Reingold O, Rothblum G (2018) Multicalibration: calibration for the (computationally-identifiable) masses. In: International conference on machine learning, PMLR, pp 1939–1948

  • Heilman ME (2001) Description and prescription: how gender stereotypes prevent women’s ascent up the organizational ladder. J Soc Issues 57(4):657–674

    Article  Google Scholar 

  • Heilman ME (2012) Gender stereotypes and workplace bias. Res Organ Behav 32:113–135

    Google Scholar 

  • Hu L, Kohler-Hausmann I (2020) What’s sex got to do with machine learning? In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 513

  • Johnson SK, Hekman DR, Chan ET (2016) If there’s only one woman in your candidate pool, there’s statistically no chance she’ll be hired. Harv Bus Rev 26(04):1–7

    Google Scholar 

  • Kamiran F, Calders T (2012) Data preprocessing techniques for classification without discrimination. Knowl Inf Syst 33(1):1–33

    Article  Google Scholar 

  • Kamiran F, Karim A, Zhang X (2012) Decision theory for discrimination-aware classification. In: 2012 IEEE 12th international conference on data mining, IEEE, pp 924–929

  • Kearns MJ, Neel S, Roth A, Wu ZS (2018) Preventing fairness gerrymandering: auditing and learning for subgroup fairness. In: Dy JG, Krause A (eds) Proceedings of the 35th international conference on machine learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10–15, 2018, PMLR, Proceedings of Machine Learning Research, vol 80, pp 2569–2577, http://proceedings.mlr.press/v80/kearns18a.html

  • Keyes O, May C, Carrell A (2021) You keep using that word: ways of thinking about gender in computing research. Proc ACM Human-Comput Interact 5(CSCW1):1–23

    Article  Google Scholar 

  • Kumar V, Bhotia TS, Kumar V, Chakraborty T (2020) Nurse is closer to woman than surgeon? mitigating gender-biased proximities in word embeddings. Trans Assoc Comput Linguist 8:486–503. https://doi.org/10.1162/tacl_a_00327

    Article  Google Scholar 

  • Kusner MJ, Loftus J, Russell C, Silva R (2017) Counterfactual fairness. In: Advances in neural information processing systems 30 (NIPS 2017)

  • Larson B (2017) Gender as a variable in natural-language processing: ethical considerations. In: Proceedings of the first ACL workshop on ethics in natural language processing, association for computational linguistics, Valencia, Spain, pp 1–11, https://doi.org/10.18653/v1/W17-1601

  • Light JS (1999) When computers were women. Technol Cult 40(3):455–483

    Article  MathSciNet  Google Scholar 

  • Lipton Z, McAuley J, Chouldechova A (2018) Does mitigating ML’s impact disparity require treatment disparity? In: Advances in neural information processing systems 31

  • Lohia PK, Ramamurthy KN, Bhide M, Saha D, Varshney KR, Puri R (2019) Bias mitigation post-processing for individual and group fairness. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 2847–2851

  • Madera JM, Hebl MR, Martin RC (2009) Gender and letters of recommendation for academia: agentic and communal differences. J Appl Psychol 94(6):1591

    Article  Google Scholar 

  • Mangheni M, Tufan H, Nkengla L, Aman B, Boonabaana B (2019) Gender norms, technology access, and women farmers’ vulnerability to climate change in sub-saharan africa. In: Agriculture and ecosystem resilience in Sub Saharan Africa, Springer, pp 715–728

  • Marx C, Calmon F, Ustun B (2020) Predictive multiplicity in classification. In: International conference on machine learning, PMLR, pp 6765–6774

  • Mikolov T, Grave É, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in pre-training distributed word representations. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)

  • Mitchell M, Baker D, Moorosi N, Denton E, Hutchinson B, Hanna A, Gebru T, Morgenstern J (2020) Diversity and inclusion metrics in subset selection. In: Proceedings of the AAAI/ACM conference on AI, ethics, and society, pp 117–123

  • Moon R (2014) From gorgeous to grumpy: adjectives, age and gender. Gender Lang 8(1):5–41

    Article  Google Scholar 

  • Nadeem M, Bethke A, Reddy S (2021) Stereoset: measuring stereotypical bias in pretrained language models. In: Zong C, Xia F, Li W, Navigli R (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1–6, 2021, Association for Computational Linguistics, pp 5356–5371, https://doi.org/10.18653/v1/2021.acl-long.416

  • Nangia N, Vania C, Bhalerao R, Bowman SR (2020) Crows-pairs: A challenge dataset for measuring social biases in masked language models. In: Webber B, Cohn T, He Y, Liu Y (eds) Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16–20, 2020, Association for computational linguistics, pp 1953–1967, https://doi.org/10.18653/v1/2020.emnlp-main.154

  • Noble SU (2018) Algorithms of oppression: how search engines reinforce racism. NYU Press, New York

    Book  Google Scholar 

  • Park JH, Shin J, Fung P (2018) Reducing gender bias in abusive language detection. In: Proceedings of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium, pp 2799–2804, https://doi.org/10.18653/v1/D18-1302

  • Peng A, Nushi B, Kıcıman E, Inkpen K, Suri S, Kamar E (2019) What you see is what you get? the impact of representation criteria on human bias in hiring. Proc AAAI Conf Hum Comput Crowdsour 7:125–134

    Google Scholar 

  • Peng A, Nushi B, Kiciman E, Inkpen K, Kamar E (2022) Investigations of performance and bias in human-AI teamwork in hiring. In: Proceedings of the 36th AAAI conference on artificial intelligence (AAAI 2022), AAAI

  • Pleiss G, Raghavan M, Wu F, Kleinberg J, Weinberger KQ (2017) On fairness and calibration. In: Advances in neural information processing systems 30 (NIPS 2017)

  • Raghavan M, Barocas S, Kleinberg J, Levy K (2020) Mitigating bias in algorithmic hiring: evaluating claims and practices. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 469–481

  • Romanov A, De-Arteaga M, Wallach HM, Chayes JT, Borgs C, Chouldechova A, Geyik SC, Kenthapadi K, Rumshisky A, Kalai A (2019) What’s in a name? reducing bias in bios without access to protected attributes. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Association for computational linguistics, pp 4187–4195, https://doi.org/10.18653/v1/n19-1424

  • Rudinger R, May C, Van Durme B (2017) Social bias in elicited natural language inferences. In: Proceedings of the First ACL workshop on ethics in natural language processing, association for computational linguistics, Valencia, Spain, pp 74–79, https://doi.org/10.18653/v1/W17-1609

  • Rudinger R, Naradowsky J, Leonard B, Durme BV (2018) Gender bias in coreference resolution. In: Walker MA, Ji H, Stent A (eds) Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 2 (Short Papers), Association for Computational Linguistics, pp 8–14, https://doi.org/10.18653/v1/n18-2002

  • Russell B (2012) Perceptions of female offenders: How stereotypes and social norms affect criminal justice responses. Springer Science and Business Media, Berlin

    Google Scholar 

  • Sánchez-Monedero J, Dencik L, Edwards L (2020) What does it mean to’solve’the problem of discrimination in hiring? social, technical and legal perspectives from the uk on automated hiring systems. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 458–468

  • Scheuerman MK, Paul JM, Brubaker JR (2019) How computers see gender. Proc ACM Human-Comput Interact 3(CSCW):1–33. https://doi.org/10.1145/3359246

    Article  Google Scholar 

  • Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. In: 9th Python in science conference

  • Sen M, Wasow O (2016) Race as a bundle of sticks: designs that estimate effects of seemingly immutable characteristics. Annu Rev Polit Sci 19:499–522

    Article  Google Scholar 

  • Shields SA (2008) Gender: an intersectionality perspective. Sex Roles 59(5):301–311

    Article  Google Scholar 

  • Snyder K (2015) The resume gap: are different gender styles contributing to tech’s dismal diversity. Fortune Magazine

  • Stark L, Stanhaus A, Anthony DL (2020) i don’t want someone to watch me while im working: gendered views of facial recognition technology in workplace surveillance. J Am Soc Inf Sci 71(9):1074–1088. https://doi.org/10.1002/asi.24342

    Article  Google Scholar 

  • Swinger N, De-Arteaga M, Heffernan IV NT, Leiserson MD, Kalai AT (2019) What are the biases in my word embedding? In: Proceedings of the 2019 AAAI/ACM conference on AI, ethics, and society, pp 305–311

  • Tang S, Zhang X, Cryan J, Metzger MJ, Zheng H, Zhao BY (2017) Gender bias in the job market: a longitudinal analysis. Proc ACM Human-Comput Interact 1(CSCW):1–19

    Google Scholar 

  • Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J et al (2020) Scipy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17(3):261–272

    Article  Google Scholar 

  • Wagner C, Garcia D, Jadidi M, Strohmaier M (2015) It’s a man’s wikipedia? assessing gender inequality in an online encyclopedia. In: Proceedings of the international AAAI conference on web and social media, vol 9

  • Wang T, Zhao J, Yatskar M, Chang KW, Ordonez V (2019) Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5310–5319

  • Wojcik S, Remy E (2020) The challenges of using machine learning to identify gender in images. https://www.pewresearch.org/internet/2019/09/05/the-challenges-of-using-machine-learning-to-identify-gender-in-images/

  • Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest Q, Rush AM (2020) Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, association for computational linguistics, Online, pp 38–45, https://www.aclweb.org/anthology/2020.emnlp-demos.6

  • Wood W, Eagly AH (2009) Gender identity. Handbook of individual differences in social behavior pp 109–125

  • Zhang BH, Lemoine B, Mitchell M (2018) Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society, pp 335–340

  • Zhou X, Sap M, Swayamdipta S, Choi Y, Smith NA (2021) Challenges in automated debiasing for toxic language detection. In: Merlo P, Tiedemann J, Tsarfaty R (eds) Proceedings of the 16th conference of the European chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19–23, 2021, Association for computational linguistics, pp 3143–3155, https://doi.org/10.18653/v1/2021.eacl-main.274

Download references

Funding

This work is supported by Microsoft Research and Good Systems, a UT Austin Grand Challenge to develop responsible AI technologies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Myra Cheng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Responsible editor: Toon Calders, Salvatore Ruggieri, Bodo Rosenhahn, Mykola Pechenizkiy and Eirini Ntoutsi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Gendered words used in classifiers

We provide insight into some of the differences across the classifiers that may be driving the SNoB  described in preceding sections. We define \(\beta _w\) as the weight of a word w based on the value of the classifiers’ coefficients. We focus on the logistic regression classifiers using the BOW and WE representations since the BERT representations are contextualized, so each word does not have a fixed weight to the model that is easily interpretable.

For the BOW representation of a biography x, each feature in the input vector \(v_x\) corresponds to a word w in the vocabulary. We define \(\beta _w\) as the value of the corresponding coefficient in the logistic regression classifier. The magnitude of \(\beta _w\) is a measure of the importance of w to the occupation classification, while the sign (positive or negative) of \(\beta \) indicates whether w is correlated or anti-correlated with the positive class of the classifier.

For the WE representation, we compute the weight of each word as

$$\begin{aligned} \beta _w = \frac{e_w\cdot W_c}{|e_w||W_c|}, \end{aligned}$$

i.e. the cosine similarity between each word’s fastText word embedding \(e_w\) and the coefficient weight vector \(W_c\) of the WE-representation classifier. Like in the BOW representation, the magnitude of \(\beta _w\) quantifies the word’s importance, while the sign indicates the direction of the association.

If a word w has positive/negative weight for classifier \(Y_c\), then adding w to a biography x increases/decreases the predicted probability \(Y_c(x)\) respectively.

Let \(\beta _w(Y_c)\) be the weights for approach \(Y_c.\) We examine the words whose weights \(\beta _w\) satisfy

  1. 1.

    \(|\beta _w(Y_c)| > T\),

  2. 2.

    \(|\beta _w(G)| > T\),

  3. 3.

    \(|\beta _w(Y_c)| > T'\cdot |\beta _w(Y_c')|\),

where \(T, T'\) are significance thresholds and \(Y_c, Y_c'\) are two different occupation classification approaches.

Words that satisfy these conditions are not only associated with either masculinity or femininity but also weighted more highly in approach \(Y_c\) compared to \(Y_c'\). Thus, including these gendered words in a biography influences \(Y_c\)’s classification more strongly than that of \(Y_c'\). This suggests that they may contribute more strongly to the \(\rho ({\textbf {p}}_C, {\textbf {r}}_C)\) in one approach than the other. For example, we examined these words for the occupations of surgeon, software engineer, composer, nurse, dietitian, and yoga teacher, which are the six most gender-imbalanced occupations, with \(Y_c = \) {BOW, post-processing}, \(Y_c' =\) {BOW, decoupled}, \(T = 0.5\) and \(T' = 0.7\). The words that satisfy these conditions are “miss”, “mom”, “wife”, “mother”, and “husband.” Conversely, with \(Y_c = \) {BOW, DE} and \(Y_c' =\) {BOW, PO}, the words are “girls”, “women”, “gender”, “loves”, “mother”, “romance”, “daughter”, “sister”, and “female.”

These gendered words illustrate the multiplicity of gender present in the biographies beyond categorical labels, which standard group fairness interventions do not consider.

Our analysis is limited by the fact that we only consider the individual influence of each word conditioned on the remaining words, while the joint influence of two or more words may also be of relevance.

Appendix B: Analysis on nonbinary dataset

Table 3 Correlation \(r^{{\textsc {nb}}}_{{\textsc {professor}}}\) (first three columns) and \(r_{{\textsc {professor}}}\)  (latter three columns) across pre-processing (pre-proc), post-processing (post-proc), and decoupled approaches

We aim to consider how algorithmic fairness approaches affect nonbinary individuals, who are overlooked by group fairness approaches (Keyes et al. 2021). Using the same regular expression as De-Arteaga et al. (2019) to identify biography-format strings, we collected a dataset of biographies that use nonbinary pronouns such as “they”, “xe”, and “hir.” Since “they” frequently refers to plural people, we manually inspected a sample of 2000 biographies using“they” to identify those biographies that refer to individuals. professor is the only occupation title with more than 20 such biographies; the other occupations have too few biographies to perform meaningful statistical analysis. We computed \(r^{{\textsc {nb}}}_{{\textsc {professor}}}\), which is analogous to \(r_{{\textsc {professor}}}\), the measure of SNoB for an individual occupation classifier introduced in Sect. 4. While \(r_{{\textsc {professor}}}\) is Spearman’s correlation computed across the biographies in \(S_c\), \(r^{{\textsc {nb}}}_{{\textsc {professor}}}\)  is the correlation across the nonbinary biographies in the profession. The results are reported in Table 3. We find that \(r^{{\textsc {nb}}}_{{\textsc {professor}}}\) is positive across different approaches. However, the associated p values are quite large \((>0.1)\), so it is challenging to analyze these associations. This is likely due to the small sample size; while \(r_{{\textsc {professor}}}\)  is computed across the 10677 professor biographies that use “she” pronouns, \(r^{{\textsc {nb}}}_{{\textsc {professor}}}\)  is across only 21 biographies.

Appendix C: Word weights

In Fig. 6, we plot the weight of each word in the BOW vocabulary in the occupation classifiers and gender classifiers. These weights illuminate some of the mechanisms behind the predictions. Ideally, without SNoB, every point would have small magnitude in either the occupation or gender classifier, i.e. lie on either the \(x-\) or \(y-\)axis of Fig. 6. We observe that in the DE approach, words are closer to the \(y-\)axis compared to the post-processing approach. This corresponds to the smaller value of \(\rho ({\textbf {p}}_C, {\textbf {r}}_C)\) exhibited by the decoupled approach compared to the post-processing one in Table 2. Note that the post-processed classifier is trained on all of the biographies, while the decoupled classifier is trained on only biographies that use the same pronoun.

Fig. 6
figure 6

Words’ weights in the occupation and gender classifiers for different approaches in the surgeon occupation. Each point represents a word; its x-position and y-position represents its weight in \(\hat{Y}_c\) and G respectively. Each point is colored based on its quadrant in the post-processing approach. Many points are closer to the \(y-\)axis in the decoupled approach

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cheng, M., De-Arteaga, M., Mackey, L. et al. Social norm bias: residual harms of fairness-aware algorithms. Data Min Knowl Disc 37, 1858–1884 (2023). https://doi.org/10.1007/s10618-022-00910-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-022-00910-8

Keywords

Navigation