Social norm bias: residual harms of fairness-aware algorithms

Cheng, Myra; De-Arteaga, Maria; Mackey, Lester; Kalai, Adam Tauman

doi:10.1007/s10618-022-00910-8

Social norm bias: residual harms of fairness-aware algorithms

Published: 23 January 2023

Volume 37, pages 1858–1884, (2023)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Myra Cheng ORCID: orcid.org/0000-0002-5052-2929^1,2,
Maria De-Arteaga³,
Lester Mackey² &
…
Adam Tauman Kalai²

1481 Accesses
69 Altmetric
8 Mentions
Explore all metrics

Abstract

Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race. However, these algorithms seldom account for within-group heterogeneity and biases that may disproportionately affect some members of a group. In this work, we characterize Social Norm Bias (SNoB), a subtle but consequential type of algorithmic discrimination that may be exhibited by machine learning models, even when these systems achieve group fairness objectives. We study this issue through the lens of gender bias in occupation classification. We quantify SNoB by measuring how an algorithm’s predictions are associated with conformity to inferred gender norms. When predicting if an individual belongs to a male-dominated occupation, this framework reveals that “fair” classifiers still favor biographies written in ways that align with inferred masculine norms. We compare SNoB across algorithmic fairness techniques and show that it is frequently a residual bias, and post-processing approaches do not mitigate this type of bias at all.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Towards a holistic view of bias in machine learning: bridging algorithmic fairness and imbalanced learning

Article Open access 04 April 2024

Quantifying Fairness and Discrimination in Predictive Models

Are Fair Machine Learning Models More Useful?

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Availability of data and material

Publicly available at http://aka.ms/biasbios.

Code Availability

Publicly available at http://bit.ly/snobcode.

Notes

Twitter thread started by Dr. Timothy Verstynen: https://twitter.com/tdverstynen/status/1501386481415434245
The dataset is publicly available at http://aka.ms/biasbios and licensed under the MIT License.
Consistent with previous work (De-Arteaga et al. 2019), we used regular expressions to remove the following words from the data: he, she, her, his, him, hers, himself, herself, mr, mrs, ms, ph, dr.
We compute p values for the two-sided test of zero correlation between $p_c$ and $r_c$ using SciPy’s spearmanr function (Virtanen et al. 2020). Values marked with $^*$ and $^{**}$ indicate that the p value is $ < 0.05$ and $< 0.01$ respectively.
CoCL (Romanov et al. 2019) is modulated by a hyperparameter $\lambda $ that determines the strength of the fairness constraint. We use $\lambda = 2$, which Romanov et al. (2019) finds to have the smallest Gap$^{{\textsc {RMS}}}$on the occupation classification task.
We computed the p values using the fdrcorrection method from the statsmodels Python package (Seabold and Perktold 2010).

References

Adi Y, Kermany E, Belinkov Y, Lavi O, Goldberg Y (2017) Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, https://openreview.net/forum?id=BJh6Ztuxl
Agarwal A, Beygelzimer A, Dudík M, Langford J, Wallach H (2018) A reductions approach to fair classification. In: International conference on machine learning, PMLR, pp 60–69
Agius S, Tobler C (2012) Trans and intersex people. Discrimination on the grounds of sex, gender identity and gender expression. Office for Official Publications of the European Union
Antoniak M, Mimno D (2021) Bad seeds: evaluating lexical methods for bias measurement. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), Association for Computational Linguistics, Online, pp 1889–1904, https://doi.org/10.18653/v1/2021.acl-long.148
Bartl M, Nissim M, Gatt A (2020) Unmasking contextual stereotypes: Measuring and mitigating bert’s gender bias. In: Proceedings of the second workshop on gender bias in natural language processing, pp 1–16
Bellamy RK, Dey K, Hind M, Hoffman SC, Houde S, Kannan K, Lohia P, Martino J, Mehta S, Mojsilović A et al (2019) Ai fairness 360: an extensible toolkit for detecting and mitigating algorithmic bias. IBM J Res Dev 63(4/5):1–4
Article Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc: Ser B (Methodol) 57(1):289–300
MathSciNet MATH Google Scholar
Bertrand M, Mullainathan S (2004) Are Emily and Greg more employable than Lakisha and Jamal? a field experiment on labor market discrimination. Am Econ Rev 94(4):991–1013
Article Google Scholar
Bird S, Dudík M, Edgar R, Horn B, Lutz R, Milan V, Sameki M, Wallach H, Walker K (2020) Fairlearn: a toolkit for assessing and improving fairness in AI. Tech. Rep. MSR-TR-2020-32, Microsoft, https://www.microsoft.com/en-us/research/publication/fairlearn-a-toolkit-for-assessing-and-improving-fairness-in-ai/
Blodgett SL, Barocas S, Daumé III H, Wallach H (2020) Language (technology) is power: a critical survey of “bias” in nlp. In: Proceedings of the 58th annual meeting of the association for computational linguistics, pp 5454–5476
Blodgett SL, Lopez G, Olteanu A, Sim R, Wallach H (2021) Stereotyping Norwegian salmon: an inventory of pitfalls in fairness benchmark datasets. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), Association for Computational Linguistics, Online, pp 1004–1015, https://doi.org/10.18653/v1/2021.acl-long.81
Bogen M, Rieke A (2018) Help wanted: an examination of hiring algorithms, equity, and bias. Upturn, December 7
Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146
Article Google Scholar
Bolukbasi T, Chang KW, Zou JY, Saligrama V, Kalai AT (2016) Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Adv Neural Inf Process Syst 29:4349–4357
Google Scholar
Bordia S, Bowman SR (2019) Identifying and reducing gender bias in word-level language models. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: student research workshop, association for computational linguistics, Minneapolis, Minnesota, pp 7–15, https://doi.org/10.18653/v1/N19-3002
Buolamwini J, Gebru T (2018) Gender shades: intersectional accuracy disparities in commercial gender classification. In: Friedler SA, Wilson C (eds) Conference on fairness, accountability and transparency, FAT 2018, 23-24 February 2018, New York, NY, USA, PMLR, proceedings of machine learning research, vol 81, pp 77–91, http://proceedings.mlr.press/v81/buolamwini18a.html
Butler J (1989) Gender trouble: feminism and the subversion of identity. Routledge, London
Google Scholar
Caliskan A, Bryson JJ, Narayanan A (2017) Semantics derived automatically from language corpora contain human-like biases. Science 356(6334):183–186
Article Google Scholar
Calmon FP, Wei D, Vinzamuri B, Ramamurthy KN, Varshney KR (2017) Optimized pre-processing for discrimination prevention. In: Proceedings of the 31st international conference on neural information processing systems, pp 3995–4004
Cao YT, III HD (2019) Toward gender-inclusive coreference resolution. CoRR, arXiv:1910.13913
Ceren A, Tekir S (2021) Gender bias in occupation classification from the new york times obituaries. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 24(71):425–436
Google Scholar
Commission OHR (2021) Gender identity and gender expression. http://www.ohrc.on.ca/en/policy-preventing-discrimination-because-gender-identity-and-gender-expression/3-gender-identity-and-gender-expression
Crawford JT, Leynes PA, Mayhorn CB, Bink ML (2004) Champagne, beer, or coffee? a corpus of gender-related and neutral words. Behav Res Methods Instrum Comput 36(3):444–458. https://doi.org/10.3758/bf03195592
Article Google Scholar
Crenshaw K (1990) Mapping the margins: intersectionality, identity politics, and violence against women of color. Stan L Rev 43:1241
Article Google Scholar
Cryan J, Tang S, Zhang X, Metzger M, Zheng H, Zhao BY (2020) Detecting gender stereotypes: Lexicon versus supervised learning methods. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–11
De-Arteaga M, Romanov A, Wallach H, Chayes J, Borgs C, Chouldechova A, Geyik S, Kenthapadi K, Kalai AT (2019) Bias in bios: a case study of semantic representation bias in a high-stakes setting. In: Proceedings of the conference on fairness, accountability, and transparency, pp 120–128
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, pp 4171–4186, https://doi.org/10.18653/v1/n19-1423
Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R (2012) Fairness through awareness. In: Proceedings of the 3rd innovations in theoretical computer science conference, pp 214–226
Dwork C, Immorlica N, Kalai AT, Leiserson M (2018) Decoupled classifiers for group-fair and efficient machine learning. In: Conference on fairness, accountability and transparency, PMLR, pp 119–133
Ensmenger N (2015) beards, sandals, and other signs of rugged individualism: masculine culture within the computing professions. Osiris 30(1):38–65
Article Google Scholar
Garg N, Schiebinger L, Jurafsky D, Zou J (2018) Word embeddings quantify 100 years of gender and ethnic stereotypes. Proc Natl Acad Sci 115(16):E3635–E3644
Article Google Scholar
Geyik SC, Ambler S, Kenthapadi K (2019) Fairness-aware ranking in search and recommendation systems with application to linkedin talent search. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery and data mining, pp 2221–2231
Glick JL, Theall K, Andrinopoulos K, Kendall C (2018) For data’s sake: dilemmas in the measurement of gender minorities. Cult Health Sex 20(12):1362–1377
Article Google Scholar
Gonen H, Goldberg Y (2019) Lipstick on a pig: debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Association for Computational Linguistics, pp 609–614, https://doi.org/10.18653/v1/n19-1061
Hanna A, Denton E, Smart A, Smith-Loud J (2020) Towards a critical race methodology in algorithmic fairness. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 501–512
Hardt M, Price E, Srebro N (2016) Equality of opportunity in supervised learning. In: Proceedings of the 30th international conference on neural information processing systems, pp 3323–3331
Hébert-Johnson U, Kim M, Reingold O, Rothblum G (2018) Multicalibration: calibration for the (computationally-identifiable) masses. In: International conference on machine learning, PMLR, pp 1939–1948
Heilman ME (2001) Description and prescription: how gender stereotypes prevent women’s ascent up the organizational ladder. J Soc Issues 57(4):657–674
Article Google Scholar
Heilman ME (2012) Gender stereotypes and workplace bias. Res Organ Behav 32:113–135
Google Scholar
Hu L, Kohler-Hausmann I (2020) What’s sex got to do with machine learning? In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 513
Johnson SK, Hekman DR, Chan ET (2016) If there’s only one woman in your candidate pool, there’s statistically no chance she’ll be hired. Harv Bus Rev 26(04):1–7
Google Scholar
Kamiran F, Calders T (2012) Data preprocessing techniques for classification without discrimination. Knowl Inf Syst 33(1):1–33
Article Google Scholar
Kamiran F, Karim A, Zhang X (2012) Decision theory for discrimination-aware classification. In: 2012 IEEE 12th international conference on data mining, IEEE, pp 924–929
Kearns MJ, Neel S, Roth A, Wu ZS (2018) Preventing fairness gerrymandering: auditing and learning for subgroup fairness. In: Dy JG, Krause A (eds) Proceedings of the 35th international conference on machine learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10–15, 2018, PMLR, Proceedings of Machine Learning Research, vol 80, pp 2569–2577, http://proceedings.mlr.press/v80/kearns18a.html
Keyes O, May C, Carrell A (2021) You keep using that word: ways of thinking about gender in computing research. Proc ACM Human-Comput Interact 5(CSCW1):1–23
Article Google Scholar
Kumar V, Bhotia TS, Kumar V, Chakraborty T (2020) Nurse is closer to woman than surgeon? mitigating gender-biased proximities in word embeddings. Trans Assoc Comput Linguist 8:486–503. https://doi.org/10.1162/tacl_a_00327
Article Google Scholar
Kusner MJ, Loftus J, Russell C, Silva R (2017) Counterfactual fairness. In: Advances in neural information processing systems 30 (NIPS 2017)
Larson B (2017) Gender as a variable in natural-language processing: ethical considerations. In: Proceedings of the first ACL workshop on ethics in natural language processing, association for computational linguistics, Valencia, Spain, pp 1–11, https://doi.org/10.18653/v1/W17-1601
Light JS (1999) When computers were women. Technol Cult 40(3):455–483
Article MathSciNet Google Scholar
Lipton Z, McAuley J, Chouldechova A (2018) Does mitigating ML’s impact disparity require treatment disparity? In: Advances in neural information processing systems 31
Lohia PK, Ramamurthy KN, Bhide M, Saha D, Varshney KR, Puri R (2019) Bias mitigation post-processing for individual and group fairness. In: ICASSP 2019-2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 2847–2851
Madera JM, Hebl MR, Martin RC (2009) Gender and letters of recommendation for academia: agentic and communal differences. J Appl Psychol 94(6):1591
Article Google Scholar
Mangheni M, Tufan H, Nkengla L, Aman B, Boonabaana B (2019) Gender norms, technology access, and women farmers’ vulnerability to climate change in sub-saharan africa. In: Agriculture and ecosystem resilience in Sub Saharan Africa, Springer, pp 715–728
Marx C, Calmon F, Ustun B (2020) Predictive multiplicity in classification. In: International conference on machine learning, PMLR, pp 6765–6774
Mikolov T, Grave É, Bojanowski P, Puhrsch C, Joulin A (2018) Advances in pre-training distributed word representations. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)
Mitchell M, Baker D, Moorosi N, Denton E, Hutchinson B, Hanna A, Gebru T, Morgenstern J (2020) Diversity and inclusion metrics in subset selection. In: Proceedings of the AAAI/ACM conference on AI, ethics, and society, pp 117–123
Moon R (2014) From gorgeous to grumpy: adjectives, age and gender. Gender Lang 8(1):5–41
Article Google Scholar
Nadeem M, Bethke A, Reddy S (2021) Stereoset: measuring stereotypical bias in pretrained language models. In: Zong C, Xia F, Li W, Navigli R (eds) Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1–6, 2021, Association for Computational Linguistics, pp 5356–5371, https://doi.org/10.18653/v1/2021.acl-long.416
Nangia N, Vania C, Bhalerao R, Bowman SR (2020) Crows-pairs: A challenge dataset for measuring social biases in masked language models. In: Webber B, Cohn T, He Y, Liu Y (eds) Proceedings of the 2020 conference on empirical methods in natural language processing, EMNLP 2020, Online, November 16–20, 2020, Association for computational linguistics, pp 1953–1967, https://doi.org/10.18653/v1/2020.emnlp-main.154
Noble SU (2018) Algorithms of oppression: how search engines reinforce racism. NYU Press, New York
Book Google Scholar
Park JH, Shin J, Fung P (2018) Reducing gender bias in abusive language detection. In: Proceedings of the 2018 conference on empirical methods in natural language processing, association for computational linguistics, Brussels, Belgium, pp 2799–2804, https://doi.org/10.18653/v1/D18-1302
Peng A, Nushi B, Kıcıman E, Inkpen K, Suri S, Kamar E (2019) What you see is what you get? the impact of representation criteria on human bias in hiring. Proc AAAI Conf Hum Comput Crowdsour 7:125–134
Google Scholar
Peng A, Nushi B, Kiciman E, Inkpen K, Kamar E (2022) Investigations of performance and bias in human-AI teamwork in hiring. In: Proceedings of the 36th AAAI conference on artificial intelligence (AAAI 2022), AAAI
Pleiss G, Raghavan M, Wu F, Kleinberg J, Weinberger KQ (2017) On fairness and calibration. In: Advances in neural information processing systems 30 (NIPS 2017)
Raghavan M, Barocas S, Kleinberg J, Levy K (2020) Mitigating bias in algorithmic hiring: evaluating claims and practices. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 469–481
Romanov A, De-Arteaga M, Wallach HM, Chayes JT, Borgs C, Chouldechova A, Geyik SC, Kenthapadi K, Rumshisky A, Kalai A (2019) What’s in a name? reducing bias in bios without access to protected attributes. In: Burstein J, Doran C, Solorio T (eds) Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), Association for computational linguistics, pp 4187–4195, https://doi.org/10.18653/v1/n19-1424
Rudinger R, May C, Van Durme B (2017) Social bias in elicited natural language inferences. In: Proceedings of the First ACL workshop on ethics in natural language processing, association for computational linguistics, Valencia, Spain, pp 74–79, https://doi.org/10.18653/v1/W17-1609
Rudinger R, Naradowsky J, Leonard B, Durme BV (2018) Gender bias in coreference resolution. In: Walker MA, Ji H, Stent A (eds) Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, NAACL-HLT, New Orleans, Louisiana, USA, June 1–6, 2018, Volume 2 (Short Papers), Association for Computational Linguistics, pp 8–14, https://doi.org/10.18653/v1/n18-2002
Russell B (2012) Perceptions of female offenders: How stereotypes and social norms affect criminal justice responses. Springer Science and Business Media, Berlin
Google Scholar
Sánchez-Monedero J, Dencik L, Edwards L (2020) What does it mean to’solve’the problem of discrimination in hiring? social, technical and legal perspectives from the uk on automated hiring systems. In: Proceedings of the 2020 conference on fairness, accountability, and transparency, pp 458–468
Scheuerman MK, Paul JM, Brubaker JR (2019) How computers see gender. Proc ACM Human-Comput Interact 3(CSCW):1–33. https://doi.org/10.1145/3359246
Article Google Scholar
Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. In: 9th Python in science conference
Sen M, Wasow O (2016) Race as a bundle of sticks: designs that estimate effects of seemingly immutable characteristics. Annu Rev Polit Sci 19:499–522
Article Google Scholar
Shields SA (2008) Gender: an intersectionality perspective. Sex Roles 59(5):301–311
Article Google Scholar
Snyder K (2015) The resume gap: are different gender styles contributing to tech’s dismal diversity. Fortune Magazine
Stark L, Stanhaus A, Anthony DL (2020) i don’t want someone to watch me while im working: gendered views of facial recognition technology in workplace surveillance. J Am Soc Inf Sci 71(9):1074–1088. https://doi.org/10.1002/asi.24342
Article Google Scholar
Swinger N, De-Arteaga M, Heffernan IV NT, Leiserson MD, Kalai AT (2019) What are the biases in my word embedding? In: Proceedings of the 2019 AAAI/ACM conference on AI, ethics, and society, pp 305–311
Tang S, Zhang X, Cryan J, Metzger MJ, Zheng H, Zhao BY (2017) Gender bias in the job market: a longitudinal analysis. Proc ACM Human-Comput Interact 1(CSCW):1–19
Google Scholar
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J et al (2020) Scipy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17(3):261–272
Article Google Scholar
Wagner C, Garcia D, Jadidi M, Strohmaier M (2015) It’s a man’s wikipedia? assessing gender inequality in an online encyclopedia. In: Proceedings of the international AAAI conference on web and social media, vol 9
Wang T, Zhao J, Yatskar M, Chang KW, Ordonez V (2019) Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5310–5319
Wojcik S, Remy E (2020) The challenges of using machine learning to identify gender in images. https://www.pewresearch.org/internet/2019/09/05/the-challenges-of-using-machine-learning-to-identify-gender-in-images/
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J, Shleifer S, von Platen P, Ma C, Jernite Y, Plu J, Xu C, Scao TL, Gugger S, Drame M, Lhoest Q, Rush AM (2020) Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, association for computational linguistics, Online, pp 38–45, https://www.aclweb.org/anthology/2020.emnlp-demos.6
Wood W, Eagly AH (2009) Gender identity. Handbook of individual differences in social behavior pp 109–125
Zhang BH, Lemoine B, Mitchell M (2018) Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society, pp 335–340
Zhou X, Sap M, Swayamdipta S, Choi Y, Smith NA (2021) Challenges in automated debiasing for toxic language detection. In: Merlo P, Tiedemann J, Tsarfaty R (eds) Proceedings of the 16th conference of the European chapter of the association for computational linguistics: main volume, EACL 2021, Online, April 19–23, 2021, Association for computational linguistics, pp 3143–3155, https://doi.org/10.18653/v1/2021.eacl-main.274

Download references

Funding

This work is supported by Microsoft Research and Good Systems, a UT Austin Grand Challenge to develop responsible AI technologies.

Author information

Authors and Affiliations

California Institute of Technology, Pasadena, USA
Myra Cheng
Microsoft Research, Cambridge, USA
Myra Cheng, Lester Mackey & Adam Tauman Kalai
University of Texas at Austin, Austin, USA
Maria De-Arteaga

Authors

Myra Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Maria De-Arteaga
View author publications
You can also search for this author in PubMed Google Scholar
Lester Mackey
View author publications
You can also search for this author in PubMed Google Scholar
Adam Tauman Kalai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Myra Cheng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Responsible editor: Toon Calders, Salvatore Ruggieri, Bodo Rosenhahn, Mykola Pechenizkiy and Eirini Ntoutsi.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Gendered words used in classifiers

We provide insight into some of the differences across the classifiers that may be driving the SNoB described in preceding sections. We define $\beta _w$ as the weight of a word w based on the value of the classifiers’ coefficients. We focus on the logistic regression classifiers using the BOW and WE representations since the BERT representations are contextualized, so each word does not have a fixed weight to the model that is easily interpretable.

For the BOW representation of a biography x, each feature in the input vector $v_x$ corresponds to a word w in the vocabulary. We define $\beta _w$ as the value of the corresponding coefficient in the logistic regression classifier. The magnitude of $\beta _w$ is a measure of the importance of w to the occupation classification, while the sign (positive or negative) of $\beta $ indicates whether w is correlated or anti-correlated with the positive class of the classifier.

For the WE representation, we compute the weight of each word as

$$\begin{aligned} \beta _w = \frac{e_w\cdot W_c}{|e_w||W_c|}, \end{aligned}$$

i.e. the cosine similarity between each word’s fastText word embedding $e_w$ and the coefficient weight vector $W_c$ of the WE-representation classifier. Like in the BOW representation, the magnitude of $\beta _w$ quantifies the word’s importance, while the sign indicates the direction of the association.

If a word w has positive/negative weight for classifier $Y_c$, then adding w to a biography x increases/decreases the predicted probability $Y_c(x)$ respectively.

Let $\beta _w(Y_c)$ be the weights for approach $Y_c.$ We examine the words whose weights $\beta _w$ satisfy

1.
$|\beta _w(Y_c)| > T$,
2.
$|\beta _w(G)| > T$,
3.
$|\beta _w(Y_c)| > T'\cdot |\beta _w(Y_c')|$,

where $T, T'$ are significance thresholds and $Y_c, Y_c'$ are two different occupation classification approaches.

Words that satisfy these conditions are not only associated with either masculinity or femininity but also weighted more highly in approach $Y_c$ compared to $Y_c'$. Thus, including these gendered words in a biography influences $Y_c$’s classification more strongly than that of $Y_c'$. This suggests that they may contribute more strongly to the $\rho ({\textbf {p}}_C, {\textbf {r}}_C)$ in one approach than the other. For example, we examined these words for the occupations of surgeon, software engineer, composer, nurse, dietitian, and yoga teacher, which are the six most gender-imbalanced occupations, with $Y_c = $ {BOW, post-processing}, $Y_c' =$ {BOW, decoupled}, $T = 0.5$ and $T' = 0.7$. The words that satisfy these conditions are “miss”, “mom”, “wife”, “mother”, and “husband.” Conversely, with $Y_c = $ {BOW, DE} and $Y_c' =$ {BOW, PO}, the words are “girls”, “women”, “gender”, “loves”, “mother”, “romance”, “daughter”, “sister”, and “female.”

These gendered words illustrate the multiplicity of gender present in the biographies beyond categorical labels, which standard group fairness interventions do not consider.

Our analysis is limited by the fact that we only consider the individual influence of each word conditioned on the remaining words, while the joint influence of two or more words may also be of relevance.

Appendix B: Analysis on nonbinary dataset

Table 3 Correlation $r^{{\textsc {nb}}}_{{\textsc {professor}}}$ (first three columns) and $r_{{\textsc {professor}}}$ (latter three columns) across pre-processing (pre-proc), post-processing (post-proc), and decoupled approaches

Full size table

We aim to consider how algorithmic fairness approaches affect nonbinary individuals, who are overlooked by group fairness approaches (Keyes et al. 2021). Using the same regular expression as De-Arteaga et al. (2019) to identify biography-format strings, we collected a dataset of biographies that use nonbinary pronouns such as “they”, “xe”, and “hir.” Since “they” frequently refers to plural people, we manually inspected a sample of 2000 biographies using“they” to identify those biographies that refer to individuals. professor is the only occupation title with more than 20 such biographies; the other occupations have too few biographies to perform meaningful statistical analysis. We computed $r^{{\textsc {nb}}}_{{\textsc {professor}}}$, which is analogous to $r_{{\textsc {professor}}}$, the measure of SNoB for an individual occupation classifier introduced in Sect. 4. While $r_{{\textsc {professor}}}$ is Spearman’s correlation computed across the biographies in $S_c$, $r^{{\textsc {nb}}}_{{\textsc {professor}}}$ is the correlation across the nonbinary biographies in the profession. The results are reported in Table 3. We find that $r^{{\textsc {nb}}}_{{\textsc {professor}}}$ is positive across different approaches. However, the associated p values are quite large $(>0.1)$, so it is challenging to analyze these associations. This is likely due to the small sample size; while $r_{{\textsc {professor}}}$ is computed across the 10677 professor biographies that use “she” pronouns, $r^{{\textsc {nb}}}_{{\textsc {professor}}}$ is across only 21 biographies.

Appendix C: Word weights

In Fig. 6, we plot the weight of each word in the BOW vocabulary in the occupation classifiers and gender classifiers. These weights illuminate some of the mechanisms behind the predictions. Ideally, without SNoB, every point would have small magnitude in either the occupation or gender classifier, i.e. lie on either the $x-$ or $y-$axis of Fig. 6. We observe that in the DE approach, words are closer to the $y-$axis compared to the post-processing approach. This corresponds to the smaller value of $\rho ({\textbf {p}}_C, {\textbf {r}}_C)$ exhibited by the decoupled approach compared to the post-processing one in Table 2. Note that the post-processed classifier is trained on all of the biographies, while the decoupled classifier is trained on only biographies that use the same pronoun.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Cheng, M., De-Arteaga, M., Mackey, L. et al. Social norm bias: residual harms of fairness-aware algorithms. Data Min Knowl Disc 37, 1858–1884 (2023). https://doi.org/10.1007/s10618-022-00910-8

Download citation

Received: 27 August 2021
Accepted: 15 December 2022
Published: 23 January 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s10618-022-00910-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

¥17,985 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (Japan)

Instant access to the full article PDF.

Institutional subscriptions

Social norm bias: residual harms of fairness-aware algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Towards a holistic view of bias in machine learning: bridging algorithmic fairness and imbalanced learning

Quantifying Fairness and Discrimination in Predictive Models

Are Fair Machine Learning Models More Useful?

Availability of data and material

Code Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Gendered words used in classifiers

Appendix B: Analysis on nonbinary dataset

Appendix C: Word weights

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Social norm bias: residual harms of fairness-aware algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Towards a holistic view of bias in machine learning: bridging algorithmic fairness and imbalanced learning

Quantifying Fairness and Discrimination in Predictive Models

Are Fair Machine Learning Models More Useful?

Explore related subjects

Availability of data and material

Code Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Gendered words used in classifiers

Appendix B: Analysis on nonbinary dataset

Appendix C: Word weights

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation