Learn AI Safety, Ethics & Society AI Bias: How AI Systems Learn to Be Unfair

AI Bias: How AI Systems Learn to Be Unfair

Intermediate 🕐 25 min Lesson 2 of 2
What you'll learn
  • Define AI bias and identify the three main sources according to NIST: computational/statistical bias, human/labeling bias, and systemic/historical bias
  • Describe three documented real-world cases of AI bias causing harm — COMPAS, Amazon's hiring tool, and the Gender Shades facial recognition study — and explain how training data propagates unfairness into deployed systems
  • Explain why bias cannot be fully eliminated through technical means alone, including why competing definitions of fairness mathematically conflict with each other when group base rates differ

The Hiring Tool That Learned to Prefer Men

Between 2014 and 2018, Amazon built and then quietly scrapped an internal AI tool designed to automatically rank job candidates. The system was trained on ten years of résumés previously submitted to Amazon — a pool that, like the broader technology industry, was overwhelmingly male.

The AI did not know that gender was irrelevant to job performance. It just learned what "successful Amazon engineer résumés" looked like, and they looked like men's résumés. The system penalized résumés containing the word "women's" — as in "women's chess club" or "women's college." It downgraded graduates from all-women's colleges. It favored action verbs that appeared more often in male-written résumés.

Amazon attempted to patch the bias out of the system but ultimately concluded it could not be confident the tool was gender-neutral in all circumstances. The project was cancelled in 2018. Reuters reported it in October of that year.

No one at Amazon intended to build a sexist hiring tool. The engineers trained it to find great candidates. But the AI learned patterns from biased historical data and reproduced those patterns at scale. That is what AI bias looks like in practice.

What Is AI Bias?

AI bias occurs when an AI system consistently produces results that are systematically skewed or unfair due to flawed assumptions in the data, labeling, training, or deployment process. The bias is not usually intentional — it is typically the result of the AI learning from imperfect human data and imperfect human decisions.

NIST, the U.S. National Institute of Standards and Technology, identifies three main sources of AI bias in its guidance document on identifying and managing bias in AI (Special Publication 1270):

1. Computational and Statistical Bias

The training dataset fails to accurately represent the population the AI will be applied to. If 90% of your training photos show light-skinned faces, the model becomes significantly more accurate for light-skinned faces — not because anyone designed it that way, but because it had more examples to learn from. This is also called representation bias or selection bias.

2. Human and Labeling Bias

People who label data for AI training inject their own subjective perceptions and cultural stereotypes. If annotators are asked to label images as "angry" or "calm," they may unconsciously rate darker-skinned people as angrier — regardless of facial expression. The AI learns from those labels. What began as individual human bias becomes algorithmic bias applied at scale.

3. Systemic and Historical Bias

Historical data accurately reflects a past that was unfair. Training an AI on historical hiring decisions, lending records, or criminal sentencing data means training it on the outcomes of decades of structural discrimination. The AI does not know the data is tainted — it just finds the patterns and reproduces them. Amazon's hiring tool learned from past hiring decisions made mostly by humans who favored men. The AI faithfully reproduced their preferences.

NIST researcher Reva Schwartz states: "Context is everything. AI systems do not operate in isolation." NIST emphasizes that bias cannot be solved through purely technical means — it requires multidisciplinary teams that include the communities affected by AI decisions, not just engineers.

Three Documented Cases

COMPAS: Criminal Sentencing (2016)

COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) was a risk assessment tool used by courts across the United States to predict whether a criminal defendant would re-offend. Scores influenced sentencing and parole decisions across multiple states.

In 2016, ProPublica analyzed more than 10,000 criminal cases in Broward County, Florida, and found significant racial disparities in the tool's predictions:

  • Black defendants who did not re-offend were nearly twice as likely to be incorrectly labeled "high risk" as white defendants in the same situation (45% vs. 23%)
  • White defendants who did re-offend were mislabeled "low risk" nearly twice as often as Black defendants (48% vs. 28%)
  • When controlling for other factors, Black defendants were 45% more likely to receive higher risk scores
  • Overall predictive accuracy: 61%

The tool's developer disputed these findings on technical grounds — and the ensuing argument became one of the most important debates in AI fairness (covered in the section below on competing fairness definitions). The Wisconsin Supreme Court ruled in 2016 that COMPAS could be used in sentencing but required explicit caveats about its limitations.

Gender Shades: Facial Recognition (2018)

Researchers Joy Buolamwini of the MIT Media Lab and Timnit Gebru audited commercial facial-analysis systems from three major technology companies. Their paper, "Gender Shades," was presented at the 2018 ACM Conference on Fairness, Accountability, and Transparency. The findings were stark:

  • Error rate for light-skinned men: 0.8%
  • Error rate for dark-skinned women: up to 34.7%

That is a 43-fold difference in error rate between the best-served and worst-served demographic group — not a rounding error. All three commercial systems failed significantly more often on darker-skinned faces and on women.

The outcomes were significant. IBM discontinued its facial recognition software in 2020. Amazon announced a moratorium on police use of its Rekognition tool. Microsoft withheld its system from law enforcement agencies. Joy Buolamwini testified before the U.S. Congress in 2019. The documentary Coded Bias (Netflix, 2021) brought this research to mainstream audiences.

Pulse Oximeters: Medical Devices (2020–Present)

Pulse oximeters are medical devices that measure blood oxygen levels through the skin. They are used in hospitals, emergency care, and at home. During the COVID-19 pandemic, a research letter published in the New England Journal of Medicine (2020) found that these devices systematically overestimated oxygen levels in patients with darker skin tones.

When Black COVID-19 patients had readings in what appeared to be the safe range (92–96% oxygen saturation), their actual oxygen saturation was below the clinically dangerous threshold of 88% approximately 12% of the time — compared to 4% for white patients. This led to delayed or withheld treatment. A 2026 Stanford study published in PNAS found that pulse oximeter racial bias is still causing gaps in follow-up care for Black patients.

This case illustrates that algorithmic bias is not limited to software. Pulse oximeters were calibrated primarily on data from lighter-skinned patients. The bias was in the measurement itself — and that measurement fed into clinical decisions at scale.

Why Bias Cannot Be Fully Eliminated: The Impossibility Problem

Here is one of the most important — and counterintuitive — findings in AI fairness research: even with perfect data, you cannot satisfy all common definitions of fairness simultaneously. This is not a flaw in any particular algorithm. It is a mathematical property of the definitions themselves.

Two independent research teams proved this formally in 2016 and 2017: when two groups have different base rates of an outcome — say, different historical rates of re-offending — no algorithm can simultaneously satisfy all of the following fairness definitions:

Fairness Definition What It Means The Conflict
Demographic Parity
Positive outcomes (loan approved, parole granted) happen at equal rates across all groups
If groups differ in relevant qualifications, equal approval rates may mean approving less-qualified candidates from one group
Equal Opportunity
Among truly qualified candidates, each group has an equal chance of being recognized as qualified
Requires agreement on who "truly qualifies" — which may itself be contested and shaped by historical bias
Equalized Odds
Equal false-positive rates AND equal false-negative rates across all groups
When base rates differ between groups, equalizing both error types simultaneously is mathematically impossible while maintaining calibration
Calibration
A score of 70% risk means 70% of those flagged actually have the predicted outcome — equally true for all groups
When group base rates differ, achieving calibration means accepting different error rates across groups by design

This is exactly what happened with the COMPAS dispute. ProPublica argued Black defendants faced higher false-positive rates — measuring fairness as equalized odds. The tool's developer argued the score was equally accurate for both groups — measuring fairness as calibration. Both analyses were mathematically correct. They were measuring different things. Choosing between these definitions requires value judgments that engineers alone cannot make.

The implication is significant: bias mitigation is not a problem that engineering can fully solve. Deciding which definition of fairness matters most in a given context — criminal sentencing, credit scoring, medical triage — is a political, ethical, and social decision, not a technical one.

What Regulators Are Doing

AI bias is increasingly subject to legal requirements in major jurisdictions:

  • EU AI Act (mandatory for high-risk systems from August 2026). Article 10 requires providers of high-risk AI to detect, prevent, and mitigate possible biases, and to ensure training data is "sufficiently representative." High-risk categories that explicitly cover AI bias risks include employment and HR decisions, education, credit, law enforcement, and migration — all areas where bias has caused documented harm.
  • NIST AI Risk Management Framework (voluntary U.S. framework, published January 2023). Identifies "harmful bias managed" as a core characteristic of trustworthy AI. Recommends multidisciplinary teams — including affected communities, not just engineers — for bias detection and mitigation throughout the AI lifecycle.
  • New York City Local Law 144 (effective 2023). Requires employers in New York City that use automated employment decision tools to conduct annual independent bias audits and disclose the results publicly. Also requires employers to notify job candidates and employees before AI tools are used in hiring or promotion decisions. This is the first municipal law of its kind in the United States.

What You Can Do

  • Know when AI is making decisions about you. In the EU, the AI Act requires disclosure when high-risk AI is used in consequential decisions. In the U.S., rights vary by jurisdiction. You can ask whether automated tools were involved in a decision about your loan, job application, healthcare, or insurance — and request human review.
  • In the EU, you have the right to contest automated decisions. GDPR Article 22 gives you the right to request human involvement in any automated decision that significantly affects you.
  • Document and report disparate outcomes. Pattern evidence is how bias investigations begin. If you and others from your demographic group receive consistently different treatment from an AI system, report it. Regulators — including the FTC and CFPB in the U.S. and data protection authorities in the EU — and civil rights organizations such as the Algorithmic Justice League collect reports that drive investigations and legal action.
  • Support algorithmic transparency. Third-party audits, training data disclosure, and public model documentation help surface bias before deployment. Laws like NYC Local Law 144 exist because advocates pushed for them. Supporting similar legislation in your jurisdiction is one of the most effective levers available to individuals.

The next lesson examines a closely related problem: privacy. When AI systems train on vast amounts of personal data, collect behavioral signals at scale, and power surveillance infrastructure, who controls that data — and what happens when it is misused?

Key takeaways
  • AI bias occurs when a system consistently produces prejudiced results due to flawed assumptions in data collection, labeling, training, or deployment — it is not intentional malice, but the effect can be the same
  • The three main sources are: computational bias (unrepresentative training data), human bias (annotators injecting stereotypes into labels), and systemic bias (historical inequities baked into training data)
  • Documented cases include: COMPAS (Black defendants nearly twice as likely to be wrongly flagged as high risk), Amazon's hiring tool (penalized women's résumés after learning from male-dominated historical data), and Gender Shades (facial recognition error rates up to 34.7% for dark-skinned women vs. 0.8% for light-skinned men)
  • Competing fairness definitions — demographic parity, equal opportunity, equalized odds, calibration — mathematically cannot all be satisfied simultaneously when group base rates differ; choosing between them is a values decision, not a technical one
  • The EU AI Act (high-risk obligations from 2026), NIST AI RMF (voluntary U.S. framework since 2023), and NYC Local Law 144 (effective 2023) all require explicit bias detection and mitigation — regulatory pressure on AI bias is accelerating