digital ethics

Diagram: Bias in Machine Learning

Understand the stages of machine learning where bias can, and often will, contribute to harm.

Diagram outlining where bias occurs in the machine learning process.

In July I got a tip from Gino Almondo introducing me to the paper A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle (Suresh & Guttag, 2021). It contains a very useful diagram outlining where and how bias will affect the outcome of the machine learning process. I could not help but apply my own visual interpretation to it for use in teaching. Feel free to use it to understand why and how so-called AI tools can, and often will, contribute to harm.

I contacted the authors, Harini Suresh and John Guttag, and have their approval for my reimagination of their chart. For an in-depth understanding of the different biases I recommend diving into their paper. The following is a brief overview of what the diagram addresses.

🗒️

For my own walkthrough of the real harms that AI can lead to, I refer you to The Elements of AI Ethics. For a more whimsy introduction to the same topic I have also produced the diagram If a Hammer was Like AI.

Machine learning biases version A

Download the diagram

machine-learning-biases-ver-A.pdf

402 KB

Overview of bias in machine learning

No system is stronger than its weakest link. Suresh & Guttag explain where the weak links in machine learning are, looking at both the data generation process and the process for model building and implementation.

a) Data Generation

Data generation biases spanning the process from data generation to population definition and sampling, to to measurement, to preprocessing (train/test split), to training data and test data (including benchmarks)

The points of bias are:

Historical bias. Harmful outcomes arise when people are mistreated because of prejudice contained within the dataset. Note that harm can and will happen even if the data represents an accurate representation of the world. This is because real-world prejudice, both current and historical, becomes embedded in software that works faster (essentially accelerating the bias), reaches further and is harder to call out and object to.

👩🏾‍⚖️

Case study: Bloomberg News performed an investigation into Stable Diffusion and found that it exacerbated prejudice in a significant way. For example, Women made up a tiny fraction of the images generated for the keyword “judge” — about 3% — when in reality 34% of US judges are women. Across OECD countries more than 50% of judges are women. Source: Humans are biased, AI is even worse, Bloomberg 2023.

Representation bias. When data fails to represent significant parts of the population it may fail to generalize well for the population that it is being used by or for. The target population in the dataset may not represent the use population or may contain underrepresented groups – where there is less data to learn from. The sampling method may also be limited or uneven (example: less available health data for an uncommon condition, or rarely researched condition, will lead to poorer performance for that condition).
Measurement bias. When used for prediction and understanding, models need the selection, collection and computing of features and labels to draw conclusions from. These labels may be an oversimplification, or the method/accuracy can vary across groups, (example: creditworthiness based on credit score, or crime rates being high because there happens to be more law enforcement presence in the area - leading to more discoverability and more presence, or pain assessment varying across groups).

b) Model Building and Implementation

Model Building and Implementation Biases spanning the process from model definition, model learning and model evalution to running and modelling the output, to post-processing, integrating into systems and human interpretation.

The points of bias are:

Evaluation bias. A model is optimized on its training data, but its quality is often measured on benchmarks. Hence, a misrepresentative benchmark encourages the development and deployment of models that perform well only on the subset of the data represented by the benchmark data. The benchmark can itself suffer from historical, representation or measurement bias. (example: Images of dark-skinned women comprise only 7.4% and 4.4% of common benchmark datasets Adience and IJB-A, and thus benchmarking on them fails to discover and penalize underperformance on this part of the population. [Buolamwini and Gebru, 2018])
Learning bias. Differences in performance occurs for different types of data because of modeling decisions made to influence for example accuracy or objectivity. These decisions can erode the influence of underrepresented data, and subsequently the models perform even worse on that data.
Aggregation bias. A particular dataset might represent people or groups with different backgrounds, cultures or norms, and any given variable can indicate different things across these groups. A one-size-fits-all model may fail to optimise for any group, or only fit the dominant population (example: models used in hiring processes for sentiment analysis may fail to consider differences in facial expression that can relate to culture or certain disabilities).
Deployment bias. Some models may actually be used in ways they were not intended to be used. As Guttag & Suresh write: "This often occurs when a system is built and evaluated as if it were fully autonomous, while in reality, it operates in a complicated sociotechnical system moderated by institutional structures and human decision-makers." This is also known as the "framing trap" (Selbst et al., 2019). These tools can lead to harm because of automation or confirmation bias (example: Incorrect advice by an AI-based decision support system could impair the performance of radiologists when reading mammograms).

How to use the diagram

When building, considering and evaluating systems based on machine learning, part of your responsibility is to understand the ways in which the tool can contribute to harm. Be transparent about these issues and how real, resolute work is required for the issues to be uncovered and managed. Sometimes the answer will be that a tool based on machine learning is not the right tool for the job.

References and Further Reading

A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle

MIT Libraries logoAuthor(s) Suresh, Harini; Guttag, John

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification – MIT Media Lab

Recent studies demonstrate that machine learning algorithms can discriminate based on classes like race and gender. In this work, we present an approach to eva…

MIT Media LabJoy Buolamwini

Fairness and Abstraction in Sociotechnical Systems

A key goal of the fair-ML community is to develop machine-learning based systems that, once introduced into a social context, can achieve social and legal outco

See all articles by Andrew D. Selbst

Diagram: Bias in Machine Learning

Overview of bias in machine learning

a) Data Generation

b) Model Building and Implementation

How to use the diagram

References and Further Reading

Read next

Digital Compassion: A Human Act

Communication Model for AI-Powered Chatbots

Introducing the Inclusive Panda

Comment