White House Report on Algorithmic Fairness

The White House has put out a report on big data and algorithmic fairness (announcement, full report).  From the announcement:

Using case studies on credit lending, employment, higher education, and criminal justice, the report we are releasing today illustrates how big data techniques can be used to detect bias and prevent discrimination. It also demonstrates the risks involved, particularly how technologies can deliberately or inadvertently perpetuate, exacerbate, or mask discrimination.

The table of contents for the report gives a good overview of the issues addressed:

Big Data and Access to Credit
The Problem: Many Americans lack access to affordable credit due to thin or non-existent credit files.
The Big Data Opportunity: Use of big data in lending can increase access to credit for the financially underserved.
The Big Data Challenge: Expanding access to affordable credit while preserving consumer rights that protect against discrimination in credit eligibility decisions

Big Data and Employment
The Problem: Traditional hiring practices may unnecessarily filter out applicants whose skills match the job opening.
The Big Data Opportunity: Big data can be used to uncover or possibly reduce employment discrimination.
The Big Data Challenge: Promoting fairness, ethics, and mechanisms for mitigating discrimination in employment opportunity.

Big Data and Higher Education
The Problem: Students often face challenges accessing higher education, finding information to help choose the right college, and staying enrolled.
The Big Data Opportunity: Using big data can increase educational opportunities for the students who most need them.
The Big Data Challenge: Administrators must be careful to address the possibility of discrimination in higher education admissions decisions.

Big Data and Criminal Justice
The Problem: In a rapidly evolving world, law enforcement officials are looking for smart ways to use new technologies to increase community safety and trust.
The Big Data Opportunity: Data and algorithms can potentially help law enforcement become more transparent, effective, and efficient.
The Big Data Challenge: The law enforcement community can use new technologies to enhance trust and public safety in the community, especially through measures that promote transparency and accountability and mitigate risks of disparities in treatment and outcomes based on individual characteristics.

Advertisements

NPR: Can Computers be Racist?

Yes.

As will come as no surprise to readers of this blog, algorithms can make biased decisions.  NPR tackles this question in their latest All Tech Considered (which I was interviewed for!).

They start by talking to Jacky Alcine, the software engineer who discovered that Google Photos had tagged his friend as an animal:

As Jacky points out: “One could say, ‘Oh, it’s a computer,’ I’m like OK … a computer built by whom? A computer designed by whom? A computer trained by whom?” It’s a short segment, but we go on to talk a bit about how that bias could come about.

What I want to emphasize here is that, while hiring more Black software engineers would likely help and make it more likely that these issues would be caught quickly, it is not enough. As Jacky implies, the training data itself is biased. In this case, likely by including more photos of white people and animals than of Black people. In other cases, because the labels have been created by people whose past racist decisions are being purposefully used to guide future decisions.

Consider the automated hiring algorithms now touted by many startups (Jobaline, Hirevue, Gild, …). If an all-white company attempts to use their current employees as training data, i.e., attempts to find future employees who are like their current employees, then they’re likely to continue being an all-white company. That’s because the data about their current employees encodes systemic racial bias such as differences between white and Black SAT test-takers even when controlling for ability. Algorithmic decisions will find and replicate this bias.

We need to be proactive to keep such biases from influencing algorithmic decisions.