Keynote at ICWSM

I’m deeply honored that the organizers of the 10th ICWSM (The AAAI conference on weblogs and social media) have invited me to kick off the conference with an opening keynote on May 18. Here’s what I’ll be talking about.

Algorithmic Fairness: From social good to mathematical framework

Machine learning has taken over our world, in more ways than we realize. You might get book recommendations, or an efficient route to your destination, or even a winning strategy for a game of Go. But you might also be admitted to college, granted a loan, or hired for a job based on algorithmically enhanced decision-making. We believe machines are neutral arbiters: cold, calculating entities that always make the right decision, that can see patterns that our human minds can’t or won’t. But are they? Or is decision-making-by-algorithm a way to amplify, extend and make inscrutable the biases and discrimination that is prevalent in society?

To answer these questions, we need to go back — all the way to the original ideas of justice and fairness in society. We also need to go forward — towards a mathematical framework for talking about justice and fairness in machine learning. I will talk about the growing landscape of research in algorithmic fairness: how we can reason systematically about biases in algorithms, and how we can make our algorithms fair(er).

“Investigating the algorithms that govern our lives”

This is the title of a new Columbia Journalism Review article by Chava Gourarie on the role of journalists in explaining the power of algorithms. She goes on to say

But when it comes to algorithms that can comput what the human mind can’t, that won’t be enough. Journalists who want to report on algorithms must expand their literacy into the areas of computing and data, in order to be equipped to deal with the ever-more-complex algorithms governing our lives.

I’m quoted in this article, as are other researchers, and Moritz Hardt’s Medium article on how big data is unfair is mentioned as well.

As they say, read the rest 🙂

“Racist algorithms” and learned helplessness

Twitter user Dan Hirschman posts another example of search results that are — let’s just say — questionable:

Aside from the problematic search results (and again, this is an image search), what’s interesting about this is the predictable way in which the discussion unfolds.

There’s a standard pattern of discourse that I see when talking about bias in algorithms (I’ll interject commentary in between the elements).

It starts with the example

Which is usually quickly followed by the retort:

It’s true that if we interpret “racist algorithm” as “algorithm that cackles evilly as it intentionally does racist things”, then an algorithm is not racist. But the usage here is a Turing-test sense i.e the algorithm does something that would be considered racist if a human did it. At least in the US, it is not necessary (even for humans) to show racist intent in order for their actions to be deemed discriminatory; this is essentially the difference between disparate treatment and disparate impact. 

Unlike France.

The retort is often followed by algorithms don’t discriminate, people discriminate:

and also garbage in, garbage out:

This is strictly speaking correct. One important source of bias in algorithms is the training data it’s fed, and that of course is provided by humans. However, this still points to a problem in the use of the algorithm: it needs better training examples, and a better learning procedure. We can’t absolve ourselves of responsibility here, or the algorithm.

But eventually, we always end up with data is truth:

There is a learned helplessness in these responses. The sentiment is, “yes there are problems, but why blame the helpless algorithm, and in any case people are at fault, and plus the world is racist, and you’re trying to be politically correct, and data never lies, and blah blah blah”.

Anything to actually avoid engaging with the issues.

Whenever I’ve had to talk about bias in algorithms, I’ve tried be  careful to emphasize that it’s not that we shouldn’t use algorithms in search, recommendation and decision making. It’s that we often just don’t know how they’re making their decisions to present answers, make recommendations or arrive at conclusions, and it’s this lack of transparency that’s worrisome. Remember, algorithms aren’t just code.

What’s also worrisome is the amplifier effect. Even if “all an algorithm is doing” is reflecting and transmitting biases inherent in society, it’s also amplifying and perpetuating them on a much larger scale than your friendly neighborhood racist. And that’s the bigger issue. As Zeynep Tufekci points out

That is to say, even if the algorithm isn’t creating bias, it’s creating a feedback loop that has powerful perception effects. Try doing an image search for ‘person’ and look carefully at the results you get.