Should software (or at least the algorithms contained therein) be subject to FOIA? Case in MI suggests yes: https://t.co/8pjKcndogN
— Nick Diakopoulos (@ndiakopoulos) March 16, 2016
Nick Diakopoulos studies computational and data journalism, and has long been concerned about algorithmic transparency to aid journalism. In the link above, he points to a case in Michigan where the city of Warren was being sued to reveal the formula they used to calculate water and sewer fees.
Thinking about FOIA (Update: the Freedom of Information Act) for algorithms (or software) brings up all kinds of interesting issues, legal and technical:
- Suppose we do require that the software be released? Can’t it just be obfuscated so that we can’t really tell what it’s doing, except as a black box ?
- Suppose we instead require that the algorithm be released. What if it’s a learning algorithm that was trained on some data? If we release the final trained model, that might tell us what the algorithm is doing, but not why.
- Does it even make sense to release the training data (as Sorelle suggests)? What happens if the algorithm is constantly learning (like an online learning algorithm)? Then would we need to timestamp the data so we can roll back to whichever version is under litigation? (This last suggestion was made by Nick in our twitter conversation).
- But suppose the algorithm instead makes use of reinforcement learning, and adapts in response to its environment. How on earth can we capture the entire environment used to influence the algorithm?
If we replaced ‘algorithm’ by ‘human’, none of this makes sense. If we’re deciding whether a human decision maker erred in some way, we don’t need to know their life story and life experiences. So we shouldn’t need to know this for an algorithm.
But a human can document their decision-making process in a way that’s interpretable by a court. Maybe that’s what we need to require from an algorithmic decision-making process.