Why Artificial Intelligence Is Still Waiting For Its Ethics Transplant

There’s no lack of reports on the ethics of artificial intelligence. But most of them are lightweight—full of platitudes about “public-private partnerships” and bromides about putting people first. They don’t acknowledge the knotty nature of the social dilemmas AI creates, or how tough it will be to untangle them. The new report from the AI Now Institute isn’t like that. It takes an unblinking look at a tech industry racing to reshape society along AI lines without any guarantee of reliable and fair results.

The report, released two weeks ago, is the brainchild of Kate Crawford and Meredith Whittaker, cofounders of AI Now, a new research institute based out of New York University. Crawford, Whittaker, and their collaborators lay out a research agenda and a policy roadmap in a dense but approachable 35 pages. Their conclusion doesn’t waffle: Our efforts to hold AI to ethical standards to date, they say, have been a flop.

“New ethical frameworks for AI need to move beyond individual responsibility to hold powerful industrial, governmental and military interests accountable as they design and employ AI,” they write. When tech giants build AI products, too often “user consent, privacy and transparency are overlooked in favor of frictionless functionality that supports profit-driven business models based on aggregated data profiles…” Meanwhile, AI systems are being introduced in policing, education, healthcare, and other environments where the misfiring of an algorithm could ruin a life. Is there anything we can do? Crawford sat down with us this week for a discussion of why ethics in AI is still a mess, and what practical steps might change the picture.

Scott Rosenberg: Towards the end of the new report, you come right out and say, “Current framings of AI ethics are failing.” That sounds dire.

Kate Crawford: There’s a lot of talk about how we come up with ethical codes for this field. We still don’t have one. We have a set of what I think are important efforts spearheaded by different organizations, including IEEE, Asilomar, and others. But what we’re seeing now is a real air gap between high-level principles—that are clearly very important—and what is happening on the ground in the day-to-day development of large-scale machine learning systems.

We read all of the existing ethical codes that have been published in the last two years that specifically consider AI and algorithmic systems. Then we looked at the difference between the ideals and what was actually happening. What is most urgently needed now is that these ethical guidelines are accompanied by very strong accountability mechanisms. We can say we want AI systems to be guided with the highest ethical principles, but we have to make sure that there is something at stake. Often when we talk about ethics, we forget to talk about power. People will often have the best of intentions. But we’re seeing a lack of thinking about how real power asymmetries are affecting different communities.

The underlying message of the report seems to be that we may be moving too fast—we’re not taking the time to do this stuff right.

I would probably phrase it differently. Time is a factor, but so is priority. If we spent as much money and hired as many people to think about and work on and empirically test the broader social and economic effects of these systems, we would be coming from a much stronger base. Who is actually creating industry standards that say, ok, this is the basic pre-release trial system you need to go through, this is how you publicly show how you’ve tested your system and with what different types of populations, and these are the confidence bounds you are prepared to put behind your system or product?

These are things we’re used to in the domains of drug testing and other mission-critical systems, even in terms of things like water safety in cities. But it’s only when we see them fail, for example in places like Flint, Michigan, that we realize how much we rely on this infrastructure being tested so it’s safe for everybody. In the case of AI, we don’t have those systems yet. We need to train people to test AI systems, and to create these kinds of safety and fairness mechanisms. That’s something we can do right now. We need to put some urgency behind prioritizing safety and fairness before these systems get deployed on human populations.

You want to get this stuff in place before there’s the AI equivalent of a Flint disaster.

I think it’s essential that we do that.

The tech landscape right now is dominated by a handful of gigantic companies. So how is that going to happen?

This is the core question. As a researcher in this space, I go to the tools that I know. We can actually do an enormous amount by increasing the level and rigor of research into the human and social impacts of these technologies. One place we think we can make a difference: Who gets a seat at the table in the design of these systems? At the moment it’s driven by engineering and computer science experts who are designing systems that touch everything from criminal justice to healthcare to education. But in the same way that we wouldn’t expect a federal judge to optimize a neural network, we shouldn’t be expecting an engineer to understand the workings of the criminal justice system.

So we have a very strong recommendation that the AI industry should be hiring experts from disciplines beyond computer science and engineering and insuring that those people have decision-making power. What’s not going to be sufficient is bringing in consultants at the end, when you’ve already designed a system and you’re already about to deploy it. If you’re not thinking about the way systemic bias can be propagated through the criminal justice system or predictive policing, then it’s very likely that, if you’re designing a system based on historical data, you’re going to be perpetuating those biases.

Addressing that is much more than a technical fix. It’s not a question of just tweaking the numbers to try and remove systemic inequalities and biases.

That’s a kind of reform-from-inside plan. But right now, the situation looks much more like researchers sit on the outside, they get access to a little data, and they come out with these bombshell studies showing how bad things are. That can build public concern and win media coverage, but how do you make that leap to changing things from inside?

Certainly when we think about the amount of capacity and resourcing in the AI industry right now, this isn’t that hard. We should see this as a baseline safety issue. You’re going to be affecting somebody’s ability to get a job, to get out of jail, to get into university. At the very least we should expect a deep understanding of how these systems can be made fairer, and of how important these decisions are to people’s lives.

I don’t think it’s too big an ask. And I think the most responsible producers of these systems really do want them to work well. This is a question of starting to back those good intentions with strong research and strong safety thresholds. It’s not beyond our capacity. If AI is going to be moving at this rapid pace into our core social institutions, I see it as absolutely essential.

You’re affiliated with Microsoft Research, and Meredith Whittaker is affiliated with Google. Can’t you just walk into the right meetings and say, “Why aren’t we doing this?”

It’s absolutely true that both Meredith and I have a seat at the table in companies that are playing a role here, and that’s part of why these recommendations are coming from a place of knowledge. We understand how these systems are being built, and we can see positive steps that could make them safer and fairer. That’s also why we think it’s really important that we’re working in a context that is independent, and we can also do research outside of technology companies, to help make these systems as sensitive as possible to the complex social terrain they’re starting to move into.

Our report took six months, It’s not just a group of us saying, hey, this is stuff we think and recommend. It comes out of deep consultation with top researchers. The recommendations are achievable, but they’re not easy. They’re not a way of throwing smoke into people’s eyes and saying, ”Everything’s fine, we’ve got this handled.” We’re saying, interventions are needed, and they’re urgent.

In the last 18 months we’ve seen a spike in interest in these questions around bias and machine learning, but often it’s being understood very narrowly as a purely technical issue. And it’s not—to understand it we need to widen the lens. To think about how we understand long-term systemic bias, and how that will be perpetuated by systems if we’re not aware of it.

Five years ago, there was this claim that data was neutral. Now that’s been shown to not be the case. But now there’s a new claim—that data can be neutralized! Neither of these things are true. Data will always bear the marks of its history. That is human history, held in those data sets. So if we’re going to try to use that to train a system, to make recommendations or to make autonomous decisions, we need to be deeply aware of how that history has worked. That’s much bigger than a purely technical question.

Speaking of history, at the tail end of the Obama years this kind of research was getting a lot of government support. How optimistic are you right now for this program now that the Trump administration doesn’t seem as interested?

Government should absolutely be tracking these issues very closely; however, this isn’t just about the US. What’s happening in Europe right now is critically important—what’s happening in India, in China. What’s coming down the pipeline as soon as May next year with GDPR [the European Union’s stringent new data privacy rules]. We’ll continue to do the research we think will guide policy in the future. When and where that gets taken up is not our decision—that’s well above our pay grade. But what we can do is do the best work now, so that when people are making decisions about safety-critical systems, about rights and liberties, about labor and automation, they can make policy based on strong empirical research.

You also call for greater diversity in the teams that make AI, and not just by fields of expertise.

It’s much bigger than just hiring—we have to talk about workplace culture, and we have to talk about how difficult these questions of inclusivity are right now. Particularly in the wake of the James Damore memo, it’s never been more stark how much work needs to be done. If you have rooms that are very homogeneous, that have all had the same life experiences and educational backgrounds and they’re all relatively wealthy, their perspective on the world is going to mirror what they already know. That can be dangerous when we’re making systems that will affect so many diverse populations. So we think it’s absolutely critical to start to make diversity and inclusion matter—to make it something more than just a set of words that are being spoken and invoked at the right time.