When can we call machine learning ‘transparent’?
A critical examination of Google’s Perspective API
Editor’s note: This piece was updated on June 16, 2022 with additional information from Google.
In 2018, I received an angry email from a Google engineer over an article I wrote that described a project he had been working on as “not really transparent.” The project I was critiquing in that article was Google’s Perspective API, a machine learning project that analyzes toxicity in language, and was intended to be used in commenting sections online. I’m a researcher of machine learning and online harassment, and at this point, I had been analyzing and following the Perspective project for over two years.
The engineer wrote, “You claim that our code and data is not public, but it is, and has been since the start,” and he was partially correct: some of their code for the project I was critiquing was on GitHub, and so was one-fourth of their dataset. But not everything was open sourced for this project, and there wasn’t any meaningful data on their model and its real-world performance.
I’ve been thinking about transparency in technology for a few years now. Too often, software teams create the appearance of transparency, without really achieving it. Open sourcing a part of a data set is a good initial first step, and sharing code is good too, but none of that contextualizes or explains what the code does, or how it was made. Or importantly, what the pitfalls or harms that could exist within that code.
What I’m interested in is meaningful transparency: a kind of transparency that is equitable and understandable, particularly for complex technology like machine learning. That’s why I felt it was important to look closely at Google’s Perspective API web documentation. If the goal of Perspective is to make healthier public spaces using machine learning, then the public must know how it works, and that includes a general audience, not just engineers.
First, a little more background on the product. Perspective API tries to mitigate and detect online harassment by rating toxicity and harassment in language. Its machine learning model is trained on commenting data from English-language Wikipedia, the Economist, The New York Times, and The Guardian. To my knowledge, Wikipedia is Perspective’s open source data set, while the data from the news outlets has not been shared publicly but did help influence the model. But there’s no clarity on how exactly the model was trained, using what metrics, and who trained it.
It’s important to note that not all harassment data is the same. The linguistic construct of a harassing news article comment can be different from the kinds of arguments someone has on Twitter, Instagram, or Facebook, which in turn differ from harassing posts on posts on Reddit, and Metafilter, or sites like Wikipedia. In machine learning, the quality of the data matters, along with the context of where a model will be placed in a product, and how that model is used or interacted with by users.
Some questions I had when learning about Perspective included: How does this model understand things like toxic language? Who trained it and where did the data come from? How well does it work in an everyday context, and how do people use it? Measuring toxicity with machine learning isn’t perfect, and Perspective recommends on its website that it be used alongside humans. But is that enforced, and how? What are the feedback mechanisms or harm reduction mechanisms put into place when Perspective is used? Do clients even like it?
When I first visited their website in 2019, I found not a lot of answers at all: just a hip, purple layout, an oblique definition of toxicity, some samples of already scored language, and notably, an interactive demo at the bottom that often produced poor results. Some phrases I entered into the demo would be rated as “toxic” when they were clearly not: for example, just the word “Muslims” was rated as over 80% likely to be toxic.
I could not find who made the project, maintained the project, or anything about their case studies. To learn more about the API itself, I had to click on a link that said “For developers.” It turns out Perspective was developed by a Google research initiative called “Conversation AI,” which wasn’t clear from any documentation on Perspective’s website, other than one sentence mentioned on the “Developer’s page.”
Buried toward the middle of the page was an explanation that the model is trained on “wiki data,” which is open source data from the Wikimedia projects. The page broke down a very in-depth process that a seasoned researcher would understand, but would be incomprehensible to anyone without engineering knowledge.
To its credit, Perspective’s current website has much more information, including links to updated blog posts and model cards. Clicking on “How it Works” now offers a more robust and in depth explanation of the model and how toxicity is defined, along with case studies. The developer site leads to Kaggle and GitHub pages that appear well-maintained.
It also made an important improvement to its interactive demo: it would no longer show a rating for a phrase below a certain threshold of toxicity. This time, writing “I am a Muslim” brought up no toxicity rating.
But even with this amount of information, Perspective still isn’t sharing anything concrete or legible for non-technical readers. If you can’t code, it’s nearly impossible to understand what Perspective does and doesn’t do well since none of the models, data sets, or experiments are concretely explained in any way. If you can code, it still requires time to analyze Perspective’s findings.
It feels like opacity through engineering speak. There’s no way to tell what works, what doesn’t work, and how the product truly functions from reading their materials alone. Even clicking on the case studies provides little analysis as to how Perspective was actually used by The New York Times, what their product flow or integration looked like, what went well, what didn’t go well, and what had to change for one case study versus another. To truly, meaningfully understand how machine learning works, there needs to be analysis not just on the model, the data set, and the code, but also how it performs when launched and interacting with people.
After a few emails back and forth with the Google engineer, I stopped hearing from him a few months later. He moved to another country, and like a lot of corporate tech employees, he moved to new products. Last month, I emailed Perspective for their thoughts. Given that the original engineer said this was a transparent project, does Google still see Perspective that way? I received no response until after this article was published, when a communications rep from Google sent me a list of links to “additional information” about Perspective—most of which I had already read, and none of which addressed my central concerns. The rep added that “we are limited by what data our partners are willing to share, but always happy to share where we are able to.”
Perspective continues to market itself as a large, self-contained, and ready-to-use product. But machine learning isn’t a discreet, self contained entity at all. Machine learning is like salt: it’s not that interesting on its own, but like how salt transforms a dish, machine learning attains meaning as it interacts with data and users, and changes whatever product, software, or problem it’s working within.
That’s why it’s not enough just to share raw data. Even if data is shared, we still need legible descriptions of that data. What’s in it, who made it (when?), has it been updated (when?), and how is it used? This can be articulated in graphs, percentages, and paragraphs. But for there to be true transparency, to engender trust from users, what must also be shown is how the product, and the algorithm are reacting to that data, along with how the model was formed. This can be shown through demos, gifs, plain language explanations, and yes—open source code.
I think meaningful transparency consists of three integral pieces: legibility, auditability, and impact-ability. Legibility is whether most audiences can understand it. Auditability builds on legibility, and is whether an outside party can understand a process, data point, or intention well enough to request changes or give feedback. Impact-ability builds upon legibility and auditability, and refers to the ability for users or individuals to affect change and/or decision making from a project and then see the ramifications of that change or interaction.
That last part is important. Mols Sauter, a professor, researcher and author, said in 2019, “What we’ve lost in this rush towards transparency is now we have the ability to just see things. but just seeing something isn’t really have an impact on something.” Questions we should ask: What happens if and when someone leaves feedback, either via a contact form or Github? How does the dev team respond to them? Can users see that their feedback is effecting change?
Perhaps, the future of transparency in machine learning means shipping products with well-designed archives of explanations, gifs, dials, and contextual examples. For every model card and data sheet: a series of legible graphs to help guide what that analysis means and why it matters. Such an archive would record the changes a product has undergone, and take time to explain the product’s processes. These kinds of explanations should be a duty of care.
Today, I have the same questions about Perspective as I did in 2018, and I still want them answered. If we are to truly understand Perspective, to be able to talk about its strengths and weaknesses, there has to be meaningful transparency. Otherwise, we are just guessing alone together in the dark—forced to take Google, and other companies, at their word.
Caroline Sinders is an award-winning design researcher, an artist analyzing technology’s impacts in society, and a Contributing Editor at New_ Public. She’s worked with the Tate Modern, the United Nations, Ars Electronica’s AI Lab, the Harvard Kennedy School, Mozilla and others. She lives between New Orleans and London.
Illustration by Josh Kramer, using MidJourney’s AI-powered text-to-image generator with the prompt “toxicity and harassment in language.”