⚖️📲 Social media punishment does not need to be a Kafkaesque nightmare

Matt Katsaros on procedural justice and our case studies with the Social Media Governance Initiative

and

Nov 17, 2024

We are seeking a Head of External Engagement and Community Engagement Lead, Local Lab to join our team.

Not too long ago, I opened TikTok on my phone and got a pop-up saying “Your account was permanently banned due to multiple violations of our Community Guidelines.” I was shocked. Me? I just watched videos and rarely interacted. I reached out via email and was rebuffed with form language: “your actions on TikTok have been found to violate our Youth Safety and Well-Being policy.” Was I being accused of doing something horrendous? I didn’t know, and I still don’t — they won’t tell me anything more.

Despite gestures towards “procedural fairness” on their website, I felt really disrespected and helpless — guilty of some secret crime I wasn’t allowed to know about. But shouldn’t there be another way? If there had been warnings about my account beforehand, or if afterwards I could explain myself to a jury of other TikTok users, I might still be on the platform, or at least I might have felt like a human being throughout the process.

Coincidentally, I just spoke with an expert on this. Matt Katsaros, previously of Facebook and Twitter, leads the Social Media Governance Initiative at The Justice Collaboratory at Yale Law School.

Below, Matt gives a crash course in procedural justice, how it works in social media, and the kinds of interventions that give agency and prevent toxic outcomes. He also explains a bit about two case studies created in a collaboration between the SMGI and New_ Public. One focuses on reputation management systems, like badges and profile signifiers, and the other is about online juries.

–Josh Kramer, New_ Public Head of Editorial

Explore the case studies

How criminology applies to social media:

I got a job early on at Facebook in 2011, and kind of quickly realized that the Silicon Valley that I was working in was quite different from what my dad was working in in the ‘80s, on the very first personal computer operating system, with a handful of people in a house with friends. After different roles there, I ended up working as a researcher on the product team building trust and safety tools for the company.
I got to do a lot of really interesting research. Some of it brought me into contact with Tom Tyler and Tracey Meares who run The Justice Collaboratory at Yale Law School. They work in the criminal legal arena, which can be policing, it can be courts, it can be incarceration — all sorts of things. And they had been spending decades building this theory, but also empirically testing it in a number of contexts, around procedural justice and building trust and legitimacy in decision-making. Their work tries to understand: Why do people follow rules? Why do people trust decision-makers? What makes for a fair process?
It seemed like a lot of what they have been developing over decades has a lot of application to content moderation. In the case of what I was working on, people care whether some comment is left up or taken down. But what they care much more about is how that decision was made and how they were treated.
Because for a long time I was facing many people I worked with who were like, “everyone who breaks rules are bad people; we can't tell them what the rules are because they'll use that just to get around our rules,” which is bizarre logic at just a fundamental level. If you want people to follow rules, you have to be clear what the rules are, or else you give people no chance.

Testing procedural justice in social media:

We took those ideas and we developed research. We ran experiments. And we published a paper showing that there is this correlational relationship between feeling more fairly treated and being less likely to recidivate or break rules again. And we did a test and experiment, showing there was a causal link.
If you are more clear when you enforce your rules, by telling people what those rules are, fewer people will appeal, fewer people will break the rules — in a small amount. It wasn't a silver bullet, but the intervention we tested was such a minuscule one and it demonstrated to me how low the bar was.
So Tom and Tracey created an initiative in their lab to continue that work and that's called the Social Media Governance Initiative. I left Facebook and I went to Twitter doing similar work. And at some point, Tom and Tracey offered me the opportunity to come run this initiative. For the last couple of years I've been running the SMGI. The SMGI primarily is doing new, empirical research. We've done projects with Facebook, Twitter, Nextdoor.
We do research. We organize convenings. We've done training. And we do some field building. We put out resources, usually aimed at practitioners, but trying to connect them with information and resources that might not exist in their company. That might be ideas, research, theories that are existing within the academy that we try to translate for them.

A user profile on Stack Overflow highlighting badges and reputation points, from the case study on reputation management systems.

Who actually breaks rules on platforms and why:

When you talk to people who are breaking rules, the answers are varied. We have surveys where we ask people, why did you post the thing that you posted, that broke Twitter or Facebook's rules? Some people just got too caught up in a heated discussion. Some people literally didn't know that there were rules. Some people had just joined the platform.
I remember an interview with a man in his 70s, a sweet-sounding, retired elementary school teacher who recently joined Twitter and found himself getting really outraged at some political discussion and saying something disparagingly about Mitch McConnell. He certainly knows how to behave and how to be respectful. And he had been banned. Whatever rule he violated at Twitter, it was one strike and you’re out, and he didn’t even know that. He was unaware. He thought at some point he was going to be let back on. He was ignorant of what the rules were, because Twitter made it a little difficult.
How do you get in front of people before they actually do this stuff? How do you help that man who just joined Twitter learn more about the rules? Why did he think that that's okay there, whereas in his classroom he would never have said something like that? Perhaps it's because when he joined Twitter, the kind of people he was following, the type of discourse he was being exposed to, showed him that it's okay to say things like that.
I think there is a clear recognition in The Justice Collaboratory’s work offline that reducing crime isn't just the goal, you need to have people who feel like they value their community, they feel safe, and they feel secure.

The promise of prosocial interventions:

We're really open to trying to understand the ways in which platforms can be designed, that will simultaneously promote people being more civil with each other, while reducing the things that platforms seem to be more concerned with, which is getting rid of hate speech and harassment and bullying and misinformation.
We’re most excited about finding ways to design platforms to mitigate things like “toxic offensive comments,” that maybe don't necessarily break a platform's rules, but brush up against them and are rude and not fostering productive dialogue. We talked to a lot of people who just said, “I got so heated, I just couldn't help myself, and I feel like I knew better.” We'd see in the data a higher proportion of these comments would be deleted in the hours and days that followed than comments that were less toxic.
On Twitter, our experiment was pretty simple, and it's something that exists on a lot of platforms now, which is that they run an algorithm as someone is about to write a comment, and if it is above some threshold for offensiveness, something pops up that says, “Are you sure? Do you want to take a moment?”
It was effective. It reduced some of those comments by a little bit. It also has network effects. Reducing one kind of toxic comment also stops people replying back to you with more toxic comments. We saw that as well in our experiment.
Contrasting that with another study that we did, with Nextdoor. Once a comment has been posted and they detect it to be a certain level of toxic, they will filter it, and it won't show that comment by default unless you select “view all comments.” That intervention was not actually effective.
If it’s, “we're the platform, we make the decisions, we'll have these algorithms find the content, filter it, and nobody will know,” it doesn’t seem to have the effect that I think platforms are hoping for. Because the feature is just turned on and for most people they won't even know that they're filtered.
Within that framework of design changes, we're coming to conclusions that those that provide people choice and information and autonomy will be more successful than ones that filter or just work through algorithms or just work on the back end.

Buttons for voting on content, including Keep, Maybe remove, Remove, and on the right: Spam, Looks OK, Not Sure — At left a Nextdoor user is prompted to vote on a comment reported as “disrespectful.” At right, a Periscope user is randomly selected to evaluate a reported comment. Both are from the case study on juries and tribunals.

The New_ Public and SMGI case studies:

The students in our lab last semester worked with you guys to focus on designs that many platforms use, not just social media. Reputation management, I think is a really interesting thing, and then the jury and tribunal, does fit in the content moderation space.
I thought it was pretty interesting that the students were trying to find connections. Within the jury and tribunal one, they were interested in understanding not just how you can use a jury of your peers to make content moderation decisions, but how can that be systematically used to set norms more proactively? How can the decisions that they make not just be made on a single instance, but help moving forward, to shape the norms of the community in a participatory way?
In the reputation management one, that seems quite effective if used properly. And one of the keys is knowing what kind of behaviors or attitudes or values you are trying to promote. If you set a system in place and then don't update it for a couple years, you might find that it's actually being manipulated by people who find out how to get to the top of the leaderboard or get that badge, and you might lose track of the thing that you're trying to do. But it seems like when platforms really invest in it — and there's so many examples, going back pretty far — that it works really well to promote certain prosocial behaviors, which again, often will help to curb a lot of the stuff you're hoping to avoid.
Some of the best examples of the reputation management system are not social media. Airbnb has really good versions of this. They have “Superhost.” They are explicit about the behaviors they're trying to promote and you hear from people that they are very interested in getting that badge and it incentivizes people in a particular way that serves the platform really well. But it also serves guests really well because you get responded to quickly, and it serves hosts really well because they get more economic value. They get more people booking when they do those things.
To the extent that platforms are going to invest what are increasingly smaller resources dedicated to these trust and safety teams, I think that there is a lot of opportunity to invest in tools and interventions that are explicitly good for business, that will create what we call prosocial communities.

Explore the case studies

Thanks Matt!

Trying to meet more of my neighbors in my community,

–Josh

A guest post by

Matt Katsaros

SMGI Director at the Justice Collaboratory at YLS

Sarah Masud

Nov 18Edited

It is nice (sorta vindicated feeling) to see that some of the ideas that I have theorised and experimented with in the past, in terms of user-driven and user-motivated content moderation, are being indirectly validated by these field experiments. The work that I am refereeing to is "Proactively Reducing the Hate Intensity of Online Posts via Hate Speech Normalization", https://arxiv.org/abs/2206.04007.

Expand full comment

Ruffin

Nov 19

I mostly study and work in social VR spaces, and have long been advocating for more thorough onboarding (meaningful friction) coupled with restorative justice principles to increase the frequency of high-quality interactions. Of course, this would require more resources on moderation teams and that feels like a pipe dream as it stands. Thanks for the read.

⚖️📲 Social media punishment does not need to be a Kafkaesque nightmare

Matt Katsaros on procedural justice and our case studies with the Social Media Governance Initiative

Discussion about this post