Tackling Twitter abuse

Twitter has an abuse problem, and as this detailed article by Buzzfeed’s Charlie Warzel, which interviews several (ex-)staff members shows, it is a problem of the company’s own creation. Allowing abuse has been Twitter’s conscious choice and, despite protestations, it can be solved.

The problem of abuse and harassment on Twitter is years old, almost as old as Twitter itself, and each attack brings renewed calls for Twitter to act. The drumbeat of people – usually women, LGBTQ people and people of colour – leaving Twitter because of harassment seems to have increased lately. Less visible are those people who self-censor more and tweet less, for fear that they might become the next target of the Twitter troll army.

Yet every time this conversation comes up, someone will say that this is a societal problem, not a technological one, and that there really isn’t very much more that Twitter can do than what it’s already doing. What we apparently need to do is fix society, and then all the racism, sexism, bigotry and abuse will just magically disappear.

The reason for citing technological difficulties is to punt the discussion of potential solutions into the long grass, because if it’s technologically impossible to solve a societal problem, then we don’t need to actually do anything about the technology. It’s a great way to stifle criticism of the status quo and to take the pressure off Twitter to act. It is also total bollocks. Twitter has created a technological problem and there are technological ways to ameliorate it.

Off the top of my head, I can think of ways to help solve The Twitter Problem. These aren’t fully fleshed out, they’re just a few thoughts I had whilst falling asleep last night, and if I can come up with this without even trying, imagine what Twitter could do if it bothered.

Privacy gradients

The first thing that always comes to mind when I think about social networks and communities is the idea of privacy gradient. This is what I wrote about privacy gradients in 2010:

The idea of a privacy gradient comes from architecture and refers to the way that public, common spaces are located by the entrance to a building and as you progress through the building the spaces become more private until you reach the most private ‘inner sanctum’. If you think of a house, then the most public part would be the porch (in the UK, a fully or semi-enclosed space around the front door, in the US, it’s often open or screened). The hallway is common space shared by everyone, and spaces like the kitchen and lounge are semi-private. As you progress deeper into the house you end up at the bedroom (and in some cases, the en-suite) which is the most private part of the house.

Understanding the privacy gradient is important, because when buildings ignore privacy gradients, they feel odd. Think about houses where there’s a bedroom directly off the lounge and how uncomfortable that can make visitors feel. I once had a friend who lived in one of the old tenements near Kings Cross, now torn down. To get to his bedroom and the kitchen you had to walk through his flatmate’s bedroom, a deeply uncomfortable act.

I also said, back then:

As one moves along a privacy gradient, one is also moving along a parallel trust gradient. As you invite me deeper into your house, so you are displaying increasing trust in me. […] The same, again, is true on websites. The more we communicate, the stronger our relationship becomes, the more I trust you, the more of myself I am willing to reveal and share.

Six years ago, I thought that Twitter had a basic, but basically sufficient, privacy gradient. And, indeed, it might have been sufficient for the network in 2010, but it is now completely insufficient. Twitter doesn’t really have a gradient, as such, but a limited number of privacy modes:

  • Public account, with potentially unlimited @messages because everyone can see everything you write
  • Open DMs, where anyone can send you a DM, but only you and the sender see them
  • Private account, @messages are limited because only your approved friends see anything to respond to
  • Private DMs, that you can only receive from people you follow

These modes are far too clunky. If you want to reach lots of people, or merely want to be open, you have to have a public account, which means that you are open to an avalanche of @messages from the world and her husband. If you have open DMs you’re risking an avalanche of unsolicited messages, again from the world and her husband. These are ostensibly “private”, but you can’t control who they come from and only you and the sender can see them. And there’s nothing like a bit of pseudo-privacy to encourage abuse from people who feel empowered to be arseholes by the veil of secrecy.

Private accounts limit the number of people who can see your tweets to just those you approve. That reduces unwanted attention, but is also untenable for anyone who wants a broader conversation, or who is a public figure. Private DMs are the most limited form of interaction that Twitter allows.

This isn’t really a sliding scale of privacy; it’s more a choice between on and off, which is a bit of Hobson’s Choice if you want any level of broader discourse. For businesses, celebrities, or even just those of us who are — or, at least, have been — happy to exist online in public, a private account isn’t going to meet our needs. And yet, a fully public account with open-season @messages is fertile ground for abuse.

Some people do, of course, maintain both public and private accounts, which makes sense in some circumstances. But it’s not only a potential duplication of effort, it’s also risky: It’s very easy to post to the wrong account when you are running more than one. And it’s a greater cognitive overhead to run two similar accounts, eg public me and private me, as opposed to two very different accounts, eg me and my cat.

So Twitter needs to create a gentler, longer privacy gradient. This has often been done, by other social networks, by allowing the user to group their friends and send messages only to certain groups. The trouble is, no one actually wants to sit down and spend hours classifying their friends. It’s a ham-fisted solution to a problem that requires something smoother. And I think there is a smoother solution.

Use data smartly to curb abuse

One thing that Twitter has is data. It knows who your social network is. It knows who you follow, who follows you, how long they have followed you, how often they @ you, how often you @ them. It has detailed information about how you interact with your friends. It can analyse that behaviour and it can form a detailed understanding of what “normal behaviour” is for you and your friends.

This kind of network analysis is old hat. People have been digging into social graphs since the data first became available, and there are plenty of people out there who understand how to analyse and understand this kind of data better than I do. But suffice it to say that Twitter has the data, and I suspect the expertise, it needs to perform this sort of analysis.

Network analysis doesn’t just provide information on what “normal” interactions are, it can also point to patterns of abuse. Indeed, anyone who’s been on Twitter long enough can deduce the pattern of an attack on an individual. In no particular order, these sorts of things happen in a dogpile:

  • Target is RTd by someone with a lot of followers
  • Target gets @s from people they do not follow, and who do not follow them
  • Number of @s increases rapidly as the attack spreads
  • Target tries to RT or .@ to draw attention to the attack
  • Target retreats, but the attack continues

These behaviours are clearly different from normal interactions, and it should be possible to design an alert system that throws up a red flag as soon as these behaviours are noted.

One challenge is that, on the face of it, an abusive dogpile might look a lot like an enthusiastically positive response to a tweet or a RT by a celebrity. Or that a mostly positive conversation could include abusive tweets. Or a wide-ranging conversation around a popular hashtag.

I suspect, however, that if one were to dig into the details, it would be possible to spot the differences between these scenarios, not least by looking at the kinds of accounts taking part, the language used, whether there is a hashtag involved and what that hashtag is (hashtags can be used to coordinate attacks, so aren’t themselves indicative), the timing of replies, etc. For example, a popular user posing a question and then RTing the best answers is going to have a very specific profile that would be very different to that of abuse.

The right kind of analysis can also help to identify abusers through their behaviour, as they:

  • @ someone they don’t follow and haven’t interacted with before
  • @ someone whose friends they don’t follow
  • @ that person repeatedly and in rapid succession
  • Use abusive language
  • Follow other accounts who are also engaged in, or even inciting, the attack

Maybe the accounts are new sockpuppets that resemble spammers, or maybe they have huge followings, or somewhere in between. Maybe the inciting RT was made innocently by a celebrity whose followers take it all a step too far, or maybe it’s a deliberate attempt to drive someone off Twitter. It doesn’t really matter: Attacks seem to follow similar trajectories and should be detectable in the data.

More importantly, the analysis of your social graph could be used to forestall an attack. I imagine a system where all tweets coming from outside my immediate circle of long-term (say over 30 days) followers, and their long-term followers, are immediately suspect and subject to additional scrutiny before they get to my @ timeline. Perhaps they go through linguistic analysis to look for problematic epithets. As imperfect as such analysis is, as a part of a broader strategy it might well have its place.

The system would also look for other signs that an attack was beginning: Are there other @ tweets coming in from outside the target’s friends and friend-of-a-friend network? Are those tweeters related in any way, eg do they follow someone who RTd a tweet by the target, be they clueless celebrity or bigot? Are they responding to or using the same hashtag?

If enough flags were triggered, the system would escalate, either to a human moderator at Twitter (though frankly I think that would be a terrible idea, given how inconsistent human moderation tends to be) or to the next level of automated control. In the automated case, any tweets that look like they might be part of an attack would be quarantined, away from the target’s main timeline.  Rather like a spam folder, a user could either glance through them and “unquarantine” good tweets and permanently hide bad ones, or let them be automatically hidden from view without ever seeing them. Any data on false positives from users who do  could be then used to help train the system.

Users should also have control over whether they take part in such a system, and there’d need to be careful thought about appropriate defaults. Users tend not to change defaults, and whilst most new users wouldn’t be likely to need such a system, one wouldn’t necessarily know when one needed it until it was too late. For it to be effective, it would need to be an opt-out system that people have to turn off, rather than on. There would need to be both clear communication with users about what such a system would mean, how to activate it and deactivate it, and how to use it.

Notification trolling

A troll mitigation system needs to not just focus on preventing abusive content from reaching its target, but also on preventing abuse through notifications. As it stands, people who have notifications turned on get ding-ding-dinged like a rat in an electrified cage during an attack, as one friend put it. The frequency of notifications becomes a part of the attack, not just a side-effect. So there would need to be an emergency brake on those notifications to make sure that someone isn’t swamped by texts, emails and alerts.

So what happens if a user was found to be a part of an actual attack? Perhaps they would receive a warning for the first incident, detailing the problematic behaviours. If they continue those behaviours, their account would be automatically suspended for a period. Persistent offenders would be banned.

Clearly people can set up multiple Twitter accounts very easily, but an automated system would be able to deal with those far more easily than the current system, which relies on people reporting abuses. Equally, brand new accounts could have restrictions, such as not being able to successfully @ message or DM non-followers for 30 days — a new user might be able to send an @ message or DM, but if the recipient isn’t following them, they shouldn’t see it.

Now, I know some people are going to scream censorship over these suggestions, but really, that’s a nonsense. Twitter is under no legal or moral obligation to provide a platform to people who abuse others, and nor am I or any other user under any legal or moral obligation to listen to people who would abuse us. The right to free speech is not the right to be heard or have an audience. The right to free speech does not give people the right to abuse others, nor does anyone have any right to demand my attention. I am free to withhold my attention just as Twitter is free to withhold service to those who break its terms and conditions.

Other objections will be technical. How on earth would this data analysis all be done in real time? Well, most accounts won’t ever need this sort of protection. People with a handful of followers, people who rarely log in, people who rarely tweet and private accounts are unlikely to end up at the epicentre of a Twitter quake. But the accounts of those who might need it could be very lightly monitored for the early signs of trouble, and the full analysis would only kick in if needed. Equally, there are categories of users who are at higher risk of attack, such as women and people of colour, who could perhaps be given more computational attention.

And those who want Twitter’s firehose, the unexpurgated reckon of the unfiltered masses in all it’s glory could, obviously, opt out.

Finally, one thing you’ll notice is absent from this blog post is a call for better reporting tools. Ultimately, focusing on users reporting abuse is shifting responsibility for dealing with that abuse on to the target. That is unethical. It is, essentially, the technological equivalent of victim blaming. If I am abused, I do not want to have an easier way to deal with the abusive messages, I want to never see them in the first place. Sure, blocks and mutes can be fed into the system to help train it, but prevention is always better than cure.