About three weeks ago, I started noticing that almost every time I opened up Twitter, the Notifications tab had that oh-so-nice blue dot, the reward for tweeting well done. The flag alerts users someone has followed, retweeted or mentioned them—but when I checked the latest activity, my rush of excitement died.
No one real was interacting with me. I was, and still am, in the middle of a Twitter bot follow flood. My hundreds of new bot “friends” weren’t doing me much harm, but it made me wonder why Twitter’s spam detection isn’t doing a better job at killing bots.
They’re more than a mere nuisance, after all. Twitter bots have been used to rig elections, silence activists, stiff people out of money, cheat on online games, gain financial advantage over competitors, usurp identities, scalp tickets, and spread misinformation. Last year, social media company Cynk used Twitter bots to make its stock look popular. Automated trading algorithms picked up on the bots’ conversations and started aggressively trading Cynk’s shares, driving the value of the shadow company up to $5 billion. The people behind the scam made out like bandits.
When I reached out to Twitter, a spokesperson pointed me to a 2014 blog post describing BotMaker, software Twitter engineers created to eradicate spam. The technical post describes an automated system that analyzes user behavior, flagging accounts that, for example, tweet spammy urls, and which also relies on (human) users blocking and reporting accounts as spam. When Twitter first announced BotMaker in August 2014, it reported a 40% reduction in spam, as well as shorter delays in detecting spam in the first place.
Since then, it seems, spambots, the Internet’s lurking, always shape-shifting, citizenry, have evolved to get past Twitter’s anti-bot weaponry. And I am not the only person the bots are after.
Even more strangely, Twitter’s algorithms seemed to be nudging some users, like Quartz executive editor Zach Seward, to follow bots:
A cynic might think the $23 billion company doesn’t want to address bots too rigorously because they make the social network’s user base look more robust. Twitter’s value, after all, is tied to its number of users and the perceived size of its audience.
According to Twitter’s own estimate in a 2014 SEC filing, about 5% of accounts—or the equivalent of roughly 13 million of them—were bots or fakes. Estimates by outsiders have been higher, and Twitter would have an incentive to downplay the number, says Emilio Ferrara, a computer scientist at the University of Southern California, who studies bot culture. “Twitter’s business is selling advertising…[but] bots don’t buy products. They don’t click on ads,” he said.
Dan Tentler, the chief technology officer at cybersecurity firm Carbon Dynamics and a former Twitter contractor, told me he proposed creating a bot detection system when he was working for the social network a few years ago.
Tentler’s bot-defense strategy involved heightened scrutiny of multiple usernames that originated from the same IP addresses and for IP addresses associated with failed logins. If, for example, 10,000 accounts were being created by a small number of IP addresses, it’s probable those accounts are spam and part of a botnet. Using this kind of data, Twitter’s security team could have hypothetically created a reputation-rating system for new accounts, to help decide if they were real or fraudulent. But the idea was shut down, he told me via Skype.
“This is not rocket science. It’s not even heavy-lifting,” said Tentler, who’s also seen his bot follower count rise recently. “It’s like panning for gold. You’re looking for little nuggets of insight to use to fingerprint bots.”
With the amount of data flowing through Twitter daily, he added, it should be relatively straight-forward for a dedicated team of engineers to come up with a working spam-filter that quickly learns to identify and nuke these things.
Bot-identifying nuggets of information include data like how often an account tweets original content, who it interacts with, who it follows, and who its followers are. “Bots tend to live in the neighborhood of other bots,” Ferrara told me. So, if an account is followed by a bunch of accounts with eggs for avatars who only retweet, it’s most likely a bot.
Using these signatures, Ferrara and other computer scientists at Indiana University actually built a tool called BotOrNot to help people determine whether an account was real or fake. But you check one account at a time, so it’s super tedious.
Good bot detection software would need to scan billions of interactions every second, without affecting what users see from one moment to the next. There’s also no room for false positives, flagging a real person as a bot. All this can make large-scale bot detection difficult. Plus, sometimes bots are good bots, such as a bot that tells you when the International Space Station is overhead, or reminds you about the brutality of slavery, or tells people not to use the term “illegal immigrant.”
BotOrNot gets at the low-hanging fruit. More sophisticated bots can be harder to sniff out. Some mimic other users, taking their content verbatim and tweeting it as if it were their own making it difficult to tell the difference between real and fake. Others “create” tweets of their own. These can read like gibberish. But those with more powerful artificial brains can look more legit, and are therefore harder to identify. In these cases, you’d need IP address data, login behavior, and how fast a user filled out an account creation form (a human can only type so fast) to figure out whether someone was a bot or not.
This is the kind of information only Twitter has access to, so outsiders like Ferrara and Tentler can’t build scalable bot-detection software themselves. DARPA, the Defense Department’s research arm, has sponsored Grand Challenges hoping to spur more research into bot detection, but we don’t have a killer bot killer yet.
When I pinged Twitter about my growing bot posse, Twitter didn’t respond. All I had to go on was a statement on the BotMaker blog post saying that Twitter will “try to clean [spam bots] up as soon as possible.” But it’s clear from my and others’ experience that doesn’t always happen. Fusion’s Kashmir Hill recently bought nearly 20,000 followers, most of which appear to be bots, for a fake karaoke business she set up, and Twitter didn’t seem to notice something was amiss. A week after writing about her bot purchase, her Freakin’ Awesome Karaoke Express (a.k.a. F.A.K.E.) still has 19,500 followers.