Draper, with Partners, Fights Terror and Cyberbullying with Research
CAMBRIDGE, MA – Online extremism, cyberbullying and other abuse of social media applications are emerging threats to public safety and welfare. Much of this threat is enabled by the anonymity inherent in online social media, which allows criminals to avoid detection.
“Simple search queries, such as a name search, are often useless,” explains Draper Fellow Christopher Marks. “A more effective approach is to figure out who they are most likely to connect to, and search for them there.” Marks and a team of researchers offer a new approach to the network search problem of finding specific individuals’ social media accounts in their research paper, “Finding Online Extremists in Social Networks.”
The research comes at a time of growing concern about the use of social networks by hate groups, anti-government militias and terror groups like the Islamic State group, also referred to as ISIS, for propaganda, communications and recruitment. Closer to home, cyberbullying has grown into a low-burning epidemic. A Pew Research Center survey published two years ago found that 70 percent of 18-to-24-year-olds who use the Internet had experienced harassment, and 26 percent of women that age said they’d been stalked online.
“Often these users get suspended from social networks only to create new accounts and continue their criminal activities,” said Marks. “We use machine learning to predict how suspended users behave when they come back online. Then, we exploit this knowledge to rapidly find their new accounts.”
Suspensions only last so long, however, and extremists often open new accounts within days or even hours after being suspended. In response, the research team, which includes Marks, Jytte Klausen at Brandeis University and Tauhid Zaman at MIT’s Sloan School of Management, created a way to identify likely extremists on social networks and learn how they tended to reconnect with other users when opening a new account. Marks’ work was funded by Draper’s internal research and development (IRaD) program.
The research team collected Twitter data from approximately 5,000 ‘seed’ users, who were either known ISIS members or who were connected to many known ISIS members as friends or followers. For each one, the researchers collected the information in their account profile and their user account ID number, and did the same data mining of the extremists’ friends and followers on Twitter.
“We looked at the tweets, everything from the tweet text, the time of the post, all hashtags, user mentions, URLs, and images contained in the tweet, and whether the tweet was a retweet of or reply to another tweet. Our data set of profiles grew to over 1.3 million, and our tweet count totaled approximately 4.8 million tweets,” the researchers said.
To spot an extremist in a social network, Marks and his researchers said to look for the signs. “We found that known ISIS users in our data set changed their screen names regularly. Frequent screen name changes provide a means of avoiding tracking and detection, while retaining account information, friends and follower connections, and Twitter posts. We also noted that accounts that exhibit multiple screen name changes had higher suspension rates, which could mean that users are changing their screen names to avoid suspension.”
The team used machine learning to produce a method for efficiently finding groups of accounts that are likely to belong to a single user. In one instance, they found 318 account pairs as belonging to the same user. The researchers learned to predict with some accuracy which former friends a suspended user is likely to reconnect with, especially when that user’s past behavior is included in the analysis. They also quantified how extremists’ new accounts resembled their old accounts after being suspended to better automate a search for a new account.
Marks points out that while the research team’s analysis focused on terrorist extremist groups such as ISIS in the social network Twitter, “the capabilities we developed can generalize to other forms of online criminality and other social media applications.”
Draper combines specific domain expertise and knowledge of how to apply the latest analytics techniques to extract meaningful information from raw data to better understand complex, dynamic processes. Our system design approach encompasses effective organization and processing of large data sets, automated analysis using algorithms and exploitation of results. To facilitate user interaction with these processed data sets, Draper applies advanced techniques to automate understanding and correlation of patterns in the data. Draper’s expertise encompasses machine learning (including deep learning), information fusion from diverse and heterogeneous data sources, optimized coupling of data acquisition and analysis and novel methods for analysis of imagery and video data.