Research: How tech can help identify hate speech videos and impact content moderation

A Boston University professor discusses his team's study of hate attacks organized on 4chan and how their research could impact content moderation.

Research: How tech can help identify hate speech videos and impact content moderation

TechRepublic's Karen Roby talked with Gianluca Stringhini, an assistant professor at Boston University, about new research concerning online hate speech and harassment. The following is an edited transcript of their interview.

Karen Roby: Tell us about your research.

Gianluca Stringhini: We started studying some of these online polarized communities to try and understand, what's their way of operating … and can we actually model how these hate attacks are working?

SEE: How to develop your IT team's capabilities (free PDF) (TechRepublic)

A couple of years back, we started looking at 4chan and, in particular, at the politically incorrect boards within 4chan, because these guys are often cited as being the source of a lot of trouble that's going on on the internet. These kinds of hate attacks, they are the ones who, when Microsoft put the Tay bot online that people could chat to a few years back, they're the ones who turned these bots racist in a matter of hours. So we wanted to understand, how are these people operating?

This is kind of challenging, because this platform is very different from other social media platforms. It's anonymous, so there are no accounts--so it's very difficult to have figured out how many people are active on there. It's also ephemeral. Threads don't stay active forever, but after a while they will get archived and deleted. This kind of created a lot of disinhibition online, because people tend to behave worse when they're completely anonymous and whatever they say will disappear.

We did this measurement study on basically collecting as much data as possible from this platform, and we started characterizing, what do these hate attacks look like? We found that oftentimes these hate attacks would target YouTube videos. Someone would find the YouTube video that they thought would be a good target for attacks because, for example, it would expose some ideas that the community found outrageous or against their ideas and whatnot.

Then I would post the link to the YouTube video on the platform, so on 4chan, with tags along the lines of "you know what to do" or something like that. Basically, the platform explicitly prohibits organizing hate attacks. But this is the way in which they go around it, without explicitly saying what they want to do, other than exactly "you know what to do." That's the code.

After this happens, basically all these anonymous actors will go on the YouTube video and start posting hateful comments. Then we'd come back on 4chan, on the thread that organized it, and start to comment about how the attack is going, what they posted and all of that. What we found is that there is some sort of a synchronization between the comments we see on the YouTube video and the comments being posted on 4chan as a reaction to the hate attack.

By basically using signal processing techniques, so cross-correlation and so on, modeling basically the comments on 4chan and the comments on YouTube as signals, and looking at the synchronization between these two signals, we can identify whether there is coordinated activity going on. We find that there is an extremely strong correlation between the synchronization, so the more the two signals or the two sets of comments are correlated, with the amount of hate speech that the YouTube video is receiving.

Karen Roby: You've identified the videos and started to determine a risk value for future videos. Certainly that could be helpful for companies like YouTube and Google, because content moderation is really tricky. 

Gianluca Stringhini: I think the main problem comes from the way content moderation started. It was all about spam detection or removing automated content and so on. Both as the research community as well as companies, we've been developing systems for 20-plus years to identify content that is automatically generated, and it's clearly malicious, right? If you think about spam, it's a black-and-white problem. It's either spam or it's not.

When you talk about this activity that's human driven and it's very context dependent, it tends to become very nuanced, and there are many gray areas. This is why at the moment we don't have systems that are as accurate as the ones we developed for spam detection, malware detection, and so on, to detect this type of activity. This is why content moderation is actually required.

So really, the problem here becomes, can we actually reduce the number of comments or content that moderators need to look at in a way that facilitates their job? Can we potentially automatically delete some of this content, which is clearly bad, and only have them make a judgment for bad content that is context-dependent or falls within a gray area? Maybe it's culturally dependent.


Also see