New Research Suggests Sentiment Analysis is Critical in Content Moderation

At Two Hat, research is the foundation of everything we do. We love to ask big questions and seek even bigger answers. And thanks to a generous grant from Mitacs, we’ve partnered with leading Canadian universities to conduct research into the subjects that we’re most passionate about — from protecting children by detecting child sexual abuse material to developing new and innovative advances in chat moderation.

Most recently, Université Laval student researcher Éloi Brassard-Gourdeau and professor Richard Khoury asked the question “What is the most accurate and effective way to detect toxic (also known as disruptive) behavior in online communities?” Specifically, their hypothesis was:

“While modifying toxic content and keywords to fool filters can be easy, hiding sentiment is harder.”

They wanted to see if sentiment analysis was more effective than keyword detection when identifying disruptive content like abuse and hate speech in online communities.

Definitions of sentiment analysis, toxicity, subversion, and keywords in content moderationIn Impact of Sentiment Detection to Recognize Toxic and Subversive Online Comments, Brassard-Gourdeau and Khoury analyzed over a million online comments using one Reddit and two Wikipedia datasets. The results show that sentiment information helps improve toxicity detection in all cases. In other words, the general sentiment of a comment — whether it’s positive or negative — is a more effective measure of toxicity than just keyword analysis.

But the real boost came when they used sentiment analysis on subversive language; that is, when users attempted to mask sentiment using L337 5p33k, deliberate misspellings, and word substitutions. According to the study, “The introduction of subversion leads to an important drop in the accuracy of toxicity detection in the network that uses the text alone… using sentiment information improved toxicity detection by as much as 3%.

You may be asking yourself, why does this matter? With chat moderation becoming more common in games and social apps, more users will find creative ways to subvert filters. Even the smartest content moderation tools on the market (like Two Hat’s Community Sift, which uses a unique AI called Unnatural Language Processing to detect complex manipulations), will find it increasingly difficult to flag disruptive content. As an industry, it’s time we started looking for innovative solutions to a problem that will only get harder in time.

In addition to asking big questions and seeking even bigger answers, we have several foundational philosophies at Two Hat that inform our technology. We believe that computers should do computer work and humans should do human work, and that an ensemble approach is key to exceptional AI.

This study validates our assumption that using multiple data points and multiple models in automated moderation algorithms are critical in boosting accuracy and ensuring a better user experience.

“We are in an exciting time in AI and content moderation,” says Two Hat CEO and founder Chris Priebe. “I am so proud of our students and the hard work they are doing. Every term they are pushing the boundaries of what is possible. Together, we are unlocking more and more pieces to the recipe that will one day make an Internet where people can share without fear of harassment or abuse.

To learn more, check out the full paper here.

Keep watching this space for more cutting-edge research. And stay tuned for major product updates and product launches from Two Hat in 2019!