New Research Suggests Sentiment Analysis is Critical in Content Moderation

At Two Hat, research is the foundation of everything we do. We love to ask big questions and seek even bigger answers. And thanks to a generous grant from Mitacs, we’ve partnered with leading Canadian universities to conduct research into the subjects that we’re most passionate about — from protecting children by detecting child sexual abuse material to developing new and innovative advances in chat moderation.

Most recently, Université Laval student researcher Éloi Brassard-Gourdeau and professor Richard Khoury asked the question “What is the most accurate and effective way to detect toxic (also known as disruptive) behavior in online communities?” Specifically, their hypothesis was:

“While modifying toxic content and keywords to fool filters can be easy, hiding sentiment is harder.”

They wanted to see if sentiment analysis was more effective than keyword detection when identifying disruptive content like abuse and hate speech in online communities.

Definitions of sentiment analysis, toxicity, subversion, and keywords in content moderationIn Impact of Sentiment Detection to Recognize Toxic and Subversive Online Comments, Brassard-Gourdeau and Khoury analyzed over a million online comments using one Reddit and two Wikipedia datasets. The results show that sentiment information helps improve toxicity detection in all cases. In other words, the general sentiment of a comment — whether it’s positive or negative — is a more effective measure of toxicity than just keyword analysis.

But the real boost came when they used sentiment analysis on subversive language; that is, when users attempted to mask sentiment using L337 5p33k, deliberate misspellings, and word substitutions. According to the study, “The introduction of subversion leads to an important drop in the accuracy of toxicity detection in the network that uses the text alone… using sentiment information improved toxicity detection by as much as 3%.

You may be asking yourself, why does this matter? With chat moderation becoming more common in games and social apps, more users will find creative ways to subvert filters. Even the smartest content moderation tools on the market (like Two Hat’s Community Sift, which uses a unique AI called Unnatural Language Processing to detect complex manipulations), will find it increasingly difficult to flag disruptive content. As an industry, it’s time we started looking for innovative solutions to a problem that will only get harder in time.

In addition to asking big questions and seeking even bigger answers, we have several foundational philosophies at Two Hat that inform our technology. We believe that computers should do computer work and humans should do human work, and that an ensemble approach is key to exceptional AI.

This study validates our assumption that using multiple data points and multiple models in automated moderation algorithms are critical in boosting accuracy and ensuring a better user experience.

“We are in an exciting time in AI and content moderation,” says Two Hat CEO and founder Chris Priebe. “I am so proud of our students and the hard work they are doing. Every term they are pushing the boundaries of what is possible. Together, we are unlocking more and more pieces to the recipe that will one day make an Internet where people can share without fear of harassment or abuse.

To learn more, check out the full paper here.

Keep watching this space for more cutting-edge research. And stay tuned for major product updates and product launches from Two Hat in 2019!

From Censorship to Diligence: How Chat Moderation is Evolving

In the past, once you figured out there was a problem in your online community, it was probably too late to do much about it. That’s because chat and other social features were often home-grown or off the shelf solutions tacked-on to games and other communities after the fact, rather than baked-in to product strategy.

So, the go-to solution when there was a problem in a chat community was simply to disallow (formerly known as blacklistlisting) ‘offensive’ users. But blacklisting alone (words, people, etc.) doesn’t really do anything to solve the underlying issues and invites accusations of censorship against community managers (i.e. your brand). It was (still is for some) an unsustainable approach, and a new way of thinking was needed.

Today, chat and chat moderation are considered and strategized for at the design stage. Community guidelines, policies and the means to educate users as to acceptable chat conduct are established before products ever go to market. There are many reasons for the change, but the biggest may be the global shift to prioritizing a user’s experience with your product, rather than the product itself.

Experience first
The broader shift to experience-first practices has opened the door for brands to leverage chat communities as revenue drivers (see our whitepaper, An Opportunity to Chat, available on VentureBeat, for more on that).

At the same time though, prioritizing chat moderation means brands and community managers need to ask themselves some very tough, complex questions that they didn’t have to ask before.

Will new community members who may not know the rules be subject to the same policies as veteran users? How will you account for variance in ages (8-year-olds communicate differently than 38-year-olds)? What are your moderators going to do if a user threatens someone, or starts talking suicide? Should people be able to report one another? Are bans permanent? Do they carry over to other products, brands or communities?

Answering these questions takes a lot of research, discussion, and forethought. More than anything, it’s essential to be sure that the answers you get to, and the community experience you build, move your brand away from being perceived by users as a censor of discussion, and towards being perceived as their diligent partner in creating a great experience.

Diligence is the opposite of censorship
One of the conversations we encounter when discussing chat moderation policies with our clients pivots around turning concerns of taking freedom of expression away from users into an opportunistic discussion about how chat moderation as part of product, brand, and business strategy are often misaligned. In fact, it is essential for brands to move away from thinking of chat moderation as just a tool for managing risk, and towards the realization that it’s also a way to identify your most influential and profitable users. Why?

Because chat and chat moderation drive clear business improvements in user engagement, retention, and lifetime value. We also know that positive chat experiences contribute to ‘K Factor’, or virality, i.e. the better the chat experience, the more likely a user is to share their satisfaction with a friend.

So then, far from fearing the label of limiting use expression, the discussion your team needs to have about chat moderation is, “How can we encourage and scale the types of chat experiences shared by our most valuable users?”

Instead of just muting those who use bad words, pick out the positive things influential users chat about and see how they inspire others to engage and stick around. Discover what your most valuable, long-term users are chatting about and figure out how to surface those conversations for new and prospective users, sooner, and to greater effect.

Don’t fear the specter of censorship. Embrace the role of chat moderation as a powerful instrument of diligence, a productive business tool, and the backbone for a great user experience.

Will This New AI Model Change How the Industry Moderates User Reports Forever?

Picture this:

You’re a moderator for a popular MMO. You spend hours slumped in front of your computer reviewing a seemingly endless stream of user-generated reports. You close most of them — people like to report their friends as a prank or just to test the report feature. After the 500th junk report, your eyes blur over and you accidentally close two reports containing violent hate speech — and you don’t even realize it. Soon enough, you’re reviewing reports that are weeks old — and what’s the point in taking action after so long? There are so many reports to review, and never enough time…

Doesn’t speak to you? Imagine this instead:

You’ve been playing a popular MMO for months now. You’re a loyal player, committed to the game and your fellow players. Several times a month, you purchase new items for your avatar. Recently, another player has been harassing you and your guild, using racial slurs, and generally disrupting your gameplay. You keep reporting them, but it seems like nothing ever happens – when you log back in the next day, they’re still there. You start to think that the game creators don’t care about you – are they even looking at your reports? You see other players talking about reports on the forum: “No wonder the community is so bad. Reporting doesn’t do anything.” You log on less often; you stop spending money on items. You find a new game with a healthier community. After a few months, you stop logging on entirely.

Still doesn’t resonate? One last try:

You’re the General Manager at a studio that makes a high-performing MMO. Every month your Head of Community delivers reports about player engagement and retention, operating costs, and social media mentions. You notice that operating costs go up while the lifetime value of a user is going down. Your Head of Community wants to hire three new moderators. A story in Wired is being shared on social media — players complain about rampant hate speech and homophobic slurs in the game that appear to go unnoticed. You’re losing money and your brand reputation is suffering — and you’re not happy about it.

The problem with reports
Most social platforms give users the ability to report offensive content. User-generated reports are a critical tool in your moderation arsenal. They surface high-risk content that you would otherwise miss, and they give players a sense of ownership over and engagement in their community.

They’re also one of the biggest time-wasters in content moderation.

Some platforms receive thousands of user reports a day. Up to 70% of those reports don’t require any action from a moderator — yet they have to review them all. And those reports that do require action often contain content that is so obviously offensive that a computer algorithm should be able to detect it automatically. In the end, reports that do require human eyes to make a fair, nuanced decision often get passed over.

Predictive Moderation
For the last two years, we’ve been developing and refining a unique AI model to label and action user reports automatically, mimicking a human moderator’s workflow. We call it Predictive Moderation.

Predictive Moderation is all about efficiency. We want moderation teams to focus on the work that matters — reports that require human review, and retention and engagement-boosting activities with the community.

Two Hat’s technology is built around the philosophy that humans should do human work, and computers should do computer work. With Predictive Moderation, you can train our innovative AI to do just that — ignore reports that a human would ignore, action on reports that a human would action on, and send reports that require human review directly to a moderator.

What does this mean for you? A reduced workload, moderators who are protected from having to read high-risk content, and an increase in user loyalty and trust.

Getting started
We recently completed a sleek redesign of our moderation layout (check out the sneak peek!). Clients begin training the AI on their dataset in January. Luckily, training the model is easy — moderators simply review user reports in the new layout, closing reports that don’t require action and actioning on the reports that require it.

Image of chat moderation workflow for user-generated reports
Layout subject to change

“User reports are essential to our game, but they take a lot of time to review,” says one of our beta clients. “We are highly interested in smarter ways to work with user reports which could allow us to spend more time on the challenging reports and let the AI take care of the rest.”

Want to save time, money, and resources? 
As we roll out Predictive Moderation to everyone in the new year, expect to see more information including a brand-new feature page, webinars, and blog posts!

In the meantime, do you:

  • Have an in-house user report system?
  • Want to increase engagement and trust on your platform?
  • Want to prevent moderator burnout and turnover?

If you answered yes to all three, you might be the perfect candidate for Predictive Moderation.

Contact us at to start the conversation.

Two Hat CEO and founder Chris hosts a webinar on Wednesday, February 20th where he’ll share Two Hat’s vision for the future of content moderation, including a look at how Predictive Moderation is about to change the landscape of chat moderation. Don’t miss it — the first 25 attendees will receive a free Two Hat gift bag!

How Do You Calculate the ROI of Proactive Moderation in Chat?

On Tuesday, October 30th, I’m excited to be talking to Steve Parkis, a senior tech and entertainment executive who drove amazing growth in key products at Disney and Zynga, about how chat has a positive effect on user retention and overall revenue. It would be great to have you join us — you can sign up here.

Until then, I would like to get the conversation started here.

There is a fundamental understanding in online industries that encouraging prosocial, productive interactions and curbing anti-social, disruptive behavior in our online communities are important things to do.

The question I’ve been asking myself lately is this — do we have the numbers to prove that proactive moderation and other approaches are business crucial?

In my experience, our industries (games, apps, social networks, etc) lack the studies and numbers to prove that encouraging the productive and tackling negative interactions has a key impact on user engagement, retention, and growth.

This is why I’m on a mission this quarter to create new resources, including a white paper, that will shed light on this matter, and hopefully help as many people as possible in their quest to articulate this connection.

First steps and big questions
We already know that chat and social features are good for business — we have lots of metrics around this — but the key info that we’re missing is the ROI of proactive moderation and other community measures. Here’s where I need your help, please:

  • How have you measured the success of filtering and other approaches to tackle disruptive behavior (think spam, fraud, hate speech, griefing, etc) as it relates to increased user retention and growth in your communities?
  • Have you measured the effects of implementing human and/or automated moderation in your platforms, be it related to usernames, user reports, live chat, forum comments, and more?
  • Why have you measured this?

I believe the way we are currently operating is self-sabotage. By not measuring and surfacing the business benefits of proactive moderation and other measures to tackle anti-social and disruptive behaviour, our departments are usually seen as cost-centers rather than key pieces in revenue generation.

I believe that our efforts are crucial to removing the blockers to growth in our platforms, and also encouraging and fostering stronger user engagement and retention.

Starting the conversation
I’ve talked to many of you and I’m convinced we feel the same way about this and see similar gaps. I invite you to email your comments and thoughts to

Your feedback will help inform my next article as well as my next steps. So what’s in it for you? First, I’ll give you a shoutout (if you want) in the next piece about this topic, and will also give you exclusive access to the resources once they are ready, giving you credit where it’s due. You will also have my deepest gratitude : ) You know you can also count on me for help with any of your projects!

To recap, I would love to hear from you about how you and your company are measuring the return on investment from implementing measures (human and/or technology driven) to curb negative, antisocial behaviour in your platforms.

How are you thinking about this, what are you tracking, and how are you analyzing this data?

Thanks in advance for your input. I look forward to reading it!

Adding Chat to Your Online Platform? First Ask Yourself These 4 Critical Questions

Want to retain users and lower the cost of acquisition on your platform? In 2018, social features including chat, private messaging, usernames, and user profiles are all must-haves in an overstuffed market where user retention is critical to long-term success. Nothing draws a crowd like a crowd — and a crowd of happy, loyal, and welcoming users will always bring in more happy, loyal, and welcoming users.

But there will always be risks involved when adding social features to your platform. A small percentage of users will post unwanted content like hate speech, NSFW images, or abusive language, all of which can cause serious damage to your brand’s reputation.

So while social features are must-haves in 2018, understanding — and mitigating — the risks inherent in adding those features are equally important.

If you’re just getting started with chat moderation (and even if you’ve been doing it for a while), here are four key questions to ask.

1. How much risk is my platform/brand willing to accept?
Every brand is different. Community demographic will usually be a major factor in determining your risk tolerance.

For instance, communities with users under 13 in the US have to be COPPA compliant, so preventing users from sharing PII (personally identifiable information) is essential. Edtech platforms have to mitigate risk by ensuring that they’re CIPA and FERPA compliant.

With legal ramifications to consider, these platforms that are designed for young people will always be far more risk-averse than brands that are marketed towards more mature audiences.

However, many older, more established brands — even if they cater to an older audience — will likely be less tolerant of risk than small or new organizations.

Consider your brand’s tone and history. Review your corporate guidelines to understand what your brand stands for. This is a great opportunity to define exactly what kind of an online community you want to create.

2. What kind of content is most dangerous to my platform/brand?
Try this exercise: Imagine that one item (say, a forum post or profile pic) containing pornography was posted on your platform. How would it affect the brand? How would your audience react to seeing pornography on your platform? How would your executive team respond? What would happen if the media/press found out?

Same with PII — for a brand associated with children or teens, this could be monumental. (And if it happens on a platform aimed at users under 13 in the US, a COPPA violation can lead to potentially millions of dollars in fines.)

What about hate speech? Sexual harassment? What is your platform/brand’s definition of abuse or harassment? The better you can define these terms in relation to your brand, the better you will understand what kind of content you need to moderate.

3. How will I communicate my expectations to the community?
Don’t expect your users to automatically know what is and isn’t acceptable on your platform. Post your community guidelines where users can see them. Make sure users have to agree to your guidelines before they can post.

In a recent blog for CMX, Two Hat Director of Community Trust & Safety Carlos Figueiredo explores writing community guidelines you can stick to. In it, he provides an engaging framework for everything from creating effective guidelines from the ground up, to collaborating with your production team to create products that encourage healthy interactions.

4. What tools can I leverage to manage risk and enforce guidelines in my community?
We recommend taking a proactive instead of a reactive approach to managing risk. What does that mean for chat moderation? First, let’s look at the different kinds of chat moderation:

  • Live moderation: Moderators follow live chat in real time and take action as needed. High risk, very expensive, and not a scalable solution.
  • Pre-moderation: Moderators review, then approve or reject all content before it’s posted. Low risk, but slow, expensive, and not scalable.
  • Post-moderation: Moderators review, then approve or reject all content after it’s posted. High-risk option.
  • User reports: Moderators depend on users to report content, then review and approve or reject. High-risk option.

On top of these techniques, there are also different tools you can use to take a proactive approach, including in-house filters (read about the build internally vs buy externally debate), and content moderation solutions like Two Hat’s Community Sift (learn about the difference between a simple profanity filter and a content moderation tool).

So what’s the best option?

Regardless of your risk tolerance, always use a proactive filter. Content moderation solutions like Two Hat’s Community Sift can be tuned to match your risk profile. Younger communities can employ a more restrictive filter, and more mature communities can be more permissive. You can even filter just the topics that matter most. For example, mature communities can allow sexual content while still blocking hate speech.

By using a proactive filter, you’ve already applied the first layer of risk mitigation. After that, we recommend using a blend of all four kinds of moderation, based on your brand’s unique risk tolerance. Brands that are less concerned about risk can depend on user reports for the most part, while more risk-averse platforms can pre or post-moderate content that they deem potentially risky, but not risky enough to filter automatically.

Once you understand and can articulate your platform/brand’s risk tolerance, you can start to build Terms of Use and community guidelines around it. Display your expectations front and center, use proven tools and techniques to manage risk, and you’ll be well on your way to building a healthy, thriving, and engaged community of users — all without putting your brand’s reputation at risk.

Now, with your brand protected, you can focus on user retention and revenue growth.

Community Sift Available for the Nintendo Switch

Do you remember the first time you heard the Super Mario Bros. theme music?

The soundtrack to a million childhoods, that sprightly 8-bit calypso-inspired theme rarely fails to conjure up cherished adolescent memories.

Are you sitting in an office right now? Try singing it. Do-do-do… do-do-do-do… do-do-do-do-do-do-do-do-do-do… Is your officemate singing along yet?

Of course they are.

One of my best memories as a kid was playing GoldenEye 007… definitely built a love for FPSs. And it was amazing to see how easy the golden gun could destroy friendships.

– Andrew

If you’re reading this, you probably have at least one fond Nintendo-related memory. If you grew up in the 80s, 90s, or 00s, you probably played Super Mario Bros, The Legend of Zelda, Street Fighter, Pokémon … and countless other classic titles. We sure did.

I went to Disney World when I was 11, went back to Brazil with a game boy and Donkey Kong. Played that a lot! That rocked my life.

– Carlos

That’s why we’re so pumped to announce that Community Sift, our high-risk content detection system and moderation tool for social products, is now an approved tool in the Nintendo Switch™ developer portal.

If you’re developing a Nintendo Switch™ game that features UGC (User-Generated Content), Community Sift can help keep your users safe from dangerous content like bullying, harassment, and child exploitation.

My uncle was in the Navy and traveled lots. In one of his trips to the UK he got me an NES, I was the most excited 6-year-old… only to discover it was “region locked” and could not use it at all. A year or so later it was released in Mexico and “Santa” came through.

– Sharon

What was so awesome about those Nintendo games that we grew up playing in our bedrooms, our basements, and our best friend’s living rooms? They were created for everyone to enjoy. Our parents didn’t have to worry about content (ok, maybe Street Fighter freaked them out a little bit).

We connected with friends, siblings, cousins, and neighbors. (With siblings, sometimes the controllers connected with our skulls.) Even though we were competing, we still felt a sense of camaraderie and belonging in the kingdoms of Mushroom and Hyrule.

Me and my two other siblings would always fight for a controller to play Super Mario Bros or Duck Hunt. When we couldn’t agree on whose turn it was on our own and someone (me, the youngest) inevitably ended up in tears, my parents would take the controllers for themselves and make us watch them play.

– Maddy

That’s what the best Nintendo games do — they bring us together, across cultures, languages, and economic, and social lines. In this newly-connected gaming world, it’s more important than ever that we preserve that sense of connection — and do it with safety in mind.

I grew up in a house without video games. My parents were worried that video games would corrupt my frail mind, so as a kid I had to go to my cousin’s house to experience the wonder of the NES. Of course, every time I would go there, we’d play Super Mario Bros. and Street Fighter until our eyes bugged out of our heads.

– Owen

That’s why we’re so excited that Community Sift is an approved tool in the Nintendo Switch™ developer’s portal. Now, if you’re building a Nintendo game that connects players through UGC, you can ensure that they are just as safe as we were when we were kids.

We believe in a world free of online bullying, harassment, and child exploitation, and we’ve built our product with that vision in mind. With Community Sift, you can protect your users from the really dangerous stuff that puts them at risk.

The moment I discovered that the bushes and the clouds were the same shape, different color in Mario. Mind blown as a kid.


Whether you feature chat, usernames, profile pics, private messages (and more), our dream is to help developers craft safe, connected experiences. And isn’t that what Nintendo is all about?

We can’t wait to help you inspire another generation of dreamers, creators, and players. 

I didn’t grow up with video games in my house. But the year I was ten my family and I spent one glorious summer house-sitting for friends who had an NES in their basement. My little sister and I spent countless hours playing Super Mario Bros and when we got bored with falling down tubes and jumping on Goomba heads, Duck Hunt. I never finished the game because I couldn’t defeat Bowser in the last castle. To this day, it remains one of my greatest disappointments.

– Leah

If you’re developing a game for the Nintendo Switch™ and are interested in using Community Sift to moderate content, we would love to hear from you. Book a free demo here, or get in touch with us at

In parting, we leave you with this — the best and sweetest memory in the office:

Does making my mom play the Bowser castles on an original Nintendo count as a story? Because I was like 7 and he scared me? LOL

– Jenn





Community Sift Moderation Solution Now Available in Nintendo Switch Developer’s Portal

Do you remember the first time you heard the Super Mario Bros. theme music?

The soundtrack to a million childhoods, that sprightly 8-bit calypso-inspired theme rarely fails to conjure up cherished adolescent memories.

Are you sitting in an office right now? Try singing it. Do-do-dodo-do-do-dodo-do-do-do-do-do-do-do-do-do… Is your officemate singing along yet?

Of course they are.

If you’re reading this, you probably have at least one fond Nintendo-related memory. If you grew up in the 80s, 90s, or 00s, you probably played Super Mario Bros, The Legend of Zelda, Street Fighter, Pokémon … and countless other classic titles. We sure did.

Big announcement

That’s why we’re so pumped to announce that Community Sift, our chat and image filter for social products is now an approved tool in the Nintendo Switch™ developer portal.

If you’re developing a Nintendo Switch™ game that features UGC (User-Generated Content), Community Sift can help keep your users safe from dangerous content like bullying, harassment, and child exploitation.

Connecting through games

What was so awesome about those Nintendo games that we grew up playing in our bedrooms, our basements, and our best friend’s living rooms? They were created for everyone to enjoy. Our parents didn’t have to worry about content (ok, maybe Street Fighter freaked them out a little bit).

We connected with friends, siblings, cousins, and neighbors. (With siblings, sometimes the controllers connected with our skulls.) And even though we were competing, we still felt a sense of camaraderie and belonging in the kingdoms of Mushroom and Hyrule.

That’s what the best Nintendo games do — they bring us together, across cultures, languages, and economic and social lines. In this newly-connected gaming world, it’s more important than ever that we preserve that sense of connection — and do it with safety in mind.

Protect your brand and your users

Now, if you’re building a Nintendo game that connects players through UGC, you can ensure that they are just as safe as we were when we were kids.

Whether you feature chat, usernames, profile pics, private messages (and more), our dream is to help developers craft safe, connected experiences. And isn’t that what Nintendo is all about?

We can’t wait to help you inspire another generation of dreamers, creators, and players.

Get in touch

Are you an authorized Nintendo Switch™ developer? Just search for Community Sift in the developer portal, and from there get in touch for more information.

Top 6 Reasons You Should Combine Automation and Manual Review in Your Image Moderation Strategy

When you’re putting together an image moderation strategy for your social platform, you have three options:

  1. Automate everything with AI;
  2. Do everything manually with human moderators, or
  3. Combine both approaches for Maximum Moderation Awesomeness™

When consulting with clients and industry partners like PopJam, unsurprisingly, we advocate for option number three.

Here are our top six reasons why:

1. Human beings are, well… human (Part 1)
We get tired, we take breaks, and we don’t work 24/7. Luckily, AI hasn’t gained sentience (yet), so we don’t have to worry (yet) about an algorithm troubling our conscience when we make it work without rest.

2. Human beings are, well… human (Part 2)
In this case, that’s a good thing. Humans are great at making judgments based on context and cultural understanding. An algorithm can find a swastika, but only a human can say with certainty if it’s posted by a troll propagating hate speech or is instead a photo from World War II with historical significance.

3. We’re in a golden age of AI
Artificial intelligence is really, really good at detecting offensive images with near-perfect accuracy. For context, this wasn’t always the case. Even 10 years ago, image scanning technology was overly reliant on “skin tone” analysis, leading to some… interesting false positives.

Babies, being (sometimes) pink, round, and strangely out of proportion would often trigger false positives.

And while some babies may not especially adorable, it was a bit cruel to label them “offensive.”

Equally inoffensive but often the cause of false positives was light oak-coloured desks, chair legs, marathon runners, some (but not all) brick walls, and even more bizarrely — balloons.

Today, the technology has advanced so far that it can distinguish between bikinis, shorts, beach shots, scantily-clad “glamour” photography, and explicit adult material.

4. Humans beings are, well… human (Part 3)
As we said, AI doesn’t yet have the capacity for shock, horror, or emotional distress of any kind.

Until our sudden inevitable overthrow by the machines, go ahead and let AI automatically reject images with a high probability of containing pornography, gore, or anything that could have a lasting effect on your users and your staff.

That way, human mods can focus on human stuff like reviewing user reports and interacting with the community.

5. It’s the easiest way to give your users an unforgettable experience
The social app market is already overcrowded. “The next Instagram” is released every day. In a market where platforms vie to retain users, it’s critical that you ensure positive user experiences.

With AI, you can approve and reject posts in real-time, meaning your users will never have to wait for their images to be reviewed.

And with human moderators engaging with the community — liking posts, upvoting images, and promptly reviewing and actioning user reports — your users will feel supported, safe, and heard.

You can’t put a price on that… no wait, you can. It’s called Cost of Customer Acquisition (CAC), and it can make or break a business that struggles to retain users.

6. You’re leveraging the best of both worlds
AI is crazy fast, scanning millions of images a day. By contrast, humans can scan about 2500 images daily before their eyes start to cross and they make a lot of mistakes. AI is more accurate than ever, but humans provide enhanced precision by understanding context.

A solid image moderation process supported by cutting-edge tech and a bright, well-trained staff? You’re well on your way to Maximum Moderation Awesomeness™.

Want to learn how one social app combines automation with manual review to reduce their workload and increase user engagement? Sign up for our webinar featuring the community team from PopJam!

Optimize Your Image Moderation Process With These Five Best Practices

If you run or moderate a social sharing site or app where users can upload their own images, you know how complex image moderation can be.

We’ve compiled five best practices that will make you and your moderation team’s lives a lot easier.

1. Create robust internal moderation guidelines
While you’ll probably rely on AI to automatically approve and reject the bulk of submitted images, there will be images that an algorithm misses, or that users have reported as being inappropriate. In those cases, it’s crucial that your moderators are well-trained and have the resources at their disposal to make what can sometimes be difficult decisions.

Remember the controversy surrounding Facebook earlier this year when they released their moderation guidelines to the public? Turns out, their guidelines were so convoluted and thorny that it was near-impossible to follow them with any consistency. (To be fair, Facebook faces unprecedented challenges when it comes to image moderation, including incredibly high volumes and billions of users from all around the world.) There’s a lesson to be learned here, though, which is that internal guidelines should be clear and concise.

Consider — you probably don’t allow pornography on your platform, but how do you feel about bathing suits or lingerie? And what about drugs — where do you draw the line? Do you allow images of pills? Alcohol?

Moderation isn’t a perfect science; there will always be grey areas.

2. Consider context
When you’re deciding whether to approve or reject an image that falls into the grey area, remember to look at everything surrounding the image. What is the user’s intent with posting the image? Is their intention to offend? Look at image tags, comments, and previous posts.

3. Be consistent when approving/rejecting images and sanctioning users
Your internal guidelines should ensure that you and your team make consistent, replicable moderation decisions. Consistency is so important because it signals to the community that 1) you’re serious about their health and safety, and 2) you’ve put real thought and attention into your guidelines.

A few suggestions for maintaining consistency:

  • Notify the community publically if you ever change your moderation guidelines
  • Consider publishing your internal guidelines
  • Host moderator debates over challenging images and ask for as many viewpoints as possible ; this will help avoid biased decision-making
  • When rejecting an image (even if it’s done automatically by the algorithm), automate a warning message to the user that includes community guidelines
  • If a user complains about an image rejection or account sanction, take the time to investigate and fully explain why action was taken

4. Map out moderation workflows
Take the time to actually sketch out your moderation workflows on a whiteboard. By mapping out your workflows, you’ll notice any holes in your process.

Here are just a few scenarios to consider:

  • What do you do when a user submits an image that breaks your guidelines? Do you notify them? Sanction their account? Do nothing and let them submit a new image?
  • Do you treat new users differently than returning users (see example workflow for details)?
  • How do you deal with images containing CSAM (child sexual abuse material; formally referred to as child pornography)?

Coming across an image that contains illegal content can be deeply disturbing.

5. Have a process to escalate illegal images
The heartbreaking reality of the internet is that it’s easier today for predators to share images than it has ever been. It’s hard to believe that your community members would ever upload CSAM, but it can happen, and you should be prepared.

If you have a Trust & Safety specialist, Compliance Officer, or legal counsel at your company, we recommend that you consult them for their best practices when dealing with illegal imagery. One option to consider is using Microsoft’s PhotoDNA, a free image scanning service that can automatically identify and escalate known child sexual abuse images to the authorities.

You may never find illegal content on your platform, but having an escalation process will ensure that you’re prepared for the worst-case scenario.

On a related note, make sure you’ve also created a wellness plan for your moderators. We’ll be discussing individuals wellness plans — and other best practices — in more depth in our Image Moderation 101 webinar on August 22nd. Register today to save your seat for this short, 20-minute chat.

The Role of Image Filtering in Shaping a Healthy Online Community

Digital citizenship, online etiquette, and user behaviour involve many different tools of expression, from texting to photo sharing, and from voice chat to video streaming. In my last article, I wrote about who is responsible for the well-being of players/users online. Many of the points discussed relate directly to the challenges posed by chat communication.

However, those considerations are also applicable to image sharing on our social platforms as well as what intent is behind it.

Picture this
Online communities that allow users to share images have to deal with several risks and challenges that come with the very nature of the beast; meaning, creating and/or sharing images is a popular form of online expression, there’s no shortage of images, and they come in all shapes, flavours, and forms.

Unsurprisingly, you’re bound to encounter images that will challenge your community guidelines (think racy pictures without obvious nudity), while others will simply be unacceptable (for example, pornography, gore, or drug-related imagery).

Fortunately, artificial intelligence has advanced to a point where it can do things that humans cannot; namely, handle incredibly high volumes while maintaining high precision and accuracy.

This is not to say that humans are dispensable. Far from that. We still need human eyes to make the difficult, nuanced decisions that machines alone can’t yet make.

For example, let’s say a user is discussing history with another user and wants to share a historical picture related to hate speech. Without the appropriate context, a machine could simply identify a hateful symbol on a flag and automatically block the image, stopping them from sharing it.

Costs and consequences
Without an automated artificial intelligence system for image filtering, a company is looking at two liabilities:

  • An unsustainable, unscalable model that will incur a manual cost connected to human moderation hours;
  • Increased psychological impact of exposing moderators to excessive amounts of harmful images

The power of artificial intelligence
Automated image moderation can identify innocuous images and automate their approval. It can also identify key topics (like pornographic content and hateful imagery) with great accuracy and block them in real time, or hold them for manual review.

By using automation, you can remove two things from your moderators’ plates:

  • Context-appropriate images (most images: fun pictures with friends smiling, silly pictures, pets, scenic locations, etc )
  • Images that are obviously against your community guidelines (think pornography or extremely gory content)

Also, a smart system can serve up images in the grey area to your moderators for manual review, which means way less content to review than the two scenarios explored above. By leveraging automation you will have less manual work (reduced workload, therefore reduced costs) and less negative impact on your moderation team.

Give humans a break
Automated image moderation can also take the emotional burden off of your human moderators. Imagine yourself sitting in front of a computer for hours and hours, reviewing hundreds or even thousands of images, never knowing when your eyes (and mind) will be assaulted by a pornographic or very graphic violent image. Now consider the impact this has week after week.

What if a big part of that work can be taken by an automated system, drastically reducing the workload, and with that the emotional impact of reviewing offensive content? Why wouldn’t we seek to improve our team’s working situation and reduce employee burnout and turnover?

It is not only a business crucial thing to do. This also means taking better care of your people and supporting them. This is key to company culture.

An invitation
Normally, I talk and write about digital citizenship as it relates to chat and text. Now, I’m excited to be venturing into the world of images and sharing as much valuable insight as I can with all of you. After all, image sharing is an important form of communication and expression in many online communities.

It would be great if you could join me for a short, 20-minute webinar we are offering on Wednesday, August 22nd. I’ll be talking about actionable best practices you can put to good use as well as considering what the future may hold for this space. You can sign up here.

I’m looking forward to seeing you there!

Originally published on LinkedIn by Carlos Figueiredo, Two Hat Director of Community Trust & Safety