Thinking of Building Your Own Chat Filter? Five Reasons You’re Wasting Your Time!

If you’re building an online community, whether a game or social network, flagging and dealing with abusive language and users is critical to success. Back in 2014, a Riot Games study suggested that users who experience abuse their first time in the game are potentially three times more likely to quit and never return.

“Chatting is a major step in our funnel towards creating engaged, paying users. And so, it’s really in Twitch’s best interests — and in the interest of most game dev companies and other social media companies — to make being social on our platform as pleasant and safe as possible.” – Ruth Toner, Twitch

At Two Hat, we found that smart moderation can potentially double user retention. And we’re starting to experience an industry-wide paradigm shift. Today, gaming and social companies realize that if they want to shape healthy, engaged, and ultimately profitable communities, they must employ some kind of chat filter and moderation software.

But that begs the question — should you build it yourself or use an outside vendor? Like anti-virus software, it’s better left to a team dedicated day in, day out, to keeping the software updated.

A few things to consider before investing a great deal of time and expense into an in-house chat filter.

1. An allow/disallow list doesn’t work because language isn’t binary
Traditionally, most filters use a binary allow/disallow list. The thing is, language isn’t binary. It’s complex and nuanced.

For instance, in many older gaming communities, some swear words will be acceptable, based on context. You could build a RegEx tool to string match input text, and it would have no problem finding an f-bomb. But can it recognize the critical difference between “Go #$%^ yourself” and “That was #$%^ing awesome”?

What if your players spell a word incorrectly? What if they use l337 5p34k (and they will)? What if they deliberately try to manipulate the filter?

It’s an endless arms race, and your users have way more time on their hands than you do.

Think about the hundreds of different variations of these phrases:

“You should kill yourself / She deserves to die / He needs to drink bleach / etc”
“You are a [insert racial slur here]”

Imagine the time and effort it would take to enter every single variation. Now add misspellings. Now add l337 mapping. Now add the latest slang. Now add the latest latest slang.

It never ends.

Now, imagine using a filter that has access to billions of lines of chat across dozens of different platforms. By using a third-party filter, you’ll benefit from the network effect, detecting words and phrases you would likely never find on your own.

2. Keep your team focused on building an awesome product — not chasing a few bad actors around the block

“When I think about being a game developer, it’s because we love creating this cool content and features. I wish we could take the time that we put into putting reporting [features] on console, and put that towards a match history system or a replay system instead. It was the exact same people that had to work on both who got re-routed to work on the other. – Jeff Caplin, Blizzard Entertainment

Like anything else built in-house, someone has to maintain the filter as well as identify and resolve specific incidents. If your plan is to scale your community, maintaining your own filter will quickly become unmanageable. The dev and engineering teams will end up spending more time keeping the community safe than actually building the community and features.

Compare that with simply tapping into the RESTful API of a service provider that reliably uses AI and human review to keep abusive language definitions current and quickly process billions of reports per day. Imagine letting community managers identify and effectively deal with the few bad actors while the rest of your team relentlessly improves the community itself.

3. Moderation without triage means drowning in user reports
There is a lot more to moderation than just filtering abusive chat. Filtering — regardless of how strict or permissive your community may be — is only the first layer of defense against antisocial behavior.

You’ll also need a way for users to report abusive behavior, an algorithm that bubbles the worst reports to the top for faster review, an automated process for escalating especially dangerous (and potentially illegal) content for your moderation team to review, various workflows to accurately and progressively message, warn, mute, and sanction accounts and (hopefully) correct user behavior, a moderation tool with content queues for moderators to actually review UGC, a live chat viewer, an engine to generate business intelligence reports…

“Invest in tools so you can focus on building your game with the community.”

That’s Lance Priebe, co-creator of the massively popular kid’s virtual world Club Penguin, sharing one of the biggest lessons he learned as a developer.

Focus on what matters to you, and on what you and your team do best — developing and shipping kickass new game features.

4. It’s obsolete before it ships
The more time and money you can put into your core product — improved game mechanics, new features, world expansions — the better.

Think of it this way. Would you build your own anti-virus software? Of course not. It would be outdated before launch. Researching, reviewing, and fighting the latest malware isn’t your job. Instead, you rely on the experts.

Now, imagine you’ve built your own chat filter and are hosting it locally. Every day, users find new ways around the filter, faster than you can keep up. That means every day you have to spend precious time updating the repository with new expressions. And that means testing and finally deploying the update… and that means an increase in game downtime.

Build your own chat filter, they said. “It’ll be fun,” they said.

This all adds up to a significant loss of resources and time — your time, your team’s time, and your player’s time.

5. Users don’t only chat in English
What if your community uses other languages? Consider the work that you’ll have to put into building an English-only filter. Now, double, triple, quadruple that work when you add Spanish, Portuguese, French, German, etc.

Word-for-word translation might work for simple profanity, but as soon as you venture into colloquial expressions (“let’s bang,” “I’m going to pound you,” etc) it gets messy.

In fact, many languages have complicated grammar rules that make direct translation literally impossible. Creating a chat filter in, say, Spanish, would require the expertise of a native speaker with a deep understanding of the language. That means hiring or outsourcing multiple language experts to build an internal multi-language filter.

And anyone who has ever run a company knows — people are awesome but they’re awfully expensive.

Lego businessman is stressed about expenses.

How complex are other languages? German has four grammar cases and three genders. Finnish uses 15 noun cases in the singular and 16 in the plural. And the Japanese language uses three independent writing systems (hiragana, katakana, kanji), all three of which can be combined in a single sentence.

TL;DR: because grammar: Every language is complex in its own way. Running your English filter through a direct translation like Google translate won’t result in a clean, accurate chat filter. In fact, it will likely alienate your community if you get it wrong.

Engineering time is too valuable to waste
Is there an engineering team on the planet that has the time (not to mention resources) to maintain an internally-hosted solution?

Dev teams are already overtaxed with overflowing sprint cycles, impossible QA workloads, and resource-depleting deployment processes. Do you really want to maintain another internal tool?

If the answer is “no,” luckily there is a solution — instead of building it yourself, rely on the experts.

Think of it as anti-virus software for your online community.

Talk to the experts
Consider Community Sift by Two Hat Security for your community’s chat filter. Specializing in identification and triage of high-risk and illegal content, we are under contract to process 4 billion messages every day. Since 2012 we have been empowering gaming and social platforms to build healthy, engaged communities by providing cost-effective, purposeful automated moderation.

You’ll be in good company with some of the largest online communities by Supercell, Roblox, Kabam, and many more. Simply call our secure RESTful API to moderate text, usernames, and images in over 20 of the most popular IRL and digital languages, all built and maintained by our on-site team of real live native speakers.

Five Moderation Workflows Proven to Decrease Workload

We get it. When you built your online game, virtual world, forum for Moomin-enthusiasts (you get the idea), you probably didn’t have content queues, workflow escalations, and account bans at the front of your mind. But now that you’ve launched and are acquiring users, it’s time to ensure that you maximise your content moderation team.

It’s been proven that smart moderation can increase user retention, decrease workload, and protect your brand. And that means more money in your company pocket for cool things like new game features, faster bug fixes… and maybe even a slammin’ espresso machine for your hard working devs.

Based on our experience at Two Hat, and with our clients across the industry — which include some of the biggest online games, virtual worlds, and social apps out there — we’ve prepared a list of five crucial moderation workflows.

Each workflow leverages AI-powered automation to enhance your mod’s efficiency. This gives them the time to do what humans do best — make tough decisions, engage with users, and ultimately build a healthy, thriving community.

Use Progressive Sanctions

At Two Hat, we are big believers in second chances. We all have bad days, and sometimes we bring those bad days online. According to research conducted by Riot Games, the majority of bad behavior doesn’t come from “trolls” — it comes from average users lashing out. In the same study, Riot Games found that players who were clearly informed why their account was suspended — and provided with chat logs as backup — were 70% less likely to misbehave again.

The truth is, users will always make mistakes and break your community guidelines, but the odds are that it’s a one-time thing and they probably won’t offend again.

We all know those parents who constantly threaten their children with repercussions — “If you don’t stop pulling the cat’s tail, I’ll take your Lego away!” but never follow through. Those are the kids who run screaming like banshees down the aisles at Whole Foods. They’ve never been given boundaries. And without boundaries and consequences, we can’t be expected to learn or to change our behavior.

That’s why we highly endorse progressive sanctions. Warnings and temporary muting followed by short-term suspensions that get progressively longer (1 hour, 6 hours, 12 hours, 24 hours, etc) are effective techniques — as long as they’re paired with an explanation.

And you can be gentle at first — sometimes all a user needs is a reminder that someone is watching in order to correct their behavior. Sanctioning doesn’t necessarily mean removing a user from the community — warning and muting can be just as effective as a ban. You can always temporarily turn off chat for bad-tempered users while still allowing them to engage with your platform.

And if that doesn’t work, and users continue to post content that disturbs the community, that’s when progressive suspensions can be useful. As always, ban messages should be paired with clear communication:

“You wrote [X], and as per our Community Guidelines and Terms of Use, your account is suspended for [X amount of time]. Please review the Community Guidelines.”

You can make it fun, too.

“Having a bad day? You wrote [X], which is against the Community Guidelines. How about taking a short break (try watching that video of cats being scared by cucumbers, zoning out to Bob Ross painting happy little trees, or, if you’re so inclined, taking a lavender-scented bubble bath), then joining the community again? We’ll see you in [X amount of time].”

If your system is smart enough, you can set up accurate behavioral triggers to automatically warn, mute, and suspend accounts in real time.

The workflow will vary based on your community and the time limits you set, but it will look something like this:

Warn Mute 1 hr suspension 6 hr suspension  12 hr suspension  24 hr suspension → 48 hr suspension  Permanent ban

Use AI to Automate Image Approvals

Every community team knows that reviewing Every. Single. Uploaded. Image. Is a royal pain. 99% of images are mind-numbingly innocent (and probably contain cats, because the internet), while the 1% are well, shocking. After a while, everything blurs together, and the chances of actually missing that shocking 1% get higher and higher… until your eyes roll back into your head and you slump forward on your keyboard, brain matter leaking out of your ears.

OK, so maybe it’s not that bad.

But scanning image after image manually does take a crazy amount of time, and the emotional labor can be overwhelming and potentially devastating. Imagine scrolling through pic after pic of kittens, and then stumbling over full-frontal nudity. Or worse: unexpected violence and gore. Or the unthinkable: images of child or animal abuse.

All this can lead to stress, burnout, and even PTSD.

It’s in your best interests to automate some of the process. AI today is smarter than it’s ever been. The best algorithms can detect pornography with nearly 100% accuracy, not to mention images containing violence and gore, drugs, and even terrorism.

If you use AI to pre-moderate images, you can tune the dial based on your community’s resilience. Set the system to automatically approve any image with, say, a low risk of being pornography (or gore, drugs, terrorism, etc), while automatically rejecting images with a high risk of being pornography. Then, send anything in the ‘grey zone’ to a pre-moderation queue for your mods to review.

Or, if your user base is older, automatically approve images in the grey zone, and let your users report anything they think is inappropriate. You can also send those borderline images to an optional post-moderation queue for manual review.

This way, you take the responsibility off of both your moderators and your community to find the worst content.

What the flow looks like:

User submits image → AI returns risk probability If safe, automatically approve and post If unsafe, automatically reject If borderline, hold and send to queue for manual pre-moderation (for younger communities) or If borderline, publish and send to queue for optional post-moderation (for older communities).

Suicide/Self-Harm Support

For many people, online communities are the safest spaces to share their deepest, darkest feelings. Depending on your community, you may or may not allow users to discuss their struggles with suicidal thoughts and self-injury openly.

Regardless, users who discuss suicide and self-harm are vulnerable and deserve extra attention. Sometimes, just knowing that someone else is listening can be enough.

We recommend that you provide at-risk users with phone or text support lines where they can get help. Ideally, this should be done through an automated messaging system to ensure that users get help in real time. However, you can also send manual messages to establish a dialogue with the user.

Worldwide, there are a few resources that we recommend:

If your community is outside of the US, Canada, or the UK, your local law enforcement agency should have phone numbers or websites that you can reference. In fact, it’s a good idea to build a relationship with local law enforcement; you may need to contact them if you ever need to escalate high-risk scenarios, like a user credibly threatening to harm themselves or others.

We don’t recommend punishing users who discuss their struggles by banning or suspending their accounts. Instead, a gentle warning message can go a long way:

“We noticed that you’ve posted an alarming message. We want you to know that we care, and we’re listening. If you’re feeling sad, considering suicide, or have harmed yourself, please know that there are people out there who can help. Please call [X] or text [X] to talk to a professional.”

When setting up a workflow, keep in mind that a user who mentions suicide or self-harm just once probably doesn’t need an automated message. Instead, tune your workflow to send a message after repeated references to suicide and self-harm. Your definition of “repeated” will vary based on your community, so it’s key that you monitor the workflow closely after setting it up. You will likely need to retune it over time.

Of course, users who encourage other users to kill themselves should receive a different kind of message. Look out for phrases like “kys” (kill yourself) and “go drink bleach,” among others. In these cases, use progressive sanctions to enforce your community guidelines and protect vulnerable users.

What the flow looks like:

User posts content about suicide/self-harm X amount of times System automatically displays message to user suggesting they contact a support line If user continues to post content about suicide/self-harm X number of times, send content to a queue for a moderator to manually review for potential escalation

Prepare for Breaking News & Trending Topics

We examined this underused moderation flow in a recent webinar. Never overestimate how deeply the latest news and emerging internet trends will affect your community. If you don’t have a process for dealing with conversations surrounding the next natural disaster, political scandal, or even another “covfefe,” you run the risk of alienating your community.

Consider Charlottesville. On August 11th marchers from the far-right, including white nationalists, neo-Nazis, and members of the KKK gathered to protest the removal of Confederate monuments throughout the city. The rally soon turned violent, and on August 12th a car plowed into a group of counter-protestors, killing a young woman.

The incident immediately began trending on social media and in news outlets and remained a trending topic for several weeks afterward.

How did your online community react to this news? Was your moderation team prepared to handle conversations about neo-Nazis on your platform?

While not a traditional moderation workflow, we have come up with a “Breaking News & Trending Topics” protocol that can help you and your team stay on top of the latest trends — and ensure that your community remains expressive but civil, even in the face of difficult or controversial topics.

  1. Compile vocabulary: When an incident occurs, compile the relevant vocabulary immediately.
  2. Evaluate: Review how your community is using the vocabulary. If you wouldn’t normally allow users to discuss the KKK, would it be appropriate to allow it based on what’s happening in the world at that moment?
  3. Adjust: Make changes to your chat filter based on your evaluation above.
  4. Validate: Watch live chat to confirm that your assumptions were correct.
  5. Stats & trends: Compile reports about how often or how quickly users use certain language. This can help you prepare for the next incident.
  6. Re-evaluate vocabulary over time: Always review and reassess. Language changes quickly. For example, the terms Googles, Skypes, and Yahoos were used in place of anti-Semitic slurs on Twitter in 2016. Now, in late 2017, they’ve disappeared — what have they been replaced with?

Stay diligent, and stay informed. Twitter is your team’s secret weapon. Have your team monitor trending hashtags and follow reputable news sites so you don’t miss anything your community may be talking about.

Provide Positive Feedback

Ever noticed that human beings are really good at punishing bad behavior but often forget to reward positive behavior? It’s a uniquely human trait.

If you’ve implemented the workflows above and are using smart moderation tools that blend automation with human review, your moderation team should have a lot more time on their hands. That means they can do what humans do best — engage with the community.

Positive moderation is a game changer. Not only does it help foster a healthier community, it can also have a huge impact on retention.

Some suggestions:

  • Set aside time every day for moderators to watch live chat to see what the community is talking about and how users are interacting.
  • Engage in purposeful community building — have moderators spend time online interacting in real time with real users.
  • Forget auto-sanctions: Try auto-rewards! Use AI to find key phrases indicating that a user is helping another user, and send them a message thanking them, or even inviting them to collect a reward.
  • Give your users the option to nominate a helpful user, instead of just reporting bad behavior.
  • Create a queue that populates with users who have displayed consistent positive behavior (no recent sanctions, daily logins, no reports, etc) and reach out to them directly in private or public chat to thank them for their contributions.

Any one of these workflows will go a long way towards building a healthy, engaged, loyal community on your platform. Try them all, or just start out with one. Your community (and your team) will thank you.

With our chat filter and moderation software Community Sift, Two Hat has helped companies like Supercell, Roblox, Habbo, Friendbase, and more implement similar workflows and foster healthy, thriving communities.

Interested in learning how we can help your gaming or social platform thrive? Get in touch today!

Want more articles like this? Subscribe to our newsletter and never miss an update!

* indicates required