In the fourth installment of our Five Layers of Community Protection blog series, we dive deeper into how Automated User Reporting leads to organizations building healthier and safer communities by giving users a way to report when they’re uncomfortable and helping moderators prioritize actionable items.
Layer Four: User Reporting
We previously learned about Layer 3 and how to leverage User Reputation to more elegantly assess context and deploy the appropriate filter settings.
Layer 4 is focused on creating access for community members to report content they find harmful and/or which should have been blocked. This is both highly empowering to your community, and provides you with valuable feedback on the grey area, the content that requires human expertise, empathy, and insights in order to be fully assessed.
Communicating your organization’s zero-tolerance policy regarding harmful content, enforcing it, and doing so quickly, shows your community you value them and their reporting. This builds credibility and trust. Following up quickly also discourages violators from acting again, as well as their copycats. Prompt resolution also helps protect a brand’s reputation and increases user engagement and retention.
The Long Backlog
Many community managers and moderators, especially those in large organizations, face a backlog of automated user reports. Small organizations may have only one person handling hundreds of reports. This backlog can hinder a team’s ability to address the most heinous reports in a timely manner. Leveraging AI software, in addition to human review and insight, can help moderators triage priority reports and close out any false or non-urgent ones. Moderators can then act quickly on the actionable and most severe reports that rise through the noise.
Making it Easy & Accessible for Members to Report Harmful Content
The content that doesn’t quite meet your cut-off threshold for proactive filtering is the harder-to-moderate grey area. This is the area that allows for healthy conflict and debate as well as for the development of resiliency, but it also brings the opportunity for critical input from your community so you better understand their needs. It’s critical to make it easy and straightforward for your community to report content that has made them feel unsafe. That means adding an intuitive reporting flow to your product that allows your community members to choose from a list of reporting reasons and provide actionable proof for the report they are sending you.
Get to the Critical Reports First
A critical insight we consistently see across verticals like social apps, gaming, and others is that 30% or less of all user reports are actionable. It’s absolutely essential to have a mechanism to sort, triage, and prioritize reports. Clearly, not all reports have the same level of importance or urgency.
Closing The Feedback Loop
Following up on reports lets concerned users know that their reports led to meaningful action and encourages them to continue to report behavior that violates the platform’s community guidelines. This helps to build trust by assuring users that you are doing your due diligence to keep your community a safe and inclusive place for all.
It’s important to thank community members for submitting reports and helping you maintain a healthy and safe community. A simple “thank you” can go a long way in building relationships with the members of your community as it shows that you take violations of community guidelines seriously.
Improving Scalability + Sustainability
Successfully applying Layer 2: Classify & Filter means that the majority of harmful content is filtered out in accordance with a platform’s community guidelines. When the worst of the worst content is filtered out, there is less negative behavior for community members to report on. This impacts Layer Four directly as it leads up to a 88% reduction* of the number of user-generated reports. This thereby increases the scalability and sustainability of the volume of content moderators need to monitor and helps to decrease the likelihood of burnout.
Optimizing User Reporting operations empowers community members to take an active role in their safety and the community’s health and helps to build trust with the platform. Leveraging AI helps community managers reduce their workload and prioritize high-risk and sensitive reports. In turn, responding quickly to urgent reports and closing the feedback loop builds trust and credibility.
To find out how else you can better build healthy and thriving communities, read the rest of our Five Layers of Community Protection blog series. You can also request a live demo of our Community Sift platform now to learn how Automated User Reporting can help you protect your online community, address user reports faster and improve your users’ online experiences.
Building healthy and safe digital spaces begins with healthy community managers and moderators. We need to help community managers be mindful and take care of their mental health as they are often exposed to some of the worst of the internet – on a daily basis.
Occupational burnout is an all-too-common result that we, as an industry, must highlight and focus on changing. Identifying job stress and giving employees flexibility to prioritize their wellbeing improves our communities.
We suggest that companies encourage community managers to follow these 5 tips to prioritize their wellness and resilience:
1 – Create a wellness plan
Community managers are often repeatedly exposed to the worst online behaviors and are left feeling emotionally drained at the end of the workday. A wellness plan helps them manage their stress and mentally recharge. This actionable set of activities helps community managers to take wellness breaks throughout the day and to create a buffer between work and their personal lives. Whether it’s taking a walk outside, listening to music, meditating, talking to family or friends, a wellness plan can help community managers decompress before transitioning to the next moment of their day.
2 – Leverage AI Plus
Community managers monitor for hate speech, graphic images, and other types of high-risk content. Prolonged exposure to traumatic content can severely impact an individual’s mental health and wellbeing. Content moderators can develop symptoms of P.T.S.D., including insomnia, nightmares, anxiety, and auditory hallucinations as a result of consistent exposure to traumatic content.
By proactively leveraging technology to filter content, reducing the exposure to human moderators, our partners have reduced the workload of their community managers by as much as 88%*. This gives community managers more time to focus on other aspects of their job and protects their wellbeing by minimizing the amount of time they’re exposed to high-risk content.
3 – Be mindful of the types of content you’re moderating for
Rotating the types of content for which each team member is monitoring can help alleviate the negative impact that constant exposure to a singular focus area may cause. Threats of harm and self-harm, racism, sexism, predatory behavior, and child grooming are just a few of the types of content community managers monitor for and are exposed to daily.
4 – Focus on the positive
Most chat, images, and videos in online communities are aligned with the intended experiences of those products. In our experience, about 85% of user-generated content across different verticals is what we classify as low-risk, very positive types of behavior. Think community members discussing matters pertinent to the community, their hobbies and passions, or sharing pictures of their pets. Focusing on the positive side of your community will help you keep this reality in mind, and also remember why you do what you do everyday.
One of the ways in which you can focus on the positive aspects of your community is spending time in your product and seeing how community members are engaging, their creativity and passion. Make a point to do that at least once a week with the intent of focusing on the positive side of the community. Similarly, if you leverage a classification and filtering system like Community Sift, you should dedicate time to looking at chat logs that are positive. After either of these activities, you should write down and reflect on 3 to 5 things that were meaningful to you.
5 – Remember you’re making an impact
Monitoring an endless stream of high-risk content can make community managers feel like their work isn’t making an impact. That couldn’t be further from the truth. Their work is directly contributing to the health and safety of social and online play communities. When community managers identify a self-harm threat or protect children from predators, they are immediately making an impact in the life of that individual. In addition to monitoring content, community managers help to ensure that users have a positive and happy experience when engaging with their platform.
According to a 2020 survey conducted by the Anti-Defamation League, 81 percent of U.S. adults aged 18-45 who played online multiplayer games experienced some form of harassment. Approximately 22% of those community members went onto quit an online platform because of harassment they experienced. Harassment is an issue actively driving community members away from engaging with their favorite platforms. By helping to create a safe and healthy space, community managers are creating an environment where individuals can make friends, feel like they belong to a community, and have overall positive social interactions without the fear of harassment – while also helping drive the success of the community and overall acquisition and retention metrics. A true win-win.
Help protect the well-being of your community managers. Request a demo today to see how Two Hat’s content moderation platform can reduce your community manager’s workload and exposure to harmful content.
In 2020, social platforms that wish to expand their product and scale their efforts are faced with a critical decision — how will they automate the crucial task of content moderation? As platforms grow from hundreds to thousands to millions of users, that means more usernames, more live chat, and more comments, all of which require some form of moderation. From app store requirements to legal compliance with global legislation, ensuring that all user-generated content is aligned with community guidelines is nothing short of an existential matter.
When it comes to making a technical choice for a content moderation platform, what I hear in consultations and demos can be distilled down to this: engineers want a solution that’s simple to integrate and maintain, and that can scale as their product scales. They are also looking for a solution that’s battle-tested and allows for easy troubleshooting — and that won’t keep them up at night with downtime issues!
“Processing 100 billion online interactions in one month is technically hard to achieve. That is not simply just taking a message and passing it on to users but doing deep textual analysis for over 3 million patterns of harmful things people can say online. It includes building user reputation and knowing if the word on the line above mixed with this line is also bad. Just trying to maintain user reputation for that many people is a very large technical challenge. And to do it all on 20 milliseconds per message is incredible”. Chris Priebe, Two Hat’s CEO and Founder
I caught up with Laurence Brockman, Two Hat’s Vice President of Core Services, and Manisha Eleperuma, our Manager of Development Operations, just as we surpassed the mark of 100 billion pieces of human interactions processed in one month.
I asked them about what developers value in a content moderation platform, the benefits of an API-based service, and the technical challenges and joys of safeguarding hundreds of millions of users globally.
Carlos Figueiredo: Laurence, 100 billion online interactions processed in one month. Wow! Can you tell us about what that means to you and the team, and the journey to getting to that landmark?
“At the core, it’s meant we were able to keep people safe online and let our customers focus on their products and communities. We were there for each of our customers when they needed us most”.
Laurence Brockman: The hardest part for our team was the pace of getting to 100 billion. We tripled the volume in three months! When trying to scale & process that much data in such a short period, you can’t cut any corners. And you know what? I’m pleased to say that it’s been business as usual – even with this immense spike in volume. We took preventative measures along the way, we focused on key areas to ensure we could scale. Don’t get me wrong, there were few late nights and a week of crazy refactoring a system but our team and our solution delivered. I’m very proud of the team and how they dug in, identified any potential problem areas and jumped right in. At 100 billion, minor problems can become major problems and our priority is to ensure our system is ready to handle those volumes.
“What I find crazy is our system is now processing over 3 billion events every day! That’s six times the volume of Twitter”.
CF: Manisha, what are the biggest challenges and joys of running a service that safeguards hundreds of millions of users globally?
Manisha Eleperuma: I would start off with the joys. I personally feel really proud to be a part of making the internet a safer place. The positive effect that we can have on an individual’s life is immense. We could be stopping a kid from harming themself, we could be saving them from a predator, we could be stopping a friendly conversation turning into a cold battle of hate speech. This is possible because of the safety net that our services provide to online communities. Also, it is very exciting to have some of the technology giants and leaders in the entertainment industry using our services to safeguard their communities.
It is not always easy to provide such top-notch service, and it definitely has its own challenges. We as an Engineering group are maintaining a massive complex system and keeping it up and running with almost zero downtime. We are equipped with monitoring tools to check the system’s health and engineers have to be vigilant for alerts triggered by these tools and promptly act upon any anomalies in the system even during non-business hours. A few months ago, when the pandemic situation was starting to affect the world, the team could foresee an increase in transactions that could potentially start hitting our system.
“This allowed the team to get ahead of the curve and pre-scale some of the infrastructure components to be ready for the new wave so that when traffic increases, it hits smoothly without bringing down the systems”.
Another strenuous exercise that the team often goes through is to maintain the language quality of the system. Incorporating language-specific characteristics into the algorithms is challenging, but exciting to deal with.
CF: Manisha, what are the benefits of using an API-based service? What do developers value the most in a content moderation platform?
ME: In our context, when Two Hat’s Community Sift is performing as a classification tool for a customer, all transactions happen via customer APIs. In every customer API, based on their requirements, it has the capability to access different components of our platform side without much hassle. For example, certain customers rely on getting the player/user context, their reputation, etc. The APIs that they are using to communicate with our services are easily configurable to fetch all that information from the internal context system, without extra implementation from the customer’s end.
“This API approach has accelerated the integration process as well. We recently had a customer who was integrated with our APIs and went live successfully within a 24 hour period”.
Customers expect reliability and usability in moderation platforms. When a moderator goes through content in a Community Sift queue, we have equipped the moderator with all the necessary data, including player/user information with the context of the conversation, history and the reputation of the player which eases decision-making. This is how we support their human moderation efforts. Further, we are happy to say that Two Hat has expanded the paradigm to another level of automated moderation, using AI models that make decisions on behalf of human moderators after it has learned from their consistent decisions, which lowers the moderation costs for customers.
CF: Laurence, many of our clients prefer to use our services via a server to server communication, instead of self-hosting a moderation solution. Why is that? What are the benefits of using a service like ours?
LB: Just as any SaaS company will tell you, our systems are able to scale to meet the demand without our customers’ engineers having to worry about it. It also means that as we release new features and functions, our customers don’t have to worry about expensive upgrades or deployments. While all this growth was going on, we also delivered more than 40 new subversion detection capabilities into our core text-classification product.
Would you like to see our content moderation platform in action? Request a demo today.
I recently had the privilege to speak on the keynote gaming panel of the 16th Annual International Bullying Prevention Conference, an event themed Kindness & Compassion: Building Healthy Communities.
The International Bullying Prevention Association is a 501(c)3 nonprofit organization founded in 2003 when grassroots practitioners and researchers came together to convene the first conference in the US entirely focused on bullying prevention. They host an annual conference in Chicago where attendees can benefit from workshops, poster sessions and TED-inspired sessions which deliver hands-on solutions and theoretical, research-based presentations.
Below, I focus on the sessions and discussions I participated in regarding cyberbullying, and present a brief account of the takeaways I brought back to Canada and Two Hat.
1. User-centric approaches to online safety
A few people on the tech panels referred to the concept of “user-centric safety” — letting users set their boundaries and comfort levels for online interactions. Catherine Teitelbaum, a renowned Global Trust & Safety Executive who heads up Trust & Safety for Twitch, is a big champion of the idea and spoke about how the concept of “safety” varies from person to person. Offering customized control for the user experience, like Twitch does with Automod by empowering channel owners to set their chat filtering standards, is the way of the future.
Online communities are diverse and unique, and often platforms contain many communities with different norms. The ability to tailor chat settings to those unique characteristics is critical.
Wouldn’t it be great for users to be able to choose their safety settings and what they are comfortable with – the same way they can set their privacy settings on online platforms? What if a mother wants to enjoy an online platform with her child, but wants to ensure that they don’t see any sexual language? Perhaps a gamer just wants to relax and play a few rounds without experiencing the violent language that might be the norm in a mature game centered around combat. The more agency and flexibility we give to users and players online, the better we can cater to the different expectations we all have when we log in.
2. Shared Responsibility, and the Importance of Diverse Voices
The concept of sharing and contributing to the greater good of online safety practices across tech industries also came up. Here at Two Hat we believe that ushering in a new age of content moderation and empowering an Internet that will fulfill its true purpose of connecting human beings is only possible through a shared responsibility approach (which also came up in the conference). We believe it will take the efforts of everyone involved to truly change things for the better. This includes academia, industry, government, and users.
In his 2018 book “Farsighted: How Do We Make The Decisions That Matter The Most”, Steven Johnson writes about how complex decisions require a comprehensive mapping of all factors involved and how those are informed and extremely benefited from a set of diverse perspectives. The best, farsighted decisions compile the voices of a variety of people. The intricate human interaction systems we are creating on the Internet require complex decision-making at both the inception and design stage. However, right now those decisions are rarely informed by multi-disciplinary lenses. No wonder we are so shortsighted when it comes to anticipating issues with online behaviour and online harms.
A true, collaborative community of practice is needed. We need that rising tide that floats all boats, as my good friend Dr. Kim Voll says.
3. Empathy as an Antidote
Another good friend, Dr. Sameer Hinduja was one of the speakers in the conference. Dr Hinduja is a Professor in the School of Criminology and Criminal Justice at Florida Atlantic University and Co-Director of the Cyberbullying Research Center who is recognized internationally for his groundbreaking work on the subjects of cyberbullying and safe social media use. You will be hard-pressed to find someone more dedicated to the well-being of others.
He talked about how empathy can be used to prevent bullying, pulling from research and practical applications that have resulted in improvement in peer to peer relationships. He stressed the importance of practices that lead youth to go beyond the traditional approach of “being in someone else’s shoes” to feel empathy, and reaching a point where they truly value others. This is so important, and it makes me wonder: How can we design human interaction systems online where we perceive each other as valuable individuals and are constantly reminded of our shared humanity? How do we create platforms that discourage solely transactional interaction? How do we bring offline social cues into the online experience? How can we design interaction proxies to reduce friction between users – and ultimately lead us to more positive and productive online spaces? I don’t have all the answers – no one does. But I am encouraged by the work of people like Dr Hinduja, the Trust and Safety team at Twitch, the incredible Digital Civility efforts of Roblox and my friend Laura Higgins, their Director of Community Safety & Digital Civility, and events like The International Bullying Prevention Conference.
Cyberbullying is one of the many challenges facing online platforms today. Let’s remember that it’s not just cyberbullying – there is a wider umbrella of behaviors that we need to better understand and define, including harassment, reputation tarnishing, doxxing, and more. We need to find a way to facilitate better digital interactions in general, by virtue of how we design online spaces, how we encourage positive and productive exchanges, and understanding that it will take a wider perspective, informed by many lenses, in order to create online spaces that fulfill their true potential.
If you’re reading this, you’re likely in the industry, and you’re definitely a participant in online communities. So what can you do, today, to make a difference? How can industry better collaborate to advance online safety practices?
Earlier this month, the Australia eSafety Office released their Safety by Design (SbD) Principles. As explained on their website, SbD is an “initiative which places the safety and rights of users at the centre of the design, development and deployment of online products and services.” It outlines three simple but comprehensive principles (service provider responsibilities, user empowerment & autonomy, and transparency & accountability) that social networks can follow to embed user safety into their platform from the design phase and onwards.
With this ground-breaking initiative, Australia has proven itself to be at the forefront of championing innovative approaches to online safety.
I first connected with the eSafety Office back in November 2018, and later had the opportunity to consult on Safety by Design. I was honored to be part of the consultation process and to bring some of my foundational beliefs around content moderation to the table. At Two Hat, we’ve long advocated for a Safety by Design approach to building social networks.
Many of the points and the Safety by Design Principles and the UK’s recent Online Harms white paper support the Trust & Safety practices we’ve been recommending to clients for years, such as leveraging filters and cutting-edge technology to triage user reports. And we’ve heartily embraced new ideas, like transparency reports, which Australia and the UK both strongly recommend in their respective papers.
As I read the SbD overview, I had a few ideas for clear, actionable measures that social networks across the globe can implement today to embrace Safety by Design. The first two fall under SbD Principle 1, and the third under SbD Principle 3.
Under SbD Principle 1: Service provider responsibilities “Put processes in place to detect, surface, flag and remove illegal and harmful conduct, contact and content with the aim of preventing harms before they occur.”
Content filters are no longer a “nice to have” for social networks – today, they’re table stakes. When I first started in the industry, many people assumed that only children’s sites required filters. And until recently, only the most innovative and forward-thinking companies were willing to leverage filters in products designed for older audiences.
That’s all changed – and the good news is that you don’t have to compromise freedom of expression for user safety. Today’s chat filters (like Two Hat’s Community Sift) go beyond allow/disallow lists, and instead allow for intelligent, nuanced filtering of online harms that take into account various factors, including user reputation and context. And they can do it well in multiple languages, too. As a Portuguese and English speaker, this is particularly dear to my heart.
All social networks can and should implement chat, username, image, and video filters today. How they use them, and the extent to which they block, flag, or escalate harms will vary based on community guidelines and audience.
Also under SbD Principle 1: Service provider responsibilities
“Put in place infrastructure that supports internal and external triaging, clear escalation paths and reporting on all user safety concerns, alongside readily accessible mechanisms for users to flag and report concerns and violations at the point that they occur.”
As the first layer of protection and user safety, baseline filters are critical. But users should always be encouraged to report content that slips through the cracks. (Note that when social networks automatically filter the most abusive content, they’ll have fewer reports.)
But what do you do with all of that reported content? Some platforms receive thousands of reports a day. Putting everything from false reports (users testing the system, reporting their friends, etc) to serious, time-sensitive content like suicide threats and child abuse into the same bucket is inefficient and ineffective.
That’s why we recommend implementing a mechanism to classify and triage reports so moderators purposefully review the high-risk ones first, while automatically closing false reports. We’ve developed technology called Predictive Moderation that does just this. With Predictive Moderation, we can train AI to take the same actions moderators take consistently and reduce manual review by up to 70%.
I shared some reporting best practices used by my fellow Fair Play Alliance members during the FPA Summit at GDC earlier this year. You can watch the talk here (starting at 37:30).
There’s a final but no less important benefit to filtering the most abusive content and using AI like Predictive Moderation to triage time-sensitive content. As we’ve learned from seemingly countless news stories recently, content moderation is a deeply challenging discipline, and moderators are too often subject to trauma and even PTSD. All of the practices that the Australian eSafety Office outlines, when done properly, can help protect moderator wellbeing.
Under SbD Principle 3: Transparency and accountability
“Publish an annual assessment of reported abuses on the service, accompanied by the open publication of a meaningful analysis of metrics such as abuse data and reports, the effectiveness of moderation efforts and the extent to which community standards and terms of service are being satisfied through enforcement metrics.”
While transparency reports aren’t mandatory yet, I expect they will be in the future. Both the Australian SbD Principles and the UK Online Harms white paper outline the kinds of data these potential reports might contain.
My recommendation is that social networks start building internal practices today to support these inevitable reports. A few ideas include:
Track the number of user reports filed and their outcome (ie, how many were closed, how many were actioned on, how many resulted in human intervention, etc)
Log high-risk escalations and their outcome
Leverage technology to generate a percentage breakdown of abusive content posted and filtered
Thank you again to the eSafety Office and Commissioner Julie Inman-Grant for spearheading this pioneering initiative. We look forward to the next iteration of the Safety by Design framework – and can’t wait to join other online professionals at the #eSafety19 conference in September to discuss how we can all work together to make the internet a safe and inclusive space where everyone is free to share without fear of abuse or harassment.
And if you, like so many of us, are concerned about community health and user safety, I’m currently offering no-cost, no-obligation Community Audits. I will examine your community (or the community from someone you know!), locate areas of potential risk, and provide you with a personalized community analysis, including recommended best practices and tips to maximize positive social interactions and user engagement.
Today, user-generated content like chat, private messaging, comments, images, and videos are all must-haves in an overstuffed market where user retention is critical to long-term success. Users love to share, and nothing draws a crowd like a crowd — and a crowd of happy, loyal, and welcoming users will always bring in more happy, loyal, and welcoming users.
But as we’ve seen all too often, there is risk involved when you have social features on your platform. You run the risk of users posting offensive content – like hate speech, NSFW images, and harassment – which can cause serious damage to your brand’s reputation.
That’s why understanding the risks when adding social features to your product are also critical to long-term success.
Here are four questions to consider when it comes to user-generated content on your platform.
1. How much risk is my brand willing to accept?
Every brand is different. Community demographic will usually be a major factor in determining your risk tolerance.
Communities with under-13 users in the US have to be COPPA compliant, so preventing them from sharing PII (personally identifiable information) is essential. Edtech platforms should be CIPA and FERPA compliant.
If your users are teens and 18+, you might be less risk-averse, but will still need to define your tolerance for high-risk content.
Consider your brand’s tone and history. Review your corporate guidelines to understand what your brand stands for. This is a great opportunity to define exactly what kind of an online community you want to create.
2. What type of high-risk content is most dangerous to my brand?
Try this exercise: Imagine that just one pornographic post was shared on your platform. How would it affect the brand? How would your audience react? How would your executive team respond? What would happen if the media/press found out?
What about hate speech? Sexual harassment? What is your brand’s definition of abuse or harassment? The better you can define these often vague terms, the better you will understand what kind of content you need to moderate.
3. How will I communicate my expectations to the community?
Don’t expect your users to automatically know what is and isn’t acceptable on your platform. Post your community guidelines where users can see them. And make sure users have to agree to your guidelines before they can post.
4. What content moderation tools and strategies can I leverage to protect my community?
We recommend taking a proactive instead of a reactive approach to managing risk and protecting your community. That means finding the right blend of pre- and post-moderation for your platform, while also using a mixture of automated artificial intelligence with real human moderation.
While social features may be inherently risky, remember that they’re also inherently beneficial to your brand and your users. Whether you’re creating a new social platform or adding chat and images to your existing product, nothing engages and delights users more than being part of a positive and healthy online community.
And if you’re not sure where to start – we have good news.
Two Hat is currently offering a no-cost, no-obligation community audit. Our team of industry experts will examine your community, locate high-risk areas, and identify how we can help solve any moderation challenges.
As many of you know, Smyte was recently acquired by Twitter and its services are no longer available, affecting many companies in the industry.
As CEO and founder of Two Hat Security, creators of the chat filter and content moderation solution Community Sift, I would like to assure both our valued customers and the industry at large that we are, and will always remain, committed to user protection and safety. For six years we have worked with many of the largest gaming and social platforms in the world to protect their communities from abuse, harassment, and hate speech.
We will continue to serve our existing clients and welcome the opportunity to work with anyone affected by this unfortunate situation. Our mandate is and will always be to protect the users on behalf of all sites. We are committed to uninterrupted service to those who rely on us.
If you’re in need of a filter to protect your community, we can be reached at firstname.lastname@example.org.
We get it. When you built your online game, virtual world, forum for Moomin-enthusiasts (you get the idea), you probably didn’t have content queues, workflow escalations, and account bans at the front of your mind. But now that you’ve launched and are acquiring users, it’s time to ensure that you maximise your content moderation team.
Based on our experience at Two Hat, and with our clients across the industry — which include some of the biggest online games, virtual worlds, and social apps out there — we’ve prepared a list of five crucial moderation workflows.
Each workflow leverages AI-powered automation to enhance your mod’s efficiency. This gives them the time to do what humans do best — make tough decisions, engage with users, and ultimately build a healthy, thriving community.
Use Progressive Sanctions
At Two Hat, we are big believers in second chances. We all have bad days, and sometimes we bring those bad days online. According to research conducted by Riot Games, the majority of bad behavior doesn’t come from “trolls” — it comes from average users lashing out. In the same study, Riot Games found that players who were clearly informed why their account was suspended — and provided with chat logs as backup — were 70% less likely to misbehave again.
The truth is, users will always make mistakes and break your community guidelines, but the odds are that it’s a one-time thing and they probably won’t offend again.
We all know those parents who constantly threaten their children with repercussions — “If you don’t stop pulling the cat’s tail, I’ll take your Lego away!” but never follow through. Those are the kids who run screaming like banshees down the aisles at Whole Foods. They’ve never been given boundaries. And without boundaries and consequences, we can’t be expected to learn or to change our behavior.
That’s why we highly endorse progressive sanctions. Warnings and temporary muting followed by short-term suspensions that get progressively longer (1 hour, 6 hours, 12 hours, 24 hours, etc) are effective techniques — as long as they’re paired with an explanation.
And you can be gentle at first — sometimes all a user needs is a reminder that someone is watching in order to correct their behavior. Sanctioning doesn’t necessarily mean removing a user from the community — warning and muting can be just as effective as a ban. You can always temporarily turn off chat for bad-tempered users while still allowing them to engage with your platform.
And if that doesn’t work, and users continue to post content that disturbs the community, that’s when progressive suspensions can be useful. As always, ban messages should be paired with clear communication:
You can make it fun, too.
“Having a bad day? You wrote [X], which is against the Community Guidelines. How about taking a short break (try watching that video of cats being scared by cucumbers, zoning out to Bob Ross painting happy little trees, or, if you’re so inclined, taking a lavender-scented bubble bath), then joining the community again? We’ll see you in [X amount of time].”
If your system is smart enough, you can set up accurate behavioral triggers to automatically warn, mute, and suspend accounts in real time.
The workflow will vary based on your community and the time limits you set, but it will look something like this:
Every community team knows that reviewing Every. Single. Uploaded. Image. Is a royal pain. 99% of images are mind-numbingly innocent (and probably contain cats, because the internet), while the 1% are well, shocking. After a while, everything blurs together, and the chances of actually missing that shocking 1% get higher and higher… until your eyes roll back into your head and you slump forward on your keyboard, brain matter leaking out of your ears.
OK, so maybe it’s not that bad.
But scanning image after image manually does take a crazy amount of time, and the emotional labor can be overwhelming and potentially devastating. Imagine scrolling through pic after pic of kittens, and then stumbling over full-frontal nudity. Or worse: unexpected violence and gore. Or the unthinkable: images of child or animal abuse.
All this can lead to stress, burnout, and even PTSD.
It’s in your best interests to automate some of the process. AI today is smarter than it’s ever been. The best algorithms can detect pornography with nearly 100% accuracy, not to mention images containing violence and gore, drugs, and even terrorism.
If you use AI to pre-moderate images, you can tune the dial based on your community’s resilience. Set the system to automatically approve any image with, say, a low risk of being pornography (or gore, drugs, terrorism, etc), while automatically rejecting images with a high risk of being pornography. Then, send anything in the ‘grey zone’ to a pre-moderation queue for your mods to review.
Or, if your user base is older, automatically approve images in the grey zone, and let your users report anything they think is inappropriate. You can also send those borderline images to an optional post-moderation queue for manual review.
This way, you take the responsibility off of both your moderators and your community to find the worst content.
What the flow looks like:
User submits image → AI returns risk probability → If safe, automatically approve and post → If unsafe, automatically reject → If borderline, hold and send to queue for manual pre-moderation (for younger communities) or → If borderline, publish and send to queue for optional post-moderation (for older communities).
For many people, online communities are the safest spaces to share their deepest, darkest feelings. Depending on your community, you may or may not allow users to discuss their struggles with suicidal thoughts and self-injury openly.
Regardless, users who discuss suicide and self-harm are vulnerable and deserve extra attention. Sometimes, just knowing that someone else is listening can be enough.
We recommend that you provide at-risk users with phone or text support lines where they can get help. Ideally, this should be done through an automated messaging system to ensure that users get help in real time. However, you can also send manual messages to establish a dialogue with the user.
Worldwide, there are a few resources that we recommend:
If your community is outside of the US, Canada, or the UK, your local law enforcement agency should have phone numbers or websites that you can reference. In fact, it’s a good idea to build a relationship with local law enforcement; you may need to contact them if you ever need to escalate high-risk scenarios, like a user credibly threatening to harm themselves or others.
We don’t recommend punishing users who discuss their struggles by banning or suspending their accounts. Instead, a gentle warning message can go a long way:
“We noticed that you’ve posted an alarming message. We want you to know that we care, and we’re listening. If you’re feeling sad, considering suicide, or have harmed yourself, please know that there are people out there who can help. Please call [X] or text [X] to talk to a professional.”
When setting up a workflow, keep in mind that a user who mentions suicide or self-harm just once probably doesn’t need an automated message. Instead, tune your workflow to send a message after repeated references to suicide and self-harm. Your definition of “repeated” will vary based on your community, so it’s key that you monitor the workflow closely after setting it up. You will likely need to retune it over time.
Of course, users who encourage other users to kill themselves should receive a different kind of message. Look out for phrases like “kys” (kill yourself) and “go drink bleach,” among others. In these cases, use progressive sanctions to enforce your community guidelines and protect vulnerable users.
What the flow looks like:
User posts content about suicide/self-harm X amount of times → System automatically displays message to user suggesting they contact a support line → If user continues to post content about suicide/self-harm X number of times, send content to a queue for a moderator to manually review for potential escalation
Prepare for Breaking News & Trending Topics
We examined this underused moderation flow in a recent webinar. Never overestimate how deeply the latest news and emerging internet trends will affect your community. If you don’t have a process for dealing with conversations surrounding the next natural disaster, political scandal, or even another “covfefe,” you run the risk of alienating your community.
Consider Charlottesville. On August 11th marchers from the far-right, including white nationalists, neo-Nazis, and members of the KKK gathered to protest the removal of Confederate monuments throughout the city. The rally soon turned violent, and on August 12th a car plowed into a group of counter-protestors, killing a young woman.
The incident immediately began trending on social media and in news outlets and remained a trending topic for several weeks afterward.
How did your online community react to this news? Was your moderation team prepared to handle conversations about neo-Nazis on your platform?
While not a traditional moderation workflow, we have come up with a “Breaking News & Trending Topics” protocol that can help you and your team stay on top of the latest trends — and ensure that your community remains expressive but civil, even in the face of difficult or controversial topics.
Compile vocabulary: When an incident occurs, compile the relevant vocabulary immediately.
Evaluate: Review how your community is using the vocabulary. If you wouldn’t normally allow users to discuss the KKK, would it be appropriate to allow it based on what’s happening in the world at that moment?
Adjust: Make changes to your chat filter based on your evaluation above.
Validate: Watch live chat to confirm that your assumptions were correct.
Stats & trends: Compile reports about how often or how quickly users use certain language. This can help you prepare for the next incident.
Re-evaluate vocabulary over time: Always review and reassess. Language changes quickly. For example, the terms Googles, Skypes, and Yahoos were used in place of anti-Semitic slurs on Twitter in 2016. Now, in late 2017, they’ve disappeared — what have they been replaced with?
Stay diligent, and stay informed. Twitter is your team’s secret weapon. Have your team monitor trending hashtags and follow reputable news sites so you don’t miss anything your community may be talking about.
Provide Positive Feedback
Ever noticed that human beings are really good at punishing bad behavior but often forget to reward positive behavior? It’s a uniquely human trait.
Positive moderation is a game changer. Not only does it help foster a healthier community, it can also have a huge impact on retention.
Set aside time every day for moderators to watch live chat to see what the community is talking about and how users are interacting.
Engage in purposeful community building — have moderators spend time online interacting in real time with real users.
Forget auto-sanctions: Try auto-rewards! Use AI to find key phrases indicating that a user is helping another user, and send them a message thanking them, or even inviting them to collect a reward.
Give your users the option to nominate a helpful user, instead of just reporting bad behavior.
Create a queue that populates with users who have displayed consistent positive behavior (no recent sanctions, daily logins, no reports, etc) and reach out to them directly in private or public chat to thank them for their contributions.
Any one of these workflows will go a long way towards building a healthy, engaged, loyal community on your platform. Try them all, or just start out with one. Your community (and your team) will thank you.
With our chat filter and moderation software Community Sift, Two Hat has helped companies like Supercell, Roblox, Habbo, Friendbase, and more implement similar workflows and foster healthy, thriving communities.
Interested in learning how we can help your gaming or social platform thrive? Get in touch today!
Chances are, you shuddered slightly at the words “online comments.”
Presenting Exhibit A, from a Daily Mail article about puppies:
It gets worse. Presenting Exhibit B, from Twitter:
The internet has so much potential. It connects us across borders, cultural divides, and even languages. And oftentimes that potential is fulfilled. Remember the Arab Spring in 2011? It probably wouldn’t have happened without Twitter connecting activists across the Middle East.
Writers, musicians, and artists can share their art with fans across the globe on platforms like Medium and YouTube.
After the terror attacks in Manchester and London in May, many Facebook users used the Safety Check feature to reassure family and friends that they were safe from danger.
Every byte of knowledge that has ever existed is only a few taps away, stored, improbably, inside a device that fits in the palm of a hand. The internet is a powerful tool for making connections, for sharing knowledge, and for conversing with people across the globe.
And yet… virtual conversations are so often reduced to emojis and cat memes. Because who wants to start a real conversation when it’s likely to dissolve into insults and vitriol?
A rich, fulfilling, and enlightened life requires a lot more.
So what’s missing?
Maslow was onto something…
Remember Maslow’s hierarchy of needs? It probably sounds vaguely familiar, but here’s a quick refresher if you’ve forgotten.
A psychology professor at Brandeis University in Massachusetts, Abraham Maslow published his groundbreaking paper “A Theory of Human Motivation” in 1943. In this seminal paper, he identifies and describes the five basic levels of human needs. Each need forms a solid base under the next. And each basic need, when achieved, leads to the next, creating a pyramid. Years later he expanded on this hierarchy of human needs in the 1954 book Motivation and Personality.
The hierarchy looks like this:
Physiological: The basic physical requirements for human survival, including air, water, and food; then clothing, shelter, and sex.
Safety: Once our physical needs are met, we require safety and security. Safety needs include economic security as well as health and well-being.
Love/belonging: Human beings require a sense of belonging and acceptance from family and social groups.
Esteem: We need to be desired and accepted by others.
Self-actualization: The ultimate. When we self-actualize, we become who we truly are.
According to Maslow, our supporting needs must be met before we can become who we truly are — before we reach self-actualization.
So what does it mean to become yourself? When we self-actualize, we’re more than just animals playing dress-up — we are fulfilling the promise of consciousness. We are human.
Sorry, what does this have to do with the internet?
We don’t stop being human when we go online. The internet is just a new kind of community — the logical evolution of the offline communities that we started forming when the first species of modern humans emerged about 200,000 years ago in Eurasia. We’ve had many chances to reassess, reevaluate, and modify our offline community etiquette since then, which means that offline communities have a distinct advantage over the internet.
Merriam-Webster’s various definitions of “community” are telling:
people with common interests living in a particular area; an interacting population of various kinds of individuals (such as species) in a common location; a group of people with a common characteristic or interest living together within a larger society
Community is all about interaction and common interests. We gather together in groups, in public and private spaces, to share our passions and express our feelings. So, of course, we expect to experience that same comfort and kinship in our online communities. After all, we’ve already spent nearly a quarter of a million years cultivating strong, resilient communities — and achieving self-actualization.
But the internet has failed us because people are afraid to do just that. Those of us who aspire to online self-actualization are too often drowned out by trolls. Which leaves us with emojis and cat memes — communication without connection.
So how do we bridge that gap between conversation and real connection? How do we reach the pinnacle of Maslow’s hierarchy of needs in the virtual space?
Conversations have needs, too
What if there was a hierarchy of conversation needs using Maslow’s theory as a framework?
On the internet, our basic physical needs are already taken care of so this pyramid starts with safety.
So what do our levels mean?
Safety: Offline, we expect to encounter bullies from time to time. And we can’t get upset when someone drops the occasional f-bomb in public. But we do expect to be safe from targeted harassment, from repeated racial, ethnic, or religious slurs, and from threats against our bodies and our lives. We should expect the same when we’re online.
Social: Once we are safe from harm, we require places where we feel a sense of belonging and acceptance. Social networks, forums, messaging apps, online games — these are all communities where we gather and share.
Esteem: We need to be heard, and we need our voices to be respected.
Self-actualization: The ultimate. When we self-actualize online, we blend the power of community with the blessing of esteem, and we achieve something bigger and better. This is where great conversation happens. This is where user-generated content turns into art. This is where real social change happens.
Problem is, online communities are far too often missing that first level. And without safety, we cannot possibly move onto social.
The real kicker? Over a quarter (27%) of Americans reported that they had self-censored their posts out of fear of harassment.
If we feel so unsafe in our online communities that we stop sharing what matters to us most, we’ve lost the whole point of building communities. We’ve forgotten why they matter.
How did we get here?
There are a few reasons. No one planned the internet; it just happened, site by site and network by network. We didn’t plan for it, so we never created a set of rules.
And the internet is still so young. Think about it: Communities have been around since we started to walk on two feet. The first written language began in Sumeria about 5000 years ago. The printing press was invented 600 years ago. The telegram has been around for 200 years. Even the telephone — one of the greatest modern advances in communication — has a solid 140 years of etiquette development behind it.
The internet as we know it today — with its complex web of disparate communities and user-generated content — is only about 20 years old. And with all due respect to 20-year-olds, it’s still a baby.
We’ve been stumbling around in this virtual space with only a dim light to guide us, which has led to the standardization of some… less-than-desirable behaviors. Kids who grew up playing MOBAS (multi-only battle games) have come to accept that toxicity is a byproduct of online competition. Those of us who use social media expect to encounter previously unimaginably vile hate speech when we scroll through our feed.
And, of course, we all know to avoid the comments section.
Can self-actualization and online communities co-exist?
Yes. Because why not? We built this thing — so we can fix it.
Three things need to happen if we’re going to move from social to esteem to self-actualization.
Industry-wide paradigm shift
The good news? It’s already happening. Every day there’s a new article about the dangers of cyberbullying and online abuse. More and more social products realize that they can’t allow harassment to run free on their platforms. The German parliament recently backed a plan to fine social networks up to €50 million if they don’t remove hate speech within 24 hours.
Even the Obama Foundation has a new initiative centered around digital citizenship.
As our friend David Ryan Polgar, Chief of Trust & Safety at Friendbase says:
“Digital citizenship is the safe, savvy, ethical use of social media and technology.”
Safe, savvy, and ethical: As a society, we can do this. We’ve figured out how to do it in our offline communities, so we can do it in our online communities, too.
A big part of the shift includes a newfound focus on bringing empathy back into online interactions. To quote David again:
“There is a person behind that avatar and we often forget that.”
Thoughtful content moderation
The problem with moderation is that it’s no fun. No one wants to comb through thousands of user reports, review millions of potentially horrifying images, or monitor a mind-numbingly long live-chat stream in real time.
Too much noise + no way to prioritize = unhappy and inefficient moderators.
Thoughtful, intentional moderation is all about focus. It’s about giving community managers and moderators the right techniques to sift through content and ensure that the worst stuff — the targeted bullying, the cries for help, the rape threats — is dealt with first.
Automation is a crucial part of that solution. With artificial intelligence getting more powerful every day, instead of forcing their moderation team to review posts unnecessarily, social products can let computers do the heavy lifting first.
The content moderation strategy will be slightly different for every community. But there are a few best practices that every community can adopt:
Know your community resilience. This is a step that too many social products forget to take. Every community has a tolerance level for certain behaviors. Can your community handle the occasional swear word — but not if it’s repeated 10 times? Resilience will tell you where to draw the line.
Use reputation to treat users differently. Behavior tends to repeat itself. If you know that a user posts things that break your community guidelines, you can place tighter restrictions on them. Conversely, you can give engaged users the ability to post more freely. But don’t forget that users are human; everyone deserves the opportunity to learn from their mistakes. Which leads us to our next point…
Use behavior-changing techniques. Strategies include auto-messaging users before they hit “send” on posts that breach community guidelines, and publicly honoring users for their positive behavior.
Let your users choose what they see. The ESRB has the right idea. We all know what “Rated E for Everyone” means — we’ve heard it a million times. So what if we designed systems that allowed users to choose their experience based on a rating? If you have a smart enough system in the background classifying and labeling content, then you can serve users only the content that they’re comfortable seeing.
It all comes back to our hierarchy of conversation needs. If we can provide that first level of safety, we can move beyond emojis and cats — and move onto the next level.
Early digital education
The biggest task ahead of us is also the most important — education. We didn’t have the benefit of 20 years of internet culture, behavior, and standards when we first started to go online. We have those 20 years of mistakes and missteps behind us now.
Which means that we have an opportunity with the next generation of digital citizens to reshape the culture of the internet. In fact, strides are already being made.
It’s a smart move — kids are already engaged when they’re playing a game they love, so it’s a lot easier to slip some education in there. Ivan and his team have even created impressive teaching resources for teachers who lead the clubs.
Google recently launched Be Internet Awesome, a program that teaches young children how to be good digital citizens and explore the internet safely. In the browser game Interland, kids learn how to protect their personal information, be kind to other users, and spot phishing scams and fake sites. And similar to Riot, Google has created curriculum for educators to use in the classroom.
Things are changing. Our kids will likely grow up to be better digital citizens than we ever were. And it’s unlikely that they will tolerate the bullying, harassment, and abuse that we’ve put up with for the last 20 years.
Along with a paradigm shift, thoughtful moderation, and education, if we want change to happen, we have to celebrate our communities. We have to talk about our wins, our successes… and especially our failures. Let’s not beat ourselves up if we don’t get it right the first time. We’re figuring this out.
It’s time for the internet to grow up
Is this the year the internet achieves its full potential? From where most of us in the industry sit, it’s already happening. People are fed up, and they’re ready for a change.
This year, social products have an opportunity to decide what they really want to be. They can be the Wild West, where too many conversations end with a (metaphorical) bullet. Or they can be something better. They can be spaces that nurture humanity — real communities, the kind we’ve been building for the last 200,000 years.
This year, let’s build online communities that honor the potential of the internet.
That meet every level in our hierarchy of needs.
That promote digital citizenship.
That encourage self-actualization.
This year, let’s start the conversation.
At Two Hat Security, we empower social and gaming platforms to build healthy, engaged online communities, all while protecting their brand and their users from high-risk content.
Way back in 2004 (only 13 years ago but several lifetimes in internet years), a Professor of Psychology at Rider University named John Suler wrote a paper called The Online Disinhibition Effect. In it, he identifies the two kinds of online disinhibition:
Benign disinhibition. We’re more likely to open up, show vulnerability, and share our deepest fears. We help others, and we give willingly to strangers on sites like GoFundMe and Kickstarter.
Toxic disinhibition. We’re more likely to harass, abuse, and threaten others when we can’t see their face. We indulge our darkest desires. We hurt people because it’s easy.
Suler identified eight ways in which the internet facilitates both benign and toxic disinhibition. Let’s look at three of them:
Anonymity. Have you ever visited an unfamiliar city and been intoxicated by the fact that no one knew you? You could become anyone you wanted; you could do anything. That kind of anonymity is rarely available in our real lives. Think about how you’re perceived by your family, friends, and co-workers. How often do you have the opportunity to indulge in unexpected — and potentially unwanted — thoughts, opinions, and activities?
Anonymity is a cloak. It allows us to become someone else (for better or worse), if only for the brief time that we’re online. If we’re unkind in our real lives, sometimes we’ll indulge in a bit of kindness online. And if we typically keep our opinions to ourselves, we often shout them all the louder on the internet.
Invisibility. Anonymity is a cloak that renders us—and the people we interact with—invisible. And when we don’t have to look someone in the eye it’s much, much easier to indulge our worst instincts.
“…the opportunity to be physically invisible amplifies the disinhibition effect… Seeing a frown, a shaking head, a sigh, a bored expression, and many other subtle and not so subtle signs of disapproval or indifference can inhibit what people are willing to express…”
Solipsistic Introjection & Dissociative Imagination. When we’re online, it feels like we exist only in our imagination, and the people we talk to are simply voices in our heads. And where do we feel most comfortable saying the kinds of things that we’re too scared to normally say? That’s right—in our heads, where it’s safe.
Just like retreating into our imagination, visiting the internet can be an escape from the overwhelming responsibilities of the real world. Once we’ve associated the internet with the “non-real” world, it’s much easier to say those things we wouldn’t say in real life.
“Online text communication can evolve into an introjected psychological tapestry in which a person’s mind weaves these fantasy role plays, usually unconsciously and with considerable disinhibition.”
The internet has enriched our lives in so many ways. We’re smarter (every single piece of information ever recorded can be accessed on your phone — think about that) and more connected (how many social networks do you belong to?) than ever.
We’re also dumber (how often do you mindlessly scroll through Facebook without actually reading anything?) and more isolated (we’re connected, but how well do we really know each other?)
Given that dichotomy, it makes sense that the internet brings out both the best and the worst in us. Benign disinhibition brings us together — and toxic disinhibition rips us apart.