In 2020, social platforms that wish to expand their product and scale their efforts are faced with a critical decision — how will they automate the crucial task of content moderation? As platforms grow from hundreds to thousands to millions of users, that means more usernames, more live chat, and more comments, all of which require some form of moderation. From app store requirements to legal compliance with global legislation, ensuring that all user-generated content is aligned with community guidelines is nothing short of an existential matter.
When it comes to making a technical choice for a content moderation platform, what I hear in consultations and demos can be distilled down to this: engineers want a solution that’s simple to integrate and maintain, and that can scale as their product scales. They are also looking for a solution that’s battle-tested and allows for easy troubleshooting — and that won’t keep them up at night with downtime issues!
“Processing 100 billion online interactions in one month is technically hard to achieve. That is not simply just taking a message and passing it on to users but doing deep textual analysis for over 3 million patterns of harmful things people can say online. It includes building user reputation and knowing if the word on the line above mixed with this line is also bad. Just trying to maintain user reputation for that many people is a very large technical challenge. And to do it all on 20 milliseconds per message is incredible”. Chris Priebe, Two Hat’s CEO and Founder
I caught up with Laurence Brockman, Two Hat’s Vice President of Core Services, and Manisha Eleperuma, our Manager of Development Operations, just as we surpassed the mark of 100 billion pieces of human interactions processed in one month.
I asked them about what developers value in a content moderation platform, the benefits of an API-based service, and the technical challenges and joys of safeguarding hundreds of millions of users globally.
Carlos Figueiredo: Laurence, 100 billion online interactions processed in one month. Wow! Can you tell us about what that means to you and the team, and the journey to getting to that landmark?
“At the core, it’s meant we were able to keep people safe online and let our customers focus on their products and communities. We were there for each of our customers when they needed us most”.
Laurence Brockman: The hardest part for our team was the pace of getting to 100 billion. We tripled the volume in three months! When trying to scale & process that much data in such a short period, you can’t cut any corners. And you know what? I’m pleased to say that it’s been business as usual – even with this immense spike in volume. We took preventative measures along the way, we focused on key areas to ensure we could scale. Don’t get me wrong, there were few late nights and a week of crazy refactoring a system but our team and our solution delivered. I’m very proud of the team and how they dug in, identified any potential problem areas and jumped right in. At 100 billion, minor problems can become major problems and our priority is to ensure our system is ready to handle those volumes.
“What I find crazy is our system is now processing over 3 billion events every day! That’s six times the volume of Twitter”.
CF: Manisha, what are the biggest challenges and joys of running a service that safeguards hundreds of millions of users globally?
Manisha Eleperuma: I would start off with the joys. I personally feel really proud to be a part of making the internet a safer place. The positive effect that we can have on an individual’s life is immense. We could be stopping a kid from harming themself, we could be saving them from a predator, we could be stopping a friendly conversation turning into a cold battle of hate speech. This is possible because of the safety net that our services provide to online communities. Also, it is very exciting to have some of the technology giants and leaders in the entertainment industry using our services to safeguard their communities.
It is not always easy to provide such top-notch service, and it definitely has its own challenges. We as an Engineering group are maintaining a massive complex system and keeping it up and running with almost zero downtime. We are equipped with monitoring tools to check the system’s health and engineers have to be vigilant for alerts triggered by these tools and promptly act upon any anomalies in the system even during non-business hours. A few months ago, when the pandemic situation was starting to affect the world, the team could foresee an increase in transactions that could potentially start hitting our system.
“This allowed the team to get ahead of the curve and pre-scale some of the infrastructure components to be ready for the new wave so that when traffic increases, it hits smoothly without bringing down the systems”.
Another strenuous exercise that the team often goes through is to maintain the language quality of the system. Incorporating language-specific characteristics into the algorithms is challenging, but exciting to deal with.
CF: Manisha, what are the benefits of using an API-based service? What do developers value the most in a content moderation platform?
ME: In our context, when Two Hat’s Community Sift is performing as a classification tool for a customer, all transactions happen via customer APIs. In every customer API, based on their requirements, it has the capability to access different components of our platform side without much hassle. For example, certain customers rely on getting the player/user context, their reputation, etc. The APIs that they are using to communicate with our services are easily configurable to fetch all that information from the internal context system, without extra implementation from the customer’s end.
“This API approach has accelerated the integration process as well. We recently had a customer who was integrated with our APIs and went live successfully within a 24 hour period”.
Customers expect reliability and usability in moderation platforms. When a moderator goes through content in a Community Sift queue, we have equipped the moderator with all the necessary data, including player/user information with the context of the conversation, history and the reputation of the player which eases decision-making. This is how we support their human moderation efforts. Further, we are happy to say that Two Hat has expanded the paradigm to another level of automated moderation, using AI models that make decisions on behalf of human moderators after it has learned from their consistent decisions, which lowers the moderation costs for customers.
CF: Laurence, many of our clients prefer to use our services via a server to server communication, instead of self-hosting a moderation solution. Why is that? What are the benefits of using a service like ours?
LB: Just as any SaaS company will tell you, our systems are able to scale to meet the demand without our customers’ engineers having to worry about it. It also means that as we release new features and functions, our customers don’t have to worry about expensive upgrades or deployments. While all this growth was going on, we also delivered more than 40 new subversion detection capabilities into our core text-classification product.
Would you like to see our content moderation platform in action? Request a demo today.