Meet the Mayor in a Town of 20 Million Teens

Launched in 2016, Yubo is a social network of more than 20 million users from around the world. Yubo lets users meet new people and connect through live video streaming and chat. Developed and operated by Paris-based Twelve App SAS, the Yubo app is available for free on the App Store and Google Play.

Two Hat’s Community Sift platform powers content moderation for Yubo’s Live Titles, Comments, and Usernames, all in multiple languages. Use cases include detection and moderation of bullying, sexting, drugs/alcohol, fraud, racism, and grooming. Recently, Yubo’s COO, Marc-Antoine Durand, sat down with Two Hat to share his thoughts on building and operating a safe social platform for teens, and where future evolutions in content moderation may lead.


Two Hat: Talk about what it’s like to operate a community of young people from around the globe sharing 7 million comments every day on your platform.

Marc-Antoine Durand: It’s like running a city. You need to have rules and boundaries, and importantly you need to educate users about them, and you have to undertake prevention to keep things from getting out of hand in the first place. You’ll deal with all the bad things that exist elsewhere in society – drug dealing, fraud, prostitution, bullying and harassment, thoughts or attempts at suicide – and you will need a framework of policies and law enforcement to keep your city safe. It’s critical that these services are delivered in real-time.

Marc-Antoine Durand, COO of Yubo

The future safety of the digital world rests upon how willing we are to use behavioral insights to stop the bad from spoiling the good. If a Yubo moderator sees something happening that violates community guidelines or could put someone at risk, they send a warning message to the user. The message might say that their Live feed will be shut down in one minute, or it might warn the user they will be suspended from the app if they don’t change their behavior. We’re the only social video app to do this, and we do it because the best way for young people to learn is in the moment, through real-life experience.

Yubo’s role is to always find a balance between ensuring self-expression and freedom of speech while preventing harm. Teenagers are very keen to talk about themselves, are interested in others and want to share the issues that are on their minds such as relationships and sexuality. This is a normal part of growing up and development at this point in teenagers’ lives. But this needs to be done within a context that is healthy and free from pressure and coercion, for example, sharing intimate images. Finding a limit or balance between freedom and protection in each case is important to make sure the app is appealing to young people and offers them the space for expression but keeps them as safe as possible.

TH: When Yubo first launched in 2016, content moderation was still quite a nascent industry. What were your solutions options at the time and how was your initial learning curve as a platform operator?

MD: There weren’t many options available then. You could hire a local team of moderators to check comments and label them, but that’s expensive and hard to scale. There was no way our little team of four could manage all that and be proficient in Danish, English, French, Norwegian, Spanish and Swedish all at the same time. So multi-language support was a must to have.

We created our own algorithms to detect images that broke Yubo’s community guidelines and acceptable use policies, but content moderation is a very special technical competency and it’s a never-ending job and there were only four of us and we simply couldn’t do all that was required to do this well. As a result, early on, we were targeted by the press as a ‘bad app.’ To win the trust back and establish the app as safe and appropriate for young people we had to start over. Our strategy was to show that we were working hard and fast to improve and we set out to establish that a small company with the right safety strategy and tools can be just as good, or better, at content moderation as any large company.

I applaud Yubo for extensively reworking its safety features to make its platform safer for teens. Altering its age restrictions, improving its real identity policy, setting clear policies around inappropriate content and cyberbullying, and giving users the ability to turn location data off demonstrates that Yubo is taking user safety seriously.

Julie Inman Grant, Australian e-safety Commissioner

TH: What are some of the key content moderation issues on your platform and how do you engage users as part of the solution?

MD: One of the issues every service has is user fake profiles. These are particularly a problem in issues like grooming, or bullying. To address this, we have created a partnership with a company called Yoti that allows users to certify their identity. So, when you’re talking to somebody, you can see that they have a badge signifying that their identity has been certified, indicating they are ‘who they say they are.’ It’s a voluntary process for users to participate in this, but if we think a particular profile may be suspicious or unsafe, we can force the user to certify their identity, or they will be removed from the platform.

Real time intervention by Yubo moderators

The other issues we deal with are often related to the user’s live stream title, which is customizable, and the comments in real-time chats. Very soon after launching, we saw that users were creating sexualized and ‘attention-seeking’ live stream titles not just for fun, but as a strategy to attract more views, for example, with a title such as: “I’m going to flash at 50 views.” People are very good at finding ways to bypass the system by creating variations of words. We realized immediately that we needed a technology to detect and respond to that subversion.

As to engaging users as part of our content moderation, it’s very important to give users who wish to participate in some way an opportunity to help and something they can do to help with the app. Users want and value this. When our users report bad or concerning behavior in the app, they give us a very precise reason and good context. They do this because they are very passionate about the service and want to keep it safe. Our job is to gather this feedback and data so that we may learn from it, but also to take action on what users tell us, and to reward those who help us. That’s how this big city functions.

TH: Yubo was referenced as part of the United Kingdom’s Online Harms white paper and consultation — What’s your take on pending duty of care legislation in the UK and elsewhere, and are you concerned that a more restrictive regulatory environment may stifle technical innovation?

MD: I think regulation is good as long as it’s thoughtful and agile to adjust to a constantly changing technical environment and not simply a way to blame apps and social platforms for all the bad things happening in society because that does not achieve anything. Perhaps most concerning is setting standards that only the Big Tech companies with thousands of moderators and technical infra-structure staff can realistically achieve, and this prohibits and restricts smaller start-ups being innovative and able to participate in the ecosystem. Certainly, people spend a lot of time on these platforms and they should not be unregulated, but the government can’t just set rules, they need to help companies get better at providing safer products and services.

It’s an ecosystem and everyone needs to work together to improve it and keep it as safe as possible, and this includes the wider public and users themselves. So much more is needed in the White Paper about media literacy and managing off-line problems escalating and being amplified online. Bullying and discrimination, for example, exist in society and strategies are needed in schools, families, and communities to tackle these issues – just focusing online will not deter or prevent these issues.

In France, by comparison to the UK, we’re very far away from this ideal ecosystem. We’ve started to work on moderation, but really the French government just does whatever Facebook says. No matter where you are, the more regulations you have, the more difficult it will be to start and grow a company, so barriers to innovation and market entry will be higher. That’s just where things are today.

It’s in our DNA to take safety features as far as we can to protect our users.

— Marc-Antoine Durand, COO of Yubo

TH: How do you see Yubo’s approach to content moderation evolving in the future?

MD: We want to build a reputation system for users, the idea being to do what I call pre-moderation, or detecting unsafe users by their history. For that, we need to gather as much data as we can from our user’s live streams, titles, and comments. The plan is to create a method where users are rewarded for good behavior. That’s the future of the app, to reward the good stuff and, for the very small minority who are doing bad stuff, like inappropriate comments or pictures or titles, we’ll engage them and let them know it’s not ok and that they need to change their behavior if they want to stay. So, user reputation as a baseline for moderation. That’s where we are going.


We’re currently offering no-cost, no-obligation Community Consultations for social networks that want an expert consultation on their community moderation practices.

Our Director of Community Trust & Safety will examine your community, locate areas of potential risk, and provide you with a personalized community analysis, including recommended best practices and tips to maximize user engagement.

Sign up using the form below to request your community consultation.

Four Must-Haves for the Internet of the Future

To make the internet of the future a safer and more enjoyable place, it is critical to get a clearly defined minimum standard of Safety by Design established internet-wide. That said, it is important to recognize that “Design for Scale” and “Design for Monetization” are the embedded norms.

Many websites and apps are built to reach live state as a first priority, and forget safety or fail to come back to it until their product is mired in a situation where making it safe is very hard. To that end, it’s important we develop guidelines for startups and SMEs to understand best practices for Safety by Design, and access resources to help them build that way.

The regulation stems from the concept of “Duty of Care”. This is an old concept that says if you are going to make a social space, such as a nightclub, you have a responsibility to ensure it is safe. Likewise, we need to learn from our past mistakes and build out shared standards of best practices so users don’t get hurt in our online social spaces.

We believe that there are four layers of protection every site should have:

1. Clear terms of use
Communities don’t just happen, we create them. In real life, if you add a swing set to a park, the community expectation is that it is a place for kids. As a society, we change our language and behaviour based on that environment. We still have free speech, but we regulate ourselves for the benefit of the kids. The adult equivalent of this scenario is a nightclub; the environment allows for a loosening of behavioural norms, but step out of line with house rules and the establishment’s bouncers deal with you. Likewise, step out of line while online, and there must be consequences.

2. Embedded filters that are situationally appropriate
Many don’t add automated filters because they are afraid of the slippery slope of inhibiting free speech. In so doing they fall down the other slippery slope – doing nothing — allowing harm to continue. For the most part, this is a solved problem. You can buy off-the-shelf solutions just like you can buy anti-virus technology that matches known signatures of things users say or share. These filters must be on every social platform, app, and web site.

3. Using User Reputation to make smarter decisions
Reward positive users. For those who keep harassing everyone else, take automated action. Two Hat are pioneers of a new technique where you can give all users maximum expression by only filtering the worst abusive content, and then increasing the filter level incrementally on those who harass others. Predictive Moderation based on user reputation is a must.

4. Let users report bad content
If someone has to report something then harm is already done. Everything that users can create needs to be able to be reported. When content is reported, record the moderator decisions (in a pseudonymized, minimized way) and train AI (like our Predictive Moderation) to scale out the easy decision-making and escalate critical issues. Engaging and empowering users to assist in identifying and escalating objectionable content is a must.

Why we must create a better internet
In 2019, the best human intentions paired with best technology platforms and companies in the world couldn’t stop a terrorist from live-streaming the murder of innocents. We still can’t understand why 1.5 million chose to share it.

What we can do is continue to build and connect datasets and train AI models to get better. We can also find new ways to work together to make the internet a better, safer, place.

We’ll know it’s working when exposure to bullying, hate, abuse, and exploitation no longer feels like the price of admission for being online.

To learn more about Two Hat’s vision for a better internet that’s Safe by Design, download our white paper By Design: 6 Tenets for a Safer Internet.

3 Use Cases for Automated Triggers in Content Moderation

Addressing the challenges of moderating the world’s content requires both artificial intelligence and human interaction, the formula we refer to as AI+HI. In the case of triggers specifically, the essential task is to align key moments of behavior with an automated process triggered by those behaviors.

In simplest terms, that means the ability for technology to intelligently identify not just content, but particular actions, or set of actions, that are known to cause harm. By setting various triggers and measuring their impact, it is possible to prevent harmful content and messages of bullying or hate speech from ever making it online.

More than this, triggers can be set to increase or reduce different user permissions, such as what they are allowed to share or comment on, and can also be used to escalate contentious or urgent issues to human moderators. The following use cases are based on some of our client’s approaches for applying automated triggers in managing User Reputation, incidents of encouragement of suicide or self-harm, and live-streamed content similar to the Christchurch terrorist video.

1. Setting Trust Levels to vet new users

The ability to automatically adjust users’ Trust Levels is a key component of our patented User Reputation technology. Consider placing new user accounts into a special moderation wherein their posts are pre-moderated before sharing until the user reaches a certain threshold of trust, i.e. five posts have to be approved by moderators before they go into the standard workflow for posting. As a user’s trust rating improves, they can be automatically triggered to new privilege or access levels.

Conversely, triggers can be set to automatically reduce trust levels based on incidents of flagged behavior, which in turn could restrict future ability to share, etc. If your community uses profile recognition (e.g. rankings, stickers, etc.), these could also be publicly applied or removed based on X threshold being met.
2. Escalating responses for incidents of self harm or suicide

Certain strings of text and discussion are known to indicate either a will towards self harm, or the encouragement of self harm by another. Incidents of encouragement of self harm are of particular concern in communities frequented by young people.  

In these incidents, triggers could be applied to mute any harassing party, send a written message to at-risk users, escalate the incident to human interaction e.g. a phone call, or even alert local police or medical professionals in real-time to a possible mental health crisis.

3. Identifying and responding to live streaming events in real time

AI can only act on things it has seen many times before (computer vision models require 50,000 examples to do it well). For live streaming events, such as the Christchurch shootings, AI is currently able to detect a gun and threat of violence before the shooter even enters the front door. However (and fortunately), events like the Christchurch shooting haven’t happened enough for AI to really learn from them. But it’s not just one murderous rampage — live streaming of bullying, fights, thefts, sexual assaults and even killings are all too common.

To help manage the response to such incidents, triggers can be set that use language signals to escalate content that requires human intervention. For example, an escalation could be set based on text conversations around the live stream: “Is this really happening?” “Is that a real gun?” “OMG he’s shooting.” Somebody please help,” etc.

In concert with improving AI models for image recognition and violent acts, these triggers could alert human moderators, network operators and law enforcement of events in real-time. This, in turn, will be able to prevent future violent live streams from making their way online and limit the virality and reach of content that does e.g. once identified, an automated trigger prevents users from sharing it.


For the first time in history, the collective global will exists to make the internet a safer place. By learning to use set automated triggers to manage common incidents and workflows, content platforms can ensure faster response times to critical incidents, reduce stress on human moderators, and provide users with a safer, more enjoyable experience.

Witnessing the Dawn of the Internet’s Duty of Care

As I write this, we are a little more than two months removed from the terrorist attacks in Christchurch. Among many things, Christchurch will be remembered as the incident that galvanized world view, and more importantly global action, around online safety.

In the last two months, there has been a seismic shift in how we look at internet safety and how content is shared. Governments in London, Sydney, Washington, DC, Paris and Ottawa are considering or introducing new laws, financial penalties and even prison time for those who fail to remove harmful content and do so quickly. Others will follow, and that’s a good thing — securing the internet’s future requires the world’s governments to collectively raise the bar on safety, and cooperate across boundaries.

In order to reach this shared goal, it is essential that technology companies engage fully as partners. We witnessed a huge step forward in just last week when Facebook, Amazon, and other tech leaders came out in strong support of the Christchurch Call to Action. Two Hat stands proudly with them.

Clear terms of use, timely actions by social platforms on user reports of extremist content, and transparent public reporting are the building blocks of a safer internet. Two Hat also believes every web site should have baseline filtering for cyberbullying, images of sexual abuse, extremist content, and encouragement of self-harm or suicide.

Crisis protocols for service providers and regulators are essential, as well — we have to get better at managing incidents when they happen. Two Hat also echoes the need for bilateral education initiatives with the goal of helping people become better informed and safer internet users.

In all cases, open collaboration between technology companies, government, not for profit organizations, and both public and private researchers will be essential to create an internet of the future that is Safe by Design. AI + HI (artificial intelligence plus human intelligence) is the formula we talk about that can make it happen.

AI+HI is the perfect marriage of machines, which excel at processing billions of units of data quickly, guided by humans, who provide empathy, compassion and critical thinking. Add a shared global understanding of what harmful content is and how we define and categorize it, and we are starting to address online safety in a coordinated way.

New laws and technology solutions to moderate internet content are necessary instruments to help prevent the incitement of violence and the spread of online hate, terror and abuse. Implementing duty of care measures in the UK and around the world requires a purposeful, collective effort to create a healthier and safer internet for everyone.

Our vision of that safer internet will be realized when exposure to hate, abuse, violence and exploitation no longer feels like the price of admission for being online.

The United Kingdom’s new duty of care legislation, the Christchurch Call to Action, and the rise of the world’s collective will move us closer to that day.


Two Hat is currently offering no cost, no obligation community audits for anyone who could benefit from a second look at their moderation techniques.

Our Director of Community Trust & Safety will examine your community, locate areas of potential risk, and provide you with a personalized community analysis, including recommended best practices and tips to maximize user engagement. This is a unique opportunity to gain insight into your community from an industry expert.

Book your audit today.

Ask Yourself These 4 Questions If You Allow User-Generated Content on Your Platform

Today, user-generated content like chat, private messaging, comments, images, and videos are all must-haves in an overstuffed market where user retention is critical to long-term success. Users love to share, and nothing draws a crowd like a crowd — and a crowd of happy, loyal, and welcoming users will always bring in more happy, loyal, and welcoming users.

But as we’ve seen all too often, there is risk involved when you have social features on your platform. You run the risk of users posting offensive content – like hate speech, NSFW images, and harassment – which can cause serious damage to your brand’s reputation.

That’s why understanding the risks when adding social features to your product are also critical to long-term success.

Here are four questions to consider when it comes to user-generated content on your platform.

1. How much risk is my brand willing to accept?
Every brand is different. Community demographic will usually be a major factor in determining your risk tolerance.

Communities with under-13 users in the US have to be COPPA compliant, so preventing them from sharing PII (personally identifiable information) is essential. Edtech platforms should be CIPA and FERPA compliant.

If your users are teens and 18+, you might be less risk-averse, but will still need to define your tolerance for high-risk content.

Consider your brand’s tone and history. Review your corporate guidelines to understand what your brand stands for. This is a great opportunity to define exactly what kind of an online community you want to create.

2. What type of high-risk content is most dangerous to my brand?
Try this exercise: Imagine that just one pornographic post was shared on your platform. How would it affect the brand? How would your audience react? How would your executive team respond? What would happen if the media/press found out?

What about hate speech? Sexual harassment? What is your brand’s definition of abuse or harassment? The better you can define these often vague terms, the better you will understand what kind of content you need to moderate.

3. How will I communicate my expectations to the community?
Don’t expect your users to automatically know what is and isn’t acceptable on your platform. Post your community guidelines where users can see them. And make sure users have to agree to your guidelines before they can post.

4. What content moderation tools and strategies can I leverage to protect my community?
We recommend taking a proactive instead of a reactive approach to managing risk and protecting your community. That means finding the right blend of pre- and post-moderation for your platform, while also using a mixture of automated artificial intelligence with real human moderation.

On top of these techniques, there are also different tools you can use to take a proactive approach, including in-house filters (read about the build internally vs buy externally debate), or content moderation solutions like Two Hat’s Community Sift (learn about the difference between a simple profanity filter and a content moderation tool).

Feeling overwhelmed?
While social features may be inherently risky, remember that they’re also inherently beneficial to your brand and your users. Whether you’re creating a new social platform or adding chat and images to your existing product, nothing engages and delights users more than being part of a positive and healthy online community.

And if you’re not sure where to start – we have good news.

Two Hat is currently offering a no-cost, no-obligation community audit. Our team of industry experts will examine your community, locate high-risk areas, and identify how we can help solve any moderation challenges.

It’s a unique opportunity to sit down with our Director of Community Trust & Safety to see how you can mitigate risk in your community.

To book your free audit, fill out the form below and we’ll reach out with next steps!

BC firm’s AI tool battles online abuse, pornography

Two Hat Security CEO Chris Priebe says the extent to which people are harassed online can end up costing businesses big bucks in the long run if companies don’t take the right steps to fight the problem. His Kelowna-based tech firm has been employing artificial-intelligence-powered tools to weed out inappropriate language or abusive content, such as pornographic images, on social networks.