Will This New AI Model Change How the Industry Moderates User Reports Forever?

Picture this:

You’re a moderator for a popular MMO. You spend hours slumped in front of your computer reviewing a seemingly endless stream of user-generated reports. You close most of them — people like to report their friends as a prank or just to test the report feature. After the 500th junk report, your eyes blur over and you accidentally close two reports containing violent hate speech — and you don’t even realize it. Soon enough, you’re reviewing reports that are weeks old — and what’s the point in taking action after so long? There are so many reports to review, and never enough time…

Doesn’t speak to you? Imagine this instead:

You’ve been playing a popular MMO for months now. You’re a loyal player, committed to the game and your fellow players. Several times a month, you purchase new items for your avatar. Recently, another player has been harassing you and your guild, using racial slurs, and generally disrupting your gameplay. You keep reporting them, but it seems like nothing ever happens – when you log back in the next day, they’re still there. You start to think that the game creators don’t care about you – are they even looking at your reports? You see other players talking about reports on the forum: “No wonder the community is so bad. Reporting doesn’t do anything.” You log on less often; you stop spending money on items. You find a new game with a healthier community. After a few months, you stop logging on entirely.

Still doesn’t resonate? One last try:

You’re the General Manager at a studio that makes a high-performing MMO. Every month your Head of Community delivers reports about player engagement and retention, operating costs, and social media mentions. You notice that operating costs go up while the lifetime value of a user is going down. Your Head of Community wants to hire three new moderators. A story in Wired is being shared on social media — players complain about rampant hate speech and homophobic slurs in the game that appear to go unnoticed. You’re losing money and your brand reputation is suffering — and you’re not happy about it.

The problem with reports
Most social platforms give users the ability to report offensive content. User-generated reports are a critical tool in your moderation arsenal. They surface high-risk content that you would otherwise miss, and they give players a sense of ownership over and engagement in their community.

They’re also one of the biggest time-wasters in content moderation.

Some platforms receive thousands of user reports a day. Up to 70% of those reports don’t require any action from a moderator — yet they have to review them all. And those reports that do require action often contain content that is so obviously offensive that a computer algorithm should be able to detect it automatically. In the end, reports that do require human eyes to make a fair, nuanced decision often get passed over.

Predictive Moderation
For the last two years, we’ve been developing and refining a unique AI model to label and action user reports automatically, mimicking a human moderator’s workflow. We call it Predictive Moderation.

Predictive Moderation is all about efficiency. We want moderation teams to focus on the work that matters — reports that require human review, and retention and engagement-boosting activities with the community.

Two Hat’s technology is built around the philosophy that humans should do human work, and computers should do computer work. With Predictive Moderation, you can train our innovative AI to do just that — ignore reports that a human would ignore, action on reports that a human would action on, and send reports that require human review directly to a moderator.

What does this mean for you? A reduced workload, moderators who are protected from having to read high-risk content, and an increase in user loyalty and trust.

Getting started
We recently completed a sleek redesign of our moderation layout (check out the sneak peek!). Clients begin training the AI on their dataset in January. Luckily, training the model is easy — moderators simply review user reports in the new layout, closing reports that don’t require action and actioning on the reports that require it.

Image of chat moderation workflow for user-generated reports
Layout subject to change

“User reports are essential to our game, but they take a lot of time to review,” says one of our beta clients. “We are highly interested in smarter ways to work with user reports which could allow us to spend more time on the challenging reports and let the AI take care of the rest.”

Want to save time, money, and resources? 
As we roll out Predictive Moderation to everyone in the new year, expect to see more information including a brand-new feature page, webinars, and blog posts!

In the meantime, do you:

  • Have an in-house user report system?
  • Want to increase engagement and trust on your platform?
  • Want to prevent moderator burnout and turnover?

If you answered yes to all three, you might be the perfect candidate for Predictive Moderation.

Contact us at hello@twohat.com to start the conversation.


Two Hat CEO and founder Chris hosts a webinar on Wednesday, February 20th where he’ll share Two Hat’s vision for the future of content moderation, including a look at how Predictive Moderation is about to change the landscape of chat moderation. Don’t miss it — the first 25 attendees will receive a free Two Hat gift bag!



Top 6 Reasons You Should Combine Automation and Manual Review in Your Image Moderation Strategy

When you’re putting together an image moderation strategy for your social platform, you have three options:

  1. Automate everything with AI;
  2. Do everything manually with human moderators, or
  3. Combine both approaches for Maximum Moderation Awesomeness™

When consulting with clients and industry partners like PopJam, unsurprisingly, we advocate for option number three.

Here are our top six reasons why:

1. Human beings are, well… human (Part 1)
We get tired, we take breaks, and we don’t work 24/7. Luckily, AI hasn’t gained sentience (yet), so we don’t have to worry (yet) about an algorithm troubling our conscience when we make it work without rest.

2. Human beings are, well… human (Part 2)
In this case, that’s a good thing. Humans are great at making judgments based on context and cultural understanding. An algorithm can find a swastika, but only a human can say with certainty if it’s posted by a troll propagating hate speech or is instead a photo from World War II with historical significance.

3. We’re in a golden age of AI
Artificial intelligence is really, really good at detecting offensive images with near-perfect accuracy. For context, this wasn’t always the case. Even 10 years ago, image scanning technology was overly reliant on “skin tone” analysis, leading to some… interesting false positives.

Babies, being (sometimes) pink, round, and strangely out of proportion would often trigger false positives.

And while some babies may not especially adorable, it was a bit cruel to label them “offensive.”

Equally inoffensive but often the cause of false positives was light oak-coloured desks, chair legs, marathon runners, some (but not all) brick walls, and even more bizarrely — balloons.

Today, the technology has advanced so far that it can distinguish between bikinis, shorts, beach shots, scantily-clad “glamour” photography, and explicit adult material.

4. Humans beings are, well… human (Part 3)
As we said, AI doesn’t yet have the capacity for shock, horror, or emotional distress of any kind.

Until our sudden inevitable overthrow by the machines, go ahead and let AI automatically reject images with a high probability of containing pornography, gore, or anything that could have a lasting effect on your users and your staff.

That way, human mods can focus on human stuff like reviewing user reports and interacting with the community.

5. It’s the easiest way to give your users an unforgettable experience
The social app market is already overcrowded. “The next Instagram” is released every day. In a market where platforms vie to retain users, it’s critical that you ensure positive user experiences.

With AI, you can approve and reject posts in real-time, meaning your users will never have to wait for their images to be reviewed.

And with human moderators engaging with the community — liking posts, upvoting images, and promptly reviewing and actioning user reports — your users will feel supported, safe, and heard.

You can’t put a price on that… no wait, you can. It’s called Cost of Customer Acquisition (CAC), and it can make or break a business that struggles to retain users.

6. You’re leveraging the best of both worlds
AI is crazy fast, scanning millions of images a day. By contrast, humans can scan about 2500 images daily before their eyes start to cross and they make a lot of mistakes. AI is more accurate than ever, but humans provide enhanced precision by understanding context.

A solid image moderation process supported by cutting-edge tech and a bright, well-trained staff? You’re well on your way to Maximum Moderation Awesomeness™.

Want to learn how one social app combines automation with manual review to reduce their workload and increase user engagement? Sign up for our webinar featuring the community team from PopJam!



Connect With us at the Crimes Against Children Conference in Dallas

For the second year in a row, Two Hat Security will be attending the Crimes Against Children Conference in Dallas, Texas as a sponsor. Founded in 1988, the conference brings together attendees from law enforcement, child protective services, and more to “provid[e] practical and interactive instruction to those fighting crimes against children and helping children heal.”

Last year, more than 4200 professionals attended CACC — a record for the conference and a sign of the growing need for these discussions, workshops, and training sessions.

Two Hat Security founder and CEO Chris Priebe and VP of Product Brad Leitch are hosting two sessions this year. Both sessions will provide investigators with a deeper understanding of the vital role artificial intelligence plays in the future of abuse investigations.

Session 1: Using Artificial Intelligence to Prioritize and Solve Crimes

Tuesday, August 8
1:45 PM – 3:00 PM
Location: Dallas D2

In this session, we explore what recent advances in artificial intelligence and machine learning mean for law enforcement. We’ll discuss how this new technology can be applied in a meaningful way to triage and solve cases faster. This is a non-technical session that will help prepare investigative teams for upcoming technology innovations.

Session 2: Beyond PhotoDNA — Detecting New Child Sexual Abuse Material With CEASE.ai

Wednesday, August 9
8:30 AM – 9:45 AM
Location: City View 8 (Exhibitor Workshop)

Traditionally, PhotoDNA has allowed organizations to detect already categorized child sexual abuse material (CSAM). Sadly, with new digital content being so easy to create and distribute worldwide, investigators have seen an epidemic of brand-new, never-seen CSAM being shared online.

CEASE is an AI model that uses computer vision to detect these new images. Our collaboration with the Royal Canadian Mounted Police has given our data scientists access to a permanent data set of confirmed CSAM, which we are using to train the model.

However, it’s still a work in progress. If you are a member of the law enforcement community or the technology industry, we need your expertise and vast knowledge to help shape this groundbreaking system.

Stop by our booth

We look forward to meeting fellow attendees and discussing potential collaboration and partnership opportunities. Visit us at booth 41 on the 2nd floor, next to Griffeye.

As we learned at the Protecting Innocence Hackathon in July, “If we want to protect the innocence of children, we have a responsibility to be transparent and collaborative.”

You can sign up for both workshops through the conference website. Feel free to email us at hello@twohat.com to set up a meeting.

Want more articles like this? Subscribe to our newsletter and never miss an update!

* indicates required


Three Powerful Lessons We Learned at the Protecting Innocence Hackathon

You rarely hear about them, but every day brave investigators across the globe review the most horrific stories and images you can ever imagine. It’s called child sexual abuse material (known as CSAM in the industry), and it hides in the dark corners of the internet, waiting to be found.

The scope is dizzying. The RCMP-led National Child Exploitation Coordination Centre (NCECC) alone received 27,000 cases in 2016. And right now, it’s nearly impossible for officers to review those cases fast enough to prioritize the ones that require their immediate attention.

That’s why, on July 6th and 7th, volunteers from law enforcement, academia, and the tech industry came together to collaborate on solving this problem, perhaps the biggest problem of our time — how do we quickly, accurately, and efficiently detect online CSAM? Artificial intelligence gets smarter and more refined every day. How can we leverage those breakthroughs to save victimized children and apprehend their abusers?

Along with event co-sponsors the RCMP, Microsoft, and Magnet Forensics, we had a simple goal at the Protecting Innocence Hackathon: to bring together the brightest minds in our respective industries to answer these questions.

We ended up learning a few valuable lessons along the way.

It starts with education

Participants across all three disciplines learned from each other. Attendees from the tech industry and academia were given a crash course in grooming and luring techniques (as well as the psychology behind them) from law enforcement, the people who study them every day.

Make no mistake, these were tough lessons to learn — but with a deeper understanding of how predators attract their victims, we can build smarter, more efficient systems to catch them.

Law enforcement studied the techniques of machine learning and artificial intelligence — which in turn provides them with a deeper understanding of the challenges facing data scientists, not to mention the need for robust and permanent datasets.

It’s crucial that we learn from each other. But that’s just the first step.

Nothing important happens without collaboration

Too often our industries are siloed, with every company, university, and agency working on a different project. Bringing professionals together from across these disciplines and encouraging them to share their diverse expertise, without reservations or fear, was a huge accomplishment, and an important lesson.

This isn’t a problem that can be solved alone. This is a 25,000,000-million-images-a-year problem. This is a problem that crosses industry, cultural, and country lines.

If we want to protect the innocence of children, we have a responsibility to be transparent and collaborative.

Just do it

Education and collaboration are commendable and necessary — but they don’t add up to much without actual results. Once you have the blueprints, you have no excuse not to build.

The great news? The five teams and 60+ participants made real, tangible progress.

Collectively, the teams built the following:

  • A proposed standard for internationally classifying and annotating child sexual exploitation images and text
  • A machine learning aggregation blueprint for both text and image classification
  • Machine learning models to detect sexploitation conversation, as well image detection for as age, anime, indoor and outdoor, nudity, and CSAM

We cannot overstate the importance of these achievements. They are the first steps towards building the most comprehensive and accurate CSAM detection system the world has seen.

Not only that, the proposed global standard for classifying text and images, if accepted, will lead to even more accurate detection.

The future of CSAM detection is now

We actually learned a fourth lesson at the hackathon, perhaps the most powerful of them all: Everyone wants to protect and save children from predators. And they’re willing to work together, despite their differences, to make that happen.

At Two Hat Security, we’re using the knowledge shared by our collaborators to further train our artificial intelligence model CEASE and to refine our grooming and luring detection in Community Sift. And we’ll continue to work alongside our partners and friends in law enforcement, academia, and the tech industry to find smart solutions to big problems.

There are challenges ahead, but if everyone continues to educate, collaborate, and create, projects like CEASE and events like Protecting Innocence can and will make great strides. We hope that the lessons we learned will be applied by any agency, company, or university that hopes to tackle this issue.

Thank you again to our co-sponsors the RCMP, Microsoft, and Magnet Forensics. And to the Chief EnforcersCode Warriors, and Data Mages who gave their time, their expertise, and their fearlessness to this event — your contributions are invaluable. You’re changing the world.

And to anyone who labors every day, despite the heartbreak, to protect children — thank you. You may work quietly, you may work undercover, and we may never know your names, but we see you. And we promise to support you, in every way we can.

Want more articles like this? Subscribe to our newsletter and never miss an update!

* indicates required


How a Hackathon Hopes to Stop Online Child Exploitation

Every year, the National Center for Missing & Exploited Children reviews 25,000,000 images containing child sexual abuse imagery (CSAM).

How do you conceptualize a number like 25,000,000? It’s unthinkable.

For perspective, there are just over 24,000,000 people in Australia. The population of a large country — that’s how many vile, horrific, and disturbing images NCMEC had to review in a year.

In 2016, the Internet Watch Foundation found 57,335 URLs containing confirmed child sexual abuse imagery. 57,335 — that’s about the population of a mid-sized city like Watertown, NY.

Still not convinced of the epidemic?

How about this? Over half of the children depicted on those 57,335 URLs were aged 10 or younger.

We’ve all been ten years old. Many of us have ten-year-old children, or nieces, or nephews.

Now that’s unthinkable.

Protecting the helpless

These images aren’t going away.

That’s why we’ve spearheaded a hackathon taking place on July 6th and 7th in Vancouver, British Columbia, Canada. Sponsored by the RCMP, Microsoft, Magnet, and Two Hat Security, the Protecting Innocence Hackathon is an attempt to build a bridge between three diverse disciplines — law enforcement, academia, and the technology sector — for the greater good.

The goal is to work together to build technology and global policy that helps stop online child exploitation.

Teams from across all three industries will gather to work on a variety of projects, including:

  • designing a text classification system to identify child luring conversations
  • training an image classification to identify child exploitation media
  • coordinating on a global protocol for sharing CSAM evidence between agencies
  • and more…

We are hopeful that by encouraging teamwork and partnerships across these three vital industries, we will come closer to ridding the internet of online child exploitation.

The beauty of a hackathon is that it’s a tried and true method for hacking away at tough problems in a short period. The time box encourages creativity, resourcefulness, critical thinking — and above all, collaboration.

We’re honored to be working on this. And we’re indebted to the RCMP, Microsoft, Magnet, and all the hackers attending for their selfless contributions.

Protecting the innocent

Forget incomprehensible numbers like 25,000,000 or 57,335. We’re doing this for all the ten-year-olds who’ve been robbed of their innocence.

Today, it’s easier than ever for predators to create and share pictures, videos, and stories. And every time those pictures, videos, and stories are shared, the victim is re-victimized.

It gets worse every year. The Internet Watch Foundation found that reports of child sexual abuse imagery rose by 417% between 2013 and 2015.

At Two Hat Security, we’re doing our part to fight the spread of illegal and immoral content. In collaboration with the RCMP and universities across the country, and with a generous grant from Mitacs, we’re building CEASE, an artificial intelligence model that can detect new CSAM.

But we can’t solve this problem alone.

So this July 6th and 7th, we salute the code warriors, chief enforcers, and data mages who are coming together to make a real difference in the world.

We hope you will too.

Just a little sneak peek…

***

Want to know more about CEASE? Read about the project here.

We believe in a world free of online bullying, harassment, and child exploitation. Find out how we’re making that vision a reality with our high-risk content detection system Community Sift.

We work with companies like ROBLOX, Animal Jam, and more to protect their communities from dangerous content.

Want more articles like this? Subscribe to our newsletter and never miss an update!

* indicates required


To Mark Zuckerberg

Re: Building Global Communities

“There are billions of posts, comments and messages across our services each day, and since it’s impossible to review all of them, we review content once it is reported to us. There have been terribly tragic events — like suicides, some live streamed — that perhaps could have been prevented if someone had realized what was happening and reported them sooner. There are cases of bullying and harassment every day, that our team must be alerted to before we can help out. These stories show we must find a way to do more.” — Mark Zuckerberg

This is hard.

I built a company (Two Hat Security) that’s also contracted to process 4 billion chat messages, comments, and photos a day. We specifically look for high-risk content in real-time, such as bullying, harassment, threats of self-harm, and hate speech. It is not easy.

“There are cases of bullying and harassment every day, that our team must be alerted to before we can help out. These stories show we must find a way to do more.”

I must ask — why wait until cases get reported?

If you wait for a report to be filed by someone, haven’t they already been hurt? Some things that are reported can never be unseen. Some like Amanda Todd cannot have that image retracted. Others post when they are enraged or drunk and the words like air cannot be taken back. The saying goes, “What happens in Vegas stays in Vegas, Facebook, Twitter and Instagram forever” so maybe some things should never go live. What if you could proactively create a safe global community for people by preventing (or pausing) personal attacks in real-time instead?

This, it appears, is key to creating the next vision point.

“How do we help people build an informed community that exposes us to new ideas and builds common understanding in a world where every person has a voice?”

One of the biggest challenges to free speech online in 2017 is that we allow a small group of toxic trolls the ‘right’ to shut up a larger group of people. Ironically, these users’ claim to free speech often ends up becoming hate speech and harassment, destroying the opportunity for anyone else to speak up, much like bullies in the lunchroom. Why would someone share their deepest thoughts if others would just attack them? Instead, the dream for real conversations gets lost beneath a blanket of fear. Instead, we get puppy pictures, non-committal thumbs up, and posts that are ‘safe.’ If we want to create an inclusive community, people need to be able to share ideas and information online without fear of abuse from toxic bullies. I applaud your manifesto, as it calls this out, and calls us all to work together to achieve this.

But how?

Fourteen years ago, we both set out to change the social network of our world. We were both entrepreneurial engineers, hacking together experiments using the power of code. It was back in the days of MySpace and Friendster and the later Orkut. We had to browse to every single friend we had on MySpace just to see if they wrote anything new. To solve this I created myTWU — a social stream of all the latest blogs and photos of fellow students, alumni and sports teams on our internal social tool. Our office was in charge of building online learning but we realized that education is not about ideas but community. It was not enough to dump curriculum online for independent study, people needed places of belonging.

A year later “The Facebook” came out. You reached beyond the walls of one University and over time opened it to the world.

So I pivoted. As part of our community, we had a little chat room where you could waddle around and talk to others. It was a skin of a little experiment my brother was running. He was caught by surprise when it grew to a million users which showed how users long for community and places of belonging. In those days chat rooms were the dark part of the web and it was nearly impossible to keep up with the creative ways users tried to hurt each other.

So I was helping my brother code the safety mechanisms for his little social game. That little social game grew to become a global community with over 300 million users and Disney bought it back in 2007. I remember huddling in my brother’s basement rapidly building the backend to fix the latest trick to get around the filter. Club Penguin was huge.

After a decade of kids breaking the filter and building tools to moderate the millions upon millions of user reports, I had a breakthrough. By then I was security at Disney, with the job to hack everything with a Mouse logo on it. In my training, we learned that if someone DDoS’es a network or tries to break the system, you find a signature of what they are doing and turn up the firewall against that.

“What if we did that with social networks and social attacks?” I thought.

I’ve spent the last five years building an AI system with signatures and firewalls as it relates to social content. As we process billions of messages with Community Sift, we build reputation scores in real-time. We know who the trolls are — they leave digital signatures everywhere they go. Moreover, I can adjust the AI to turn up the sensitivity only where it counts. In so doing we drastically dropped false positives, opened communication with the masses while detecting the highest risk when it matters.

I had to build whole new AI algorithms to do this since traditional methods only hit 90–95% percent. That is great for most AI tasks but when it comes to cyber-bullying, hate-speech, and suicide the stakes are too high for the current state of art in NLP.

“To prevent harm, we can build social infrastructure to help our community identify problems before they happen. When someone is thinking of suicide or hurting themselves, we’ve built infrastructure to give their friends and community tools that could save their life.”

Since Two Hat is a security company, we are uniquely positioned to prevent harm with the largest vault of high-risk signatures, like grooming conversations and CSAM (child sexual abuse material.) In collaboration with our partners at the RCMP (Royal Canadian Mounted Police), we are developing a system to predict and prevent child exploitation before it happens to complement the efforts our friends at Microsoft have made with PhotoDNA. With CEASE.ai, we are training AI models to find CSAM, and have lined up millions of dollars of Ph.D. research to give students world-class experience in working with our team.

“Artificial intelligence can help provide a better approach. We are researching systems that can look at photos and videos to flag content our team should review. This is still very early in development, but we have started to have it look at some content, and it already generates about one-third of all reports to the team that reviews content for our community.”

It is incredible what deep learning has accomplished in the last few years. And although we have been able to see near perfect recall in finding pornography with our current work there is an explosion of new topics we are training on. Further, the subtleties you outline are key.

I look forward to two changes to resolve this:

  1. I call on networks to trust that their users have resilience. It is not imperative to find everything just the worst. If all content can be sorted by maybe bad to absolutely bad we can then draw a line in the sand and say these cannot be unseen and these the community will find. In so doing we don’t have to wait for technology to reach perfection nor wait for users to report things we already know are bad. Let computers do what they do well and let humans deal with the rest.
  2. I call on users to be patient. Yes, sometimes in our ambition to prevent harm we may find a Holocaust photo. We know this is terrible but we ask for your patience. Computer vision is like a child still learning. A child that sees that image for the first time is still deeply impacted and is concerned. Join us to report these problems and to help train the system to mature and discern.

However, you are right that many more strides need to happen to get this to where it needs to be. We need to call on the world’s greatest thinkers. Of all the hard problems to solve, our next one is child pornography (CSAM). Some things cannot be unseen. There are things when seen re-victimize over and over again. We are the first to gain access to hundreds of thousands of CSAM material and train deep learning models on them with CEASE.ai. We are pouring millions of dollars and putting the best minds on this topic. It is a problem that must be solved.

And before I move on I want to give a shout out to your incredible team whom I have had the chance to volunteer at hack-a-thons with and who have helped me think through how to get this done. Your company commitment to social good is outstanding and they have helped many other companies and not for profits.

“The guiding principles are that the Community Standards should reflect the cultural norms of our community, that each person should see as little objectionable content as possible, and each person should be able to share what they want while being told they cannot share something as little as possible. The approach is to combine creating a large-scale democratic process to determine standards with AI to help enforce them.”

That is cool. I have got a couple of the main pieces needed for that completed if you need them.

“The idea is to give everyone in the community options for how they would like to set the content policy for themselves. Where is your line on nudity? On violence? On graphic content? On profanity?”

I had the chance to swing by Twitter 18 months ago. I took their sample firehose and have been running it through our system. We label each message across 1.8 million of our signatures, then put together a quick demo of what it would be like if you could turn off the toxicity on Twitter. It shows low, medium, and high-risk. I would not expect to see anything severe on there, as they have recently tried to clean it up.

My suggestion to Twitter was to allow each user the option to choose what they want to see. The suggestion was that a global policy gets rid of clear infractions against terms of use for content that can never be unseen such as gore or CSAM. After the global policy is applied, you can then let each user choose their own risk and tolerance levels.

We are committed to helping you and the Facebook team with your mission to build a safe, supportive, and inclusive community. We are already discussing ways we can help your team, and we are always open to feedback. Good luck on your journey to connect the world, and hope we cross paths next time I am in the valley.

Sincerely,
Chris Priebe
CEO, Two Hat Security

 

Originally published on Medium