The Future of Image Moderation: Why We’re Creating Invisible AI (Part Two)

Yesterday, we announced that Two Hat has acquired image moderation service ImageVision. With the addition of ImageVision’s technology to our existing image recognition tech stack, we’ve boosted our filter accuracy — and are determined to push image moderation to the next level.

Today, Two Hat CEO and founder Chris Priebe discusses why ImageVision was the ideal choice for a technology acquisition— and how he hopes to change the landscape of image moderation in 2019.

We were approached by ImageVision over a year ago. Their founder Steven White has a powerful story that led him to found the company (it’s his to tell so I won’t share). His story resonated with me and my own journey of why I founded Two Hat. He spent over 10 years perfecting his art. He had clients with Facebook, Yahoo, Flickr, and Apple. That is 10 years of experience and over $10 million in investment to solve the problems of accurately detecting pornographic images.

Of course 10 years ago we all did things differently. Neural networks weren’t popular yet. Back then, you would look at how much skin tone was in an image. You looked at angles and curves and how they relate to each other. ImageVision made 185 of these hand-coded features.

Later they moved on to neural networks but ImageVision did something amazing. They took their manually coded features and fed both them and the pixels into the neural network. And they got a result different from what everyone else was doing at the time.

Now here is the reality — there is no way I’m going to hire people to write nearly 200 manually coded features in this modern age. And yet the problem of child sexual abuse imagery is so important that we need to throw every resource we can at it. It’s not good enough to only prevent 90% of exploitation — we need all the resources we can get.

Like describing an elephant

So we did a study. We asked, “What would happen if we took several image detectors and mixed them together? Would they give a better answer than any alone?”

It’s like the story of several blind men describing an elephant. One describes a tail, another a trunk, another a leg. They each think they know what an elephant looks like, but until they start listening to each other they’ll never actually “see” the real elephant. Likewise in AI, some systems are good at finding one kind of problem and another at another problem. What if we trained another model (called an ensemble) to figure out when each of them is right?

For our study, we took 30,000 pornographic images and 55,000 clean images. We used ImageVision images since they are full of really hard ones to find; the kind of images you might actually see in real life and not just a lab experiment. The big cloud providers found between 89-98% of pornographic images out of all 30k images, while the precision rate was around 95-98% for all of them (precision refers to the proportion of positive identifications that are correct).

We were excited that our current system found most of the images, but we wanted to do better.

For the project, we had to create a bunch of weak learners to find CSAM. Detecting CSAM is such a huge problem that we needed to throw everything we could at it. So we ensembled the weak learners all together to see what would happen — and we got another 1% of accuracy, which is huge because the gap from 97% to 100% is the hardest to close.

But how do you close the last 2%? This is where millions of dollars and decades of experience are critical. This is where we must acquire and merge every trick in the book. When we took ImageVision’s work and merged it with our own, we squeezed out another 1%. And that’s why we bought them.

We’re working on a white paper where we’ll present our findings in further detail. Stay tuned for that soon.

The final result

So if we bought ImageVision, not only would we gain 10 years of experience, multiple patents, and over $10 million in technology, but we would be the best NSFW detector in the industry. And if we added that into our CSAM detector (along with age detection, face detection, body part detection, and abuse detection) then we could push that accuracy even closer and hopefully save more kids from the horrors of abuse. Spending money to solve this problem was a no-brainer for us.

Today, we’re on the path to making AI invisible.

Learn more about Priebe’s groundbreaking vision of artificial intelligence in an on-demand webinar. He shares more details about the acquisition,, and the content moderation trends that will dominate 2019. Register to watch the webinar here.

Further reading:

Part One of The Future of Image Moderation: Why We’re Creating Invisible AI
Official ImageVision acquisition announcement.

The Future of Image Moderation: Why We’re Creating Invisible AI (Part One)

In December and early January, we teased exciting Two Hat news coming your way in the new year. Today, we’re pleased to share our first announcement of 2019 — we have officially acquired ImageVision, an image recognition and visual search company. With the addition of ImageVision’s groundbreaking technology, we are now poised to provide the most accurate NSFW image moderation service in the industry.

We asked Two Hat CEO and founder Chris Priebe to discuss the ambitious technology goals that led to the acquisition. Here is part one of that discussion:

The future of AI is all about quality. Right now the study of images is still young. Anyone can download TensorFlow or PyTorch, feed it a few thousand images and get a model that gets things right 80-90% of the time. People are excited about that because it seems magical – “They fed a bunch of images into a box and it gave an answer that surprisingly right most of the time!” But even if you get 90% right, you are still getting 10% wrong.

Think of it this way: If you do 10 million images a day that is a million mistakes. A million times someone tried to upload a picture that was innocent and meaningful to them and they had to wait for a human to review it. That is one million images humans need to review. We call those false positives.

Worse than false positives are false negatives, where someone uploads an NSFW (not safe for work) picture or video and it isn’t detected. Hopefully, it was a mature adult who saw it. Even if it was an adult, they weren’t expecting to see adult content, so their trust in the site is in jeopardy. They’re probably less likely to encourage a friend to join them on the site or app.

Worse if it was a child who saw it. Worst of all if it is a graphic depiction of a child being abused.

Protecting children is the goal

That last point is closest to our heart. A few years ago we realized that what really keeps our clients awake at night is the possibility someone will upload child sexual abuse material (CSAM; also known as child exploitive imagery, or CEI, and formerly called child pornography) to their platform. We began a long journey to solve that problem. It began with a hackathon where we gathered some of the largest social networks in the world with international law enforcement and academia all in the same room and attempted to build a solution together.

So AI must mature. We need to get beyond a magical box that’s “good enough” and push it until AI becomes invisible. What do I mean by invisible? For us, that means you don’t even notice that there is a filter because it gets it right every time.

Today, everyone is basically doing the same thing, like what I described earlier — label some NSFW images and throw them at the black box. Some of us are opening up the black box and changing the network design to hotrod the engine, but for the most part it’s a world of “good enough”.

Invisible AI

But in the future, “good enough” will no longer be tolerated. The bar of expectation will rise and people will expect it to just work. From that, we expect companies to hyper-specialize. Models will be trained that do one thing really, really well. Instead of a single model that answers all questions, instead, there will be groups of hyper-specialists with a final arbiter over them deciding how to best blend all their opinions together to make AI invisible.

We want to be at the top of the list for those models. We want to be the best at detecting child abuse, bullying, sextortion, grooming, and racism. We are already top of the market in several of those fields and trusted by many of the largest games and social sharing platforms. But we can do more.

Solving the biggest problems on the internet

That’s why we’ve turned our attention to acquiring. These problems are too big, too important to have a “not built here, not interested” attitude. If someone else has created a model that brings new experience to our answers, then we owe it our future to embrace every advantage we can get.

Success for me means that one day my children will take for granted all the hard work we’re doing today. That our technology will be invisible.

In part two, Chris discusses why ImageVision was the ideal choice for a technology acquisition— and how he hopes to change the landscape of image moderation in 2019.

Sneak peek:

“It’s like the story of several blind men describing an elephant. One describes a tail, another a trunk, another a leg. They each think they know what an elephant looks like, but until they start listening to each other they’ll never actually “see” the real elephant. Likewise in AI, some systems are good at finding one kind of problem and another at another problem. Could we train another model (called an ensemble) to figure out when each of them is right?”


Read the official ImageVision acquisition announcement

Top 6 Reasons You Should Combine Automation and Manual Review in Your Image Moderation Strategy

When you’re putting together an image moderation strategy for your social platform, you have three options:

  1. Automate everything with AI;
  2. Do everything manually with human moderators, or
  3. Combine both approaches for Maximum Moderation Awesomeness™

When consulting with clients and industry partners like PopJam, unsurprisingly, we advocate for option number three.

Here are our top six reasons why:

1. Human beings are, well… human (Part 1)
We get tired, we take breaks, and we don’t work 24/7. Luckily, AI hasn’t gained sentience (yet), so we don’t have to worry (yet) about an algorithm troubling our conscience when we make it work without rest.

2. Human beings are, well… human (Part 2)
In this case, that’s a good thing. Humans are great at making judgments based on context and cultural understanding. An algorithm can find a swastika, but only a human can say with certainty if it’s posted by a troll propagating hate speech or is instead a photo from World War II with historical significance.

3. We’re in a golden age of AI
Artificial intelligence is really, really good at detecting offensive images with near-perfect accuracy. For context, this wasn’t always the case. Even 10 years ago, image scanning technology was overly reliant on “skin tone” analysis, leading to some… interesting false positives.

Babies, being (sometimes) pink, round, and strangely out of proportion would often trigger false positives.

And while some babies may not especially adorable, it was a bit cruel to label them “offensive.”

Equally inoffensive but often the cause of false positives was light oak-coloured desks, chair legs, marathon runners, some (but not all) brick walls, and even more bizarrely — balloons.

Today, the technology has advanced so far that it can distinguish between bikinis, shorts, beach shots, scantily-clad “glamour” photography, and explicit adult material.

4. Humans beings are, well… human (Part 3)
As we said, AI doesn’t yet have the capacity for shock, horror, or emotional distress of any kind.

Until our sudden inevitable overthrow by the machines, go ahead and let AI automatically reject images with a high probability of containing pornography, gore, or anything that could have a lasting effect on your users and your staff.

That way, human mods can focus on human stuff like reviewing user reports and interacting with the community.

5. It’s the easiest way to give your users an unforgettable experience
The social app market is already overcrowded. “The next Instagram” is released every day. In a market where platforms vie to retain users, it’s critical that you ensure positive user experiences.

With AI, you can approve and reject posts in real-time, meaning your users will never have to wait for their images to be reviewed.

And with human moderators engaging with the community — liking posts, upvoting images, and promptly reviewing and actioning user reports — your users will feel supported, safe, and heard.

You can’t put a price on that… no wait, you can. It’s called Cost of Customer Acquisition (CAC), and it can make or break a business that struggles to retain users.

6. You’re leveraging the best of both worlds
AI is crazy fast, scanning millions of images a day. By contrast, humans can scan about 2500 images daily before their eyes start to cross and they make a lot of mistakes. AI is more accurate than ever, but humans provide enhanced precision by understanding context.

A solid image moderation process supported by cutting-edge tech and a bright, well-trained staff? You’re well on your way to Maximum Moderation Awesomeness™.

Want to learn how one social app combines automation with manual review to reduce their workload and increase user engagement? Sign up for our webinar featuring the community team from PopJam!

Optimize Your Image Moderation Process With These Five Best Practices

If you run or moderate a social sharing site or app where users can upload their own images, you know how complex image moderation can be.

We’ve compiled five best practices that will make you and your moderation team’s lives a lot easier.

1. Create robust internal moderation guidelines
While you’ll probably rely on AI to automatically approve and reject the bulk of submitted images, there will be images that an algorithm misses, or that users have reported as being inappropriate. In those cases, it’s crucial that your moderators are well-trained and have the resources at their disposal to make what can sometimes be difficult decisions.

Remember the controversy surrounding Facebook earlier this year when they released their moderation guidelines to the public? Turns out, their guidelines were so convoluted and thorny that it was near-impossible to follow them with any consistency. (To be fair, Facebook faces unprecedented challenges when it comes to image moderation, including incredibly high volumes and billions of users from all around the world.) There’s a lesson to be learned here, though, which is that internal guidelines should be clear and concise.

Consider — you probably don’t allow pornography on your platform, but how do you feel about bathing suits or lingerie? And what about drugs — where do you draw the line? Do you allow images of pills? Alcohol?

Moderation isn’t a perfect science; there will always be grey areas.

2. Consider context
When you’re deciding whether to approve or reject an image that falls into the grey area, remember to look at everything surrounding the image. What is the user’s intent with posting the image? Is their intention to offend? Look at image tags, comments, and previous posts.

3. Be consistent when approving/rejecting images and sanctioning users
Your internal guidelines should ensure that you and your team make consistent, replicable moderation decisions. Consistency is so important because it signals to the community that 1) you’re serious about their health and safety, and 2) you’ve put real thought and attention into your guidelines.

A few suggestions for maintaining consistency:

  • Notify the community publically if you ever change your moderation guidelines
  • Consider publishing your internal guidelines
  • Host moderator debates over challenging images and ask for as many viewpoints as possible ; this will help avoid biased decision-making
  • When rejecting an image (even if it’s done automatically by the algorithm), automate a warning message to the user that includes community guidelines
  • If a user complains about an image rejection or account sanction, take the time to investigate and fully explain why action was taken

4. Map out moderation workflows
Take the time to actually sketch out your moderation workflows on a whiteboard. By mapping out your workflows, you’ll notice any holes in your process.

Here are just a few scenarios to consider:

  • What do you do when a user submits an image that breaks your guidelines? Do you notify them? Sanction their account? Do nothing and let them submit a new image?
  • Do you treat new users differently than returning users (see example workflow for details)?
  • How do you deal with images containing CSAM (child sexual abuse material; formally referred to as child pornography)?

Coming across an image that contains illegal content can be deeply disturbing.

5. Have a process to escalate illegal images
The heartbreaking reality of the internet is that it’s easier today for predators to share images than it has ever been. It’s hard to believe that your community members would ever upload CSAM, but it can happen, and you should be prepared.

If you have a Trust & Safety specialist, Compliance Officer, or legal counsel at your company, we recommend that you consult them for their best practices when dealing with illegal imagery. One option to consider is using Microsoft’s PhotoDNA, a free image scanning service that can automatically identify and escalate known child sexual abuse images to the authorities.

You may never find illegal content on your platform, but having an escalation process will ensure that you’re prepared for the worst-case scenario.

On a related note, make sure you’ve also created a wellness plan for your moderators. We’ll be discussing individuals wellness plans — and other best practices — in more depth in our Image Moderation 101 webinar on August 22nd. Register today to save your seat for this short, 20-minute chat.

The Role of Image Filtering in Shaping a Healthy Online Community

Digital citizenship, online etiquette, and user behaviour involve many different tools of expression, from texting to photo sharing, and from voice chat to video streaming. In my last article, I wrote about who is responsible for the well-being of players/users online. Many of the points discussed relate directly to the challenges posed by chat communication.

However, those considerations are also applicable to image sharing on our social platforms as well as what intent is behind it.

Picture this
Online communities that allow users to share images have to deal with several risks and challenges that come with the very nature of the beast; meaning, creating and/or sharing images is a popular form of online expression, there’s no shortage of images, and they come in all shapes, flavours, and forms.

Unsurprisingly, you’re bound to encounter images that will challenge your community guidelines (think racy pictures without obvious nudity), while others will simply be unacceptable (for example, pornography, gore, or drug-related imagery).

Fortunately, artificial intelligence has advanced to a point where it can do things that humans cannot; namely, handle incredibly high volumes while maintaining high precision and accuracy.

This is not to say that humans are dispensable. Far from that. We still need human eyes to make the difficult, nuanced decisions that machines alone can’t yet make.

For example, let’s say a user is discussing history with another user and wants to share a historical picture related to hate speech. Without the appropriate context, a machine could simply identify a hateful symbol on a flag and automatically block the image, stopping them from sharing it.

Costs and consequences
Without an automated artificial intelligence system for image filtering, a company is looking at two liabilities:

  • An unsustainable, unscalable model that will incur a manual cost connected to human moderation hours;
  • Increased psychological impact of exposing moderators to excessive amounts of harmful images

The power of artificial intelligence
Automated image moderation can identify innocuous images and automate their approval. It can also identify key topics (like pornographic content and hateful imagery) with great accuracy and block them in real time, or hold them for manual review.

By using automation, you can remove two things from your moderators’ plates:

  • Context-appropriate images (most images: fun pictures with friends smiling, silly pictures, pets, scenic locations, etc )
  • Images that are obviously against your community guidelines (think pornography or extremely gory content)

Also, a smart system can serve up images in the grey area to your moderators for manual review, which means way less content to review than the two scenarios explored above. By leveraging automation you will have less manual work (reduced workload, therefore reduced costs) and less negative impact on your moderation team.

Give humans a break
Automated image moderation can also take the emotional burden off of your human moderators. Imagine yourself sitting in front of a computer for hours and hours, reviewing hundreds or even thousands of images, never knowing when your eyes (and mind) will be assaulted by a pornographic or very graphic violent image. Now consider the impact this has week after week.

What if a big part of that work can be taken by an automated system, drastically reducing the workload, and with that the emotional impact of reviewing offensive content? Why wouldn’t we seek to improve our team’s working situation and reduce employee burnout and turnover?

It is not only a business crucial thing to do. This also means taking better care of your people and supporting them. This is key to company culture.

An invitation
Normally, I talk and write about digital citizenship as it relates to chat and text. Now, I’m excited to be venturing into the world of images and sharing as much valuable insight as I can with all of you. After all, image sharing is an important form of communication and expression in many online communities.

It would be great if you could join me for a short, 20-minute webinar we are offering on Wednesday, August 22nd. I’ll be talking about actionable best practices you can put to good use as well as considering what the future may hold for this space. You can sign up here.

I’m looking forward to seeing you there!

Originally published on LinkedIn by Carlos Figueiredo, Two Hat Director of Community Trust & Safety

Webinar: Image Moderation 101

Wondering about the latest industry trends in image moderation? Need to keep offensive and unwanted images out of your community — but no idea where to start?

Join us for 20 minutes on Wednesday, August 22 for an intimate chat with Carlos Figueiredo, Two Hat Director of Community Trust & Safety.

Register Now
In this 20 minute chat, we’ll cover:

  • Why image moderation is business-critical for sharing sites in 2018
  • An exclusive look at our industry-proven best practices
  • A sneak peek at the future of image moderation… will there be robots?

Sign up today to save your seat!

How To Prevent Offensive Images From Appearing in Your Social Platform

If you manage a social platform like an Instagram or a Tumblr, you’ll inevitably face the task of having to remove offensive UGC (user-generated content) from your website, game, or app.

At first, this is simple, with only the occasional inappropriate image or three to remove. Since it seems like such a small issue, you just delete the offending images as needed. However, as your user base grows, so does the % of users who refuse to adhere to your terms of use.

There are some fundamental issues with human moderation:

  • It’s expensive. It costs much more to review images manually, as each message needs to be reviewed by flawed human eyes.
  • Moderators get tired and make mistakes. As you throw more pictures at people, they tend to get sick of looking for needles in haystacks and start to get fatigued.
  • Increased risk. If your platform allows for ‘instant publishing’ without an approval step, then you take on the additional risk of exposing users to offensive images.
  • Unmanageable backlogs. The more users you have, the more content you’ll receive. If you’re not careful, you can overload your moderators with massive queues full of stuff to review.
  • Humans aren’t scalable. When you’re throwing human time at the problem, you’re spending human resource dollars on things that aren’t about your future.
  • Stuck in the past. If you’re spending all of your time moderating, you’re wasting precious time reacting to things rather than building for the future.

At Two Hat, we believe in empowering humans to make purposeful decisions with their time and brain power. We built Community Sift to take care of the crappy stuff so you don’t have to. That’s why we’ve worked with leading professionals and partners to provide a service that automatically assesses and prioritizes user-generated content based on probable risk levels.

Do you want to build and maintain your own anti-virus software and virus signatures?

Here’s the thing — you could go and build some sort of image system in-house to evaluate the risk of incoming UGC. But here’s a question for you: would you create your own anti-virus system just to protect yourself from viruses on your computer? Would you make your own project management system just because you need to manage projects? Or would you build a bug-tracking database system just to track bugs? In the case of anti-virus software, that would be kind of nuts. After all, if you create your own anti-virus software, you’re the first one to get infected with new viruses at they emerge. And humans are clever… they create new viruses all the time. We know because that’s what we deal with every day.

Offensive images are much like viruses. Instead of having to manage your own set of threat signatures, you can just use a third-party service and decrease the scope required to keep those images at bay. By using an automated text and image classification system on your user-generated content, you can protect your users at scale, without the need for an army of human moderators leafing through the content.

Here are some offensive image types we can detect:

  • Pornography
  • Graphic Violence
  • Weapons
  • Drugs
  • Custom Topics
Example image analysis result


Some benefits to an automated threat prevention system like Community Sift:

  • Decreased costs. Reduces moderation queues by 90% or more.
  • Increased efficiency. Prioritized queues for purposeful moderation, sorted by risk
  • Empowers automation. Instead of pre-moderating or reacting after inappropriate images are published, you can let the system filter or prevent the images from being posted in the first place.
  • Increased scalability. You can grow your community without worrying about the scope of work required to moderate the content.
  • Safer than managing it yourself. In the case of Community Sift, we’re assessing images, videos, and text across multiple platforms. You gain a lot from the network effect.
  • Shape the community you want. You can educate your user base proactively. For example, instead of just accepting inbound pornographic images, you can warn the user that they are uploading content that breaks your terms of use. A warning system is one of the most practical ways to encourage positive user behavior in your app.
  • Get back to what matters. Instead of trying to tackle this problem, you can focus on building new features and ideas. Let’s face it… that’s the fun stuff, and that’s where you should be spending your time — coming up with new features for the community that’s gathered together because of your platform.

In the latest release to the Community Sift image classification service, the system has been built from the ground up with our partners using machine learning and artificial intelligence. This new incarnation of the image classifier was trained on millions of images to be able to distinguish the difference between a pornographic photo and a picture of a skin-colored donut, for example.

Classifying images can be tricky. In earlier iterations of our image classification service, the system wrongly believed that plain, glazed donuts and fingernails were pornographic since both image types contained a skin tone color. We’ve since fixed this, and the classifier is now running at a 98.14% detection rate and a 0.32% false positive rate for pornography. The remaining 1.86%? Likely blurry images or pictures taken from a distance.

On the image spectrum, some content is so severe it will always be filtered — that’s the 98.14%. Some content you will see again and again, and requires that action be taken on the user, like a ban or suspension — that’s when we factor in user reputation. The more high-risk content they post, the closer we look at their content.

Some images are on the lower end of the severity spectrum. In other words, there is less danger if they appear on the site briefly, are reported, and then removed — that’s the 1.86%.

By combining the image classifier with the text classifier, Community Sift can also catch less-overt pornographic content. Some users may post obscene text within a picture instead of an actual photo, while other users might try to sneak in a picture with an innuendo, but with a very graphic text description.

Keeping on top of incoming user-generated content is a huge amount of work, but it’s absolutely worth the effort. In some of the studies conducted by our Data Science team, we’ve observed that users who engage in social interactions are 3x more likely to continue using your product and less likely to leave your community.

By creating a social platform that allows people to share ideas and information, you have the ability to create connections between people from all around the world.

Community is built through connections from like-minded individuals that bond through shared interests. The relationships between people in a community are strengthened and harder to break when individuals come together through shared beliefs. MMOs like World of Warcraft and Ultima Online mastered the art of gaming communities, resulting in long-term businesses rather than short-term wins.

To learn more about how we help shape healthy online communities, reach out to us anytime. We’d be happy to share more about our vision to create a harassment-free, healthy social web.