The Future of Image Moderation: Why We’re Creating Invisible AI (Part Two)
Yesterday, we announced that Two Hat has acquired image moderation service ImageVision. With the addition of ImageVision’s technology to our existing image recognition tech stack, we’ve boosted our filter accuracy — and are determined to push image moderation to the next level.
Today, Two Hat CEO and founder Chris Priebe discusses why ImageVision was the ideal choice for a technology acquisition— and how he hopes to change the landscape of image moderation in 2019.
We were approached by ImageVision over a year ago. Their founder Steven White has a powerful story that led him to found the company (it’s his to tell so I won’t share). His story resonated with me and my own journey of why I founded Two Hat. He spent over 10 years perfecting his art. He had clients with Facebook, Yahoo, Flickr, and Apple. That is 10 years of experience and over $10 million in investment to solve the problems of accurately detecting pornographic images.
Of course 10 years ago we all did things differently. Neural networks weren’t popular yet. Back then, you would look at how much skin tone was in an image. You looked at angles and curves and how they relate to each other. ImageVision made 185 of these hand-coded features.
Later they moved on to neural networks but ImageVision did something amazing. They took their manually coded features and fed both them and the pixels into the neural network. And they got a result different from what everyone else was doing at the time.
Now here is the reality — there is no way I’m going to hire people to write nearly 200 manually coded features in this modern age. And yet the problem of child sexual abuse imagery is so important that we need to throw every resource we can at it. It’s not good enough to only prevent 90% of exploitation — we need all the resources we can get.
Like describing an elephant
So we did a study. We asked, “What would happen if we took several image detectors and mixed them together? Would they give a better answer than any alone?”
It’s like the story of several blind men describing an elephant. One describes a tail, another a trunk, another a leg. They each think they know what an elephant looks like, but until they start listening to each other they’ll never actually “see” the real elephant. Likewise in AI, some systems are good at finding one kind of problem and another at another problem. What if we trained another model (called an ensemble) to figure out when each of them is right?
For our study, we took 30,000 pornographic images and 55,000 clean images. We used ImageVision images since they are full of really hard ones to find; the kind of images you might actually see in real life and not just a lab experiment. The big cloud providers found between 89-98% of pornographic images out of all 30k images, while the precision rate was around 95-98% for all of them (precision refers to the proportion of positive identifications that are correct).
We were excited that our current system found most of the images, but we wanted to do better.
For the CEASE.ai project, we had to create a bunch of weak learners to find CSAM. Detecting CSAM is such a huge problem that we needed to throw everything we could at it. So we ensembled the weak learners all together to see what would happen — and we got another 1% of accuracy, which is huge because the gap from 97% to 100% is the hardest to close.
But how do you close the last 2%? This is where millions of dollars and decades of experience are critical. This is where we must acquire and merge every trick in the book. When we took ImageVision’s work and merged it with our own, we squeezed out another 1%. And that’s why we bought them.
We’re working on a white paper where we’ll present our findings in further detail. Stay tuned for that soon.
The final result
So if we bought ImageVision, not only would we gain 10 years of experience, multiple patents, and over $10 million in technology, but we would be the best NSFW detector in the industry. And if we added that into our CSAM detector (along with age detection, face detection, body part detection, and abuse detection) then we could push that accuracy even closer and hopefully save more kids from the horrors of abuse. Spending money to solve this problem was a no-brainer for us.
Today, we’re on the path to making AI invisible.
Learn more about Priebe’s groundbreaking vision of artificial intelligence in an on-demand webinar. He shares more details about the acquisition, CEASE.ai, and the content moderation trends that will dominate 2019. Register to watch the webinar here.
Part One of The Future of Image Moderation: Why We’re Creating Invisible AI
Official ImageVision acquisition announcement
Learn about CSAM detection with CEASE.ai on our site