Recently, I have read commentaries in response to the excellent New York Times series on child sexual abuse. One particular point that was raised inspired me to write this article: the claim that existing technologies are not sophisticated enough to stop predators online, and that artificial intelligence systems alone might provide a solution. In desperate times, when the horrid truth of online child sexual abuse (there’s no such thing as child pornography) and the staggering increase in images and videos being shared are crushing our collective spirits, it’s understandable that we will look for a silver bullet.
While I can’t even begin to understand the suffering victims go through, I have been aware of the horrors children face in their daily lives since I started working in online safety 11 years ago. I have attended two editions of the very important Crimes Against Children Conference in Dallas (2017 and 2019), participated in a law enforcement and industry roundtable back in 2017, took part in a Hackathon the same year, and attended an eye-opening workshop in Australia last September where I learned more about the incredible work of the Canadian Centre for Child Protection. Even after attending all of those events, I’ll say that I still know very little about the subject, and all my respect goes to people who have been doing the work of keeping children safer for decades now.
What is it going to take to get ahead of this challenge then? Is it artificial intelligence? Is it better legislation? Is it hiring more moderators? I hope to show you that neither one tool or approach alone (or even one company for that matter), be it machine learning or rule-based/keyword-based detection, can keep children safer online.
But I also intend to leave you with your hopes renewed.
Two Hat’s CEO, Chris Priebe, likes to say that we need the right tools to solve the right challenges — if all we have is a hammer, everything looks like a nail, and we certainly will fail to keep users safer online. Online behaviour is multi-faceted, complex, and must be understood and approached holistically. Understanding user behaviour as a big picture means understanding the relationship between all points of interaction in a platform, from username creation to the text, images, videos, and audio a user posts. Content moderation data points are key as well. How many times did a user post risky content in the last 24 hours, how many times was a user reported in the last 24 hours, in the last x number of days, the last week? How many times were they reported and how often did that result in a moderation action?
As we know, language alone can be a very bad proxy for intent. For example, a child asking another child if they are home alone does not necessarily ring an alarm bell. It’s critical to gauge user behavior across a time span. What are the leading indicators that show you a user is potentially engaging in grooming activities? Can you gauge that by detecting that they have also been requesting personally identifiable information? That alone is inconclusive, but when you pair PII with known phrases used to build trust and normalize child-adult relationships, you have better signals to consider and be on a higher alert. That brings up another crucial point. We need to involve experts who work in this area to help us understand how groomers chat online. Real-world data is absolutely imperative in building data sets, and grooming data is rare. Access to that data can be even rarer.
Artificial intelligence is a critical weapon in our arsenal. However, it can’t be the only weapon. If only users chatted naturally online – then natural language processing would truly change the game. We’ve coined the term unNatural Language Processing to explain how millions of users subvert a filter when they know one is in place. They use Unicode, invisible characters, upside-down characters, l33t spe4k, a mix of capital and lowercase letters, added accents to letters, dots between letters, and coded language. Not to mention the classic manipulations young audiences use: beach, shirt, truck (all words used in context instead of their much less gracious equivalents).
Or sometimes it’s as simple as: “wh4Ts Ur insta?” or “w.h.e.r.e do U liv3?”
All of these things can throw machine learning systems off. And not only that; to account for constant changes in the dynamic landscape of online speech, companies have to retrain those models regularly to keep them relevant. It’s an arms-race. When you have a community crisis or a child in peril, you simply don’t have the luxury of time. You need to make filter changes on the spot. Fifteen minutes later can be too late.
Furthermore, how companies design their platforms and user interaction features is an indispensable part of the solution. How is the platform environment being conducive to the very behaviours we want to prevent? Why add private chat as a feature? What communication and interaction need does it fulfill? Does it open the door for abuse and, if so, how can it be mitigated?
There’s also the human and technical moderation processes and policies companies need to put in place. Keeping users safer online is a complex challenge and, as such, requires the representation of multiple individuals and disciplines within a company: legal counsel, product development, community, customer support, PR, trust & safety, and more. We can’t do this in isolation, and neither can we put all the onus on technology alone.
We need humans – and technology
At this point you can probably guess that we cannot protect children online without the critical expertise of humans. Every platform needs well-trained moderators who know how to follow a solid moderation process and use a suite of tools and techniques to monitor a community and foster a safer environment. Without them, we surely have no hope. They bring a touch of human empathy, understanding, and nuance that machines lack. For instance, A.I. still needs humans to distinguish hoax from reality.
Humans alone are not the answer either. With billions and billions of pieces of user-generated content being shared online everyday, manual moderation can never be scalable and sustainable. At Two Hat, we believe in empowering super moderators with the right tools so they can focus on purposeful moderation, letting the machine deal with the obviously good and the obviously bad. Giving moderators access to the right tools for the right moderation activity is crucial.
Filters are only one tool and online safety requires a variety of tools. As one of our clients once said to us in an in-office visit:
“You are a Community Protection Service that empowers us to protect our users… The filter is just a lock on the door, but what you really offer is the lock, video surveillance cameras, alarms, etc. “
This is why we’ve developed the Five Layers of Community Protection, a framework for online community protection that we believe is just one piece of the puzzle when it comes to protecting children online.
Amongst many techniques, we leverage a rule-based, linguistic-pattern driven system that’s curated by highly-trained native language specialists who understands both formal and informal language (as opposed to direct translations from English to other languages) and who carefully review the filter and improve the quality everyday. Our algorithms surface trends for the Language & Culture team to review, providing them with a data-driven and systemic approach to QA. We also leverage machine learning in how we help clients process user-generated reports and automate image and video review. And we have an incredible research team that, as I write this, is working on new data science models to help us stay ahead of the curve. My bi-weekly meeting with that team constantly amazes me.
At Two Hat Security, we don’t claim we have all the answers. We believe we bring a piece of the puzzle alongside the very important pieces that our clients and partners provide. After all, they know their platforms and users better than anyone else. We recognize that academia and government also play a key role. It’s a shared responsibility, and we stand with our industry friends. I can especially testify to the amazing dedication of several gaming companies that walk the walk, have amazing teams and a combination of internal and external tools to deal with the major problem of child exploitation. They invest resources, time, and millions of dollars to address this. Every day.
When it comes to child protection, a lot of us in the industry believe in collaboration and, ultimately, the non-competitive nature of this mission.
Can we do more? Yes.
And I believe that’s exactly what is going to happen. I’m optimistic about 2020. I have countless conversations going on right now with industry, academia and government – everyone wants to act. We are taking our share of responsibilities by standardizing the terminology around child exploitation behaviours through a cross-discipline collaboration, partnering with organizations, global law enforcement and government globally to develop technology and processes to combat child exploitation, and we will stand by our friends and we will be relentless in protecting children online. I know you will too. Shoot me a message if you want to talk about this.
Regardless if we are connected or not, if you believe you could benefit from one of the free community audits I have been offering industry, please reach out.
I would love to see if I can help you spot any opportunities for positive changes in your content moderation strategy.