User reputation is a powerful tool in the fight against hate speech. If a user has a history of racism, you can prioritize reviewing — and removing — their posts above others.
The way we approach this with Community Sift is to apply a series of lenses to the reported content — internally, we call this ‘classification.’ We assess the content on a sliding scale of risk, note the frequency of user-submitted reports, the context of the message (public vs. large group vs. small group vs. 1:1), and the speaker’s reputation. Note that at this point in the process we have not done anything yet other than label the data. Now it is time to do something with it.
Step 2 — Take automatic action.
After we label the data, we can place it into three distinct ‘buckets.’ The vast majority (around 95%) will fall under ‘obviously good’, since social media predominantly consists of pictures of kittens, food, and reposted jokes. Just like there is the ‘obviously good,’ however, there is also the ‘obviously bad’.
In this case, think of the system like anti-virus technology. Every day, people are creating new ways to mess up your computer. Cybersecurity companies dedicate their time to finding the latest malware signatures so that when one comes to you, it is automatically removed. Similarly, our company uses AI to find new social signatures by processing billions of messages across the globe for our human professionals to review. The manual review is critical to reducing false positives. Just like with antivirus technology, you do not want to delete innocuous content on people’s computers, lest you end up making some very common mistakes like this one.
Now that we have labeled almost everything as either ‘obviously good’ and ‘obviously bad,’ we can prioritize which messages to address first.
Step 3 — Create prioritized queues for human action.
Computers are great at finding the good and the bad, but what about all the stuff in the middle? Currently, the best practice is to crowdsource judgment by allowing your users to report content. Human moderation of some kind is key to maintaining and training a quality workflow to eliminate hate speech. The challenge is going to be getting above the noise of bad reports and dealing with the urgent right now.
Remember the Steven Covey model of time management? Instead of only using a simple chronologically sorted list of hate speech reports, we want to provide humans with a streamlined list of items to action quickly, with the most important items at the top of the list.
The second list focuses on high-risk, time-sensitive content. These are rare events, so this work queue is kept minuscule. Content enters when the system thinks it is high-risk, but cannot be sure; or, when users report content that is right on the border of triggering the conditions necessary for a rating of ‘obviously bad.’ The result is a prioritized queue that humans can stay on top of and remove content from in minutes instead of days.
In our case, we devote millions of dollars a year into continual refinement and improvement with human professionals, so product owners don’t have to. We take care of all that complexity to get product owners back to the fun stuff instead — like making more amazing social products.
Step 4 — Take human action.
Product owners could use crowdsourced, outsourced, or internal moderation to handle these queues, though this depends on the scale and available resources within the team. The important thing is to take action as fast as humanly possible, starting with the questionable content that the computers cannot catch.
Step 5 — Train artificial intelligence based on decisions.
To manage the volume of reported content for a platform like Facebook or Twitter, you need to employ some level of artificial intelligence. By setting up the moderation AI to learn from human decisions, the system becomes increasingly effective at automatically detecting and taking action against emerging issues. The more precise the automation, the faster the response.
After five years of dedicated research in this field, we’ve learned a few tricks.
Machine learning AI is a powerful tool. But when it comes to processing language, it’s far more efficient to use a combination of a well-trained human team working alongside an expert system AI.
By applying the methodology above, it is now within our grasp to remove hate speech from social platforms almost instantly. Prejudice is an issue that affects everyone, and in an increasingly connected global world, it affects everyone in real-time. We have to get this right.