Three Ways Social Networks Can Embrace Safety by Design Today

Earlier this month, the Australia eSafety Office released their Safety by Design (SbD) Principles. As explained on their website, SbD is an “initiative which places the safety and rights of users at the centre of the design, development and deployment of online products and services.” It outlines three simple but comprehensive principles (service provider responsibilities, user empowerment & autonomy, and transparency & accountability) that social networks can follow to embed user safety into their platform from the design phase and onwards.

With this ground-breaking initiative, Australia has proven itself to be at the forefront of championing innovative approaches to online safety.

I first connected with the eSafety Office back in November 2018, and later had the opportunity to consult on Safety by Design. I was honored to be part of the consultation process and to bring some of my foundational beliefs around content moderation to the table. At Two Hat, we’ve long advocated for a Safety by Design approach to building social networks.

Many of the points and the Safety by Design Principles and the UK’s recent Online Harms white paper support the Trust & Safety practices we’ve been recommending to clients for years, such as leveraging filters and cutting-edge technology to triage user reports. And we’ve heartily embraced new ideas, like transparency reports, which Australia and the UK both strongly recommend in their respective papers.

As I read the SbD overview, I had a few ideas for clear, actionable measures that social networks across the globe can implement today to embrace Safety by Design. The first two fall under SbD Principle 1, and the third under SbD Principle 3.

Under SbD Principle 1: Service provider responsibilities
“Put processes in place to detect, surface, flag and remove illegal and harmful conduct, contact and content with the aim of preventing harms before they occur.”

Content filters are no longer a “nice to have” for social networks – today, they’re table stakes. When I first started in the industry, many people assumed that only children’s sites required filters. And until recently, only the most innovative and forward-thinking companies were willing to leverage filters in products designed for older audiences.

That’s all changed – and the good news is that you don’t have to compromise freedom of expression for user safety. Today’s chat filters (like Two Hat’s Community Sift) go beyond allow/disallow lists, and instead allow for intelligent, nuanced filtering of online harms that take into account various factors, including user reputation and context. And they can do it well in multiple languages, too. As a Portuguese and English speaker, this is particularly dear to my heart.

All social networks can and should implement chat, username, image, and video filters today. How they use them, and the extent to which they block, flag, or escalate harms will vary based on community guidelines and audience.

Also under SbD Principle 1: Service provider responsibilities
Put in place infrastructure that supports internal and external triaging, clear escalation paths and reporting on all user safety concerns, alongside readily accessible mechanisms for users to flag and report concerns and violations at the point that they occur.”

As the first layer of protection and user safety, baseline filters are critical. But users should always be encouraged to report content that slips through the cracks. (Note that when social networks automatically filter the most abusive content, they’ll have fewer reports.)

But what do you do with all of that reported content? Some platforms receive thousands of reports a day. Putting everything from false reports (users testing the system, reporting their friends, etc) to serious, time-sensitive content like suicide threats and child abuse into the same bucket is inefficient and ineffective.

That’s why we recommend implementing a mechanism to classify and triage reports so moderators purposefully review the high-risk ones first, while automatically closing false reports. We’ve developed technology called Predictive Moderation that does just this. With Predictive Moderation, we can train AI to take the same actions moderators take consistently and reduce manual review by up to 70%.

I shared some reporting best practices used by my fellow Fair Play Alliance members during the FPA Summit at GDC earlier this year. You can watch the talk here (starting at 37:30).

There’s a final but no less important benefit to filtering the most abusive content and using AI like Predictive Moderation to triage time-sensitive content. As we’ve learned from seemingly countless news stories recently, content moderation is a deeply challenging discipline, and moderators are too often subject to trauma and even PTSD. All of the practices that the Australian eSafety Office outlines, when done properly, can help protect moderator wellbeing.

Under SbD Principle 3: Transparency and accountability
Publish an annual assessment of reported abuses on the service, accompanied by the open publication of a meaningful analysis of metrics such as abuse data and reports, the effectiveness of moderation efforts and the extent to which community standards and terms of service are being satisfied through enforcement metrics.”

While transparency reports aren’t mandatory yet, I expect they will be in the future. Both the Australian SbD Principles and the UK Online Harms white paper outline the kinds of data these potential reports might contain.

My recommendation is that social networks start building internal practices today to support these inevitable reports. A few ideas include:

  • Track the number of user reports filed and their outcome (ie, how many were closed, how many were actioned on, how many resulted in human intervention, etc)
  • Log high-risk escalations and their outcome
  • Leverage technology to generate a percentage breakdown of abusive content posted and filtered

Thank you again to the eSafety Office and Commissioner Julie Inman-Grant for spearheading this pioneering initiative. We look forward to the next iteration of the Safety by Design framework – and can’t wait to join other online professionals at the #eSafety19 conference in September to discuss how we can all work together to make the internet a safe and inclusive space where everyone is free to share without fear of abuse or harassment.

To read more about Two Hat’s vision for a safer internet, download our new white paper By Design: 6 Tenets for a Safer Internet.

And if you, like so many of us, are concerned about community health and user safety, I’m currently offering no-cost, no-obligation Community Audits. I will examine your community (or the community from someone you know!), locate areas of potential risk, and provide you with a personalized community analysis, including recommended best practices and tips to maximize positive social interactions and user engagement.



Ask Yourself These 4 Questions If You Allow User-Generated Content on Your Platform

Today, user-generated content like chat, private messaging, comments, images, and videos are all must-haves in an overstuffed market where user retention is critical to long-term success. Users love to share, and nothing draws a crowd like a crowd — and a crowd of happy, loyal, and welcoming users will always bring in more happy, loyal, and welcoming users.

But as we’ve seen all too often, there is risk involved when you have social features on your platform. You run the risk of users posting offensive content – like hate speech, NSFW images, and harassment – which can cause serious damage to your brand’s reputation.

That’s why understanding the risks when adding social features to your product are also critical to long-term success.

Here are four questions to consider when it comes to user-generated content on your platform.

1. How much risk is my brand willing to accept?
Every brand is different. Community demographic will usually be a major factor in determining your risk tolerance.

Communities with under-13 users in the US have to be COPPA compliant, so preventing them from sharing PII (personally identifiable information) is essential. Edtech platforms should be CIPA and FERPA compliant.

If your users are teens and 18+, you might be less risk-averse, but will still need to define your tolerance for high-risk content.

Consider your brand’s tone and history. Review your corporate guidelines to understand what your brand stands for. This is a great opportunity to define exactly what kind of an online community you want to create.

2. What type of high-risk content is most dangerous to my brand?
Try this exercise: Imagine that just one pornographic post was shared on your platform. How would it affect the brand? How would your audience react? How would your executive team respond? What would happen if the media/press found out?

What about hate speech? Sexual harassment? What is your brand’s definition of abuse or harassment? The better you can define these often vague terms, the better you will understand what kind of content you need to moderate.

3. How will I communicate my expectations to the community?
Don’t expect your users to automatically know what is and isn’t acceptable on your platform. Post your community guidelines where users can see them. And make sure users have to agree to your guidelines before they can post.

4. What content moderation tools and strategies can I leverage to protect my community?
We recommend taking a proactive instead of a reactive approach to managing risk and protecting your community. That means finding the right blend of pre- and post-moderation for your platform, while also using a mixture of automated artificial intelligence with real human moderation.

On top of these techniques, there are also different tools you can use to take a proactive approach, including in-house filters (read about the build internally vs buy externally debate), or content moderation solutions like Two Hat’s Community Sift (learn about the difference between a simple profanity filter and a content moderation tool).

Feeling overwhelmed?
While social features may be inherently risky, remember that they’re also inherently beneficial to your brand and your users. Whether you’re creating a new social platform or adding chat and images to your existing product, nothing engages and delights users more than being part of a positive and healthy online community.

And if you’re not sure where to start – we have good news.

Two Hat is currently offering a no-cost, no-obligation community audit. Our team of industry experts will examine your community, locate high-risk areas, and identify how we can help solve any moderation challenges.

It’s a unique opportunity to sit down with our Director of Community Trust & Safety to see how you can mitigate risk in your community.

To book your free audit, fill out the form below and we’ll reach out with next steps!