3 Takeaways From The 16th International Bullying Prevention Conference

I recently had the privilege to speak on the keynote gaming panel of the 16th Annual International Bullying Prevention Conference, an event themed Kindness & Compassion: Building Healthy Communities.

The International Bullying Prevention Association is a 501(c)3 nonprofit organization founded in 2003 when grassroots practitioners and researchers came together to convene the first conference in the US entirely focused on bullying prevention. They host an annual conference in Chicago where attendees can benefit from workshops, poster sessions and TED-inspired sessions which deliver hands-on solutions and theoretical, research-based presentations. 

Below, I focus on the sessions and discussions I participated in regarding cyberbullying, and present a brief account of the takeaways I brought back to Canada and Two Hat.

1. User-centric approaches to online safety

A few people on the tech panels referred to the concept of “user-centric safety” — letting users set their boundaries and comfort levels for online interactions. Catherine Teitelbaum, a renowned Global Trust & Safety Executive who heads up Trust & Safety for Twitch, is a big champion of the idea and spoke about how the concept of “safety” varies from person to person. Offering customized control for the user experience, like Twitch does with Automod by empowering channel owners to set their chat filtering standards, is the way of the future. 

Online communities are diverse and unique, and often platforms contain many communities with different norms. The ability to tailor chat settings to those unique characteristics is critical.

Wouldn’t it be great for users to be able to choose their safety settings and what they are comfortable with – the same way they can set their privacy settings on online platforms? What if a mother wants to enjoy an online platform with her child, but wants to ensure that they don’t see any sexual language? Perhaps a gamer just wants to relax and play a few rounds without experiencing the violent language that might be the norm in a mature game centered around combat. The more agency and flexibility we give to users and players online, the better we can cater to the different expectations we all have when we log in.

2. Shared Responsibility, and the Importance of Diverse Voices

The concept of sharing and contributing to the greater good of online safety practices across tech industries also came up. Here at Two Hat we believe that ushering in a new age of content moderation and empowering an Internet that will fulfill its true purpose of connecting human beings is only possible through a shared responsibility approach (which also came up in the conference). We believe it will take the efforts of everyone involved to truly change things for the better. This includes academia, industry, government, and users. 

In his 2018 book “Farsighted: How Do We Make The Decisions That Matter The Most”, Steven Johnson writes about how complex decisions require a comprehensive mapping of all factors involved and how those are informed and extremely benefited from a set of diverse perspectives. The best, farsighted decisions compile the voices of a variety of people. The intricate human interaction systems we are creating on the Internet require complex decision-making at both the inception and design stage. However, right now those decisions are rarely informed by multi-disciplinary lenses. No wonder we are so shortsighted when it comes to anticipating issues with online behaviour and online harms.

A true, collaborative community of practice is needed. We need that rising tide that floats all boats, as my good friend Dr. Kim Voll says.

3. Empathy as an Antidote

Another good friend, Dr. Sameer Hinduja was one of the speakers in the conference. Dr Hinduja is a Professor in the School of Criminology and Criminal Justice at Florida Atlantic University and Co-Director of the Cyberbullying Research Center who is recognized internationally for his groundbreaking work on the subjects of cyberbullying and safe social media use. You will be hard-pressed to find someone more dedicated to the well-being of others.

He talked about how empathy can be used to prevent bullying, pulling from research and practical applications that have resulted in improvement in peer to peer relationships. He stressed the importance of practices that lead youth to go beyond the traditional approach of “being in someone else’s shoes” to feel empathy, and reaching a point where they truly value others. This is so important, and it makes me wonder: How can we design human interaction systems online where we perceive each other as valuable individuals and are constantly reminded of our shared humanity? How do we create platforms that discourage solely transactional interaction? How do we bring offline social cues into the online experience? How can we design interaction proxies to reduce friction between users – and ultimately lead us to more positive and productive online spaces? I don’t have all the answers – no one does. But I am encouraged by the work of people like Dr Hinduja, the Trust and Safety team at Twitch, the incredible Digital Civility efforts of Roblox and my friend Laura Higgins, their Director of Community Safety & Digital Civility, and events like The International Bullying Prevention Conference.

Moving Forward

Cyberbullying is one of the many challenges facing online platforms today. Let’s remember that it’s not just cyberbullying – there is a wider umbrella of behaviors that we need to better understand and define, including harassment, reputation tarnishing, doxxing, and more. We need to find a way to facilitate better digital interactions in general, by virtue of how we design online spaces, how we encourage positive and productive exchanges, and understanding that it will take a wider perspective, informed by many lenses, in order to create online spaces that fulfill their true potential.

If you’re reading this, you’re likely in the industry, and you’re definitely a participant in online communities. So what can you do, today, to make a difference? How can industry better collaborate to advance online safety practices?

Quora: How big a problem are bullying and harassment of pre-teens and teenagers on social media?

The numbers indicate that cyberbullying and harassment are huge problems for young people on social media. A 2016 report from the Cyberbullying Research Center indicates that 33.8% of students between 12 and 17 were victims of cyberbullying in their lifetime. Conversely, 11.5% of students between 12 and 17 indicated that they had engaged in cyberbullying in their lifetime.

Cyberbullying is different from “traditional” bullying in that it happens 24/7. For victims, there is no escape. It’s not confined to school or the playground. Kids and teens connect through social media, so for many, there is no option to simply go offline.

Even more troubling is the connection between cyberbullying and child exploitation. At Two Hat Security, we’ve identified a cycle in which child predators groom young victims, who are tricked into taking explicit photos which are then shared online; this leads to bullying and harassment from peers and strangers. Finally, the victim suffers from depression, engages in self-harm, and sometimes — tragically — commits suicide. It’s a heartbreaking cycle.

Cyberbullying and online harassment are profoundly dangerous and alarming behaviors with real, often severe and sometimes fatal, consequences for victims.

Social media platforms have options, though. AI-based text and image filters like Community Sift are the first lines of defense against cyberbullying. Purposeful, focused moderation of User Generated Content (UGC) is the next step. And finally, education and honest, open discussions about the effects of cyberbullying on real victims is crucial. The more we talk about it, the more comfortable victims will feel speaking out about their experiences.

Originally published on Quora, featured in Huffington Post and Forbes

Want more articles like this? Subscribe to our newsletter and never miss an update!

* indicates required

How to Remove Online Hate Speech in Under 24 Hours

Note: This post was originally published on July 5th, 2016. We’ve updated the content in light of the draft bill presented by the German government on March 14th.

In July of last year, the major players in social media came together as a united front with a pact to remove hate speech within 24 hours. Facebook defines hate speech as “content that attacks people based on their perceived or actual race, ethnicity, religion, sex, gender, sexual orientation, disability or disease.” Hate speech is a serious issue, as it shapes the core beliefs of people all over the globe.

Earlier this week, the German government took their fight against online hate speech one step further. They have proposed a new law that would levy fines up to €50 million against social media companies that failed to remove or block hate speech within 24 hours of a complaint. And the proposed law wouldn’t just affect companies — it would affect individuals as well. Social media companies would be expected to appoint a “responsible contact person.” This individual could be subject to a fine up to €5 million if user complaints aren’t dealt with promptly.

Those are big numbers — the kinds of numbers that could potentially cripple a business.

As professionals with social products, we tend to rally around the shared belief that empowering societies to exchange ideas and information will create a better, more connected world. The rise of the social web has been one of the most inspiring and amazing changes in recent history, impacting humanity for the better.

Unfortunately, like many good things in the world, there tends to be a dark underbelly hidden beneath the surface. While the majority of users use social platforms to share fun content, interesting information and inspirational news, there is a small fraction of users that use these platforms to spread messages of hate.

It is important to make the distinction that we are not talking about complaints, anger, or frustration. We recognize that there is a huge difference between trash talking vs. harassing specific individuals or groups of people.

We are a protection layer for social products, and we believe everyone should have the power to share without fear of harassment or abuse. We believe that social platforms should be as expressive as possible, where everyone can share thoughts, opinions, and information freely.

We also believe that hate speech does not belong on any social platform. To this end, we want to enable all social platforms to remove hate speech as fast as possible — and not just because they could be subject to a massive fine. As professionals in the social product space, we want everyone to be able to get this right — not just the huge companies like Google.

Smaller companies may be tempted to do this manually, but the task becomes progressively harder to manage with increased scale and growth. Eventually, moderators will be spending every waking moment looking at submissions, making for an inefficient process and slow reaction time.

Instead of removing hate speech within 24 hours, we want to remove it within minutes or even seconds. That is our big, hairy, audacious goal.

Here’s how we approach this vision of ‘instant hate speech removal.’

Step 1 — Label everything.

Full disclosure: traditional filters suck. They have a bad reputation for being overly-simplistic, unable to address context, and prone to flagging false-positives. Still, leaving it up to users to report all terrible content is unfair to them and bad for your brand. Filters are not adequate for addressing something as complicated as hate speech, so we decided to invest our money into creating something different.

Using the old environmentally-friendly adage of “reduce, reuse, recycle (in that specific order)”, we first want to reduce all the noise. Consider movie ratings: all films are rated, and “R” ratings come accompanied by explanations. For instance, “Rated R for extreme language and promotion of genocide.” We want to borrow this approach and apply labels that indicate the level of risk associated with the content.

There are two immediate benefits: First, users can decide what they want to see; and second, we can flag any content above our target threshold. Of course, content that falls under ‘artistic expression’ can be subjective. Films like “Schindler’s List” are hard to watch but do not fall under hate speech, despite touching upon subjects of racism and genocide. On social media, some content may address challenging issues without promoting hate. The rating allows people to prepare themselves for what they are about to see, but we need more information to know if it is hate speech.

In the real world, we might look at the reputation of the individual to gain a better sense of what to expect. Likewise, content on social media does not exist in a vacuum; there are circumstances at play, including the reputation of the speaker. To simulate human judgment, we have built out our system with 119 features to examine the text, context, and reputation. Just looking for words like “nigga” will generate tons of noise, but if you combine that with past expressions of racism and promotions of violence, you can start sifting out the harmless stuff to determine what requires immediate action.

User reputation is a powerful tool in the fight against hate speech. If a user has a history of racism, you can prioritize reviewing — and removing — their posts above others.

The way we approach this with Community Sift is to apply a series of lenses to the reported content — internally, we call this ‘classification.’ We assess the content on a sliding scale of risk, note the frequency of user-submitted reports, the context of the message (public vs. large group vs. small group vs. 1:1), and the speaker’s reputation. Note that at this point in the process we have not done anything yet other than label the data. Now it is time to do something with it.

Step 2 — Take automatic action.


After we label the data, we can place it into three distinct ‘buckets.’ The vast majority (around 95%) will fall under ‘obviously good’, since social media predominantly consists of pictures of kittens, food, and reposted jokes. Just like there is the ‘obviously good,’ however, there is also the ‘obviously bad’.

In this case, think of the system like anti-virus technology. Every day, people are creating new ways to mess up your computer. Cybersecurity companies dedicate their time to finding the latest malware signatures so that when one comes to you, it is automatically removed. Similarly, our company uses AI to find new social signatures by processing billions of messages across the globe for our human professionals to review. The manual review is critical to reducing false positives. Just like with antivirus technology, you do not want to delete innocuous content on people’s computers, lest you end up making some very common mistakes like this one.

So what is considered ‘obviously bad?’ That will depend on the purpose of the site. Most already have a ‘terms of use’ or ‘community guidelines’ page that defines what the group is for and the rules in place to achieve that goal. When users break the rules, our clients can configure the system to take immediate action with the reported user, such as warning, muting, or banning them. The more we can automate meaningfully here, the better. When seconds matter, speed is of the essence.

Now that we have labeled almost everything as either ‘obviously good’ and ‘obviously bad,’ we can prioritize which messages to address first.

Step 3 — Create prioritized queues for human action.

Computers are great at finding the good and the bad, but what about all the stuff in the middle? Currently, the best practice is to crowdsource judgment by allowing your users to report content. Human moderation of some kind is key to maintaining and training a quality workflow to eliminate hate speech. The challenge is going to be getting above the noise of bad reports and dealing with the urgent right now.

Remember the Steven Covey model of time management? Instead of only using a simple chronologically sorted list of hate speech reports, we want to provide humans with a streamlined list of items to action quickly, with the most important items at the top of the list.

A simple technique is to have two lists. One list has all the noise of user reported content. We see that about 80–95% of those reports are junk (one user like dogs, so they report the person who likes cats). Since we labeled the data in step 1, we know a fair bit about it already: the severity of the content, the intensity of the context, and the person’s reputation. If the community thinks the content violates the terms of use and our label says it is likely bad, chances are, it is bad. Alternatively, if the label thinks it is fine, then we can wait until more people report it, thus reducing the noise.

The second list focuses on high-risk, time-sensitive content. These are rare events, so this work queue is kept minuscule. Content enters when the system thinks it is high-risk, but cannot be sure; or, when users report content that is right on the border of triggering the conditions necessary for a rating of ‘obviously bad.’ The result is a prioritized queue that humans can stay on top of and remove content from in minutes instead of days.

In our case, we devote millions of dollars a year into continual refinement and improvement with human professionals, so product owners don’t have to. We take care of all that complexity to get product owners back to the fun stuff instead — like making more amazing social products.

Step 4 — Take human action.

Product owners could use crowdsourced, outsourced, or internal moderation to handle these queues, though this depends on the scale and available resources within the team. The important thing is to take action as fast as humanly possible, starting with the questionable content that the computers cannot catch.

Step 5 — Train artificial intelligence based on decisions.

To manage the volume of reported content for a platform like Facebook or Twitter, you need to employ some level of artificial intelligence. By setting up the moderation AI to learn from human decisions, the system becomes increasingly effective at automatically detecting and taking action against emerging issues. The more precise the automation, the faster the response.

After five years of dedicated research in this field, we’ve learned a few tricks.

Machine learning AI is a powerful tool. But when it comes to processing language, it’s far more efficient to use a combination of a well-trained human team working alongside an expert system AI.

By applying the methodology above, it is now within our grasp to remove hate speech from social platforms almost instantly. Prejudice is an issue that affects everyone, and in an increasingly connected global world, it affects everyone in real-time. We have to get this right.

Since Facebook, YouTube, Twitter and Microsoft signed the EU hate speech code back in 2016, more and more product owners have taken up the fight and are looking for ways to combat intolerance in their communities. With this latest announcement by the German government— and the prospect of substantial fines in the future — we wanted to go public with our insights in hopes that someone sees something he or she could apply to a platform right now. In truth, 24 hours just isn’t fast enough, given the damage that racism, threats, and harassment can cause. Luckily, there are ways to prevent hate speech from ever reaching the community.

At Community Sift and Two Hat Security, we have a dream — that all social products have the tools at their disposal to protect their communities. The hardest problems on the internet are the most important to solve. Whether it’s hate speech, child exploitation, or rape threats, we cannot tolerate dangerous or illegal content in our communities.

If we work together, we have a real shot at making the online world a better place. And that’s never been more urgent than it is today.

How We Manage Toxicity for Social Apps and Websites

At Two Hat, we believe the social internet is a positive place with unlimited potential. We also believe bullying and toxicity are causing harm to real people and causing irreparable damage to social products. That’s why we made Community Sift.

We work with leading game studios and social platforms to find and manage toxic behaviours in their communities. We do this in real-time, and (at the time of writing) process over 1 billion messages a month.

Some interesting facts about toxicity in online communities:

  • According to the Fiksu Index, the cost of acquiring a loyal user is now $4.23, making user acquisition one of the biggest costs to a game.
  • Player Behavior in Online Games research published by Riot Games indicates that “players are 320% more likely to quit, the more toxicity they experience.”

Toxicity hurts everyone:

  • An estimated 1% of a new community is toxic. If that is ignored, the best community members leave and toxicity can grow as high as 20%.
  • If a studio spends $1 million launching its game and a handful of toxic users send destructive messages, their investment is at risk.
  • Addressing the problem early will model what the community is for, and what is expected of future members, thus reducing future costs.
  • Behaviour does change. That’s why we’ve created responsive tools that adapt to changing trends and user behaviours. We believe people are coachable and have built our technology with this assumption.
  • Even existing communities see an immediate drop in toxicity with the addition of strong tools.

Here’s a little bit about what Community Sift can do to help:

  • More than a Filter: Unlike products that only look for profanity, we have over 1 million human-validated rules and multiple AI systems to seek out bullying, toxicity, racism, fraud, and more.
  • Emphasis on Reputation: Every user has a bad day. The real problem is users who are consistently damaging the community.
  • Reusable Common Sense: Instead of simple reg-ex or black/whitelist, we measure the severity on a spectrum, from extreme good to extreme bad. You can use the same rules but a different permission level for group chat vs. private chat and for one game vs. another.
  • Industry Veterans: Our team has made games with over 300 million users and managed a wide variety of communities across multiple languages. We are live and battle-tested on top titles, processing over 1 billion messages a month at the time of writing.

To install Community Sift, you have your backend servers make one simple API call for each message, and we handle all the complexity in our cloud.

When toxic behaviour is found, we can:

  • Hash out the negative parts of a message: e.g. *”####ed out message”*
  • Educate the user
  • Reward positive users who are consistently helping others
  • Automatically trigger a temporary mute for regular offenders
  • Escalate for internal review when certain conditions like “past history of toxicity” are met
  • Group toxic users on a server together to help protect new users
  • Provide daily stats, BI reports, and analytics

We’d love to show you how we can help protect your social product. Feel free to book a demo anytime.

Cyberbullying, Bullying, and Online Harassment Facts

What is Bullying?

Bullying is an unacceptable anti-social behavior. Bullying tactics can involve verbal abuse, physical abuse, and psychological harassment towards a victim.This destructive

This destructive behavior is learned by a bully from the negative influences that can come from others in their home environment, school, peer groups, and even from certain media exposure.

The good news is that since bullying is a learned behavior, it can also be unlearned.

What is Cyberbullying?

Cyberbullying is a form of bullying behavior that is inflicted online or by phone through text messages, email, instant messaging, chat rooms, website posts, or images sent to a cell phone or personal digital assistant.

Cyberbullying is distinguished from bullying by the nature of its anonymity, anytime accessibility, and punitive fears of the loss of the use of a digital device.

The Signs of Bullying?

The signs of bullying behavior can be things such as the desire to be in control, arrogance, impulsiveness, boastfulness, poor sportsmanship and a lack of empathy.

The signs of a victim of bullying may be a reluctance to go to school, mood changes, loss of appetite, torn clothes, and signs of physical abuse such as bruises.

Often bullying can occur at schools in certain high-risk areas such as the hallways, cafeterias, playgrounds, and buses.

Bullying Facts

  • Bullying is the most common form of violence in our society
  • Nearly 1 in 5 students have indicated that they had been bullied repeatedly over time (two to three times per month or more within the school semester)
  • Bullying impacts approximately 13 million students every year, and some 160,000 students stay home from school each day because of bullying
  • 52% of bullied victims will suffer incidents of repeated bullying for a duration of months at a time
  • 71% of bullying is witnessed by others
  • 61% of face to face bullying is reportedly done by males
  • 68% of cyberbullying is reportedly done by females

How to Stop Bullying

While it may seem like an insurmountable task to completely stop bullying, there are practical ways to counter bullying problems.

Some initiatives to take to help stop bullying and cyberbullying are:

  • Talk about bullying and harassment
  • Help others develop empathy, including those that are doing the bullying
  • Provide anti-bullying training
  • Don’t be a bystander, speak up, report, intervene
  • Don’t delay in acting
  • In school settings, parents need to be involved

Sandy Neeson, a licensed school counsellor at McLean Middle School in Fort Worth, Texas has said, “To successfully change bullying behavior, you must involve the whole school, from teachers and custodians to cafeteria workers and bus drivers.”

The key is that it takes an entire community to actively stop bullying.

Keeping Kids Safe Online

Digital communication offers many incredible ways to communicate, discover, and share messages with others around the world. However there are risks inherent with conversations, social media, games, apps, and websites found online.

What are the best ways to keep kids safe online?

  • As a parent or guardian, you should educate yourself about the risks for kids being online
  • Appropriately educate your child about the risks for kids being online
  • Engage and experience your child’s games, apps, and websites they visit
  • Activate all child safety settings for your computer’s operating system, search engines, games and game consoles, and cellphones
  • Set rules and enforce rules
  • Establish limits for time online per day
  • Follow your child on social media, but respect their space and be careful not to stalk them
  • Encourage your child to behave as a good online citizen and to maintain a positive reputation
  • Tell your child to NEVER share their passwords or personal information
  • Encourage your child to share any questions they have or any unusual encounters they experience while online
  • Keep devices in a common area of the home
  • Engage a child in conversation about what they are and have been doing on the web
  • Be a model of positive online behavior
  • Encourage and engage in offline activities

You want to make being on the internet a positive experience for your child and for others, teaching them to be positive responsible online citizens.

Inspiration for this post:

Seven Steps to Good Digital Parenting
Dear Parents – a Message From Miss Florida 2015