Understanding context reduces racial bias in systems that detect online hate speech
When it comes to accurately flagging hate speech on social media, context matters, says a recently published study by the University of South California (USC) aimed at reducing errors that could amplify racial bias.
Often, hate speech detection algorithms used by social media platforms which are designed to stop the spread of hateful speech amplify racial bias by blocking inoffensive tweets by black people or other minority groups.
Research has found that artificial intelligence (AI) models were 1.5 times more likely to flag tweets written by African Americans as "offensive" - in other words, a false positive, compared to other tweets.
The reason behind this issue is the lack of context that AI detection models are able to identify. Specifically, hate speech classifiers are oversensitive to group identifiers like 'black', 'gay', or 'transgender', which are only indicators of hate speech when used in some settings.
Now, a team of researchers from USC has created a hate speech classifier that is more context-sensitive, and less likely to mistake a post containing a group identifier as hate speech.
Brendan Kennedy, a computer science PhD student and co-lead author of the study - published in July 2020 - said: "We want to move hate speech detection closer to being ready for real-world application".
The team hope that their hate speech classifier will fix 'broken' hate speech detection models or those that generate bad predictions, when introduced to real-world data, such as social media or other online text data.
"It is key for models to not ignore identifiers, but to match them with the right context", said Professor Xiang Ren, an expert in natural language processing.
It is vital that members of minority groups, such as those from black and minority ethnic backgrounds, are not inappropriately censored. Voices must be heard, not muted.