Page MenuHomePhabricator

Machine-learning tool to reduce toxic talk page interactions
Open, MediumPublic


Build an AI tool to identify occurrences of apparent talk page abuse in the English Wikipedia in real time, building on existing en:WP functions such as tags and edit filters.

Envisaged benefits

An edit filter could warn users before posting that their comment may need to be refactored to be considered appropriate:

  • Cutting down on the number of abusive talk page messages actually posted.

Editors could check recent changes for tagged edits:

  • Bringing much-needed third eyes to talk pages where an editor may be facing sexual harassment or other types of abuse.
  • Improving response times and relieving victims of the burden of having to ask an admin for help.

Prevention of talk page escalation.

Improvement of talk page culture.

Enhanced editor retention.

Some prior discussion of this idea can be found at

As User:Denny pointed out on the Wikimedia-l mailing list yesterday, a similar project has reportedly been run in the League of Legends online gaming community to improve the quality of social interactions, with considerable success: occurrences of verbal abuse in that community are reported to have dropped by more than 40 percent. (

Another interesting finding from that project was: 87 percent of online toxicity came from the neutral and positive citizens just having a bad day here or there. [...] We had to change how people thought about online society and change their expectations of what was acceptable. That is something that seems to apply to Wikipedia as well.

The game designers and scientists working on this project started out by compiling a large dataset of interactions community members deemed counterproductive (toxic behaviour, harassment, abuse) and then applied machine learning to this dataset to be able to provide near real-time feedback to participants on the quality of their interaction. (They're also looking at identifying positive, collaborative behaviours.)

I would love to see the Foundation explore if this approach could be adapted to address the very similar problems in the Wikipedia community. The totality of revision-deleted and oversighted talk page posts in the English Wikipedia could provide an initial dataset, for example; like the League of Legends community, the Foundation could invite outside labs and academic institutes to help analyse this dataset.

There are considerable difficulties involved in building a system sophisticated enough to avoid unacceptable numbers of false positives, but this is a challenge familiar from ClueBot programming, and one the League of Legends team seems to have mastered: Just classifying words was easy, but what about more advanced linguistics such as whether something was sarcastic or passive-aggressive? What about more positive concepts, like phrases that supported conflict resolution? To tackle the more challenging problems, we wanted to collaborate with world-class labs. We offered the chance to work on these datasets and solve these problems with us. Scientists leapt at the chance to make a difference and the breakthroughs followed. We began to better understand collaboration between strangers, how language evolves over time and the relationship between age and toxicity; surprisingly, there was no link between age and toxicity in online societies.

A successful project of this type could subsequently be offered to other Wikimedia projects as well. It would address a long-standing and much-discussed problem in the English Wikipedia, and put the Foundation at the leading edge of internet culture.

Andreas JN466 20:03, 14 November 2015 (UTC)

This card tracks a proposal from the 2015 Community Wishlist Survey:

This proposal received 6 support votes, and was ranked #83 out of 107 proposals.

Event Timeline

DannyH raised the priority of this task from to Needs Triage.
DannyH updated the task description. (Show Details)
DannyH moved this task to Wishlist 51-on on the Community-Wishlist-Survey-2015 board.
DannyH added a subscriber: DannyH.
DannyH set Security to None.
IMPORTANT: If you are a community developer interested in working on this task: The Wikimedia Hackathon 2016 (Jerusalem, March 31 - April 3) focuses on #Community-Wishlist-Survey projects. There is some budget for sponsoring volunteer developers. THE DEADLINE TO REQUEST TRAVEL SPONSORSHIP IS TODAY, JANUARY 21. Exceptions can be made for developers focusing on Community Wishlist projects until the end of Sunday 24, but not beyond. If you or someone you know is interested, please REGISTER NOW.

I'm foster-parenting this task, since it's come up in our team a few times lately. Having models to estimate social behaviors is complementary to the existing editquality model. The evaluation function will be much different, for example many citations in Talk space is not necessarily a good indicator of pro-social behavior.

Such a model could be used by social recommender systems, for example when prioritizing who to include for auto-invite to the Teahouse (c.f. @notconfusing and @Capt_Swing).

This would probably derive some metrics from [[en:User:Ewitch51]]'s pending research on civility.

Harej triaged this task as Medium priority.Apr 3 2019, 5:07 AM