Page MenuHomePhabricator

[Session] Anti abuse work on wikis
Closed, ResolvedPublic

Description

(Please set yourself as task assignee of this session)

  • Title of session: Anti abuse work on wikis
  • Session description: A workshop on anti abuse work on WMF wikis, giving detail on the current systems, recent work, and new tools being created.
  • Username for contact: @Dreamy_Jazz
  • Session duration (25 or 50 min): 50 mins
  • Session type (presentation, workshop, discussion, etc.): Workshop
  • Language of session (English, Arabic, etc.): English
  • Prerequisites (some Python, etc.): None
  • Any other details to share?:
  • Interested? Add your username below:
  • @Rina.sl

Notes from session:

Anti abuse work on the wikis

Date and time:

Relevant links

Survey - https://wikimedia.qualtrics.com/jfe/form/SV_emruGo866DIuxa6

Agenda

Brief overview of existing anti-abuse efforts across various teams at WMF (10 minutes)
[see slides linked above]
Incident Reporting System - can test this on Beta
Exercise: Identify common themes & areas of work
Exercise: Identify weak spots, places for improvement, project ideas
Review as a group
Hack on projects this weekend!

Goals

Know about different areas of work, why they’re happening
Understand how to contribute
Hear your feedback about where we should improve and what we’re doing well
Connect people

Presenter

  • William Brown - WBrown (WMF) / Dreamy Jazz
  • Kosta Harlan - KHarlan (WMF)

Participants

  • Magioladitis
  • SocialKnowledge

Notes

Temporary accounts

  • protect contributors in some environments
  • Q: potential for abuse?

Global User

MediaModeration

..... Blocking Improvements
IP info and IPoid

  • look up geolocation and ISP info
  • we're hiding IP addresses, but displayed for temp accounts

Incident reporting system

  • for users to report harmful content in appropriate place
  • currently in MTP stage and likely to change - hopefully to production later this year

Agent Header

  • depricated and replaced with user agent client hints

Other things that are being worked on:

SRE - DDOS attack mitigation
upgrades to captcha
automoderator

automatic reversion of edits deemed to be bad edits or by a machine learning model

abuse filter

support the temp account system

any examples of anti-abuse projects? that we should know about?

  • About half of the room work on an anti-abuse tool/product

what's interesting to you about this session?

  • Here to learn and listen - team works on these topics
  • Wikibase Cloud - relevant as this gets more popular
  • Admin on Greek Wikipedia & thesis on ethics/abuse on Wikipedia
  • From the Security team
  • Steward - cross-wiki anti abuse
  • Admin on enwiki, particularly interested in mobile patrolling
  • Steward - tried to work on abuse tools like AbuseFilter but it's hard

Big group exercise (5minutes) -- what themes, categories, and problem areas do the projects we mentioned relate to?

Themes/Categories:

  • Vandalism
  • Scale
  • conflict of interest editors
  • ad hominem attacks
  • spam bots
  • cross-wiki abuse
  • harrassment prevention / mitigation
  • detection
  • sock puppetry
  • improved mitigations
  • collateral damage (esp in particular countries)
  • mass creation of articles by a particular user (e.g. with LLMs)
  • vandalism - to see how long a particular edit of bad faith was made and then reverted (just for fun/research purposes)
  • user privacy
  • copyright violations? content moderation?
  • Long term abuse by individuals

Small Group Exercise (20 minutes)
discuss what projects, capabilities, critiques, and weaknesses are present in each area we identified in the previous slide.

  • Discuss one (or a group of) themes identified above.
  • groups: (red) vandalism - spam bots - sock puppetry Content issues

Most obvious issues - anyone can address them
Long running discussion topic - lots of tools. ORES etc.
Machine Learning can support this, but not solve it completely.
Spambots - fighting against technical people who know how to cirumvent the system.
Captcha helps with spambots. Works to some extent but not fully as expected.
For vandalism it can be easy - for example a blacklist of words. Lots of low-hanging fruit here.
Hard part is subtle vandalism - for example the capital of a country is wrong. Have to manually check this is correct. AI can't cross-check information like this. Misinformation/disinformation is harder.
SWViewer for cross-wiki monitoring.
Sometimes abusers use scripts to make rapid abuse. AbuseFilters help but it's not always easy. Can be hard to find people who know how the software works.
Lots of volunteer/third-party tools. Can be hard to know what exists.
Bulgarian Wikipedia has an anti-vandalism bot. Turkish Wikipedia doesn't have one. There should be a centralised tool for this. Especially helpful for smaller Wikipedias.
DDoS - IP reputation. Can this be implemented in anti-vandalism/anti-spam tools to foresee abuse from users?
KH: We're starting an experiment in the next weeks to look at this data to see if it maps to bad content/accounts. Based on that, we might be able to make decisions based on the information. We would like to make this available to other tools. There are some Phab tasks for this, it would just need some work to be done.
AbuseFilter - how would it be helpful? We have an AbuseFilter for new account creating a profile page "I am ...". If we knew it was from a bad reputation IP, we could prevent them from taking the action.
Using AI for abuse detection. Falsified images, for example.

(green) ad hominem - protecting privacy - crosswiki abuse - long term abuse - sockpuppetry
Interpersonal issues
  • how do you identify bad actors? esp with temp accounts?
  • dont want to ban people that aren't doing vandalism
  • privacy violations - building fingerprinting solution-- where does that data go? the users documentation can create something and then use that patch to share the data collected (in violation of privacy)
    • analytically oriented?
    • device fingerprinting solution in particular - to catch device users information -- specific info on the computer that identifing a particlar person
      • esp where indivual people use a shared computer
    • how different would that be to sharing your same IP address?
    • - more devices that humans are in this room right now
      • public library - provide access to computers, people will log into their accounts on that computer and could be hacked
    • false positives?
  • usually not the people that are good people ;)
  • identification by some means (but not perfect)
  • abuse of wikis - bad behaviour by certain users - how to we find that
  • we have two options - deny or block - is there a third option?
    • maybe change that to minor things and we allow suspicious editors from doing some things but not all
      • this is kind of already there but not easy to use
    • need more documentation
      • are there large projects that we can work on?
    • can we block only certain aspects that make sense to block? to avoid false positives?
    • device fingerprinting
    • easier to maintain
    • abuse filters (lua)
      • are there large projects that we can work on at the Hackathon?
    • consequences (other than block edit)
  • what do we need to improve?
    • catalog of anti-abuse tools
    • have anti abuse histories
    • Reliance on individuals to remember things

      (purple) scaled abuse - people testing defenses - collateral damage Large scale/deliberate attacks on our projects.

Large project ideas
Anti-Abuse LLM
Early detection
Detecting early is important
Hackathon ideas

New capabilities
Thank ability for temp accounts
Where we need to improve
Make AbuseFilter easier to use, and better documented

Themes/summaries

  • AbuseFilter is a common thread - powerful tool
  • Improving global abuse filters?
  • A tool that could make an AbuseFilter based on a series of edits? Send to an LLM or something.

Survey! Take the survey! (QR code link in presentation)
https://wikimedia.qualtrics.com/jfe/form/SV_emruGo866DIuxa6

Questions

Photos

Social

Details

Other Assignee
kostajh

Event Timeline

KCVelaga_WMF updated the task description. (Show Details)
KCVelaga_WMF subscribed.

Hello! 👋 The 2024 Hackathon Program is now open for scheduling! If you are still interested in organizing a session, you can claim a slot on a first-come, first-serve basis by adding your session to the daily program, following these instructions. We look forward to hearing your presentation!

debt triaged this task as Medium priority.Apr 24 2024, 7:40 PM