[Session] Anti abuse work on wikis
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Dreamy_Jazz
	Mar 6 2024, 1:22 PM

Description

(Please set yourself as task assignee of this session)

Title of session: Anti abuse work on wikis
Session description: A workshop on anti abuse work on WMF wikis, giving detail on the current systems, recent work, and new tools being created.
Username for contact: @Dreamy_Jazz
Session duration (25 or 50 min): 50 mins
Session type (presentation, workshop, discussion, etc.): Workshop
Language of session (English, Arabic, etc.): English
Prerequisites (some Python, etc.): None
Any other details to share?:
Interested? Add your username below:
@Rina.sl

Notes from session:

Anti abuse work on the wikis

Date and time:

Relevant links

Phabricator task: https://phabricator.wikimedia.org/T359343
Session slides: https://docs.google.com/presentation/d/10dOnsGbF2C3InBUpbp9lCPRSv23XWA9Etd9DhFYrlJk/edit?usp=sharing

Survey - https://wikimedia.qualtrics.com/jfe/form/SV_emruGo866DIuxa6

Agenda

Brief overview of existing anti-abuse efforts across various teams at WMF (10 minutes)
[see slides linked above]
Incident Reporting System - can test this on Beta
Exercise: Identify common themes & areas of work
Exercise: Identify weak spots, places for improvement, project ideas
Review as a group
Hack on projects this weekend!

Goals

Know about different areas of work, why they’re happening
Understand how to contribute
Hear your feedback about where we should improve and what we’re doing well
Connect people

Presenter

William Brown - WBrown (WMF) / Dreamy Jazz
Kosta Harlan - KHarlan (WMF)

Participants

Magioladitis
SocialKnowledge

Notes

Temporary accounts

protect contributors in some environments
Q: potential for abuse?

Global User

MediaModeration

..... Blocking Improvements
IP info and IPoid

look up geolocation and ISP info
we're hiding IP addresses, but displayed for temp accounts

Incident reporting system

for users to report harmful content in appropriate place
currently in MTP stage and likely to change - hopefully to production later this year

Agent Header

depricated and replaced with user agent client hints

Other things that are being worked on:

SRE - DDOS attack mitigation
upgrades to captcha
automoderator

automatic reversion of edits deemed to be bad edits or by a machine learning model

abuse filter

support the temp account system

any examples of anti-abuse projects? that we should know about?

About half of the room work on an anti-abuse tool/product

what's interesting to you about this session?

Here to learn and listen - team works on these topics
Wikibase Cloud - relevant as this gets more popular
Admin on Greek Wikipedia & thesis on ethics/abuse on Wikipedia
From the Security team
Steward - cross-wiki anti abuse
Admin on enwiki, particularly interested in mobile patrolling
Steward - tried to work on abuse tools like AbuseFilter but it's hard

Big group exercise (5minutes) -- what themes, categories, and problem areas do the projects we mentioned relate to?

Themes/Categories:

Vandalism
Scale
conflict of interest editors
ad hominem attacks
spam bots
cross-wiki abuse
harrassment prevention / mitigation
detection
sock puppetry
improved mitigations
collateral damage (esp in particular countries)
mass creation of articles by a particular user (e.g. with LLMs)
vandalism - to see how long a particular edit of bad faith was made and then reverted (just for fun/research purposes)
user privacy
copyright violations? content moderation?
Long term abuse by individuals

Small Group Exercise (20 minutes)
discuss what projects, capabilities, critiques, and weaknesses are present in each area we identified in the previous slide.

Discuss one (or a group of) themes identified above.
groups: (red) vandalism - spam bots - sock puppetry Content issues

Most obvious issues - anyone can address them
Long running discussion topic - lots of tools. ORES etc.
Machine Learning can support this, but not solve it completely.
Spambots - fighting against technical people who know how to cirumvent the system.
Captcha helps with spambots. Works to some extent but not fully as expected.
For vandalism it can be easy - for example a blacklist of words. Lots of low-hanging fruit here.
Hard part is subtle vandalism - for example the capital of a country is wrong. Have to manually check this is correct. AI can't cross-check information like this. Misinformation/disinformation is harder.
SWViewer for cross-wiki monitoring.
Sometimes abusers use scripts to make rapid abuse. AbuseFilters help but it's not always easy. Can be hard to find people who know how the software works.
Lots of volunteer/third-party tools. Can be hard to know what exists.
Bulgarian Wikipedia has an anti-vandalism bot. Turkish Wikipedia doesn't have one. There should be a centralised tool for this. Especially helpful for smaller Wikipedias.
DDoS - IP reputation. Can this be implemented in anti-vandalism/anti-spam tools to foresee abuse from users?
KH: We're starting an experiment in the next weeks to look at this data to see if it maps to bad content/accounts. Based on that, we might be able to make decisions based on the information. We would like to make this available to other tools. There are some Phab tasks for this, it would just need some work to be done.
AbuseFilter - how would it be helpful? We have an AbuseFilter for new account creating a profile page "I am ...". If we knew it was from a bad reputation IP, we could prevent them from taking the action.
Using AI for abuse detection. Falsified images, for example.

(green) ad hominem - protecting privacy - crosswiki abuse - long term abuse - sockpuppetry
Interpersonal issues

how do you identify bad actors? esp with temp accounts?
dont want to ban people that aren't doing vandalism
privacy violations - building fingerprinting solution-- where does that data go? the users documentation can create something and then use that patch to share the data collected (in violation of privacy)
- analytically oriented?
- device fingerprinting solution in particular - to catch device users information -- specific info on the computer that identifing a particlar person
  - esp where indivual people use a shared computer
- how different would that be to sharing your same IP address?
- - more devices that humans are in this room right now
  - public library - provide access to computers, people will log into their accounts on that computer and could be hacked
- false positives?
usually not the people that are good people ;)
identification by some means (but not perfect)
abuse of wikis - bad behaviour by certain users - how to we find that
we have two options - deny or block - is there a third option?
- maybe change that to minor things and we allow suspicious editors from doing some things but not all
  - this is kind of already there but not easy to use
- need more documentation
  - are there large projects that we can work on?
- can we block only certain aspects that make sense to block? to avoid false positives?
- device fingerprinting
- easier to maintain
- abuse filters (lua)
  - are there large projects that we can work on at the Hackathon?
- consequences (other than block edit)

what do we need to improve?
- catalog of anti-abuse tools
- have anti abuse histories
- Reliance on individuals to remember things
  
  (purple) scaled abuse - people testing defenses - collateral damage Large scale/deliberate attacks on our projects.

Large project ideas
Anti-Abuse LLM
Early detection
Detecting early is important
Hackathon ideas

New capabilities
Thank ability for temp accounts
Where we need to improve
Make AbuseFilter easier to use, and better documented

Themes/summaries

AbuseFilter is a common thread - powerful tool
Improving global abuse filters?
A tool that could make an AbuseFilter based on a series of edits? Send to an LLM or something.

Survey! Take the survey! (QR code link in presentation)
https://wikimedia.qualtrics.com/jfe/form/SV_emruGo866DIuxa6

Questions

Photos

Social

Details

Other Assignee: kostajh

Related Objects

Mentioned In: T361778: [Session] 👋 Wikimedia Hackathon 2024 Opening Ceremony

Event Timeline

Dreamy_Jazz created this task.Mar 6 2024, 1:22 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 6 2024, 1:22 PM

Dreamy_Jazz moved this task from Backlog to Proposed sessions on the Wikimedia-Hackathon-2024 board.Mar 6 2024, 1:22 PM

kostajh subscribed.Mar 6 2024, 1:34 PM

Dreamy_Jazz updated the task description. (Show Details)Mar 6 2024, 1:35 PM

kostajh updated the task description. (Show Details)Mar 6 2024, 1:37 PM

TheDJ subscribed.Mar 6 2024, 3:41 PM

sbassett subscribed.Mar 7 2024, 5:56 PM

• Samwalton9-WMF updated the task description. (Show Details)Mar 12 2024, 5:58 PM

• Samwalton9-WMF subscribed.

Scardenasmolinar updated the task description. (Show Details)Mar 12 2024, 5:58 PM

Scardenasmolinar updated the task description. (Show Details)

Scardenasmolinar subscribed.

• Urbanecm_WMF updated the task description. (Show Details)Mar 15 2024, 10:11 PM

• Urbanecm_WMF subscribed.

KCVelaga_WMF updated the task description. (Show Details)Mar 27 2024, 5:32 PM

KCVelaga_WMF updated the task description. (Show Details)

KCVelaga_WMF subscribed.

AnotherJensen subscribed.Apr 9 2024, 9:15 PM

Hello! 👋 The 2024 Hackathon Program is now open for scheduling! If you are still interested in organizing a session, you can claim a slot on a first-come, first-serve basis by adding your session to the daily program, following these instructions. We look forward to hearing your presentation!

Dreamy_Jazz moved this task from Proposed sessions to Accepted sessions on the Wikimedia-Hackathon-2024 board.Apr 24 2024, 12:47 PM

Dreamy_Jazz moved this task from Accepted sessions to Scheduled Sessions on the Wikimedia-Hackathon-2024 board.

debt triaged this task as Medium priority.Apr 24 2024, 7:40 PM

WMDECyn subscribed.Apr 30 2024, 11:29 AM

Rina.sl updated the task description. (Show Details)Apr 30 2024, 1:25 PM

Rina.sl subscribed.

SocialKnowledge subscribed.May 1 2024, 7:08 PM