Page MenuHomePhabricator

Algorithmic dangers and transparency -- Best practices
Closed, ResolvedPublic

Assigned To
Authored By
Oct 12 2016, 2:12 PM
Referenced Files
"Like" token, awarded by cscott."Like" token, awarded by Tgr."Like" token, awarded by ZhouZ."Like" token, awarded by Aklapper."Insectivore" token, awarded by Capt_Swing."Barnstar" token, awarded by jmatazzoni.


Type of activity: Scheduled session
Main topic: T147708: Facilitate Wikidev'17 main topic "Artificial Intelligence to build and navigate content"
Timing: Tuesday, January 10th at 1:10PM PST
Location: Room 2
Stream link:
Back channel: #wikimedia-ai

How do we make sure that our filtering and ranking algorithms do not perpetuate biases or cause other types of social problems? What aspects of AIs should we make transparent and what are some good strategies for doing so? In this session, we'll develop a call to action and gather resources for a best practices document

Problem statement

There's no best practices document for not causing problems with your algorithm. What are common problems we can cause? What are users' expectations?

Expected outcome

A document containing prescriptions for transparency around new AI projects. The beginning of a set of guidelines and best practices.

Summary of discussion

There's clear interest, but it seems like we'll probably want a brief summary of the critical algorithms literature as part of a session. We could probably compress a useful overview into less than 10 minutes so that it doesn't dominate the discussion.

Concerns were raised in regards to ORES (@Halfak) and ElasticSearch (@EBernhardson). @Tbayer has been reading some the recent literature. Generally, interest has been signaled (via token and subscriptions) by @Aklapper, @jmatazzoni, @Lydia_Pintscher, @Capt_Swing, @Arlolra, @gpaumier, and @Siznax.

(Updated Nov. 21st, 2016)


Event Timeline

@Mooeypoo has expressed concerns over tracking algorithmic bias around the Edit-Review-Improvements project. I think that @jmatazzoni and @Pginer-WMF should consider discussing this angle of algorithmic work in relation to ERI since it's a serious user-facing concern.

When I presented on my preliminary results regarding anonymous editor bias in damage detection, @Tbayer raised some good counter-points. I wonder if he'd want to bring that perspective to the dev summit.

I've been working on this idea around T148700: JADE: UI/API for reviewing/refuting how ORES classifies you and your stuff

From a blog I wrote about the idea:

So, I was listening to an NPR show titled "Digging into Facebook's File on You". At some point, there was some casual discussion of laws that some countries in the European Union have re. users' ability to review and correct mistakes in data that is stored about them. This made me realize that ORES needs a good mechanism for you to review how the system classifies you and your stuff.

EBernhardson unsubscribed.
EBernhardson subscribed.

This is certainly of concern to me in search. I'm starting to evaluate ways to build ML re-ranking systems using user click through data as the input, and the ability of users to 'poison the well' so to speak will certainly become an issue to deal with. Would love to talk about how others are dealing with the issue of training with data from user behaviour.


When I presented on my preliminary results regarding anonymous editor bias in damage detection, @Tbayer raised some good counter-points. I wonder if he'd want to bring that perspective to the dev summit.

I will be at the summit and am happy to participate in a session regarding this topic. (It's not my core work area currently, but I have been interested in it both as a volunteer editor who does quite of patrolling on Wikipedia and Wikidata, now with the help of ORES, and as a Wikipedia research topic in general - I also wrote a review of a related paper recently.)
In any case, I agree it's a worthwhile topic for a session at the summit.

Halfak updated the task description. (Show Details)

This has been a hot topic in the media recently (e.g. Cathy O'Neil's book Weapons of Math Destruction has gotten at lot of attention from the press). I don't know a lot about it, but I would really enjoy the chance to learn more.

To the owner of this session: Here is the link to the session guidelines page: We encourage you to recruit Note-taker(s) 2(min) and 3(max), Remote Moderator, and Advocate (optional) on the spot before the beginning of your session. Instructions about each role player's task are outlined in the guidelines. The physical version of the role cards will be made available in all the session rooms. Good luck prepping, see you at the summit! :)

Halfak updated the task description. (Show Details)

I've added the link for the Youtube stream. See you all in person or on IRC tomorrow morning.

Note-taker(s) of this session: Follow the instructions here: After the session, DO NOT FORGET to copy the relevant notes and summary into a new wiki page following the template here: and also link this from the All Session Notes page: The EtherPad links are also now linked from the Schedule page ( for you!