Page MenuHomePhabricator

Analyze the extent of the bias of damage detection models against anons
Closed, ResolvedPublic



Extend this work by exploring the extent of the bias that ORES learns.

Event Timeline

@Halfak I don't understand whether the change made in your blog post was ever implemented. There's which seems preparatory, but then reverted and damaging models still have the is_anon feature.

awight removed Avner as the assignee of this task.Oct 24 2018, 5:11 PM
awight added a subscriber: Avner.

Right. So there's an analysis in

That shows the bias against anons and how we mitigated some of the issue by switching to a new modeling strategy. It's not clear whether it is worthwhile to further work is necessary, but I think we can consider this task to be done.

@Halfak Thanks for the pointer to ... a paper my name is on. I see what you're talking about, in section 7.4. Switching from SVM to gradient boosting apparently made a huge improvement, but hasn't made the problem go away. Do you think there's any value in continuing this investigation, for example quantitizing how much our algorithm relies on is_anon and how a model would perform if trained without that feature?

Right. So I think we might file a new task for that work. I think that we will need a community consultation of some sort to make a decision about the inclusion of is_anon and seconds_since_registration. There will be a tradeoff in model fitness. I've been talking to some researchers about potentially picking that task up. They want to study the process of intersecting algorithmic parameters with people's values. Ping @Bobo.03 :)