Page MenuHomePhabricator

Increased visibility in wiki-replicas for volunteers fighting vandals
Open, Needs TriagePublic

Description

Summary
The Security-Team recently completed an audit of the configuration file maintain-views.yaml, in order to explore whether wiki-replicas pose some privacy risks for the contributors supporting Wikimedia projects. As part of the conclusions, it is recommended that details about vandal fighters be redacted from wiki-replicas logs, as also raised in T241667.

Broader context
Abuse filter logs are somewhat public, depending on the project configuration. The issue at stake here is that the logs create increased visibility for volunteers doing anti-vandalism work, making them potential targets of harassment. While this is more of a safety issue rather than a privacy concern this is a risk which stems from the data released in wiki-replicas. As such it cannot be overlooked.

Below is a list of the last 100 anti-vandalism actions on En.WP using Abuse Filter. The volunteers behind these actions are highlighted through this query.

SELECT afh_user_text, afh_timestamp, afh_public_comments, afh_actions 
FROM abuse_filter_history
LIMIT 100;

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
sguebo_WMF added a parent task: Restricted Task.Aug 6 2021, 5:49 PM
sguebo_WMF added a parent task: Restricted Task.Aug 6 2021, 5:54 PM
sguebo_WMF moved this task from In Progress to Completed on the Privacy Engineering board.

@sguebo_WMF Is this data visible on the wikis?

@nskaggs do you know if this data is in use by any of the tools?

Is this data visible on the wikis?

All AF modifications are publically listed in Special:Log/abusefilter (from logging).

More details are in Special:AbuseFilter/history (from abuse_filter_history). This only lists public filters unless you have sufficient rights for private ones.

As stated, I recommend declining this since the users are publically available on wiki and via the API.

afh_flags, afh_deleted, afh_changed_fields, afh_group could be redacted for hidden filters, but that doesn't help with making the users less visible.

@odimitrijevic I don't. If you want to explore some sampled queries you can look at the research undertaken during the upgrade: https://wikitech.wikimedia.org/wiki/News/Wikireplicas_2020_Redesign/Wiki_Replicas_Cross-DB_Query_Data

@sguebo_WMF Is this data visible on the wikis?

@odimitrijevic -- yes, it is. My take on this ticket is that, since it's tangentially related to T241667 we should either stall it until a decision has been made on that other ticket or close it if we think it's a duplicate.

Is this data visible on the wikis?

All AF modifications are publically listed in Special:Log/abusefilter (from logging).

More details are in Special:AbuseFilter/history (from abuse_filter_history). This only lists public filters unless you have sufficient rights for private ones.

As stated, I recommend declining this since the users are publically available on wiki and via the API.

Concur.

afh_flags, afh_deleted, afh_changed_fields, afh_group could be redacted for hidden filters, but that doesn't help with making the users less visible.

afh_flags and afh_deleted are currently visible in the Status column of https://en.wikipedia.org/wiki/Special:AbuseFilter. afh_changed_fields isn't visible for private filters and could be redacted, but doesn't contain much in the first place. afh_group is only ever default unless you're on a Flow wiki, so it could be redacted without losing anything (but is also not worth redacting).


If people are looking to harass anti-vandalism editors, I'm pretty sure they're not going to look in a mostly-obscure database table to find targets. They're going to harass the people who block and revert them, and maybe the people already listed in Special:AbuseFilter. What's next, hiding the blocking administrator in the block log?

If people are looking to harass anti-vandalism editors, I'm pretty sure they're not going to look in a mostly-obscure database table to find targets.

No, you're wrong and too optimistic. Vandals we're targeting, especially LTA(long-term abuse)s, are very stubborn and pay persistent efforts to evade our extermination.
They try using anything they can do, even if it is obscure and needs professional skills to utilize.
I don't understand why they do it but anyway we need to make countermoves for them effectively.

Controlling visibility is one of the most important methods.
We can not make good decisions without knowing what our enemy knows and don't.

However, excessive restriction of visibility harms our regular activity. Keeping publicity and transparency as wide as possible is also very important since our wiki projects are open.

Accordingly, as the Security-Team says in the description, we have made the border of the restriction with careful consideration about the balance.

Abuse filter logs are somewhat public, depending on the project configuration.

Each project needs to elaborately customize its visibility along with its situation (e.g. goals, contributors' skill, readers' level, language and regional culture).
Especially, AbuseFilter's logs and related histories are critical because AbuseFilter is the most effective and unique tool for anti-vandalism.

Inconsistency between on-wiki and the DB replica is not allowable by anti-vandal volunteers. Please don't discourage them.
You have to realize that you're benefiting our enemies.

If you can't control the publishing scope strictly now, you should select safer and limited scope instead of rude settings until the best modification of the functions is developed and rolled out.

In my conclusion, the better setting you can do immediately is temporally excluding all data related to abusefilter (e.g. abusefilter, abuse_filter, abuse_filter_log, and abuse_filter_history) from the replica DB,
and create a phablicator ticket to establish controlling method of visibility as same as the each project's settings and re-enable the exports if you need.