Page MenuHomePhabricator

Investigate impact of IP Masking on PageTriage
Closed, ResolvedPublic

Description

We need to understand what changes, if any, are required in PageTriage given the IP Masking changes which are coming.

The scope of this investigation should be to understand what technical changes are needed and create tasks for each unit of work.

Findings

PageTriage sets preferences for temporary accounts

Mitigation

New page creation + Draft creation

  • New page creation
    • Currently, only registered users can create new pages on enwiki (the only production site using the extension that I am aware of), so I don't believe ip masking will have an meaningful immediate impact on the new pages feed queue.
  • Draft creation
    • Anonymous users can currently create drafts that populate the afc queue on enwiki (verified by scrolling through the afc queue on the new pages feed). Articles that currently show IP addresses as the creator would instead show temporary user accounts. Drafts from single IP address may spawn multiple temporary accounts over time.

Mitigation
Impacts to draft not look like an immediate problem, so no mitigation is required at this time. After IP masking rolls out, the community will asses the impact to their processes and may request implementation of a "reaveal ip" workflow in the PageTriage extension.

Event Timeline

backbone.js article model (used to present the toolbar and the new pages feed)

direct checks for isAnon()
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/PageTriage/+/refs/heads/master/modules/ext.pageTriage.util/models/ext.pageTriage.article.js

Impact
Temporary accounts would save a userjs preference to persist new pages feed filter settings. This would not break anything, but I believe it would be of limited benefit.

Mitigation
Since the purpose of the toolbar and the feed is to serve new page patrollers, I believe we could limit these to "named" users, which are registered, non-temporary accounts. isAnon() could be replaced with !isNamed()

onArticleViewFooter hook (used to present the toolbar)

direct check for isRegistered()
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/PageTriage/+/refs/heads/master/includes/Hooks.php

Impact
Temporary accounts would make it past an early return check in the hook implementation. This would not break anything, but we lose the benefit of the optimization.

Mitigation
Since the purpose of the toolbar is to serve new page patrollers, I believe we could limit this to "named" users, which are registered, non-temporary accounts. isRegsitered() could be replaced with isNamed()

New page creation

Impact
Currently, only registered users can create new pages on enwiki (the only production site using the extension that I am aware of), so I don't believe ip masking will have an meaningful immediate impact on the new pages feed queue.

Draft creation

Impact
Anonymous users can currently create drafts that populate the afc queue on enwiki (verified by scrolling through the afc queue on the new pages feed). Articles that currently show IP addresses as the creator would instead show temporary user accounts. Drafts from single IP address may spawn multiple temporary accounts over time. I do not know if this is a meaningful impact or not.

backbone.js article model (used to present the toolbar and the new pages feed)

direct checks for isAnon()
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/PageTriage/+/refs/heads/master/modules/ext.pageTriage.util/models/ext.pageTriage.article.js

Impact
Temporary accounts would save a userjs preference to persist new pages feed filter settings. This would not break anything, but I believe it would be of limited benefit.

Mitigation
Since the purpose of the toolbar and the feed is to serve new page patrollers, I believe we could limit these to "named" users, which are registered, non-temporary accounts. isAnon() could be replaced with !isNamed()

onArticleViewFooter hook (used to present the toolbar)

direct check for isRegistered()
https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/PageTriage/+/refs/heads/master/includes/Hooks.php

Impact
Temporary accounts would make it past an early return check in the hook implementation. This would not break anything, but we lose the benefit of the optimization.

Mitigation
Since the purpose of the toolbar is to serve new page patrollers, I believe we could limit this to "named" users, which are registered, non-temporary accounts. isRegsitered() could be replaced with isNamed()

New page creation

Impact
Currently, only registered users can create new pages on enwiki (the only production site using the extension that I am aware of), so I don't believe ip masking will have an meaningful immediate impact on the new pages feed queue.

Draft creation

Impact
Anonymous users can currently create drafts that populate the afc queue on enwiki (verified by scrolling through the afc queue on the new pages feed). Articles that currently show IP addresses as the creator would instead show temporary user accounts. Drafts from single IP address may spawn multiple temporary accounts over time. I do not know if this is a meaningful impact or not.

@Novem_Linguae I'd love to get your input on these draft findings, especially the impact/mitigation parts.

Thanks for taking a look at this.

isAnon() could be replaced with !isNamed()

I suggest leaving the original code for this, instead of changing it. Special:NewPagesFeed is a public page that can be used by any editor. Allowing temp accounts to have their Special:NewPagesFeed filters be remembered between sessions seems harmless and low cost. Kind of like how we're giving them the ability to ping and receive pings. Open to other opinions though.

Noting that there are two isAnon() statements in this file. Don't forget both if we write a patch.

isRegsitered() could be replaced with isNamed()

+1. There's no circumstance where a temp account would have the "patroller" permission, so they would never need to display the Page Curation toolbar. Looks good to change.


Just now I did a codebase search for mw.util.isIPAddress(), isIPv4Address(), isIPv6Address(), mw.isAnon(), mw.util.isTemporaryUser(), isTemp(), isNamed() and didn't find anything additional.

Overall, it looks like IP masking's impact on PageTriage is pretty low. Anything else can probably be addressed through bug reports.

Thanks for taking a look at this.

isAnon() could be replaced with !isNamed()

I suggest leaving the original code for this, instead of changing it. Special:NewPagesFeed is a public page that can be used by any editor. Allowing temp accounts to have their Special:NewPagesFeed filters be remembered between sessions seems harmless and low cost. Kind of like how we're giving them the ability to ping and receive pings. Open to other opinions though.

Thanks @jsn.sherman and @Novem_Linguae. I have a question - are the filters remembered through preferences? i.e. if we let temp users save filters is that going to impact the database tables?

Looks like the save code is...

new mw.Api().saveOption( 'userjs-NewPagesFeedFilterOptions', this.encodeFilterParams() );

The documentation for mw.Api().saveOption() states that it uses https://www.mediawiki.org/wiki/API:Options.

The code for that eventually calls User->saveSettings(), which does touch the database, probably the user_properties table via UserOptionsManager-saveOptionsInternal().

Interesting. I would have guessed that PageTraige used browser localStorage, but I guess not. Using the SQL database makes the sessions per user rather than per browser, which I guess is good.


Anyway, what are your thoughts Niharika and Jason?

  • Are temporary accounts able to access Special:Preferences and set preferences? If not, then we probably shouldn't let them set this.
  • Is there a concern that using preferences for temporary accounts will increase the size of the database too much?

Looks like the save code is...

new mw.Api().saveOption( 'userjs-NewPagesFeedFilterOptions', this.encodeFilterParams() );

The documentation for mw.Api().saveOption() states that it uses https://www.mediawiki.org/wiki/API:Options.

The code for that eventually calls User->saveSettings(), which does touch the database, probably the user_properties table via UserOptionsManager-saveOptionsInternal().

Interesting. I would have guessed that PageTraige used browser localStorage, but I guess not. Using the SQL database makes the sessions per user rather than per browser, which I guess is good.

Thanks for looking this up!

Anyway, what are your thoughts Niharika and Jason?

  • Are temporary accounts able to access Special:Preferences and set preferences? If not, then we probably shouldn't let them set this.

Temporary accounts are not able to access Special:Preferences.

  • Is there a concern that using preferences for temporary accounts will increase the size of the database too much?

I believe that is a concern the SRE folks have expressed.

My suggestion is that it would be better to not let them save preferences for the filters. It's not like they can do any patrolling anyway.
To clarify - they would be able to set filters but they won't be saved between sessions, right? I think that is acceptable unless you disagree.

Yes, I see localStorage in a code search of the repo, so it is probably also saving to the browser.

const stored = JSON.parse( localStorage.getItem( 'ext.pageTriage.settings' ) );

If temporary accounts can't access Special:Preferences, I wonder if mw.Api().saveOption() would even work. Might return a permissions error.

Sounds good. Let's go with Jason's original plan then. "isAnon() could be replaced with !isNamed()". Thanks all.

@Niharika Yeah, my impact statements were flawed; for some reason, I had it in my head that global preferences were unavailable but local preferences are available. This is of course incorrect! I believe my proposed mitigations are still valid though.

@Novem_Linguae The localstorage calls are only in the new vue interface, and those are actually going away in more recent patches.

However, In the new interface, I could easily add localstorage as a backup persistence method for anonymous users on the new pages feed. I would recommend not adding that feature to the backbone new pages feed since it is going away, but rather just limit it to named users as I already suggested.

Impact
Currently, only registered users can create new pages on enwiki (the only production site using the extension that I am aware of), so I don't believe ip masking will have an meaningful immediate impact on the new pages feed queue.

Draft creation

Impact
Anonymous users can currently create drafts that populate the afc queue on enwiki (verified by scrolling through the afc queue on the new pages feed). Articles that currently show IP addresses as the creator would instead show temporary user accounts. Drafts from single IP address may spawn multiple temporary accounts over time. I do not know if this is a meaningful impact or not.

I'm personally somewhat more concerned about this, it would be great to look into building some kind of integration between the new IP address tool (the popup beside each IP address that is provided by the (ipinfo-view-basic) permission) and PageTriage. I'm unsure if just having the temporary account name being visible will be enough information for patrollers?

Also, we might want to consider building tools that surface some (ipinfo-view-full) information in PageTriage (ex: This IP address is in the same range as this other IP who recently also tried to create the same page a month ago etc) for administrators.

[...]

I'm unsure if just having the temporary account name being visible will be enough information for patrollers?

Genuine question: how do we determine that?

Additional info: the process for gaining access to IP addresses is laid out here:
https://foundation.wikimedia.org/wiki/Policy:Access_to_temporary_account_IP_addresses#Minimum_requirements_for_access

[...]

Also, we might want to consider building tools that surface some (ipinfo-view-full) information in PageTriage (ex: This IP address is in the same range as this other IP who recently also tried to create the same page a month ago etc) for administrators.

If I understand things correctly, I think we would need to implement the "reveal ip" button/workflow, which would then surface the ip info workflow. I'm not sure all legal and technical processes we would need to go through to implement that, but I did find this:
https://meta.wikimedia.org/wiki/IP_Editing:_Privacy_Enhancement_and_Abuse_Mitigation#How_will_the_IP_address_reveal_functionality_work?

Change 952279 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[mediawiki/extensions/PageTriage@master] new ui persists settings the same way as old ui

https://gerrit.wikimedia.org/r/952279

[...]

I'm unsure if just having the temporary account name being visible will be enough information for patrollers?

Genuine question: how do we determine that?

Additional info: the process for gaining access to IP addresses is laid out here:
https://foundation.wikimedia.org/wiki/Policy:Access_to_temporary_account_IP_addresses#Minimum_requirements_for_access

[...]

Also, we might want to consider building tools that surface some (ipinfo-view-full) information in PageTriage (ex: This IP address is in the same range as this other IP who recently also tried to create the same page a month ago etc) for administrators.

If I understand things correctly, I think we would need to implement the "reveal ip" button/workflow, which would then surface the ip info workflow. I'm not sure all legal and technical processes we would need to go through to implement that, but I did find this:
https://meta.wikimedia.org/wiki/IP_Editing:_Privacy_Enhancement_and_Abuse_Mitigation#How_will_the_IP_address_reveal_functionality_work?

We are building that into MediaWiki so maybe it's not a big lift to do this in PageTriage. I'm out of my depth here though. @Tchanders will you be able to speak to this? And also, is it possible to enable IP Info within PageTriage?

jsn.sherman moved this task from Code review to Done on the Moderator-Tools-Team (Kanban) board.

Marking as resolved since the investigation is complete.

Change 952279 merged by jenkins-bot:

[mediawiki/extensions/PageTriage@master] new ui stores settings the same way as old ui

https://gerrit.wikimedia.org/r/952279