Page MenuHomePhabricator

[TECH][IPM] Investigate ramifications of IP masking on Wikidata related extensions
Closed, ResolvedPublic1 Estimated Story Points

Description

In T326908: Update WMDE Engineering-owned products that may be affected by IP Masking it was identified that upcoming work on IP masking might affect Wikidata related extensions, in particular:

  • EntitySchema
  • Wikibase
  • WikibaseLexeme
  • WikibaseQualityConstraints

It appears as though the time of rollout for this feature is October 2023 at the earliest (and unlikely to be on Wikidata at that date), however this is subject to change. Additionally, this list of extensions might not be fully exhaustive, and we should be watching out for any additional extensions that might be impacted by IP Masking.

For more information about the IP Masking project, please see: https://meta.wikimedia.org/wiki/IP_Editing:_Privacy_Enhancement_and_Abuse_Mitigation

In this Investigation, we would like the following questions to be addressed:

Product Related

  • How do these features affect change summaries and change dispatching to client wikis?
  • How do they affect changes made from client wikis when adding sitelinks?
  • What implications do IP masking have on editing workflows that involve wikidata bridge?

Technical

  • What kind of changes are there to used MediaWiki stable (or perhaps even unstable) interfaces?
  • Which code points in the extensions listed above would be affected?

Related Objects

Event Timeline

Task Prio Notes:

  • Does not affect end-users \ production (as the investigation results do not directly affect users, but the engineers who will have to implement any compatibility fixes)
  • Does affect development efforts
  • Does not onboarding efforts
  • Affects additional stakeholders (as this feature will be prepared and deployed by other teams in the movement)
ItamarWMDE renamed this task from Investigate ramifications of IP masking on Wikidata related extensions to [SW] Investigate ramifications of IP masking on Wikidata related extensions.Feb 15 2023, 8:57 AM

Story Writing Notes:

  • Add explanation on IP masking from metawiki.
ItamarWMDE renamed this task from [SW] Investigate ramifications of IP masking on Wikidata related extensions to [IPM] Investigate ramifications of IP masking on Wikidata related extensions.Mar 14 2023, 1:34 PM
ItamarWMDE updated the task description. (Show Details)

Some thoughts I had during the wmhack session (T332079):

Two things that directly comes to my mind here there this might cause trouble is the link item feature (linking Wikidata items from the client) and Wikibase client's bridge. Link item is supposed to only be accessible to logged in users, both tools will need to "log in" on Wikidata (both using mediawiki.ForeignApi).

Especially interesting case here: User has no (temporary) account yet, uses one of the tools (shall we allow that?)… and gets a temporary account on Wikidata (does that work, will it (=the account login) propagate to the client wiki?).

I am monitoring this task as I would like to understand the consequences of IP Masking for analytics.

ItamarWMDE renamed this task from [IPM] Investigate ramifications of IP masking on Wikidata related extensions to [TECH][IPM] Investigate ramifications of IP masking on Wikidata related extensions.Aug 2 2023, 2:16 PM

Timebox 16 hrs broken into

  • 8 hr - Investigate places where Wikibase distinguishes between anonymous and registered users
  • 8 hr - Investigate behavior of temporary accounts with cross-wiki API actions

See the child-tasks T343799 and T343800 for the results. Please take a look and let us know if you're missing something or need the information presented differently or anything.

Also, the upstream implementations seem to still see very active development, so it might make sense to wait for them to settle somewhat and not rush our development:

(this task should probably move on to tech-verification after product is happy)

Thank you! Looking good. Moving this to tech verification.

Results from the subtasks copied with slightly adjusted language to: https://wikitech.wikimedia.org/wiki/WMDE/Wikidata/Reports/2023/2023-09-18_Ramifications_of_IP_masking_on_Wikidata_related_extensions

@Lucas_Werkmeister_WMDE please let me know what you think of the overall report and my adaption of your parts in particular.

FTR: That writeup took another two hours.

Looks great to me, thank you! (I’ve made a few small edits to the page.)

If I understand IP masking correctly, the goal is to get rid of IPs as public identifiers. Instead we will use cookie-based "temporary accounts" as public identifiers. These are independent of the IP. I have two questions about this:

  1. This seems feasible for browser-based editing. But what about API-based editing? Can non-logged-in users currently edit using our APIs? (I hope not, as it does not sound like a good idea. xD) How will/should this work in the future?
  1. For Wikidata Analytics I got the impression that we will still store IPs internally, so nothing should change from an analytics perspective. Is that correct?

If I understand IP masking correctly, the goal is to get rid of IPs as public identifiers. Instead we will use cookie-based "temporary accounts" as public identifiers. These are independent of the IP. I have two questions about this:

  1. This seems feasible for browser-based editing. But what about API-based editing? Can non-logged-in users currently edit using our APIs? (I hope not, as it does not sound like a good idea. xD) How will/should this work in the future?

Yes they can. The edits made from entity pages are made via API calls from javascript, that includes edits from users that are not logged in.
The details how that will work in the future will need to be figured out. I guess the API response will include some info about the created account?

  1. For Wikidata Analytics I got the impression that we will still store IPs internally, so nothing should change from an analytics perspective. Is that correct?

We will still store the IPs internally, also for anti-abuse measures. With temporary accounts, I could see new types of queries coming the table of analytics with respect to the behavior of anonymous editors, now that we sometimes have a better way to associate edits with the same editor over time than relying on IPs.

ItamarWMDE claimed this task.

Thanks for the investigation and for summarizing your findings. I'll resolve this, and we can discuss a future followup with product managers.