Page MenuHomePhabricator

IP Masking Considerations: services/parsoid
Closed, ResolvedPublic

Description

IP Masking will affect lots of our products, features, tools, gadgets, etc. This task is for tracking work to update those that are owned by Content-Transform-Team, ahead of IP Masking being enabled on WMF sites.

See T326816: [Epic] Update features for temporary accounts, particularly What will be affected.

A preliminary investigation (T326759) has found that the following may be affected:

  • services/parsoid

Event Timeline

The usage in Parsoid, according to the list of problematic methods in T326759, is limited to a single line:

$ git grep IPUtils
src/Utils/Title.php:use Wikimedia\IPUtils;
src/Utils/Title.php:                    $title = IPUtils::sanitizeIP( $title );

This is in the implementation of Title::newFromText(), which is a more-or-less direction port from Title::newFromText in core (although it was first ported to JavaScript and then reverse ported back to PHP). We should be able to just copy the "fixed" implementation of Title::newFromText() from core and then resolve this task.

There's another usage in core's includes/parser/Parsoid:

$ git grep getName\( includes/parser/Parsoid
includes/parser/Parsoid/Config/PageConfig.php:          return $user ? $user->getName() : null;

This appears to be inside PageConfig::getRevisionUser() which is never actually called inside Parsoid (!).

cscott added a subscriber: ssastry.
MSantos triaged this task as High priority.Aug 15 2024, 2:09 PM
MSantos moved this task from Backlog to Current Deploy Target on the Content-Transform-Team-WIP board.
kostajh subscribed.

@ssastry @cscott could you please provide us with an update on the progress of this task? AFAICT, the usage here does not seem particularly problematic. I don't think you need to change anything about $title = IPUtils::sanitizeIP( $title ); -- there will continue to be user and user_talk pages associated with IP addresses from before temp accounts rollout. And calling $user-?getName() in PageConfig::getRevisionUser() also seems fine.

Change #1138857 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/services/parsoid@master] Audit Parsoid'd Title::newFromText() against core's TitleParser

https://gerrit.wikimedia.org/r/1138857

Change #1138858 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/services/parsoid@master] Remove PageConfig::getRevisionUser(), ::getRevisionUserId()

https://gerrit.wikimedia.org/r/1138858

Audited code to confirm that the usage is not problematic; a couple of patches are attached to document this in the code itself.

Change #1138857 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Audit Parsoid's Title::newFromText() against core's TitleParser

https://gerrit.wikimedia.org/r/1138857

Change #1138858 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Remove PageConfig::getRevisionUser(), ::getRevisionUserId()

https://gerrit.wikimedia.org/r/1138858

Change #1139521 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.21.0-a27

https://gerrit.wikimedia.org/r/1139521

Change #1139521 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.21.0-a27

https://gerrit.wikimedia.org/r/1139521