Page MenuHomePhabricator

Retract revdel'ed Wikidata edits from Wikibase client watchlists
Closed, ResolvedPublicSecurity

Description

Steps to Reproduce:

  • a user adds non-public identifying information about a subject to its Wikidata item
    • the edit is correctly being dispatched by the software to connected Wikibase clients immediately thereafter and appears on watchlists
  • an admin (or oversighter) revdels information from the edit in Wikidata (in particular "edit summary" or "revision text", in order to remove the non-public identifying information from Wikidata)

Actual Results:

  • the edit continues to appear on Wikibase client watchlists after both admin-revdel and oversight-revdel; the displayed text includes the revdel'ed content and there is no possibility to remove it from there

Expected Results:

  • edits that were revdel'ed by admins or oversighters in Wikidata should no longer appear in Wikibase clients, no matter which part of the information of the edit was revdel'ed

Status as of 2020-11-11:

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

That seems to be the right thing to do, yes.

I noticed during testing that ChangeLineFormatter (and its caller ChangesListLinesHandler) don't yet support hiding information based on rc_deleted, thus they will also need to be tweaked (but that can be done independently of propagating the deletion bitflag).

Moved back to TODO during story time so we can evaluate what is left and continue here.
@Lydia_Pintscher to comment with the current state according to @hoo

Marius said the status is:

  • Revision visibility change: Code ☑ Deploy to test ☐ Testing ☐ Production ☐
  • Entity deletion (whole page): Code ☐ (somewhat, first change is in gerrit), ...
  • He already uploaded the config change but it got stalled because of other committments

Sketch for handling entity deletion: the repo subscribes to the right MediaWiki hook and sends a single job to the client wiki; in the client wiki, that job loops through all the revisions of the deleted entity (stopping once they exceed the max RC age) and proceeds to delete each of them from the recent changes as if they had been deleted individually.

Todo: check with Marius what his plans for this change were.

Config change is deployed, and recentchanges redaction seems to be working:

Screenshot_2020-08-26 Recent changes - Test Wikipedia.png (100×975 px, 24 KB)

Okay, after a conversation the path forward is clearer now.

  • Create abstract ChangeModificationNotificationJob in new NS (edit: blocked on T261315) creates an abstract parent class of the current job class. Locating the affected recentchanges rows is done in the abstract class; marking them as deleted (rc_deleted column) is done in its subclass, the current job class.
  • Introduce a new client job class, a second subclass of that abstract class. Instead of marking rows as deleted (rc_deleted column), it actually deletes them from the table completely. This matches MediaWiki behavior for page deletion. (There is no provision for resurrecting the rows after a page is undeleted; apparently, MediaWiki doesn’t do this either.) Marius mentioned he already has a local change for this; it’s not a big class, since most of the work happens in the abstract parent class.
  • Introduce a new repo hook handler which watches for page deletion. It iterates through all the revisions of the deleted page (we might have to query the archive table manually for this, unless we can get the hook to run before the page has been fully deleted), stopping once they get older than the max RC age of the client, and creates instances of the new job class on each client wiki. (That is, iterating through the relevant page revisions happens once on the repo, not on each client. The “locate affected recentchanges rows” part of the client job is thus exactly the same as for individual revision deletion.)

In the meantime, the config change for propagating revision visibility can probably be promoted to real Wikidata soon, and eventually the configuration option can be removed altogether and Wikibase will just do this unconditionally.

Appears to be unblocked but still requires review

  • Introduce a new client job class, a second subclass of that abstract class. Instead of marking rows as deleted (rc_deleted column), it actually deletes them from the table completely. This matches MediaWiki behavior for page deletion. (There is no provision for resurrecting the rows after a page is undeleted; apparently, MediaWiki doesn’t do this either.) Marius mentioned he already has a local change for this; it’s not a big class, since most of the work happens in the abstract parent class.

Currently up for review in 624710.

In the meantime, the config change for propagating revision visibility can probably be promoted to real Wikidata soon…

https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/629383

…and eventually the configuration option can be removed altogether and Wikibase will just do this unconditionally.

(still to be done)

This chain is now merged, but we’ll need to wait until next week’s train before we can test it in production.

In the meantime, the config change for propagating revision visibility can probably be promoted to real Wikidata soon…

https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/629383

Deployed. The original issue report should be resolved with this; maybe one of the subscribed Wikidata admins can test it next time they delete a revision. (We tested it on Test Wikidata, but I don’t think staff should delete revisions on real Wikidata.)

Moving to the Waiting column for the page deletion part, which is waiting for next week’s train. (Removing the propagateChangeVisibility setting from Wikibase, and enabling that part by default, could be done at any time, though.)

TODOs, as far as I understand/remember:

  • config: set $wgWBRepoSettings['propagatePageDeletion'] to true (test and real Wikidata) – all the related code should be merged and deployed by now
  • Wikibase: remove propagateChangeVisibility repo setting, always propagate change visibility
    • config: remove $wgWBRepoSettings['propagatePageDeletion'] once Wikibase no longer uses the setting

Deployed to Test Wikidata and it seems to work. Before deletion:

Two Wikidata changes for Another random test page, both changing P7

After deletion:
One Wikidata change: associated Wikidata item deleted. The other rows are gone.

Gerrit change for removing the propagateChangeVisibility Wikibase setting (always enable it): https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/636048

Merged.

Next step would be to enable the same config setting on real Wikidata.

https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/636453

This is in stalled/waiting waiting for 1.36.0-wmf.15 to be deployed (we are currently on .14)

Still left TODO (as per @Lucas_Werkmeister_WMDE on Mattermost) in the order below:

  1. Deploy config change of propagatePageDeletion to real Wikidata
  2. Enable this unconditionally in Wikibase
  3. Eventually remove the setting from the production config just like propagateChangeVisibility

And also remove propagateChangeVisibility from the production config, since it’s been unconditionally enabled in Wikibase since wmf.15 if I’m not mistaken.

Edit: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/640676

Next step would be to enable the same config setting on real Wikidata.

https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/636453

Deployed. This can now be tested on Wikidata – when a whole page is deleted, all of its revisions should vanish from the recent changes of connected wikis. Maybe one of the subscribed admins can test this the next time there’s a page to delete? (Though it would need to be a page that at least used to have a sitelink… not sure how common such a scenario is.)

And also remove propagateChangeVisibility from the production config, since it’s been unconditionally enabled in Wikibase since wmf.15 if I’m not mistaken.

Edit: https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/640676

And deployed.

Deployed. This can now be tested on Wikidata – when a whole page is deleted, all of its revisions should vanish from the recent changes of connected wikis. Maybe one of the subscribed admins can test this the next time there’s a page to delete? (Though it would need to be a page that at least used to have a sitelink… not sure how common such a scenario is.)

Maybe to clarify this a bit – the test scenario I have in mind would be:

  • We have a Wikidata item linked to a Wikipedia [or other client wiki] page.
  • The sitelink is removed from the Wikidata item, but the Wikipedia page is not (yet) deleted.
    • At this point, there should be a recent changes entry on the Wikipedia page, if I’m not mistaken. The easiest way to verify this is probably to put the page on your watchlist, but maybe Special:RecentChangesLinked can also help, or you could use Quarry to query the recentchanges table. (IIUC the entry will not appear on the history of the page itself.)
  • The Wikidata item is deleted.
    • Now the recent changes entry on Wikipedia should be gone. (This might take a few minutes.)
  • At this point, the Wikipedia page could be deleted too, I guess.

But I’m not sure when this would happen… for usual spam pages, the Wikipedia page would probably get deleted before the Wikidata item?

ItamarWMDE changed Due Date from Sep 30 2020, 10:00 PM to Dec 7 2020, 9:00 AM.

Two notes from our daily meeting:

  1. We probably can’t verify this very well; we’ll mainly keep an eye on it and see if the admins report any more problems.
  1. Do we backport this? Should it be coordinated with the MediaWiki security releases? I’m not sure how we’ve handled this so far in Wikibase. (That question also applies to T260349, I suppose.)
  1. Do we backport this? Should it be coordinated with the MediaWiki security releases? I’m not sure how we’ve handled this so far in Wikibase. (That question also applies to T260349, I suppose.)

Since Wikibase is not part of WMF mediawiki bundle, we don't follow the mediawiki security release workflow (While I think we can simply backport it once it's made public) for a long-term and clearer workflow for security releases I assume the release strategy hike should handle it (Am I wrong?)

What's the blocker for making this public and applying it on gerrit?

I assume the release strategy hike should handle it (Am I wrong?)

We're on it yes. Which does not mean we have an immediate advice for the campsite crew.

What's the blocker for making this public and applying it on gerrit?

There might be none. To avoid making a mistake, I've asked WMF Security team for a kind consultation before we move on with making this bug and its fix a public knowledge.

What's the blocker for making this public and applying it on gerrit?

There might be none. To avoid making a mistake, I've asked WMF Security team for a kind consultation before we move on with making this bug and its fix a public knowledge.

Any movement here @WMDE-leszek ?
I think we can probably make this public (being consistent with what we have done in the past)

It was originally intended to open it up in January together with T260349 but then AFAIR @Lucas_Werkmeister_WMDE was on the position there is not need to open this bug any time soon (I do not recall the details - the issue does not realistically affect any non-WMF wiki?; that would be an equally good argument for making it public and closing it)

I think we thought that both tasks are unlikely to affect third-party wikis; however, the fixes to T260349 had been deployed as security patches, and started causing issues during train deployments, so it became more important to move those fixes to Gerrit, and making the task public became a part of that. Here, on the other hand, the fixes were always committed to Gerrit, just without making the security impact very clear, so there was no similarly pressing reason to make this task public. That’s not a reason not to make it public, though.

The related code has been merged and included in the Wikibase relase for non-Wikimedia users, issue is fixed on Wikidata/WMF wikis. Nothing left to do here. Opening the task for a public and resolving then.

WMDE-leszek changed the visibility from "Custom Policy" to "Public (No Login Required)".