Wikibase does not purge cached Special:EntityData URLs when revisions or entities are deleted
Closed, ResolvedPublicSecurity
Actions

Assigned To

Authored By

	Lucas_Werkmeister_WMDE
	Aug 13 2020, 12:56 PM

Description

The contents of Wikibase entity pages can be retrieved through the Special:EntityData special page, whose contents are cached by default (for one hour, in production – $wgWBRepoSettings['dataCdnMaxAge']). When a revision is deleted, the contents of Special:EntityData are not purged from the cache, allowing others to still access this information.

For example, on Test Wikidata I added the information that the rebel base’s location is Dantooine, and then removed it again and deleted that revision. But because the Wikibase editing UI loads Special:EntityData by default (since T85499), the data is already in the cache, and can be retrieved from there:

$ curl -v https://test.wikidata.org/wiki/Special:EntityData/Q212688.json?revision=530580 | jq -r .entities.Q212688.claims.P286[0].mainsnak.datavalue.value.id
…
> GET /wiki/Special:EntityData/Q212688.json?revision=530580 HTTP/2
> Host: test.wikidata.org
> user-agent: curl/7.68.0
> accept: */*
> 
…
< HTTP/2 200 
< date: Thu, 13 Aug 2020 12:42:52 GMT
…
< last-modified: Thu, 13 Aug 2020 12:42:45 GMT
< content-type: application/json; charset=UTF-8
< age: 187
< x-cache: cp3050 miss, cp3058 hit/11
< x-cache-status: hit-front
< server-timing: cache;desc="hit-front"
…
< cache-control: private, s-maxage=0, max-age=0, must-revalidate
…
Q212689

I think we also don’t purge cached entity data when a whole page is deleted, but I haven’t tested that yet.

Acceptance Criteria

Pages are purged when entity deletion happens (might already be done not checked)
Pages are purged when revision deletion of specific revisions happens

Inspection notes

Some of the mechanisms/code that were written (and are merged) from T242164 could be used.
No extra extensions are needed for reproducing this (it is core wikibase)
This can't easily be tested with a full cache setup, but could be tested with unit tests etc to make sure the purging code is called correctly.

Patch-For-Review:

Details

Due Date: Oct 7 2020, 10:00 PM
Author Affiliation: Wikimedia Deutschland

	Subject	Repo	Branch	Lines +/-
	SECURITY: Add job to purge entity data on page deletion	mediawiki/extensions/Wikibase	master	+397 -6
	SECURITY: Add EntityDataPurger	mediawiki/extensions/Wikibase	master	+179 -1

Customize query in gerrit

Related Objects

Mentioned In: T285795: Limit languages on EntityStub rdf builders
T271342: 1.36.0-wmf.28 deployment blockers
T242164: Retract revdel'ed Wikidata edits from Wikibase client watchlists
T263177: 1.36.0-wmf.11 deployment blockers
T261486: Plan to scale up termbox service to be able to render the termbox for desktop pageviews
Mentioned Here: T128667: Special:EntityData with flavor is cached but not purged properly
T263177: 1.36.0-wmf.11 deployment blockers
T263519: Wikibase EntityDataRequestHandlerTest::testCacheHeaderIsSetWithRevision fails on Travis CI
T128486: [Story] Make Special:EntityData be up to date after an edit
T14619: Use article path URLs for editing, previewing skins, etc.
T242164: Retract revdel'ed Wikidata edits from Wikibase client watchlists
T85499: wbEntity shouldn't be served on every page load

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

We are seeing some fishy travis builds (e.g.) ever since 4c4e0cd was merged.

EntityDataRequestHandlerTest::testCacheHeaderIsSetWithRevision comes back red for 2 runs of the matrix (not sure what makes them distinct) - the Cache-Control header which is expected to say public is private apparently.

This is tracked individually in T263519

• Pablo-WMDE added a subscriber: hoo.Sep 22 2020, 8:03 AM

The first Gerrit change was merged (just after the branch cut, which is probably fortunate); we should configure the new config setting in production, so that the ?id= requests don’t lose caching.

I left a risky patch comment on the next train blockers task, see T263177#6483815; CCing @thcipriani here in case more context is needed. (This is a security task, so I didn’t want to mention the task ID in the comment.)

Lucas_Werkmeister_WMDE updated the task description. (Show Details)Sep 22 2020, 2:57 PM

Also CCing @mmodell who seems to have been designated as the train conductor now.

Lucas_Werkmeister_WMDE updated the task description. (Show Details)Sep 23 2020, 10:03 AM

New version of the security patch to purge URLs (the old one was in T260349#6409396), based on Ic3b2595682, PS7:

0001-SECURITY-Add-EntityDataPurger.patch6 KBDownload

Addshore mentioned this in T261486: Plan to scale up termbox service to be able to render the termbox for desktop pageviews.Sep 25 2020, 11:46 AM

The “move caching responsibilities into EntityDataUriManager” change was abandoned, so here’s a new version of the security change:

0001-SECURITY-Add-EntityDataPurger.patch8 KBDownload

Diff to the previous version (mostly test changes):

diff --git a/repo/includes/Hooks/EntityDataPurger.php b/repo/includes/Hooks/EntityDataPurger.php
index c62b2ad926..aaa63ed94a 100644
--- a/repo/includes/Hooks/EntityDataPurger.php
+++ b/repo/includes/Hooks/EntityDataPurger.php
@@ -55,13 +55,16 @@ public function onArticleRevisionVisibilitySet( $title, $ids, $visibilityChangeM
 			return;
 		}
 
+		$urls = [];
 		foreach ( $ids as $revisionId ) {
-			$this->entityDataUriManager->purgeWebCache(
+			$urls = array_merge( $urls, $this->entityDataUriManager->getPotentiallyCachedUrls(
 				$entityId,
 				// $ids should be int[] but MediaWiki may call with a string[], so cast to int
-				(int)$revisionId,
-				$this->htmlCacheUpdater
-			);
+				(int)$revisionId
+			) );
+		}
+		if ( $urls !== [] ) {
+			$this->htmlCacheUpdater->purgeUrls( $urls );
 		}
    }
 
diff --git a/repo/tests/phpunit/includes/Hooks/EntityDataPurgerTest.php b/repo/tests/phpunit/includes/Hooks/EntityDataPurgerTest.php
index 7936d0b915..a58766c619 100644
--- a/repo/tests/phpunit/includes/Hooks/EntityDataPurgerTest.php
+++ b/repo/tests/phpunit/includes/Hooks/EntityDataPurgerTest.php
@@ -31,8 +31,10 @@ public function testGivenEntityIdLookupReturnsNull_handlerDoesNothing() {
 			->willReturn( null );
 		$entityDataUriManager = $this->createMock( EntityDataUriManager::class );
 		$entityDataUriManager->expects( $this->never() )
-			->method( 'purgeWebCache' );
+			->method( 'getPotentiallyCachedUrls' );
 		$htmlCacheUpdater = $this->createMock( HtmlCacheUpdater::class );
+		$htmlCacheUpdater->expects( $this->never() )
+			->method( 'purgeUrls' );
 		$purger = new EntityDataPurger( $entityIdLookup, $entityDataUriManager, $htmlCacheUpdater );
 
 		$purger->onArticleRevisionVisibilitySet( $title, [ 1, 2, 3 ], [] );
@@ -46,15 +48,49 @@ public function testGivenEntityIdLookupReturnsId_handlerPurgesCache() {
 			->method( 'getEntityIdForTitle' )
 			->with( $title )
 			->willReturn( $entityId );
+		$entityDataUriManager = $this->createMock( EntityDataUriManager::class );
+		$entityDataUriManager->expects( $this->once() )
+			->method( 'getPotentiallyCachedUrls' )
+			->with( $entityId, 1 )
+			->willReturn( [ 'urlA/Q1/1', 'urlB/Q1/1' ] );
 		$htmlCacheUpdater = $this->createMock( HtmlCacheUpdater::class );
+		$htmlCacheUpdater->expects( $this->once() )
+			->method( 'purgeUrls' )
+			->with( [ 'urlA/Q1/1', 'urlB/Q1/1' ] );
+		$purger = new EntityDataPurger( $entityIdLookup, $entityDataUriManager, $htmlCacheUpdater );
+
+		$purger->onArticleRevisionVisibilitySet( $title, [ 1 ], [] );
+	}
+
+	public function testGivenMultipleRevisions_handlerPurgesCacheOnce() {
+		$title = Title::newFromText( 'Item:Q1' );
+		$entityId = new ItemId( 'Q1' );
+		$entityIdLookup = $this->createMock( EntityIdLookup::class );
+		$entityIdLookup->expects( $this->once() )
+			->method( 'getEntityIdForTitle' )
+			->with( $title )
+			->willReturn( $entityId );
 		$entityDataUriManager = $this->createMock( EntityDataUriManager::class );
 		$entityDataUriManager
-			->method( 'purgeWebCache' )
+			->method( 'getPotentiallyCachedUrls' )
 			->withConsecutive(
-				[ $entityId, 1, $htmlCacheUpdater ],
-				[ $entityId, 2, $htmlCacheUpdater ],
-				[ $entityId, 3, $htmlCacheUpdater ]
+				[ $entityId, 1 ],
+				[ $entityId, 2 ],
+				[ $entityId, 3 ]
+			)
+			->willReturnOnConsecutiveCalls(
+				[ 'urlA/Q1/1', 'urlB/Q1/1' ],
+				[ 'urlA/Q1/2', 'urlB/Q1/2' ],
+				[ 'urlA/Q1/3', 'urlB/Q1/3' ]
 			);
+		$htmlCacheUpdater = $this->createMock( HtmlCacheUpdater::class );
+		$htmlCacheUpdater->expects( $this->once() )
+			->method( 'purgeUrls' )
+			->with( [
+				'urlA/Q1/1', 'urlB/Q1/1',
+				'urlA/Q1/2', 'urlB/Q1/2',
+				'urlA/Q1/3', 'urlB/Q1/3',
+			] );
 		$purger = new EntityDataPurger( $entityIdLookup, $entityDataUriManager, $htmlCacheUpdater );
 
 		$purger->onArticleRevisionVisibilitySet( $title, [ 1, 2, 3 ], [] );

Had a look at this locally and it looks good to me

Lucas_Werkmeister_WMDE updated the task description. (Show Details)Sep 29 2020, 10:52 AM

Alright, all the public Gerrit changes have been merged, but only after the wmf.11 branch cut. I’m not convinced backporting them (specifically, Add EntityDataUriManager to WikibaseRepo) is worth the trouble, so I suggest we wait until next week and then deploy the security patch. (Meanwhile, during this week’s train we can monitor the risky change in cache behavior, see T263177#6483815.)

Lucas_Werkmeister_WMDE changed the task status from Open to Stalled.Sep 29 2020, 10:56 AM

Lucas_Werkmeister_WMDE moved this task from Peer Review to Stalled/Waiting on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.

Lucas_Werkmeister_WMDE set Due Date to Oct 7 2020, 10:00 PM.

• mmodell mentioned this in T263177: 1.36.0-wmf.11 deployment blockers.Sep 29 2020, 9:49 PM

• mmodell added a parent task: T263177: 1.36.0-wmf.11 deployment blockers.

@Lucas_Werkmeister_WMDE Is this still a train blocker?

I don't think it is...

Nope, no need to block the train on this.

Lucas_Werkmeister_WMDE changed the task status from Stalled to Open.Oct 21 2020, 3:24 PM

Lucas_Werkmeister_WMDE moved this task from Stalled/Waiting to To Do (prioritised from top to bottom) on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.

I just deployed the security patch from T260349#6498297 to the wmf.13 and wmf.14 branches; on Test Wikidata (with the same Dantooine test set from the task description) it seemed to work fine (the cache was purged when the revision visibility was changed). I also put the patch under /srv/patches as described in how to deploy security patches.

Lucas_Werkmeister_WMDE updated the task description. (Show Details)Oct 21 2020, 4:05 PM

I guess a good way to verify this would be to look at the revision deletion log – if any new revisions are deleted after Oct 21st, then their data should no longer be cached.

Addshore added a comment.Oct 28 2020, 7:16 PM

This comment was removed by Addshore.

Verified for deleting revisions with:

And the cache is purged in a couple of seconds.

Also verified for restoring revisions with revision 533543

I tried 2 entity deletions, but neither seemed to purge the cache. Didn't spot anything in the logs. (all with Q213483 on testwikidata just now)
Thus I'll bounce this back to TODO for now for further investigation

Addshore added a subscriber: Silvan_WMDE.Oct 28 2020, 7:30 PM

Hm, I think we didn’t implement any entity/page deletion handling yet? It’s in the task title and AC, but maybe we forgot about it while working on the task (or at least I forgot, I believe).

A note from task inspection: We might be able to reuse some code to dig up revisions out of the archive table from the changes related to T242164: Retract revdel'ed Wikidata edits from Wikibase client watchlists.

noarave updated the task description. (Show Details)Nov 3 2020, 3:04 PM

ItamarWMDE claimed this task.Nov 10 2020, 12:57 PM

ItamarWMDE moved this task from To Do (prioritised from top to bottom) to Doing on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.

Additional patch to purge urls of deleted items. based on previous security patch:

0001-SECURITY-Add-onArticleDelete-hook-handler-to-EntityD.patch14 KBDownload

ItamarWMDE moved this task from Doing to Peer Review on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Nov 20 2020, 12:36 PM

Looks good to me; I think we can deploy it on Monday (scheduled for the EU window). The change seems to apply without conflict to wmf.16 too, so we can probably backport to both branches, since wmf.18 currently isn’t fully rolled out.

Just cleaned up some whitespace errors so the patch can later be uploaded to Gerrit as-is:

0001-SECURITY-Add-onArticleDelete-hook-handler-to-EntityD.patch14 KBDownload

Ahead of the deployment, I’ve verified that if I undelete the specific revision with the rebel base’s location, then get the entity data (to make sure it’s cached), and then delete the whole item, the cached data stays available. I’ll use the same item (Q212688) to test the deployment later.

I deployed this to wmf.18, but it didn’t seem to work, so I removed it again. I’ll try to test this locally later, and if I can’t figure out what’s going wrong, see how I can add some debug logging.

Alright, here’s a version of the patch with some debug logging added:

T260349-2-debug.patch15 KBDownload

Diff from the non-debug version

diff --git a/repo/includes/Hooks/EntityDataPurger.php b/repo/includes/Hooks/EntityDataPurger.php
index 72c4a34001..612cc697ab 100644
--- a/repo/includes/Hooks/EntityDataPurger.php
+++ b/repo/includes/Hooks/EntityDataPurger.php
@@ -8,6 +8,7 @@
 use Config;
 use HtmlCacheUpdater;
 use MediaWiki\Hook\ArticleRevisionVisibilitySetHook;
+use MediaWiki\Logger\LoggerFactory;
 use MediaWiki\Page\Hook\ArticleDeleteCompleteHook;
 use Title;
 use Wikibase\Lib\Store\EntityIdLookup;
@@ -109,28 +110,52 @@ public function onArticleDeleteComplete(
 		$logEntry,
 		$archivedRevisionCount
 	) {
+		$logger = LoggerFactory::getInstance( 'Wikibase' );
+		$logger->debug( __METHOD__ . ': entry for page {pageId}', [
+			'wikiPage' => $wikiPage,
+			'pageId' => $id,
+			'archivedRevisionCount' => $archivedRevisionCount,
+		] );
 		$title = $wikiPage->getTitle();
 		$entityId = $this->entityIdLookup->getEntityIdForTitle( $title );
 		if ( !$entityId ) {
+			$logger->debug( __METHOD__ . ': exit, no entity ID for title {title}', [
+				'title' => $title->getPrefixedText(),
+			] );
 			return;
 		}
+		$logger->debug( __METHOD__ . ': entity ID is {entityId}', [
+			'entityId' => $entityId->getSerialization(),
+		] );
 
 		// This ensures we would delete also EntityData for non revision based urls of an entity.
 		$urls = $this->entityDataUriManager->getPotentiallyCachedUrls( $entityId );
 
-		foreach ( $this->getArchivedRevisionIds( $title, $id ) as $revisionId ) {
-			$urls = array_merge( $urls, $this->entityDataUriManager->getPotentiallyCachedUrls(
+		foreach ( $this->getArchivedRevisionIds( $title, $id, $logger ) as $revisionId ) {
+			$addedUrls = $this->entityDataUriManager->getPotentiallyCachedUrls(
 				$entityId,
 				(int)$revisionId
-			) );
+			);
+			$logger->debug( __METHOD__ . ': add {count} URLs for revision {revision}', [
+				'revision' => $revisionId,
+				'urls' => $addedUrls,
+				'count' => count( $addedUrls ),
+			] );
+			$urls = array_merge( $urls, $addedUrls );
 		}
 
 		if ( $urls !== [] ) {
+			$logger->debug( __METHOD__ . ': purge {count} URLs', [
+				'urls' => $urls,
+				'count' => count( $urls ),
+			] );
 			$this->htmlCacheUpdater->purgeUrls( $urls );
+		} else {
+			$logger->debug( __METHOD__ . ': no URLs to purge' );
 		}
 	}
 
-	private function getArchivedRevisionIds( Title $title, int $id ): iterable {
+	private function getArchivedRevisionIds( Title $title, int $id, $logger ): iterable {
 		$dbr = $this->loadBalancerFactory
 			->getMainLB()
 			->getConnectionRef( DB_REPLICA );
@@ -141,6 +166,9 @@ private function getArchivedRevisionIds( Title $title, int $id ): iterable {
 			[ 'ar_id' ],
 			$this->batchSize
 		);
+		$logger->debug( __METHOD__ . ': created batch row iterator for batch size {batchSize}', [
+			'batchSize' => $this->batchSize,
+		] );
 
 		$iterator->setFetchColumns( [ 'ar_rev_id' ] );
 		$iterator->addConditions( [
@@ -150,6 +178,10 @@ private function getArchivedRevisionIds( Title $title, int $id ): iterable {
 		] );
 
 		foreach ( $iterator as $batch ) {
+			$logger->debug( __METHOD__ . ': got batch of {count} rows', [
+				'batch' => $batch,
+				'count' => count( $batch ),
+			] );
 			yield from array_map(
 				function ( $row ) {
 					return $row->ar_rev_id;

I’ll try to deploy this to an mwdebug server later and then look at the debug logging.

(Edit: I did not succeed in debugging this locally – as far as I could tell it worked fine.)

Excerpt from the log messages:

Wikibase\Repo\Hooks\EntityDataPurger::onArticleDeleteComplete: entry for page 303258
Wikibase\Repo\Hooks\EntityDataPurger::onArticleDeleteComplete: entity ID is Q212688
Wikibase\Repo\Hooks\EntityDataPurger::getArchivedRevisionIds: created batch row iterator for batch size 100
Wikibase\Repo\Hooks\EntityDataPurger::onArticleDeleteComplete: no URLs to purge

There are no log messages between these. Apparently, the BatchRowIterator yielded zero batches.

Also, later in the same request:

Wikibase\Repo\ChangeModification\DispatchChangeDeletionNotificationJob::run: processed 0 rows but archived 3 revisions for Q212688

I suspect that our assumption – that, by the time the ArticleDeleteComplete hook runs, the archived rows have been committed and replicated – is incorrect. I’ll try manually changing DB_REPLICA to DB_MASTER on mwdebug1001 and see how the behavior changes then…

Yup, with DB_MASTER it works. (See the logs if you want to – most important is “purge 9 URLs”.) I’ll undeploy the change again so we can figure out how to properly solve this in normal campsite work (because reading an unbounded number of rows from the master is probably not ideal).

I suspect that our assumption – that, by the time the ArticleDeleteComplete hook runs, the archived rows have been committed and replicated – is incorrect.

I think I see where we got that assumption wrong. I don’t remember who else was in the call, but I believe we read WikiPage::doDeleteArticleBatched() as something like

// yadda yadda yadda
while ( true ) {
	$done = $this->archiveRevisions( $dbw, $id, $suppress );
	// yadda yadda yadda
	$lbFactory = MediaWikiServices::getInstance()->getDBLoadBalancerFactory();
	$lbFactory->waitForReplication();
	// yadda yadda yadda
}

if ( !$done ) {
	// something something continue as job
} else {
	// we’re done archiving, something something something
	$this->getHookRunner()->onArticleDeleteComplete( /* many arguments */ );
	// some more stuff
}

and so we believed that the waitForReplication() would always take place before the hook is called. But I think we overlooked this bit:

while ( true ) {
	$done = $this->archiveRevisions( $dbw, $id, $suppress );
	if ( $done || !$immediate ) {
		break;
	}
	// buncha code
	$lbFactory = MediaWikiServices::getInstance()->getDBLoadBalancerFactory();
	$lbFactory->waitForReplication();
	// more code
}

On the last “batch” of archived revisions – the one that also runs the hook – the waitForReplication() is skipped, because the loop is exited early. (In fact, the loop is probably almost always exited early, due to $immediate usually being false?) The block that later runs the hook includes an endAtomic(), but does not call waitForReplication(). In effect, the “immediate deletion” code tricked us into thinking that waitForReplication() would always be called before the hook was called, when that’s actually not the case most of the time.

I’ll update the change to add waitForReplication() and see if that helps. (And by “see if that helps”, I probably mean another round of mwdebug testing, because I don’t have a replicated setup on localhost.)

New patches with and without debug logging:

T260349-2.patch14 KBDownload

T260349-2-debug.patch15 KBDownload

Diff to the previous version:

diff --git a/repo/includes/Hooks/EntityDataPurger.php b/repo/includes/Hooks/EntityDataPurger.php
index 72c4a34001..69eaf82e0c 100644
--- a/repo/includes/Hooks/EntityDataPurger.php
+++ b/repo/includes/Hooks/EntityDataPurger.php
@@ -131,6 +131,11 @@ public function onArticleDeleteComplete(
 	}
 
 	private function getArchivedRevisionIds( Title $title, int $id ): iterable {
+		// ensure that we can see the archived rows on the replica
+		$this->loadBalancerFactory->waitForReplication( [
+			'domain' => $this->loadBalancerFactory->getLocalDomainID(),
+		] );
+
 		$dbr = $this->loadBalancerFactory
 			->getMainLB()
 			->getConnectionRef( DB_REPLICA );

That said, I’m now doubting the whole approach of this patch. We should probably move this to a job, similar to the jobs of T242164: Retract revdel'ed Wikidata edits from Wikibase client watchlists, instead of trying to process all of these revisions directly in the web request that’s deleting the page. So the hook handler would schedule a job with the page title/namespace/ID and the number of archived revisions, and then the job would go through the archive table and purge the corresponding URLs (and complain at the end if it processed a different number of archived revisions than stated in its job parameters).

Weird, the waitForReplication() didn’t help either. (I also tried it without the 'domain' part by commenting out that lineon mwdebug1001 directly.) I think it’s time for me to step back for a bit…

Okay, I think I made another false assumption – that endAtomic() commits the transaction, so that it’s now ready for replication. That’s not the case: in general, an endAtomic() may commit the whole transaction (if it’s the outermost atomic level and the whole transaction was started by that atomic section), but in a normal web request, IIUC the most it can do is release a savepoint – the commit only happens later, when the whole request is done. (And since the startAtomic() call here didn’t request the section to be cancelable, and the default is ATOMIC_NOT_CANCELABLE, there shouldn’t even be a savepoint to release.) So this means that with the patches from T260349#6641445, the waitForReplication() didn’t help because the transaction to create the archive rows hadn’t committed yet, and therefore none of the replicas were allowed to see those rows.

That means we must ensure that our code runs after the main transaction has committed, and then also still wait for replication. I think there are several ways to do the first part – a post-send (i.e. default) deferred update, onTransactionCommitOrIdle(), probably some more I’m not aware of – but a job should take care of that part as well, so it’s probably best to go for the job solution. (Even if jobs are run as part of the web request, they run after the main transaction commit, as far as I’m aware.) Again, though – even the job should still wait for replication. (I assume that’s also the explanation for the “DispatchChangeDeletionNotificationJob::run: processed 0 rows but archived 3 revisions for Q212688” log message earlier, because that job doesn’t wait for replication either – ~~though I still don’t understand why the job ran as part of the same web request at all, since $wgJobRunRate is 0 in production.~~ The job didn’t run as part of the same PHP process – the “server” field of that message is “jobrunner.discovery.wmnet” rather than “test.wikidata.org”. It just got the same request ID field, somehow.)

Nice. We discussed the "job solution" a bit and in my understanding it would both cater to the performance concerns doing all this work during the web request, as well as making the topic area of waiting for replication largely irrelevant. Win win (at quite a bit of effort and added complexity).

The job didn’t run as part of the same PHP process [...] It just got the same request ID field, somehow.

I assume this is a feature, cf. https://microservices.io/patterns/observability/distributed-tracing.html but would be nice to see the documentation.

guergana.tzatchkova unsubscribed.Nov 24 2020, 2:41 PM

Alright, the following patch implements the job solution:

T260349-2-alt.patch17 KBDownload

I tested locally that HtmlCacheUpdater::purgeUrls() is called by adding the following line to that method:

file_put_contents( '/tmp/purgedUrls', implode( "\n", $urls ) . "\n", FILE_APPEND );

However, that doesn’t guarantee that the replication is working correctly now, since I don’t have replication set up locally. I’ll try to test this again on mwdebug soon.

Lucas_Werkmeister_WMDE moved this task from Doing to Peer Review on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Nov 24 2020, 7:25 PM

@ItamarWMDE noticed that the test suite failed with that change, so I uploaded a MediaWiki core change to fix that: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/643496. The following version of the patch file just adds a comment to the commit message and streamlines the test code a bit; no non-test code changes.

T260349-2-alt.patch17 KBDownload

That means we must ensure that our code runs after the main transaction has committed, and then also still wait for replication. I think there are several ways to do the first part – a post-send (i.e. default) deferred update, onTransactionCommitOrIdle(), probably some more I’m not aware of – but a job should take care of that part as well, so it’s probably best to go for the job solution. (Even if jobs are run as part of the web request, they run after the main transaction commit, as far as I’m aware.)

@hoo pointed out that if we push the job directly in the hook handler, there’s a possibility (probably not very likely in practice) that a hook runner will pick up the job and start running it before our web request has completed and committed the transaction. Using lazyPush() instead of push() ought to prevent this.

Edit: new version using lazyPush():

T260349-2-alt.patch17 KBDownload

tested locally that HtmlCacheUpdater::purgeUrls() is called by adding the following line

~~Is this method of validating still relevant, with these changes?~~ I was looking at the wrong container. +1 Validated locally and seems to work. Tests still fail as the patch was not merged to master yet.

Lucas_Werkmeister_WMDE moved this task from Peer Review to Doing on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Nov 26 2020, 12:23 PM

In T260349#6648544, @Lucas_Werkmeister_WMDE wrote:

T260349-2-alt.patch17 KBDownload

I deployed this one to wmf.18, and it seems to be working – I deleted the rebel base item again and the entity data URL, which previously had had x-cache-status: hit-local, was gone. (wmf.20 is rolling out later this week, I’ll try to remember to check that the patch isn’t missed.)

Krinkle unsubscribed.Dec 1 2020, 2:25 AM

Lucas_Werkmeister_WMDE moved this task from Doing to Test (Verification) on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Dec 1 2020, 10:20 AM

Once this task is made public, we should leave a comment at T128667: Special:EntityData with flavor is cached but not purged properly. It’s not clear to me whether that task is fully resolved with the changes here – it asks for purging “when action=purge is issued”, and we now purge Special:EntityData on revdel/pagedel and on /wiki/Special:EntityData/Q123?action=purge, but not on /wiki/Q123?action=purge.

WMDE-leszek moved this task from Test (Verification) to Peer Review on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Dec 15 2020, 10:22 AM

WMDE-leszek moved this task from Peer Review to Test (Verification) on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.

Lucas_Werkmeister_WMDE mentioned this in T242164: Retract revdel'ed Wikidata edits from Wikibase client watchlists.Dec 16 2020, 10:32 AM

WMDE-leszek moved this task from Test (Verification) to Stalled/Waiting on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Jan 13 2021, 9:26 AM

Noting that 01-T260349.patch no longer applies without 3-way merge:

/usr/bin/git apply --no-3way --check /srv/patches/1.36.0-wmf.27/extensions/Wikibase/01-T260349.patch
error: patch failed: extension-repo.json:448                                                                              
error: extension-repo.json: patch does not apply

This traditionally wouldn't have triggered a complaint, but we've recently moved to using scap apply-patches by default during the train. apply-patches was written to use --no-3way, and discussion of that has led us to realize that there may be cases where a 3-way merge results in a semantically broken result.

I'm proceeding with 1.36.0-wmf.27 under the assumption that this is fine, as it has generally been in the past, but a check on the patch would be appreciated. In future this will likely become a train blocker (and hopefully we can find a way to notify security folk of it in advance).

Thanks. This seems to be a problem. I check to see what's the blocker of making this public, it should not stay like this for that long time.

I’ve looked at the applied commit on deploy1001 (/srv/mediawiki-staging/php-1.36.0-wmf.27/extensions/Wikibase, commit e4054597c9) and it looks fine to me. I’m guessing that some other change on master touched an adjacent hooks line in the extension JSON file; in that case, a 3-way merge is probably semantically correct.

I think this should not linger in production like that as more issues are bound to happen. I make this a blocker of the next train to make sure it doesn't get forgotten by the next branch cut.

Ladsgroup added a parent task: T271342: 1.36.0-wmf.28 deployment blockers.Jan 20 2021, 7:08 PM

The issue has been successfully patched on Wikidata. To our best knowledge the problem does not pose a security risk to Wikibase installation outside of WMF production wiki. Therefore we make the issue, and the fix, public.

The two security patches are now on Gerrit: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/658323 and https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/658324. We can merge them so that they’ll go out with the regular train, and won’t need to be applied as security patches anymore.

WMDE-leszek changed the visibility from "Custom Policy" to "Public (No Login Required)".Jan 25 2021, 4:49 PM

WMDE-leszek changed the edit policy from "Custom Policy" to "All Users".

Maintenance_bot moved this task from incoming to in progress on the Wikidata board.Jan 25 2021, 5:15 PM

Change 658323 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] SECURITY: Add EntityDataPurger

https://gerrit.wikimedia.org/r/658323

Change 658324 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] SECURITY: Add job to purge entity data on page deletion

https://gerrit.wikimedia.org/r/658324

Lucas_Werkmeister_WMDE moved this task from Stalled/Waiting to Done on the Wikidata-Campsite (Wikidata-Campsite-Iteration-∞ (On Hold)) board.Jan 25 2021, 5:57 PM

ReleaseTaggerBot added a project: MW-1.36-notes (1.36.0-wmf.28; 2021-01-26).Jan 25 2021, 6:00 PM

WMDE-leszek closed this task as Resolved.Jan 25 2021, 6:32 PM

Silvan_WMDE mentioned this in T271342: 1.36.0-wmf.28 deployment blockers.Jan 25 2021, 6:34 PM

Silvan_WMDE removed a parent task: T271342: 1.36.0-wmf.28 deployment blockers.

sbassett lowered the priority of this task from High to Low.Jan 25 2021, 7:18 PM

sbassett moved this task from Watching to Our Part Is Done on the Security-Team board.

Note: I committed the deletion of the two wmf.28 Wikibase patches under /srv/patches on the deployment server (5578144525) since wmf.28 was rolled back and as noted by gerritbot above, https://gerrit.wikimedia.org/r/658323 and https://gerrit.wikimedia.org/r/658324 were merged.

Thanks for this @sbassett and apologies for not cleaning the house ourselves!

Lucas_Werkmeister_WMDE mentioned this in T285795: Limit languages on EntityStub rdf builders.Jun 30 2021, 1:00 PM

	F33920504: 0001-SECURITY-Add-onArticleDelete-hook-handler-to-EntityD.patch
	Nov 20 2020, 2:34 PM

	F33924921: T260349-2-alt.patch
	Nov 25 2020, 1:52 PM

	F33924920: T260349-2-alt.patch
	Nov 25 2020, 1:51 PM

	F33924895: T260349-2-alt.patch
	Nov 25 2020, 1:15 PM

Wikibase does not purge cached Special:EntityData URLs when revisions or entities are deletedClosed, ResolvedPublicSecurityActions

Description

Details

Related Objects

Event Timeline

Wikibase does not purge cached Special:EntityData URLs when revisions or entities are deleted
Closed, ResolvedPublicSecurity
Actions