Page MenuHomePhabricator

Fatal logged after renaming files: "LocalFile.php: Call to a member function purgeEverything() on boolean"
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

Request URL: Special Move and api.php
Request ID: XREv5wpAMDoAAHPGpY0AAAAM

message
Error from line 1938 of /srv/mediawiki/php-1.34.0-wmf.10/includes/filerepo/file/LocalFile.php:
Call to a member function purgeEverything() on boolean
trace
#0 /srv/mediawiki/php-1.34.0-wmf.10/includes/deferred/AutoCommitUpdate.php(46): LocalFile->{closure}(Wikimedia\Rdbms\DatabaseMysqli, string)
#1 /srv/mediawiki/php-1.34.0-wmf.10/includes/deferred/DeferredUpdates.php(309): AutoCommitUpdate->doUpdate()
#2 /srv/mediawiki/php-1.34.0-wmf.10/includes/deferred/DeferredUpdates.php(265): DeferredUpdates::attemptUpdate(AutoCommitUpdate, Wikimedia\Rdbms\LBFactoryMulti)
#3 /srv/mediawiki/php-1.34.0-wmf.10/includes/deferred/DeferredUpdates.php(217): DeferredUpdates::handleUpdate(AutoCommitUpdate, Wikimedia\Rdbms\LBFactoryMulti, string, integer)
#4 /srv/mediawiki/php-1.34.0-wmf.10/includes/deferred/DeferredUpdates.php(140): DeferredUpdates::handleUpdateQueue(array, string, integer)
#5 /srv/mediawiki/php-1.34.0-wmf.10/includes/MediaWiki.php(600): DeferredUpdates::doUpdates(string, integer)
#6 /srv/mediawiki/php-1.34.0-wmf.10/includes/api/ApiMain.php(548): MediaWiki::preOutputCommit(DerivativeContext)
#7 /srv/mediawiki/php-1.34.0-wmf.10/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#8 /srv/mediawiki/php-1.34.0-wmf.10/api.php(87): ApiMain->execute()
#9 /srv/mediawiki/w/api.php(3): require(string)
#10 {main}

Impact

The DeferredUpdate callback is interrupted by this fatal error, which means the required updates for these renamed files were not able to run, which means caches and derived data store (e.g. link tables or some such) are likely out of date, these do not automatically recover by other means.

Notes

Regression aligns exactly with the two attempted deployments of wmf.10.

Screenshot 2019-06-24 at 21.55.11.png (1×2 px, 201 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Deferred update hasn't changed in years

https://github.com/wikimedia/mediawiki/commit/3c21f0e8e5459400caf0ea2934976e86fe5918e0

But the code before it was by @D3r1ck01 in https://github.com/wikimedia/mediawiki/commit/21e2d71560cb87191dd80ae0750d0190b45063c1

But I would say it's functionally the same...

function wfFindFile( $title, $options = [] ) {
	return MediaWikiServices::getInstance()->getRepoGroup()->findFile( $title, $options );
}

https://github.com/wikimedia/mediawiki/blame/master/includes/filerepo/file/LocalFile.php#L1909 and https://github.com/wikimedia/mediawiki/blame/master/includes/filerepo/file/LocalFile.php#L1927 are basically exactly the same as in wfFindFile(), no error handling added or lost... So I'm guessing this commit didn't actually "break" it?

Seems a bit race-y

Aside: Would it have been better using $this->getRepo() rather than MediaWikiServices::getInstance()->getRepoGroup() ?

rMW94d8579d2c03: FileRepo: Use Late Static Binding in File static constructors seems the only other change in the code and time range, but it seems harmless.

Seems a bit race-y

With what? The LocalFile objects are looked up in the same transaction that did the move.

Aside: Would it have been better using $this->getRepo() rather than MediaWikiServices::getInstance()->getRepoGroup() ?

They aren't equivalent, the first returns a repo, the second a repo group. ($localRepo = MediaWikiServices::getInstance()->getRepoGroup() is confusing and the variable name should be fixed.) Although file moves only make sense for the local repo so probably wouldn't make much difference.

Tgr triaged this task as Unbreak Now! priority.Jun 25 2019, 12:56 PM

Train blockers are UBN. It's also a nuisance to Commons users per T226473: Caching issues when moving files.

If in doubt, could always revert both and revisit another time.

They aren't equivalent, the first returns a repo, the second a repo group.

That does explain it, actually. The patch replaces wfFindFile (which proxied to LocalRepo::findFile) with $localRepo->findFile (despite the name, $localRepo is RepoGroup, not LocalRepo). RepoGroup::findFile calls Repo::findFile but has an extra layer of in-process cache which LocalFileMoveBatch does not invalidate. So probably something somewhere earlier did a RepoGroup::findFile call for the move target, false got cached, and it gets loaded from the cache in the lines before DeferredUpdates::addUpdate.

Hm, fixing that did not fix the error. (And getting it locally means it's not replag-related.) So I guess that leaves the cache in LocalFile::load (which also does not seem to be reset during move), not sure what triggered a change in behavior though.

Change 519039 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] Revert "Replace some uses of deprecated wfFindFile() and wfLocalFile()"

https://gerrit.wikimedia.org/r/519039

Change 519040 had a related patch set uploaded (by Krinkle; owner: Gergő Tisza):
[mediawiki/core@wmf/1.34.0-wmf.10] Revert "Replace some uses of deprecated wfFindFile() and wfLocalFile()"

https://gerrit.wikimedia.org/r/519040

Change 519040 merged by jenkins-bot:
[mediawiki/core@wmf/1.34.0-wmf.10] Revert "Replace some uses of deprecated wfFindFile() and wfLocalFile()"

https://gerrit.wikimedia.org/r/519040

Mentioned in SAL (#wikimedia-operations) [2019-06-25T15:56:53Z] <krinkle@deploy1001> Synchronized php-1.34.0-wmf.10/includes/: T226448 / rMW40e725b6502c (duration: 01m 20s)

Mentioned in SAL (#wikimedia-operations) [2019-06-25T15:59:12Z] <krinkle@deploy1001> Synchronized php-1.34.0-wmf.10/maintenance/: T226448 / rMW40e725b6502c (duration: 01m 15s)

Change 519063 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] Fix LocalFile::move

https://gerrit.wikimedia.org/r/519063

Change 519039 abandoned by Gergő Tisza:
Revert "Replace some uses of deprecated wfFindFile() and wfLocalFile()"

Reason:
Ibb9d6f63 has a proper fix.

https://gerrit.wikimedia.org/r/519039

That does explain it, actually. The patch replaces wfFindFile (which proxied to LocalRepo::findFile) with $localRepo->findFile (despite the name, $localRepo is RepoGroup, not LocalRepo). RepoGroup::findFile calls Repo::findFile but has an extra layer of in-process cache which LocalFileMoveBatch does not invalidate. So probably something somewhere earlier did a RepoGroup::findFile call for the move target, false got cached, and it gets loaded from the cache in the lines before DeferredUpdates::addUpdate.

Apparenly I was confused about this myself. wfFindFile is RepoGroup::findFile, wfLocalFile` is LocalRepo::newFile (not LocalRepo::findFile as indicated above, so that's why my first fix attempt failed). Probably the target file (ie. the image table row) genuinely doesn't exist at this point so the findFile lookup returning false had nothing to do with caching problems. newFile in contrast always returns a file object.

Krinkle assigned this task to Tgr.

Change 519063 merged by jenkins-bot:
[mediawiki/core@master] Fix LocalFile::move

https://gerrit.wikimedia.org/r/519063

Thanking everyone that helped resolved this especially @Tgr for digging and solving the issue. Thanks a lot! Wasn't around for a while but saw all the notifications and hard work!

Sorry about the mess!

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:06 PM