Page MenuHomePhabricator

[Skins] Uncaught Error: Call to a member function isLocal() on boolean
Open, LowPublic


Today during chores, I noticed a few of these errors:

PHP Fatal error:  Uncaught Error: Call to a member function isLocal() on boolean in /srv/mediawiki/php-1.35.0-wmf.28/includes/page/WikiFilePage.php:125
Stack trace:
#0 /srv/mediawiki/php-1.35.0-wmf.28/includes/skins/SkinTemplate.php(995): WikiFilePage->isLocal()
#1 /srv/mediawiki/php-1.35.0-wmf.28/includes/skins/SkinTemplate.php(459): SkinTemplate->buildContentNavigationUrls()
#2 /srv/mediawiki/php-1.35.0-wmf.28/skins/MinervaNeue/includes/skins/SkinMinerva.php(143): SkinTemplate->prepareQuickTemplate()

Acceptance Criteria

  • Error does not appear in Logstash

Developer notes

  • There are about 6 reports on the 20th April. Isolated incident?
  • Update: Several more reports on May 2

Event Timeline

nray created this task.Apr 20 2020, 11:58 PM
Restricted Application added subscribers: Masumrezarock100, Aklapper. · View Herald TranscriptApr 20 2020, 11:58 PM
nray updated the task description. (Show Details)Apr 21 2020, 3:44 PM
Krinkle added a subscriber: Krinkle.EditedApr 29 2020, 6:12 PM

@nray It looks like these were captured from the internal syslog channel which is lacking a lot of valueable information you'd need to determine how this is impacting users and how to reproduce it (e.g. which wikis, which URLs, HTTP method, referer, etc.).

The link to Logstash doesn't work unfortunately as it's not the sharable link but your personal session link. I do note from the screenshot that it appears to be based on a permalink to an outdated version of the Reading Web dashboard. The latest uses type:mediawiki instead which has all the added information.

Let me know if you'd like a refresher session with myself and/or CPT on Logstash. A lot has changed in recent years. Happy to help :)

Jdlrobson renamed this task from [Minerva] Uncaught Error: Call to a member function isLocal() on boolean to [Skins] Uncaught Error: Call to a member function isLocal() on boolean.May 1 2020, 3:11 PM
Jdlrobson lowered the priority of this task from Medium to Low.
Jdlrobson updated the task description. (Show Details)
Jdlrobson edited projects, added MediaWiki-Interface; removed MinervaNeue.
Jdlrobson updated the task description. (Show Details)
Jdlrobson added a subscriber: Jdlrobson.

The error occurred in buildContentNavigationUrls inside SkinTemplate so have dropped the Minerva tag.

I assumed it was Minerva-related because the message is only seen with traces that include SkinMinerva in the callstack, or with a mobile url from a post-send stack trace (after the SkinMinerva call is unwound and thus no longer visible). There are no matches with other skins in the call stack. (I've looked for the past 30 days and found none after excluding SkinMinerva and post-send).

The actual point where it blows up is indeed in the core SkinTemplate->buildContentNavigationUrls method, which suggests that either SkinMinerva has prepared the instance in a way that core doesn't expect, or some other extension code that only runs in combination with Minerva (e.g. MobileFrontend or one of its hook) has broken the Skin object indirectly.

nray updated the task description. (Show Details)May 4 2020, 8:41 PM
Jdlrobson added a comment.EditedMay 8 2020, 8:57 PM

In SkinTemplate $page = $this->canUseWikiPage() ? $this->getWikiPage() : false;

getWikiPage comes from ContextSource::getWikiPage this calls $this->getContext()->getWikiPage();. getContext returns a RequestContext (always) so that brings us to RequestContext:: getWikiPage

RequestContext::getWikiPage will always return a WikiPage::factory or throw an exception.

The factory will either return the result of the WikiPageFactory hook or an instance of WikiFilePage, WikiPage or WikiCategoryPage depending on the title so context matters here. WikiCategoryPage and WikiPage isLocal always returns true. WikiFilePage should always return some kind of WikiPage.

The error is occuring inside WikiFilePage so something is going wrong inside WikiFilePage:loadFile and mFile is not being defined.

Looking at the function body the method should account for mFile being defined using the following method:

$this->mFile = $services->getRepoGroup()->getLocalRepo()
				->newFile( $this->mTitle ); // always a File

However the comment there is wrong. It's not always a File.

According to

newFile @return File|null A File, or null if passed an invalid Title

There is no accounting for the situation when this is null.

how null becomes a boolean I'm not sure. It's possible the function signature here is also wrong. call_user_func either calls UnregisteredLocalFile::class, 'newFromTitle'

This feels like it might relate to the recent changes to Title in T246284 so I'm gonna cc @DannyS712 and @Ppchelko @Pchelolo

There was a recent change in 21e2d71560c by Alangi Derick @D3r1ck01 (I9437494de003f40fbe591321da7b42d16bb732d6) which could also account for this.

Jdlrobson added a subscriber: Pchelolo.

how null becomes a boolean I'm not sure. It's possible the function signature here is also wrong.

$mFile has been initialized as a boolean.

Tgr added a comment.May 11 2020, 10:48 AM

When mFile is set via loadFile(), the call ends with $this->mRepo = $this->mFile->getRepo(); which would not pass if mFile weren't a File. So either mFile is set to false via setFile() (doesn't happen in core,m at least, but a type hint wouldn't hurt) or it's never initialized in the first place.

Change 595500 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/core@master] Typehint WikiFilePage::setFile

Tgr added a comment.May 11 2020, 10:58 AM

... or it's never initialized in the first place.

But then isLocal does call loadFile so that can't happen either.

Change 595500 merged by jenkins-bot:
[mediawiki/core@master] Typehint WikiFilePage::setFile

It appears to be passed a boolean. The question is why...

Tgr added a comment.May 29 2020, 5:31 PM

I don't see anything relevant in the logs after May 2.

Jdlrobson closed this task as Resolved.Jun 26 2020, 6:39 PM
Jdlrobson claimed this task.
Tgr added a comment.Sep 10 2020, 10:41 PM

Huh. This seems logically impossible:

  • WikiFilePage->isLocal() calls loadFile() before trying to call isLocal() on mFile (which is apparently a boolean)
  • unless mFileLoaded is true, loadFile() will load the file and then call a method on it (so we'd get a different error if this were a loading failure)
  • mFileLoaded can only be set true by loadFile (which does load a file) and setFile (which is also guaranteed to load the file, we have a typehint for that now)
  • there is nothing that could unset mFile once it is loaded

It's a bit cheap to handwave at opcache corruption, but I don't have any other theory.

I can't speak to the impossible nature of the error, but in terms of opcache-iness, there's two signs that lead me to think it isn't opcache corruption:

  • We have seen opcache cause method calls on the wrong object, typically one "nearby" within the same function. However I've not yet seen it do that on a non-object. This would be a first, to my knowledge. See T245183 for details.
  • This error was seen on multiple servers in the same hour in Eqiad, and is now still seen on several servers in Codfw. The probability of the same exact corruption happening twice in the same hour on different servers seems extremely small (that is, so far I've yet to see any corruption happen on more than one server), much smaller for it to also happen in Codfw some weeks later, on multiple servers, spread out of over three separate days.