Page MenuHomePhabricator

TypeError: Argument 2 passed to Parser::preSaveTransform() must implement interface MediaWiki\Page\PageReference, null given, called in /srv/mediawiki/php-1.38.0-wmf.7/includes/preferences/DefaultPreferencesFactory.php on line
Closed, ResolvedPublic

Description

Job execution exception on wmf.7 happening since Nov 8, 2021 @ 20:06:49.528: https://logstash.wikimedia.org/goto/25b03cc33092b0b14889229636a3fa84

Example:

{
  "_index": "logstash-deploy-2021.11.11",
  "_type": "_doc",
  "_id": "vXCVDn0Bf67HOmRfdPIW",
  "_version": 1,
  "_score": null,
  "_source": {
    "phpversion": "7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf3",
    "host": "mw1338",
    "caught_by": "other",
    "channel": "exception",
    "message_checksum": "e33dd19c92cc84abfd56692ca3b123ae",
    "http_method": "POST",
    "@version": 1,
    "timestamp": "2021-11-11T10:40:14+00:00",
    "program": "mediawiki",
    "shard": "s7",
    "tags": [
      "input-kafka-rsyslog-udp-localhost",
      "rsyslog-udp-localhost",
      "kafka",
      "es",
      "es"
    ],
    "@timestamp": "2021-11-11T10:40:14.683Z",
    "monolog_level": 400,
    "logsource": "mw1338",
    "servergroup": "jobrunner",
    "severity": "err",
    "message": "[50ec917e-24cb-403d-a60b-558b5c41e21c] /rpc/RunSingleJob.php   TypeError: Argument 2 passed to Parser::preSaveTransform() must implement interface MediaWiki\\Page\\PageReference, null given, called in /srv/mediawiki/php-1.38.0-wmf.7/includes/preferences/DefaultPreferencesFactory.php on line 595",
    "server": "jobrunner.discovery.wmnet",
    "wiki": "metawiki",
    "exception_url": "/rpc/RunSingleJob.php",
    "level": "ERROR",
    "normalized_message": "[{reqId}] {exception_url}   TypeError: Argument 2 passed to Parser::preSaveTransform() must implement interface MediaWiki\\Page\\PageReference, null given, called in /srv/mediawiki/php-1.38.0-wmf.7/includes/preferences/DefaultPreferencesFactory.php on line ",
    "mwversion": "1.38.0-wmf.7",
    "type": "mediawiki",
    "reqId": "50ec917e-24cb-403d-a60b-558b5c41e21c",
    "referrer": null,
    "facility": "user",
    "url": "/rpc/RunSingleJob.php",
    "exception": {
      "file": "/srv/mediawiki/php-1.38.0-wmf.7/includes/parser/Parser.php:4554",
      "class": "TypeError",
      "message": "Argument 2 passed to Parser::preSaveTransform() must implement interface MediaWiki\\Page\\PageReference, null given, called in /srv/mediawiki/php-1.38.0-wmf.7/includes/preferences/DefaultPreferencesFactory.php on line 595",
      "trace": "from /srv/mediawiki/php-1.38.0-wmf.7/includes/parser/Parser.php(4554)\n#0 /srv/mediawiki/php-1.38.0-wmf.7/includes/preferences/DefaultPreferencesFactory.php(595): Parser->preSaveTransform(string, NULL, User, ParserOptions)\n#1 /srv/mediawiki/php-1.38.0-wmf.7/includes/preferences/DefaultPreferencesFactory.php(238): MediaWiki\\Preferences\\DefaultPreferencesFactory->profilePreferences(User, RequestContext, array)\n#2 /srv/mediawiki/php-1.38.0-wmf.7/extensions/GlobalPreferences/includes/GlobalPreferencesFactory.php(117): MediaWiki\\Preferences\\DefaultPreferencesFactory->getFormDescriptor(User, RequestContext)\n#3 /srv/mediawiki/php-1.38.0-wmf.7/extensions/GlobalPreferences/includes/GlobalPreferencesFactory.php(531): GlobalPreferences\\GlobalPreferencesFactory->getFormDescriptor(User, RequestContext)\n#4 /srv/mediawiki/php-1.38.0-wmf.7/extensions/TheWikipediaLibrary/includes/PreferenceHelper.php(39): GlobalPreferences\\GlobalPreferencesFactory->setGlobalPreferences(User, array, RequestContext)\n#5 /srv/mediawiki/php-1.38.0-wmf.7/extensions/TheWikipediaLibrary/includes/TheWikipediaLibraryHooks.php(137): MediaWiki\\Extension\\TheWikipediaLibrary\\PreferenceHelper::setGlobalPreference(User, string, string)\n#6 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/MWCallableUpdate.php(38): TheWikipediaLibraryHooks::{closure}()\n#7 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdates.php(515): MWCallableUpdate->doUpdate()\n#8 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdates.php(391): DeferredUpdates::attemptUpdate(MWCallableUpdate, Wikimedia\\Rdbms\\LBFactoryMulti)\n#9 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdates.php(234): DeferredUpdates::run(MWCallableUpdate, Wikimedia\\Rdbms\\LBFactoryMulti, Monolog\\Logger, BufferingStatsdDataFactory, string)\n#10 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdatesScope.php(267): DeferredUpdates::{closure}(MWCallableUpdate, integer)\n#11 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdatesScope.php(196): DeferredUpdatesScope->processStageQueue(integer, integer, Closure)\n#12 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdates.php(237): DeferredUpdatesScope->processUpdates(integer, Closure)\n#13 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdatesScope.php(267): DeferredUpdates::{closure}(AtomicSectionUpdate, integer)\n#14 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdatesScope.php(196): DeferredUpdatesScope->processStageQueue(integer, integer, Closure)\n#15 /srv/mediawiki/php-1.38.0-wmf.7/includes/deferred/DeferredUpdates.php(242): DeferredUpdatesScope->processUpdates(integer, Closure)\n#16 /srv/mediawiki/php-1.38.0-wmf.7/extensions/EventBus/includes/JobExecutor.php(105): DeferredUpdates::doUpdates()\n#17 /srv/mediawiki/rpc/RunSingleJob.php(76): MediaWiki\\Extension\\EventBus\\JobExecutor->execute(array)\n#18 {main}",
      "code": 0
    }
  },
  "fields": {
    "@timestamp": [
      "2021-11-11T10:40:14.683Z"
    ]
  },
  "highlight": {
    "normalized_message.keyword": [
      "@kibana-highlighted-field@[{reqId}] {exception_url}   TypeError: Argument 2 passed to Parser::preSaveTransform() must implement interface MediaWiki\\Page\\PageReference, null given, called in /srv/mediawiki/php-1.38.0-wmf.7/includes/preferences/DefaultPreferencesFactory.php on line @/kibana-highlighted-field@"
    ],
    "channel.keyword": [
      "@kibana-highlighted-field@exception@/kibana-highlighted-field@"
    ],
    "type": [
      "@kibana-highlighted-field@mediawiki@/kibana-highlighted-field@"
    ]
  },
  "sort": [
    1636627214683
  ]
}

Not super-frequent, 4K occurrences in a few days, but enough to trigger the exception & error monitoring threshold: https://grafana.wikimedia.org/d/000000438/mediawiki-exceptions-alerts?orgId=1&var-datasource=eqiad%20prometheus%2Fops&viewPanel=18&from=1636626110938&to=1636627968819

Screenshot from 2021-11-11 11-55-57.png (976×2 px, 92 KB)

Screenshot from 2021-11-11 11-56-58.png (840×2 px, 87 KB)

Adding a few heuristic tags based on error message (apologies if not accurate).

Event Timeline

Restricted Application added a subscriber: RhinosF1. · View Herald Transcript
Urbanecm added a subscriber: Urbanecm.

This is definitely directly related to TheWikipediaLibrary deployment. @jsn.sherman and the team should be able to fix this. Triaging to High preliminary.

@jcrespo If at any time this starts to be unbearable for infra/humans, do let me know and I'm happy to uninstall the extension. Alternatively you can revert https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/736863 yourself.

Thanks for bringing this to my attention. I'm out today, but wanted to acknowledge seeing this. I'll be taking a look at our global preferences helper first thing tomorrow.

Thanks for the quick response- having found a most probable cause, I will remove the potentially unrelated tags.

Update:
I've been unable to reproduce this error in mediawiki vagrant, but it looks to me like the context we're passing into setGlobalPreferences() isn't what's expected.
Right now we're just grabbing RequestContext::getMain() because that's what other examples we found were using and it works in the cases we tested.
The documentation for RequestContext says we should be only do this as a last resort, so I've been looking for other ways to pass the context to the preferences factory.

It looks like we could probably derive a safer value from $wikiPage->getTitle() in our onPageSaveComplete hook, but I'd really like to reproduce the failure state so I can write a test for this.

Change 738494 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/TheWikipediaLibrary@master] Ignore bots and system accounts

https://gerrit.wikimedia.org/r/738494

@jsn.sherman Do note this happens in a job context (because the URI is /rpc/RunSingleJob.php, and because mw1338 is a job runner). I've skimmed through beta's logs -- nothing similar appears to happen there.

I don't immediately see a job enqueued by your code, so I digged down deeper the logs to find out which kind of a job gets executed here (via JobExecutor.log). At the logstash URL from the description, 64d0230e-c11b-4644-b8e1-9b8c6cb850af is the reqID of last failed entry, and this is an example of a job (there's a bunch of others, at a first glance I didn't see anything else; see next comment):

2021-11-13 00:13:39 [64d0230e-c11b-4644-b8e1-9b8c6cb850af] mw1338 metawiki 1.38.0-wmf.7 JobExecutor DEBUG: Beginning job execution {"job":"cirrusSearchElasticaWrite Special: method=sendData arguments=[\"content\",[{\"data\":{\"version\":22328067,\"wiki\":\"metawiki\",\"namespace\":0,\"namespace_text\":\"\",\"title\":\"Access to nonpublic personal data policy/yi\",\"timestamp\":\"2021-11-13T00:12:29Z\",\"create_timestamp\":\"2014-01-08T11:23:00Z\",\"redirect\":[],\"incoming_links\":1},\"params\":{\"_id\":\"3340104\",\"_type\":\"\",\"_index\":\"\",\"_cirrus_hints\":{\"BuildDocument_flags\":0,\"noop\":{\"version\":\"documentVersion\",\"incoming_links\":\"within 20%\"}}},\"upsert\":true}]] cluster=codfw createdAt=1636762351 errorCount=0 retryCount=0 requestId=64d0230e-c11b-4644-b8e1-9b8c6cb850af","job_type":"cirrusSearchElasticaWrite"}

It looks it's...cirrusSearchElasticaWrite. I don't think that job causes it (I don't see why CirrusSearch enabled would change anything), however, it provides some important metadata already: it says the error might've happened at https://meta.wikimedia.org/w/index.php?title=Access_to_nonpublic_personal_data_policy/yi&action=history. Looking at the history page, the page was edited at around the time the job started (and almost (?) exactly at the timestamp noted in the job's metadata), and it was edited by FuzzyBot (which makes sense, considering this page is Translate-managed).

My current working theory: Since TheWikipediaLibraryHooks doesn't exclude system accounts nor bot accounts, it gets executed for FuzzyBot as well. It runs in a job runner, because FuzzyBot operates (mainly?) via jobs. Since the bot has millions of edits globally, the extension tries to send the notification, and for some reason, the parser doesn't receive the relevant title, which causes the failure.

Uploaded a patch to ignore system/bot accounts. Note the patch is completely untested – I hope the information provided above will be sufficient for @jsn.sherman to fix this in a proper way.

I just marked https://meta.wikimedia.org/wiki/Global_bans for translation to test the theory from above. It looks that having Translate enabled and marking a page for translation (which should make FuzzyBot do some edits via MediaWiki job) causes the issue -- new errors appeared in logstash the second I finished the operation.

FTR, P17740 (private paste, @jsn.sherman added manually) is a transcript of all jobs executed by my test edit. Error logged to Logstash is the same as in the description already, so not pasting that.

This is really great context. Thank you very much! 🙏

Change 739019 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[mediawiki/extensions/TheWikipediaLibrary@master] Avoid TypeError when setting global prefernce

https://gerrit.wikimedia.org/r/739019

Okay, between the stacktrace, the poc patch, and the discussion here, this is what I have in Change 739019 so far:

  • I've excluded members of bot as suggested by @Urbanecm. isSystemAccount() doesn't seem to be an available method, so maybe it used to be a thing that's now gone?
    • I manually tested in wikimedia-vagrant to verify that this works as expected. Bot accounts don't get global preferences fiddled with nor do they get notifications.
  • I also added in a guard clause to the preference helper to check for a wikipage attached to the RequestContext before attempting to set a preference. The preference factory needs a page reference, and based on a quick perusal of the source, this looks like a handy way to exclude some requests that won't have it. I manually tested and verified that this doesn't create any problems in our use cases.
  • I don't have unit tests for these changes written because I haven't come up a good unit test scenario for the error yet.

Change 739019 merged by jenkins-bot:

[mediawiki/extensions/TheWikipediaLibrary@master] Avoid TypeError when setting global prefernce

https://gerrit.wikimedia.org/r/739019

Okay, between the stacktrace, the poc patch, and the discussion here, this is what I have in Change 739019 so far:

  • I've excluded members of bot as suggested by @Urbanecm. isSystemAccount() doesn't seem to be an available method, so maybe it used to be a thing that's now gone?

It's just called User::isSystemUser(), dunno why I ended up with isSystemAccount(). Fixed the PoC patch.

ah, helpful. I even grepped for that, I just got the capitalization wrong. I should always set to case insensitive. Once again, I manually verified that this does not regress anything in mediawiki-vagrant just to be sure. Tiny patch inbound.

Change 739031 had a related patch set uploaded (by Jsn.sherman; author: Jsn.sherman):

[mediawiki/extensions/TheWikipediaLibrary@master] Ignore system accounts

https://gerrit.wikimedia.org/r/739031

Change 739031 merged by jenkins-bot:

[mediawiki/extensions/TheWikipediaLibrary@master] Ignore system accounts

https://gerrit.wikimedia.org/r/739031

All changes except that last one for system accounts made onto this week's train. I'm hoping we'll see the error rate drop with what we have.

All changes except that last one for system accounts made onto this week's train. I'm hoping we'll see the error rate drop with what we have.

Based on current trends, the error rate at the moment seems to be the same (e.g. on metawiki)- not sure if you meant it is supposed to have dropped already or it will be next week: https://logstash.wikimedia.org/goto/485ce2a74d85c779a86574f04afb5a45

This is still causing alerting to produce rate alerts:

PROBLEM - MediaWiki exceptions and fatals per minute for jobrunner on alert1001 is CRITICAL: 295 gt 100 https://wikitech.wikimedia.org/wiki/Application_servers https://grafana.wikimedia.org/d/000000438/mediawiki-alerts?panelId=18&fullscreen&orgId=1&var-datasource=eqiad+prometheus/ops

the error rate at the moment seems to be the same (e.g. on metawiki)

I don't think this is deployed to Meta just yet (Meta still seems to be on .7)

the error rate at the moment seems to be the same (e.g. on metawiki)

I don't think this is deployed to Meta just yet (Meta still seems to be on .7)

Should be in about 2 hours

Thanks for the clarification -I misunderstood as if it may be already deployed-, errors stopped (so far) at Nov 18, 2021 @ 04:29:01.232:
https://logstash.wikimedia.org/goto/8cc92f7dca6d290148ea814f724de5f9

General jobqueue errors got reduced significantly, too and no longer alerting: https://grafana.wikimedia.org/d/000000438/mediawiki-exceptions-alerts?orgId=1&var-datasource=eqiad%20prometheus%2Fops&viewPanel=18&from=1637135668122&to=1637234332225

Thank you all.

Change 738494 abandoned by Urbanecm:

[mediawiki/extensions/TheWikipediaLibrary@master] Ignore bots and system accounts

Reason:

in favor of Ie13e25413a48d32d6d31cda52d3196ee778aee25

https://gerrit.wikimedia.org/r/738494