Page MenuHomePhabricator

Investigate alternatives to MEDIAWIKI_JOB_RUNNER global
Closed, ResolvedPublic

Description

This global is set from /maintenance/runJobs.php and from internal /rpc endpoints, is undocumented, and is non-reliable, as for example it's not set from MediaWiki Special:RunJobs.

There's a few usages of the global.

We're planning to make WMF infrastructure use a REST endpoint for executing the jobs, and setting a global from within a REST handler would be a code smell.

We need to evaluate whether the variable is really needed. If it is, there's some alternative solutions:

  • We're planning on making the JobRunner a service and use it everywhere for executing the jobs, so we can add the capability to check whether the service is instantiated (via MediaWikiServices::peekService). The downside is that we're going to be relying on the fact that only job runners ever instantiate a JobRunner. This fact is true and would likely remain true, but nothing really prevents a non-job-runner to require the service by mistake.
  • Set flag within the service to indicate whether the execution is currently in the JobRunner context. This will probably not work for deferred updates called from within job execution.
  • Set the variable in mw-config depending on whether the execution is happening on jobrunner cluster. This will work for the WMF, but not for other installations, plus I didn't find an elegant way of figuring out which cluster is MW running on

This is an exploratory task and a proposal, keeping the status quo (defining a global) wouldn't be critically horrible either.

Event Timeline

Pchelolo renamed this task from Proposal: Deprecate/Remove MEDIAWIKI_JOB_RUNNER global to Proposal:Investigate alternatives to MEDIAWIKI_JOB_RUNNER global.Feb 11 2020, 5:20 AM
Krinkle renamed this task from Proposal:Investigate alternatives to MEDIAWIKI_JOB_RUNNER global to Investigate alternatives to MEDIAWIKI_JOB_RUNNER global.Feb 18 2020, 1:54 AM

Based on a quick audit of the handful of results, I think it's feasible to simply phase this out - without replacement.

  • [extensions/Translate] PageTranslationHooks.php: This is an ugly hack that has more to do with it being called from a specific job class, and less about jobrunners in general. This is a typical short-cut where some kind of injected flag, new private state, or existing primitive should be used instead.
  • [extensions/Translate] MessageGroupStats.php: An old optimisation for handling DeferredUpdates. Afaik, this can just be removed.
  • [core] ServiceWiring.php for PageEditStash: This is about deciding whether to run PageEditStash::checkCache(), to avoid false negatives in our logs. This is about detecting whether a request is from a real user (e.g. user interface like action=edit that may do stashing). It should ignore all rest edits, api edits, CLI edits and job edits. Detecting index.php should be easy to do via MW_ENTRY_POINT constant. In addition, in order to still have stashing work for VisualEditor (which is a user interface, but uses the API via JavaScript), perhaps introduce an ApiEdit parameter for explicitly opt-ing to stashing. This would also resolve other technical debt there as it we currently have lots of false negatives from the API already, which we only accept today because we don't have this option.
  • [core] runJobs.php: (Defines the constant)
  • [wmf-config] rpc: (Defines the constant)
  • extensions/BlueSpiceFoundation: This combines the Setup.php hook (ExtensionFunctions) with the MEDIAWIKI_JOB_RUNNER constant, to unconditionally queue a job at the beginning of each invocation of maintenance/runJobs.php. I don't know why, but if this is an important use-case we should probably add a hook to that maintenance script instead, or to JobRunner main execute method. Alternatively, BlueSpice seems sufficiently complex that they may want to subclass JobRunner and move the code there, and register their implementation as the main job runner instead. Or they could file a task for the higher-level thing this is meant to accomplish to see if there's another way within our architecture to make that work.
  • [extensions/CirrusSearch]: This is a string for logging purpose only. Trivially inverted. Right now it is coded as cli-or-job: cli, API: api, default: web. It can instead be coded as index.php: web, API: api, default: misc where misc captures CLI, Job and other non-user-facing contexts.
  • [extensions/WikimediaEvents] This is a string for logging purpose only. Trivially inverted.

The only ones that matter here are:

  1. Translate: PageTranslationHooks.php (ugly hack due to lack of state between Translate's jobs and Translate's MovePage hook.)
  2. core: PageEditStash ServiceWiring (detect user vs non-user context).
  3. BlueSpiceFoundation: RunJobsTriggerRunner.php (JobRunner setup hook).

@Krinkle Thanks for the audit! For the most part, I agree with your assessment but I'm not completely sure about BlueSpiceFoundation and how a change there might impact the BlueSpice suite. @Mglaser can you provide more detail here?

[extensions/CirrusSearch]: This is a string for logging purpose only

These are logs, but probably not the logs you are thinking of. These are logs that go to hadoop and feed our analytics and ML pipelines. In this context it's important to be able to filter out logs generated by the job runner infrastructure, as they are primarily concerned with user behaviour.

[extensions/CirrusSearch]: This is a string for logging purpose only

These are logs, but probably not the logs you are thinking of. […]

Continuing at T247129: Remove usage of MEDIAWIKI_JOB_RUNNER from CirrusSearch extension.

CCicalese_WMF subscribed.

Marking as Resolved as it is in the Done column. Feel free to reopen if there is remaining work.