Page MenuHomePhabricator

Get rid of pointless EnqueueJob usage
Closed, ResolvedPublic

Description

The enqueue job is one of the most high-traffic jobs in the queue currently with about 230 jobs/s over the last 3 months. While discussing moving it to the new Kafka-based queue, @aaron mentioned that it was originally added for multi-DC support, but that did not play out well, so now whenever the enqueue job is used, it can be removed and replaced with directly pushing the jobs into the queue. Doing that would significantly lower the JobQueue load and potentially make the jobs get processed faster.

Could you provide some more background on this @aaron and correct me if I misunderstood what you've said?

Event Timeline

They were mentioned in https://www.mediawiki.org/wiki/Requests_for_comment/Master-slave_datacenter_strategy_for_MediaWiki#Job_queuing though it was never set up (partly from people being busy with other things). In general jobs are enqueued on POST requests or from other jobs, all in the master datacenter. In some cases, jobs are enqueued on GET or possibly POST (if the api-promise-nonwrite thing is set up in vlc) in rare cases. This should work in a way where the cross-DC propagation is async, rather than having JobQueue::push() blocking on cross-DC traffic.

AFAIK, the new queue would be able to deal with that, though I'm not 100% sure on how writes initiated in different DC perform. Where are the best docs for this area?

Another use of enqueue job is when you want to enqueue multiple jobs that you *really* prefer to all be enqueued or not at all. Since enqueue job retries several times on failure (configurable too), it's more likely to get the all enqueued than MediaWiki pushing them in one try. These kind of jobs are idempotent, so extra runs don't break. This case is not related to multi-DC. I don't see any callers relying on this.

aaron renamed this task from Get rid of enqueue job to Get rid of pointless EnqueueJob usage.Nov 23 2017, 10:03 AM

AFAIK, the new queue would be able to deal with that, though I'm not 100% sure on how writes initiated in different DC perform.

In the kafka-based queue jobs are always enqueued in the local DC. Then the jobs are asynchronously picked up and executed on the master DC, because they're posted to jobrunner.discovery.wmnet which in turn points to the job runner LVS in the master DC, so usages of the enqueue job for this use-case is unnecessary.

Change 393199 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/FlaggedRevs@master] Remove usage of EnqueueJob

https://gerrit.wikimedia.org/r/393199

Change 393199 merged by jenkins-bot:
[mediawiki/extensions/FlaggedRevs@master] Remove usage of EnqueueJob

https://gerrit.wikimedia.org/r/393199

Grepping the current MW release code on tin, it looks like only FlaggedRevs and Wikibase use this job:

$ grep -iIREn enqueuejob .
./includes/DefaultSettings.php:7432:	'enqueue' => 'EnqueueJob', // local queue for multi-DC setups
./includes/jobqueue/jobs/EnqueueJob.php:36:final class EnqueueJob extends Job {
./includes/jobqueue/jobs/EnqueueJob.php:49:	 * @return EnqueueJob
./includes/jobqueue/jobs/EnqueueJob.php:59:	 * @return EnqueueJob
./autoload.php:434:	'EnqueueJob' => __DIR__ . '/includes/jobqueue/jobs/EnqueueJob.php',
./extensions/FlaggedRevs/backend/FRDependencyUpdate.php:69:				JobQueueGroup::singleton()->push( EnqueueJob::newFromLocalJobs(
./extensions/FlaggedRevs/backend/FlaggableWikiPage.php:345:		JobQueueGroup::singleton()->push( EnqueueJob::newFromLocalJobs(
./extensions/Wikidata/extensions/Wikibase/client/includes/Hooks/DataUpdateHookHandlers.php:7:use EnqueueJob;
./extensions/Wikidata/extensions/Wikibase/client/includes/Hooks/DataUpdateHookHandlers.php:168:		$enqueueJob = EnqueueJob::newFromLocalJobs( $addUsagesForPageJob );
./extensions/Wikidata/extensions/Wikibase/client/includes/Hooks/DataUpdateHookHandlers.php:170:		$this->jobScheduler->lazyPush( $enqueueJob );
./extensions/Wikibase/client/includes/Hooks/DataUpdateHookHandlers.php:7:use EnqueueJob;
./extensions/Wikibase/client/includes/Hooks/DataUpdateHookHandlers.php:168:		$enqueueJob = EnqueueJob::newFromLocalJobs( $addUsagesForPageJob );
./extensions/Wikibase/client/includes/Hooks/DataUpdateHookHandlers.php:170:		$this->jobScheduler->lazyPush( $enqueueJob );

Since FlaggedRevs has been taken care of in Gerrit 393199, that leaves only the Wikibase extension.

Change 393750 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/Wikibase@master] Stop using EnqueueJob for AddUsagesForPageJob

https://gerrit.wikimedia.org/r/393750

Change 393750 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Stop using EnqueueJob for AddUsagesForPageJob

https://gerrit.wikimedia.org/r/393750

thiemowmde triaged this task as Low priority.
thiemowmde removed a project: Wikidata.
thiemowmde subscribed.

@Pchelolo, can this be closed as done? I had a quick look in my local code base as well as on GitHub, but could not find appearances of new EnqueueJob any more.

@thiemowmde Let's wait till this week's media wiki train to deploy all over to verify that we indeed don't have anything posting this job type.

mobrovac raised the priority of this task from Low to Medium.Dec 6 2017, 9:07 AM

Change 395847 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] Change EnqueueJob docs to discourage obsolete use-cases

https://gerrit.wikimedia.org/r/395847

After today's MediaWiki deploy we have no EnqueueJob executions left. Resolving.

Change 395847 merged by jenkins-bot:
[mediawiki/core@master] Change EnqueueJob docs to discourage obsolete use-cases

https://gerrit.wikimedia.org/r/395847