Page MenuHomePhabricator

Categorization via the job queue is inacceptable
Closed, ResolvedPublic

Description

User problem:

Even if you run runJobs.php non stop, there is a delay.

Users got used to action=purge, but that doesn't help either here.

And there are more problems: https://www.mediawiki.org/w/index.php?title=Topic:Tb8v3cjly5vffl05

To sum it up: After adding a page to a category I click on the category link and the page is not there.
This is inacceptable.

https://en.wikipedia.org/wiki/Usability

Hook problem:

With onPageContentSaveComplete you can grab the categories that have changed by an edit. But if you delete related category memcaches, you end up with wrong caches when the user visits a category with freshly flushed cache before runJobs completed categorization.

T125366 would fix it.

Performance win?

What is the performance win anyway? Categorization is done after page saving. No performance win for the user.

Event Timeline

Subfader created this task.Sep 9 2016, 4:03 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 9 2016, 4:03 PM
Subfader renamed this task from Categorization via jobs is inacceptable to Categorization via the job queue is inacceptable.Sep 9 2016, 4:04 PM
Subfader added a comment.EditedSep 9 2016, 8:32 PM

I use the hook PageContentSaveComplete to flush some memcached stuff based on category changes.

When the user visits a category page before the job completed the edit, then a new cache is written with the old categorization.

  1. How can I disable this bad categorization via jobs?
  1. What is the performance win anyway? Categorization is done after page saving. No performance win for the user.
  1. Can you at least add a hook CategorizationJobComplete so I can grab changed categorization this way?

Hi @Subfader, thanks for taking the time to report this!
Unfortunately this report lacks some information. If you have time and can still reproduce the problem, please add a more complete description to this report. Ideally, exact and clear steps to reproduce should allow any other person to follow these steps (without having to interpret those steps) and see the same results. Problems that others can reliably reproduce can get fixed faster. Thanks!

Subfader added a comment.EditedSep 10 2016, 10:17 AM

I edited the top post.

Subfader updated the task description. (Show Details)Sep 10 2016, 10:21 AM
Subfader updated the task description. (Show Details)
Subfader updated the task description. (Show Details)
Subfader updated the task description. (Show Details)Sep 10 2016, 10:23 AM
Subfader updated the task description. (Show Details)Sep 10 2016, 11:23 AM

I edited the top post.

I don't see a list of steps to see and reproduce the problem. :(

Subfader added a comment.EditedSep 10 2016, 12:12 PM

I edited the top post.

I don't see a list of steps to see and reproduce the problem. :(

  1. Add a page to a category
  2. Click on the category link
  3. The page is not there

If that's true, that surely explains why there's a boom of similar reports that brought T142751: Turn $wgRunJobsAsync off by default up. The description of that task has a several examples that apparently were solved by running jobs synchronously, though.

Subfader added a comment.EditedSep 10 2016, 12:46 PM

With $wgRunJobsAsync = true; new pages don't appear in a cat after saving. On mediawiki.org the delay in ~20 seconds.

But I don't want to use sync jobs. I want a high performance site. But I don't see any performance win with delayed categorization.

In fact, the single job for delayed categorization forces me to turn off ALL async jobs?
Super perfomance win then!!!

Not my MediaWiki :(

Subfader updated the task description. (Show Details)Sep 10 2016, 1:01 PM
Subfader updated the task description. (Show Details)
Subfader added a comment.EditedSep 10 2016, 1:13 PM

We have $wgJobTypesExcludedFromDefaultQueue but that only excludes jobs from runJobs.php

What a about a global to exclude certain tasks completely from job queues so users can keep $wgRunJobsAsync = true?

$wgTasksExcludedFromJobQueue = [ 'categorization' ];

This effects not only categorylinks, it also about templatelinks, imagelinks and externallinks in the corresponding tables. This is happen, because the LinksUpdate after edits is part of the job queue (I would say that https://gerrit.wikimedia.org/r/#/c/242649/ is the change)

Subfader added a comment.EditedSep 11 2016, 11:09 AM

Thanks Umherirrender!

In includes/MediaWiki.php > function restInPeace() > I changed
DeferredUpdates::doUpdates( 'enqueue' ); back to
DeferredUpdates::doUpdates( 'commit' ); (as it was in MW 1.25.1)

Now with $wgJobRunRate = 0; and $wgRunJobsAsync = true; categorization is done immediatly (job_cmd "refreshLinksPrioritized" does not appear) and other jobs still pop up in the jobs table.

aaron added a comment.EditedSep 11 2016, 11:39 AM

Thanks Umherirrender!
In includes/MediaWiki.php > function restInPeace() > I changed
DeferredUpdates::doUpdates( 'enqueue' ); back to
DeferredUpdates::doUpdates( 'commit' ); (as it was in MW 1.25.1)
Now with $wgJobRunRate = 0; and $wgRunJobsAsync = true; categorization is done immediatly (job_cmd "refreshLinksPrioritized" does not appear) and other jobs still pop up in the jobs table.

Note that this is the setting $wgJobRunRate = 0 disables async jobs at the end of web requests. I'd suggest using 3 and keeping the 'commit'/'run' (it's called 'run' now) change.

MediaWiki could possibly use make links updates POSTSEND when $wgJobRunRate > 0. Or it could keep enqueueing them but pick the up at the end of the request if JobQueueGroup::queuesHaveJobs() had its cache cleared on job insertion.

Change 309826 had a related patch set uploaded (by Aaron Schulz):
Make JobQueueGroup::push() update the queuesHaveJobs() cache

https://gerrit.wikimedia.org/r/309826

@aaron: Thanks. I have this setup

  • $wgJobRunRate = 0;
  • $wgRunJobsAsync = true;
  • restInPeace() > DeferredUpdates::doUpdates( 'run' );
  • maintenance/runJobs.php > via cronjob (minutely)

Can I use your patch in MW 1.27.1?

I use vagrant, which uses the redis queue and redisJobRunner that runs continuously.

Having wgJobRunRate *and* wgRunJobsAsync doesn't make sense, since the HTTP request for async jobs will never happen. The patch I have above is to make it so the code in MediaWiki.php that runs if wgJobRunRate > 0 will actually see the jobs just enqueued DeferredUpdates::doUpdates( 'enqueue' );

Subfader added a comment.EditedSep 11 2016, 1:12 PM

https://www.mediawiki.org/wiki/Manual:$wgRunJobsAsync

When the execution of jobs during normal page requests is enabled (by setting $wgJobRunRate to a number greater than 0; it defaults to 1), then this variable controls whether to execute them asynchronously or not.

https://www.mediawiki.org/wiki/Manual:$wgJobRunRate

If this is zero, jobs will not be done during ordinary apache requests. In this case, maintenance/runJobs.php should be run periodically.

Ok, so $wgJobRunRate = 0; is enough already.

Now I also understand why they say $wgRunJobsAsync = false; (kind of) fixes the categorization problem. Well, an explanation "but only if you use $wgJobRunRate > 0" would be helpful. I understaood $wgRunJobsAsync that you need to set it to true when you use runJobs.php in a cronjob (which is async to HTTP reuqests).

So my hack with DeferredUpdates::doUpdates( 'run' ); doesn't break it?

vagrant seems complicated. I don't even understand $wgRunJobsAsync ;)

Subfader added a comment.EditedSep 11 2016, 1:38 PM

And back to the topic: Users complain that they need to see category changes immediatly. Therefor it cannot be delayed in jobs (of whatever kind).

Esp. not when users want to run jobs via cronjob for performance reasons and are now forced to use jobs on HTTP requests ($wgJobRunRate > 2 + $wgRunJobsAsync = false) .

What about my suggestion to use a global to exclude such tasks from jobs completely?

$wgTasksExcludedFromJobQueue = [ 'categorization' ]; or
$wgTasksExcludedFromJobQueue = [ 'LinksUpdate' ];

I understaood $wgRunJobsAsync that you need to set it to true when you use runJobs.php in a cronjob (which is async to HTTP reuqests).

No. $wgJobRunRate controls if jobs are executed (how many), or aren't executed at all, during normal page requests. If you use a cronjob you normally want to set $wgJobRunRate to 0. And $wgRunJobsAsync only applies if $wgJobRunRate > 0.

Subfader added a comment.EditedSep 11 2016, 1:48 PM

Yep. I always had $wgJobRunRate = 0, so misunderstanding $wgRunJobsAsync didn't break anything :)

Change 309826 merged by jenkins-bot:
Make JobQueueGroup::push() update the queuesHaveJobs() cache

https://gerrit.wikimedia.org/r/309826

Change 310465 had a related patch set uploaded (by Aaron Schulz):
Make JobQueueGroup::push() update the queuesHaveJobs() cache

https://gerrit.wikimedia.org/r/310465

Change 310465 merged by jenkins-bot:
Make JobQueueGroup::push() update the queuesHaveJobs() cache

https://gerrit.wikimedia.org/r/310465

All patches merged.

What exactly is left to do in this task?

Aklapper closed this task as Resolved.Dec 21 2016, 6:22 AM
Aklapper claimed this task.

All patches merged.
What exactly is left to do in this task?

No reply, hence assuming this is resolved.