Page MenuHomePhabricator

LinksUpdate totally broken when JobQueueDB is in use
Closed, ResolvedPublic

Description

Setup

  • MediaWiki 1.29.0-rc.0 (37c22fc) 23:03, 22 June 2017
  • PHP 5.6.30-0+deb8u1 (apache2handler)
  • MariaDB 10.0.31-MariaDB-1~jessie

Pages are no longer propagated into the categories added to them. This used to be an issue with 1.27.0, 1.27.1 as well as in 1.28.0. Now the issue is back again. The job queue is run via cron and empty.

Event Timeline

I have been advised by @Ciencia_Al_Poder on this spot that one should get the following jobs when creating a page with a category: refreshLinksPrioritized, recentChangesUpdate and htmlCacheUpdate. So I tested: When I do this I get 2 x htmlCacheUpdate and 1 recentChangesUpdate. No refreshLinksPrioritize job emerges and here we are. :(

I have no hopes that this issue will be dealt with prior to the release, however:

When I create a new page without trying to add it to a category I get these jobs:

882	recentChangesUpdate	-1	Modifications_récentes	20170627124811	a:2:{s:4:"type";s:5:"purge";s:9:"requestId";s:24:"1c52d0df34e980a625c7d564";}	1060882060	0		NULL	imxdcfwg5lq3l5r8tp28m3jh3puiu8l
883	recentChangesUpdate	-1	Modifications_récentes	20170627124811	a:2:{s:4:"type";s:11:"cacheUpdate";s:9:"requestId";s:24:"1c52d0df34e980a625c7d564";}	1170635974	0		NULL	edjbtcpc5b4ojx1fpc5asy2751erab6
884	htmlCacheUpdate	0	Tmlh1cna07n5uffj_/_take_three	20170627124811	a:6:{s:5:"table";s:9:"pagelinks";s:9:"recursive";b:1;s:13:"rootJobIsSelf";b:1;s:16:"rootJobSignature";s:40:"775111d78ae1d358f98d9c23ad0ea50c1486c789";s:16:"rootJobTimestamp";s:14:"20170627124811";s:9:"requestId";s:24:"1c52d0df34e980a625c7d564";}	627939601	0		NULL	fbklrte5e5lcmmbitejykxkgf7gfhnw
885	htmlCacheUpdate	0	Tmlh1cna07n5uffj_/_take_three	20170627124811	a:6:{s:5:"table";s:13:"templatelinks";s:9:"recursive";b:1;s:13:"rootJobIsSelf";b:1;s:16:"rootJobSignature";s:40:"5f2c20070f7c69722e98247943dcc6acca7bb923";s:16:"rootJobTimestamp";s:14:"20170627124811";s:9:"requestId";s:24:"1c52d0df34e980a625c7d564";}	2076655181	0		NULL	28p7k535hxl9d884mbedznm9c3eir82

When I now add a category to this page I get these jobs:

886	recentChangesUpdate	-1	Modifications_récentes	20170627125222	a:2:{s:4:"type";s:11:"cacheUpdate";s:9:"requestId";s:24:"b54e46325d479c7d6d5250a8";}	1464002474	0		NULL	edjbtcpc5b4ojx1fpc5asy2751erab6
887	htmlCacheUpdate	0	Tmlh1cna07n5uffj_/_take_three	20170627125222	a:6:{s:5:"table";s:13:"templatelinks";s:9:"recursive";b:1;s:13:"rootJobIsSelf";b:1;s:16:"rootJobSignature";s:40:"5f2c20070f7c69722e98247943dcc6acca7bb923";s:16:"rootJobTimestamp";s:14:"20170627125222";s:9:"requestId";s:24:"b54e46325d479c7d6d5250a8";}	492389822	0		NULL	28p7k535hxl9d884mbedznm9c3eir82
888	htmlCacheUpdate	0	Tmlh1cna07n5uffj_/_take_three	20170627125222	a:6:{s:5:"table";s:8:"redirect";s:9:"recursive";b:1;s:13:"rootJobIsSelf";b:1;s:16:"rootJobSignature";s:40:"a7e3e8aeee3f711b2f90cd9e6e642a4fedf3e456";s:16:"rootJobTimestamp";s:14:"20170627125222";s:9:"requestId";s:24:"b54e46325d479c7d6d5250a8";}	1461360479	0		NULL	rl8ckxba7r9d9q2hrgombo29b3b4kc2

When I create an new page including a category I get these jobs:

890	recentChangesUpdate	-1	Modifications_récentes	20170627125806	a:2:{s:4:"type";s:11:"cacheUpdate";s:9:"requestId";s:24:"a334fef2bdcbaff72760c2b9";}	2041893187	0		NULL	edjbtcpc5b4ojx1fpc5asy2751erab6
891	htmlCacheUpdate	0	Tmlh1cna07n5uffj_/_take_four	20170627125806	a:6:{s:5:"table";s:9:"pagelinks";s:9:"recursive";b:1;s:13:"rootJobIsSelf";b:1;s:16:"rootJobSignature";s:40:"663c424034daae07daaf7d98ad4c84e2b05a0b1f";s:16:"rootJobTimestamp";s:14:"20170627125806";s:9:"requestId";s:24:"a334fef2bdcbaff72760c2b9";}	417779943	0		NULL	slgqkngsqdhe45qtpjadsk9dy48ou0m
892	htmlCacheUpdate	0	Tmlh1cna07n5uffj_/_take_four	20170627125806	a:6:{s:5:"table";s:13:"templatelinks";s:9:"recursive";b:1;s:13:"rootJobIsSelf";b:1;s:16:"rootJobSignature";s:40:"08b39ccf08014b9d646d50d1b4dfb6e0d92854cd";s:16:"rootJobTimestamp";s:14:"20170627125806";s:9:"requestId";s:24:"a334fef2bdcbaff72760c2b9";}	1157962403	0		NULL	5rpg0kb75hiurh3vshbosbtg1zrg60j

In both cases the pages are a no show in the category after the job queue has been dealt with. To have them show up I need to run "refreshLinks.php". The error logs remain empty probably because nothing is happening in the first place.

Something about refreshLinksPrioritized is fishy to me. I added debug logging to see which jobs are getting inserted, and ended up with:

[test] pushed 1 jobs of refreshLinksPrioritized
Title::getRestrictionTypes: applicable restrictions to [[Main Page]] are {edit,move}
[test] pushed 2 jobs of htmlCacheUpdate
Job with hash 'p38x7xp14q86ij8u7kkvq57uypvco5m' is a duplicate.
Job with hash 'rd535w09gxzvveamxif7agp4ruf15lr' is a duplicate.

But when I looked in my job queue right afterwards (I have $wgJobRunRate = 0;) set, it wasn't there.

@Legoktm Thanks for having a peep at this. Forgot to mention that I set $wgJobRunRate = 0; too.

Studied, the situation could become better but the bug is not solved.

With $wgJobRunRate = 0 I added extensive logging about DeferredUpdates stages (PRESEND and POSTSEND) and around EnqueueableDataUpdate (LinksUpdate is an EnqueueableDataUpdate). I found the EnqueueableDataUpdates are converted to jobs, then jobs are pushed, then (with recent changes and in the case JobQueueDB) an AutoCommitUpdate (type of DeferrableUpdate) is added in the subqueue presently in the PRESEND subqueue. Given we are in POSTSEND queue and it is added in the PRESEND queue, it is never added.

If I change PRESEND by POSTSEND in JobQueueDB::doBatchPush, the good news is that the job refreshLinksPrioritized appears in the job queue, but the bad news is that it does not add the page to the category when executed (and similar with $wgJobRunRate = 0. I still don’t know why. I tried to add the patch from T168347 but nothing (no exception) is reported.

tstarling renamed this task from Category propagation is not working to LinksUpdate totally broken when JobQueueDB is in use.Jul 3 2017, 6:13 AM
tstarling added a subscriber: aaron.

I ran into this on Sunday night (my hobbies and work seem to be colliding at the moment), with the current git master, and came up with a similar analysis to the Seb35's comment above. I confirmed that no link updates of any kind are performed. Updated the task description accordingly.

If @aaron could have a look, that would be very helpful.

Change 363109 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] Push all DeferredUpdates to POSTSEND queue when running that queue

https://gerrit.wikimedia.org/r/363109

I and @Envlh tried various configurations; it seems that, with the current master (884a54e) and REL1_29 (eb815ad), it works with a fresh installation without any extensions (only skins) (i.e. categories are correctly added, sometimes with some delay), but I experience the bug (=pages are not added in a category and job queue is empty) in some cases with only the Translate extension activated ("some cases" = with specific configuration parameters – we were not able still to determine what configuration parameters exactly), and possibly the bug occurs with other extensions and/or set of extensions and/or specific configuration.

Given the A/B test “Translate activated or nor”, we remarked there are there are *more* warnings as described in T154424 when Translate is *not* activated; an hypothesis is there is some hook in the DeferrableUpdate LinksUpdate (there are 5 hooks there) which imply a bad interaction (e.g. a call to onTransactionIdle) and make the whole LinksUpdate fails without warning (hypothesis to be checked). Another thing to check would be to remove the EnqueueableDataUpdate implementation from LinksUpdate to check if it works better in various configurations.

Given these elements, imho, this bug should no more be a 1.29 blocker, or at least a RC1 should be issued to get more feedback (is it a vanilla MW bug or related to interactions with specific extensions?) and do not block too much the release itself.

Change 363109 merged by jenkins-bot:
[mediawiki/core@master] Push all DeferredUpdates to POSTSEND queue when running that queue

https://gerrit.wikimedia.org/r/363109

should be cherry-picked to REL1_29. (I'd do it myself but I'm having issues with logging into Gerrit)

Change 363883 had a related patch set uploaded (by MacFan4000; owner: Aaron Schulz):
[mediawiki/core@REL1_29] Push all DeferredUpdates to POSTSEND queue when running that queue

https://gerrit.wikimedia.org/r/363883

Change 363883 merged by jenkins-bot:
[mediawiki/core@REL1_29] Push all DeferredUpdates to POSTSEND queue when running that queue

https://gerrit.wikimedia.org/r/363883