Page MenuHomePhabricator

Wikidata doesn't update recentchanges
Closed, ResolvedPublic

Description

On hewik/cawiki (and probably many others) Wikidata doesn't update recentchanges, as it appears in the DB query below:

use hewiki_p;
select rc_timestamp, now() from recentchanges where rc_source='wb' order by rc_timestamp desc limit 5;
+----------------+---------------------+
| rc_timestamp   | now()               |
+----------------+---------------------+
| 20180412143955 | 2018-04-14 05:01:39 |
| 20180412143842 | 2018-04-14 05:01:39 |
| 20180412143811 | 2018-04-14 05:01:39 |
| 20180412143802 | 2018-04-14 05:01:39 |
| 20180412143756 | 2018-04-14 05:01:39 |
+----------------+---------------------+

Likely related to: T192085

Event Timeline

Ladsgroup triaged this task as Unbreak Now! priority.

Per https://grafana.wikimedia.org/dashboard/db/jobqueue-eventbus?orgId=1&var-site=eqiad&var-type=wikibase-InjectRCRecords&from=now-7d&to=now injecting rc records is completely broken for two days now. This guarantees an unscheduled deploy, incident report, and several actionables ASAP.

Restricted Application added subscribers: Liuxinyu970226, TerraCodes. · View Herald Transcript
Ladsgroup added subscribers: Pchelolo, mobrovac.

Per grafana it started in 12.04.2018 around 14:30-15:00 UTC which aligns with this in SAL:

14:46 mobrovac@tin: Synchronized wmf-config/InitialiseSettings.php: No-op: Clean up an unused global var for the EventBus-based JobQueue (duration: 01m 17s)
14:44 mobrovac@tin: Synchronized wmf-config/jobqueue.php: Switch the second bulk of low-traffic jobs for all wikis - T190327 (duration: 01m 16s)
14:43 ppchelko@tin: Finished deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327 (duration: 00m 35s)
14:42 ppchelko@tin: Started deploy [cpjobqueue/deploy@85fbd47]: Enable second bulk of low traffic jobs for all wikis T190327

At first I thought it's a bug in the code but this part of code hasn't been touched for over a month now.
Pinging @mobrovac and @Pchelolo

It's definitely not related to the codebase, it's an issue in the job runner infra...

Change 426907 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/services/change-propagation/jobqueue-deploy@master] Revert switching ChangeNotification job.

https://gerrit.wikimedia.org/r/426907

Change 426969 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Use the wiki that the queue belongs to for setting database and domain.

https://gerrit.wikimedia.org/r/426969

Change 426969 merged by Mobrovac:
[mediawiki/extensions/EventBus@master] Use the wiki that the queue belongs to for setting database and domain.

https://gerrit.wikimedia.org/r/426969

Change 426975 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@wmf/1.31.0-wmf.29] Use the wiki that the queue belongs to for setting database and domain.

https://gerrit.wikimedia.org/r/426975

Change 426975 merged by Mobrovac:
[mediawiki/extensions/EventBus@wmf/1.31.0-wmf.29] Use the wiki that the queue belongs to for setting database and domain.

https://gerrit.wikimedia.org/r/426975

Mentioned in SAL (#wikimedia-operations) [2018-04-16T19:46:03Z] <mobrovac@tin> Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the wiki set in the JobQueue when creating the event, file 1/2 - T192198 (duration: 01m 00s)

Mentioned in SAL (#wikimedia-operations) [2018-04-16T19:47:28Z] <mobrovac@tin> Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the wiki set in the JobQueue when creating the event, file 2/2 - T192198 (duration: 00m 59s)

Change 427014 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@master] Correctly calculate a domain based on the provided wiki id.

https://gerrit.wikimedia.org/r/427014

Change 427014 merged by Mobrovac:
[mediawiki/extensions/EventBus@master] Correctly calculate a domain based on the provided wiki id.

https://gerrit.wikimedia.org/r/427014

Change 427018 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[mediawiki/extensions/EventBus@wmf/1.31.0-wmf.29] Correctly calculate a domain based on the provided wiki id.

https://gerrit.wikimedia.org/r/427018

Change 427018 merged by Mobrovac:
[mediawiki/extensions/EventBus@wmf/1.31.0-wmf.29] Correctly calculate a domain based on the provided wiki id.

https://gerrit.wikimedia.org/r/427018

Mentioned in SAL (#wikimedia-operations) [2018-04-16T21:02:29Z] <mobrovac@tin> Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/EventBus.php: Use the correct way of calculating the domain from the wiki, file 1/2 - T192198 (duration: 00m 59s)

Mentioned in SAL (#wikimedia-operations) [2018-04-16T21:03:46Z] <mobrovac@tin> Synchronized php-1.31.0-wmf.29/extensions/EventBus/includes/JobQueueEventBus.php: Use the correct way of calculating the domain from the wiki, file 2/2 - T192198 (duration: 00m 58s)

mobrovac assigned this task to Pchelolo.

It took us a while to find the root cause of this. Essentially, the problem was that the ChangeNotification jobs were being created/dispatched via a cron script. In such a case, the EventBus-based JobQueue was assigning the originating wiki (which was Wikidata) instead of the intended recipient wiki. At the same time, whether this job injects RC records or not is configurable per-wiki, and that setting was set to false for WD.

We have fixed the bug in the EventBus-based JobQueue push service and now the correct wiki name is set in the event data. Consequently, injectRCRecord jobs are now being created and executed as intended.

Change 426907 abandoned by Mobrovac:
Revert switching ChangeNotification job.

Reason:
Not needed

https://gerrit.wikimedia.org/r/426907