Global mass message delivered on meta but not on other wikis?
Closed, ResolvedPublic

Description

I sent not one, but two global mass messages out; neither has delivered. The interface doesn't give me any indication that there was an error and there are zero messages in the global queue.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptThu, May 24, 4:39 PM
Evad37 added a subscriber: Evad37.Fri, May 25, 1:30 AM

Actually, it seems that the messages were delivered on Meta, but not globally (e.g. not to Commons:Signpost)

Aklapper renamed this task from Global mass message broken? to Global mass message delivered on meta but not on other wikis?.Sat, May 26, 2:23 PM
Legoktm triaged this task as Unbreak Now! priority.Mon, May 28, 6:20 PM
Legoktm added a subscriber: Legoktm.

Not sure what's happening, but this is UBN for MassMessage...still looking through logs.

Restricted Application added subscribers: Liuxinyu970226, TerraCodes. · View Herald TranscriptMon, May 28, 6:20 PM
Johan added a subscriber: Johan.Mon, May 28, 9:21 PM

Yeah, Tech News isn't being delivered either.

jrbs added a subscriber: jrbs.Mon, May 28, 11:27 PM

Based on the timing of this being reported, I suspect the recent changes in T190327: FY17/18 Q4 Program 8 Services Goal: Complete the JobQueue transition to EventBus maybe? Specifically https://gerrit.wikimedia.org/r/#/c/429980/ and https://gerrit.wikimedia.org/r/#/c/429983/

I guess things are no longer logged to the runJobs.log file, trying to figure out where they went instead.

I'm fairly sure that's the issue. Sending a message from metawiki -> testwikidatawiki worked, and both of those are on the same job queue system. So this is T193471 probably. Unfortunately I have no idea where the jobs have gone. mwscript showJobs.php doesn't seem to work.

I sent an email to @mobrovac and @Pchelolo:

Hey,

Users started reporting problems with the cross-wiki functionality of MassMessage on Thursday morning[1]. After digging around in logs and noticing nothing, I found [2], which appears to be the issue. I can successfully send from metawiki -> testwikidatawiki[3], which are both on the new job queue AIUI. 

Can this be reverted back or alternatively move all wikis over to the new system? Same with GlobalUserPage's jobs probably, though that one is less noticeable when it doesn't work.

And, in the future where should I be looking for logs? runJobs.log was empty for these jobs after the switchover.

-- Kunal

[1] https://phabricator.wikimedia.org/T195500
[2] https://gerrit.wikimedia.org/r/#/c/429980/
[3] https://test.wikidata.org/w/index.php?title=User_talk%3ALegoktm&type=revision&diff=387383&oldid=507

Same for the Wikidata Weekly Summary. I didn't get any error message, nothing on the logs as well, still the messages didn't get delivered.

Change 435979 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/change-propagation/jobqueue-deploy@master] Switch MassMessageJob for all wikis

https://gerrit.wikimedia.org/r/435979

Change 435979 abandoned by Mobrovac:
Switch MassMessageJob for all wikis

Reason:
Covered by Ib46fc7678dd9f565cd508fb527b8ee72cc42b0c1

https://gerrit.wikimedia.org/r/435979

So this is T193471 probably.

Indeed, this is the case. We switched MassMessageSubmitJob for all wikis, but not MassMessageJob :/. We will be switching it today. You will probably need to re-send the messages though. Sorry for the inconvenience.

Restricted Application added a project: Analytics. · View Herald TranscriptTue, May 29, 9:38 AM
Stashbot added a subscriber: Stashbot.

Mentioned in SAL (#wikimedia-operations) [2018-05-29T10:06:26Z] <mobrovac@tin> Synchronized wmf-config/jobqueue.php: Switch all jobs to EventBus file 1/2 - T190327 T195500 (duration: 01m 39s)

Mentioned in SAL (#wikimedia-operations) [2018-05-29T10:09:33Z] <mobrovac@tin> Synchronized wmf-config/InitialiseSettings.php: Switch all jobs to EventBus file 2/2 - T190327 T195500 (duration: 01m 47s)

mobrovac closed this task as Resolved.
mobrovac claimed this task.

Both jobs are now on the EventBus JobQueue, so this should be fixed now. Please reopen if that's not the case.