I previously filed this as T422842: Kafka-topics broken in beta: "zookeeper is not a recognized option", but that turned out not to be the cause for this, hence filing a separate task.
It looks like beta is once again not processing jobs. I specifically observed this with the CampaignEventsComputeEventContribution job (adding an edit to this event), but I imagine it affects more jobs.
Some things I looked into:
- Checked recent jobs:
daimona@deployment-mwlog02:~$ egrep -o '"job":"[^ ]+' /srv/mw-log/JobExecutor.log | sort | uniq -c | sort -r 82 "job":"newcomerTasksCacheRefreshJob 30 "job":"notificationReEngageJob 30 "job":"notificationKeepGoingJob 30 "job":"notificationGetStartedJob
Pretty suspicious, same as the task description of T401002.
- Enqueued a null job as in T387631#10647693, that works
- Indeed, it also works for the one job I'm interested in:
daimona@deployment-kafka-main-5:~$ kafkacat -b localhost:9092 -C -t eqiad.mediawiki.job.CampaignEventsComputeEventContribution {"$schema":"/mediawiki/job/1.0.0","meta":{"uri":"https://placeholder.invalid/wiki/Special:Badtitle","request_id":"99cbfd86-4fb1-4cae-a0a3-351df8bc7c8d","id":"72a4c3c7-789a-4066-be13-3c6e1ac7b4c9","domain":"meta.wikimedia.beta.wmcloud.org","stream":"mediawiki.job.CampaignEventsComputeEventContribution","dt":"2026-04-16T15:10:51.609Z"},"database":"metawiki","type":"CampaignEventsComputeEventContribution","params":{"revisionId":60146,"wiki":"metawiki","eventId":1578,"userId":27265,"requestId":"99cbfd86-4fb1-4cae-a0a3-351df8bc7c8d"},"mediawiki_signature":"0e9f2f8d6b7055a2de1a2b728879c5eb8c1b2c30"} % Reached end of topic eqiad.mediawiki.job.CampaignEventsComputeEventContribution [0] at offset 107
- Then checked changeprop as in T387631#10647782, and this time we seem to have something more interesting:
daimona@deployment-changeprop-1:~$ sudo systemctl status changeprop --no-pager -l ● changeprop.service - Systemd runner for changeprop Loaded: loaded (/lib/systemd/system/changeprop.service; enabled; preset: enabled) Active: active (running) since Thu 2025-06-19 09:18:36 UTC; 9 months 27 days ago Main PID: 1120544 (docker) Tasks: 9 (limit: 4685) Memory: 21.8M CPU: 49min 34.832s CGroup: /system.slice/changeprop.service └─1120544 /usr/bin/docker run --rm=true --env-file /etc/changeprop/env -p 7272:7272 -v changeprop:/etc/changeprop --name changeprop.service docker-registry.wikimedia.org/wikimedia/mediawiki-services-change-propagation:2025-03-06-075118-production /srv/service/server.js -c /etc/changeprop/config.yaml Apr 16 12:56:42 deployment-changeprop-1 docker-changeprop[1120544]: {"name":"changeprop","hostname":"de0ccb1e28cb","pid":1,"level":"ERROR","message":"Exec error in changeprop","status":504,"event_str":"{\"$schema\":\"/resource_change/1.0.0\",\"meta\":{\"uri\":\"https://en.wikipedia.beta.wmcloud.org/wiki/Template:Existing_Template\",\"request_id\":\"e2132317-7a24-47e7-a62d-aa797af942b1\",\"id\":\"b5aa60cd-a609-48b8-975b-a3d6bbc82d43\",\"domain\":\"en.wikipedia.beta.wmcloud.org\",\"stream\":\"resource_change\",\"dt\":\"2026-04-16T12:56:42.641Z\"},\"tags\":[\"null_edit\"]}","stream":"resource_change","error_body_str":"{\"type\":\"https://mediawiki.org/wiki/HyperSwitch/errors/internal_http_error\",\"method\":\"get\",\"detail\":\"Hostname/IP does not match certificate's altnames: Host: en.wikipedia.beta.wmcloud.org. is not in the cert's altnames: DNS:*.m.mediawiki.org, DNS:*.m.wikibooks.org, DNS:*.m.wikidata.org, DNS:*.m.wikimedia.org, DNS:*.m.wikinews.org, DNS:*.m.wikipedia.org, DNS:*.m.wikiquote.org, DNS:*.m.wikisource.org, DNS:*.m.wikiversity.org, DNS:*.m.wikivoyage.org, DNS:*.m.wiktionary.org, DNS:*.mediawiki.org, DNS:*.planet.wikimedia.org, DNS:*.wikibooks.org, DNS:*.wikidata.org, DNS:*.wikifunctions.org, DNS:*.wikimedia.org, DNS:*.wikimediafoundation.org, DNS:*.wikinews.org, DNS:*.wikipedia.org, DNS:*.wikiquote.org, DNS:*.wikisource.org, DNS:*.wikiversity.org, DNS:*.wikivoyage.org, DNS:*.wiktionary.org, DNS:*.wmfusercontent.org, DNS:mediawiki.org, DNS:w.wiki, DNS:wikibooks.org, DNS:wikidata.org, DNS:wikifunctions.org, DNS:wikimedia.org, DNS:wikimediafoundation.org, DNS:wikinews.org, DNS:wikipedia.org, DNS:wikiquote.org, DNS:wikisource.org, DNS:wikiversity.org, DNS:wikivoyage.org, DNS:wiktionary.org, DNS:wmfusercontent.org\",\"uri\":\"/en.wikipedia.beta.wmcloud.org/v1/page/html/Template%3AExisting_Template\",\"internalURI\":\"http://deployment-restbase05.deployment-prep.eqiad1.wikimedia.cloud:7231/en.wikipedia.beta.wmcloud.org/v1/page/html/Template%3AExisting_Template\",\"internalMethod\":\"get\"}","rule_name":"null_edit","executor":"RuleExecutor","levelPath":"error/exec_error","msg":"Exec error in changeprop","time":"2026-04-16T12:56:42.954Z","v":0}
So it might be this?
(Besides, I'll copy from T422842: it would be nice if these things were caught earlier and by some tool that isn't me, as this is at least the third time I'm left puzzled by jobs not working in beta, after T387631 and T401002).