Page MenuHomePhabricator

cirrussearch dumps have failed - 2025-07-21
Closed, ResolvedPublic

Assigned To
Authored By
BTullis
Jul 22 2025, 1:23 PM
Referenced Files
F65689576: image.png
Jul 29 2025, 6:52 PM
F65686140: image.png
Jul 28 2025, 1:11 PM
F65685903: image.png
Jul 28 2025, 11:27 AM
F65588829: image.png
Jul 22 2025, 3:39 PM
F65588244: image.png
Jul 22 2025, 1:23 PM

Description

Our cirrussearch dumps have all failed with the same error at the start of the weekly dump.
https://airflow-test-k8s.wikimedia.org/dags/mediawiki_cirrussearch_dump/grid

image.png (810×754 px, 94 KB)

Note that the dumps are also still running on snapshot1016, although these are legacy dumps and are about to be switched off.

The logs of the failed jobs all say this:

[2025-07-21, 17:30:34 UTC] {pod_manager.py:520} INFO - [base] Named cluster (dnsdisc) is not configured for maintenance operations. Allowed clusters: eqiad, codfw, cloudelastic
[2025-07-21, 17:30:34 UTC] {pod_manager.py:520} INFO - [base] extensions/CirrusSearch/maintenance/DumpIndex.php failed for /mnt/dumpsdata/otherdumps/cirrussearch/20250721/enwiki-20250721-cirrussearch-content.json.gz

Here is an example.

Event Timeline

BTullis triaged this task as High priority.Jul 22 2025, 1:23 PM

Change #1171581 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] Allow index dump from non-managed cluster

https://gerrit.wikimedia.org/r/1171581

Change #1171581 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Allow index dump from non-managed cluster

https://gerrit.wikimedia.org/r/1171581

OK, so the patch is merged. When a new version of mediawiki has been deployed, what we need to do to trigger the re-run of the dumps is to clear this task.

image.png (696×1 px, 161 KB)

This is still not working. Even though we are running on a mediawiki image generated today, the errors messages remains the same.

image.png (738×1 px, 183 KB)

I believe that @EBernhardson is out this week, so I'll reassign it to myself and try to make progress.

Change #1173363 had a related patch set uploaded (by Reedy; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@wmf/1.45.0-wmf.11] Allow index dump from non-managed cluster

https://gerrit.wikimedia.org/r/1173363

Change #1173363 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@wmf/1.45.0-wmf.11] Allow index dump from non-managed cluster

https://gerrit.wikimedia.org/r/1173363

Mentioned in SAL (#wikimedia-operations) [2025-07-28T12:51:11Z] <reedy@deploy1003> Started scap sync-world: Backport for [[gerrit:1173363|Allow index dump from non-managed cluster (T400158)]]

Mentioned in SAL (#wikimedia-operations) [2025-07-28T12:55:15Z] <reedy@deploy1003> reedy: Backport for [[gerrit:1173363|Allow index dump from non-managed cluster (T400158)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-07-28T13:04:50Z] <reedy@deploy1003> Finished scap sync-world: Backport for [[gerrit:1173363|Allow index dump from non-managed cluster (T400158)]] (duration: 13m 39s)

I'll tentatively resove this ticket for now, as they seem to be proceeding well.
If we see futher issues, then we can reopen.

image.png (657×732 px, 73 KB)

Hello, @BTullis
CirrusSearch dumps are still not available here - https://dumps.wikimedia.org/other/cirrussearch/ Looks like they are not published after being created. So I guess this ticket should be reopened.
Or maybe they can be accessed somewhere else. In this case could you please clarify how to access them now?