Page MenuHomePhabricator

100's of dump failures due to etcd issue
Open, HighPublic

Description

Original alert to ops-dumps: https://groups.google.com/a/wikimedia.org/g/ops-dumps/c/efjbIJHS--Q

Example stack:

*** Wiki: aawikibooks
=====================
[20240501085807]: Notice: Undefined index: es7 in /srv/mediawiki/wmf-config/etcd.php on line 116
Warning: Invalid argument supplied for foreach() in /srv/mediawiki/wmf-config/etcd.php on line 116
Notice: Undefined index: es7 in /srv/mediawiki/wmf-config/etcd.php on line 116
Warning: Invalid argument supplied for foreach() in /srv/mediawiki/wmf-config/etcd.php on line 116
Notice: Undefined index: es7 in /srv/mediawiki/wmf-config/etcd.php on line 116
Warning: Invalid argument supplied for foreach() in /srv/mediawiki/wmf-config/etcd.php on line 116

Event Timeline

This may just be a cascading effect of the etcd work done on T358636.

Will investigate if the jobs are recovering.

A cursory look on snapshot1012, which runs enwiki, and on snapshot1008, which run all the small wikis shows no issues, so looks like dumps recovered.

State at https://dumps.wikimedia.org/backup-index.html shows progress for all dumps.

Will continue monitoring.

Dumps continue running and continue making progress. So definitely a transient issue. Closing.