Page MenuHomePhabricator
Paste P30138

decom cookbook failure after netbox/ganeti cluster change
ActivePublic

Authored by MoritzMuehlenhoff on Jun 24 2022, 8:58 AM.
No Kerberos credentials found.
Scheduling downtime on Icinga server alert1001.wikimedia.org for hosts: webperf2002
Created silence ID 2d00ac0a-e31c-439b-8b7a-9618a8f00e62
Downtimed host on Icinga/Alertmanager
Host steps raised exception
Traceback (most recent call last):
File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 405, in run
switches.update(self._decommission_host(fqdn))
File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 276, in _decommission_host
virtual_machine = ganeti.instance(fqdn, cluster=netbox_data['cluster']['name'])
File "/usr/lib/python3/dist-packages/spicerack/ganeti.py", line 354, in instance
master = self.rapi(cluster).master
File "/usr/lib/python3/dist-packages/spicerack/ganeti.py", line 312, in rapi
raise GanetiError(f"Cannot find cluster {cluster} (expected {keys}).")
spicerack.ganeti.GanetiError: Cannot find cluster row_A (expected ('ganeti01.svc.eqiad.wmnet', 'ganeti01.svc.codfw.wmnet', 'ganeti01.svc.esams.wmnet', 'ganeti01.svc.ulsfo.wmnet', 'ganeti01.svc.eqsin.wmnet', 'ganeti-test01.svc.codfw.wmnet', 'ganeti01.svc.drmrs.wmnet', 'ganeti02.svc.drmrs.wmnet')).
**Host steps raised exception**: Cannot find cluster row_A (expected ('ganeti01.svc.eqiad.wmnet', 'ganeti01.svc.codfw.wmnet', 'ganeti01.svc.esams.wmnet', 'ganeti01.svc.ulsfo.wmnet', 'ganeti01.svc.eqsin.wmnet', 'ganeti-test01.svc.codfw.wmnet', 'ganeti01.svc.drmrs.wmnet', 'ganeti02.svc.drmrs.wmnet')).
Sleeping for 3 minutes to get netbox caches in sync
START - Cookbook sre.dns.netbox
Generating the DNS records from Netbox data. It will take a couple of minutes.
----- OUTPUT of 'cd /tmp && runus...e asset tag one"' -----