Page MenuHomePhabricator
Paste P13333

(An Untitled Masterwork)
ActivePublic

Authored by Marostegui on Thu, Nov 19, 6:23 AM.
[06:18:18] marostegui@cumin1001:~$ sudo cookbook sre.hosts.decommission es1014.eqiad.wmnet -t T268102
START - Cookbook sre.hosts.decommission
ATTENTION: destructive action for 1 hosts: es1014.eqiad.wmnet
Are you sure to proceed?
Type "done" to proceed
> done
Looking for matches in puppetmaster1001.eqiad.wmnet:/var/lib/git/operations/puppet
conftool-data/node/eqiad.yaml: kubernetes1014.eqiad.wmnet: ["recommendation-api"]
conftool-data/node/eqiad.yaml: kubernetes1014.eqiad.wmnet: [kubesvc]
hieradata/role/eqiad/kubernetes/worker.yaml:- kubernetes1014.eqiad.wmnet
modules/install_server/files/dhcpd/linux-host-entries.ttyS1-115200: fixed-address es1014.eqiad.wmnet;
modules/install_server/files/dhcpd/linux-host-entries.ttyS1-115200: fixed-address kubernetes1014.eqiad.wmnet;
Looking for matches in puppetmaster1001.eqiad.wmnet:/srv/private
Looking for matches in deploy1001.eqiad.wmnet:/srv/mediawiki-staging
Found match(es) in the Puppet or mediawiki-config repositories (see above), proceed anyway?
Type "done" to proceed
> done
Scheduling downtime on Icinga server alert1001.wikimedia.org for hosts: ['es1014.eqiad.wmnet']
Downtimed host on Icinga
Management Password:
Found physical host
Scheduling downtime on Icinga server alert1001.wikimedia.org for hosts: ['es1014.mgmt.eqiad.wmnet']
Downtimed management interface on Icinga
Wiped bootloaders
Running IPMI command: ipmitool -I lanplus -H es1014.mgmt.eqiad.wmnet -U root -E chassis power off
Powered off
Disable and reset potential vlans on asw2-b1-eqiad:ge-1/0/21 for local eno1
Delete IP 10.64.16.187/22 on eno1
Delete IP 2620:0:861:102:10:64:16:187/64 on eno1
Host steps raised exception
Traceback (most recent call last):
File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 312, in run
dcs.add(_decommission_host(fqdn, spicerack, reason))
File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 142, in _decommission_host
update_netbox(netbox, netbox_data, spicerack.dry_run)
File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 218, in update_netbox
device = netbox.api.dcim.devices.get(id=netbox_data['id'])
File "/usr/lib/python3/dist-packages/pynetbox/core/endpoint.py", line 138, in get
filter_lookup = self.filter(**kwargs)
File "/usr/lib/python3/dist-packages/pynetbox/core/endpoint.py", line 213, in filter
"try again.".format(RESERVED_KWARGS)
ValueError: A reserved ('id', 'pk', 'limit', 'offset') kwarg was passed. Please remove it try again.
Host steps raised exception: A reserved ('id', 'pk', 'limit', 'offset') kwarg was passed. Please remove it try again.
Generating the DNS records from Netbox data. It will take a couple of minutes.
2020-11-19 06:19:26,241 [INFO] Gathering devices, interfaces, addresses and prefixes from Netbox
2020-11-19 06:21:44,239 [INFO] Gathered 2184 devices from Netbox
2020-11-19 06:21:44,239 [INFO] Generating DNS records
2020-11-19 06:21:51,441 [INFO] Generated 12048 direct and reverse records (6024 each) in 27 direct zones and 168 reverse zones
2020-11-19 06:21:51,441 [INFO] Cloning /srv/netbox-exports/dns.git/ to /tmp/dns-c25pcHBldHM-00e3a7zk ...
2020-11-19 06:21:51,607 [INFO] Generating zonefile snippets to directory /tmp/dns-c25pcHBldHM-00e3a7zk
2020-11-19 06:21:52,340 [INFO] Committed changes: 38bbce30b6c3a06117074b5a76a5dce6da447e8e
2020-11-19 06:21:52,365 [INFO] Validating generated data
2020-11-19 06:21:52,366 [INFO] Commit details: {'insertions': 4, 'deletions': 2, 'lines': 6, 'files': 4}
commit 38bbce30b6c3a06117074b5a76a5dce6da447e8e
Author: generate-dns-snippets <noc@wikimedia.org>
Date: Thu Nov 19 06:21:52 2020 +0000
marostegui@cumin1001: es1014.eqiad.wmnet decommissioned, removing all IPs except the asset tag one
diff --git a/16.64.10.in-addr.arpa b/16.64.10.in-addr.arpa
index e119d49..fd715d4 100644
--- a/16.64.10.in-addr.arpa
+++ b/16.64.10.in-addr.arpa
@@ -171,7 +171,6 @@
181 1H IN PTR restbase1029-b.eqiad.wmnet.
182 1H IN PTR restbase1029-c.eqiad.wmnet.
186 1H IN PTR es1013.eqiad.wmnet.
-187 1H IN PTR es1014.eqiad.wmnet.
188 1H IN PTR kubernetes1009.eqiad.wmnet.
189 1H IN PTR kubernetes1010.eqiad.wmnet.
190 1H IN PTR db1076.eqiad.wmnet.
diff --git a/20.192.10.in-addr.arpa b/20.192.10.in-addr.arpa
index 0390212..cd33cdd 100644
--- a/20.192.10.in-addr.arpa
+++ b/20.192.10.in-addr.arpa
@@ -2,6 +2,8 @@
2 1H IN PTR ae2-2118.cr1-codfw.wikimedia.org.
3 1H IN PTR ae2-2118.cr2-codfw.wikimedia.org.
5 1H IN PTR cloudvirt2001-dev.codfw.wmnet.
+6 1H IN PTR cloudcephmon2001-dev.codfw.wmnet.
+7 1H IN PTR cloudcephmon2002-dev.codfw.wmnet.
8 1H IN PTR labtestvirt2003.codfw.wmnet.
10 1H IN PTR cloudnet2002-dev.codfw.wmnet.
11 1H IN PTR clouddb2001-dev.codfw.wmnet.
diff --git a/codfw.wmnet b/codfw.wmnet
index eabc9fc..93c34f7 100644
--- a/codfw.wmnet
+++ b/codfw.wmnet
@@ -15,6 +15,8 @@ chartmuseum2001 1H IN A 10.192.48.159
chartmuseum2001 1H IN AAAA 2620:0:860:104:10:192:48:159
cloudbackup2001 1H IN A 10.192.0.130
cloudbackup2002 1H IN A 10.192.32.186
+cloudcephmon2001-dev 1H IN A 10.192.20.6
+cloudcephmon2002-dev 1H IN A 10.192.20.7
clouddb2001-dev 1H IN A 10.192.20.11
clouddb2001-dev 1H IN AAAA 2620:0:860:118:10:192:20:11
cloudnet2002-dev 1H IN A 10.192.20.10
diff --git a/eqiad.wmnet b/eqiad.wmnet
index 5e62dce..499b355 100644
--- a/eqiad.wmnet
+++ b/eqiad.wmnet
@@ -506,7 +506,6 @@ elastic1067 1H IN A 10.64.48.137
es1011 1H IN A 10.64.0.6
es1012 1H IN A 10.64.0.7
es1013 1H IN A 10.64.16.186
-es1014 1H IN A 10.64.16.187
es1015 1H IN A 10.64.32.184
es1016 1H IN A 10.64.32.185
es1017 1H IN A 10.64.32.65
METADATA: {"path": "/tmp/dns-c25pcHBldHM-00e3a7zk", "sha1": "38bbce30b6c3a06117074b5a76a5dce6da447e8e", "insertions": 4, "deletions": 2, "lines": 6, "files": 4}
Have you checked that the diff is OK?
Type "done" to proceed
> done
2020-11-19 06:22:17,435 [INFO] Pushed with bitflags 256: 7ae94c6..38bbce3
2020-11-19 06:22:17,480 [INFO] Temporary directory /tmp/dns-c25pcHBldHM-00e3a7zk removed.
Updating the Netbox passive copies of the repository on netbox2001.wikimedia.org
Updating the authdns copies of the repository on authdns[1001,2001].wikimedia.org,dns[1001-1002,2001-2002,3001-3002,4001-4002,5001-5002].wikimedia.org
Deploying the updated zonefiles on authdns[1001,2001].wikimedia.org,dns[1001-1002,2001-2002,3001-3002,4001-4002,5001-5002].wikimedia.org
ERROR: some step failed, check the task updates.
Updated Phabricator task T268102
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)

Event Timeline

Marostegui created this paste.Thu, Nov 19, 6:23 AM
Host steps raised exception
Traceback (most recent call last):
  File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 312, in run
    dcs.add(_decommission_host(fqdn, spicerack, reason))
  File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 142, in _decommission_host
    update_netbox(netbox, netbox_data, spicerack.dry_run)
  File "/srv/deployment/spicerack/cookbooks/sre/hosts/decommission.py", line 223, in update_netbox
    device.save()
  File "/usr/lib/python3/dist-packages/pynetbox/core/response.py", line 391, in save
    if req.patch({i: serialized[i] for i in diff}):
  File "/usr/lib/python3/dist-packages/pynetbox/core/query.py", line 409, in patch
    return self._make_call(verb="patch", data=data)
  File "/usr/lib/python3/dist-packages/pynetbox/core/query.py", line 274, in _make_call
    raise RequestError(req)
pynetbox.core.query.RequestError: The request failed with code 500 Internal Server Error but more specific details were not returned in json. Check the NetBox Logs or investigate this exception's error attribute.