Page MenuHomePhabricator

Replace bast3001
Closed, ResolvedPublic


bast3001 is having disk issues (T154603) that are unlikely to be fixed soon by a site visit (the hardware is super old). esams does have other unused servers though, and we could use those to replace the bastion with.

A good start would be hooft. This was our previous bastion, but we switched to slauerhoff when we were unable to reformat hooft to jessie, if memory serves. I think what happened back then with hooft was the PXE firmware was getting confused with all of the Ganglia UDP packets hooft was receiving, an issue that we figured out later in a different part of the infrastructure.

Event Timeline

@Dzahn, any chance you could take this?

Any news about this? I see @Dzahn you claimed that already :)

< mutante> paravoid: i'm back and will try the bast3001 reinstall today. first thing was "should it be install3002 or actually re-use the existing name". i guess keep it 3001 because the inconsistency is ugly.. < mutante> but if it wasn't that i would use new names for new servers

Change 339681 merged by Dzahn:
(re-)add hooft as bast3002

Change 339684 had a related patch set uploaded (by Dzahn):
add bast3002 to network constants

Change 339687 had a related patch set uploaded (by Dzahn):
dhcp/site: add bast3002

Change 339687 merged by Dzahn:
dhcp/site: add bast3002

Change 339698 had a related patch set uploaded (by Dzahn):
install: don't use http install method for bast3002

Change 339698 merged by Dzahn:
install: don't use http install method for bast3002

I was able to install jessie on the-server-formerly-known-as-hooft as "bast3002". It did not work over http. Over tftp it was still very slow and needed patience but did eventually finish.

Debian GNU/Linux 8 bast3002 ttyS1

bast3002 login:

Mentioned in SAL (#wikimedia-operations) [2017-02-25T01:43:26Z] <mutante> bast3002 - sign puppet cert, initial run with basic "bastion" role, to replace broken bast3001, but WIP, ganglia/prometheus roles not moved yet (T156506)

[bast3002:~] $ gen_fingerprints
| Cipher  | Algo    | Fingerprint                                     |
| RSA     | MD5     | 5e:2b:0b:da:fa:16:c2:9d:0e:f2:a0:ab:42:4d:b7:17 |
| RSA     | SHA-256 | 3IMu0Zs5cTA6V4k81wUpsEihM3ZP5WMj7gn8V7Nwy/0=    |
| DSA     | MD5     | b7:65:77:e7:a6:2e:af:e1:ed:3f:74:a8:14:57:83:9d |
| DSA     | SHA-256 | Xh9tdp6FrtaFwblK5s2fixW1a0AKqXxcbu7uksuqPhM=    |
| ECDSA   | MD5     | fc:8a:2b:af:ea:e1:27:72:1d:d5:25:0f:e3:0f:ab:d9 |
| ECDSA   | SHA-256 | 4jFetkjXoXVKbwm5mhwzdDVWTd+ejLIBdeujmz7cvLo=    |
| ED25519 | MD5     | 07:f4:7b:af:18:f9:74:8c:bc:b1:2a:94:db:6d:b2:c8 |
| ED25519 | SHA-256 | Y0mvj3+P7/yP2C9n681H5goh4wwvkkGKXyl7KHOx0AA=    |

Next is and moving the roles:



ganglia::monitor::aggregator from 3001 to 3002, then shutting down 3001.

Any preference if the final step should be renaming 3002 back to 3001 or just leave it as 3002 or CNAME 3001 to 3002?

Change 340163 had a related patch set uploaded (by Dzahn; owner: Dzahn):
ganglia: move esams aggregator from bast3001 to bast3002

Change 340165 had a related patch set uploaded (by Dzahn; owner: Dzahn):
install/bast: move tftp server from bast3001 to bast3002

Change 340166 had a related patch set uploaded (by Dzahn; owner: Dzahn):
install/prometheus: move prometheus::ops from bast3001 to 3002

Change 340169 had a related patch set uploaded (by Dzahn; owner: Dzahn):
install: remove bast3001 from puppet and smokeping

Change 339684 merged by Dzahn:
add bast3002 to network constants

Change 340173 had a related patch set uploaded (by Dzahn; owner: Dzahn):
prometheus: add bast3002 as second esams host

Change 340173 merged by Dzahn:
prometheus: add bast3002 as second esams host

Change 340163 merged by Dzahn:
ganglia: move esams aggregator from bast3001 to bast3002

Change 340165 merged by Dzahn:
install/bast: move tftp server from bast3001 to bast3002

Change 340166 merged by Dzahn:
install/prometheus: add prometheus::ops to bast3002

Mentioned in SAL (#wikimedia-operations) [2017-02-28T02:18:40Z] <mutante> rsyncing prometheus metrics data from bast3001 to bast3002 (T156506)

Change 340272 had a related patch set uploaded (by Dzahn; owner: Dzahn):
switch prometheus.eqiad to bast3002

Change 340272 merged by Dzahn:
[operations/dns] switch prometheus.esams to bast3002

Change 340169 merged by Dzahn:
[operations/puppet] smokeping: replace bast3001 with bast3002

Change 340811 had a related patch set uploaded (by Dzahn):
[operations/puppet] prometheus: remove bast3001 as esams server, keep bast3002

Change 340811 merged by Dzahn:
[operations/puppet] prometheus: remove bast3001 as esams node, keep bast3002

Change 340812 had a related patch set uploaded (by Dzahn):
[operations/puppet] bast3001: remove puppet roles, add role::spare for decom

Change 340813 had a related patch set uploaded (by Dzahn):
[operations/puppet] bast3001: remove from network/constants.pp

Change 340812 merged by Dzahn:
[operations/puppet] bast3001: remove puppet roles, add role::spare for decom

Change 340833 had a related patch set uploaded (by Dzahn):
[operations/puppet] bastion: rsync home dir data bast3001->bast3002

Change 340833 merged by Dzahn:
[operations/puppet] bastion: rsync home dir data bast3001->bast3002

Change 340842 had a related patch set uploaded (by Dzahn):
[operations/puppet] bast3002: remove bastionhost::migration role

replaced by bast3002 for all practical purposes (prometheus and ganglia roles moved too)

copied home dir data, mailed ops list about it, edited wikitech pages, pasted new fingerprints as above,

created follow-up decom ticket at T159480

last step here will be removing it from firewall rules

then i will hand over the decom steps to dc-ops

Change 340842 merged by Dzahn:
[operations/puppet] bast3002: remove bastionhost::migration role

Mentioned in SAL (#wikimedia-operations) [2017-03-02T22:09:37Z] <mutante> bast3002 - stop rsyncd, remove rsyncd config snippets (T156506)

Change 340813 merged by Dzahn:
[operations/puppet] bast3001: remove from network/constants.pp

Change 341451 had a related patch set uploaded (by dzahn):
[operations/puppet] delete unused bastionhost::migration class

Change 341451 merged by Dzahn:
[operations/puppet] delete unused bastionhost::migration class