We would like to update the SSD firmware on these hosts, please.
- an-coord1003.eqiad.wmnet
- an-coord1004.eqiad.wmnet
We can start with an-coord1004, as this is currently operating only in a standby capacity.
We can then migrate the Hive metastore and Presto coordinator services to an-coord1004 in order to reduce the impact of downtime for an-coord1003.
an-coord1003:
- - schedule downtime for host with service owners and icinga
- - note old firmware version
Disk 0 on Embedded AHCI Controller 1 DL70 Disk 1 on Embedded AHCI Controller 1 DL70
- - send firmware update cumin cookbook command: cookbook sre.hardware.upgrade-firmware -c ssd "hostname*" and select option 0 for DL7C. - THIS WILL REQUIRE THE HOST TO REBOOT
- - confirm firmware updated to correct version on all affected SSDs
- - pass system back to service owners/service use
an-coord1004:
- - schedule downtime for host with service owners and icinga
- - note old firmware version
- - send firmware update cumin cookbook command: cookbook sre.hardware.upgrade-firmware -c ssd "hostname*" and select option 0 for DL7C.
- - confirm firmware updated to correct version on all affected SSDs
- - pass system back to service owners/service use