Page MenuHomePhabricator

Clarify 'wipe bootloader' step in sre.hosts.decommission
Open, MediumPublic

Description

Today the decommission cookbook will try to render the host being targeted unbootable by executing /sbin/wipefs --all --force against the block devices found on the host during a step called "wipe bootloaders".

This is effective in making the host unbootable, but I think we clarify the output and expectations for the step. Wipefs is actually removing filesystem, raid and partition-table signatures from the disks, and afaict leaves the grub bootloader intact on disk.

I think we should either a) update this to execute a command which zeroes the bootloader (and leaves partition/fs signatures in place) or b) update the output to specifically list down all that is being wiped (raid, partition table, filesystem signatures).

Event Timeline

herron triaged this task as Medium priority.May 19 2021, 9:29 PM
herron created this task.

Change 692991 had a related patch set uploaded (by Herron; author: Herron):

[operations/cookbooks@master] sre.hosts.decommssion: use dd to zero the bootloader

https://gerrit.wikimedia.org/r/692991

Change 692992 had a related patch set uploaded (by Herron; author: Herron):

[operations/cookbooks@master] sre.hosts.decommssion: use dd to zero the bootloader

https://gerrit.wikimedia.org/r/692992

Change 692991 abandoned by Herron:

[operations/cookbooks@master] sre.hosts.decommssion: use dd to zero the bootloader

Reason:

stale

https://gerrit.wikimedia.org/r/692991

Change 692993 had a related patch set uploaded (by Herron; author: Herron):

[operations/cookbooks@master] sre.hosts.decommission: clarify "wipe bootloader" step

https://gerrit.wikimedia.org/r/692993

I think we should either a) update this to execute a command which zeroes the bootloader (and leaves partition/fs signatures in place) or b) update the output to specifically list down all that is being wiped (raid, partition table, filesystem signatures).

AFAIK the initial idea was to directly wipe the disks, but that's pending the implementation of a PXE menu that has also a wipe capability that covers all use cases (including HW RAID). Adding @faidon to correct me if I'm recalling incorrectly.
So I think that it would be better to just adjust the message as it could be confusing, as opposed to wiping the bootloader only.

Change 692993 merged by jenkins-bot:

[operations/cookbooks@master] sre.hosts.decommission: clarify "wipe bootloader" step

https://gerrit.wikimedia.org/r/692993

Change 692992 abandoned by Herron:

[operations/cookbooks@master] sre.hosts.decommssion: use dd to zero the bootloader

Reason:

in favor of I9a79970bd8346d8ea2c16afc6202c5c5a08374d6

https://gerrit.wikimedia.org/r/692992