After fixing T410400 we can now reclone db1169 and get it back to production
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| db1169: Enable notifications | operations/puppet | production | +0 -1 | |
| installserver: Add db1169 to preseed | operations/puppet | production | +5 -0 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | MoritzMuehlenhoff | T410400 Support UEFI on databases instead of Legacy BIOS | |||
| Resolved | • Marostegui | T411498 Reclone db1169 (s1) |
Event Timeline
Completed depool of db1251 - Depool db1251.eqiad.wmnet to then clone it to db1169.eqiad.wmnet - marostegui@cumin1003 - marostegui@cumin1003
Start pool of db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning - marostegui@cumin1003
Completed pool of db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning - marostegui@cumin1003
Change #1214220 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] installserver: Add db1169 to preseed
Change #1214220 merged by Marostegui:
[operations/puppet@production] installserver: Add db1169 to preseed
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1169.eqiad.wmnet with OS trixie
I am reimagining this host with Trixie, which was the original point of the task, which ended up with all the nokia/uefi things
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1169.eqiad.wmnet with OS trixie completed:
- db1169 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512030605_marostegui_501408_db1169.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Change #1214243 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db1169: Enable notifications
Change #1214243 merged by Marostegui:
[operations/puppet@production] db1169: Enable notifications
Start pool of db1169 gradually with 4 steps - Repooling db1169 - marostegui@cumin1003
Completed pool of db1169 gradually with 4 steps - Repooling db1169 - marostegui@cumin1003