This is only a proposal at the moment, pending completion and review of the DPE compute and storage strategy.
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| Prepare for renaming kafka-stretch200[1-2] to dse-k8s-worker200[1-2] | operations/puppet | production | +8 -14 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T362105 EPIC: OpenSearch on K8s (formerly Mutualized opensearch cluster) - FY25/26 WE4.2.6 | |||
| Resolved | • Stevemunene | T396478 EPIC: Build dse-k8s-codfw Kubernetes cluster | |||
| Resolved | BTullis | T396479 Decide initial hardware footprint of dse-k8s-codfw cluster | |||
| Resolved | BTullis | T353789 Re-purpose kafka-stretch200[1-2] as DSE workers in codfw |
Event Timeline
Change #1160888 had a related patch set uploaded (by Btullis; author: Btullis):
[operations/puppet@production] Prepare for renaming kafka-stretc200[1-2] to dse-k8s-worker200[1-2]
Change #1160888 merged by Btullis:
[operations/puppet@production] Prepare for renaming kafka-stretch200[1-2] to dse-k8s-worker200[1-2]
Cookbook cookbooks.sre.hosts.rename started by btullis@cumin1003 from kafka-stretch2001 to dse-k8s-worker2001 completed:
- kafka-stretch2001 (PASS)
- ✔️ Downtimed host on Icinga/Alertmanager
- ✔️ Disabled puppet and its timer
- ✔️ Disabled debmonitor-client timer
- ✔️ Netbox updated
- ✔️ BMC Hostname updated
- ✔️ DNS updated
- ✔️ Switch description updated
- ✔️ Removed from DebMonitor
- ✔️ Removed from Puppet master and PuppetDB
- Rename completed 👍 - now please run the re-image cookbook on the new name with --new
Cookbook cookbooks.sre.hosts.rename started by btullis@cumin1003 from kafka-stretch2002 to dse-k8s-worker2002 completed:
- kafka-stretch2002 (PASS)
- ✔️ Downtimed host on Icinga/Alertmanager
- ✔️ Disabled puppet and its timer
- ✔️ Disabled debmonitor-client timer
- ✔️ Netbox updated
- ✔️ BMC Hostname updated
- ✔️ DNS updated
- ✔️ Switch description updated
- ✔️ Removed from DebMonitor
- ✔️ Removed from Puppet master and PuppetDB
- Rename completed 👍 - now please run the re-image cookbook on the new name with --new
Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1003 for host dse-k8s-worker2001.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1003 for host dse-k8s-worker2001.codfw.wmnet with OS bookworm completed:
- dse-k8s-worker2001 (PASS)
- Host successfully migrated to the new VLAN
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202506191002_btullis_2242387_dse-k8s-worker2001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1003 for host dse-k8s-worker2002.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1003 for host dse-k8s-worker2002.codfw.wmnet with OS bookworm completed:
- dse-k8s-worker2002 (PASS)
- Host successfully migrated to the new VLAN
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202506191046_btullis_2248496_dse-k8s-worker2002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB