Site/Location: esams
Number of systems: 1
Service: prometheus3002
Networking Requirements: internal
Processor Requirements: 2
Memory: 8Gb
Disks: 128Gb
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
prometheus: Add the prometheus3002 node definition | operations/puppet | production | +4 -0 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | andrea.denisse | T324725 Observability Bookworm upgrades | |||
Resolved | andrea.denisse | T309979 Upgrade Prometheus VMs in PoPs to Bullseye | |||
Resolved | andrea.denisse | T333627 Site: esams 1 VM request for prometheus3002 |
Event Timeline
Comment Actions
Change 904651 had a related patch set uploaded (by Andrea Denisse; author: Andrea Denisse):
[operations/puppet@production] prometheus: Add the prometheus3002 node definition
Comment Actions
Change 904651 merged by Andrea Denisse:
[operations/puppet@production] prometheus: Add the prometheus3002 node definition
Comment Actions
cookbooks.sre.hosts.decommission executed by denisse@cumin1001 for hosts: prometheus3002.esams.wmnet
- prometheus3002.esams.wmnet (WARN)
- Host not found on Icinga, unable to downtime it
- Found Ganeti VM
- VM shutdown
- Started forced sync of VMs in Ganeti cluster esams to Netbox
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
- VM removed
- Started forced sync of VMs in Ganeti cluster esams to Netbox
Comment Actions
Cookbook cookbooks.sre.ganeti.reimage was started by denisse@cumin1001 for host prometheus3002.esams.wmnet with OS bullseye
Comment Actions
Cookbook cookbooks.sre.ganeti.reimage started by denisse@cumin1001 for host prometheus3002.esams.wmnet with OS bullseye completed:
- prometheus3002 (PASS)
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via gnt-instance
- Host up (Debian installer)
- Set boot to disk
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/ganeti/reimage/202303311923_denisse_1052231_prometheus3002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed