Cloud VPS Project Tested: n/a
Site/Location: eqiad
Number of systems: 1
Service: Matomo
Networking Requirements: internal
Processor Requirements: 4
Memory: 8 GB
Disks: 40 GB root, 80 GB /var/lib/mysql
Other Requirements: This is a direct replacement for the existing virtual machine - matomo1002 - The older machine will be decommissioned once the new server has been put into service.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Add puppet7 data for new host matomo1003. | operations/puppet | production | +2 -0 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T291916 Tracking task for Bullseye migrations in production | |||
Open | None | T288804 Upgrade the Data Engineering infrastructure to Debian Bullseye | |||
Resolved | BTullis | T349397 Migrate the matomo host to bookworm | |||
Resolved | BTullis | T362146 Site: eqiad 1 VM for Matomo |
Event Timeline
I'll add the second disk after the initial creation by the cookbook. This will be useful to allow us to retain MariaDB data during an in-place reimage.
The Ganeti cluster report looks like it's fairly evenly balanced at the moment.
DRY-RUN: START - Cookbook sre.ganeti.resource-report +-------+-------+-----------+----------+-----------+---------+-----------+ | Group | Nodes | Instances | MFree | MFree avg | DFree | DFree avg | +-------+-------+-----------+----------+-----------+---------+-----------+ | A | 8 | 35 | 291.7GiB | 36.5GiB | 16.6TiB | 2.1TiB | | B | 7 | 36 | 232.2GiB | 33.2GiB | 11.9TiB | 1.7TiB | | C | 8 | 37 | 289.2GiB | 36.1GiB | 15.6TiB | 1.9TiB | | D | 7 | 32 | 276.7GiB | 39.5GiB | 13.1TiB | 1.9TiB | +-------+-------+-----------+----------+-----------+---------+-----------+
matomo1002 is currently in cluster group C, so I suppose that if I use this group then it will remain balanced after I decom the old host.
Change #1018270 had a related patch set uploaded (by Btullis; author: Btullis):
[operations/puppet@production] Add puppet7 data for new host matomo1003.
Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1002 for host matomo1003.eqiad.wmnet with OS bookworm
Change #1018270 merged by Btullis:
[operations/puppet@production] Add puppet7 data for new host matomo1003.
Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1002 for host matomo1003.eqiad.wmnet with OS bookworm executed with errors:
- matomo1003 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via gnt-instance
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Set boot media to disk
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" matomo1003.eqiad.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1002 for host matomo1003.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1002 for host matomo1003.eqiad.wmnet with OS bookworm completed:
- matomo1003 (WARN)
- Downtimed on Icinga/Alertmanager
- Unable to disable Puppet, the host may have been unreachable
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via gnt-instance
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Set boot media to disk
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202404091545_btullis_1610211_matomo1003.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
I'm adding the second disk now.
btullis@ganeti1027:~$ sudo gnt-instance modify --disk add:size=80g matomo1003.eqiad.wmnet Thu Apr 11 09:07:30 2024 - INFO: Waiting for instance matomo1003.eqiad.wmnet to sync disks Thu Apr 11 09:07:30 2024 - INFO: - device disk/1: 0.10% done, 1h 7m 12s remaining (estimated) Thu Apr 11 09:08:31 2024 - INFO: - device disk/1: 2.80% done, 35m 15s remaining (estimated)
I think that I will mount this as /srv and try to make the mariadb configuration more like one of our standard server setup.