The current aphlict host (aphlict1001.eqiad.wmnet) is running buster, and is in large part running on hand-provisioned configs. A new host has been set up (aphlict1002), which is a bullseye host and has all configs and services managed by puppet. This ticket is tracking the work to move this into production and decommission aphlict1001.
The plan is to test the new host during a maintenance window. If the new host works correctly, we will keep it on as the production host and shut down the existing one
- Set up aphlict1002, running bullseye
- Schedule maintenance window for phabricator: https://wikitech.wikimedia.org/wiki/Deployments#deploycal-item-20230424T1900
- During the maintenance window:
- Remove phabricator::aphlict::ensure: absent from puppet/hierdata/hosts/aphlict1002.yaml
- Update aphlict.discovery.wmnet to point to the new host in dns/templates/wmnet
- sudo run-puppet-agent on aphlict1002.eqiad.wmnet
- Test that notifications in phabricator work correctly (move tickets in workboard, add comments to see popups, etc), check logs to see traffic hitting the new host
- If needed, revert the changes above to return to normal
- If keeping the new host, turn aphlict1001.eqiad.wmnet off for ~2 weeks to allow time for recovery if needed, then decommission the host
Once this is done and we've verified that the puppet-managed configs are sound, we can reimage the aphlict2001.codfw.wmnet host to ensure the remaining hand-rolled configs are correctly managed by puppet, and close the parent task