As per title, all hosts running Graphite should be on Bullseye
root@cumin1001:~# cumin 'P{C:graphite} and not P{F:lsbdistcodename = buster}' 2 hosts will be targeted: graphite2003.codfw.wmnet,graphite1004.eqiad.wmnet DRY-RUN mode enabled, aborting
- Get the graphite::production role to work in Pontoon on Bullseye
Action plan for codfw:
- Make sure carbonate is available on Bullseye and working as expected, as per https://wikitech.wikimedia.org/wiki/Graphite#Merge_and_sync_metrics
- Reimage graphite2003 with Bullseye. graphite1004 will start dropping metrics directed to graphite2003 as expected
- Ensure Puppet is running as expected, all services are up and metrics are received from graphite1004 and wait 24h for some data to accumulate
- Transfer and merge data from graphite1004 to graphite2003 with carbonate, following https://wikitech.wikimedia.org/wiki/Graphite#Merge_and_sync_metrics
- Validate that historical metric data is present and new data is flowing
The plan for eqiad is similar, with the addition of a failover to codfw as per https://wikitech.wikimedia.org/wiki/Graphite#Failover and fail back once things are working in eqiad.
- Failover out of graphite1004 and to graphite2003
- Reimage graphite1004
- Transfer and merge data from graphite1004 to graphite2003 with carbonate, following https://wikitech.wikimedia.org/wiki/Graphite#Merge_and_sync_metrics
- Validate that historical metric data is present and new data is flowing
- Failover back to graphite1004