To track the overall progress of Icinga -> AM migration I've come up with the following "progress indicator". Namely an aggregate list of all service checks we have in Icinga. Whenever we make progress (either removing checks altogether or moving checks to Alertmanager/Prometheus) the list will shrink.
In terms of targets, ideally the list would reach zero (i.e. no more checks in Icinga, we can shut it down) though that largely depends on a cost/benefit analysis since the tail of checks is quite long (subject for another task)
The following command will report on all puppet-generated checks in Icinga present at the moment of running, with some attempt at aggregating related checks. The total amount of services is also tracked in icinga_service_count metric.
grep _NAME /etc/icinga/objects/puppet_services.cfg | awk '{print $4}' | sort | sed -E -e 's@[a-z][-a-z0-9]+\.[a-z]+(\.[a-z]+)?@HOST@g' -e 's@(eqiad|codfw|esams|ulsfo|eqsin|drmrs)@SITE@g' -e 's@[0-9]{3,}@PORT@g' | sort | uniq -c | sort -nr
The results as of the writing of this task (20221017) can be found at P35498, listing 978 de-duplicated checks. For some of these work is already in progress for either decom'ing the check or porting to Prometheus (cfr T288622)
See also this spreadsheet: https://docs.google.com/spreadsheets/d/19nxCXldb804TJCXGy4Z2BHG_1wRksRnKcPC6sXfjQuM/edit#gid=1831147731
Icinga checks are tallied and deduplicated there, also annotated with their check "kind" (i.e. what we can do in a world without icinga (the engine/program)