Machine-level metrics are covered in prometheus by `node_exporter` (tracked in T140646) though we also have application-specific metrics deployed in ganglia.
For prometheus to be a viable replacement for ganglia we'd have to have at least the same metrics (if not better) in prometheus too.
See also https://wikitech.wikimedia.org/wiki/Prometheus#Replacing_Ganglia for a list of ganglia plugins we are currently deploying. I'm listing below the ones I think are more important/urgent to have:
[x] varnish
[x] gdnsd
[x] apache
[x] vhtcpd
[x] hhvm
[x] memcache
[] redis
[] postgresql
The list of rrds updated in the last 30d in P4571 and their current status.
[] fundraising-related stats for misc queues and donations T152562
[x] cirrussearch slow log rate, in graphite via logstash
[x] apache mod_socache_shmcb stats, we don't seem to use `mod_socache` anyway
[x] elasticsearch stats, afaict those are in graphite already
[] exim, can be done with diamond/graphite or in prometheus via node_exporter
[x] jenkins TODO? some stats might be already in graphite
[x] kafka, in graphite
[x] varnishkafka, in graphite
[] osm sync lag from `/srv/osmosis/state.txt`
[x] powerdns, in graphite via diamond