Overview
The Kartotherian helm chart on k8s is now running an extra container to "translate" statsd metrics to Prometheus ones: nodejs is configured to push metrics to localhost:9125, that is the Prometheus exporter collecting statsd metrics and exposing them as Prometheus metrics.
By default the statsd exporter doesn't do a great job in translation, because we see metrics names like:
[.cut.] kartotherian_req_osm_intl_6_png_bucket{le="+Inf"} 1055 kartotherian_req_osm_intl_6_png_sum 47.51299999999971 kartotherian_req_osm_intl_6_png_count 1055
The proposal is to add a config that allows the following use cases:
kartotherian.req.osm-intl.18.png kartotherian.req.osm-intl.8.png.static.2 kartotherian.req.osm-intl.9.png.1-5:159
Translated as:
kartotherian_request_ms{kind="osm-int", int="18", format="png"} kartotherian_request_ms{kind="osm-int", int="8", format="png", static="2"} kartotherian_request_ms{kind="osm-int", int="9", format="png", zoom="1"}
From graphite (left panel -> Metrics -> kartotherian -> ...) you can see the metrics being collected, to have a broader idea.
There are other metrics like the kartotherian.err ones, but those should be easier to translate, they seem to have a flat structure and not a nested/dynamic one like the kartotherian.req ones.
More info in https://grafana-rw.wikimedia.org/d/000000030/service-kartotherian
Proposal
Writing the prometheus statsd config seems to be the easiest way to put Kartotherian on k8s, move traffic to it and start using it. As a follow up we could also force service-runner to publish Prometheus metrics itself, but it is probably something that requires a bit of time and that could be done later on.