[summary of a chat with @Krinkle]
Currently and for historical navtiming follows mediawiki main/master site via etcd to establish where to consume from (eqiad/codfw), and send to statsd after processing.
The parent task's goal is to establish a Prometheus processor for navtiming, and there's significant progress towards that. Since Prometheus polls both navtiming sites at all times, and there's no danger of double-counting, in a Prometheus-only future we can simplify how navtiming operates, specifically:
- navtiming consumes from kafka-jumbo which is eqiad-only at all times (this is true today, and will stay true)
- Use a single consumer group for eqiad and codfw navtiming processes, this way load is effectively spread amongst all navtiming processes
- When one of webperf hosts needs to go down for maintenance and such there's no action required: the other host will automatically pick up the slack via the single consumer group
- There's no more etcd following required, eqiad and codfw webperf hosts operate exactly the same (even though both are consuming from eqiad)