Page MenuHomePhabricator

unwind the Puppetized /etc/hosts override of statsd.eqiad.wmnet
Open, LowPublic

Description

It turns out that internal recDNS is underprovisioned in eqiad given current load. 80% of the load on eqiad recdns is lookups for statsd.eqiad.wmnet, which seem to be made multiple times per MW appserver query, and never cached by those clients (presumably for usual PHP reasons).

https://gerrit.wikimedia.org/r/c/operations/puppet/+/554618 dropped us from ~70k packets-per-second on each recdns host to about 12k pps. But this is a kludge, and should be rolled back when we have the capacity (10G NICs coming Soon, which will likely help), or when we work around it other ways (such as with a local stub resolver on every host [with a max-ttl set to only a minute or two, so we don't create more of a mess around purging bad records]).

Event Timeline

If I recall correctly, HHVM had a dns cache. This is among the reasons that, over the years, we gradually adopted more use of hostnames in wmf-config for services instead of hardcoding IP addresses. I guess we lost that in the PHP7 transition. Does the OS not cache this at all? Does PHP7 do something to bypass it?

Change 554631 had a related patch set uploaded (by CDanis; owner: CDanis):
[operations/dns@master] statsd: document Puppet /etc/hosts-ification

https://gerrit.wikimedia.org/r/554631

Change 554632 had a related patch set uploaded (by CDanis; owner: CDanis):
[operations/puppet@production] base: document statsd DNS kludge

https://gerrit.wikimedia.org/r/554632

In general, usually applayer DNS caching is a Bad Idea unless it's done very carefully (e.g. cap it at something like 5s max, or actually use a full-featured resolver library and get the real TTLs from upstream, or both).

There are several ways we can make the OS layer do the same, but they all involve some form of stub cache, and no such thing is presently configured. There's a systemd-resolved stub cache that's fairly easy to inject at the OS layer, but it lacks any kind of config to cap TTLs down to ~5s and has no way to wipe individual records, so we'd lose our current ability to actively wipe individual cache entries from recdns in various operational problem scenarios. Plugging together a per-host real cache like powerdns recursor is also very tricky...

Change 554632 merged by CDanis:
[operations/puppet@production] base: document statsd DNS kludge

https://gerrit.wikimedia.org/r/554632

Change 554631 merged by CDanis:
[operations/dns@master] statsd: document Puppet /etc/hosts-ification

https://gerrit.wikimedia.org/r/554631

Given most nodejs applications don't use statsd anymore (in kubernetes we just use the prometheus-statsd exporter), and I have submitted https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/661732 to switch to using the IP address directly in MediaWiki, I think we can remove the puppetized host file entry once my patch is merged.