In case statsd metrics collector in service-runner based service (like RESTBase ) is configured to send the metrics to a non-relovable domain (or the domain becomes non-resolvable while the service is up) the service starts crashing with something like
[2019-12-20T18:25:11.751Z] FATAL: restbase/13462 on deployment-restbase01: Error sending hot-shots message: Error: getaddrinfo ENOTFOUND labmon1001.eqiad.wmnet (err.code=ENOTFOUND, err.levelPath=fatal/service-runner/unhandled) Error: Error sending hot-shots message: Error: getaddrinfo ENOTFOUND labmon1001.eqiad.wmnet at handleCallback (/srv/deployment/restbase/deploy-cache/revs/92acf1e5ae89accc351b9e7b08d3dc1d9590551b/node_modules/hot-shots/lib/statsd.js:357:32) at doSend (dgram.js:372:7) at afterDns (dgram.js:362:5) at /srv/deployment/restbase/deploy-cache/revs/92acf1e5ae89accc351b9e7b08d3dc1d9590551b/node_modules/dnscache/lib/index.js:136:58 at Array.forEach (native) at /srv/deployment/restbase/deploy-cache/revs/92acf1e5ae89accc351b9e7b08d3dc1d9590551b/node_modules/dnscache/lib/index.js:136:34 at GetAddrInfoReqWrap.asyncCallback [as callback] (dns.js:62:16) at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:76:17)
The service should recover from a condition like this without fatalling out.