The API gateway's version of the ratelimit service is very old (using the 1.5.1 branch). We have started to see issues where the service is throwing 5xx errors and there is very little way to debug why this is happening. We're also seeing this manifesting when other services are impaired, which is confusing and a bad signal, especially when it pages.
We should update the service to a recent version, and as part of this migration we should switch to using the new prometheus metrics it offers. In addition we can remove the statsd gateway as part of this work.