Page MenuHomePhabricator

wikifeeds-production-tls-proxy regularly exceeding its k8s CPU reservation
Closed, ResolvedPublic

Description

Screenshot_20201021_180948.png (387×1 px, 86 KB)

https://grafana.wikimedia.org/d/lxZAdAdMk/wikifeeds?orgId=1&from=1603296568693&to=1603318168693

It's not clear to me that this causes a latency problem, but it's not clear that it doesn't, either.

In general I have a feeling that our services Envoys are a bit underprovisioned on CPU.

Event Timeline

JMeybohm added a subscriber: JMeybohm.

I think you are right, thanks for the heads up!

While this probably is also an issue of "too much throttling" (T262527), we should bump requests and limits for wikifeeds. It's envoy also continuously runs above the memory request, which should be fixed as well.

Change 635753 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] wikifeeds: Increase envoy CPU and memory ressources

https://gerrit.wikimedia.org/r/635753

That's pretty interesting, there shouldn't be so much throttling at so low CPU usage. user+system summed barely hit 1/5 of the limit.

+1 to bumping the limit to see if it would solve latency issues, but it might be indeed related to T262527

That's pretty interesting, there shouldn't be so much throttling at so low CPU usage. user+system summed barely hit 1/5 of the limit.

+1 to bumping the limit to see if it would solve latency issues, but it might be indeed related to T262527

Indeed, but it approaches the configured request very often which then already leads to throttling. Seen that in eventgates as well. We should probably revisit this after kernel 4.19 is rolled out to see if we can lower the resource limits again.

Change 635753 merged by jenkins-bot:
[operations/deployment-charts@master] wikifeeds: Increase envoy CPU and memory ressources

https://gerrit.wikimedia.org/r/635753

Looks way better now, even under higher load.