mw-on-k8s app container CPU throttling at low average load
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Clement_Goubert
	Jul 26 2023, 11:43 AM

Description

Problem first raised in T342252: Migrate rdf-streaming-updater to connect to mw-on-k8s

All our mw-on-k8s deployments are experiencing significant throttling at low CPU load. For instance for mw-web, the main deployment in eqiad gets throttled for up to 5ms while using less than 1/10th of its CPU quota.

This is probably due to how CPU quota gets calculated through timeslots, see:
https://medium.com/@betz.mark/understanding-resource-limits-in-kubernetes-cpu-time-9eff74d3161b
https://medium.com/indeed-engineering/unthrottled-how-a-valid-fix-becomes-a-regression-f61eabb2fbd9

I propose we remove the CPU limit for the app container in our mw-on-k8s deployments.

Details

Subject	Repo	Branch	Lines +/-
mw-misc: Enforce fixed requests and limits	operations/deployment-charts	master	+3 -0
mediawiki: Autocompute requests and limits for all	operations/deployment-charts	master	+3 -5
mw-api-int: autocompute memory limit	operations/deployment-charts	master	+3 -1
mediawiki: Allow autocomputing the memory limit	operations/deployment-charts	master	+9 -2
mediawiki: Reduce memory request	operations/deployment-charts	master	+3 -3
mw-api-int: Set requests based on php.workers	operations/deployment-charts	master	+7 -3
mediawiki: Add exporter limits and requests	operations/deployment-charts	master	+74 -10
Revert "mediawiki: set requests based on php.workers"	operations/deployment-charts	master	+22 -9
mediawiki: set requests based on php.workers	operations/deployment-charts	master	+70 -36

Customize query in gerrit

Related Objects
Search...

Status	Subtype	Assigned	Task
Stalled		None	T255792 Quibble runs core:unit tests twice!
Open		None	T328919 Upgrade to PHPUnit 10
Open		None	T338103 Micro-optimize ApiResult::isMetadataKey with str_starts_with once we support PHP8+
Open		None	T328921 Drop PHP 7.4 support from MediaWiki
Stalled		None	T334726 Use return type `never` in Wikibase
Open		None	T328922 Drop PHP 8.0 support from MediaWiki
Stalled		None	T319055 Upgrade to psr/container 2.x
Stalled	Feature	None	T364249 New upstream release for Pygments (2.18.0)
Stalled		Krinkle	T319432 Migrate WMF production from PHP 7.4 to PHP 8.1
Open		None	T291916 Tracking task for Bullseye migrations in production
Open		None	T368366 Upgrade K8s docker images to running in production on Buster with either Bullseye or Bookworm
Stalled		None	T356293 Migrate MW appservers' base images to bullseye
Open		None	T290536 Serve production traffic via Kubernetes
Resolved		Clement_Goubert	T342748 mw-on-k8s app container CPU throttling at low average load
Resolved		Clement_Goubert	T343306 Wikikube CPU capacity issue
Resolved		Jclark-ctr	T343708 Physical re-labeling of mw1497 and mw1498 to kubernetes1025 and kubernetes1026
Resolved		JMeybohm	T343978 Allow more flexibility in ResourceQuota and LimitRanger config

Event Timeline

In my opinion, we need to fix this before moving forward with migrating more traffic to mw-on-k8s.

Clement_Goubert renamed this task from mw-on-k8s php-fpm container CPU throttling at low average load to mw-on-k8s app container CPU throttling at low average load.Jul 26 2023, 11:46 AM

Clement_Goubert updated the task description. (Show Details)

We were consistently throttled until we set limits == FPM worker count. Per the description (and Dan Luu's insightful foray[1]) into the topic, I don't think there is much that can be done besides adjusting or removing the limits or tweaking the CFS period that k8s uses. Removing the limits is probably fine given that the size of the worker pool is a natural upper bound on concurrency with pm = static.

[1] https://danluu.com/cgroup-throttling/

Thanks @TK-999 indeed for a PHP application that doesn't shell out much like MediaWiki the number of workers is a hard limit on the amount of CPUs it can use, which is roughly around 1 CPU per worker, with typical usage for a mediawiki cluster around 0.1 seconds per worker in the mediawiki appserver cluster.

So in practice we might want to raise slightly the cpu requests for a mediawiki pod, and possibly remove the limits.

Reducing the CFS quota period from 100 ms to something like 10 ms also probably makes sense.

How I evaluated the current seconds per worker:

I used the following formula in promQL:

sum(rate(container_cpu_usage_seconds_total{cluster="$cluster", id=~"/system.slice/php7\\.4-fpm.service"}[5m]))/ sum(phpfpm_statustext_processes{site="eqiad",service="php7.4-fpm.service",cluster="$cluster"})

the values I found are:

~ 0.05-0.15 for appservers
~ 0.2-0.25 for apis and jobrunners
~ 0.4-0.5 for parsoid

Thanks for the insight @TK-999
When you say "limits == FPM worker count", do you mean one whole CPU per worker? Did you use pinning as well?
As I understand it, even using whole CPU counts matching process count, we would still see some (but probably less) throttling due to the CFS timeslot mechanism.

@Joe So we would set request for the app container to something like:

mw-web 100m*nb_workers
mw-api-* 200m*nb_workers

Then pod request a bit higher than that (to take into account sidecars), and remove limits for the main container, sidecars, and the whole pod?

@Clement_Goubert Yeah, we currently set a limit of 1 CPU per worker. We have not experimented with pinning.

In practice, this keeps throttling at < 0.25% - likely because even if a pod sees 100% process utilization, those processes might be waiting on I/O or otherwise not utilizing the CPU time budget.

akosiaris moved this task from Backlog to In Progress on the MW-on-K8s board.Jul 27 2023, 8:51 AM

We had a long but productive discussion with @JMeybohm this morning, resulting in a tentative plan of action:

Graph the global latency of wikikube hosted services. This is not useful as a raw number, but if it's not too spiky, a variation should help us spot if mediawiki is being too noisy of a neighbor
T277876: Reserve resources for system daemons on kubernetes nodes should be completed before removing limits on all mw-on-k8s deployments, so we avoid system resource starvation under spikes once baseload has increased. It has been updated with an implementation proposal. This is not a blocker per-se.
Remove limits on mw-api-int. We would not be raising worker count, and keep requests where they are for the php-fpm container (.5 CPUs/worker).
Let that run for a week or so to get actionable intel on behaviour.
If it's conclusive, remove limits for the php-fpm container on all mw-on-k8s deployments.

A question we were left with was if we can load-test ourself against a single pod IP to check behavior quicker. I know a load test of mw-on-k8s has been discussed with Performance-Team, maybe we could collaborate on that?

Change 943560 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mediawiki: set requests based on php.workers

https://gerrit.wikimedia.org/r/943560

gerritbot added a project: Patch-For-Review.Jul 31 2023, 1:21 PM

Clement_Goubert mentioned this in T343306: Wikikube CPU capacity issue.Aug 2 2023, 10:09 AM

Blocked by T343306: Wikikube CPU capacity issue

Clement_Goubert claimed this task.Aug 7 2023, 11:58 AM

Ladsgroup subscribed.Aug 7 2023, 4:11 PM

Clement_Goubert closed subtask T343306: Wikikube CPU capacity issue as Resolved.Aug 8 2023, 10:44 AM

Clement_Goubert changed the task status from Stalled to In Progress.Aug 8 2023, 1:37 PM

Change 943560 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki: set requests based on php.workers

https://gerrit.wikimedia.org/r/943560

Maintenance_bot removed a project: Patch-For-Review.Aug 10 2023, 8:29 AM

Change 947792 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/deployment-charts@master] Revert "mediawiki: set requests based on php.workers"

https://gerrit.wikimedia.org/r/947792

gerritbot added a project: Patch-For-Review.Aug 10 2023, 8:44 AM

Change 947792 merged by jenkins-bot:

[operations/deployment-charts@master] Revert "mediawiki: set requests based on php.workers"

https://gerrit.wikimedia.org/r/947792

Maintenance_bot removed a project: Patch-For-Review.Aug 10 2023, 9:10 AM

Clement_Goubert closed subtask T343978: Allow more flexibility in ResourceQuota and LimitRanger config as Resolved.Aug 17 2023, 11:36 AM

Clement_Goubert mentioned this in T343978: Allow more flexibility in ResourceQuota and LimitRanger config.

Change 949957 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mediawiki: Set requests based on php.workers

https://gerrit.wikimedia.org/r/949957

gerritbot added a project: Patch-For-Review.Aug 17 2023, 11:44 AM

Clement_Goubert reopened subtask T343978: Allow more flexibility in ResourceQuota and LimitRanger config as Open.Aug 17 2023, 2:33 PM

Change 950138 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mediawiki: Add exporter limits and requests

https://gerrit.wikimedia.org/r/950138

Change 950138 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki: Add exporter limits and requests

https://gerrit.wikimedia.org/r/950138

Change 949957 merged by jenkins-bot:

[operations/deployment-charts@master] mw-api-int: Set requests based on php.workers

https://gerrit.wikimedia.org/r/949957

Maintenance_bot removed a project: Patch-For-Review.Aug 18 2023, 10:10 AM

JMeybohm closed subtask T343978: Allow more flexibility in ResourceQuota and LimitRanger config as Resolved.Aug 18 2023, 10:17 AM

Limitless php containers deployed today on mw-api-int.
Next week we will reintroduce memory limits, and extend the deployment of limitless CPU for all mw-on-k8s deployments except mw-debug and mw-misc

Clement_Goubert mentioned this in T277876: Reserve resources for system daemons on kubernetes nodes.Aug 18 2023, 1:27 PM

Change 950177 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mediawiki: Reduce memory request

https://gerrit.wikimedia.org/r/950177

gerritbot added a project: Patch-For-Review.Aug 18 2023, 2:22 PM

Change 950177 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki: Reduce memory request

https://gerrit.wikimedia.org/r/950177

Maintenance_bot removed a project: Patch-For-Review.Aug 21 2023, 8:30 AM

Change 951045 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mediawiki: Allow autocomputing the memory limit

https://gerrit.wikimedia.org/r/951045

Change 951051 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mediawiki: Autocompute requests and limits for all

https://gerrit.wikimedia.org/r/951051

Change 951052 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mw-api-int: autocompute memory limit

https://gerrit.wikimedia.org/r/951052

Change 951045 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki: Allow autocomputing the memory limit

https://gerrit.wikimedia.org/r/951045

Mentioned in SAL (#wikimedia-operations) [2023-08-21T10:59:49Z] <claime> Deploying memory limit autocompute for mw-on-k8s - T342748

Change 951052 merged by jenkins-bot:

[operations/deployment-charts@master] mw-api-int: autocompute memory limit

https://gerrit.wikimedia.org/r/951052

Mentioned in SAL (#wikimedia-operations) [2023-08-21T11:02:07Z] <claime> Enabling memory limit autocompute for mw-api-int - T342748

Change 951051 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki: Autocompute requests and limits for all

https://gerrit.wikimedia.org/r/951051

Mentioned in SAL (#wikimedia-operations) [2023-08-21T13:42:48Z] <claime> Enabling memory limit autocompute for all mw-on-k8s deployments - T342748

Change 951125 had a related patch set uploaded (by Clément Goubert; author: Clément Goubert):

[operations/deployment-charts@master] mw-misc: Enforce fixed requests and limits

https://gerrit.wikimedia.org/r/951125

Change 951125 merged by jenkins-bot:

[operations/deployment-charts@master] mw-misc: Enforce fixed requests and limits

https://gerrit.wikimedia.org/r/951125

Mentioned in SAL (#wikimedia-operations) [2023-08-21T13:55:16Z] <claime> Re-enforcing limits and requests for mw-misc - T342748

All deployments of mw-on-k8s are now using:

Autocomputed CPU requests, no limits
Autocomputed Memory requests and limits

Only exception is mw-misc, which is on fixed requests and limits.
For future reference without having to go check the chart, the resource computation is:

Requests:

# CPU calculation:
# Minimum 1 whole CPU
# Multiply the amount of cpu_per_worker (float, unit: cpu, ex: 0.5 is half a CPU per worker)                                                                                      
# by the number of configured workers + 1 (to take into account the main php-fpm process)                                                                                         
# RAM calculation:
# Multiply 50% of the amount of memory_per_worker by the number of workers (ignoring the main php-fpm process)                                                                    
# Add 50% of the opcache size and the apc size (close to the average real consumption)

Limits:

# RAM calculation:
# Multiply the amount of memory_per_worker by the number of workers (ignoring the main php-fpm process)                                                                           
# Add 50% of the opcache size and the apc size (close to the average real consumption)

Maintenance_bot removed a project: Patch-For-Review.Aug 21 2023, 2:10 PM

Clement_Goubert moved this task from this.quarter 🍕 to Doing 😎 on the serviceops board.Aug 21 2023, 2:27 PM

Everything looking ok, we will see how it copes with doubling the incoming traffic from T341780: Direct 5% of all traffic to mw-on-k8s (only going to 2% for now) and resolve afterwards if everything stays ok.

Clement_Goubert closed this task as Resolved.Aug 23 2023, 11:39 AM

Clement_Goubert mentioned this in T344814: mw-on-k8s tls-proxy container CPU throttling at low average load.Aug 23 2023, 11:46 AM

mw-on-k8s app container CPU throttling at low average loadClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

mw-on-k8s app container CPU throttling at low average load
Closed, ResolvedPublic
Actions

Related Objects
Search...