Benchmark performance of MediaWiki on k8s
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Joe
	Apr 19 2021, 9:44 AM

Description

We need to test, given different pod sizes (see T278220), what perfomance do we get from mediawiki on kubernetes.

We can use our own mediawiki performance testing framework for this work.

There are several dimensions to this problem so it will need proper testing; specifically we need to test which combination of:

N. of php workers per pod
CPU and Memory limits of the pods
- Round 1: 4000-5000m CPU, 1000-2000Mi mem
Opcache/apcu size
Socket or TCP proxying to fcgi
- Will be tested later on as we expect it to scrape a few ms from a single request.

Notes:

Opcache/apcu sizes will be determined when we have a wider list of urls test, eg replay reads from production
Regardless of other factors, in production we will probably always observe some latency on k8s vs baremetal. In baremetal, a single request can warm up eg 96 workers on a server for a codepath, while we will need multiple requests to hit different pods to achieve the same result. Pods are more ephemeral, and we will have pods being terminated and spawned constantly.
On average, concurrency is ~12-20 per server https://grafana.wikimedia.org/goto/7k4VvbS7z

Our goal is to get the same performance (within reasonable limits) at all concurrencies with the ones of the appserver. I would also suggest we get in touch with Performance to ask for other URLs they think we should benchmark.

Single request profiling

Another thing we should check is what is faster and what is slower on kubernetes; one way to test this is the following:

Disable the timer doing the automatic deployments to mediawiki on deploy1002 for the duration of the test
Deploy mediawiki to k8s using the latest mediawiki-multiversion-debug image https://docker-registry.wikimedia.org/restricted/mediawiki-multiversion-debug/tags/, that includes tideways and so it's able to send profiling data to xhgui
Run profiling request on k8s and one mwdebug server at the same time, after some warmup of the cache on both (basically request the same page twice without profiling, then grab a profile)
Check for big differences in the results - functions that require much more time on k8s in order to run

Again for this test we can use the set of URLs we use in mwbench, but we should also reach out to performance to ask them if they see other stuff that should be tested.

Methodology and Testing

Original plan:
In order to be able to compare results with production, we will need to reserve one kubernetes node for mediawiki only, run as many pods on it we can given their size, then run our testing framework https://gerrit.wikimedia.org/g/operations/software/benchmw on mwdebug.discovery.wmnet on the HTTP port (8444), and on one appserver in the active datacenter, so prepared:

should have similar hardware (esp. cpu) to the kubernetes node
depooled from traffic
php-fpm has been restarted before starting the tests

Round 1: mw2254 vs k8s, TLS enabled, 12 pods x 8 workers

Tideways-xhprof: generally increases request times, while we already know that mediawiki's performance on VMs is poorer, so in order to get a more informative profile, we selected a random production appserver where we temporarily installed tideways-xhprof. Addtionally we run ab tests with both tideways enabled and disabled.

mw2254 re-parsing profile: https://performance.wikimedia.org/xhgui/run/view?id=613b33ac1d2b2a8ba4f624bf
k8s re-parsing profile: https://performance.wikimedia.org/xhgui/run/view?id=613b33ea1e630124211e665e

Hardware: Given than mw2254 is a 2016 server where almost all kubernetes nodes in codfw were purchased after 2017, we didn't put an limitations as to where a pod can be spawned (except a server having an SSD disk T288345). We didn't observe any kubernetes node being under pressure, so we saw no need to assign mediawiki on specific nodes

TLS::The caching layer is talking to our app layer via TLS, for that reason, we chose to use TLS at least for this round of tests.

Pods specs:

96 workers (12 pods x 8 workers, mw2254 has 96 workers configured as well)
PHP: opcache.size=500M, opcache.interned_strings_buffer=50M, apc.size=400M
CPU: 4000-5000m, MEM: 1000-2000Mi

Fixes:

There was added latency due to mediawiki failing to reach etcd via its ipv6 addresses 719278
Kubernete's p99s were unreasonably high, due to instantaneously hitting max CPU 720324
Interned string buffer needed was at least 12MB and we used the default of 10MB 720188
Increased opcache max_accelerated_files (nofiles)
increased APCu size to 768M
increased app container memory limit

Extra tests:

~~Run a benchmark with 6 pods (=48 workers), and observe if higher concurrency per pod (thus, per master process), affects the results for concurrencies c=10, c=15, c=20.~~
Get a profile for each URL we are testing both for k8s and baremetal if needed
Parsoid URLs testing:
- http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Hospet/1043074958 takes between 1 - 1.2s
- http://it.wikipedia.org/w/rest.php/it.wikipedia.org/v3/page/html/Luna/122769677' takes between 4.7 - 4.9s
- http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Minneapolis/1044317827 takes between 9 - 9.5s
- http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Joe_Biden/1045585041 takes 14 - 14.5s

Conclusions:

For most workloads, at low concurrencies (c > 20), baremetal performs marginally better than k8s, to the point where we can consider it negligent T280497#7351513
At higher concurrencies (c < 20), kubernetes outperforms baremetal, sometimes by far
Having ~10 workers per pod will be sufficient

Round 2: replay read requests

We will run a large number of urls (~700k or less) at same concurrencies as in Round 1, and compare p20, p50, p75, p95, p99 for successful requests. After running the tests multiple times, we made the following adjustments

Improvements

Envoy: more CPU and memory added
PHP: Increased max_accelerated_files and APCu size
main_app: more memory
mcrouter: reduced CPU
bumped namespace's limit as our pod size overall increased

Notes

During kubernetes deployments (aka every time there was a scap run), we were getting connection errors and timeout errors, quite expected

Details

Subject	Repo	Branch	Lines +/-
mwdebug: Bump opcache max accelerated files	operations/deployment-charts	master	+1 -1
mwdebug: tune memory, cpu and apcu size	operations/deployment-charts	master	+14 -23
Bump namespace limits	operations/deployment-charts	master	+3 -3
mwdebug: bump envoy memory and cpu	operations/deployment-charts	master	+18 -0
mwdebug: bump max_accelerated_files	operations/deployment-charts	master	+1 -1
mwdebug: round 1 experiment, use 6 pods instead of 12	operations/deployment-charts	master	+16 -0
Fix 'load' title, add 'rl_startup', add 'parse_light'	operations/software/benchmw	master	+16 -5
mwdebug: bump opcache and interned string buffer	operations/deployment-charts	master	+2 -1
mediawiki: add interned_strings_buffer	operations/deployment-charts	master	+7 -4
mwdebug: increase number of replicas for benchmarking	operations/deployment-charts	master	+1 -1
mwdebug: tune up a bit the codfw deployment	operations/deployment-charts	master	+11 -0
mwdebug: bump opcache size and n. of files	operations/deployment-charts	master	+53 -55

Related Objects
Search...

Status	Assigned	Task
Stalled	None	T255792 Quibble runs core:unit tests twice!
Open	None	T328919 Upgrade to PHPUnit 10
Open	None	T338103 Micro-optimize ApiResult::isMetadataKey with str_starts_with once we support PHP8+
Open	None	T328921 Drop PHP 7.4 support from MediaWiki
Stalled	None	T334726 Use return type `never` in Wikibase
Open	None	T328922 Drop PHP 8.0 support from MediaWiki
Stalled	None	T319055 Upgrade to psr/container 2.x
Stalled	Krinkle	T319432 Migrate WMF production from PHP 7.4 to PHP 8.1
Open	None	T291916 Tracking task for Bullseye migrations in production
Stalled	None	T356293 Migrate MW appservers' base images to bullseye
Open	None	T290536 Serve production traffic via Kubernetes
Resolved	jijiki	T280497 Benchmark performance of MediaWiki on k8s
Resolved	jijiki	T290485 mediawiki-debug image does not produce profiling info
Resolved	aaron	T293630 Investigate performance degradation at high concurrencies in php-fpm

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

MoritzMuehlenhoff subscribed.Apr 19 2021, 9:47 AM

• wkandek subscribed.Apr 23 2021, 2:08 AM

JMeybohm subscribed.Apr 23 2021, 1:40 PM

First tests with the staging version of the mwdebug deployment, and I get the following non-encouraging timings (in ms, approximated from multiple runs):

page	k8s staging	mwdebug	appserver
enwiki:Barack_Obama	1000	350	330

This is a 3x slowdown, probably due to some very basic error.

@Joe appservers are running onhost memcached, which can be a factor for this specific test: https://phabricator.wikimedia.org/T263958#6510350, but it can't be the only factor ofc.

@jijiki it would not justify such a huge performance shift, by any measure. I am even veering towards disabling onhost memcached, for the latest discoveries of bad interactions with wancache. But we didn't halve our latency just by adding the onhost layer - actually I don't think it significantly reduce latency, which wasn't our goal anyways.

I created a first basic dashboard for the mwdebug deployment and I noticed what the major issue was immediately: I dedicated just 2k maximum opcache scripts, which bottomed out even just serving the enwiki main page. Boosting that up to 4k already got our mwdebug deployment almost up to par with mwdebug in terms of performance: it now takes ~ 400 ms to render the page on mw on kubernetes, and about 380 on mwdebug1001 (we've switched over the datacenter, which might explain why eqiad mediawikis might peform slightly worse than before - virtually *any* cache is cold).

Is this the dashboard? https://grafana.wikimedia.org/d/U7JT--knk/joe-k8s-mwdebug?viewPanel=70&orgId=1&from=1625227688488&to=1625246654342

In T280497#7194633, @wkandek wrote:

Is this the dashboard? https://grafana.wikimedia.org/d/U7JT--knk/joe-k8s-mwdebug?viewPanel=70&orgId=1&from=1625227688488&to=1625246654342

yes :)

Change 703435 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/deployment-charts@master] mwdebug: bump opcache size and n. of files

https://gerrit.wikimedia.org/r/703435

gerritbot added a project: Patch-For-Review.Jul 6 2021, 1:32 PM

Change 703435 merged by Giuseppe Lavagetto:

[operations/deployment-charts@master] mwdebug: bump opcache size and n. of files

https://gerrit.wikimedia.org/r/703435

Maintenance_bot removed a project: Patch-For-Review.Jul 6 2021, 2:10 PM

In order to run some tests on the mwdebug deployment in codfw, I first exposed http on port 8444 and then ran:

ab -n 1000 -c 2  -H 'X-Forwarded-Proto: https' -H 'Host: en.wikipedia.org' http://kubernetes2017.codfw.wmnet:8444/wiki/Main_Page

varying the concurrency of requests to understand what is the limit for avoiding throttling.

I found that at our current pod size, we can respond to three concurrent requests without throttling. Bumping up the limits to 3 CPUs we can get over 5 or 6 concurrent requests with no throttling.

Please note: for now I'm running a single replica of the pod. It might make sense to reserve a whole k8s node just to mwdebug for such tests, and test combinations of resources for the single pod vs number of pods, and compare how we get to a sweet spot where the service on k8s can sustain similar concurrencies to what we can get on a physical server, or better.

Change 703834 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[operations/deployment-charts@master] mwdebug: tune up a bit the codfw deployment

https://gerrit.wikimedia.org/r/703834

gerritbot added a project: Patch-For-Review.Jul 9 2021, 10:40 AM

Change 703834 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: tune up a bit the codfw deployment

https://gerrit.wikimedia.org/r/703834

Maintenance_bot removed a project: Patch-For-Review.Jul 9 2021, 2:10 PM

Joe updated the task description. (Show Details)Aug 6 2021, 9:37 AM

jijiki claimed this task.Aug 11 2021, 10:05 AM

Mentioned in SAL (#wikimedia-operations) [2021-09-01T05:25:03Z] <effie> depool mw2251 mw2255 parse2001 for tests - T280497

Change 715970 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: increase number of replicas for benchmarking

https://gerrit.wikimedia.org/r/715970

gerritbot added a project: Patch-For-Review.Sep 1 2021, 2:33 PM

Change 715970 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: increase number of replicas for benchmarking

https://gerrit.wikimedia.org/r/715970

Maintenance_bot removed a project: Patch-For-Review.Sep 1 2021, 3:10 PM

With @jijiki we went ahead and create some percentiles comparisons between mw2254 and pinkunicorn. We chose to have the exact same number of php fpm workers (96) as an invariant for this first test: Results are at https://people.wikimedia.org/~akosiaris/baremetal-k8s/

The TL;DR for now is that k8s is slower in most cases right now, which means we got work to do.

jijiki added a subtask: T290485: mediawiki-debug image does not produce profiling info .Sep 7 2021, 4:14 PM

Our initial benchmarks that @akosiaris showed that k8s was slower than baremetal, while at higher concurrencies the difference between the two was smaller. We have observed our baremetal servers underperforming in higher concurrencies (probably due to locking) in the past, so this is expected. After fixing profiling in k8s T290485, we were able to find that function MultiHttpClient::runMultiCurl was taking over 200ms, tracing it down to EtcdConfig::fetchAllFromEtcdServer.

Luckily our initial guess that probably when mw was resolving _etcd._tcp.eqiad.wmnet, would in turn try to connect first to the IPv6 address of our etcd servers, would fail to connect (since we have allowed their IPv4 addresses in egress), was right. After merging https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/719278, things looked brighter:

We are running another round of benchmarks.

jijiki closed subtask T290485: mediawiki-debug image does not produce profiling info as Resolved.Sep 7 2021, 6:57 PM

Krinkle added a project: Performance-Team (Radar).Sep 8 2021, 7:54 PM

Change 719608 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/software/benchmw@master] Fix label of rl_css url, improve other labels, add rl_startup url

https://gerrit.wikimedia.org/r/719608

gerritbot added a project: Patch-For-Review.Sep 8 2021, 7:58 PM

Graph with latency percentiles comparing baremetal against both the IPv6 etcd egress rule fixed version and the non fixed version are at: https://people.wikimedia.org/~akosiaris/baremetal-k8s/ The fixed definitely shaved off a considerable amount of ms from the higher percentiles (which was the expected outcome). It does look though we got some more work to do to become more on par with baremetal (hopefully low hanging fruit as well).

CDanis subscribed.Sep 9 2021, 1:59 PM

Change 720055 had a related patch set uploaded (by Krinkle; author: Krinkle):

[operations/software/benchmw@master] Add parse_light bench

https://gerrit.wikimedia.org/r/720055

Change 720188 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: bump opcache and interned string buffer

https://gerrit.wikimedia.org/r/720188

Change 720163 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mediawiki: add interned_strings_buffer

https://gerrit.wikimedia.org/r/720163

Change 720163 merged by jenkins-bot:

[operations/deployment-charts@master] mediawiki: add interned_strings_buffer

https://gerrit.wikimedia.org/r/720163

Change 720188 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: bump opcache and interned string buffer

https://gerrit.wikimedia.org/r/720188

jijiki updated the task description. (Show Details)Sep 12 2021, 8:00 AM

Change 719608 merged by Alexandros Kosiaris:

[operations/software/benchmw@master] Fix 'load' title, add 'rl_startup', add 'parse_light'

https://gerrit.wikimedia.org/r/719608

akosiaris mentioned this in rOSBEcf389424f5ea: Fix 'load' title, add 'rl_startup', add 'parse_light'.Sep 13 2021, 8:56 AM

jijiki updated the task description. (Show Details)Sep 13 2021, 12:12 PM

After round 1 fixes, we run another set of 10k requests with and without xhprof. Results can be found here: https://people.wikimedia.org/~jiji/benchmarks-round1-all/. We have got mixed results, as a general pattern I will go out on a limb and say that at low concurrencies (c<20) baremetal performs marginally better or similar to kubernetes, while at higher concurrencies (c > 20), kubernetes performs better.

Workloads tested: heavy_page (barack obama), light_page (@Joe's favourite italian film page), main_page, re-parse (api re-parse of Australia), load (load of kk.wikpedia.org)

Comments:

At c=10, c=15, and c=20 baremetal outperforms kubernetes in all workloads, mostly marginally
In load.php tests, baremetal is clearly better. This will be further investigated using the profiler
In article-reparsing, kubernetes performs better for c > 15
Y-axis can be deceiving, be careful when read

jijiki updated the task description. (Show Details)Sep 13 2021, 5:55 PM

Last round of urls, same configuration, with the addition of a couple more requests: gerrit: 720061, where we set y=0. We get a better idea of how marginal differences are at low concurrencies in most workloads:

(Phabricator's thumbnails seem to fail sadly, the images are there though)

Without TLS:

With TLS:

jijiki mentioned this in T290959: Phabricator failed to generate thumbnails for some 800-900KB files.Sep 14 2021, 10:44 AM

jijiki updated the task description. (Show Details)Sep 14 2021, 10:55 AM

jijiki updated the task description. (Show Details)Sep 14 2021, 1:58 PM

Change 721247 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: round 1 experiment, use 6 pods instead of 12

https://gerrit.wikimedia.org/r/721247

Change 721247 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: round 1 experiment, use 6 pods instead of 12

https://gerrit.wikimedia.org/r/721247

Last set of benchmarks of Round 1, we added a run with 6 pods x 8 workers, no tideways installed:

https://people.wikimedia.org/~jiji/benchmarks-baremetal-vs-12pods-vs-6pods/

Joe moved this task from Backlog to In Progress on the MW-on-K8s board.Sep 20 2021, 8:27 AM

jijiki updated the task description. (Show Details)Sep 20 2021, 12:08 PM

@ssastry we have done some benchmarks, but non of those were parsoid urls, it would great if you would provide a couple of parsoid URLs you'd like us to test

In T280497#7365189, @jijiki wrote:

@ssastry we have done some benchmarks, but non of those were parsoid urls, it would great if you would provide a couple of parsoid URLs you'd like us to test

If @ssastry has no specific suggestion, I would suggest we look at parsoid access logs and find an url that gets rendered in 1 second in production, one that takes 10 seconds, and one that takes over 30 seconds.

Also, parsoid *might* need more memory and we might need to adapt mediawiki-config so that we can raise php's memory limit in k8s as well as on the "normal" parsoid cluster.

In T280497#7367757, @Joe wrote:

In T280497#7365189, @jijiki wrote:

@ssastry we have done some benchmarks, but non of those were parsoid urls, it would great if you would provide a couple of parsoid URLs you'd like us to test

If @ssastry has no specific suggestion, I would suggest we look at parsoid access logs and find an url that gets rendered in 1 second in production, one that takes 10 seconds, and one that takes over 30 seconds.

This is a long ticket and I was trying to digest it before I could respond ... but, looks like you are looking for pages that take different lengths of time to render with Parsoid. I'll paste some sample urls later here today.

Times from scandium.eqiad.wmnet:

http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Hospet/1043074958 takes between 1 - 1.2s
http://it.wikipedia.org/w/rest.php/it.wikipedia.org/v3/page/html/Luna/122769677' takes between 4.7 - 4.9s
http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Minneapolis/1044317827 takes between 9 - 9.5s
http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Joe_Biden/1045585041 takes 14 - 14.5s

I don't have any page handy for 20 / 25 / 30s yet.

time curl -x scandium.eqiad.wmnet:80 'http://it.wikipedia.org/w/rest.php/it.wikipedia.org/v3/page/html/Luna/122769677' > /dev/null is how I measured times while logged onto scandium. You can get numbers for other servers by changing the arg to -x, but all eqiad servers are busy serving requests .. but I think scandium is roughly representative of production servers.

Thank you @ssastry, I updated the task descr to include them

@jijiki I guess you want to rebuild the php base image to include the patches to optimize DOM performance before running the tests

@Joe did so, thanks.

I run an initial test running some 1000s of production URLs. It appears that we are about to hit max_accelerated_files (currently is 7963x12 pods = 95556). Looking at the same value on our production servers, 16229 is a possible value to set before moving forward. We will see if we need to bump opcache too.

opcache keys

Change 725500 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: bump max_accelerated_files

https://gerrit.wikimedia.org/r/725500

Change 725500 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: bump max_accelerated_files

https://gerrit.wikimedia.org/r/725500

Change 726000 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: bump envoy memory and cpu

https://gerrit.wikimedia.org/r/726000

Change 726000 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: bump envoy memory and cpu

https://gerrit.wikimedia.org/r/726000

Change 726580 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: bump namespace limits

https://gerrit.wikimedia.org/r/726580

Krinkle moved this task from Limbo to Watching on the Performance-Team (Radar) board.Oct 5 2021, 6:30 PM

Running some tests (c=60, ~1.9m URLs) agains mwdebug services, we found 2 issues:

Our client was returning the following error messages bellow from time to time. While most of them were produce during a pod being killed (as its liveness probe failed), the one I am not sure why we see it, is the Errno 111] Connection refused one.

https://www.wikidata.org/w/api.php?action=wbcheckconstraints&format=json&formatversion=2&uselang=de&id=Q916&status=violation%7Cwarning%7Csuggestion%7Cbad-parameters
 generated an exception: HTTPSConnectionPool(host='mwdebug.discovery.wmnet', port=4444): Read timed out. (read timeout=60)

https://en.wikipedia.org/wiki/The_Mountain_(Heartless_Bastards_album)
 generated an exception: HTTPSConnectionPool(host='mwdebug.discovery.wmnet', port=4444): Max retries exceeded with url: /wiki/The_Mountain_(Heartless_Bastards_album) (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa89fc30240>: Failed to establish a new connection: [Errno 111] Connection refused'))

https://fa.wikipedia.org/w/api.php?action=query&prop=info&titles=%D8%B3%D8%A7%D9%85%D9%88%D8%A6%D9%84+%D8%AF%DA%A9%D8%B3%D8%AA%D8%B1&inprop=protection%7Ctalkid%7Cwatched%7Cwatchers%7Cvisitingwatchers%7Cnotificationtimestamp%7Csubjectid%7Curl%7Creadable%7Cpreload%7Cdisplaytitle&format=json&redirects=1
 generated an exception: ('Connection aborted.', OSError(107, 'Transport endpoint is not connected'))

https://www.wikidata.org/wiki/Special:EntityData/Q619835.json
 generated an exception: ('Connection aborted.', OSError(0, 'Error'))

https://he.wikipedia.org/w/index.php?title=MediaWiki:SearchEngines.js&action=raw&ctype=text/javascript
 generated an exception: HTTPSConnectionPool(host='mwdebug.discovery.wmnet', port=4444): Max retries exceeded with url: /w/index.php?title=MediaWiki%3ASearchEngines.js&action=raw&ctype=text%2Fjavascript (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa8b51532b0>: Failed to establish a new connection: [Errno 111] Connection refused'))

APCu was hitting 100% fragmentation during the benchmark. This was a little bit unexpected. After chatting a bit with @TK-999 on IRC, I will bump out APCu size a bit, and see what happens.

jijiki added a parent task: T290536: Serve production traffic via Kubernetes.Oct 7 2021, 10:20 AM

Change 727279 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: tune memory, cpu and apcu size

https://gerrit.wikimedia.org/r/727279

Change 726580 merged by jenkins-bot:

[operations/deployment-charts@master] Bump namespace limits

https://gerrit.wikimedia.org/r/726580

Change 727279 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: tune memory, cpu and apcu size

https://gerrit.wikimedia.org/r/727279

jijiki updated the task description. (Show Details)Oct 8 2021, 11:21 AM

Change 728384 had a related patch set uploaded (by Effie Mouzeli; author: Effie Mouzeli):

[operations/deployment-charts@master] mwdebug: Bump opcache max accelerated files

https://gerrit.wikimedia.org/r/728384

After the last tuning (APCu + memory limits), the results were more promising:

On the other hand, we seem to be hitting max accelerated files again, so we should bump it a bit

Change 728384 merged by jenkins-bot:

[operations/deployment-charts@master] mwdebug: Bump opcache max accelerated files

https://gerrit.wikimedia.org/r/728384

jijiki added a subtask: T293630: Investigate performance degradation at high concurrencies in php-fpm .Oct 18 2021, 2:09 PM

• dpifke subscribed.Oct 25 2021, 6:07 PM

jijiki updated the task description. (Show Details)Oct 26 2021, 7:08 AM

Parsoid testing, original images can be found at https://people.wikimedia.org/~jiji/benchmarks-parsoid/, our findings are similar to our previous tests. Baremetal performs better at low concurrencies, while k8s performs better at c=15 and up, while its >p90 is not always great.

I have some scripts in my home dir on mwdebug1001.eqiad.wmnet (apcu_stats_test.php and apcu_rw_test.php).

Here are some stats from a custom script on mwdebug (per key "collection").

{
    "overall": {
        "num_slots": 4099,
        "ttl": 0,
        "num_hits": 14798,
        "num_misses": 4805,
        "num_inserts": 12387,
        "num_entries": 3801,
        "expunges": 0,
        "start_time": 1635246669,
        "mem_size": 7731144,
        "memory_type": "mmap"
    },
    "breakdown": {
        "resourceloader-filter": {
            "keys": 146,
            "keys_with_ttl": 146,
            "ttl_ave": 86400,
            "bytes_ave": 17962.027397260274,
            "atime_ago_ave": 1360.972602739726,
            "mtime_ago_ave": 1362.2328767123288,
            "hit_qps_total": 0.036618293603589494,
            "hit_bps_total": 777.7990275373559,
            "%bytes": 34.66157574629706
        },
        "registration-main": {
            "keys": 3,
            "keys_with_ttl": 3,
            "ttl_ave": 86400,
            "bytes_ave": 757424,
            "atime_ago_ave": 909.3333333333334,
            "mtime_ago_ave": 18046,
            "hit_qps_total": 0.03921416066913052,
            "hit_bps_total": 40445.79902742657,
            "%bytes": 30.033117064381603
        },
        "FileContentsHasher": {
            "keys": 3475,
            "keys_with_ttl": 3475,
            "ttl_ave": 86400,
            "bytes_ave": 329.88546762589925,
            "atime_ago_ave": 1365.2152517985612,
            "mtime_ago_ave": 1366.2376978417267,
            "hit_qps_total": 1.4391886348659448,
            "hit_bps_total": 461.0408732459703,
            "%bytes": 15.151585643350787
        },
        "sites\/SiteList#2014-03-17+Site%3A2013-01-23": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 3600,
            "bytes_ave": 470632,
            "atime_ago_ave": 1366,
            "mtime_ago_ave": 1366,
            "hit_qps_total": 0,
            "hit_bps_total": 0,
            "%bytes": 6.220446297909776
        },
        "MessageCache": {
            "keys": 3,
            "keys_with_ttl": 0,
            "ttl_ave": 0,
            "bytes_ave": 77954.66666666667,
            "atime_ago_ave": 909.6666666666666,
            "mtime_ago_ave": 2475,
            "hit_qps_total": 0.02959335746220992,
            "hit_bps_total": 5216.006812859272,
            "%bytes": 3.0910317467030968
        },
        "resourceloader-less": {
            "keys": 63,
            "keys_with_ttl": 63,
            "ttl_ave": 86400,
            "bytes_ave": 2833.3968253968255,
            "atime_ago_ave": 1360.1587301587301,
            "mtime_ago_ave": 1363.6507936507937,
            "hit_qps_total": 0.0402448233462375,
            "hit_bps_total": 111.82946521105119,
            "%bytes": 2.3593264928056032
        },
        "wikibase-sites-module": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 3600,
            "bytes_ave": 176832,
            "atime_ago_ave": 1365,
            "mtime_ago_ave": 1365,
            "hit_qps_total": 0,
            "hit_bps_total": 0,
            "%bytes": 2.3372273023338437
        },
        "registration-lazy-attrib": {
            "keys": 9,
            "keys_with_ttl": 9,
            "ttl_ave": 86400,
            "bytes_ave": 10509.333333333334,
            "atime_ago_ave": 12484.777777777777,
            "mtime_ago_ave": 18046,
            "hit_qps_total": 0.0022952732931336315,
            "hit_bps_total": 1.1935421124294885,
            "%bytes": 1.2501374590794896
        },
        "messages-big": {
            "keys": 54,
            "keys_with_ttl": 54,
            "ttl_ave": 3600,
            "bytes_ave": 1573.1851851851852,
            "atime_ago_ave": 1361.7777777777778,
            "mtime_ago_ave": 1366.0925925925926,
            "hit_qps_total": 0.03949087785148151,
            "hit_bps_total": 82.16267095091457,
            "%bytes": 1.122829203921602
        },
        "gadgets-definition": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 498,
            "bytes_ave": 78504,
            "atime_ago_ave": 3,
            "mtime_ago_ave": 116,
            "hit_qps_total": 0.034482758620689655,
            "hit_bps_total": 2707.0344827586205,
            "%bytes": 1.0376045746381655
        },
        "tor-exit-nodes": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 3600,
            "bytes_ave": 58832,
            "atime_ago_ave": 1360,
            "mtime_ago_ave": 1360,
            "hit_qps_total": 0,
            "hit_bps_total": 0,
            "%bytes": 0.777595438896267
        },
        "bad-image-list": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 86400,
            "bytes_ave": 54256,
            "atime_ago_ave": 1360,
            "mtime_ago_ave": 1366,
            "hit_qps_total": 0.0014641288433382138,
            "hit_bps_total": 79.43777452415813,
            "%bytes": 0.717113443920925
        },
        "lightncandy-compiled": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 604800,
            "bytes_ave": 40392,
            "atime_ago_ave": 1359,
            "mtime_ago_ave": 1368,
            "hit_qps_total": 0.0014619883040935672,
            "hit_bps_total": 59.05263157894736,
            "%bytes": 0.5338699171861915
        },
        "Wikimedia\\Minify\\CSSMin": {
            "keys": 4,
            "keys_with_ttl": 4,
            "ttl_ave": 3600,
            "bytes_ave": 7568,
            "atime_ago_ave": 1364.75,
            "mtime_ago_ave": 1364.75,
            "hit_qps_total": 0,
            "hit_bps_total": 0,
            "%bytes": 0.4001116590676468
        },
        "EtcdConfig": {
            "keys": 1,
            "keys_with_ttl": 0,
            "ttl_ave": 0,
            "bytes_ave": 20120,
            "atime_ago_ave": 0,
            "mtime_ago_ave": 3,
            "hit_qps_total": 0.6666666666666666,
            "hit_bps_total": 13413.333333333332,
            "%bytes": 0.26593044993528847
        },
        "mysql-server-version": {
            "keys": 6,
            "keys_with_ttl": 6,
            "ttl_ave": 3600,
            "bytes_ave": 208,
            "atime_ago_ave": 290,
            "mtime_ago_ave": 1041.5,
            "hit_qps_total": 0.013862215944294405,
            "hit_bps_total": 2.8833409164132364,
            "%bytes": 0.016495089538729624
        },
        "rdbms-server-states": {
            "keys": 2,
            "keys_with_ttl": 2,
            "ttl_ave": 60,
            "bytes_ave": 524,
            "atime_ago_ave": 1,
            "mtime_ago_ave": 1,
            "hit_qps_total": 2,
            "hit_bps_total": 1168,
            "%bytes": 0.013851645702394748
        },
        "wikibase-client": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 86400,
            "bytes_ave": 384,
            "atime_ago_ave": 1366,
            "mtime_ago_ave": 1366,
            "hit_qps_total": 0,
            "hit_bps_total": 0,
            "%bytes": 0.005075412165762961
        },
        "cirrussearch-morelikethis-settings": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 600,
            "bytes_ave": 200,
            "atime_ago_ave": 1,
            "mtime_ago_ave": 272,
            "hit_qps_total": 0.01838235294117647,
            "hit_bps_total": 3.6764705882352944,
            "%bytes": 0.002643443836334876
        },
        "pygmentize-version": {
            "keys": 1,
            "keys_with_ttl": 1,
            "ttl_ave": 3600,
            "bytes_ave": 184,
            "atime_ago_ave": 1366,
            "mtime_ago_ave": 1366,
            "hit_qps_total": 0,
            "hit_bps_total": 0,
            "%bytes": 0.0024319683294280856
        }
    }
}

Krinkle mentioned this in T293630: Investigate performance degradation at high concurrencies in php-fpm .Oct 27 2021, 5:07 PM

Production URL testing (1.929.416 URLs) results in https://people.wikimedia.org/~akosiaris/prod_urls/. Findings for c=20, c=30, c=40 are consistent with what we have seen so far

percentiles_urls_piehit_c20.png (1×1 px, 18 KB)

percentiles_urls_piehit_c30.png (1×1 px, 18 KB)

percentiles_urls_piehit_c40.png (1×1 px, 20 KB)

During testing, there were a few 503s on both baremetal and kubernetes, our assumption is that some requests timed out:

concurrency	baremetal	kubernetes
20	0.0047%	0.0063%"
30	0.0022%	0.0027%
40	0.0012%	0.0036%

Lastly, we observed read and connection errors on the kubernetes installation, which coincided with scap deployments.

ssastry mentioned this in T297259: Compare Parsoid perf on current production servers vs a newer test server.Dec 8 2021, 8:11 PM

aaron closed subtask T293630: Investigate performance degradation at high concurrencies in php-fpm as Resolved.Aug 16 2022, 10:03 PM

Krinkle mentioned this in T333269: Benchmark baremetal vs k8s mediawiki perf (2023).Mar 28 2023, 2:36 AM

	F34711818: percentiles_urls_piehit_c40.png
	Oct 27 2021, 8:55 PM

	F34711816: percentiles_urls_piehit_c30.png
	Oct 27 2021, 8:55 PM

	F34711817: percentiles_urls_piehit_c20.png
	Oct 27 2021, 8:55 PM

	F34697825: c10.jpg
	Oct 26 2021, 11:14 AM

	F34697833: c40.jpg
	Oct 26 2021, 11:14 AM

	F34697835: c30.jpg
	Oct 26 2021, 11:14 AM

	F34697834: c35.jpg
	Oct 26 2021, 11:14 AM

Benchmark performance of MediaWiki on k8sClosed, ResolvedPublicActions