We need to test, given different pod sizes (see T278220), what perfomance do we get from mediawiki on kubernetes.
We can use our own mediawiki performance testing framework for this work.
There are several dimensions to this problem so it will need proper testing; specifically we need to test which combination of:
- N. of php workers per pod
- CPU and Memory limits of the pods
- Opcache/apcu size
- Socket or TCP proxying to fcgi
give us the best latency and throughput.
In order to be able to compare results with production, we will need to reserve one kubernetes node for mediawiki only, run as many pods on it we can given their size, then run our testing framework https://gerrit.wikimedia.org/g/operations/software/benchmw on `mwdebug.discovery.wmnet` on the HTTP port (8444), and on one appserver in the active datacenter, so prepared:
- should have similar hardware (esp. cpu) to the kubernetes node
- depooled from traffic
- php-fpm has been restarted before starting the tests
Our goal is to get the same performance (within reasonable limits) at all concurrencies with the ones of the appserver. I would also suggest we get in touch with Performance to ask for other URLs they think we should benchmark.
**Single request profiling**
Another thing we should check is what is faster and what is slower on kubernetes; one way to test this is the following:
- Disable the timer doing the automatic deployments to mediawiki on deploy1002 for the duration of the test
- Deploy mediawiki to k8s using the latest mediawiki-multiversion-debug image https://docker-registry.wikimedia.org/restricted/mediawiki-multiversion-debug/tags/, that includes tideways and so it's able to send profiling data to xhgui
- Run profiling request on k8s and one mwdebug server at the same time, after some warmup of the cache on both (basically request the same page twice without profiling, then grab a profile)
- Check for big differences in the results - functions that require much more time on k8s in order to run
Again for this test we can use the set of URLs we use in mwbench, but we should also reach out to performance to ask them if they see other stuff that should be tested.