Anecdata: sometimes I notice a long wait time for a catalyst/patchdemo wiki to load.
We should investigate and maybe gather some metrics here.
Anecdata: sometimes I notice a long wait time for a catalyst/patchdemo wiki to load.
We should investigate and maybe gather some metrics here.
In the case of Catalyst wikis, from what I've seen so far the first load will be significantly slower until the content gets cached. Subsequent loads seem much faster.
This config is relevant: https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/blob/aa337a9db7f6f7ecf85c8927c16f0e2de7945a16/mediawiki/templates/mw.yaml#L243
echo 'pm.max_children = 1' >>$POOL_CONFIG echo 'pm.start_servers = 1' >>$POOL_CONFIG echo 'pm.min_spare_servers = 1' >>$POOL_CONFIG echo 'pm.max_spare_servers = 1' >>$POOL_CONFIG echo 'php_admin_value[memory_limit] = 256M' >>$POOL_CONFIG
The envs are pretty responsive locally on my machine with these settings, but the situation in production isn't as good. We should play with these values to see if we can strike a good balance between responsiveness and memory use; memory use being our largest bottleneck here which we should be careful not to increase significantly.
A good first attempt could be to increase pm.max_children so the fpm pool can spin up extra workers on demand and go down to a single worker while idle. Increasing pm.start_servers could make sense too.
jnuche opened https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/77
mw-web: log user agents in access.log again
jnuche merged https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/77
mw-web: log user agents in access.log again
jnuche opened https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/78
mw-web: try to block common web crawlers
jnuche merged https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/78
mw-web: try to block common web crawlers
jnuche opened https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/79
mw-fpm: use an "ondemand" worker pool configuration
jnuche merged https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/79
mw-fpm: use an "ondemand" worker pool configuration
jnuche opened https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/80
mw-fpm: set pool worker idle timeout to 10m
jnuche merged https://gitlab.wikimedia.org/repos/test-platform/catalyst/ci-charts/-/merge_requests/80
mw-fpm: set pool worker idle timeout to 10m
Putting this in the backlog as we're noodling. We need to look into request profiling; i.e., isolate the network-proxy, the k8s routing, and the pod spin-up. That will enable us to focus in on what is actually causing slow load per anecdata.
It seems like with @jnuche 's work on this, CPU load is down and wikis are loading fast again <3
Calling this one closed.
jhuneidi opened https://gitlab.wikimedia.org/repos/test-platform/catalyst/patchdemo/-/merge_requests/235
Block bots and crawlers
jhuneidi merged https://gitlab.wikimedia.org/repos/test-platform/catalyst/patchdemo/-/merge_requests/235
Block bots and crawlers
dancy opened https://gitlab.wikimedia.org/repos/test-platform/catalyst/patchdemo/-/merge_requests/249
README.md: Use https URL to access local patchdemo server
dancy merged https://gitlab.wikimedia.org/repos/test-platform/catalyst/patchdemo/-/merge_requests/249
README.md: Use https URL to access local patchdemo server