For example browser tests can be way faster if we change their caching from CACHE_DB to CACHE_ACCEL or 'hash'
- Mentioned In
- T225730: Reduce runtime of MW shared gate Jenkins jobs to 5 min
T225248: Consider moving browser based tests (Selenium and QUnit) to a non-voting pipeline
- Mentioned Here
- T160519: Jenkins Browser tests for Wikibase/Popups etc are failing: Invalid CSRF token in Selenium browser
rMW35c725e157e5: Properly detect if CACHE_ACCEL is available in the installer
rMW1081356412d4: Move l10n_cache table to a separate DB for sqlite via the installer
rMWdb784ada9f9d: Set synchronous = NORMAL for cache tables in Sqlite installer
T196347: Quibble may need to rebuild localization cache before running tests
T225218: Consider httpd for quibble instead of php built-in server
Number of DBQueries drop by 1000:
amsa@amsa-Latitude-7480:~/Downloads$ grep DBQuery mw-debug-www.log | wc -l 24814 amsa@amsa-Latitude-7480:~/Downloads$ grep DBQuery mw-debug-www.log.1 | wc -l 23775
I am not quite sure how MediaWiki selects the caches its is going to use. A few notes:
- for the PHPUnit test suites, we might just be fine using a per process hash.
- for browser tests, they do their queries in parallel and when the cache is backed up by sqlite (at least), there are lock contention issues. I had the issue previously with the localization cache which I have fixed by having Quibble to build the cache before proceeding with tests ( T196347 ).
What would be nice is to check which caches are being detected now for each of the PHPUnit testsuite and the browser tests. Elected backends should be findable in the MediaWiki debug log files attached to each builds.
To speed up browser tests, Kosta found out that the PHP built-in server (php -S) is dramatically slower than using Apache. I guess because it is single threaded. T225218: Consider httpd for quibble instead of php built-in server
This seems to have had a ~20% speed improvement – picking two patches merged into MW-core either side of this change (but without any change in the number or nature of tests), the before durations are 135 and 156 seconds, and the after durations are 162 and 187. Of course, the durations bump around based on CI server load, but 30 seconds saved is 30 seconds saved.
MediaWiki does this for PHPUnit tests already.
- for browser tests, they do their queries in parallel and when the cache is backed up by sqlite, there are lock contention issues. [..]
What would be nice is to check which caches are being detected now [..]
For light-weight values that are not worth a Memc or DB rountrip to fetch, we use APC in MediaWiki always. This is not configurable and there is no opt-in or opt-out. This is referred to as "LocalServerCache" and is separate from the (configurable) WANObjectCache (which wraps "MainCacheType").
The MainCacheType/WANObjectCache is "none" by default, not "db". So we were not using SqlBagOStuff in CI for general caching. That would've likely been slower than no caching at all, and is good that we didn't do that. So for the most part lock contention shouldn't have been an issue during browser tests.
For other caches (like MessageCache, ParserCache and Session) the default is indeed "db", but should have relatively little contention. Aaron has done a lot of work over the past year to improve sqlite performance; e.g. db784ada9f9d1, and 1081356412d. And these cache types have less contention in general. But, with or without contention, using the DB is still slow, and APC would be much faster.
MediaWiki doesn't randomly pick caches at run-time. This is mainly to avoid corrupted or unexpected changes in different cache tiers, hash rings, and to avoid missed purges if the environment shifts back and forth or differs between app servers for some reason. If the environment has changed, we currently require sysadmins to update LocalSettings to reflect these changes (and to ensure hard failure if such things are missing).
But during Installation we auto-detect APC and use it as the default MainCacheType, with the recommendation to install Memc or Redis and configure that for even better performance. We added this in 2017 (35c725e157e53c, T160519) but it only applied to the Web installer. Not the CLI installer. This is now fixed with:
The first commit from Ladsgroup changed ParserCache, MessageCache, and Session from DB to APC.
The second commit (from me) enabled WANCache/MainCacheType, by setting it to APC as well. (previously None, the first commit did set the variable, but it was re-overridden back to None by the installer's generated LocalSettings, which my second commit fixes)
Looking at the graphs, I don't see a conclusive drop or change in either direction, but the anecdotal numbers in this task suffice I think to close it. Even if it had no improvement, there aren't any other major cache controls to enable really. Short of introducing HTMLFileCache/Varnish but that's not likely to have much impact given we're doing not doing many repeat/anon page views, if at all.