Page MenuHomePhabricator

Use disk-based LCStore by default in MediaWiki 1.34
Open, NormalPublic

Description

The MediaWiki default is to store localisation cache in the database.

At WMF production, it's been the case for many years that these are stored as a file on disk instead, in a private cache directory. MediaWiki offers two well-tested formats for this: php array files, and CDB files. CDB files is what WMF currently uses in production. PHP-array files is what we're moving toward (T99740), which are faster to generate and faster to read than CDB files.

For third parties, this means:

  • More scalable by default. We'd no longer require db-master connections on web requests when repopulating the localisation cache.
  • Better recache performance.

This seems like a fairly easy thing to enable by default. Should we do that? Are there reasons we haven't already?

Later

For core, if we later remove the db option (because why would we keep it?), then after the next release it would mean:

  • Lower complexity in database management. MediaWiki currently has dedicated db-related logic that carefully establishes multiple connections to the same database, so that localisation-cache queries are not blocked behind regular database queries.
  • Fewer LCStore implementations to maintain and test.
  • One less database table in our schema (yay).

Event Timeline

Krinkle created this task.Mar 13 2019, 1:52 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 13 2019, 1:52 PM

Core Platform Team: From a product perspective, do you see this as a desirable change? Do we know of third parties that have raised concerns with the defaults and/or specifically prefer it?

Language Team: Do we know of known issues with file-based LCStore, e.g. something that is fine for WMF but could be a problem for third parties?

This would essentially mean making $wgCacheDirectory mandatory. Currently it is optional.

From personal experiences, setting up a cache directory with write access from admin users (for command line scripts) and web server simultaneously (and in a safe manner) is not very easy. For simple web hotels or shared hosting, is it even possible?

daniel added a subscriber: daniel.EditedMar 19 2019, 1:47 PM

From personal experiences, setting up a cache directory with write access from admin users (for command line scripts) and web server simultaneously (and in a safe manner) is not very easy.

I keep running into this annoying issue with my local development environment. Whatever we use per default should work without the need to fiddle with file permissions.

It seems like this would be a change that would break at least some upgrades, and would require some documentation for third-party admins. Do we have any data that says that the performance advantage between PHP array files and CDB justifies this breakage?

daniel added a comment.EditedMar 19 2019, 3:03 PM

These are for generating - the more relevant question is how this impacts read performance. Though in practice, it probably doesn't, since it just sits in the op code cache either way.

Krinkle added a comment.EditedMar 23 2019, 11:20 PM

From personal experiences, setting up a cache directory with write access from [..] command line scripts [..] and web server simultaneously [..] is not very easy.

We do this already for $wgUploadDirectory (for media uploads by users), and $wgFileCacheDirectory (for the "FileCache" feature which stores HTTP responses), and $wgTmpDirectory. Should be simple enough to do the same for $wgCacheDirectory.

T99740#4628985
Data here, pretty impressive

The impact of switching the default will be even bigger. The above comparison was for WMF, between the CDB file format it currently uses, and the static PHP array file format it will be using next.

The MediaWiki default currently, however, is not any local file format (not JSON, CDB, or PHP array). The MediaWiki default is to store localisation cache in a relational database table. The cost of writing and reading from the database is significantly more expensive than pretty much any kind of local file.

It seems like this would be a change that would break at least some upgrades [..]

The various LCStore backends MediaWiki supports (SQL, JSON, PHP, CDB) already have their readers and writers ship with MediaWiki core, and are compatible with all run-times and platforms MediaWiki itself supports.

In fact, for many third-party wikis, localisation cache already defaults to being stored on disk instead of a relational database. This is because the LCStore system automatically detects if you have $wgCacheDirectory configured in LocalSettings.php. If you do, then LCStore (as various other systems in MediaWiki) automatically start using it.

This task is about enabling $wgCacheDirectory by default so that it isn't just for third-party admins that read Performance tuning manual, or Karonen's guide, or Aaron's guide.

Switching between backends, either by site administrator choice in configuration, or by automatic discovery, or by changing the default, has no impact on anything outside LCStore. It just works. Other parts of core and extensions interact with the localisation cache through a generic interface – neutral to the currently configured backend.

aaron added a subscriber: aaron.Apr 8 2019, 6:09 PM

Have you compared LCStore with sqlite (as defined by the current installer) vs cdb?

I think making $wgCacheDirectory mandatory and writable/readable in the installer would be a start, in any case.

Krinkle claimed this task.Apr 18 2019, 11:23 PM

Assigning to self, to do perf comparison for page view reads with a local install. Comparing the stock default of a db table on sqlite, vs cdb vs static-array. This is an interesting comparison because unlike for MySQL, sqlite would technically be a local disk read just as a file-based l10n cache would be. I would imagine the static array read to be far more efficient (which is the one we proposed). I can still compare cdb file open-read perf between sqlite3 file open-read performance, though.

Krinkle triaged this task as Normal priority.Apr 23 2019, 5:40 PM
Krinkle added a comment.EditedApr 25 2019, 5:26 PM

Test cases

Three performance costs around writing to / reading from the localisation cache:

  1. Populate LCStore for 1 language. – This is what happens on most third-party wikis, whenever a user first visits the wiki in a given language after a site admin has upgraded MediaWiki and purged the server caches. This represents the amount of time a user page view can be delayed by.

I measure this by running update.php to purge all object caches, truncating the l10n_cache table, and clearing the tmp directory. Then, run php maintenance/rebuildLocalisationCache.php --lang de to populate the cache for one language.


  1. Populate LCStore for all languages. – This is what happens on large-scale wiki farms like WMF, which populate the cache for all languages at once as part of the deployment process.

Measured similar to 1, except running php maintenance/rebuildLocalisationCache.php for all languages (no lang param), and with threads=4.


  1. Measure TTFB from loading the main page . – This measures the time it takes to read localisation messages from LCStore. This is the most important one and represents the cost applied to all page views after the server caches have been populated.

Measured by generating all languages first, then running update.php to purge object caches, and then viewing one page. In between page views, running update.php again to purge object caches.

This because LCStore has another layer of caching on top from MessageCache, which uses db-objectcache as "clusterCache" by default. The MessageCache also has optional WANObjectCache (disabled by default) and llocalServerCache (APC) also disabled by default (wgUseLocalMessageCache).


Environment

From addshore/mediawiki-docker-dev (master at f12a8f9). Using PHP setting RUNTIMEVERSION=7.2, on Docker for Mac, with 4 CPU cores given to Docker – verified on the guest via python; import multiprocessing multiprocessing.cpu_count().

Test A - LCStore as database table (current MW stock default)
# This let's MediaWiki core decide,
# The stock default it picks is to use a "l10n_cache" database table.
# -------

# Database (from mdd/LocalSettings)
// [mysql database settings]

# Database (from core/DevelopmentSettings)
$wgSQLMode = 'TRADITIONAL'; // strict mode

# Disk (from mdd/LocalSettings)
$wgTmpDirectory = "{$wgUploadDirectory}/tmp";

# Language (from mdd/LocalSettings)
$wgLanguageCode = "en";

# Extensions (from my own LocalSettings)
wfLoadSkin('Vector');
wfLoadExtension('Cite');
wfLoadExtension('Gadgets');
wfLoadExtension('Interwiki');
wfLoadExtension('ParserFunctions');
wfLoadExtension('WikiEditor');
wfLoadExtension('CategoryTree');
Test B - LCStore as CDB on disk (current WMF use)
# Change $wgCacheDirectory from the default false to a writable tmp directory.
# With this in place, MediaWiki automatically starts using a disk-based LCStore
# instead of in the database. The file format it currently picks is CDB.

$wgCacheDirectory = $wgTmpDirectory;

# … rest same as Test A
Test C - LCStore as static array on disk (proposed)
$wgCacheDirectory = $wgTmpDirectory;
$wgLocalisationCacheConf['store'] = 'array';

# … rest same as Test A

Results

Raw results
## Results A1 (database table / one language)

Rebuilding de... 1 languages rebuilt out of 1
real	0m3.782s
user	0m0.560s
sys	0m0.310s

Rebuilding de... 1 languages rebuilt out of 1
real	0m4.000s
user	0m0.630s
sys	0m0.330s

Rebuilding de... 1 languages rebuilt out of 1
real	0m3.981s
user	0m0.510s
sys	0m0.450s

## Results A2 (database table / all languages)

Rebuilding...
real	1m51.781s
user	1m34.650s
sys	1m0.540s

Rebuilding...
real	1m47.267s
user	1m32.090s
sys	0m57.790s

Rebuilding...
real	1m33.337s
user	0m44.470s
sys	0m26.440s

## Results A3 (database table / page load)

responseStart: 2013.925 ms
responseStart: 2049.880 ms
responseStart: 2049.881 ms

## Results B1 (CDB files / one language)

Rebuilding de... 1 languages rebuilt out of 1
real	0m4.630s
user	0m1.310s
sys	0m0.720s

Rebuilding de... 1 languages rebuilt out of 1
real	0m4.635s
user	0m1.280s
sys	0m0.800s

Rebuilding de... 1 languages rebuilt out of 1
real	0m4.391s
user	0m1.140s
sys	0m0.800s

##  Results B2 (CDB files / all languages)

Rebuilding..
real	4m10.834s
user	8m35.780s
sys	5m52.070s

Rebuilding..
real	4m20.847s
user	9m4.440s
sys	6m3.750s

Rebuilding...
real	4m16.865s
user	8m50.060s
sys	5m56.830s

## Results B3 (CDB files / page load)

responseStart: 1853.005 ms
responseStart: 1907.499 ms
responseStart: 1869.214 ms

## Results C1 (static array files / one language)

Rebuilding de... 1 languages rebuilt out of 1
real	0m3.118s
user	0m0.400s
sys	0m0.220s

Rebuilding de... 1 languages rebuilt out of 1
real	0m3.161s
user	0m0.320s
sys	0m0.290s

Rebuilding de... 1 languages rebuilt out of 1
real	0m3.041s
user	0m0.350s
sys	0m0.280s

## Results C2 (static array files / all languages)

Rebuilding...
real	1m13.084s
user	1m35.640s
sys	1m17.810s

Rebuilding...
real	1m13.043s
user	1m37.700s
sys	1m14.950s

Rebuilding...
real	1m10.789s
user	1m35.650s
sys	1m11.540s

## Results C3 (static array files / page load)

responseStart: 1771.395 ms
responseStart: 1628.985 ms
responseStart: 1809.974 ms
Test case / LCStoreA (database table, current MW default)B (CDB files, current WMF)C (static array files, proposed)
1 (Populate 1 language)3.782s, 4.000s, 3.981s4.630s, 4.635s, 4.391s 3.118s, 3.161s, 3.041s
2 (Populate all languages)1m51s, 1m47s, 1m33s4m10s, 4m20s, 4m16s 1m13s, 1m13s, 1m10s
3 (Page load time)2013 ms, 2049 ms, 2049 ms1853 ms, 1907 ms, 1869 ms 1771 ms, 1628 ms, 1809 ms

The most important one for end-users is the page load time as that cost may be paid by users on any page view, as opposed to only when upgrading the site. Both of the disk-based caches came out better than the database table, thus confirming out hypothesis.

Among the two file formats for on-disk caching, the the static array file format came out better than the CDB file format. This matches our expectation and tests we ran in 2015 in WMF production when we first started the migration from CDB files to PHP files (T212460, T99740). We've completed most of that migration in WMF production. The only remaining disk-based caching we have that isn't yet using static arrays instead of CDB is the localisation cache.

The comparison between CDB and static arrays for populating the caches (for site admins during upgrades and deployments) also came out as expected and matching our 2015 tests, favouring the static array format.

The surprising bit to me, although not particularly important for this task, is that the CDB files were much slower to fill than the database tables (4 minutes vs 1 minute). This is a good reason to stick with our proposal to switch from database tables toward static array files, and never use CDB files by default for third parties.

At WMF the cost of slowly generating CDB files (over db tables) was justified many years ago because our focus is on the end user and the page load performance. We're happy to pay a few minutes during deployment so as to not require database load for localisation messages. Having said that, once we switch from CDB to static array files at WMF as well, it'll be nice to know our deployment process will likely be much faster. The ticket for that is T99740.

Krinkle added a comment.EditedApr 25 2019, 5:42 PM

Next step here is deciding what $wgCacheDirectory should default to. Note that in order to uphold our current guarantees to third parties, this directory must vary by wiki or be within the installation path.

It must vary by wiki, because there can be multiple wikis on the same server that each have similar but incompatible extension installed. At WMF, we explicitly don't support this and thus share our cache directory between wikis as optimisation (thus sharing computed cache files much better), but we can't do that for MW default. This isn't a big deal, because we're going from no caching to some caching, so it's an improvement either way.

There's a few candidates we can pick from:

  • $wgTmpDirectory – we can rely on this to be a writable path, used for lots of ad-hoc purposes currently. We should carve out a sub directory under this, because it is set to a system-wide directory by default. E.g. $tmpDir/mw-cache.
  • $wgUploadDirectory - we can rely on this to be a writable path, already used for HTML-FileCache ($wgFileCacheDirectory) and for $wgReadOnlyFile. Adding wgCacheDirectory to that, seems straight forward and should naturally deal with the by-version/by-wiki variance. On wiki farms, site admins already have this set to vary accordingly.

Concerns, preferences, other ideas?

aaron added a comment.Apr 26 2019, 7:44 PM

Did you try sqlite LCStore (with journal_mode = WAL and synchronous = NORMAL? like the installer uses)?

Did the CDB test use the PHP CDB implementation or the implementation using dba_* functions?

Krinkle added a comment.EditedMay 4 2019, 5:50 PM

Did the CDB test use the PHP CDB implementation or the implementation using dba_* functions?

Aye, this was with php72 as provided by the php:7.2 image on DockerHub, used via webdevops' php-nginx image, via MDD: mediawiki-docker-dev. And that installs php without --with-cdb or --with-dba=shared. I've confirmed via mwscript eval.php that Cdb\Writer::open( '/tmp/foo.cdb' ); on this installation results in a Cdb\Writer\PHP instance.

I'm curious whether it's generally common for third-parties to lack this as well.

  • Debian 9 Stretch: php 7.0, without dba functions.
  • Debian 10 Buster: php 7.3, without dba functions.
  • Ubuntu 16.04 Xenial, php 7.0, without dba functions.
  • Ubuntu 18.04 Bionic, php 7.2, without dba functions.
Log
$ docker run --rm -i -t debian:stretch sh -c "bash"
# apt-get update …
# apt-get install php … php7.0 (7.0.33-0+deb9u3)
php -a
> var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

$ docker run --rm -i -t debian:buster sh -c "bash"
# apt-get update …
# apt-get install php … php7.3 (7.3.4-2)
# php -a
> var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

$ docker run --rm -i -t ubuntu:16.04 sh -c "bash"
# apt-get update …
# apt-get install php … php7.0 (7.0.33-0ubuntu0.16.04.3)
# php -a
php > var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

$ docker run --rm -i -t ubuntu:18.04 sh -c "bash"
# apt-get update …
# apt-get install php … php7.2 (7.2.17-0ubuntu0.18.04.1)
# php -a
php > var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

Did the CDB test use the PHP CDB implementation or the implementation using dba_* functions?

These are not installed by default on popular distros per T218207#5157534. However, the native is indeed known to be faster.

In the unlikely event that Cdb-Dba (as opposed to Cdb-Php) is also faster than StaticArray, I still do not think we default to "Cdb" over "StaticArray" as that would mean for most wikis it would be slow by default. If an administrator is customising their environment by installing php7.2-dba and wants to use Cdb for MW, they can still set $wgLocalisationCacheConf['store'] = 'cdb';. Which is actually what those administrators would already be doing today. We're only changing the default. We are not removing support for anything.

Anyhow, here's a comparison, same methods and environment as per T218207#5138303. I'm re-testing Cdb-Php and StaticArray here to give it a fresh baseline, because absolute numbers may no longer compare.

Cdb-Php
Test: Page load time
responseStart: 2762 ms, 3298 ms, 3004 ms
Test: Deploy one language
real	0m4.832s, 0m4.939s, 0m4.881s

The Dba functions were installed by using docker-php-ext-install dba. Confirmed by dba_handlers() from eval.

Cdb-Dba
Test: Page load time
responseStart: 2504 ms, 3078 ms, 3149 ms
Test: Deploy one language
real	0m3.298s, 0m3.225s, 0m3.015s
StaticArray
Test: Page load time
responseStart: 2546 ms, 2728 ms, 2753 ms
Test: Deploy one language
real	0m3.322s, 0m3.224s, 0m3.055s

Cdb-Dba is faster than Cdb-Php both for writing and for reading cache values, as expected.

Compared to StaticArray, Cdb-Dba is on-par on both accounts. Cdb-Dba has more outliers for user-perceived metrics, and StaticArray has one outlier for the deploy runtime. Both both had the same range of values. And in any case, we cannot assume Dba to be installed, so StaticArray remains the better choice as our default. It also helps generally, I think, whenever we can use the WMF production setting as MW default, as third-parties then benefit all optimisations and resources dedicated to its maintenance and performance.

Did you try sqlite LCStore (with journal_mode = WAL and synchronous = NORMAL? like the installer uses)?

Interesting. In the specific case of SQLite, "cache in database" and "cache on disk" are effectively both use the disk. Some quick comparisons using Quick MediaWiki to install MediaWiki with SQLite (macOS, on-disk /private/tmp/quickmw, PHP 7.1.26 from Homebrew).

SQLite (default installation)
LocalSettings.php (generated)
$wgLocalisationCacheConf['storeServer'] = [
	'type' => 'sqlite',
	'dbname' => "{$wgDBname}_l10n_cache",
	'tablePrefix' => '',
	'variables' => [ 'synchronous' => 'NORMAL' ],
	'dbDirectory' => $wgSQLiteDataDir,
	'trxMode' => 'IMMEDIATE',
	'flags' => 0
];
Test: Deploy one language
time php maintenance/rebuildLocalisationCache.php --lang de --force
real	0m0.404s, 0m0.416s, 0m0.407s
StaticArray
LocalSettings.php (appendix)
$wgCacheDirectory = $wgSQLiteDataDir;
$wgLocalisationCacheConf['store'] = 'array';
Test: Deploy one language
real	0m0.140s, 0m0.156s, 0m0.148s

Looks like Static Array beats SQLite as well. We've shown in all previous benchmarks that the "All languages" and "Page load time" use cases always align with the "One language" use case, so I won't bother re-running those. Besides, I don't think this would inform our decision here, as I don't think we should optimise the stock MW default for SQLite against MySQL and other RDBMS'es.

In the unlikely event someone finds that sqlite3-based writing or reading outperforms opcache-backed arrays, it will still work by default, and can be optimised by setting wgLocalisationCacheConf directly.

Next step here is deciding what $wgCacheDirectory should default to. [..] There's a few candidates we can pick from:

  • $wgTmpDirectory – we can rely on this to be a writable path, used for [..]
  • $wgUploadDirectory - we can rely on this to be a writable path, already used for [..]

Concerns, preferences, other ideas?

Storing executable files in $wgUploadDirectory which is usually web-accessible sounds scary to me. Sub-directory of $wgTmpDirectory sounds a better option to me. One small thing might be that perhaps $wgUploadDirectory might play better with selinux if it is inside mediawiki installation directory, unlike $wgTmpDirectory. More generally, the concept of "executable code generated during runtime" is now being introduced (or did I miss a precedence?) and probably could benefit from documentation to help people who try to secure MediaWiki deployments or for people doing MediaWiki deployment systems in general.

Krinkle added a comment.EditedMay 6 2019, 4:48 PM

Storing executable files in $wgUploadDirectory which is usually web-accessible sounds scary to me.

Re-use wgUploadDirectory

Aye. It would require several compromises for that to be effective, though. One would need to have wgHashedUploadDirectory disabled (on by default), and to have whitelisted .php, and to have broken MediaWiki so that allows uploads containing PHP code - and – allow slashes in user images - and - allow image names to start with a lowercase letter. So that a file upload can then be written to uploads/mw-cache/l10n-en.php instead of uploads/a/a0/Filename.png.

If we go this route, we might want to do T199590 first so that even if all these things are broken/compromised, it would still not be possible. The only way to compromise it then if someone has access to the server to write anywhere that PHP could write to, in which no place is safe, regardless of being web-accessible.

Re-use wgTmpDirectory

The downside of wgTmpDirectory is resource management. We would need to (by defaul) make sure we have a 1-to-1 mapping between MW installs and cache directories.

This is because caches can vary by which extensions are installed, and configuration etc. So we can't just use /tmp or even /tmp-cache-{mw_version}. It would need to be something like /tmp/mw-cache-{wiki_id}-{mw_version}-{hash of $InstallPath}. The downside then, is that they could stay behind and consume disk space for longer than they should after upgrades etc.

It also means that while nothing is broken by default, it would be quite inefficient for multi-wiki set ups that share the same code base for multiple wikis. Each wiki would have its own cache directory by default. Whereas today our default for when multiple wikis share the same code, is to assume cache should be shared and site admins have to specifically change this if they want to (by setting the configuration accordingly).

Our other use of /tmp so far has always been self-cleaning.

I think it would be better if we keep the cache paired with the installation directory. That would make the cache easier to manage for third-parties, for upgrades, and other correctness.

Re-use $IP/cache

Another option is to use :mw/cache/. So no re-use of /tmp or :mw/uploads/. The benefit would be that there is no conflict with uploads, and that site admins can easily disable access for all of /cache (which we can do by default with .htaccess for Apache, like we do elsewhere already). This directory is already being used in MW by default for storing SQLite databases. The downside is that it would require more documentation and awareness for other server software to ensure people make this inaccessible. (Because right now it's only used for SQLite which most installs can afford to ignore the docs for as it'd remain empty.)

This isn't unprecedented, however. We do this already for HTML-FileCache, and for private/deleted uploads - which are also stored in such a directory and are also web-accessible by default if the site admin doesn't configure them properly.

On the other hand, if it requires awareness of the site administrator manually do something, it might also make sense to have it be an advertised and well-documented opt-in that is technically disabled by default. E.g. the technical default would be /tmp but we would in the installer check if /cache is web-accessible and if not, warn against that, and if it has been disabled already (as it should be) generate LocalSettings to use it. This is a compromise of both. Thoughts?

Paladox added a subscriber: Paladox.May 6 2019, 4:52 PM

Change 508422 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] localisation: Improve documentation around wgLocalisationCacheConf

https://gerrit.wikimedia.org/r/508422

Change 508423 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] localisation: Inject 'directory' option to LCStore classes

https://gerrit.wikimedia.org/r/508423

Change 508422 merged by jenkins-bot:
[mediawiki/core@master] localisation: Improve documentation around wgLocalisationCacheConf

https://gerrit.wikimedia.org/r/508422

Change 508423 merged by jenkins-bot:
[mediawiki/core@master] localisation: Inject 'directory' option to LCStore classes

https://gerrit.wikimedia.org/r/508423

Change 528907 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] DefaultSettings: Document wgTmpDirectory guarantees and expectations

https://gerrit.wikimedia.org/r/528907

I think as the first step we can add it to DevelopmentSettings.php. For one reason is that we put experimental configs there first and then slowly we migrate to to DefaultSettings.php (I admit the reason I care is that our tests are extremely slow because of more than 50k db queries to l10n_cache in every quibble run)

Change 529057 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/core@master] Set l10n cache to array in DevelopmentSettings.php

https://gerrit.wikimedia.org/r/529057

Change 528907 merged by jenkins-bot:
[mediawiki/core@master] DefaultSettings: Document wgTmpDirectory guarantees and expectations

https://gerrit.wikimedia.org/r/528907