Page MenuHomePhabricator

Use disk-based LCStore by default in MediaWiki 1.36
Open, MediumPublic

Description

The MediaWiki default is to store localisation cache in the database.

At WMF production, it's been the case for many years that these are stored as a file on disk instead, in a private cache directory. MediaWiki offers two well-tested formats for this: php array files, and CDB files. CDB files is what WMF currently uses in production. PHP-array files is what we're moving toward (T99740), which are even faster to generate (but difficult to manage opcache memory for, T99740).

For third parties, this means:

  • More performant by default. We'd no longer require db-master connections on web requests when repopulating the localisation cache.
  • Better recache performance.
Benchmark
Test case / LCStoreA (database table, current MW default)B (CDB files, current WMF)
(Page load time)2013 ms, 2049 ms, 2049 ms1853 ms, 1907 ms, 1869 ms

This seems like a fairly easy thing to enable by default. Should we do that? Are there reasons we haven't already?

Sub tasks

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Did the CDB test use the PHP CDB implementation or the implementation using dba_* functions?

Aye, this was with php72 as provided by the php:7.2 image on DockerHub, used via webdevops' php-nginx image, via MDD: mediawiki-docker-dev. And that installs php without --with-cdb or --with-dba=shared. I've confirmed via mwscript eval.php that Cdb\Writer::open( '/tmp/foo.cdb' ); on this installation results in a Cdb\Writer\PHP instance.

I'm curious whether it's generally common for third-parties to lack this as well.

  • Debian 9 Stretch: php 7.0, without dba functions.
  • Debian 10 Buster: php 7.3, without dba functions.
  • Ubuntu 16.04 Xenial, php 7.0, without dba functions.
  • Ubuntu 18.04 Bionic, php 7.2, without dba functions.
Log
$ docker run --rm -i -t debian:stretch sh -c "bash"
# apt-get update …
# apt-get install php … php7.0 (7.0.33-0+deb9u3)
php -a
> var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

$ docker run --rm -i -t debian:buster sh -c "bash"
# apt-get update …
# apt-get install php … php7.3 (7.3.4-2)
# php -a
> var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

$ docker run --rm -i -t ubuntu:16.04 sh -c "bash"
# apt-get update …
# apt-get install php … php7.0 (7.0.33-0ubuntu0.16.04.3)
# php -a
php > var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

$ docker run --rm -i -t ubuntu:18.04 sh -c "bash"
# apt-get update …
# apt-get install php … php7.2 (7.2.17-0ubuntu0.18.04.1)
# php -a
php > var_dump(dba_handlers());
Uncaught Error: Call to undefined function dba_handlers()

Did the CDB test use the PHP CDB implementation or the implementation using dba_* functions?

These are not installed by default on popular distros per T218207#5157534. However, the native is indeed known to be faster.

In the unlikely event that Cdb-Dba (as opposed to Cdb-Php) is also faster than StaticArray, I still do not think we default to "Cdb" over "StaticArray" as that would mean for most wikis it would be slow by default. If an administrator is customising their environment by installing php7.2-dba and wants to use Cdb for MW, they can still set $wgLocalisationCacheConf['store'] = 'cdb';. Which is actually what those administrators would already be doing today. We're only changing the default. We are not removing support for anything.

Anyhow, here's a comparison, same methods and environment as per T218207#5138303. I'm re-testing Cdb-Php and StaticArray here to give it a fresh baseline, because absolute numbers may no longer compare.

Cdb-Php
Test: Page load time
responseStart: 2762 ms, 3298 ms, 3004 ms
Test: Deploy one language
real	0m4.832s, 0m4.939s, 0m4.881s

The Dba functions were installed by using docker-php-ext-install dba. Confirmed by dba_handlers() from eval.

Cdb-Dba
Test: Page load time
responseStart: 2504 ms, 3078 ms, 3149 ms
Test: Deploy one language
real	0m3.298s, 0m3.225s, 0m3.015s
StaticArray
Test: Page load time
responseStart: 2546 ms, 2728 ms, 2753 ms
Test: Deploy one language
real	0m3.322s, 0m3.224s, 0m3.055s

Cdb-Dba is faster than Cdb-Php both for writing and for reading cache values, as expected.

Compared to StaticArray, Cdb-Dba is on-par on both accounts. Cdb-Dba has more outliers for user-perceived metrics, and StaticArray has one outlier for the deploy runtime. Both both had the same range of values. And in any case, we cannot assume Dba to be installed, so StaticArray remains the better choice as our default. It also helps generally, I think, whenever we can use the WMF production setting as MW default, as third-parties then benefit all optimisations and resources dedicated to its maintenance and performance.

Did you try sqlite LCStore (with journal_mode = WAL and synchronous = NORMAL? like the installer uses)?

Interesting. In the specific case of SQLite, "cache in database" and "cache on disk" are effectively both use the disk. Some quick comparisons using Quick MediaWiki to install MediaWiki with SQLite (macOS, on-disk /private/tmp/quickmw, PHP 7.1.26 from Homebrew).

SQLite (default installation)
LocalSettings.php (generated)
$wgLocalisationCacheConf['storeServer'] = [
	'type' => 'sqlite',
	'dbname' => "{$wgDBname}_l10n_cache",
	'tablePrefix' => '',
	'variables' => [ 'synchronous' => 'NORMAL' ],
	'dbDirectory' => $wgSQLiteDataDir,
	'trxMode' => 'IMMEDIATE',
	'flags' => 0
];
Test: Deploy one language
time php maintenance/rebuildLocalisationCache.php --lang de --force
real	0m0.404s, 0m0.416s, 0m0.407s
StaticArray
LocalSettings.php (appendix)
$wgCacheDirectory = $wgSQLiteDataDir;
$wgLocalisationCacheConf['store'] = 'array';
Test: Deploy one language
real	0m0.140s, 0m0.156s, 0m0.148s

Looks like Static Array beats SQLite as well. We've shown in all previous benchmarks that the "All languages" and "Page load time" use cases always align with the "One language" use case, so I won't bother re-running those. Besides, I don't think this would inform our decision here, as I don't think we should optimise the stock MW default for SQLite against MySQL and other RDBMS'es.

In the unlikely event someone finds that sqlite3-based writing or reading outperforms opcache-backed arrays, it will still work by default, and can be optimised by setting wgLocalisationCacheConf directly.

Next step here is deciding what $wgCacheDirectory should default to. [..] There's a few candidates we can pick from:

  • $wgTmpDirectory – we can rely on this to be a writable path, used for [..]
  • $wgUploadDirectory - we can rely on this to be a writable path, already used for [..]

Concerns, preferences, other ideas?

Storing executable files in $wgUploadDirectory which is usually web-accessible sounds scary to me. Sub-directory of $wgTmpDirectory sounds a better option to me. One small thing might be that perhaps $wgUploadDirectory might play better with selinux if it is inside mediawiki installation directory, unlike $wgTmpDirectory. More generally, the concept of "executable code generated during runtime" is now being introduced (or did I miss a precedence?) and probably could benefit from documentation to help people who try to secure MediaWiki deployments or for people doing MediaWiki deployment systems in general.

Storing executable files in $wgUploadDirectory which is usually web-accessible sounds scary to me.

1: Re-use wgUploadDirectory

Aye. There would have to been been several security compromises for that to be a risk, though. One would need to have explicitly disabled wgHashedUploadDirectory (it is enabled by default), and to have whitelisted .php, and to have broken MediaWiki so that allows uploads containing PHP code - and – allow slashes in user images - and - allow image names to start with a lowercase letter. So that a file upload can then be written to uploads/mw-cache/l10n-en.php instead of uploads/a/a0/Filename.png.

As such, it seems wgUploadDirectory is safe to use for this by default.

And if we do, we could consider doing T199590 as well. The only way to compromise it then, is if someone has access to the server to write anywhere that PHP could write to, in which nothing we pick is safe.

2: Re-use wgTmpDirectory

The downside of wgTmpDirectory is resource management. We would need to (by default) make sure we have a 1-to-1 mapping between MW installs and tmp sub directories.

This is because caches can vary by which extensions are installed, and configuration etc. So we can't just use /tmp or even /tmp-cache-{mw_version}. It would need to be something like /tmp/mw-cache-{wiki_id}-{mw_version}-{hash of $InstallPath}. The downside then, is that they would always stay behind and consume disk space after upgrades.

It also means that while nothing is broken by default, it would be quite inefficient for multi-wiki set ups that share the same code base for multiple wikis. Each wiki would have its own cache directory by default. Whereas today our default for when multiple wikis share the same code, is to assume cache should be shared and site admins have to specifically opt-out if they don't want that (by setting the configuration accordingly, like they would already do for various other configs).

All our use of /tmp so far has always been self-cleaning. This would somewhat depart from that.

3: Re-use $IP/cache

Another option is to use :mw/cache/. So no re-use of /tmp or :mw/uploads/. The benefit would be that there is no conflict with uploads, and that site admins can easily disable access for all of /cache (which we can do by default with .htaccess for Apache, like we do elsewhere already). This directory is already being used in MW by default for storing SQLite databases. The downside is that it would require more documentation and awareness for site admins using MySQL etc. (Because so far they've been able to ignore our advice to disable web-access to IP/cache.)

This isn't unprecedented, however. We do this already for HTML-FileCache, and for private/deleted uploads - which are also stored in such a directory and are also web-accessible by default if the site admin doesn't configure them properly.

On the other hand, if it requires awareness of the site administrator manually do something, it might also make sense to have it be an advertised and well-documented opt-in that is technically disabled by default. E.g. the technical default would be /tmp but we would in the installer check if /cache is web-accessible and if not, warn against that, and if it has been disabled already (as it should be) generate LocalSettings to use it. This is a compromise of both. Thoughts?

Change 508422 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] localisation: Improve documentation around wgLocalisationCacheConf

https://gerrit.wikimedia.org/r/508422

Change 508423 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] localisation: Inject 'directory' option to LCStore classes

https://gerrit.wikimedia.org/r/508423

Change 508422 merged by jenkins-bot:
[mediawiki/core@master] localisation: Improve documentation around wgLocalisationCacheConf

https://gerrit.wikimedia.org/r/508422

Change 508423 merged by jenkins-bot:
[mediawiki/core@master] localisation: Inject 'directory' option to LCStore classes

https://gerrit.wikimedia.org/r/508423

Change 528907 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] DefaultSettings: Document wgTmpDirectory guarantees and expectations

https://gerrit.wikimedia.org/r/528907

I think as the first step we can add it to DevelopmentSettings.php. For one reason is that we put experimental configs there first and then slowly we migrate to to DefaultSettings.php (I admit the reason I care is that our tests are extremely slow because of more than 50k db queries to l10n_cache in every quibble run)

Change 529057 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/core@master] Set l10n cache to array in DevelopmentSettings.php

https://gerrit.wikimedia.org/r/529057

Change 528907 merged by jenkins-bot:
[mediawiki/core@master] DefaultSettings: Document wgTmpDirectory guarantees and expectations

https://gerrit.wikimedia.org/r/528907

Change 532730 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] MessageCache: Minor wgMsgCacheExpiry doc fix, and cleare constant access

https://gerrit.wikimedia.org/r/532730

Change 532732 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] MessageCache: Increase APC 'messages-big' expiry from 1min to 1h

https://gerrit.wikimedia.org/r/532732

Change 532730 merged by jenkins-bot:
[mediawiki/core@master] MessageCache: Minor wgMsgCacheExpiry doc fix, and clear constant access

https://gerrit.wikimedia.org/r/532730

Change 532732 merged by jenkins-bot:
[mediawiki/core@master] MessageCache: Increase APC 'messages-big' expiry from 1min to 1h

https://gerrit.wikimedia.org/r/532732

I was testing this and noticed one significant issue when running rebuildLocalisationCache.php:

  • With files (aka CDB) memory use stays <100M for each thread when running with 4 threads.
  • With array memory use grows >500M for each thread when running with 4 threads.

PHP 7.1.31. in case it is relevant.

I was testing this and noticed one significant issue when running rebuildLocalisationCache.php:

  • With files (aka CDB) memory use stays <100M for each thread when running with 4 threads.
  • With array memory use grows >500M for each thread when running with 4 threads.

PHP 7.1.31. in case it is relevant.

What are $wgMemoryLimit and PHP memory_limit set to?

Did you see it grow the whole time and 500M is where it ended, or does it get to 500M early on and then remain stable at that level? I'll be sure to test for this as well before making any changes.

Command line scripts do not have any memory limit unless specified separately.

It grew during the duration of the script, reaching 500M at the end, while the files <100M stayed relatively stable.

Change 534527 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] localisation: Release data from memory in LCStoreStaticArray::finishWrite

https://gerrit.wikimedia.org/r/534527

Thanks @Nikerabbit I'm able to reproduce the leak, as well as confirm that CDB wasn't affected. I found the source of the leak and fixed it in the above patch.

Change 534527 merged by jenkins-bot:
[mediawiki/core@master] localisation: Release data from memory in LCStoreStaticArray::finishWrite

https://gerrit.wikimedia.org/r/534527

Hey there, should this be moved to 1.35? The cut is a couple of weeks away. If it needs to go out in 1.34, is there anything I can do to help get it out of the door?

The next step here is deciding on what wgCacheDirectory would be by default. Some options and thoughts from me about that are at T218207#5160934. I'm basically just waiting for others to say whether some or all of those are a bad idea.

WDoranWMF raised the priority of this task from Medium to High.EditedSep 6 2019, 5:54 PM
WDoranWMF moved this task from Inbox to Feature Requests to Review on the Platform Engineering board.
WDoranWMF added a subscriber: WDoranWMF.

I'm moving the priority up a little for the PMs to make a decision, it can be moved back down once they've confirmed.

Interesting. In the specific case of SQLite, "cache in database" and "cache on disk" are effectively both use the disk. Some quick comparisons using Quick MediaWiki to install MediaWiki with SQLite (macOS, on-disk /private/tmp/quickmw, PHP 7.1.26 from Homebrew).

SQLite (default installation)
LocalSettings.php (generated)
$wgLocalisationCacheConf['storeServer'] = [
	'type' => 'sqlite',
	'dbname' => "{$wgDBname}_l10n_cache",
	'tablePrefix' => '',
	'variables' => [ 'synchronous' => 'NORMAL' ],
	'dbDirectory' => $wgSQLiteDataDir,
	'trxMode' => 'IMMEDIATE',
	'flags' => 0
];
Test: Deploy one language
time php maintenance/rebuildLocalisationCache.php --lang de --force
real	0m0.404s, 0m0.416s, 0m0.407s
StaticArray
LocalSettings.php (appendix)
$wgCacheDirectory = $wgSQLiteDataDir;
$wgLocalisationCacheConf['store'] = 'array';
Test: Deploy one language
real	0m0.140s, 0m0.156s, 0m0.148s

Looks like Static Array beats SQLite as well. We've shown in all previous benchmarks that the "All languages" and "Page load time" use cases always align with the "One language" use case, so I won't bother re-running those. Besides, I don't think this would inform our decision here, as I don't think we should optimise the stock MW default for SQLite against MySQL and other RDBMS'es.

In the unlikely event someone finds that sqlite3-based writing or reading outperforms opcache-backed arrays, it will still work by default, and can be optimised by setting wgLocalisationCacheConf directly.

FYI, with https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/535338 and SQLite 3.30 (compiled), I get:

# SQLite (cold table)
aaron@DESKTOP-78G5GQL:~/OSS/core$ sudo -u www-data echo "delete from l10n_cache" | sqlite3 /home/data/my_wiki_l10n_cache.sqlite
aaron@DESKTOP-78G5GQL:~/OSS/core$ sudo -u www-data php maintenance/rebuildLocalisationCache.php --lang de --force
Rebuilding de...done (205ms)

# SQLite (table already set)
aaron@DESKTOP-78G5GQL:~/OSS/core$ sudo -u www-data php maintenance/rebuildLocalisationCache.php --lang de --force
Rebuilding de...done (122ms)
1 languages rebuilt out of 1

# LCStoreStaticArray
aaron@DESKTOP-78G5GQL:~/OSS/core$ sudo -u www-data php maintenance/rebuildLocalisationCache.php --lang de --force
Rebuilding de...done (106ms)
1 languages rebuilt out of 1

It's hard to beat file_put_contents().

We're discussing this in CPT feature request meeting, and we think it should be decided before 1.35 but shouldn't gate the 1.34 release per @Jdforrester-WMF

Change 529057 merged by jenkins-bot:
[mediawiki/core@master] Set l18n cache to array in DevelopmentSettings.php

https://gerrit.wikimedia.org/r/529057

Krinkle renamed this task from Use disk-based LCStore by default in MediaWiki 1.34 to Use disk-based LCStore by default in MediaWiki 1.35.Oct 5 2019, 2:35 AM

Updating title to reflect priority from CPT (thirty-party wikis will remain on the slower SQL-based LCStore for MW 1.34).

Moving back as we're now half-way the 1.35 cycle. This is blocked on deciding if and what to use as the default cache directory in MediaWiki core. See previous comments for arguments in favour and against various of the options I was able to gather.

eprodromou added a subscriber: CCicalese_WMF.

CPT needs to provide input on a technical decision. We (@CCicalese_WMF and I) think this would be good for a technical planning discussion, so we're adding to Clinic Duty and tagging for @WDoranWMF .

Speaking as a disinterested person, $IP/cache seems the most sensible option to me.

daniel lowered the priority of this task from High to Medium.Apr 7 2020, 12:53 PM

Change 594218 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] language: Add test coverage for LCStoreStaticArray

https://gerrit.wikimedia.org/r/594218

Change 594234 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] language: Avoid LCStoreStaticArray::decode() recursion for arrays

https://gerrit.wikimedia.org/r/594234

Change 594218 merged by jenkins-bot:
[mediawiki/core@master] language: Add test coverage for LCStoreStaticArray

https://gerrit.wikimedia.org/r/594218

Change 594234 merged by jenkins-bot:
[mediawiki/core@master] language: Avoid LCStoreStaticArray::decode() recursion for arrays

https://gerrit.wikimedia.org/r/594234

Moving on workboard from Backlog to Next, since the Backlog column was hidden some time ago. Was T218207#5866709 ever resolved?

Not yet. This is still pending advice/direction from CPT to decide where we want to store this in MW by default, based on the options and trade-offs provided above.

Legoktm added a subscriber: Legoktm.

Bumping again, doesn't seem like enough progress has been made yet.

Jdforrester-WMF renamed this task from Use disk-based LCStore by default in MediaWiki 1.35 to Use disk-based LCStore by default in MediaWiki 1.36.Jul 29 2020, 9:03 AM

Vanilla installs on shared hosting may not be able to write to disk, as far as I know. That means we can't use the disk based LC store per default, right?

BPirkle added a subscriber: BPirkle.

At the risk of oversimplifying, it sounds like the tradeoff is that we can make MW faster by default, at the cost of being more difficult to install for some people. Further, it sounds like the people that would experience difficulties installing are exactly the people who are not well equipped to overcome those difficulties. In other words, the suggested change sounds like a showstopper for certain people.

I'd recommend that:

  1. we don't make the change
  2. we consider whether there are documentation improvements to make it more obvious how to do performance tuning

Vanilla installs on shared hosting may not be able to write to disk, as far as I know.

I don't think we support not being able to write to disk anywhere. We have uploads, file cache, git info cache, and various tmp file writers. Some of these can be disabled, and are auto-disabled by the installer if they are unwritable, but they can't all be unwritable. One of the options proposed in this task was to use the "tmp" directory instead of $IP/cache. That seems like a reasonable fallback to use. There are pros and cons as laid out in previous comments.

That means we can't use the disk based LC store per default, right?

That depends on how you interpret "default". Most file-related features are enabled in MW by default, but gracefully fail at run-time and are also warned about by the installer, and statically disabled by the installer if one proceeds (for better UX instead of runtime discovery).

However, we do recommend, allow, and at least try by default. I think we should do the same for LCStore - try by default if the cache directory is writable, like with uploads, gitinfo cache, etc. If we really think it's valid to not even have a writable tmp dir under normal circumstances, then I suppose we could fallback to the installer assinging LCStoreDB like today. (Note I'm referring to configuration, not runtime behaviour. At runtime we will of course need to tolerate things like full disk or permission problems causing temporary problems at which point we already fallback to not having a cache and that works fine.)

[…] we can make MW faster by default, at the cost of being more difficult to install for some people.

None of this should affect site admin experience. This all happens silently and automatically. It will only make it faster and significantly easier for most, and only the same or easier for others. What would become more difficult and for whom?

Vanilla installs on shared hosting may not be able to write to disk, as far as I know.

I don't think we support not being able to write to disk anywhere. We have uploads, file cache, git info cache, and various tmp file writers. Some of these can be disabled, and are auto-disabled by the installer if they are unwritable, but they can't all be unwritable. One of the options proposed in this task was to use the "tmp" directory instead of $IP/cache. That seems like a reasonable fallback to use. There are pros and cons as laid out in previous comments.

It's been a few MW releases since I last tried it, but "not being able to write to the disk anywhere" works fine, since none of the things you mentioned are actually enabled by default. I also think it's something we should continue to support as immutable containers become more popular.

Hey there, should this be moved to 1.37? The cut for 1.36 has happened, and 1.36.0-rc.0 will be cut in a fortnight or so, after which feature changes shouldn't be landed and back-ported.

12:32:27 <Krinkle> legoktm: if I understand correctly, your reply at https://phabricator.wikimedia.org/T218207 is not in opposition or incompatible with my intent there, right? Or did I miss something? My intent being that we detect and use by default where possible.

Indeed, I'm overall supportive of this change as it will likely provide a better/faster MW experience out of the box.