⚓ T99740 Use static php array files for l10n cache at WMF (instead of CDB)

Subject	Repo	Branch	Lines +/-
Revert "Disable PHP L10n in beta cluster"	operations/mediawiki-config	master	+15 -0
Disable PHP L10n in beta cluster	operations/mediawiki-config	master	+0 -15
Feature flag PHP L10n generation	mediawiki/tools/scap	master	+21 -3
Revert "Enable LCStoreStaticArray on depooled mw1407 for benchmarking"	operations/mediawiki-config	master	+2 -3
Enable LCStoreStaticArray on depooled mw1407 for benchmarking	operations/mediawiki-config	master	+3 -2
Enable LCStoreStaticArray on group0 wikis	operations/mediawiki-config	master	+3 -1
Enable LCStoreStaticArray on wikidata.org	operations/mediawiki-config	master	+9 -1
Enable LCStoreStaticArray on commons.wikimedia.org	operations/mediawiki-config	master	+1 -0
Enable LCStoreStaticArray on hewiki and cawiki (group1)	operations/mediawiki-config	master	+2 -0
Enable LCStoreStaticArray on all wikis	operations/mediawiki-config	master	+5 -21
[BETA] Enable LCStoreStaticArray format on Beta Cluster wikis	operations/mediawiki-config	master	+15 -0
Add option to override storeClass in rebuildLocalisationCache	mediawiki/core	wmf/1.35.0-wmf.11	+10 -0
Add option to override storeClass in rebuildLocalisationCache	mediawiki/core	master	+10 -0
localisation: Add process cache to LCStoreDB	mediawiki/core	master	+13 -1
Set wgLocalisationCacheConf['storeClass'] explicitly	operations/mediawiki-config	master	+1 -1
localisation: Make PHP cache files slimmer	mediawiki/core	master	+21 -14
Beta: Attempt using LCStoreStaticArray	operations/mediawiki-config	master	+8 -0
Test LCStoreStaticArray on test2wiki	operations/mediawiki-config	master	+5 -0
Add LCStore implementation that uses static arrays in PHP files	mediawiki/core	wmf/1.26wmf9	+147 -1
Add LCStore implementation that uses static arrays in PHP files	mediawiki/core	master	+147 -1

Status	Assigned	Task
Resolved	Krinkle	T212460 Adopt static array files for local disk storage of values (epic)
Open	None	T99740 Use static php array files for l10n cache at WMF (instead of CDB)
Resolved	Joe	T103886 Translation cache exhaustion caused by changes to PHP code in file scope
Resolved	Joe	T119637 Update HHVM package to recent release
Resolved	Joe	T129467 HHVM 3.12 has a race-condition when starting up
Resolved	Ladsgroup	T105683 Add Scap support for static-array format of LCStore
Resolved	• LarsWirzenius	T243358 Tag a new release of scap
Resolved	jijiki	T245530 Deploy scap 3.13.0-1
Resolved	Krinkle	T253673 Avoid php-opcache corruption in WMF production
Resolved	jijiki	T261009 Reproduce opcache corruptions in production
Resolved	Joe	T266055 Update Scap to perform rolling restart for all MW deploy
Resolved	dancy	T243009 Add option in Scap to restart php-fpm for emergency deployments, and skip depooling/pooling servers
Resolved	thcipriani	T264362 Scap feature: restart php-fpm on deployment
Resolved	dancy	T290038 scap sync-file --force warns "sudo: no tty present and no askpass program specified"
Resolved	dancy	T237033 Scap can't clear opcache on mw servers in Beta Cluster
Resolved	Krinkle	T311788 MW wmf-config tmp cache stays outdated after Scap deploy (opcache revalidation is off)
Resolved	dancy	T263872 Localisation cache must be purged after or during train deploy, not (just) before
Resolved	None	T314947 Remove old refreshMessageBlobs.php script from WikimediaMaintenance

In T99740#5959996, @Ladsgroup wrote:

In T99740#5934663, @Krinkle wrote:

@thcipriani mentioned this in T246577: Repeated deployment-mediawiki-07 socket timeouts

I'm blocking prod roll out pending investigation of this elevated memory use issue in Beta Cluster.

I'll enable it on a canary server for a while first, and work with SRE try to find any operational impact over a few hours time to see if we can find any notable differences and/or problems.

I haven't dug deep in it but I think the reason for memory usage in beta cluster is that they have more than 1000 extensions enabled (and thus they have waaay more i18n messages than in production), I doubt it would cause a big issue for production.

So it improves performance but it can't scale? Also, having the beta cluster permanently crippled is not acceptable IMHO.

In T99740#5959996, @Ladsgroup wrote:

I haven't dug deep in it but I think the reason for memory usage in beta cluster is that they have more than 1000 extensions enabled (and thus they have waaay more i18n messages than in production), I doubt it would cause a big issue for production.

I don't think that's the case? Beta only has a handful of extensions more than prod. It does have the meta-repo of 1000+ extensions present on disk, but afaik all setting-loading and l10n-loading stuff is just as explicit for Beta as for prod.

In T99740#5960368, @AlexisJazz wrote:

So it improves performance but it can't scale? Also, having the beta cluster permanently crippled is not acceptable IMHO.

The array-based approach scales quite well, but it requires a server configuration change. This is why it is not enabled for third parties by default. The server change is what we're doing in production right now, naturally ahead of it actually being enabled there.

For beta, the only impact of this change should be that it might have fewer opcache hits and recompile source code more often - untill we set the same confguration settings there.

I have yet to see aforementioned Beta issued be correlated strongly with this change. In any event, please continue that conversation on those tasks instead.

reedy@deploy1001:/srv/mediawiki-staging/php-1.35.0-wmf.22/cache/l10n$ du -h en.l10n.php 
3.7M	en.l10n.php
reedy@deploy1001:/srv/mediawiki-staging/php-1.35.0-wmf.22/cache/l10n$ du -h de.l10n.php 
4.1M	de.l10n.php
reedy@deployment-deploy01:/srv/mediawiki/php-master/cache/l10n$ du -h en.l10n.php 
3.9M	en.l10n.php
reedy@deployment-deploy01:/srv/mediawiki/php-master/cache/l10n$ du -h de.l10n.php 
4.3M	de.l10n.php

~0.2M larger on beta (per file, so it obviously adds up a bit)

reedy@deploy1001:/srv/mediawiki-staging/php-1.35.0-wmf.22/cache/l10n$ cat *.php | wc -c
1763198989
reedy@deployment-deploy01:/srv/mediawiki/php-master/cache/l10n$ cat *.php | wc -c
1838711618

It's less than 5% in total at least in disk space for the PHP files

Krinkle moved this task from Blocked (old) to Doing (old) on the Performance-Team board.Mar 16 2020, 8:51 PM

Mentioned in SAL (#wikimedia-operations) [2020-03-17T21:54:49Z] <Krinkle> krinkle@mw2170$ disable-puppet (Testing for T99740)

In T99740#5941838, @ori wrote:
This might help:
<?php
/*
  Measure opcache memory cost of l10n cache

      …

*/

Thanks Ori. I combined this with code from the php7adm metrics module that @Joe pointed me at (source), and ran it against an mwdebug server in Eqiad, and a (depooled) production server in Codfw.

The script includes some general MW setup to give it a more realistic base line (instead of completely empty opcache). I then executed the web request right after running php7adm /opcache-clear from the command-line. Source code at P10713.

mwdebug1001

### Initial
Opcache is enabled:     1
Opcache is full:        0
Opcache memory used:      132MB
Opcache memory free:      168MB
Opcache strings mem used: 3MB
Opcache strings mem free: 47MB
APCu mem used:          128MB
APCu mem free:          126MB

Found 836 files. Created 84 chunks of 10 files each.

### After chunk #0
Opcache memory used:     151MB / Opcache memory free:      149MB
Opcache strings mem used: 15MB / Opcache strings mem free:  35MB

### After chunk #1
Opcache memory used:     171MB / Opcache memory free:      129MB
Opcache strings mem used: 20MB / Opcache strings mem free:  30MB

### After chunk #2
Opcache memory used:     191MB / Opcache memory free:      109MB
Opcache strings mem used: 26MB / Opcache strings mem free:  24MB

### After chunk #3
Opcache memory used:     211MB / Opcache memory free:      89MB
Opcache strings mem used: 34MB / Opcache strings mem free:  16MB

### After chunk #4
Opcache memory used:     231MB / Opcache memory free:      69MB
Opcache strings mem used: 39MB / Opcache strings mem free:  11MB

### After chunk #5
Opcache memory used:     251MB / Opcache memory free:      49MB
Opcache strings mem used: 43MB / Opcache strings mem free:  7MB

### After chunk #6
Opcache memory used:     270MB / Opcache memory free:      30MB
Opcache strings mem used: 48MB / Opcache strings mem free: 2MB

### After chunk #7
Opcache memory used:     300MB / Opcache memory free:      45KB
Opcache strings mem used: 50MB / Opcache strings mem free:  32B

### After chunk #8
Opcache is full:         1 # <!--
Opcache memory used:     300MB / Opcache memory free:      45KB
Opcache strings mem used: 50MB / Opcache strings mem free:  32B

### After chunk #9
Opcache is full:         1 # <!--
Opcache memory used:     300MB / Opcache memory free:      45KB
Opcache strings mem used: 50MB / Opcache strings mem free:  32B

### After chunk #10
Opcache is full:         1 # <!--
Opcache memory used:     300MB / Opcache memory free:      45KB
Opcache strings mem used: 50MB / Opcache strings mem free:  32B

### After chunk #11
Opcache is full:         1 # <!--
Opcache memory used:     300MB / Opcache memory free:      45KB
Opcache strings mem used: 50MB / Opcache strings mem free:  32B

## After chunk #12 - 84…

This reached its limit after chunk 7/84. Upto that point, compiling 70 localisation files increased opcache mem by 168M (~2.4M per file), and string mem by 47M (~0.5M per file).

mw2170 (original)

$ php7adm /opcache-free
{"*":true}
$ curl mw2170.codfw.wmnet/w/krinkle.php -H 'Host: nl.wiktionary.org'

### Initial
Opcache is full: 0
Opcache memory used:     277MB / Opcache memory free:     747MB
Opcache strings mem used: 9MB / Opcache strings mem free: 87MB
APCu mem used:          6GB
APCu mem free:          6GB

Found 836 files. Created 84 chunks of 10 files each.

### After chunk #0
Opcache memory used:     297MB / Opcache memory free:     727MB
Opcache strings mem used: 21MB / Opcache strings mem free: 75MB

### After chunk #1
Opcache memory used:     317MB / Opcache memory free:     707MB
Opcache strings mem used: 26MB / Opcache strings mem free: 70MB

### After chunk #2
Opcache memory used:     337MB / Opcache memory free:     687MB
Opcache strings mem used: 32MB / Opcache strings mem free: 64MB

### After chunk #3
Opcache memory used:     357MB / Opcache memory free:     667MB
Opcache strings mem used: 40MB / Opcache strings mem free: 56MB

### After chunk #4
Opcache memory used:     376MB / Opcache memory free:     648MB
Opcache strings mem used: 45MB / Opcache strings mem free: 51MB

### After chunk #5
Opcache memory used:     396MB / Opcache memory free:     628MB
Opcache strings mem used: 49MB / Opcache strings mem free: 47MB

### After chunk #6
Opcache memory used:     416MB / Opcache memory free:     608MB
Opcache strings mem used: 54MB / Opcache strings mem free: 42MB

### After chunk #7
Opcache memory used:     436MB / Opcache memory free:     588MB
Opcache strings mem used: 59MB / Opcache strings mem free: 37MB

### After chunk #8
Opcache memory used:     455MB / Opcache memory free:     569MB
Opcache strings mem used: 63MB / Opcache strings mem free: 33MB

### After chunk #9
Opcache memory used:     475MB / Opcache memory free:     549MB
Opcache strings mem used: 66MB / Opcache strings mem free: 30MB

### After chunk #10
Opcache memory used:     495MB / Opcache memory free:     529MB
Opcache strings mem used: 70MB / Opcache strings mem free: 26MB

### After chunk #11
Opcache memory used:     515MB / Opcache memory free:     509MB
Opcache strings mem used: 76MB / Opcache strings mem free: 20MB

### After chunk #12
Opcache memory used:     535MB / Opcache memory free:     489MB
Opcache strings mem used: 79MB / Opcache strings mem free: 17MB

### After chunk #13
Opcache memory used:     554MB / Opcache memory free:     470MB
Opcache strings mem used: 84MB / Opcache strings mem free: 12MB

### After chunk #14
Opcache memory used:     574MB / Opcache memory free:     450MB
Opcache strings mem used: 88MB / Opcache strings mem free: 8MB

### After chunk #15
Opcache memory used:     594MB / Opcache memory free:     430MB
Opcache strings mem used: 91MB / Opcache strings mem free: 5MB

### After chunk #16
Opcache memory used:     613MB / Opcache memory free:     411MB
Opcache strings mem used: 96MB / Opcache strings mem free: 454KB

### After chunk #17
Opcache memory used:     637MB / Opcache memory free:     387MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #18
Opcache memory used:     663MB / Opcache memory free:     361MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #19
Opcache memory used:     687MB / Opcache memory free:     337MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #20
Opcache memory used:     710MB / Opcache memory free:     314MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #21
Opcache memory used:     736MB / Opcache memory free:     288MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #22
Opcache memory used:     759MB / Opcache memory free:     265MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #23
Opcache memory used:     786MB / Opcache memory free:     238MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #24
Opcache memory used:     814MB / Opcache memory free:     210MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #25
Opcache memory used:     839MB / Opcache memory free:     185MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #26
Opcache memory used:     865MB / Opcache memory free:     159MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #27
Opcache memory used:     887MB / Opcache memory free:     137MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #28
Opcache memory used:     909MB / Opcache memory free:     115MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #29
Opcache memory used:     935MB / Opcache memory free:     89MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #30
Opcache memory used:     967MB / Opcache memory free:     57MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #31
Opcache memory used:     991MB / Opcache memory free:     33MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #32
Opcache memory used:     1016MB / Opcache memory free:     8MB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #33
Opcache memory used:     1023MB / Opcache memory free:     649KB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

### After chunk #34
Opcache memory used:     1023MB / Opcache memory free:     649KB
Opcache strings mem used: 96MB / Opcache strings mem free: 8B

…

This reached its limit after chunk 33/84. Upto that point, compiling 330 localisation files increased opcache mem by 746M (~2.2M per file), and string mem by 87M (~0.3M per file).

mw2170:/etc/php/7.2/fpm/php.ini (original)

opcache.enable = 1
opcache.interned_strings_buffer = 96
opcache.max_accelerated_files = 24000
opcache.max_wasted_percentage = 10
opcache.memory_consumption = 1024
opcache.revalidate_freq = 10
opcache.validate_timestamps = 1

I've locally bumped these to see how it would ideally consume

mw2170:/etc/php/7.2/fpm/php.ini (modified)

opcache.enable = 1
opcache.interned_strings_buffer = 960 # +800M (10x)
opcache.max_accelerated_files = 240000 # (10x)
opcache.max_wasted_percentage = 10
opcache.memory_consumption = 4096 # +3G (4x)
opcache.revalidate_freq = 10
opcache.validate_timestamps = 1

mw2170 (modified)

$ php7adm /opcache-free
{"*":true}
$ curl mw2170.codfw.wmnet/w/krinkle.php -H 'Host: nl.wiktionary.org'

### Initial
Opcache is enabled:     1
Opcache is full:        0
Opcache memory used:    2162MB / Opcache memory free:    1934MB
Opcache strings mem used:  9MB / Opcache strings mem free:  951MB

Found 836 files. Created 84 chunks of 10 files each.

### After chunk #0
Opcache memory used:     2182MB / Opcache memory free:     1914MB
Opcache strings mem used: 21MB / Opcache strings mem free: 939MB

### After chunk #1
Opcache memory used:     2201MB / Opcache memory free:     1895MB
Opcache strings mem used: 26MB / Opcache strings mem free: 934MB

### After chunk #2
Opcache memory used:     2221MB / Opcache memory free:     1875MB
Opcache strings mem used: 32MB / Opcache strings mem free: 928MB

### After chunk #3
Opcache memory used:     2241MB / Opcache memory free:     1855MB
Opcache strings mem used: 40MB / Opcache strings mem free: 920MB

…

### After chunk #80
Opcache memory used:     3766MB / Opcache memory free:     330MB
Opcache strings mem used: 202MB / Opcache strings mem free: 758MB

### After chunk #81
Opcache memory used:     3786MB / Opcache memory free:     310MB
Opcache strings mem used: 203MB / Opcache strings mem free: 757MB

### After chunk #82
Opcache memory used:     3806MB / Opcache memory free:     290MB
Opcache strings mem used: 203MB / Opcache strings mem free: 757MB

### After chunk #83
Opcache memory used:     3818MB / Opcache memory free:     278MB
Opcache strings mem used: 203MB / Opcache strings mem free: 757MB

This did not reach the limits. After compiling all 836 localisation files opcache mem increased by 1,656M (~2M per file), and string mem by 194M (~0.2M per file).

This seems really excessive, especially if we ever want to run in a containerized environment (where ideally we run multiple, smaller instances of php-fpm) and / or if we want to run a memcached instance on the same server where we're running php.

I'll run some numbers later today but this doesn't look acceptable at first sight.

In T99740#5977889, @Joe wrote:

This seems really excessive, especially if we ever want to run in a containerized environment (where ideally we run multiple […]

Note that I ran it based on two versions of MediaWiki. For the container, we'd only need half, or ~0.8G.

In T99740#5977889, @Joe wrote:

This seems really excessive, especially if we ever want to run in a containerized environment (where ideally we run multiple, smaller instances of php-fpm) and / or if we want to run a memcached instance on the same server where we're running php.

I'll run some numbers later today but this doesn't look acceptable at first sight.

The increase in opcache size doesn't tell us everything about overall impact on memory usage. Increasing the string buffer size can actually reduce overall memory usage, because the string buffer is shared. Most of the memory metrics exported by the kernel won't give you the full picture either. If you're worried about memory constraints, try gradually shrinking the total amount of memory available to PHP and see how low you can go before it starts paging heavily.

If this does reduce page serve time, it's going to be a bargain, even at the cost of some additional RAM.

Krinkle moved this task from Doing (old) to Blocked (old) on the Performance-Team board.Mar 22 2020, 1:39 AM

Krinkle removed a project: Patch-For-Review.Mar 27 2020, 4:55 PM

Could this be causing T249018 on beta?

In T99740#6030334, @Addshore wrote:

Could this be causing T249018 on beta?

No, there is definitely no relation between the two. The opcache memory usage does not contribute to the memory calculated by php when checking the memory limit.

Is there a way to turn this function on for just a single application server? I would like to run some thorough performance tests with the two options to understand a bit better what's the real performance impact of this, and be able to give an opinion on the cost/benefits.

Specifically, I want to run tests on wiki pages and possibly shadowing real traffic.

In T99740#6031681, @Joe wrote:

Is there a way to turn this function on for just a single application server?

Yes, no problem. The cache acts standalone on each server, so there's no cross-server or split-brain concerns here. Totally fine to switch conditionally by hostname in wmf-config.

Change 587299 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/puppet@production] mediawiki: increase php7 opcache capacity on mw1407

https://gerrit.wikimedia.org/r/587299

gerritbot added a project: Patch-For-Review.Apr 7 2020, 4:59 PM

Change 589674 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] Enable LCStoreStaticArray on depooled mw1407 for benchmarking

https://gerrit.wikimedia.org/r/589674

Mentioned in SAL (#wikimedia-operations) [2020-04-17T19:32:41Z] <Krinkle> Depool mw1407.eqiad.wmnet for opcache and LCStoreStaticArray testing. – T99740

Change 589674 merged by Krinkle:
[operations/mediawiki-config@master] Enable LCStoreStaticArray on depooled mw1407 for benchmarking

https://gerrit.wikimedia.org/r/589674

In T99740#6031681, @Joe wrote:

Specifically, I want to run tests on wiki pages and possibly shadowing real traffic.

I've depooled mw1407 and enabled LCStoreStaticArray in MediaWiki config there with host-specific condition (https://gerrit.wikimedia.org/r/589674). I don't want to disable puppet for several days there so whenever you're ready, apply https://gerrit.wikimedia.org/r/587299 via puppet or disable puppet and apply by hand in /etc/php/7.2/fpm/php.ini.

Confirmed via a local request and Logstash:

krinkle@mw1407:~$ curl 'http://mw1407.eqiad.wmnet/w/load.php' -H 'Host: aa.wikipedia.org' -H 'X-Wikimedia-Debug: 1; log'

Logstash

type:mediawiki host:mw1407 message:"LocalisationCache"

* DEBUG | LocalisationCache using store LCStoreStaticArray

.. which for all other hosts logs LocalisationCache using store LCStoreCDB.

RhinosF1 subscribed.Apr 20 2020, 9:10 AM

Krinkle reassigned this task from Krinkle to Joe.Apr 20 2020, 7:39 PM

Mentioned in SAL (#wikimedia-operations) [2020-04-27T11:20:59Z] <_joe_> restarted php-fpm on mw1407 to pick up enlarged opcache values, T99740

Mentioned in SAL (#wikimedia-operations) [2020-04-27T13:30:11Z] <_joe_> depooled mw1409 as well as mw1407 for further benchmarking, T99740

Mentioned in SAL (#wikimedia-operations) [2020-04-27T13:38:14Z] <_joe_> repooling both mw1407 and mw1409 for tesing T99740

After running a few benchmarks on mw1407 (where LCStoreStaticArray is used) vs mw1409 (which uses cdb files), it seemed the change made little to no difference for the following urls:

And I didn't find any significant difference in performance, so I decided to take a look at the performance of the server inside the serving pool for ~ 15 minutes. In this case, we see an improvement in performance between 1 and 5%.

quantile	LCStoreArray (ms)	LCStoreCDB (ms)
50	153	160
75	219	224
95	495	503
99	1680	1711

Personally, if there isn't an external factor (as: better maintenance, ability to run static analysis on the code, etc), I don't consider this small performance gain to justify the changes we'd need to perform.

Namely, we will need to ensure that every full sync of the code causes a full fleet restart of php-fpm, as the 3 gb of memory we reserve for php-fpm are not enough to contain the code twice over.

Intersingly, other effect noticed:

CPU usage is slightly higher when using LCStoreArray
overall Memory usage is higher with LCStoreArray, but not significantly enough to be a worry in our current setup. Every php worker uses ~ 1 GB of memory at startup vs ~ 500k in the normal setup.

I think 5% is huge! As part of T233886 and T189966, I took many work-days to achieve similar gains, and even that is becoming harder and harder without it turning into weeks of multi-person/cross-team dependencies. These kinds of gain will decide how much work it is going to take to achieve certain consistent latencies on the new REST api for example, and also make latencies generally more consistent.

The l10n store win isn't a one-time cost difference, but it scales with call frequency (T99740#5929577). This is among the reasons why on cold cache performance varies so wildely because we have many layers of caching on top of l10n store. LocalisationCache is global per language, then MessageCache is per-wiki per-language (based on hooks and on-wiki overrides), and then there is MessageBlobStore for ResourceLoader on a per-module basis. Until recently MessageBlobStore had a dedicated DB table and nightly clean up cron. This might have been needed as much if base l10n performed better. Yet even with MessageBlobStore, the cache-miss experience is still pretty bad. The first cache miss for any module, in any language, on any wiki (1001 * 419 * 944), can currently takes upwards of 5 seconds to compute on-demand. And that's on an unstressed server. For a single JS resource. Not to mention end-user latency, HTML/wikitext processing, CSS, .. page load performance is lost before it even begins.

Aside from performance, this change also has benefits for the deployment process:

Generating the array files is about 4X faster than building the CDB binaries (4 min vs 1 min). – T218207#5138303
The arrays are about 10% smaller than CDB in terms of uncompressed file size (e.g. 3.7M per file instead of of 4.2M) . – T99740#4609279
When compressed as part of an image, they can presumably be much smaller still.

What we do today:

Scap invokes MediaWiki to gather the hundreds of i18n JSON files from core and extensions, and builds a set of large CDB binaries, one per language.
Scap converts CDB binaries to JSON files, with MD5 checksum files alongside them.
Scap rscyncs the JSON files (diff-only, compressible).
Scap instructs each app server to convert the JSON files back to CDB binaries, and verifies their MD5 checksum.

What we'll do instead:

Scap invokes MediaWiki to build a set of PHP array files.
Scap rsyncs the array files (diff-only, compressible).

I would expect deployments to be faster, with simpler tools. And for container images in the future to therefore be smaller and take less complexity/time to build.

In T99740#6085180, @Joe wrote:

overall Memory usage is higher with LCStoreArray, but not significantly enough to be a worry in our current setup. Every php worker uses ~ 1 GB of memory at startup vs ~ 500k in the normal setup.

We might be able to bring this down a bit. The opcache config I staged was optimised for benchmarking latency, not memory. I rounded the number up significantly to make sure it would definitely use opcache and not fallback to re-parsing disk reads. But, I don't know if all of that allocated space is actually needed. See https://gerrit.wikimedia.org/r/587299 and T99740#5977799.

I concur with Timo that this change does seem worthwhile from our perspective.

In T99740#6085845, @Krinkle wrote:

I think 5% is huge! As part of T233886 and T189966, I took many work-days to achieve similar gains, and even that is becoming harder and harder without it turning into weeks of multi-person/cross-team dependencies. These kinds of gain will decide how much work it is going to take to achieve certain consistent latencies on the new REST api for example, and also make latencies generally more consistent.

Let me note that the difference is well below 2% in most cases, and that it's much smaller than the variations in backend response times we see daily due to a variety of other effects, which can be in the range 5-10% or higher.

Moreover: a simple badly optimized database query costs us a 20% of performance for hours, easily, in terms of backend response times. And it doesn't require us to radically change our production environment.

I will run more extensive tests so I have more precise results, but in terms of performance evaluation, this gain is barely noticeable over a full day, and well below what we would consider significant.
It also makes the memory usage of the whole process about 3x what it is in normal operating conditions, and raises the CPU usage.

From the point of view of overall backend performance (that is what I'm talking about) this is a third-order optimization, notwithstanding whatever your perception of it is. Also, those figures above are quite un-scientific - I thought the result was lackluster enough not to justify further analysis. I'll post more precise numbers, testing over a full day on two servers restarted in the same second.

But this is not even my main reason of worry ( coming below).

The l10n store win isn't a one-time cost difference, but it scales with call frequency (T99740#5929577). This is among the reasons why on cold cache performance varies so wildely because we have many layers of caching on top of l10n store. LocalisationCache is global per language, then MessageCache is per-wiki per-language (based on hooks and on-wiki overrides), and then there is MessageBlobStore for ResourceLoader on a per-module basis. Until recently MessageBlobStore had a dedicated DB table and nightly clean up cron. This might have been needed as much if base l10n performed better. Yet even with MessageBlobStore, the cache-miss experience is still pretty bad. The first cache miss for any module, in any language, on any wiki (1001 * 419 * 944), can currently takes upwards of 5 seconds to compute on-demand. And that's on an unstressed server. For a single JS resource. Not to mention end-user latency, HTML/wikitext processing, CSS, .. page load performance is lost before it even begins.

We can't completely revisit how we deploy software for such a tail gain. I find it hard to believe that those 5 seconds are completely due to CDB files. Is that the case?
If so, I'm sure there are ways to keep that time down without polluting a single php-fpm cache with 2.5 GB of additional php data.

Aside from performance, this change also has benefits for the deployment process:

[CUT]

I would expect deployments to be faster, with simpler tools. And for container images in the future to therefore be smaller and take less complexity/time to build.

I don't think the two statements above are correct. And these are my main worries.

For scap deploys, we'll need to perform a full rolling restart of all appservers for every non sync-file change, as a single train deploy can fill the opcache up easily on a server using 3 GB of opcache.

When I proposed to do a full rolling restart at every release, it was deemed unpractical, dangerous and basically refused by the release engineering and performance teams. Let me note that this might allow us to go back to not validating opcache, which would make deployments much more atomic :)

A single deploy (for the train, but probably for most SWATs too) will require a restart, making it significantly slower than it is now. A full safe rolling restart of our application servers might take up to 5 minutes or more.

As for container images:

Why should they be smaller using php arrays isntead of cdb files? I would expect the opposite to be true (we're talking about compressed file sizes)
Having a 3 GB overhead of RAM usage would kill our ability to run much smaller installations of php-fpm in parallel, and force us to run "fat pods", which is decidedly suboptimal - kubernetes doesn't like to have to allocate very large chunks of one server's memory. Also, we'd get less available workers per server, because of the 2.5 GB memory overhead.

Basically for every pod we'd need a 3 GB opcache space + 3 GB apcu space *even before* we try to allocate workers. It's a 50% increase (from 4 GB to 6 GB) of the baseline occupied memory.

More in general, php-fpm performs better (much, much more than 1%) when you can keep its concurrency low - so much that we've discussed running multiple php-fpms with smaller footprint on a physical appserver even before we have moved to kubernetes. So increasing the memory footprint of the single daemon seems dangerous.

In T99740#6085180, @Joe wrote:

overall Memory usage is higher with LCStoreArray, but not significantly enough to be a worry in our current setup. Every php worker uses ~ 1 GB of memory at startup vs ~ 500k in the normal setup.

We might be able to bring this down a bit. The opcache config I staged was optimised for benchmarking latency, not memory. I rounded the number up significantly to make sure it would definitely use opcache and not fallback to re-parsing disk reads. But, I don't know if all of that allocated space is actually needed. See https://gerrit.wikimedia.org/r/587299 and T99740#5977799.

Also don't forget the 3 GB of opcache memory usage. I'll post more precise numbers in a followup.

If our problem is having a local, highly available cache of these data, we can explore other avenues, like storing those data into a local memcached on all servers, as we're thinking of installing it for other reasons. On one hand, that would make the cache be slower (possibly), but it will also allow the cache to be shared between php-fpm instances.

Anyways, given the gain seems significant, I'll run more precise tests today.

Mentioned in SAL (#wikimedia-operations) [2020-04-28T07:52:38Z] <_joe_> running benchmarks on mw1407 (LCStoreStaticArray) and mw1409 (LCStoreCDB) for T99740: restart php-fpm, pool for 5 minutes to warmup caches, then depool both servers.

Un1tY subscribed.Apr 28 2020, 7:54 AM

Assuming we'll be ok with restarting php-fpm at every release, I reduced the amount of strings memory and opcache allocated on mw1407 from the values in the puppet patch. I am now using 300 MB of interned strings cache and 3.3 GB of opcache space. These figures can be reduced further probably.

I am now running the following tests:

restart the appserver, run traffic through it for 30 minutes, evaluate if there is any significant performance gain over the whole period.
run each test benchmark I listed before, in parallel on both servers (so that we can hope no external factors affect our results), with growing concurrency
Repeat the above tests when reducing further the opcache usage on mw1407 *and* turning off opcache revalidation completely

First, the results of the real traffic test. These are averages over 10 minutes, starting after 20 minutes of having both servers pooled. This is an attempt at smoothing out the effects of very slow queries at higher percentiles, that can be traffic dependent.

metric	LCStoreStaticArray	LCStoreCDB	diff
p50 (ms)	142	149	-4.7%
p75 (ms)	210	213	-1.4%
p95 (ms)	462	467	-1.0%
rps	97	102	-4.9%
CPU user (%)	22	22	-
CPU system (%)	3	2	+1%
RSS (MB)*	60.7	99	+39%
Shared Mem (GB)**	9.9	7.5	+25%

* The memory used by php-fpm is measured by running:

$ ps -eo rss,command | awk 'BEGIN {mem=0} {if (/php-fpm/) { mem+=$1 }} END {print mem}'

** The shared memory is calculated by running:

master_pid=$(ps -eo pid,ppid,command | awk '{if (/php-fpm\:/ && $2 == 1) {print $1}}'); sudo pmap $master_pid | awk '{if ($4 == "zero") {c+=$2}} END {print c * 1024}'

My conclusion is that, while clearly slightly faster than LCStoreCDB, LCStoreStaticArray requires way more resources in terms of RAM usage, in particular for the shared memory that reaches almost 10 GB per php-fpm pool.

If we can tailor those numbers down a bit (for instance, reducing the size of the APC pool, and by programmatically restarting php-fpm at each release, thus removing the need for revalidating the opcache) I think the difference could be reduced to a smaller, more manageable number. I'm still unconvinced the perf gain would make it worthwile.

Just to be clearer: we achieved a much larger improvement in the average latency of requests by switching to persistent connections to our session storage:

application-servers-red-dashboard-latency.png (500×1 px, 24 KB)

(this is a picture from a blog post I should write about it).

With this I just mean there are easier, cheaper wins we can obtain if the goal is to improve performance. Frankly, none of the other benefits listed earlier can justify this switch IMO.

I'll report the results for the rest of the benchmarks below but I'm unassigning myself as owner of this task, and I oppose its deployment to production.

Some more data:

Rendering the enwiki Barack Obama page, with concurrency of 25, gives this response time distribution (over 10k requests - so the p99 can be found at 9900 requests):

In this case, we don't see significant differences.

Same thing, for a lighter page (https://it.wikipedia.org/wiki/Nemico_pubblico_(film_1998))

nemico_pubblico_c25.png (997×1 px, 49 KB)

In this case, we have more differences, still well below 1% in p99

Finally, what happens when we try to load a resource via load.php, with concurrency of 40:

as you can see, differences are negligible here too.

In T99740#6089198, @Joe wrote:

Some more data:

Rendering the enwiki Barack Obama page, with concurrency of 25, gives this response time distribution (over 10k requests - so the p99 can be found at 9900 requests):

In this case, we don't see significant differences.

Same thing, for a lighter page (https://it.wikipedia.org/wiki/Nemico_pubblico_(film_1998))

In this case, we have more differences, still well below 1% in p99

Finally, what happens when we try to load a resource via load.php, with concurrency of 40:

as you can see, differences are negligible here too.

One rather random note and feel free to correct me if I'm wrong. Articles themselves are not heavy in using l10n (unless in multilingual projects like commons/wikidata). Checking Special pages or action history might yield a different result.

In T99740#6089626, @Ladsgroup wrote:

In T99740#6089198, @Joe wrote:

Some more data:

Rendering the enwiki Barack Obama page, with concurrency of 25, gives this response time distribution (over 10k requests - so the p99 can be found at 9900 requests):

In this case, we don't see significant differences.

Same thing, for a lighter page (https://it.wikipedia.org/wiki/Nemico_pubblico_(film_1998))

In this case, we have more differences, still well below 1% in p99

Finally, what happens when we try to load a resource via load.php, with concurrency of 40:

as you can see, differences are negligible here too.

One rather random note and feel free to correct me if I'm wrong. Articles themselves are not heavy in using l10n (unless in multilingual projects like commons/wikidata). Checking Special pages or action history might yield a different result.

This is a valid point! I'll repeat the test on a special page. I focused on things that get requested the most (articles and load.php), but it makes sense we also try with a special page.

@Joe, I appreciate the effort you put in to evaluating this change! If you have the patience to put up with some more annoying kibitzing from me, I have a few questions :)

Hammering one or two pages may not be representative. The performance test should force MediaWiki to look up entries in the l10n cache at the same rate as production, and that may not happen if every request is a parser cache hit. (Timo, please correct me if I'm wrong.) It's the traffic test we ought to pay attention to.

In T99740#6088175, @Joe wrote:

metric LCStoreStaticArray LCStoreCDB diff

RSS (MB)* 60.7 99 +39%

Shouldn't this be -39%?

My conclusion is that, while clearly slightly faster than LCStoreCDB, LCStoreStaticArray requires way more resources in terms of RAM usage, in particular for the shared memory that reaches almost 10 GB per php-fpm pool.

A 5% improvement in p50 page load time looks pretty significant to me.

If we can tailor those numbers down a bit (for instance, reducing the size of the APC pool, and by programmatically restarting php-fpm at each release, thus removing the need for revalidating the opcache) I think the difference could be reduced to a smaller, more manageable number. I'm still unconvinced the perf gain would make it worthwile.

How do you weigh RAM vs. performance? How much RAM headroom do app servers have currently?

In T99740#6089095, @Joe wrote:

Just to be clearer: we achieved a much larger improvement in the average latency of requests by switching to persistent connections to our session storage:

I can't tell the effect size from this graph, and I'm not sure what point you're making. It's generally the case with performance tuning that over time you pay more for diminishing improvements, no? The opportunity cost for LCStoreStaticArray should be evaluated relative to other unrealized opportunities the Foundation could be pursuing with the same resources. If there are lower-hanging fruit, what are they?

This is a valid point! I'll repeat the test on a special page. I focused on things that get requested the most (articles and load.php), but it makes sense we also try with a special page.

Special pages with many messages, apart from Special:AllMessages of course, might be Special:Tags, Special:Gadgets and Special:Version. Pages with few messages repeated many times could be Special:SiteMatrix or any query page with high limit.

On Special:Log and the like, there can be higher variance due to things like querying for user preferences (grammatical gender especially), if I remember well from some work Tim did on it years ago.

In T99740#6089198, @Joe wrote:

Rendering the enwiki Barack Obama page, with concurrency of 25

load a resource via load.php, with concurrency of 40

I haven't confirmed it, but I suspect most interface messages used during page views are behind other layers of caching. The skin sidebar and wikitext parser both involve a good numebr of messages, but both have dedicated caches. The higher percentiles don't resemble the cache-miss scenarios per se either as these may've been warmed up in Memcached prior to the benchmark, but even if not, there is enough variance in these URLs from other factors that the outliers are likely extremes from other code paths, not LCStore.

ResourceLoader is possibly the largest consumer of interface messages (in terms of how many it fetches per http request, given it has to bundle them upfront). It has a dedicated cache and uses it for all interface messages fetched from its code paths (MessageBlobStore).

An isolated bench on the LCStore calls was already done at T99740#5929577 (see the far end of that comment), and in other comments and on other tasks. But if we want to do this with higher concurrency and in production, I'd recommend patching MW locally to disable part of Memcached.

For example, the most frequently used URL for ResourceLoader is https://en.wikipedia.org/w/load.php?modules=startup&lang=en&skin=vector&only=scripts. This is requested from every pageview in production, and requires thousands of messages (every message of every module, to determine each module URLs's version key). Below is from mwdebug1001 with an opcache patch applied, for the load.php?startup url, with 1 warmup, and 3 runs:

Scenario	backend-timing
MBS cache miss, l10n-cdb [status quo]	D=1,233,018 µs, D=1,024,391 µs, D=1,022,317 µs (Grafana)
MBS cache miss, l10n-array	D=1,005,511 µs, D=880,391 µs, D=795,582 µs

This 20%-reduced traffic is representative of the traffic we get after a new branch or other major deployment where we get these requests for each wiki/language/skin/platform combination to backfill caches (mobile/desktop * wikis * languages * skins = 2*940*310*5 = 2.9M) .

@ php-1.35.0-wmf.28/includes/resourceloader/MessageBlobStore.php
- $result = $cache->getMulti( array_values( $cacheKeys ), $curTTLs, $checkKeys );
+ $result = []; # $cache->getMulti( array_values( $cacheKeys ), $curTTLs, $checkKeys );

@ wmf-config/CommonSettings.php
- $wgLocalisationCacheConf['storeClass'] = LCStoreCDB::class;
+ $wgLocalisationCacheConf['storeClass'] = LCStoreStaticArray::class;

@Joe I admit I don't have a good understanding of the RAM cost and overhead we have. It would help to better quantify the budget/buffer/cost here.

At the risk of bringing more bad news.. The memory limit for MW web requests is 600M currently. This is set to 1.4G on Parsoid servers, and I believe the current expectation is that we'd need to apply this to other app servers before Parsoid can be used within MW. Would that be similarly concerning? Or less concerning (given per-request)?

Mentioned in SAL (#wikimedia-operations) [2020-05-06T08:02:46Z] <_joe_> restarted php-fpm with tweaked parameters on mw1407, now briefly pooling for traffic (T99740)

In T99740#6100595, @ori wrote:

@Joe, I appreciate the effort you put in to evaluating this change! If you have the patience to put up with some more annoying kibitzing from me, I have a few questions :)

Hammering one or two pages may not be representative. The performance test should force MediaWiki to look up entries in the l10n cache at the same rate as production, and that may not happen if every request is a parser cache hit. (Timo, please correct me if I'm wrong.) It's the traffic test we ought to pay attention to.

That's why I also ran a test of re-parsing where I got no big difference either, but I do agree - that's why I also ran some tests with actual production traffic.

In T99740#6088175, @Joe wrote:

metric LCStoreStaticArray LCStoreCDB diff

RSS (MB)* 60.7 99 +39%

Shouldn't this be -39%?

No, I twisted my fingers, it's using more memory, but it's not really relevant govien the actual numbers.

My conclusion is that, while clearly slightly faster than LCStoreCDB, LCStoreStaticArray requires way more resources in terms of RAM usage, in particular for the shared memory that reaches almost 10 GB per php-fpm pool.

A 5% improvement in p50 page load time looks pretty significant to me.

I don't think it is, given we have quite a few ongoing issues that cause the latency to spike up by more than 20% - for example, still unexplained surges in memcache request rate - for which we have no instrumentation nor - clearly - time for investigation.

See this for instance on mw1407 during a period where we had to re-pool it during a network outage that forced us to depool a whole rack of appservers:

The mean latency had serveral spikes of 10-25%, and a sustained plateau, in correspondence of spikes in memcached request rates:

So: I think there are other areas we should focus on first. I would agree that in a void where we've repaired all of our larger culprits this would be a good perf gain, even if I would still be doubtful about its costs.

If we can tailor those numbers down a bit (for instance, reducing the size of the APC pool, and by programmatically restarting php-fpm at each release, thus removing the need for revalidating the opcache) I think the difference could be reduced to a smaller, more manageable number. I'm still unconvinced the perf gain would make it worthwile.

How do you weigh RAM vs. performance? How much RAM headroom do app servers have currently?

One thing all of my current and previous benchmarking has shown consistently is that php-fpm's performance degrades under higher concurrency, no matter the amount of cpu / ram /disk we throw at the problem. There are fundamental bottlenecks in php-fpm that are less severe when it's serving 50 req/s instead of 150. Being able to run 3 separated php-fpm instances on a single physical server would improve both our scalability and our latencies.

So, it's not just about where they are currently, it's where we want to get. Having such a huge memory requirement (basically, 8-10 GB for apc/lcstaticarray, plus the N GB to serve requests) would practically kill our ability to run mediawiki within kubernetes, or even to just run multiple instances per machine if we choose not to get there for some reason.

I have some ideas on how to overcome this - basically using a shared cache instead of a per-instance one, see also T244340 and T248005. but that would beg the question - is that faster than using CDB when accessing the data itself?

Anyways, this can be worked on further. Sadly, I have other priorities at the moment - but I'm happy to come back to the discussion once I have time for it again.

Basically, my precondition for seeing this in production right now would be:

Stop revalidating opcache (which seeems a good idea given the occasional corruptions we see anyways)
Rolling restart php-fpm with every scap run (this is currently supported in scap, but needs to be tested)
Set opcache to be what can contain one train release, not multiple ones like we do today
Make checks *pre-deploy* to ensure we don't get over said limit.

For instance, I was able to pool mw1407 with the following configuration:

opcache.validate_timestamps = 0
opcache.interned_strings_buffer = 300
opcache.max_accelerated_files = 15000
opcache.max_wasted_percentage = 10
opcache.memory_consumption = 2000

This would mean a 1GB increase in our current opcache memory usage, and would not impair any of our plans for containerization either, as we can probably compensate that by reducing the size of APCu anyways.

Last time I proposed to not revalidate cache and just roll-restart php-fpm everywhere there was some resistance from Release Engineering folks - but I see that as the best way forward if you want to see this to production.

Change 592867 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/mediawiki-config@master] Revert "Enable LCStoreStaticArray on depooled mw1407 for benchmarking"

https://gerrit.wikimedia.org/r/592867

Change 592867 merged by jenkins-bot:
[operations/mediawiki-config@master] Revert "Enable LCStoreStaticArray on depooled mw1407 for benchmarking"

https://gerrit.wikimedia.org/r/592867

Mentioned in SAL (#wikimedia-operations) [2020-05-06T08:43:54Z] <oblivian@deploy1001> Synchronized wmf-config/CommonSettings.php: Reverting change on mw1407 T99740 (duration: 01m 16s)

Krinkle mentioned this in T253673: Avoid php-opcache corruption in WMF production.May 26 2020, 6:21 PM

Krinkle changed the task status from Open to Stalled.May 26 2020, 6:29 PM

Krinkle added a subtask: T253673: Avoid php-opcache corruption in WMF production.

Krinkle removed a project: Patch-For-Review.

jijiki subscribed.May 27 2020, 6:24 AM

Change 630592 had a related patch set uploaded (by Thcipriani; owner: Thcipriani):
[mediawiki/tools/scap@master] Feature flag PHP L10n generation

https://gerrit.wikimedia.org/r/630592

gerritbot added a project: Patch-For-Review.Sep 28 2020, 1:48 PM

Change 630592 merged by jenkins-bot:
[mediawiki/tools/scap@master] Feature flag PHP L10n generation

https://gerrit.wikimedia.org/r/630592

Krinkle closed subtask T105683: Add Scap support for static-array format of LCStore as Resolved.Oct 5 2020, 6:34 PM

Change 651228 had a related patch set uploaded (by Ahmon Dancy; owner: Ahmon Dancy):
[operations/mediawiki-config@master] Disable PHP L10n in beta cluster

https://gerrit.wikimedia.org/r/651228

Change 651228 merged by jenkins-bot:
[operations/mediawiki-config@master] Disable PHP L10n in beta cluster

https://gerrit.wikimedia.org/r/651228

Jdforrester-WMF updated the task description. (Show Details)Dec 21 2020, 5:57 PM

thcipriani edited projects, added Release-Engineering-Team; removed Release-Engineering-Team-TODO.Jan 12 2021, 5:19 PM

Krinkle claimed this task.Feb 11 2021, 12:29 AM

Krinkle moved this task from Blocked (old) to To-do: Goals, prioritized next 4 Quarters on the Performance-Team board.Mar 1 2021, 8:22 PM

thcipriani edited projects, added Release-Engineering-Team-TODO; removed Release-Engineering-Team.Apr 6 2021, 4:45 PM

thcipriani moved this task from Should be empty (use Release-Engineering-Team) to Watching/External on the Release-Engineering-Team-TODO board.

thcipriani edited projects, added Release-Engineering-Team (Radar); removed Release-Engineering-Team-TODO.Apr 20 2021, 3:33 AM

thcipriani moved this task from Limbo to Watching/External on the Release-Engineering-Team (Radar) board.Apr 20 2021, 3:34 AM

• LarsWirzenius unsubscribed.Jun 15 2021, 5:04 AM

Paladox subscribed.Sep 1 2021, 4:48 PM

Krinkle mentioned this in T295304: Improve efficiency of scap l10n operations.Nov 9 2021, 10:02 PM

Krinkle mentioned this in T297326: Stop trying to avoid rsyncing l10n CDB files.Dec 15 2021, 8:10 PM

Krinkle mentioned this in T302465: Deprecate "/static/current" at WMF in favour of similar long-cache unversioned /w/ URLs.Mar 16 2022, 1:40 PM

Krinkle removed a project: Patch-For-Review.Apr 11 2022, 5:56 PM

No longer stalled as T266055 is now resolved for prod. Un-assinging for now until it comes around as a scheduled goal.

Krinkle closed subtask T253673: Avoid php-opcache corruption in WMF production as Resolved.Jul 31 2022, 2:48 AM

Krinkle added a subtask: T263872: Localisation cache must be purged after or during train deploy, not (just) before.Aug 23 2022, 4:42 PM

dancy closed subtask T263872: Localisation cache must be purged after or during train deploy, not (just) before as Resolved.Aug 23 2022, 9:09 PM

larissagaulia moved this task from To-do: Goals, prioritized next 4 Quarters to Backlog: Maintenance, non-prioritized on the Performance-Team board.Jan 19 2023, 12:34 PM

Change 883707 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[operations/mediawiki-config@master] Revert "Disable PHP L10n in beta cluster"

https://gerrit.wikimedia.org/r/883707

gerritbot added a project: Patch-For-Review.Jan 26 2023, 4:48 AM

This is somewhat important for mw-on-k8s so even if the perf gains are small, it's still a good cause to see done.

Krinkle moved this task from Backlog: Maintenance, non-prioritized to Backlog: Future Goals, non-prioritized on the Performance-Team board.Jan 26 2023, 6:49 AM

In T99740#8559738, @Ladsgroup wrote:

This is somewhat important for mw-on-k8s so even if the perf gains are small, it's still a good cause to see done.

Can you expand on what makes this important for mw-on-k8s?

I can't find it but I had it somewhere that serviceops mentioned this, in one of their slides as part of mw-on-k8s challenges because cdb files are too big for images and harder to maintain and build. Can't find it though :/

Func reopened subtask T263872: Localisation cache must be purged after or during train deploy, not (just) before as Open.Jun 17 2023, 10:26 AM

Addshore unsubscribed.Jun 27 2023, 12:39 PM

Krinkle edited projects, added Performance-Team (Radar); removed Patch-For-Review, Performance-Team.Aug 17 2023, 1:45 PM

Krinkle removed subscribers: • GWicke, • Gilles.

Krinkle edited projects, added Wikimedia-Performance-recommendation; removed Performance-Team (Radar).Aug 18 2023, 8:42 PM

DAlangi_WMF subscribed.Dec 7 2023, 4:35 PM

matmarex closed subtask T263872: Localisation cache must be purged after or during train deploy, not (just) before as Resolved.Dec 12 2023, 5:17 PM

Use static php array files for l10n cache at WMF (instead of CDB)
Open, HighPublic
Actions

Description

Details

Related Objects
Search...

Event Timeline

	F31784335: application-servers-red-dashboard-latency.png
	Apr 28 2020, 1:57 PM

	F175299: Screen Shot 2015-06-05 at 2.22.18 PM.png
	Jun 5 2015, 9:23 PM

	ori
	May 19 2015, 11:21 PM

	F31804580: mw1407-latency.png
	May 6 2020, 8:10 AM

	F31784369: obama_c25.png
	Apr 28 2020, 2:19 PM

	F31784390: load_c40.png
	Apr 28 2020, 2:19 PM

Use static php array files for l10n cache at WMF (instead of CDB)Open, HighPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Use static php array files for l10n cache at WMF (instead of CDB)
Open, HighPublic
Actions

Related Objects
Search...