Page MenuHomePhabricator

Define variant Wikimedia production config in compiled, static files
Open, Stalled, MediumPublic

Description

Arising from James's previous musing, and discussions at the 2019 Hackathon.

What

  • InitialiseSettings.php (and much of CommonSettings.php) is replaced with per-wiki inheritable YAML files (to allow comments).
  • Actually-variant config goes into a much slimmer CommonSettings.php (or re-worked to not vary).
  • On merge, the YAML files are converted into one JSON file per wiki, for the currently-deployed version(s) of MW, which are stored in git.
  • This replaces the opportunistic cache in /tmp that we current have.

Inheritance tree:

allwikis.yaml
| Default values for all wikis (e.g. wgNamespacesWithSubpages which is over-ridden, or wgEnableCanonicalServerLink which isn't)
|
+- wikipedias.yaml
   | Standard values for Wikipedias, where they differ from defaults (e.g. wgSitename or the fallback logo) and special inheritances
   |
   +- dewiki.yaml 
        Bespoke values for the German Wikipedia (e.g. the logo, or FlaggedRevisions configuration) and other special inheritances

Comparison:

TaskCurrent situationFuture state
Config authored inInitialiseSettings.phpwikipedias.yaml etc.
Config build stepRuntime cache, in /tmp/Build time static file, in /srv/mediawiki/
mw-config mergeTrivial rebaseFull production build of on JSON static file per wiki
Config read stepFrom cache or computed liveAlways read from built static file

Pros

  • Variant configuration will be static, making it more plausible to inject into docker images.
  • It will be much clearer exactly which wikis' config is changing, so deployers have more confidence.
  • YAML configuration files explicitly set the inheritance pattern.
  • Easier to compare one wiki's config with another's (e.g. "how different is dewiki from frwiki?").
  • Clear when the rump of CommonSettings refers to undefined variables; variant config forced to be merged first.

Cons

  • Merging is harder (and slower?).
  • Harder to audit all wikis' config for settings that "shouldn't" be over-ridden, or see how values vary.
  • Production branch pruning, currently just a disc operation and a sync, now needs a commit to mw-config as well as a deploy to delete.
  • First time we're reading YAML files in PHP prod. We're not reading them in prod, only in CI.

Former questions

  • Deterministic sort of output files to avoid noise.
    • Assuming that alphasort of the array by keys (ksort) sufficient.
  • How do we do splicing in private settings at run time?
    • Private settings are already spliced in in CommonSettings; no change.
  • Syntax for specifying config, and that a document inherits from another.
    • Roughly worked out; to be documented.
  • Syntax for specifying that descendent config can't over-ride (e.g. wgMiserMode)
    • For now, this is just a simple all.yaml file that is re-applied at the end and so can't be over-ridden.
  • Do we need to vary on the PHP run-time still? (once HHVM is un-deployed can this go away, or are there reasons beyond PHP serialisation format that we think this might vary?)
    • No. Nothing has been variant between HHVM and Zend for a while. No reason to continue.

Open questions

  • How does the CI work for this?
  • Do we need to check on build time that a vanilla MediaWiki install (i.e., DefaultSettings) doesn't set any config that isn't represented in all.yaml?
  • What do we do about variant non-static config?

Planned steps

Details

SubjectRepoBranchLines +/-
operations/mediawiki-configmaster+29 -2
operations/mediawiki-configmaster+2 -6
operations/mediawiki-configmaster+9 K -64
operations/mediawiki-configmaster+24 -6
operations/mediawiki-configmaster+0 -2
operations/mediawiki-configmaster+4 -8
integration/configmaster+6 -0
integration/configmaster+1 -0
operations/mediawiki-configmaster+123 -71
operations/mediawiki-configmaster+3 -22
operations/mediawiki-configmaster+406 -1 K
operations/puppetproduction+20 -2
operations/mediawiki-configmaster+28 -28
operations/mediawiki-configmaster+0 -1
operations/mediawiki-configmaster+3 -46
operations/mediawiki-configmaster+3 -9
operations/mediawiki-configmaster+14 -14
operations/mediawiki-configmaster+39 -3
operations/mediawiki-configmaster+12 -12
operations/mediawiki-configmaster+4 -7
operations/mediawiki-configmaster+40 -1
operations/mediawiki-configmaster+92 -77
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Review of overall approach on using YAML perf restrictions/flexibilities (e.g. what should def be cached, and what would be fine to do at run-time) now pencilled in for Q3, maybe Q2. Not expecting to involve TechCom or CPT right now, but depending on how ambitious we want to be, might make sense to involve one or both at some point, but hoping right now to keep it isolated enough to not be cross-cutting

Change 538129 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[operations/mediawiki-config@master] Variant configuration: Allow for YAML-based inheritance of configuration

https://gerrit.wikimedia.org/r/538129

Change 545411 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[operations/mediawiki-config@master] Variant configuration: Generate dblists from YAML

https://gerrit.wikimedia.org/r/545411

Hey @Jdforrester-WMF I'm looking around at the last patchset and not really understanding where the db lists will be and how the yaml will be used to generate the *dblist files. When the dust has all settled, where would a script go to look for the list of, say, closed dbs? (Assuming not a MediaWiki script and not even in php.) I ask because I'll need to update the dump scripts and other related tools if this is changing. Thanks!

Change 547283 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] check_private_data: ignore comments on private.dblist

https://gerrit.wikimedia.org/r/547283

Change 547283 merged by Jcrespo:
[operations/puppet@production] check_private_data: ignore comments on private.dblist

https://gerrit.wikimedia.org/r/547283

Change 538129 merged by jenkins-bot:
[operations/mediawiki-config@master] Variant configuration: Allow for YAML-based inheritance of configuration

https://gerrit.wikimedia.org/r/538129

Change 545411 merged by jenkins-bot:
[operations/mediawiki-config@master] Variant configuration: Generate dblists from YAML

https://gerrit.wikimedia.org/r/545411

Mentioned in SAL (#wikimedia-operations) [2019-11-26T20:33:13Z] <jforrester@deploy1001> Synchronized dblists/: Update dblists, now autogenerated (no-op, just comment changes) T223602 (duration: 01m 01s)

Change 553220 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] tests: Remove obsolete logic for "-computed" dblists

https://gerrit.wikimedia.org/r/553220

Change 553220 merged by jenkins-bot:
[operations/mediawiki-config@master] tests: Remove obsolete logic for "-computed" dblists

https://gerrit.wikimedia.org/r/553220

Change 554941 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/config@master] jjb: Provide operations-mw-config-php72-composer-diffConfig-docker

https://gerrit.wikimedia.org/r/554941

Change 507729 merged by jenkins-bot:
[operations/mediawiki-config@master] Variant configuration: Pre-calculate config for each wiki on demand

https://gerrit.wikimedia.org/r/507729

Change 554951 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/config@master] layout: [mediawiki-operations] Provide a non-voting config diff job

https://gerrit.wikimedia.org/r/554951

Change 554941 merged by jenkins-bot:
[integration/config@master] jjb: Provide operations-mw-config-php72-composer-diffConfig-docker

https://gerrit.wikimedia.org/r/554941

Change 554951 merged by jenkins-bot:
[integration/config@master] layout: [mediawiki-operations] Provide a non-voting config diff job

https://gerrit.wikimedia.org/r/554951

Jdforrester-WMF changed the task status from Open to Stalled.Jan 10 2020, 6:34 PM

Stalled for the next month or two.

Change 576060 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] MWConfigCacheGenerator: Remove unused 'docRoot' wgConf placeholder variable

https://gerrit.wikimedia.org/r/576060

Change 576060 merged by jenkins-bot:
[operations/mediawiki-config@master] MWConfigCacheGenerator: Remove unused 'docRoot' wgConf placeholder variable

https://gerrit.wikimedia.org/r/576060

Change 577037 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] multiversion: Make buildDBLists.php both create and delete dblist files

https://gerrit.wikimedia.org/r/577037

Change 577040 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] dblists: Remove 'labtestwiki' dblist containing 'labtestwiki'

https://gerrit.wikimedia.org/r/577040

Change 577040 merged by jenkins-bot:
[operations/mediawiki-config@master] dblists: Remove 'labtestwiki' dblist containing 'labtestwiki'

https://gerrit.wikimedia.org/r/577040

Change 577037 merged by jenkins-bot:
[operations/mediawiki-config@master] multiversion: Make buildDBLists.php both create and delete dblist files

https://gerrit.wikimedia.org/r/577037

What is this blocked on? Just needs to be done?

This would indeed be quite important for mediawiki on kubernetes. If we moved the yaml files to a separate repository that we import into mediawiki-config, it would be very easy to skip the build step for small configuration changes, allowing us to reduce the time it would take to make a release.

Is anyone going to work on this in the forseeable future?

Where has this proposal been all my life? I'd like to help but it's unclear what the next steps are. I think @Ladsgroup 's T223602#5235186 is right, the migration to mostly declarative configuration needs to be incremental. The brilliant buildConfigCache.php by @Jdforrester-WMF is already producing the deterministic dumps we need to safely refactor PHP logic, but it looks like the data is mostly unused for the moment. I'm unclear about how "diffChange" can be used in CI, we can't alarm every time the configuration is changed so it would only be applicable to pure refactors. It's great to look at, of course—starting today I plan to run this diff job for myself locally when deploying.

Is the config-cache committed anywhere? Can I check out as a submodule as mentioned earlier? I wouldn't mind something like the extension-Popups arrangement, where the compiled files must be committed along with any config changes. The only obstacle is that the contents are 116MB, which at least doubles the size of this repo if we commit the data directly. Or if we don't commit, maybe we should create more tools and documentation to leverage this resource?

Change 737190 had a related patch set uploaded (by Awight; author: Awight):

[operations/mediawiki-config@master] [WIP] Write static portion of the config to files

https://gerrit.wikimedia.org/r/737190

^ I've added a patch for discussion, once these files are read during startup we can slowly remove the redundant statements from InitialiseSettings.php .

Change 737197 had a related patch set uploaded (by Awight; author: Awight):

[operations/mediawiki-config@master] [DNM] Migrate $wgDisableQueryPages to static config

https://gerrit.wikimedia.org/r/737197