Page MenuHomePhabricator

Audit and simplify MediaWiki initialisation code (Spring 2018)
Open, NormalPublic

Description

This is a tracking task for various miscellaneous tasks relating to MediaWiki PHP set-up code. It will be resolved when the sub tasks added during Spring 2018, are resolved.

Goals:

  • Reduce maintenance overhead by addressing Technical-Debt.
  • Make the code easier to contribute to by simplifying it and improving its documentation.
  • Improve startup performance, especially for stateless requests (api.php, load.php) and read-only routes.

Sub tasks:

  • Understand current initialisation logic.
  • Understand current initialisation requirements.
  • Understand current initialisation performance.
  • Improve and simplify initialisation performance in a way that continues to satisfy current requirements.

Current code (Feb 2018)

From: /w/{entrypoint}.php:

  • include PHPVersionCheck
    • PHPVersionCheck::checkRequiredPHPVersion()
    • PHPVersionCheck::checkVendorExistence()
    • PHPVersionCheck::checkExtensionExistence()
  • include WebStart
    1. WebStart: assert php-env mbstring.func_overload
    2. WebStart: emit header X-Content-Type-Options: nosniff
    3. WebStart: global set $wgRequestTime
    4. WebStart: global unset $IP
    5. WebStart: define MEDIAWIKI
    6. WebStart: global set $IP
    7. WebStart: include Setup
      1. Setup: global set $wgProfiler
      2. Setup: include StartProfiler (conditional)
      3. Setup: include AutoLoader
      4. Setup: include Defines
      5. Setup: include DefaultSettings
      6. Setup: include GlobalFunctions
      7. Setup: include vendor/autoload (conditional)
      8. Setup: assert vendor
      9. Setup: php-env add HeaderCallback
      10. Setup: include LocalSettings
    8. WebStart: include OutputHandler (conditional)
    9. WebStart: register ob_start wfOutputHandler (conditional)
      1. OutputHandler: handle ValidateAllHtml (conditional)
      2. OutputHandler: handle compression (conditional)
      3. OutputHandler: emit header Content-Length (conditional)
    10. WebStart: include Setup [continued]
      1. Setup: process Profiler (implicit)
      2. Setup: process ExtensionRegistry
      3. Setup: php-env ensure mb_internal_encoding( .. )
      4. Setup: php-env ensure LC_ALL=..
      5. Setup: process mw-config phase1/2 (DefaultSettings, LocalSettings, shortcuts, derivatives, deprecations, normalising)
      6. Setup: process MWDebug
      7. Setup: process MediaWikiServices
      8. Setup: define MW_SERVICE_BOOTSTRAP_COMPLETE
      9. Setup: php-env add MWExceptionHandler
      10. Setup: include UtfNormalUtil
      11. Setup: Setup: assert validity of $wgArticlePath
      12. Setup: process mw-config phase2/2 (more defaults, shortcuts, normalising)
      13. Setup: php-env ensure wfMemoryLimit
      14. Setup: process singletons phase1/3 (wgMemc, messageMemc, parserMemc)
      15. Setup: process hooks for SetupAfterCache
      16. Setup: process singletons phase2/3 (wgContLang, wgAuth)
      17. Setup: php-env add MediaWiki\Session
      18. Setup: process singletons phase3/3 (wgUser, wgLang, wgOut, wgParser, wgTitle)
      19. Setup: process hooks for wgExtensionFunctions
      20. Setup: process Pingback
Issues

1: WebStart: assert php-env mbstring.func_overload

Unsure why this is in WebStart instead of Setup. Is it fine to apply this to CLI?

3: WebStart: global set $wgRequestTime

Deprecated since MediaWiki 1.25.

4: WebStart: global unset $IP

This sanity unset() seems redundant. It's unconditionally re-assigned two lines down. Can it be removed?

7A: Setup: global set $wgProfiler
7B: Setup: include StartProfiler (conditional)

Seems like this could be folded into DefaultSettings.php, with the StartProfiler include moved down a few lines. Should be safe given we don't reference it after DefaultSettings and LocalSettings load.

8: WebStart: include OutputHandler (conditional)

This (conditionally) includes a file that creates additional global functions. Should be safe to merge with GlobalFunctions and just load always. Alternatively, given this is after the autoloader, perhaps better as a class. That way also has the benefit of 1) keeping the private functions really private and 2) still only loading the file when needed.

9A: OutputHandler: handle ValidateAllHtml (conditional)

See also T191670.

10J: Setup: include UtfNormalUtil

Removed in https://gerrit.wikimedia.org/r/435207 / 88ea69f2f7.

10K: Setup: assert validity of $wgArticlePath

Seems like this belongs to the "mw-config process" section that ended a few lines earlier.

  • Patch: TODO

10L: Setup: process mw-config phase2 (more defaults, shortcuts, normalising)

Unsure why this is separate from the earlier "mw-config process" section.

  • Patch: TODO

Misc ideas and themes

(Early draft with ideas for different directions we could take.)

  • Reduce number of manually included files.
  • Reduce amount of commonly executed code:
    • Re-evaluate the way we normalise global configuration variables in Setup.php. We currently do it unconditionally because we want convenient access to them without needing to re-normalise at each call site. However, we are actively reducing use of global variables, typically using them (or the Config proxy) only in ServiceWiring.php and/or the constructor of a service class. Such class seems like a better place to perform normalisation than Setup.php. Benefits of doing so:
      1. Still written and run from one place (just in the relevant class instead of Setup.php)
      2. No longer unconditional on all requests.
      3. Testable - the code for supporting older configuration formats can be covered with unit tests.
      4. Maintained in context of the associated feature.
    • Continue deprecating (some) global functions in favour auto-loaded class methods.
    • Maybe start deprecating some constants in favour of auto-loaded class constants.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 427272 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Setup: Move mbstring.func_overload from WebStart to Setup

https://gerrit.wikimedia.org/r/427272

Change 427270 merged by jenkins-bot:
[mediawiki/core@REL1_31] WebStart: Remove redundant unset() for $IP

https://gerrit.wikimedia.org/r/427270

Krinkle updated the task description. (Show Details)Apr 17 2018, 10:11 PM

Change 427274 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] WebStart: Remove deprecated $wgRequestTime

https://gerrit.wikimedia.org/r/427274

Krinkle updated the task description. (Show Details)Apr 17 2018, 10:20 PM
Krinkle updated the task description. (Show Details)

Change 427274 merged by jenkins-bot:
[mediawiki/core@master] WebStart: Remove deprecated $wgRequestTime

https://gerrit.wikimedia.org/r/427274

Change 427272 merged by jenkins-bot:
[mediawiki/core@master] Setup: Move mbstring.func_overload from WebStart to Setup

https://gerrit.wikimedia.org/r/427272

Krinkle updated the task description. (Show Details)Apr 18 2018, 5:18 PM
Krinkle updated the task description. (Show Details)
Krinkle updated the task description. (Show Details)
Krinkle removed a project: MW-1.31-release-notes.

Change 434496 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Setup: Remove various Profile sections

https://gerrit.wikimedia.org/r/434496

Change 434496 merged by jenkins-bot:
[mediawiki/core@master] Setup: Remove various Profile sections

https://gerrit.wikimedia.org/r/434496

Krinkle updated the task description. (Show Details)Jun 5 2018, 11:20 PM

Note to self: the "Setup: assert vendor" (which is the spot check for vendor dependencies existing, using PSR-3 classes as example) seems somewhat redundant with PHPVersionCheck::checkVendorExistence. Perhaps one of these could be improved in a way that makes the other removable.

Afaik the checkVendorExistence was introduced later (hadn't seen it before today).

9B. OutputHandler: handle compression (conditional)

@tstarling @Anomie Curious if you have thoughts/opinions on this whether to keep or remove this.

I'd like us to consider removing the explicit logic for output compression from MediaWiki in order to simplify the stack. In favour of letting site admins configure this through PHP's own settings (zlib.output_compression) and/or through the web server (Apache, Nginx, Varnish). It seems WordPress follows this already.

For the use case of running MediaWiki without an external server (e.g. with PHP's built-in web server), PHP's own
zlib.output_compression feature should suffice, but maybe there are issues with it? If there are, perhaps we could move the logic to the built-in web server handling in maintenance/dev/start.sh (via PHP flags) or maintenance/dev/router.php (with run-time logic).

Krinkle updated the task description. (Show Details)Jun 23 2018, 9:56 PM

The nice things about DIY output compression are:

  • It's on by default, so it helps site admins who would never think to tweak their web server settings
  • MediaWiki disables it for img_auth.php and thumb.php, since it is just a waste of CPU time for typical images
  • Special CDN integration with the Key header (formerly X-Vary-Options). X-Vary-Options was a feature I introduced to avoid splitting the CDN cache between IE and everything else, which were sending slightly different Accept-Encoding request headers (differing by a space)
  • The Safari hack. Not sure if it is still needed.

Seems like it is worth the ~50 lines of code.

Change 444946 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] Remove unused $tmarray variables

https://gerrit.wikimedia.org/r/444946

Change 444946 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove unused $tmarray variables

https://gerrit.wikimedia.org/r/444946

Krinkle added a comment.EditedAug 16 2018, 7:08 PM

The nice things about DIY output compression are:

  • It's on by default, so it helps site admins who would never think to tweak their web server settings

I agree it's a nice default. But, aside from a php -S, I'd expect web servers to also be capable of good defaults (Apache, Nginx), but alas that doesn't seem to be the case..

  • MediaWiki disables it for img_auth.php and thumb.php, since it is just a waste of CPU time for typical images

Do we proactively communicate a desire to not compress, or does this apply to our PHP-based compression only? I ask because for WMF we (mostly?) don't compress on app servers any more. The reasoning behind that is somewhat scattered and I couldn't find it all, but T181315#4081674 and T125938 provide a starting point.

[..]
Seems like it is worth the ~50 lines of code.

Agreed. For the purpose of this task anyway, it seems reasonable to consider compression as one of the layers to reason about in MediaWiki init/shutdown.

Change 453893 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Remove support for StartProfiler.php

https://gerrit.wikimedia.org/r/453893

Change 453893 merged by jenkins-bot:
[mediawiki/core@master] Remove support for StartProfiler.php

https://gerrit.wikimedia.org/r/453893

7H: Setup: assert vendor

This seems is similar to the step PHPVersionCheck::checkVendorExistence() that already happens earlier in the process. Let's see if one of them can be merged into the other, or perhaps one is already redundant as-is.

Change 459013 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] profiler: Move vendor/autoload from MWMultiVersion to profiler.php

https://gerrit.wikimedia.org/r/459013

Change 459013 merged by jenkins-bot:
[operations/mediawiki-config@master] profiler: Move vendor/autoload from MWMultiVersion to profiler.php

https://gerrit.wikimedia.org/r/459013

Change 462192 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/Wikibase@master] Use native require_once instead of manual checks and throws

https://gerrit.wikimedia.org/r/462192

Change 462192 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Use native require_once instead of manual checks and throws

https://gerrit.wikimedia.org/r/462192

Change 463591 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] user: Remove use of Message:text() from User::isUsableName()

https://gerrit.wikimedia.org/r/463591

Change 463591 merged by jenkins-bot:
[mediawiki/core@master] user: Remove use of Message:text() from User::isUsableName()

https://gerrit.wikimedia.org/r/463591

Change 467233 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] multiversion: Remove reference to deprecated getRealmSpecificFilename()

https://gerrit.wikimedia.org/r/467233

Change 467233 merged by jenkins-bot:
[operations/mediawiki-config@master] multiversion: Remove reference to deprecated getRealmSpecificFilename()

https://gerrit.wikimedia.org/r/467233

cscott added a subscriber: cscott.Oct 16 2018, 2:23 PM
  • Special CDN integration with the Key header (formerly X-Vary-Options). X-Vary-Options was a feature I introduced to avoid splitting the CDN cache between IE and everything else, which were sending slightly different Accept-Encoding request headers (differing by a space)

FWIW, the Key header is now expired and no longer on the IETF standards track. It's not used in WMF production, and it appears to be semi-broken, in that the standard only allows case-sensitive header matching in a quite restricted form, and we instead do a somewhat sophisticated case-*in*sensitive match on the Accept-Language header.

Is there an appropriate cache-guidance header we should be investigating? Otherwise, I'm inclined to remove the Key header code entirely (since it's currently-broken, somewhat complex, inadequately tested, and no one is using it).

Change 470259 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] PHPVersionCheck: Remove obsolete load.php code and simplify

https://gerrit.wikimedia.org/r/470259

One of the more odd parts of the initialisation logic is ResourceLoader support for Suhosin. In light of T205684, we may want to remove support for it from MediaWiki, which would reduce the requirements of init and setup logic.

Change 470259 merged by jenkins-bot:
[mediawiki/core@master] PHPVersionCheck: Remove obsolete load.php code and simplify

https://gerrit.wikimedia.org/r/470259

Krinkle removed Krinkle as the assignee of this task.Dec 9 2018, 10:29 PM

Un-assigning for now as this isn't current quarter's goal. Current quarter has T202154 as my optimisation theme.

Change 486713 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Remove deprecated wgEnableParserCache

https://gerrit.wikimedia.org/r/486713

Change 486713 merged by jenkins-bot:
[mediawiki/core@master] Remove deprecated wgEnableParserCache

https://gerrit.wikimedia.org/r/486713

Krinkle updated the task description. (Show Details)Sat, Jul 13, 12:35 AM

Mark removal of "10J: Setup: include UtfNormalUtil", as was done in https://gerrit.wikimedia.org/r/435207 / 88ea69f2f7.

Krinkle updated the task description. (Show Details)Sat, Jul 13, 12:36 AM

Change 522578 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Setup: Move mb_internal_encoding() call earlier

https://gerrit.wikimedia.org/r/522578

Change 522579 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Setup: Make wfMemoryLimit() internal and simplify

https://gerrit.wikimedia.org/r/522579

Change 522580 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] Setup: Remove $wgDebugPrintHttpHeaders option

https://gerrit.wikimedia.org/r/522580

Krinkle updated the task description. (Show Details)Sat, Jul 13, 1:13 AM

Change 522578 merged by jenkins-bot:
[mediawiki/core@master] Setup: Move mb_internal_encoding() call earlier

https://gerrit.wikimedia.org/r/522578

Change 522579 merged by jenkins-bot:
[mediawiki/core@master] Setup: Make wfMemoryLimit() internal and simplify

https://gerrit.wikimedia.org/r/522579

Change 522580 merged by jenkins-bot:
[mediawiki/core@master] Setup: Remove $wgDebugPrintHttpHeaders option

https://gerrit.wikimedia.org/r/522580

Change 524537 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/extensions/Flow@master] FlowHooks: Simplify registerExtension() code a bit

https://gerrit.wikimedia.org/r/524537