Page MenuHomePhabricator

Replace StubUserLang with a better lazy loading mechanism
Closed, DuplicatePublic

Description

StubUserLang has two problems:

  1. it's incompatible with type hints against the Language type, see e.g. T244300, T278429
  2. and it binds to global variables, which we want to remove, see T159283: Deprecate non-configuration globals.

When thinking about solutions for this, we have to keep in mind that the information for instantiating the user language object comes from the current web request (or alternative invocation context, when running maintenance scripts or background jobs). The necessary context information needs to be passed from the top level code in the entry point, probably the MediaWiki class. It's not known to service wiring code, and may not be obtained via a service. Also, we don't want to fully initialize the user or user language on all requests, or right away. This initialization should be deferred as long as possible.

Here are two possible approaches:

  1. Make a LazyLanguage class that is an actual Language implementation, but lazy-loads its state from a WebRequest or other context when needed. Initially, wouldn't even know its own language code. Once loaded, it would delegate all calls to the "real" Language instance internally.
  1. Instead of a Language object, we pass a context object somewhat similar to RequestContext, except that it doesn't use global state, and returns classes that are more narrow than what is currently exposed. It would have methods like getUser() and getRequest() and getUserLanguage().

Event Timeline

From what I understand, this problem is already solved.

The solution is RequestContext for higher-level code that is session-aware and user-facing, such as Action and SpecialPage entry points, and classes very closely tied those such as Skin and OutputPage.

For all lower-level code, the Language object must be injected explicitly because it cannot reasonably determine in a safe and predictable way which language to use. For example a random internal code path that produces a bit of HTML. Depending on when and how that is called, it may be for any user on the wiki, or perhaps not even for a real user at all (e.g. in the installer, or from CLI).

This is currently dynamically detected and enforced by throwing a fatal error when the lazy-initialisation happens too early or in context where it is not possible to do so (e.g. before wgUser is safe to load, or when entry point has "NO SESSION" set).

What's left is "merely" to migrate away from code using wgLang in favour calling RequestContext::getLanguage(). This should be a simple find-and-replace without much perf impact because importing wgLang from as global variable generally goes hand-in-hand with unstubbing it a few lines later.

Either at the same time, or later, we should also make sure to avoid static imports like this and either find a local context object if there is one, or (if the code path is widely shared and considered low-level/internal, to make it an injected parameter and emit deprecation warnings for not passing it).

As first pass though, to get rid of StubUserLang, replacing global $wgLang with RequestContext::getLanguage() should be fine I think?

From what I understand, this problem is already solved.

The solution is RequestContext for higher-level code that is session-aware and user-facing, such as Action and SpecialPage entry points, and classes very closely tied those such as Skin and OutputPage.

RequestContext is problematic because it relies on global state. Though not for $wgLang, it seems. We'd want to have an implementation that doesn't, and gets passed as an argument. That's pretty much solution (2) in the description.

For all lower-level code, the Language object must be injected explicitly because it cannot reasonably determine in a safe and predictable way which language to use.

And at that point I'd assume that we can expect/force an actual non-stub Language object.

What's left is "merely" to migrate away from code using wgLang in favour calling RequestContext::getLanguage().

What should new code do that is designed to not use global state? I suppose it could require a RequestContext to be passed in. Or a nicer replacement that binds to more narrow interfaces.

As first pass though, to get rid of StubUserLang, replacing global $wgLang with RequestContext::getLanguage() should be fine I think?

We'd still have to keep StubUserLang until all usages of $wgLang are removed. Only proposal (1) would remove that need. But yes, it's probably a good step forward.

For all lower-level code, the Language object must be injected explicitly because it cannot reasonably determine in a safe and predictable way which language to use.

And at that point I'd assume that we can expect/force an actual non-stub Language object.

Indeed.

What's left is "merely" to migrate away from code using wgLang in favour calling RequestContext::getLanguage().

What should new code do that is designed to not use global state? I suppose it could require a RequestContext to be passed in. Or a nicer replacement that binds to more narrow interfaces.

Existence of the global RequestContext::getMain() singleton is imho out of scope for this task. Migrating away from wgLang by itself is straight-forward. I think we should handle those one at a time.

The problem with variable like wgParser was that they were either expensive unconditional singletons or fragile/complicated stub objects that are sometimes illegal/explode. Changing them from global object imports to inline getters means that they are naturally lazy without needing any stubs. The assumption is made that the object is only imported when it is needed.

As first pass though, to get rid of StubUserLang, replacing global $wgLang with RequestContext::getLanguage() should be fine I think?

We'd still have to keep StubUserLang until all usages of $wgLang are removed. Only proposal (1) would remove that need. But yes, it's probably a good step forward.

Yes, I'm saying StubUserLang is easy to remove as all it takes is replacing wgLang with RequestContext::getMain()->getLanguage() (or $context->getLanguage() when there is one already).

As a separate task for later, after $wgLang and StubUserLang are gone, we can think about where exactly RequestContext would be constructed and given to Action and SpecialPage etc, instead of lazy-initialised by the first getter from a RequestContext-inheriting class (like today).

Also separate is figuring out whether there are actually many places that would require passing it down. My suspicion is that there wouldn't. The context must only be passed down to down to code that is directly or very closely related to html-serving session-aware frontend-facing. To a first approximation, all such classses implement RequestContext or have it as a member already. Everything else can't be handling a context anyway, not even injected. It needs to remain agnostic and be given a specific Language object instead. This is largely the case today already, and has to be, unless the code in question is already broken and/or happens to never be used outside such session-aware code.

Krinkle added a subscriber: Ammarpad.

Declining as-written, because we don't need a better mechanism. We can trivially get rid of what we have by replacing callers, and the work for that can continue at T160814.

Once wgLang is no longer used anywhere we care about, we can either remove it, or maybe keep it around for a short while. We could also spread a few unstub() calls around the codebase as @Ammarpad drafted in an earlier patch set version of https://gerrit.wikimedia.org/r/c/mediawiki/core/+/674969. There is no longer a need for this mechanism as far as I'm concerned, other than to not construct it eagerly in Setup.php (and we can't/shouldn't given NO_SESSION and such, it sometimes will throw).