Page MenuHomePhabricator

Consider removing global $userLang from onPageContentLanguage hook
Open, Needs TriagePublic

Description

The value of Content->getPageLanguage() is currently not deterministic and thus means that its exposure in external APIs and page metadata exposed to client-side JS may be different based on user session and query parameters. I think it would be a significant improvement if ContentHandler did not rely on global state.

That leaves two options:

  1. Make this ContentHandler method explicitly depend on, and dependency-inject and pass down everywhere, a RequestContext when accessing the page language. This would require a lot of complexity and separation for some of the consumers where I think the logical model of the code would be best to not separate out a user-specific version of all those services.
  2. Deprecate and remove this parameter.

I'm proposing the latter.

\cc PE, @daniel As maintainer of ContentHandler; What are you thoughts on this?
\cc WikimediaIncubator, @Ebe123 As maintainer of the affected WikimediaIncubator extension. Could you describe what this is used for currently? This will help figuring out how we can best support that in a different way going forward.

The entry point for this problem seems to be ContentHandler::getPageLanguage(), […]. This method does not logically depend on the context/user language, except for where it passes $wgLang as third parameter to HookRunner::onPageContentLanguage.

This seems like a bad idea and not something that is imho reasonable to support in MediaWiki, and incompatible with the general model of how this feature works.

For one, it would break any assumption that this is persistable to the database.

I feared the Translate extension might depend on thit, but from a Codesearch query we find it is actually not recongising this parameter at all. Instead, it is LiquidThreads and WikimediaIncubator using these to make some of their wiki pages act like a special page, in that they automagically render in the UI language. This is strange since we already expose the use language to the parser and allow it to be used. Rather, the hook is additionally forcing the internal page language to be perceived as if it were the current user language. Perhaps in the short-mid term we can find a different way to support the underlying need there.

Event Timeline

Krinkle renamed this task from onPageContentLanguage to Consider removing global $userLang from onPageContentLanguage hook.Jan 18 2022, 12:26 AM
Krinkle updated the task description. (Show Details)
Krinkle moved this task from Limbo to Watching on the Performance-Team (Radar) board.
Krinkle added subscribers: Ebe123, daniel.

I am hoping that a resolution here can lead to a resolution of T161976 for which the fix was reverted due to T298659 and ultimately resulted in this issue. I believe if page content is available during the rendering of another page that the purported content language of the available page content should also be available during the page render (much like the content model already is).

[…] from a Codesearch query […]

Please note that in the old (static method-based) hook system you can’t count on event handlers having any particular name. Instead, you should grep for the hook name itself as well, i.e. without the on prefix. In this case, this brings no additional problematic extensions, but the CirrusSearch extension (rECIR tests/jenkins/Jenkins.php) – which recognizes this parameter, but ignores it – wasn’t caught by your query.