Page MenuHomePhabricator

Parser should have a msg() helper function so people don't localize messages improperly
Open, Needs TriagePublic

Description

Using a standard wfMessage(...)->escaped() in the context of a parser function or tag hook is wrong, because it'll use the current user's language, get cached. In reviewing MediaWiki extensions, I've seen a lot of them do this wrong.

The proper invocation should be: wfMessage(...)->inLanguage( $parser->getFunctionLang() )->title( $parser->getTitle() )->escaped();

I don't think people are going to remember to properly do that every time, so we should provide a helper for them (also it took me a while to figure it out as well).

GCI mentors: @Legoktm

For this task you'll need to add a new msg function to the Parser class (file includes/parser/Parser.php in MediaWiki core), and then update any incorrect usages of wfMessage in the CoreTagHooks and CoreParserFunctions classes to use your new method.

Event Timeline

Change 470239 had a related patch set uploaded (by Arcayn; owner: Arcayn):
[mediawiki/core@master] Add msg() helper function to Parser.php and fix CoreTagHooks.php/CoreParserFunctions.php

https://gerrit.wikimedia.org/r/470239

I'm not convinced the helper should be in Parser.php. Parser functions and tag hooks are a bit of a special case, since they actually are in the content language. In 95% of the cases where system messages are used, they should be in the user interface language -- compare https://codesearch.wmflabs.org/search/?q=addWikiTextAsContent with https://codesearch.wmflabs.org/search/?q=addWikiTextAsInterface for example. Even when you do want the content language, it's probably because the thing you're adding would be better off as an interface message, for example see bacd87e4942baa34808a1b77d3b29bfdb566cc17.

So I'm concerned about making it too easy for folks to do the wrong thing, especially if the name of the method doesn't make it obvious that you're making a very specific and uncommon choice. I'd be happier if the method were in CoreParserFunctions, because at least then the context would make it more likely that someone who stumbled across it in the API was actually trying to do the uncommon thing, not the common thing (for which it's not suited).

See also T114640: make Parser::getTargetLanguage aware of multilingual wikis, which goes in to more detail about the problems one has in general when trying to make "user interface elements" from "content" markup.

I'm not convinced the helper should be in Parser.php. Parser functions and tag hooks are a bit of a special case, since they actually are in the content language. In 95% of the cases where system messages are used, they should be in the user interface language -- compare https://codesearch.wmflabs.org/search/?q=addWikiTextAsContent with https://codesearch.wmflabs.org/search/?q=addWikiTextAsInterface for example. Even when you do want the content language, it's probably because the thing you're adding would be better off as an interface message, for example see bacd87e4942baa34808a1b77d3b29bfdb566cc17.

Er, but 95% of the time system messages are used in the context of a page parse (the Parser), they should be in the content language. We try very hard to not split the parser cache on user language, which is why any embedded message in the page parse should use the content language.

So I'm concerned about making it too easy for folks to do the wrong thing, especially if the name of the method doesn't make it obvious that you're making a very specific and uncommon choice. I'd be happier if the method were in CoreParserFunctions, because at least then the context would make it more likely that someone who stumbled across it in the API was actually trying to do the uncommon thing, not the common thing (for which it's not suited).

The correct thing in the context of the Parser is to use the content language. In fact, using the user language is harder now, and generally the wrong thing to do.

In fact, the current code (using wfMessage()) is falling back on $wgTitle, which is always the wrong thing to do. Setting an explicit Title is always the correct thing, which was one of the main reasons I proposed this.

Regarding naming, the reason it's called msg is so that it can implement the MessageLocalizer interface. I don't think adding it to CoreParserFunctions is appropriate since it will also be useful to extension parser functions and tag hooks as well.

I think I could be satisfied with a good documentation comment for this method elaborating these points:
(a) state explicitly that this uses the content language, not the user interface language
(b) is intended for parser functions and tag hooks which appear in article content, not UX (special pages, warnings, etc)
(c) the result will be subject to language conversion (which is unusual for system messages, which are more usually pre-converted to a specific variant)

Could you also look at bacd87e4942baa34808a1b77d3b29bfdb566cc17 in particular, because that is very similar to the change being proposed in https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/470239/8/includes/parser/CoreParserFunctions.php -- we should clarify what language ParserOutput::addWarning() expects. In fact, it would be better still to add a ParserOutput::addWarningMsg() function which takes a system message name directly and handles the appropriate language and title setting. (And in fact, the ApiBase::addWarning() method takes a message object, we could add consistency w/ that.) This is probably complicated by the fact that we expose ParserOutput::addWarning(string) directly to Lua via Scribunto, sigh. There seems to be a similar issue in the JS API.

Anyway, codesearch for addWarning is very interesting. Maybe we could clean all that up.

WRT GCI we could do some of this cleanup in follow-up patches, but we should get good docs in the first patch and figure out whether ParserOutput::addWarning() should actually be using this newly-added Parser::msg() API or not. (I'm thinking it should not, because warnings are interface not content and cached via MessageCache not ParserCache... but maybe I'm wrong.)

Change 836296 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] Add Parser::msg() helper for messages from extensions or parser functions

https://gerrit.wikimedia.org/r/836296

Change 836296 merged by jenkins-bot:

[mediawiki/core@master] Add Parser::msg() helper for messages from extensions or parser functions

https://gerrit.wikimedia.org/r/836296

Change 836296 merged by jenkins-bot:

[mediawiki/core@master] Add Parser::msg() helper for messages from extensions or parser functions

https://gerrit.wikimedia.org/r/836296

@cscott I notice that this uses inLanguage(), which turns off the Message->interface flag, which has various side-effects you might not want here.

See also how RequestContext and ResourceLoader\Context::msg() handle this.

Ref T291601: Re-evaluate Message 'interface' and 'title' flags for messages bundled via load.php, remove "Dwimmerlaik", T291590: DiscussionTools markup appears in dropdowns on Special:MediaSearch.

Change 942709 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] Parser::msg() helper: ensure that 'interface' flag is set correctly

https://gerrit.wikimedia.org/r/942709

Change 989898 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/extensions/Translate@master] Ensure a page context is set for wfMessage

https://gerrit.wikimedia.org/r/989898