Page MenuHomePhabricator

Add a {{USERLANGUAGE}} magic word
Closed, DeclinedPublic


mw.getContentLanguage().code -- Returns the content language.
mw.getCurrentFrame():preprocess("{{PAGELANGUAGE}}") -- Returns the page language.
mw.getCurrentFrame():preprocess( "{{int:Lang}}" ) -- Returns the user language.

Author: gangleri


The benefits of a variable returning the language code of the selected user interface can not be foreseen and all "applications" can not be described here.

Hopefully it should be easy to implement and be available soon.

Regards Reinhardt

Version: unspecified
Severity: enhancement
See Also:

Workarounds available:

  • For system administrators, installing the UILangCode extension which provides the {{UILANGCODE}} variable
  • For system administrators, installing the MyVariables extension which provides the {{USERLANGUAGECODE}} variable
  • For system administrators, installing the LanguageCode extension which provides the {{USERLANGUAGE}} variable
  • For wiki administrators, using {{int:lang}} after creating all MediaWiki:Lang subpages.
  • I'm looking for how to use this task
  • but I do not find in this task how to use it
  • I tried :
  • langs.page_lang = modes.frame:preprocess( "PAGELANGUAGE")
  • and langs.page_lang = modes.frame:preprocess( title = "PAGELANGUAGE")
  • and langs.user_lang = modes.frame:preprocess( title = "USERLANGUAGECODE")
  • But none worked
  • You should provide in the description of the task itself the right way to use it
  • Thanks in advance for all coders

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

(In reply to PiRSquared17 from comment #29)

Does it support caching correctly? Is there any reason not to use this?

It would if it used $parser->getOptions()->getUserLang().

He7d3r set Security to None.

No one has commented on this bug in a couple years. It was closed twice as WONTFIX and it doesn't seem like there is developer support for implementing it. Is there any reason it should be kept open?

This task is needed as part of central modules T121470#1880317
The goal is to show arguments names, categories, error messages in the helper language when he helps a small wiki which has not enough specialists of modules.
See also T122086

No one has commented on this bug in a couple years. It was closed twice as WONTFIX and it doesn't seem like there is developer support for implementing it. Is there any reason it should be kept open?

All reasons is present. This is still needed. Template "Uselang" (tghrough "hacked" way) is still used widely.

The problem with this task is that it asks for a specific solution (which is unlikely to be implemented) instead of focussing the discussion on what are the use cases and how those cases could be best solved.

@Nikerabbit: That's a good point. The only use cases I know of currently are:

More use cases would be helpful.

In at least 3 cases we need adaptations of translations depending not only from the wiki language or the user language :

  • The genre of the user in his preferences
  • The format of date-time in some régions or sub-languages
  • In mw.language.convertPlural() "if your language has several plural forms"

Can we, and is it adapted, to group all these cases in a same translations adapter with some parameters linked to the user : gender, sub-language, rights(sysop, bureaucrat...), plural forms?

That permit to really try the user language for central modules.
user_lang = modes.frame:preprocess( "{{int:Lang}}" )
also in some other wikis if needed.

Create a page to permit each language seems to me very strange.
Another way could be to define the basic language of the wiki on the first line.
Then, on the second line, a list of all other languages to permit.
This way could permit to administrators to easy know or change this list without long researches.

Any way used by {{int:Lang}} and implemented internally by the {{int:}} parserfunction that will always need to get the correct user language should be still be made available even if we don't manually create the MediaWiki:Lang/* resources (because the "int:" extension can already internally recognize the resource name and does not need to query the database to retrieve these fake pages, all it has to do is still to get the correct user language code for any kind of {{int:*}} message needed by the Mediawiki core itself; I think it also adds some work to also mark specially the expanded pages so that they will be cached separately according to the user language)
The same should then be made for Lua, by adding into Scribunto supporting the {{#invoke:}} extension a basic module that will do the same thing as what the {{int:}} extension does, using the same logic for cache management and to return the same language code: it should then be exposed into Lua by a "mw:userLanguage()" that won't require any use of "frame:prepropress()" with its unbelieavable, and undocumented side-effects (notably the fact it fully expands prematurely all text in parent frames instead of just the wikitext given in its single string parameter !)

So Scribunto now really needs to be fixed. Demon,station has been done that this was really needed:

  • add "mw:getUserLanguage()" or "frame:getUserLanguage()" : I don't know which class is the best place to add this API.
  • provide immediately a workaround in "frame:preprocess(text)" so that it will not preprocess the given text when it just matches /^\{\{[ *[Ii][Nn][Tn]: *[Ll]ang *\}\}$/ : it will use the previous function instead
  • fix "frame:preprocess(text)" to avoid full expansion of everything in parent frames as much as possible (wikicode parsing and expansion should only be made lazily for only what was demended)
  • investigate the other bugs (related to Wikidata client retrieveing only full objects instead of just the selected properties)
  • other bugs related to exceptions for memory exhaustion in PHP->Lua->PHP is independant but still needs a better fix than what is currently worked on (which is a real hack if it just consist in increasing or relaxing the memory constraints, and will not solve any problem if the exception is handled correctly but it still causes unexpected HTTP 500 failures on the server in other modules not expecting the new exception or if it causes them to return a dummy expanded error text instead)

Now, the page language come from frame:preprocess("{{PAGELANGUAGE}}").
For the same reasons as the user language, the page language could use the new way to give it to Scribunto modules.

PAGELANGUAGE is NOT the correct one. It was the user language that was expected (we won't have per-language categories, category pages won't be translated, the navbox was intended to provide coherent international navigation independantly of the English name of the categories, and without having to create and edit many translations within the page itself (this is about tens of thousands of categories). This PAGELANGUAGE would be a severe return backward where only English will be displayed on Commons.

frame:preprocess( "{{int:Lang}}" ) is far better than frame:preprocess( "{{PAGELANGUAGE}}" ) which is really different. We are still needing a simpler way in Lua to get the user language directly within a variable from the PHP environment (i.e. available as a string property in mw or frame, without needing any additional preprocessing for its expansion).

In summary do not confuse the three distinct languages:

  • default language of the wiki (English only in Commons or Meta or "old Wikisource", or the language of the project wiki when it has separate linguistic editions, such as French for French Wikipedia)
  • page language (only works for pages using the translate extension, or pages that would use some magic keyword to design to indicate in which language a monolingual page is effectively written, or some wiki specific hook parsing some pagename or namespace convention, notably for beta wikis like Incubator, or like OpenStreetMap wiki). Useful for multilingual wikis that still have language-specific pages (possibly translated, but this is not always the case and their language is NOT necessarily the same as the default wiki language; translated pages can come later using some wiki-specific naming convention that "page language" would detect with some hooks, such as the hook used by the "Translate" extension, which expects languages codes to be found as the last subpage name). If a page is not detected by the special hook as being marked/tagged/named to be in a specific language, the page language will be assumed to be the default language of the wiki (not necessarily English as said above). For example an "English Embassy" community page on the French Wikipedia could be marked/tagged as having "page language = en" even if the default language of the wiki is French. The page language also will afftect how the layout will be generated (notably there will be CSS classes or styles for RTL languages, and cells of tables will render in different order, and paragraphs will be aligned on a different margin, thumb images will float by default on another side... The "English Embassy" page on the Arabic Wikipedia would have a radically different layout than most ofther pages of the Arabic wiki; the "side bars" on the opposite will have a layout based on the user language, but will not be affected by the page language or the wiki default language)
  • user prefered language (at set by the Universal Language selector, or in user preferences or when using "uselang=code" in query strings (part of the URL), to be used typically in navboxes transcluded in pages (such as categories) that will never be translated directly. It is the same language used by the Mediawiki UI (translations come from the "int:" parser function that uses translations imported into the special namespace from or added locally by local wiki admins). This is always independant of the two previous languages, any user can set its own language preference or view any page with "uselang=code" parameter in the query string. This means that the same page will render differently depending on language, and MediaWiki already has separate caches for pages rendered with different user language (each version is cached inpendantly, the lifetime of these server-side caches are also independant: you can "purge" the version in English this does not purge the French version, and caches will expire or will be cleaned also independantly of any explicit purge action, if storage space gets exhausted, by using a LRU algorithm where the oldest retreived pages from the cache will be purged first to make space)

Many thanks for this overview of languages.
For 2 days, I made the function form2en() dedicated to tests many functions, always in english, to increase the stability of central modules, using ClassNameTests.

There is the MyVariables extension around which provides the {{USERLANGUAGECODE}} variable. This is only available for third party instances though.

Kghbln updated the task description. (Show Details)

Why did you cross nikext? It's still available, the vcs link just needs updating.

Why did you cross nikext? It's still available, the vcs link just needs updating.

Sorry, I did not find it so I assumed it is gone. You are welcome to do the honors.

Change 508295 had a related patch set uploaded (by Dvorapa; owner: Dvorapa):
[mediawiki/core@master] Add a new {{USERLANGUAGE}} variable for use in wikitext

Change 507328 had a related patch set uploaded (by Dvorapa; owner: Want):
[mediawiki/core@master] Add system message MediaWiki:Langcode

An we still lack the possibility of marking a specific page (with a margic syntax generating metadata, not content, like "[[Category:...]]") as being primarily in a specific language (independant of the user language, but that should NOT be inserted in pages marked for translations with the translation tools, which are marked automatically by the Translation tool and uses a specific page naming convention using "/langcode" suffixes/subpages, or some "langcode:" prefix or namespace, like on the OpenStreemap wiki).

This magic metadata keyword to insert in the wikicode of pages (or in transcluded templates), such as:

  • {{PAGELANGUAGE:langcode}} or
  • [[PAGELANGUAGE:langcode]] (like categories), or
  • <meta name="lang" value="langcode"/> (using a generic metadata system for pages, not limited to languages)

should be natively implemented. It should also properly set the HTML metadata in the page header and/or in the MIME headers of HTTP query replies.

And the page language metadata would be overriden by a wiki-specific hook if there's a naming convention enforced by the translation tool: this hook would then detect the presence of the keyword in the wikipage content, and any difference would generate a warning (at least on the preview) and mark the parsed page with a tracking maintenance category, if language code set by the magic keyword is different from the language code set by the translation tool or enforced naming convention; so that we can either:

  • remove the magic keyword from the page content (or its transcluded templates) so that the wiki-specific hook or translation extension will set the expected language itself; or
  • keep this magic keyword, but move the page to use a different full pagename (with the correct language code or none) so that the wiki-specific hook or translation extension will not imply a different language.

The need of this keyword is for all other cases where a translation tool is not applicable (e.g. talk pages like "embassies", or because the page is not intended to be itself translated, or because its content is generated by templates using the user's prefered language only and is then completely "autotranslated", with the page possibly embedding a custom "View this page in..." language chooser to render it in one of the supported languages, independantly of the universal language selector which autoselects the prefered user's language throughout his navigation on the wiki, this custom language chooser also setting the correct HTML or HTTP headers in the current session, but not changing the user's prefered language for his navigation on the wiki site)

There is Special:PageLanguage. Anything on the page itself doesn't work, as the language needs to be known to parse the page, so it would be chicken and egg situation.

I'm not convinced this is needed to parse the page, only to generate its content. But is this related to conditional code like #if and #switch and with transclusion of Lua generated contents (that would then need to generate all lingusitic versions until a laguage filter is applied at end to purge the excluded section) ?

Anyway, there's still no evident way in the UI to mark a page with this special page (which is also too much restricted). Marking an individual page to use a specific language should be as open to anyone as being able to create or rename/move a page with a correct title in that language.

So the menu at top of pages that allows renaming a page should also include the option to change its language (this change should be also visible in the history and can be reverted by reviewers if needed)

An alternative is to integrate the language setting of a page in the same dialog for renaming/moving a page. And in the page editing dialog notably when creating one (insert that option just above the comment line before the submit button). In the same place we could also have other selectors for setting other page metadata (that may be needed for some wiki extensions, e.g. pages that contain input forms or that run some scripts, or for setting specific CSS stylesheets applicable to the page).

Change 507328 abandoned by Krinkle:
Add system message MediaWiki:Langcode

This is not a stable implementation for the software and will be hard to maintain and verify over time. Also something about appropriate fallback for new languages and insurance that this exists for every recognised language (including those without translations), which would require a structure test at the very least.

I don't think we can improve this approach to a version that is acceptable, as such closing in favour of < /508295/> - which, if this is a feature we can support, offers a more maintainable implementation.

Krinkle renamed this task from Add a {{USERLANGUAGE}}/{{USERLANG}} magic word to Add a {{USERLANGUAGE}} magic word.May 9 2019, 4:07 PM
Krinkle changed the task status from Open to Stalled.May 9 2019, 4:09 PM
Krinkle edited projects, added Performance-Team (Radar); removed Performance Issue.
Krinkle removed a subscriber: wikibugs-l-list.
Krinkle added a subscriber: Krinkle.

Marking as stalled as this feature request should first be reviewed and approved by the Parsing Team and evaluated in terms of cost, complexity, maintenance, compatibility, roadmap etc.

Once approved, please ping Performance Team so we can take a look at the implementation before it goes live.

Change 507328 abandoned by Krinkle:

Just an afterthought, that maybe the combination of both possibilities could be beneficial as Urbanecm suggested in the abandoned change. Generate the message Want suggested for all languages by (translate? separate?) extension could be a second approach alternative to the magic word. Anyway this is a thing for performance and parsoid teams to decide

Please note: I'm seen that a realize two magic Words:

  • {{USERLANGUAGE}} - for a display the user language
  • {{PAGELANGUAGE}} - for a display the page language

I have not reviewed the source code, but maybe this is a solution for this issue.

By way, {{PAGELANGUAGE}} is also a very needed feature! See T59603 and T161976 about this issue.

It's listed in the description :) The issue is not having code that implements this, the issue is whether this feature is suitable (now and in the future) for MediaWiki core and/or deployment to Wikimedia sites.

Of cource ;) This is notice only. And also this is marker, that this issue still important and very needed.

There should not be any need to call the wiki parser from Lua just to get an information which is built in the Wikimedia API (in PHP) itself. The best way would be to expose this value directly in the environment offered by Scribunto.

Now for the access from wikipages, it should also not be ncessary to call a Lua module. But using the template expansion of "{{int:Lang}}" [which works if there are resources loaded in pages of a special namespace intended for storing translated texts, but that is non sense here as this is NOT translatable and it is just returning a rewriting of the page name that does not need to be stored as a page and its content should not be modifiable at all] should be deprecated.

However, there's not necessarily a need for a new magic keyword, if the "int:" parser function hook implements it internally instead of performing a template expansion from loaded pages. Here the parseer function hook implementing the "'int:" special namespace is written in PHP and can instantly use the Mediawiki API (in PHP) which exposes a PHP variable (which should have been "cleaned", i.e. should return already a canonicalized and validated code, that should then map to a supported code, even if the user is querying a page with an invalid value for the "?uselang=*" parameter; that varaible should also already canonicalize some known aliased codes, such as "zh-min-nan" to "nan", or "fra" and "fre" to "fr", using BCP47 rules, or "EN" to "en"; invalid codes specified in query parameters may be replaced by another "guessed" language, from the browser's "prefered language" environment, or at least the default language code of the wiki).

The whole code for getting and validating the user language (from HTTP query parameters, otherwise from user's preferences, otherwise from the browser's environment in the "AcceptLanguage:" MIME header, which also specifies a list of user-prefered fallbacks that coulmd be exposed as well as a separate variable containing an ordered array of validated codes) is part of Mediawiki core. It is at the root even before parsing any page (and there's not always a wiki page to parse when the user requests a page in the "Special:" namespace, which is implemented by Mediawiki hooks that won't parse any wiki code from any actually stored page; this is also true when rendering pages from the "api.php" endpoint). Then only, the wiki code of wiki pages is parsed and expanded (and later cached by this validated user language). The core will recompose the final page containing the cached parsed page and the rest of the UI. In all cases, the Mediawiki parser and the Lua parsers are only plugins loaded from the Mediawiki internal environemnt where all is ready:

It's up to the MediaWiki API to expose this variable (or varaibles, it it exposes not just the prefered user language, but also his own prefered fallbacks, which could also be stored in user preferences, as an ordered array of language codes that will however need to be revalidated by the core when it will load the user's preferences from the database if the user is logged in; the user preferences, users may not even need to set these preferences which should by default use the browser's settings exposed in the standard "AcceptLanguages:" header of HTTP but with an order specified by "q=*" numeric values qualifiers rather than just an ordered list; we don't need these values, just an ordered list of validated codes usable for the Mediawiki UI). So expose the main prefered language in the PHP implementation of the "int:" special namespace, and in Scribunto, and all is done! No need to load extra pages in wikis, and no longer any need for users to configure the prefered language for each wiki, as the Mediawiki UI by default will honor the browser's settings (in fact users may also want to REMOVE this language setting from their stored preferences, with the option "user language preferences of the current browser", or "application" if he's using a detected Mediawiki mobile app).

And there will be no need for admins to manage special pages on each wiki. users will still be able to use another language using the universal Language Selector (ULS), temporarily, or persistantly on a specific wiki if they wish to do so and are logged in. Users without accounts are not required to create an account to view and navigate anywhere on a wiki in their prefered language, using simply the language settings of their browser or app, and the server will never need to store anything (except in server logs whose access is highly restricted for privacy reasons) for non-logged in user.

For this reason, I think that exposing wgUserLanguage of PHP (or a subvariable in the MediaWiki environment) is the solution to go: simplest, fastest, most efficient (no need to cross multiple language parser barriers: PHP is already at the root of everything in Mediawiki, for the Scribunto extension and for the "int:" namespace hook if it implements "int:Lang" internally without loading any page from the database with an unnecessarily complex mechanism of fallbacks that should not be needed at all if the exposed variable is already canonicalized).

In general we would like to separate "page content" (in the "page content language") from user interface (in the "user interface language"). Currently much of that user interface is expressed in wikitext on sites like commons, but we'd like to reduce that over time. As mentioned before, wikitext in the user interface language complicates caching.

As a separate note, I'd like to encourage any new parser functions/magic words to use the {{#....}} syntax, as described in T204370: Behavior switch/magic word uniformity.

"As mentioned before, wikitext in the user interface language complicates caching."

But this is not new and this task won't change it! Caching is already performed by user language (if there's any use of it in the content, including when using the "{{int:*}}" syntax or using any parser function that requires it, or in most special pages when their content is cachable).

Caching is also controled by an expiration date that already depends as well on the use of current date functions and magic keyword (depending on their precision: "{{CURRENTYEAR}}" will limit the expiration date to the end of the year, "{{CURRENTDAY}}" will limit it to the end of the day ; "{{CURRENTMINUTE}}" would limit it to the end of the current minute, but caches also define a minimum caching period, probably set to a few minutes, to avoid too fast refreshes which could be costly for rendering servers, but whose exact duration will depend on the administrative settings of each wiki).

Yes, the legacy parser splits the user cache by user. Parsoid, however, does not: instead any per-user adjustments must happen as a post-processing pass. So we'd rather *not* see new features introduced which vary by user; these will all have to be reimplemented as post-processing passes later.

Just a note. This is a cool feature needed and used via extensions on many third party wikis. It will be sad to not have this feature at all via extensions in the future due to shortcomings or improvements, probably depends on how you see it, of the new parser. I very much suspect that every wiki using the Translate extension is likely to need this feature.

How does (or doesn't) parsoid currently handle {{int:}} syntax?

I don't think Translate users need this feature. Most cases should be covered by {{PAGELANGUAGE}} and the soon to be introduced {{TRANSLATIONLANGUAGE}}.

I don't think Translate users need this feature. Most cases should be covered by {{PAGELANGUAGE}} and the soon to be introduced {{TRANSLATIONLANGUAGE}}.

I was indeed a bit too general here I guess. Sorry for putting Translate into the same pot as multilingual wiki. ;)

The use case is to serve content depending on the language the user chose in their settings without having them to again choose the language on every page they are currently looking at. So after having a second peep, a bit late I guess, at the original description I see that {{int:Lang}} could do the same thing. Perhaps a bit more effort to set up initially but I do not mind here. Anyhow that's the use case I have in mind.

I also agree that {{int:Lang}} is already widely used. However, the "int:" naamespace by default performs a lookup of pages that must be created in the "Mediawiki:" namespace, named "Lang/*", with one page per language. This is the old method, and in my opinioln Mediawiki does not necessarily needs to loads hundreds of pages for hundreds of languages (plus their variant subcodes): the parser can implement a hook that can directly return the user language from the MediaWiki's API, without loading any page from the "Mediawiki:" namespace (so all pages "MediaWiki:Lang/*" could be simply deleted if they still exist, and protected from being recreated, or this namespace "Mediawiki;" would also have a hook implemented that locks creating/editing any page named "Lang" or its subpages).

The hook is very minimalist as it won't perform use any Media API, but will just look into variables that are already exposed by the internal API.

Why is it so complicate to implement and deploy this hook (which will also accelerate the rendering as there's no longer any need to load an external page from the database, just containing a plain-text code matching the subpage name???)

Why do then we need annother special magic keyword ? The "int:" namespace is already reserved and can be used for this. All what is needed is to implement a hook for pagename resolvers in the two existing namespaces: they can perfectly detect that the "base page name" is "Lang" and just return the language code seen in the requested subpagename (for the "MediaWiki:" namespace) or in the requested page name (in the "Int:" pseudonamespace). No need to pollute the main namespace with more magic keywords.

And for compatiblity with wikis that have already deployed various kooks for extra magic keywords, inform them they can deprecate these keywords, and convert them to a very basic template just containing {{int:Lang}} (only to make the transition, bots will be able to replace invokatiions to these templates).

Now all wikis will be unified with the same syntax {int:Lang}which is the most common in many widely used wikis.

In general I dont like at all the magic keywords using the template like syntax with an unqualified name, even if this name is capitalized: we have too many "magic keywords" using unqualified names: using a namespace prefix for them is highly preferable, and it is much more flexible: new additions of qualified keywords witll be safer, will avoid clashing with existing templates already used for incomaptible goals).

In general, many internationalization features could be implemented with the same hook in the "int:" namespace (which you can think as meaning "international" or "internal", and already reserved for the internal maintenance and management of Mediawiki itself).

As well {{TRANSLATIONLANGUAGE}} could just be {{translate:Lang}} using the "translate:" existing namespace.

Please cleanup with namespace separation. Pollution of the main/global namespace by magic keywords should be deprecated. Each Mediawiki extension should have a policy requiring them to use their own namespace for all their needs.

And please adopt a policy for the addition of new spacespaces: some wikis have a LOT of namespaces for normal pages. namespaces for wiki extensions should better use a prefix starting by "#". In fact I would even prefer that the "int:" pseudo-namespace was changed to "#int:" so the final prefered syntax would be {{#int:Lang}}. As well the internal special namespace for "Mediawiki:" whould better become #mediawiki: or shorter as #mw:.

This would also help avoiding clashes with interwiki codes (which are hard to detect as the list is long an extensible, and non-existing namespace currently resolved as part of the main namespace where the ":" character is allowed anywhere in the base page name. MediaWiki suffers from these old legacies made in a time when there was no policy at all about naming conventions. Many new magic keywords added in successive versions of MediaWiki have caused name clashes with exiting and unrelated pages or templates.

It's time to cleanup this nightmare (which would also simplify the parsers that constantly need to be updated, and that require long and costly maintenance on the wikis).

If you're trying to do a transparent redirect, then there are Parser hooks (see ParserOptions::getCurrentRevisionRecordCallback()) which might be more appropriate than setting up transclusion templates.

And yes, int: is also an issue: T85581: Parsoid page views: need to do something about {{int:}}. But we're trying to draw a line in the sand and avoid adding *new* things we can't support.

There's also T114640: make Parser::getTargetLanguage aware of multilingual wikis.

Wikitext parser functions and magic words are tools or building blocks to build actual things for end users. If this is considered essential to something bigger, I recommend filing a separate task specifically about that (one) bigger thing so that developers and product managers can help find or build something appropiate to that. Most likely that will not involve a magic word since that is not a sustainable long-term path.

Declining this for now, but as I said about, this is not declining the larger use case(s) that users are willing to build with it. Explain those separately and we can figure out how and if that is in scope for WMF and how to build it in a responsible way that doesn't lead to unstable software.

Change 508295 abandoned by TTO:

[mediawiki/core@master] Add a new {{USERLANGUAGE}} variable for use in wikitext


Task was declined.