Page MenuHomePhabricator

Feature request: add detection for page language to Scribunto
Closed, ResolvedPublic

Description

Recently, {{PAGELANGUAGE}} has been introduced.

I want to query this property similar to mw.language.getContentLanguage() as mw.language.getPageLanguage() (might be almost identical on regular wikis, but could differ on wikis with translate features).

I am not happy being forced on frame:preprocess() hacks.

Actually, I want this for the current page only. A more widened approach would be to have a pile of page properties of any page, accessed by mw.title, including redirect, disambiguation, content model or defaultsort attributes.

Event Timeline

mw.title objects already have isRedirect and contentModel, pageLanguage would fit well there.

Change 346176 had a related patch set uploaded (by Anomie):
[mediawiki/extensions/Scribunto@master] mw.title: Add pageLanguage property

https://gerrit.wikimedia.org/r/346176

We could have mw.language.getUserLanguage() with getContentLanguage() and getPageLanguage() to complete languages needs of any modules. See T68051.

User language is off topic for this task.

Please note: I'm seen https://www.mediawiki.org/wiki/Extension:LanguageCode that a realize two magic Words:

{{USERLANGUAGE}} - for a display the user language (see T4085 for this issue)
{{PAGELANGUAGE}} - for a display the page language
I have not reviewed the source code, but maybe this is a solution for this issue.

Uzume triaged this task as Medium priority.
Uzume subscribed.

@Anomie Can we get your change from over three years ago merged? This is an easy and straightforward fix but Gerrit is reporting some sort of merge conflict even though Jenkins had no issues with it.

https://gerrit.wikimedia.org/r/346176

Aklapper raised the priority of this task from Medium to Needs Triage.
Aklapper edited projects, added Patch-Needs-Improvement; removed Patch-For-Review.

@Uzume: Please don't prioritize tasks and assign tasks without assignee consent - thanks!

@Anomie Can we get your change from over three years ago merged? This is an easy and straightforward fix but Gerrit is reporting some sort of merge conflict even though Jenkins had no issues with it.

https://gerrit.wikimedia.org/r/346176

Jenkins had no issues with the patch because it ran in 2017, when there wasn't a merge conflict. I've told Jenkins to run again, and it's now properly failing.

Uzume added a subscriber: Pppery.

@Aklapper Does it really need triage? There was already a patch for it (thought it seems to need to be updated). I can see how the patch itself needs triage but the issue seems well understood. Anomie already clarified that mw.language.getPageLanguage was not the right thing and demonstrated that a pageLanguage field of mw.title objects was the way to go. What further triage does this issue really need? I only assigned it to Anomie so that he would respond based on the patch he created. I understand if he wanted to remove himself at this point in time but the point was to get him to make such a statement.

@Pppery Thanks for forcing the rebuild. I never assumed the conflict was not there. I just noticed it was not originally there. I really wonder why it was never merged.

In any event, I believe what needs to be done it clear. Short of attempting to make a new patch myself (I do not really have a proper Mediawiki development and test environment set up), how to I facilitate forward action on this? Thank you.

Please do not remove valid project tags and do not replace them by wrong tags - thanks.

I actually did not do that. I think somehow I must have edited/submitted and older version (though I am not sure how as that was not my intention).

Change 747996 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/extensions/Scribunto@master] Add test for pageLanguage property

https://gerrit.wikimedia.org/r/747996

Change 346176 merged by jenkins-bot:

[mediawiki/extensions/Scribunto@master] mw.title: Add pageLanguage property

https://gerrit.wikimedia.org/r/346176

This patch is apparently causing quite a bit of noise in logs and will likely have a bigger impact with the rollout of 1.38.0-wmf.16 to all wikis today. I may revert the above patch (https://gerrit.wikimedia.org/r/346176) if there isn't a better solution to be found. see T298659: BadMethodCallException: Sessions are disabled for load entry point

For reference, {{PAGELANGUAGE}} mentioned in the description was added in T59603, however, it only allows one to obtain the language of the page being rendered (since it does not take any arguments unlike {{PAGENAME}} and friends) not the content language of arbitrary pages despite arbitrary page content being available via getContent on mw.title objects.

I believe if arbitrary page content is made available, the purported language of that content should also be made available (along with the content model which is already also available).

I look forward to T299369 getting resolved in hopes that this might also be able to be resolved by that (since that seems to be blocking this one).

Change 747996 merged by jenkins-bot:

[mediawiki/extensions/Scribunto@master] Add test for pageLanguage property

https://gerrit.wikimedia.org/r/747996

Change 844519 had a related patch set uploaded (by C. Scott Ananian; author: Reedy):

[mediawiki/extensions/Scribunto@master] mw.title: add `pageLang` property

https://gerrit.wikimedia.org/r/844519

The problems that caused the revert in 2022 have been resolved (see T299369), so it should be possible to re-attempt this change now.

Change 844519 merged by jenkins-bot:

[mediawiki/extensions/Scribunto@master] mw.title: add `pageLang` property

https://gerrit.wikimedia.org/r/844519

I am glad to see this long overdue implementation finally arrive, however, it was made "expensive" regardless of whether other related title information was already fetched about the same target page during the parser rendering of the current page. This is substantially different from the PAGELANGUAGE variable which does not appear to be "expensive" (I only looked at the documentation and not the code so far).

So sadly if a Scribunto Lua programmer wants the language code for the current page (as this issue originally requested), there is little reason to move from mw.getCurrentFrame():preprocess('{{PAGELANGUAGE}}') to mw.title.getCurrentTitle().pageLang:getCode(). Now it is true that mw.title objects cache this but there is no caching across such objects or across #invoke calls so depending on how things work, using this new interface for the current page could get "expensive" fast. And the same is true for other pages although now it is possible to get that information and before it was entirely inaccessible.

@Uzume – If you are really expecting to violate page limitations, there is a hint, or hack.

Create a CurrentPage.lua module.

  • This might consist of one line only, at least in effect:
  • return mw.title.getCurrentTitle()

In applications, use

local page = mw.loadData( "Module:CurrentPage" )
local slang = page.pageLang:getCode()

Not tested, but should work in principle.

  • mw.loadData() is evaluated once per page.
  • Then the result is sticked to the current page, with static content.
  • The content is a table in this case, with title structure.
  • Subsequent queries by further #invoke will retrieve that static content.

I am afraid that it will be necessary to create a local table as CurrentPage.lua result.

  • mw.loadData() does not permit functions.
  • The title is an object with methods rather than a simple table.
  • The returned table must contain only atomic strings, numbers and boolean etc.

@PerfektesChaos: I am not sure I am really looking to violate any explicit limitations but avoiding tripping them is good. It is not a bad idea to use mw.loadData() to wrap the pageLang as a workaround to prevent increased "expensiveness" but mw.loadData() validates the return value and it does not allow functions and metatables as mw.title objects like what is returned by mw.title.getCurrentTitle() provides.

A simplistic possibility would be to have a Module:PageLang that contained return {langcode=mw.title.getCurrentTitle().pageLang:getCode()} (or previously return {langcode=mw.getCurrentFrame():preprocess('{{PAGELANGUAGE}}')}) and then use local pagelang = mw.language.new(mw.loadData("Module:PageLang").langcode). The prior preprocess() might not be "expensive" but it is resource expensive to repeatedly evaluate so encapsulating the expense (either way) can be useful.

The point of my comment is that although such a construct is possible, why is it necessary? This just underlies laziness in making the "expensive" count work correctly by punting to always increase it (which ensures it is not missed but is not really right).