Page MenuHomePhabricator

DBPageLanguage isn't returned by Title::getPageLanguage
Closed, ResolvedPublic

Description

Consider the scenario, that I want to have a german wiki (wgContLang is de), and want to add an english text, that can be translated with the Translate extension. To make this work, I need to change the content model of the page I want to translate into english (instead of having the default german). For that I enable $wgPageLanguageUseDB and set it to true to be able to change the page language of a specific page to a different one.

Now, if I call Title::getPageLanguage() I would expect, that a language object is returned, which represents the DB language (en), instead of wgContLang (de). Unfortunately, the Translate extension still shows, that the page will be translated from de, not from en (db language).

I created a hook in another extension to recheck this:

public static function onPageContentLanguage( Title $title, Language &$pageLang, $userLang  ) {
            var_dump($title->getPageLanguage()->getCode());
            var_dump($pageLang->getCode());
}

The output is "en" and "de". That, btw., doesn't make much sense, because Title::getPageLanguage() is usually the function, which calls the hook (through ContentHandler::getPageLanguage()). I'm currently not sure, if this is a thing of my setup or not, will verify first.

Btw.: If I change the language in the hook:

public static function onPageContentLanguage( Title $title, Language &$pageLang, $userLang  ) {
            $pageLang = wfGetLangObj( 'en' );
}

anything works fine.

Some further information with screenshots of the specific problem, that may help to understand the problem. The source page is "TestSections", which page language was set to german (using Special:PageLanguage). The content language is, by default, english:

The translateable page, when requested in the browser looks correctly:

But clicking on translate gives this result in the header (which, in fact, is false, the source language isn't en, it's de, saved in the db):

Requesting the english version of the page (TestSections/en) gives this confusing header:

Clicking on german (Deutsch) should redirect to TestSections (the base page, which has the page language german), but instead redirects to TestSections/de:


and says, that this is a translated version of the base page, which is (again) false :)

Outputting the source language of the base title (in PageTranslationHooks::languages()) shows the following:


while instead should say, that the source language is de (saved in the database), not en (default content language).

However, the problem does not occur, if you use getPageLanguage() in the title of the page you're currently viewing. WikiPage already requests the page_lang attribute and loads it correctly into the Title object.

Event Timeline

Florian raised the priority of this task from to Needs Triage.
Florian updated the task description. (Show Details)
Florian added a subscriber: Florian.

Hmm, after some investigation, it seems, that the Title object returned by Title::newFromtext() (probably) under some conditions doesn't has the Title::$mDbPageLanguage attribute set. Therefore, Title::getPageLanguage() will return the content language ($wgContLang), instead of the language set in the database. The easiest solution would be to lazy load the value from the database, but this, in the worst case, results in an extra database query, where the value isn't cached. I think there has to be a better solution :)

Florian set Security to None.
Florian updated the task description. (Show Details)

Change 260203 had a related patch set uploaded (by Florianschmidtwelzow):
Fix not-loaded DbPageLanguage when Title::getPageLanguage() get's called

https://gerrit.wikimedia.org/r/260203

Hi, Thanks for the patch. Robin was working on a patch to modify some features which will make everything fall into place, and a lot of work including similar to this patch depends on this.
All of them are outdated but I can surely work to get them back on track.
Robin's patch:
https://gerrit.wikimedia.org/r/#/c/137033/11

Further work on top of this restructuring(this can be done without the restructuring too, but it makes the further features simpler):
https://gerrit.wikimedia.org/r/#/c/143049/
https://gerrit.wikimedia.org/r/#/c/143015/
https://gerrit.wikimedia.org/r/#/c/143025/
https://gerrit.wikimedia.org/r/#/c/137915/
https://gerrit.wikimedia.org/r/#/c/151920/
https://gerrit.wikimedia.org/r/#/c/151310/

Do you think we should go ahead with keeping the current class or follow Robin's approach? In either case, I can help to create new patches that achieve the above functionality.

I haven't looked at your changes, but it seems unrelated to rebuilt the page language construct to fix this bug (in my opinion a bug fix should be as small as possible, so the change is as small as possible to fix a specific bug). However, if you think, that rebuilding the page language handling in mediawiki to a better one is needed, I think that sounds like a great idea (if it is really better :P)! This task (from my side) should fix "only" this specific bug to fix the broken behaviour :) I hope you understand what I mean :P

Actually this was one of the bugs fixed in the change set, and in an
efficient way.

However according to the comments that were listed there, it will definitely take quite some time to pass. I mentioned all the related work done to avoid work getting repeated. :)

Change 260203 merged by jenkins-bot:
Fix not-loaded DbPageLanguage when Title::getPageLanguage() get's called

https://gerrit.wikimedia.org/r/260203