Page MenuHomePhabricator

WikiBase assumes English doesn't have a variant
Closed, ResolvedPublic

Description

See gerrit failures for https://gerrit.wikimedia.org/r/72053

Wikibase makes the assumption that English won't have a variant assigned. That assumption isn't always correct.

Related Objects

StatusAssignedTask
OpenNone
OpenNone
OpenNone
StalledNone
OpenNone
StalledNone
Resolvedovasileva
OpenNone
DuplicateNone
OpenABorbaWMF
OpenNone
ResolvedPchelolo
Resolvedmobrovac
ResolvedPchelolo
ResolvedJdforrester-WMF
ResolvedMarkTraceur
ResolvedJdforrester-WMF
Resolvedcscott
ResolvedJdforrester-WMF
OpenNone
OpenNone
OpenNone
OpenNone
Opencscott
Resolvedliangent
Resolvedthiemowmde

Event Timeline

cscott created this task.Jan 25 2017, 5:12 PM
Restricted Application added a project: Wikidata. · View Herald TranscriptJan 25 2017, 5:12 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

It looks like what is happening is that "en" language is resolved to be it's own variant even if pig latin is not enabled. It's not exactly right, and I'm not sure it's Wikibase test's fault - I think actually Wikibase is not at fault here. It's the code that says "we have variants, but we don't actually have variants". So we should either not declare variants, or at least not declare "en" to be its own variant.

Looked more into it, and looks like fallback chain is supposed to return language itself if it's specified as "language with variants"... So looks like maybe making it somehow conditional may work, but I couldn't figure out code fix for it.

The logic for "should a language have languageconverter enabled" is pretty baroque; see the discussion in T153341. It's more-or-less $language->getConverter() instanceof FakeConverter. If you use that logic, Wikibase will work for pig latin. Unfortunately, "having only a single variant" doesn't necessarily mean that language converter is not enabled. Some wikis might have existing text in a different variant, but then decide to shift to being a 'single variant wiki' --- but they'll leave language converter active to properly render their old content, and just disable all but one variant.

thiemowmde triaged this task as Normal priority.Apr 12 2017, 6:08 PM
thiemowmde moved this task from incoming to blocked on others on the Wikidata board.

Unfortunately the failing test result was not copied to this ticket, and now that the last patch set is about two months old all test results are already deleted. I tried to retrigger the tests on https://gerrit.wikimedia.org/r/72053 and even rebased the patch, but Gerrit does not like what I did. I'm afraid I have to wait for somebody to get this patch in a working state, so we can look at a fresh log.

The correct logic to determine if a language has variants doesn't involve looking at the fallback chain. See discussion in T153341.

@cscott The Wikibase code looks at LanguageConverter::$languagesWithVariants. The problem is that before the patch that list did not contain en but now it does. And depending on whether the language is there or not, getParentLanguage() for en returns different results - it returned null before, but with the patch it returns object for en so English is parent of itself now. The description of FALLBACK_VARIANTS suggests that's what languages with variants should do, but not what languages without variants do. The problem seems to be Wikibase assumes en is the latter, but the patch changes it to the former.

I'm not sure it's a good idea to have certain languages behave completely different from others on the same calls, but that's how the code seems to go now.

Change 348048 had a related patch set uploaded (by Thiemo Mättig (WMDE)):
[mediawiki/extensions/Wikibase@master] Consider languages being it's own parent in LanguageFallbackChainFactory

https://gerrit.wikimedia.org/r/348048

thiemowmde moved this task from blocked on others to in progress on the Wikidata board.

For reference, here is a copy of the failures:

21:28:37 1) Wikibase\Lib\Tests\LanguageFallbackChainFactoryTest::testNewFromLanguage with data set #1 ('en', 2, array())
21:28:37 Failed asserting that 1 is identical to 0.
21:28:37 
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/extensions/Wikidata/extensions/Wikibase/lib/tests/phpunit/LanguageFallbackChainFactoryTest.php:28
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/extensions/Wikidata/extensions/Wikibase/lib/tests/phpunit/LanguageFallbackChainFactoryTest.php:109
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/tests/phpunit/MediaWikiTestCase.php:400
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/maintenance/doMaintenance.php:111
21:28:37 
21:28:37 2) Wikibase\Lib\Tests\LanguageFallbackChainFactoryTest::testNewFromLanguageCode with data set #1 ('en', 2, array())
21:28:37 Failed asserting that 1 is identical to 0.
21:28:37 
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/extensions/Wikidata/extensions/Wikibase/lib/tests/phpunit/LanguageFallbackChainFactoryTest.php:28
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/extensions/Wikidata/extensions/Wikibase/lib/tests/phpunit/LanguageFallbackChainFactoryTest.php:124
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/tests/phpunit/MediaWikiTestCase.php:400
21:28:37 /home/jenkins/workspace/mediawiki-extensions-hhvm-jessie/src/maintenance/doMaintenance.php:111
21:28:37 
21:28:37 FAILURES!
21:28:37 Tests: 14541, Assertions: 116581, Failures: 2, Skipped: 116, Risky: 39.

I created https://gerrit.wikimedia.org/r/348048 and tried to make https://gerrit.wikimedia.org/r/72053 depend on it. Let's see if this fixes the issue.

Change 348229 had a related patch set uploaded (by C. Scott Ananian):
[mediawiki/extensions/Wikibase@master] Don't trust LanguageConverter::$languagesWithVariants

https://gerrit.wikimedia.org/r/348229

Change 348229 abandoned by C. Scott Ananian:
Don't trust LanguageConverter::$languagesWithVariants

Reason:
It turns out that $language->getParentLanguage() already has the desired call to $parentLanguage->hasVariant($language) and so this patch is not needed.

https://gerrit.wikimedia.org/r/348229

Change 348048 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Consider languages being it's own parent in LanguageFallbackChainFactory

https://gerrit.wikimedia.org/r/348048

thiemowmde closed this task as Resolved.Apr 17 2017, 8:41 PM
thiemowmde claimed this task.
thiemowmde removed a project: Patch-For-Review.
thiemowmde moved this task from Review to Done on the Wikidata-Former-Sprint-Board board.

I tried https://gerrit.wikimedia.org/r/72053 locally and found that my patch https://gerrit.wikimedia.org/r/348048 solves the issue. Why https://gerrit.wikimedia.org/r/72053 still fails, I don't know. It might just need an other rebase.