Page MenuHomePhabricator

Wikibase receiving ⧼Lang⧽ from uselang parameter and using it everwhere
Open, Needs TriagePublic

Description

Calls to pages such as https://commons.wikimedia.org/wiki/Category:Plates?uselang=%E2%A7%BCLang%E2%A7%BD / https://commons.wikimedia.org/wiki/Category:Plates?uselang=⧼Lang⧽
should not make it as far down the stack of wikibase code as demonstrated in T246207#6078825

This is not a valid language and it should be handeled much further up rather than making it into a cache key.
In a situation like this I imagine a uselang=safjhaskfas should result in the default language being used.

I believe this has come up before and this can likely be linked to another ticket.
This might need fixing in mediawiki or in wikibase, but the root is to be determined.

Event Timeline

Restricted Application added a project: Wikidata. · View Herald TranscriptApr 22 2020, 10:13 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Restricted Application added a project: User-Michael. · View Herald TranscriptJun 22 2020, 4:41 PM

We can't fix it where I would like to fix it:

RequestContext::getLanguage()
				// There are certain characters we don't allow in language code strings,
				// but by and large almost any valid UTF-8 string will makes it past
				// this check and the LanguageNameUtils::isValidCode method it uses.
				// This is to support on-wiki interface message overrides for
				// non-existent language codes. Also known as "Uselang hacks".
				// See <https://www.mediawiki.org/wiki/Manual:Uselang_hack>
				// For something like "en-whatever" or "de-whatever" it will end up
				// with a mostly "en" or "de" interface, but with an extra layer of
				// possible MessageCache overrides from `MediaWiki:*/<code>` titles.
				// While non-ASCII works here, it is required that they are in
				// NFC form given this will not convert to normalised form.
				$code = self::sanitizeLangCode( $code );

The reasoning more better explained in T39564#407711:

See implementation of Language::isValidCode, which explicitly doesn't check whether a language exists, in order to allow {{int:}} hacks such as historically used on Wikimedia Commons to provide an extra dimension of interface customisation:

https://commons.wikimedia.org/w/index.php?title=Special:Upload&uselang=ownwork
https://commons.wikimedia.org/w/index.php?title=Special:Upload&uselang=fromflickr
https://commons.wikimedia.org/w/index.php?title=Special:Upload&uselang=fromgov

So until we decide do no longer support that, it makes sense to have this consistently reflected throughout and not make a weird exception just for <html lang> but not anywhere else.

I'll keep looking. There must be some place that lends itself to throwing out everything except a valid language code.

Tarrow added a subscriber: Tarrow.Jun 23 2020, 4:06 PM

I think the main place this is used and causes problems is: T247057

You can see that what happens is that the CachingFallbackLabelDescriptionLookup is always looking up whatever is at the top of the LanguageFallbackChain (see: for example \Wikibase\Lib\Store\CachingFallbackLabelDescriptionLookup::getDescription)

This is a problem because in \Wikibase\Client\DataAccess\Scribunto\Scribunto_LuaWikibaseLibrary::getLanguageFallbackChain the
chain is built from a Language that is not guaranteed to be a Term language (it can either be the contentLanguage or the user language). ( See: \Wikibase\Client\DataAccess\Scribunto\Scribunto_LuaWikibaseLibrary::getLanguage )

The same problem may well be true in the other place it is used in Wikibase: \Wikibase\Lib\Formatters\WikibaseValueFormatterBuilders::newItemPropertyIdHtmlLinkFormatter where it is built from a language fallback chain from FormatterOptions (which is obviously custom to the caller in the datatypes callbacks)

This brings to mind a handful of solutions:

  • Augment the CachingFallbackLabelDescriptionLookup so that it only tries to work with valid terms. The challenges related to this are:
    • Having to awkwardly inject the valid Terms ContentLanguages object
    • Having to inject some 'default' Language in the event that none of the languages in the provided FallbackChain are valid Term Languages
  • Ensure that only valid Term Languages are used to create the CachingFallbackLabelDescriptionLookup. This has the challenge that:
    • The ItemPropertyIdHtmlLinkFormatter is built using a FallbackChain provided by FormatterOptions which other people can hook into
  • Remove CachingFallbackLabelDescriptionLookup and:
    • Create a new LabelDescriptionLookup e.g. using something like CachingPrefetchingTermLookup inside and swap OR
    • Remove ItemPropertyIdHtmlLinkFormatter (replace with LabelsProviderEntityIdHtmlLinkFormatter?) AND refactor \Wikibase\Client\DataAccess\Scribunto\WikibaseLanguageDependentLuaBindings to use an alternative method for LabelDescription Lookups

Change 613622 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/Wikibase@master] Rename LanguageFallbackChain because it is about Terms

https://gerrit.wikimedia.org/r/613622

Change 613623 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/Wikibase@master] Remove temporary class name alias again

https://gerrit.wikimedia.org/r/613623

Change 613625 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/WikibaseLexeme@master] Rename LanguageFallbackChain to TermLanguageFallbackChain

https://gerrit.wikimedia.org/r/613625

Change 613628 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/WikibaseCirrusSearch@master] Rename LanguageFallbackChain to TermLanguageFallbackChain

https://gerrit.wikimedia.org/r/613628

Change 613631 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/WikibaseMediaInfo@master] Rename LanguageFallbackChain to TermLanguageFallbackChain

https://gerrit.wikimedia.org/r/613631

Change 613622 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Rename LanguageFallbackChain because it is about Terms

https://gerrit.wikimedia.org/r/613622

Change 613625 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Rename LanguageFallbackChain to TermLanguageFallbackChain

https://gerrit.wikimedia.org/r/613625

Change 613628 merged by jenkins-bot:
[mediawiki/extensions/WikibaseCirrusSearch@master] Rename LanguageFallbackChain to TermLanguageFallbackChain

https://gerrit.wikimedia.org/r/613628

Change 613631 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Rename LanguageFallbackChain to TermLanguageFallbackChain

https://gerrit.wikimedia.org/r/613631

Change 613623 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Remove temporary class name alias again

https://gerrit.wikimedia.org/r/613623

Change 615258 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] More (Term)LanguageFallbackChain renames

https://gerrit.wikimedia.org/r/615258

Change 615305 had a related patch set uploaded (by Eric Gardner; owner: Eric Gardner):
[mediawiki/extensions/WikibaseMediaInfo@master] Update variable definition that used deprecated LanguageFallbackChain class

https://gerrit.wikimedia.org/r/615305

Change 615305 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Update variable definition that used deprecated LanguageFallbackChain

https://gerrit.wikimedia.org/r/615305

Change 615258 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] More (Term)LanguageFallbackChain renames

https://gerrit.wikimedia.org/r/615258

Change 615451 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/WikibaseLexeme@master] Adjust test to account for qqx not being a valid term language

https://gerrit.wikimedia.org/r/615451

Change 615467 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/WikibaseLexeme@master] Adjust for signature change in TermFallbackLanguageChain constructor

https://gerrit.wikimedia.org/r/615467

Change 615483 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/WikibaseMediaInfo@master] Adjust for signature change in TermFallbackLanguageChain constructor

https://gerrit.wikimedia.org/r/615483

Change 608628 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/Wikibase@master] Add validation to LanguageFallbackChain

https://gerrit.wikimedia.org/r/608628

Change 615471 had a related patch set uploaded (by Michael Große; owner: Michael Große):
[mediawiki/extensions/Wikibase@master] Undo constructor parameter being transitionally optional

https://gerrit.wikimedia.org/r/615471

Change 615451 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Adjust test to account for qqx not being a valid term language

https://gerrit.wikimedia.org/r/615451

Change 615697 had a related patch set uploaded (by Tobias Andersson; owner: Tobias Andersson):
[mediawiki/extensions/WikibaseCirrusSearch@master] TermLanguageFallbackChain constructor changed.

https://gerrit.wikimedia.org/r/615697

Change 615787 had a related patch set uploaded (by Tobias Andersson; owner: Tobias Andersson):
[mediawiki/extensions/WikibaseMediaInfo@master] LanguageFallbackChain has been renamed.

https://gerrit.wikimedia.org/r/615787

Change 615787 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] LanguageFallbackChain has been renamed.

https://gerrit.wikimedia.org/r/615787

Change 608628 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add validation to LanguageFallbackChain

https://gerrit.wikimedia.org/r/608628

Change 615697 merged by jenkins-bot:
[mediawiki/extensions/WikibaseCirrusSearch@master] TermLanguageFallbackChain constructor changed.

https://gerrit.wikimedia.org/r/615697

Change 615483 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Adjust for signature change in TermFallbackLanguageChain constructor

https://gerrit.wikimedia.org/r/615483

Change 615467 merged by jenkins-bot:
[mediawiki/extensions/WikibaseLexeme@master] Adjust for signature change in TermFallbackLanguageChain constructor

https://gerrit.wikimedia.org/r/615467

Change 615471 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Undo constructor parameter being transitionally optional

https://gerrit.wikimedia.org/r/615471

Michael closed this task as Resolved.Aug 3 2020, 1:58 PM
Michael removed a project: Patch-For-Review.

As far as I can tell, all patches have been merged for this ticket.

Michael reopened this task as Open.Aug 3 2020, 1:58 PM

Closing it was unintentional