Page MenuHomePhabricator

Wikibase critical error "Failed to format entity ID. Cache key contains characters that are not allowed"
Closed, ResolvedPublic5 Estimated Story PointsPRODUCTION ERROR

Description

Error

Request ID: W6GY@QrAIEAAAIfoQdwAAAAO

message
Failed to format entity ID. Using fallback formatter.

Error: Cache key contains characters that are not allowed
stacktrace
#0 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/SimpleCacheWithBagOStuff.php(262): Wikibase\Lib\SimpleCacheWithBagOStuff->invalidArgument(string)
#1 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/SimpleCacheWithBagOStuff.php(67): Wikibase\Lib\SimpleCacheWithBagOStuff->assertKeyIsValid(string)
#2 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Store/CachingFallbackLabelDescriptionLookup.php(121): Wikibase\Lib\SimpleCacheWithBagOStuff->get(string, string)
#3 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Store/CachingFallbackLabelDescriptionLookup.php(107): Wikibase\Lib\Store\CachingFallbackLabelDescriptionLookup->getTerm(Wikibase\DataModel\Entity\ItemId, string, string)
#4 /srv/mediawiki/php-1.32.0-wmf.20/vendor/wikibase/data-model-services/src/EntityId/EntityIdLabelFormatter.php(53): Wikibase\Lib\Store\CachingFallbackLabelDescriptionLookup->getLabel(Wikibase\DataModel\Entity\ItemId)
#5 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/ItemIdHtmlLinkFormatter.php(65): Wikibase\DataModel\Services\EntityId\EntityIdLabelFormatter->lookupEntityLabel(Wikibase\DataModel\Entity\ItemId)
#6 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/ControlledFallbackEntityIdFormatter.php(76): Wikibase\Lib\Formatters\ItemIdHtmlLinkFormatter->formatEntityId(Wikibase\DataModel\Entity\ItemId)
#7 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/EntityIdValueFormatter.php(44): Wikibase\Lib\Formatters\ControlledFallbackEntityIdFormatter->formatEntityId(Wikibase\DataModel\Entity\ItemId)
#8 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/DispatchingValueFormatter.php(75): Wikibase\Lib\EntityIdValueFormatter->format(Wikibase\DataModel\Entity\EntityIdValue)
#9 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/PropertyValueSnakFormatter.php(148): Wikibase\Lib\Formatters\DispatchingValueFormatter->formatValue(Wikibase\DataModel\Entity\EntityIdValue, string)
#10 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/PropertyValueSnakFormatter.php(117): Wikibase\Lib\PropertyValueSnakFormatter->formatValue(Wikibase\DataModel\Entity\EntityIdValue, string)
#11 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/DispatchingSnakFormatter.php(151): Wikibase\Lib\PropertyValueSnakFormatter->formatSnak(Wikibase\DataModel\Snak\PropertyValueSnak)
#12 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/lib/includes/Formatters/ErrorHandlingSnakFormatter.php(68): Wikibase\Lib\DispatchingSnakFormatter->formatSnak(Wikibase\DataModel\Snak\PropertyValueSnak)
#13 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/SnakHtmlGenerator.php(129): Wikibase\Lib\Formatters\ErrorHandlingSnakFormatter->formatSnak(Wikibase\DataModel\Snak\PropertyValueSnak)
#14 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/SnakHtmlGenerator.php(79): Wikibase\View\SnakHtmlGenerator->getFormattedSnakValue(Wikibase\DataModel\Snak\PropertyValueSnak)
#15 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementHtmlGenerator.php(181): Wikibase\View\SnakHtmlGenerator->getSnakHtml(Wikibase\DataModel\Snak\PropertyValueSnak, boolean)
#16 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementHtmlGenerator.php(161): Wikibase\View\StatementHtmlGenerator->getSnaklistviewHtml(array)
#17 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementHtmlGenerator.php(132): Wikibase\View\StatementHtmlGenerator->getHtmlForReference(Wikibase\DataModel\Reference)
#18 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementHtmlGenerator.php(88): Wikibase\View\StatementHtmlGenerator->getHtmlForReferences(Wikibase\DataModel\ReferenceList)
#19 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementGroupListView.php(154): Wikibase\View\StatementHtmlGenerator->getHtmlForStatement(Wikibase\DataModel\Statement\Statement, string)
#20 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementGroupListView.php(138): Wikibase\View\StatementGroupListView->getHtmlForStatementListView(array, string)
#21 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementGroupListView.php(71): Wikibase\View\StatementGroupListView->getHtmlForStatementGroupView(array)
#22 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/StatementSectionsView.php(71): Wikibase\View\StatementGroupListView->getHtml(array)
#23 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/ItemView.php(86): Wikibase\View\StatementSectionsView->getHtml(Wikibase\DataModel\Statement\StatementList)
#24 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/view/src/EntityView.php(76): Wikibase\View\ItemView->getMainHtml(Wikibase\DataModel\Entity\Item)
#25 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/repo/includes/ParserOutput/EntityParserOutputGenerator.php(282): Wikibase\View\EntityView->getHtml(Wikibase\DataModel\Entity\Item)
#26 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/repo/includes/ParserOutput/EntityParserOutputGenerator.php(167): Wikibase\Repo\ParserOutput\EntityParserOutputGenerator->addHtmlToParserOutput(ParserOutput, Wikibase\DataModel\Entity\Item, Wikibase\Lib\Store\EntityInfo)
#27 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/repo/includes/Content/EntityContent.php(278): Wikibase\Repo\ParserOutput\EntityParserOutputGenerator->getParserOutput(Wikibase\DataModel\Entity\Item, boolean)
#28 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/repo/includes/Content/EntityContent.php(215): Wikibase\EntityContent->getParserOutputFromEntityView(integer, ParserOptions, boolean)
#29 /srv/mediawiki/php-1.32.0-wmf.20/includes/poolcounter/PoolWorkArticleView.php(145): Wikibase\EntityContent->getParserOutput(Title, integer, ParserOptions)
#30 /srv/mediawiki/php-1.32.0-wmf.20/includes/poolcounter/PoolCounterWork.php(123): PoolWorkArticleView->doWork()
#31 /srv/mediawiki/php-1.32.0-wmf.20/includes/page/Article.php(617): PoolCounterWork->execute()
#32 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/repo/includes/Actions/ViewEntityAction.php(79): Article->view()
#33 /srv/mediawiki/php-1.32.0-wmf.20/extensions/Wikibase/repo/includes/Actions/ViewEntityAction.php(54): Wikibase\ViewEntityAction->showEntityPage()
#34 /srv/mediawiki/php-1.32.0-wmf.20/includes/MediaWiki.php(501): Wikibase\ViewEntityAction->show()
#35 /srv/mediawiki/php-1.32.0-wmf.20/includes/MediaWiki.php(294): MediaWiki->performAction(Article, Title)
#36 /srv/mediawiki/php-1.32.0-wmf.20/includes/MediaWiki.php(868): MediaWiki->performRequest()
#37 /srv/mediawiki/php-1.32.0-wmf.20/includes/MediaWiki.php(525): MediaWiki->main()
#38 /srv/mediawiki/php-1.32.0-wmf.20/index.php(42): MediaWiki->run()

Notes

This error was recorded 2,479 times in the last 30 days. With bursts on 4 September, 6 September and 19 September (today). Only on wikidata.org and test.wikidata.org, not others wikis.

Reproduction

  1. Navigate to and purge https://www.wikidata.org/wiki/Q101971?action=purge
  2. Navigate to https://www.wikidata.org/wiki/Q101971?uselang=%E2%A7%BClang%E2%A7%BD and look at the logs

This should be reproducible locally while using the tmpMaxItemIdForNewItemIdHtmlFormatter config variable within wikibase repo.

Acceptance Criteria

Event Timeline

This is the first time I've seen a log error in MediaWiki with severity CRITICAL. Given that this issue has been around for over 30 days and that the urls it applies to all seem to render fine, this is probably incorrect use of this severity.

Looking at the request URLs this seems to be users requesting invalid language codes, and the code makes it all the way into the cache key, that is then invalid as it contains bad chars.

Example:
https://www.wikidata.org/wiki/Q101971?uselang=%E2%A7%BClang%E2%A7%BD

Each single hit on a page like that results in multiple log messages.
I wonder why we bother formating entity ids for invalid languages

Wikibase assumes that the language code in the ParserOptions is valid, but the ParserOptions will happily have whatever language the user provided.

/var/www/mediawiki/extensions/Wikibase/repo/includes/Content/EntityContent.php:272:
object(Language)[652]
  public 'mConverter' => 
    object(FakeConverter)[653]
      public 'mLang' => 
        &object(Language)[652]
  public 'mVariants' => null
  public 'mCode' => string '⧼lang⧽' (length=10)
  public 'mLoaded' => boolean false
  public 'mMagicExtensions' => 
    array (size=0)
      empty
  public 'mMagicHookDone' => boolean false
  private 'mHtmlCode' => null
  private 'mParentLanguage' => boolean false
  public 'dateFormatStrings' => 
    array (size=0)
      empty
  public 'mExtendedSpecialPageAliases' => null
  protected 'namespaceNames' => null
  protected 'mNamespaceIds' => null
  protected 'namespaceAliases' => null
  public 'transformData' => 
    array (size=0)
      empty

The language is used in EntityContent::getParserOutputFromEntityView to construct an EntityParserOutputGenerator

Addshore triaged this task as Medium priority.Sep 20 2018, 10:44 AM
Addshore moved this task from incoming to needs discussion or investigation on the Wikidata board.
Addshore moved this task from Inbox to Research on the [DEPRECATED] wdwb-tech board.

Change 461624 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] SimpleCacheWithBagOStuff, Throw exception with key when invalid

https://gerrit.wikimedia.org/r/461624

Addshore moved this task from Research to Goals on the [DEPRECATED] wdwb-tech board.
Addshore added a project: Wikidata-Campsite.
Addshore moved this task from Incoming to Ready to estimate on the Wikidata-Campsite board.
Addshore added a subscriber: Lydia_Pintscher.

Change 461624 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] SimpleCacheWithBagOStuff, Throw exception with key when invalid

https://gerrit.wikimedia.org/r/461624

When an invalid language code is requested some other default formatting will be used. << @Lydia_Pintscher to decide this.

Decision: treat it as an unknown language that falls back to English, just like e. g. uselang=1234. (Note that language fallback indicators should appear, so this isn’t the same as just treating it like English.)

This might be coming from a misbehaving gadget, lua module, etc. on a client wiki, which might try to do something like this:

// ...
url += '&uselang=' + mw.message('lang').escaped();

MediaWiki:Lang is a message translating to the current user language on some, but not all Wikis (for example, wikidata and dewiki have it, enwiki doesn’t). On Wikis that don’t have the message, the result will be ⧼lang⧽.

One option would be to treat all invalid language codes as und (“undetermined”). (Where “invalid language code” is determined via Language::isValidBuiltInCode.)

Krinkle raised the priority of this task from Medium to Unbreak Now!.Sep 26 2018, 4:36 PM

Please remove use of the CRITICAL severity for this error as soon as possible. Everything else about this bug is secondary.

Change 463264 had a related patch set uploaded (by Jonas Kress (WMDE); owner: Jonas Kress (WMDE)):
[mediawiki/extensions/Wikibase@master] When an invalid language code is requested use 'und' instead

https://gerrit.wikimedia.org/r/463264

Interestingly

<lang>

falls back to English but

⧼lang⧽

not.

Also Language class has interesting special features

	protected static function newFromCode( $code, $fallback = false ) {
		if ( !self::isValidCode( $code ) ) {
			throw new MWException( "Invalid language code \"$code\"" );
		}

		if ( !self::isValidBuiltInCode( $code ) ) {
			// It's not possible to customise this code with class files, so
			// just return a Language object. This is to support uselang= hacks.
			$lang = new Language;
			$lang->setCode( $code );
			return $lang;
		}

Whatever uselang=hacks is ...

Change 463307 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/mediawiki-config@master] logging: Disable 'Wikibase.NewItemIdFormatter' channel

https://gerrit.wikimedia.org/r/463307

Change 463307 merged by jenkins-bot:
[operations/mediawiki-config@master] logging: Disable 'Wikibase.NewItemIdFormatter' channel

https://gerrit.wikimedia.org/r/463307

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:34:52Z] <krinkle@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T204791 (duration: 00m 57s)

@Jonas The uselang-hacks represent a legacy feature for the community, originally for Commons but probably other wikis as well. Examples:

[mediawiki/extensions/Wikibase@master] When an invalid language code is requested use 'und' instead
https://gerrit.wikimedia.org/r/463264

I'm familiar with und but have not seen it before MediaWiki context. I may've missed it, but if it is true, then we should probably avoid introducing it in just one area like this – Under the "principle of least surprise", and more generally to reduce inconsistencies and things that can go wrong, or are different without significant benefit (given that consistency is a big benefit, one we would lose). If it is unused and Wikibase does use it, I would recommend documenting the reasoning of this trade-off and filing a task with plan for how/when to resolve the technical debt. You'll want to talk to Wikimedia's Language Engineering team about what the recommended alternatives would be.

In further reviewing the patch, I noticed that it did not address the problem with SimpleCacheWithBagOStuff or NewItemIdFormatter. So I've disabled the log channel in production.

Change 463323 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Switch Wikibase.NewItemIdFormatter log to error from critical

https://gerrit.wikimedia.org/r/463323

Addshore lowered the priority of this task from Unbreak Now! to Medium.Oct 1 2018, 9:07 PM

Changing from UBN to normal as the is no longer logging and also a patch is up to change from critical to error.

Change 463323 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Switch Wikibase.NewItemIdFormatter log to error from critical

https://gerrit.wikimedia.org/r/463323

@Jonas The uselang-hacks represent a legacy feature for the community, originally for Commons but probably other wikis as well. Examples:

Wow, those hacks are evil

I'm familiar with und but have not seen it before MediaWiki context.

I'm not sure it is used anywhere, but it just stands for "language undetermined", which is the case when a user passes random crap in there?

I may've missed it, but if it is true, then we should probably avoid introducing it in just one area like this – Under the "principle of least surprise", and more generally to reduce inconsistencies and things that can go wrong, or are different without significant benefit (given that consistency is a big benefit, one we would lose). If it is unused and Wikibase does use it, I would recommend documenting the reasoning of this trade-off and filing a task with plan for how/when to resolve the technical debt. You'll want to talk to Wikimedia's Language Engineering team about what the recommended alternatives would be.

Right now wikibase will try to parse the page in the language code provided by the parser, in relation to this ticket, uselang=%E2%A7%BClang%E2%A7%BD.
This makes it all the way through the mediawiki and wikibase code, even though it isn't a valid language.
This is because we have an assumption that ParserOptions::getUserLangObj would return a Language object with a valid language code, which isn't the case, it will give you whatever the user passes in.

Using und vs whatever invalid string the user passes around will have the same behaviour within wikibase, and will currently parse the page with no language defined.
The ParserCache will still be split in the same way as it was before, using whatever string for language the user passed into uselang (as this is handled by ParserOptions::getOption)

In further reviewing the patch, I noticed that it did not address the problem with SimpleCacheWithBagOStuff or NewItemIdFormatter. So I've disabled the log channel in production.

Fixed in another commit that is now merged.

This comment was removed by Krinkle.

Change 463264 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] When an invalid language code is requested use 'und' instead

https://gerrit.wikimedia.org/r/463264

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:08 PM