Page MenuHomePhabricator

mw.wikibase.getLabelByLang('Q1','en') returning nil today
Closed, ResolvedPublic

Description

Function mw.wikibase.getLabelByLang Stopped working today and mw.wikibase.getLabelByLang('Q1','en') returning nil. Might be related to T157868 ticket.

Event Timeline

Jarekt created this task.May 6 2020, 10:06 PM
Restricted Application added a project: Wikidata. · View Herald TranscriptMay 6 2020, 10:06 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Jarekt triaged this task as Unbreak Now! priority.May 6 2020, 10:59 PM
Restricted Application added a subscriber: Liuxinyu970226. · View Herald TranscriptMay 6 2020, 10:59 PM

Timing suggests that this might be due to the train (first report on Telegram at 21:15 UTC, two hours after group1 wikis to 1.35.0-wmf.31), so linking to the wmf.31 task. (That said, personally I’m not convinced this needs to be a train blocker, or that the UBN! priority is called for. @Lydia_Pintscher?)

We've noticed something similar on the Beta Cluster since last week: T251550: Description property missing in beta cluster WP

UBN is appropriate, in my opinion.

Worth noting that Commons template Wikidata Infobox is extensively used on Commons categories. For categories on humans, it also adds given name categories, surname categories, and DEFAULTSORT. An example is https://commons.wikimedia.org/wiki/Category:Edward_Everett_Horton

As @WilliamGraham notes, this is actually breaking some categorization on a large scale, which may take a long time to repopulate once they have been depopulated. It's not just about infoboxes on categories, though. If I am understanding right, this is presumably affecting millions of pages on Wikimedia Commons, considering how widespread the use of templates like Template:Artwork, Template:Institution, Template:Creator in image metadata. It might be used a dozen times in a single file page alone, such as for an artist name, genre, medium, institution, and even the labels for basic terms like "height", "width", "collection", "genre", etc. displayed by the template are pulled in from Wikidata.

Example:

brennen added a subscriber: brennen.May 7 2020, 2:38 AM

Been offline for a while and just catching up here. Does this warrant a rollback to group0?

...welp, that sure sounds like it does.

Mentioned in SAL (#wikimedia-operations) [2020-05-07T02:55:35Z] <brennen> reverting group1 to 1.35.0-wmf.30 for T252079

Mentioned in SAL (#wikimedia-operations) [2020-05-07T02:56:41Z] <brennen@deploy1001> rebuilt and synchronized wikiversions files: Revert group1 wikis to 1.35.0-wmf.30 for T252079

Change 594822 had a related patch set uploaded (by Brennen Bearnes; owner: Brennen Bearnes):
[operations/mediawiki-config@master] Revert "group1 wikis to 1.35.0-wmf.31"

https://gerrit.wikimedia.org/r/594822

Change 594822 merged by jenkins-bot:
[operations/mediawiki-config@master] Revert "group1 wikis to 1.35.0-wmf.31"

https://gerrit.wikimedia.org/r/594822

Jarekt added a comment.May 7 2020, 3:00 AM

I purged my test page and now mw.wikibase.getLabelByLang('Q1','en') returned "universe". Thank you.

Ladsgroup added a subscriber: Ladsgroup.

As the incident manager of wikidata team this week.

Maintenance_bot moved this task from incoming to in progress on the Wikidata board.May 7 2020, 4:15 AM
Husky added a subscriber: Husky.May 7 2020, 9:40 AM

I don't see the same things that @Dominicbm sees, so this might have been fixed by the rollback?

Change 594917 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] Revert "Move prefetching-term-lookup-callback service wiring"

https://gerrit.wikimedia.org/r/594917

Addshore claimed this task.May 7 2020, 10:17 AM
Restricted Application added a project: User-Addshore. · View Herald TranscriptMay 7 2020, 10:17 AM

Change 594920 had a related patch set uploaded (by Addshore; owner: Hoo man):
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.31] Revert "Move prefetching-term-lookup-callback service wiring"

https://gerrit.wikimedia.org/r/594920

Change 594929 had a related patch set uploaded (by Hoo man; owner: Hoo man):
[mediawiki/extensions/Wikibase@master] Add LuaWikibaseIntegrationTest

https://gerrit.wikimedia.org/r/594929

Change 594920 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.31] Revert "Move prefetching-term-lookup-callback service wiring"

https://gerrit.wikimedia.org/r/594920

Mentioned in SAL (#wikimedia-operations) [2020-05-07T12:13:19Z] <addshore@deploy1001> Synchronized php-1.35.0-wmf.31/extensions/Wikibase: [[gerrit:594920]] T252079 Revert "Move prefetching-term-lookup-callback service wiring" (duration: 01m 12s)

Moving to block next weeks train now as we haven't fixed this on master yet

So there is one more aspect that plays into this issue, which means the affects will not entirely disappear for another ~5 hours for all lookups, even with page purges.
There is a layer of memcached data that now sits between wikidata and the renderings of client pages for LUA calls.
This cache has a TTL of 24 hours and holds the data that is displayed by LUA that has been affected.
As a result of the issue this cache now holds nil values for all terms that were used in page renderings during the hours that group1 had buggy code (that were not already in the cache with correct values.0

Rolled back to group0 at 02:56 UTC for T252079; writing a "blocked" status update mail.

So around 02:56 on 8th May this cache will no longer have bad data in it and page purges would correctly fix the page renderings.

We have no way to actively purge the cache keys that hold bad data currently.
We will look into better ways to recover from an incident like this in the future.

Just off the top of my head, this could be:

  • a way to force this cache to be updated when performing a page purge (similar to force links update?)
  • a way to ditch all of the cache keys in this cache (probably just incrementing some value in the cache key)

Of course we have also identified an area that somehow misses all existing testing and we will dig into this too.

Change 595474 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Wikibase@master] WIP DNM, Move PREFETCHING_TERM_LOOKUP definition to Lib

https://gerrit.wikimedia.org/r/595474

Change 595474 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Move PREFETCHING_TERM_LOOKUP definition to Lib

https://gerrit.wikimedia.org/r/595474

Addshore lowered the priority of this task from Unbreak Now! to High.May 12 2020, 7:35 AM

Change 594917 abandoned by Addshore:
Revert "Move prefetching-term-lookup-callback service wiring"

https://gerrit.wikimedia.org/r/594917

Change 594929 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add LuaWikibaseIntegrationTest

https://gerrit.wikimedia.org/r/594929

Husky removed a subscriber: Husky.May 12 2020, 3:52 PM

Can someone confirm that this is fixed now please so we can close it?

Lucas_Werkmeister_WMDE closed this task as Resolved.May 26 2020, 10:23 AM

The test cases linked in the task description work again, and this bug is/was severe enough that if it still happened, I’m sure we would’ve heard of it. That’s good enough for me, at least.