Page MenuHomePhabricator

wikidatawiki.wb_terms missing label and descriptions for some languages
Closed, InvalidPublic

Description

Starting maybe a week or two ago, the wikidatawiki.wb_terms table (on replicas and production) does not match what I see on wikidata.org. Example:

MariaDB [enwiki_p]> SELECT term_type AS term, term_text, term_language FROM wikidatawiki_p.wb_terms WHERE term_entity_id = 4855000 AND term_type IN ('label', 'description');
+-------+---------------------------------+---------------+
| term  | term_text                       | term_language |
+-------+---------------------------------+---------------+
| label | Bangabandhu-1                   | en            |
| label | Bangabandhu-1                   | pt            |
| label | বঙ্গবন্ধু-                          | bn            |
| label | Bangabandhu                     | fr            |
+-------+---------------------------------+---------------+
4 rows in set (0.03 sec)

but https://www.wikidata.org/wiki/Q4855000 shows a label and description for cz, zh, etc.

Or:

MariaDB [enwiki_p]> SELECT term_type AS term, term_text, term_language FROM wikidatawiki_p.wb_terms WHERE term_entity_id = 95 AND term_type IN ('label', 'description') AND term_language = 'en';
Empty set (0.01 sec)

but https://www.wikidata.org/wiki/Q95 clearly shows a label and description for en.

For Q95 at least, this data was in the database before, but is now missing.

This is what is causing T195000

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 25 2018, 7:16 PM
MusikAnimal updated the task description. (Show Details)May 25 2018, 7:17 PM

I have my doubts T195520 is related, but I did notice the incident report mentions changes to wb_terms

MusikAnimal updated the task description. (Show Details)May 25 2018, 7:18 PM

If this is new then the events of yesterday might have something to with this.
Inserts are still happening on the table, but it is likely that during the DB overload some inserts were missed? (If not part of transactions).

MusikAnimal added a subscriber: Bawolff.EditedMay 25 2018, 7:50 PM

If this is new then the events of yesterday might have something to with this.
Inserts are still happening on the table, but it is likely that during the DB overload some inserts were missed? (If not part of transactions).

T195000 was reported to me on IRC about a week ago, and that person made it sound like it had been going on for several days. I wasn't sure if the changes that led to T195520 were related, but the outage certainly isn't the culprit.

I talked to @Bawolff about this at the Hackathon and he found a commit that seemed to be related (sorry don't have a link), but either way there is a data inconsistency in wb_terms. If labels and descriptions aren't meant to be in this table anymore they should all be removed, is my thinking.

e78328eab0e7 was the commit i found, which made it look like it was disabled (Guessing by commit message. I'm not actually familiar with the feature or what's going on)

e78328eab0e7 only relates to 1 part of the system that reads from this table, many other parts still do.
All data should still be written to the table :)

try

SELECT term_type AS term, term_text, term_language FROM wikidatawiki_p.wb_terms WHERE term_full_entity_id = 'Q4855000' AND term_type IN ('label', 'description');

term_entity_id is no longer written.

This relates to T188995 and linked tickets

MusikAnimal closed this task as Invalid.May 29 2018, 1:03 AM

Aha! That explains it. Thank you :)

I'm assuming there was an announcement that I missed? I thought I was subscribed to wikidata-l, apparently not... but I am now.

As this is a known part of the term_entity_id to term_full_entity_id migration, I'll close this task as invalid. Thank you again.

Vvjjkkii renamed this task from wikidatawiki.wb_terms missing label and descriptions for some languages to h9baaaaaaa.Jul 1 2018, 1:07 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from h9baaaaaaa to wikidatawiki.wb_terms missing label and descriptions for some languages.Jul 2 2018, 3:44 PM
CommunityTechBot closed this task as Invalid.
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.