Page MenuHomePhabricator

Add a column for full entity ID to wb_terms table
Closed, ResolvedPublic

Description

As suggested in T159718 a column storing the full entity ID (i.e. its string form, not just a numeric part) should be added to wb_terms table.

https://gerrit.wikimedia.org/r/341322 is a draft of the patch adding column like the one considered here. That patch has been only introduced to illustrate the intent. While actually working on this it could either a separate patch (or patches), or that patch might be taken over and changed to the needed form. https://gerrit.wikimedia.org/r/341322 should not be considered something to follow.

Note that in T159718 DBAs suggested to introduce indexes including the new column at the point when the column is added. I.e. unlike @WMDE-leszek initially considered T159718, or it was proposed in the plan in T114903. This decision should be considered here.

Side note: It might be outside of scope of this task but when actually requesting the schema change, make sure to conform to https://wikitech.wikimedia.org/wiki/Schema_changes#Workflow_of_a_schema_change.

Related Objects

StatusSubtypeAssignedTask
Declineddchen
OpenNone
OpenNone
DuplicateNone
OpenFeatureNone
OpenFeatureNone
DuplicateNone
ResolvedNone
ResolvedNone
ResolvedNone
DuplicateNone
InvalidLydia_Pintscher
OpenNone
OpenNone
StalledNone
OpenNone
ResolvedAddshore
Resolvedthiemowmde
ResolvedAddshore
DeclinedNone
OpenNone
Resolvedhoo
ResolvedLydia_Pintscher
ResolvedNone
DeclinedNone
InvalidLydia_Pintscher
ResolvedLadsgroup
ResolvedAddshore
ResolvedLadsgroup
DeclinedNone
ResolvedNone
ResolvedWMDE-leszek

Event Timeline

Change 341322 had a related patch set uploaded (by Thiemo Mättig (WMDE); owner: WMDE-leszek):
[mediawiki/extensions/Wikibase] [DNM] Add column for full entity ID to Terms DB table

https://gerrit.wikimedia.org/r/341322

The schema update is not sufficient, we also need code to populate the new column. That code should be triggered by a schema update, but also needs a maintenance script. The script needs to operate in batches of configurable size, and should wait for slaves to catch up between batches.

Besides this, the code that updates the terms table needs to be able to write to the old and the new column at the same time. Similarly, the code that reads from the table needs to fall back to using the old columns in case the new column doesn't exist, or is empty. So we will need a feature switch with three states: numeric ids (legacy), full ids (new), and both (compat).

We should probably file separate tickets for these.

With respect to doing the schema update during the data center switchover: we can do that with only the schema change itself in place. It would however be nice to have the rest of the code in place before, so we can actually test the new schema before deploying it.

Change 346169 had a related patch set uploaded (by Aleksey Bekh-Ivanov (WMDE)):
[mediawiki/extensions/Wikibase@master] Writing full IDs to wb_terms on insert

https://gerrit.wikimedia.org/r/346169

Change 341322 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Add column for full entity ID to Terms DB table

https://gerrit.wikimedia.org/r/341322

see T162533 for creating script to populate the column

Change 346169 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Writing full IDs to wb_terms on insert

https://gerrit.wikimedia.org/r/346169

gerritbot doesn't add comment when I make a patch:

Change 348413 merged by jenkins-bot:
[operations/mediawiki-config@master] Don't let Wikibase instances read/write terms_full_entity_id

https://gerrit.wikimedia.org/r/348413

WMDE-leszek claimed this task.

The column has been added and populated on wikidata production. Not open for view yet though, this is tracked in https://phabricator.wikimedia.org/T167114.