Page MenuHomePhabricator

Support for custom Item IDs
Open, Needs TriagePublic

Description

Wikibase currently only support item IDs that are a number prefixed with Q. Example: Q123. IDs in different formats, like 123-4X, are not supported.

This task is about adding support for custom item IDs on a code level. In other words, making it possible for third party Wikibase instances to work with their own ID format if they so desire. The current behavior (Q-IDs) could remain the default, requiring third parties to add configuration and code to switch their Wikibase to their own ID format.

I'm making this feature request on behalf of the German National Library (DNB). The DNB already had IDs for its data set (GND IDs). Using those existing IDs would simplify Wikibase adoption. This seems to be a common use case among institutions like the DNB.

Event Timeline

Hey @JeroenDeDauw, we're continuing to look into the complexity of this request. Just so I'm sure I understand the use case for GND and why this would be important to them, can you help by providing some more detail?

  • My first assumption: one reason this change would be helpful for GND is being able to search the Wikibase by GND ID rather than item label or Q identifier. Is that right? Are there additional reasons this would support their use case?
  • Since GND has existing IDs they want to use, I assume then you'll want the IDs for these entities to be assigned manually during the import process. What about for entities that don't yet exist/will need a new GND ID in the future? Would there ever be an expectation that Wikibase automatically generates/assigns new IDs once the initial import is done?

Thanks for your response!

Use cases for DNB:

  • Always end up with the same IDs when importing the dataset in a fresh wiki
  • Access data in Wikibase with existing IDs (both for internal DNB tools and the GND ecosystem users)
  • Do not allow changing GND IDs
  • Ability to create an item with references to other items (statements with type item) without first having imported the other items
  • Avoiding ending up with multiple IDs (per item) to support through the ecosystem

We are aware that a "lookup layer" can be created on top of Q-IDs. This is an extra cost and does not solve most of the above.

We considered several approaches. Support for custom IDs looks most promising.

image.png (428×991 px, 61 KB)

We also investigated which parts of Wikibase would need changing and estimated the required effort. We estimate about 5 days of work for us to make things work sufficiently for the first stage of our project. With full custom ID support that WMDE would be happy with taking significantly longer.

image.png (626×950 px, 78 KB)

Did we miss anything?

Since GND has existing IDs they want to use, I assume then you'll want the IDs for these entities to be assigned manually during the import process. What about for entities that don't yet exist/will need a new GND ID in the future? Would there ever be an expectation that Wikibase automatically generates/assigns new IDs once the initial import is done?

In the first stage of the project we do not need to create new entities on Wikibase. Wikibase will just be a secondary editing interface for the primary copy of the GND in its current system. A full switch to Wikibase will not happen any time soon, and will only happen if the stakeholders wish to switch. Support for GND IDs would help :)

From our side there is no expectation that Wikibase would somehow know how to generate new GND IDs. We would only need a way to hook into the new ID generation of Wikibase, so we can specify our own ID generation function. I expect that to be easy to facilitate.