Page MenuHomePhabricator

Add wikibase/wikidata export format to citoid service
Open, MediumPublic

Description

As opposed to mediawiki, where all citation data are stored as strings, Wikidata has a strict typing system. https://www.wikidata.org/wiki/Special:ListDatatypes

Done:

  • New wikibase format added and deployed
  • isbns with dashes - deployed for all formats, see T230057
  • Seperate properties for ISBN10 and ISBN13, deployed for wikibase format
  • Identifiers key: added, see T245142

In progress:

  • pmcid without the pmc prefix - part done for all formats, see T157152
  • Identifiers key: added and deployed for some identifiers, T245142

To do:

  • Hierarchical structure for containers: T245142
  • Plausibly a structured date format compatible with wikibase dates. However this isn't a priority because the wikibase api is pretty good at handling this on its own already ^-^.
  • Validating language codes against wikibase's allowed values. see also T217258
  • Plausibly including language code with every monolingual type, if possible. see also T217258

Alternatively we may even consider a wikidata format, which will actually try to obtain the correct QIDs in the background for certain params.

Event Timeline

Mvolz triaged this task as Medium priority.Oct 29 2018, 11:15 AM
Mvolz created this task.
Mvolz added a project: Services.
mobrovac subscribed.

+1 to the idea. We could then have ref import/export to/fro WB and WP and other utilities.

Dates in particular seem like something that could be a big source of pain because on the one hand we don't always have the full date available and on the other WikiBase has a rather intricate way of ingesting time-based data. But we can iterate on that.

Mvolz updated the task description. (Show Details)
Mvolz renamed this task from Add wikibase export format to citoid service to Add wikibase/wikidata export format to citoid service.Nov 15 2018, 11:29 AM
Mvolz updated the task description. (Show Details)

Change 513975 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/citoid@master] Add wikibase format

https://gerrit.wikimedia.org/r/513975

Change 513975 merged by jenkins-bot:
[mediawiki/services/citoid@master] Add wikibase format; rm legacy basefields param

https://gerrit.wikimedia.org/r/513975

Change 559068 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/citoid@master] Modifiy wikibase format for ISBNs

https://gerrit.wikimedia.org/r/559068

Change 562839 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/citoid@master] Convert most remaining ids to new format

https://gerrit.wikimedia.org/r/562839

Change 559068 merged by jenkins-bot:
[mediawiki/services/citoid@master] Modify wikibase format for ISBNs

https://gerrit.wikimedia.org/r/559068

Change 562839 merged by jenkins-bot:
[mediawiki/services/citoid@master] Convert most remaining ids to new format

https://gerrit.wikimedia.org/r/562839

Are all aspects of this task now resolved? If not, what's outstanding?

Are all aspects of this task now resolved? If not, what's outstanding?

For identifiers:

Done: ISBN, DOI, url, pmid. Waiting deploy for DOI, url, pmid, pmcid. ISBN deployed already.
PMC partially done, see: T157152
TODO: ISSN

All other aspects (language codes, dates): not done :(

Mvolz updated the task description. (Show Details)
Mvolz updated the task description. (Show Details)