Page MenuHomePhabricator

Enable and use or merge results from zotero ISBN search to improve ISBN results
Open, Stalled, NormalPublic

Description

Steps to reproduce

  1. Open WorldCat or Zotero
  2. Search for ISBN 9780198029359
  3. Open https://cs.wikipedia.org/api/rest_v1/#!/Citation/getCitation (or use Citoid in VE)
  4. Search for ISBN 9780198029359

Expected behavior
The records from WorldCat or Zotero should match what REST or Citoid outputs, right? The records in Zotero seems to be correct, the result from WorldCat has some issues.

Current behavior
The REST output (Citoid output) is a mess. There is year in author last name field, there is WorldCat url in e-book url field, there is some english text in numPages field (bad on non-english Wikipedia). What is wrong with this?

Configuration
cswiki

Issue

We currently use worldcat's open search API for this metadata and the results we get are only available in marcxml which is probably where that junk is come from and dublincore which is also pretty messy and not too structured. The results on their website come from a database we don't have access to. I could probably improve it a bit on our end but it might worth working towards fixing zotero instead because they query more databases, or merging the results from both somehow.

Event Timeline

Dvorapa created this task.Jan 27 2019, 5:45 PM
Restricted Application added subscribers: Urbanecm, Aklapper. · View Herald TranscriptJan 27 2019, 5:45 PM
Mvolz added a subscriber: Mvolz.Jan 28 2019, 1:43 PM

I'm not sure what version of Zotero you're using, but unfortunately in translation-server which is what we use, this feature seems to be broken, filed here: https://github.com/zotero/translation-server/issues/79

We currently use worldcat's open search API for this metadata and the results we get are only available in marcxml which is probably where that junk is come from and dublincore which is also pretty messy and not too structured. The results on their website come from a database we don't have access to. I could probably improve it a bit on our end but it might worth working towards fixing zotero instead because they query more databases, or merging the results from both somehow.

Mvolz renamed this task from Both Zotero and WorldCat records seems correct, but Citoid generates mess from ISBN to Enable and use or merge results from zotero ISBN search to improve ISBN results.Jan 28 2019, 1:44 PM
Mvolz changed the task status from Open to Stalled.
Mvolz triaged this task as Normal priority.

Maybe a dupe of T160845

Change 486851 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/citoid@master] [WIP] Remove xISBN and replace with zotero

https://gerrit.wikimedia.org/r/486851

Mvolz updated the task description. (Show Details)Jan 28 2019, 2:00 PM
Dvorapa added a comment.EditedJan 28 2019, 5:39 PM

...or merging the results from both somehow.

Combining multiple sources would be the best option, but maybe too internet-consuming. Possibly a good idea is also to try some local authorities (per T212585 and subtasks) first.

Change 486851 merged by jenkins-bot:
[mediawiki/services/citoid@master] Remove xISBN and replace with zotero

https://gerrit.wikimedia.org/r/486851

Mvolz moved this task from Backlog to Service on the Citoid board.Feb 26 2019, 9:43 AM
Mvolz updated the task description. (Show Details)