Page MenuHomePhabricator

Incorrect handling references with ISBN13 starting with 979 (ISBN is truncated to 10 digits)
Closed, ResolvedPublic

Description

It seems that VE is incorrectly handling references with an ISBN13 starting with 979.

For example, on frwiki, article Petrea volubilis, if I edit the article with VE, and try to add a reference with the "Sources" button (References), and input the first ISBN in the field for automatic reference and click Générer (Generate), the suggested reference contains 3 ISBN: 9782877631815, 9791029801 and 2877631818. The first and the third are correct, but the second is missing its last 3 digits. The dialog says that the values are coming from Worldcat, with OCLC 496442002, but Worlcat correctly displays the second ISBN as 9791029801297 (note the last 3 characters missing in the generated reference)

I've noticed a lot of article with ISBN13 starting with 979 truncated after the first 10 digits, and I've always wondered where it came from, I even asked some editors if they used some external tools to generate the references, but it seems that it's the internal Reference gadget that is truncating ISBN.

Event Timeline

matmarex subscribed.

VisualEditor only displays the values it gets from Citoid. Citoid in fact responds with 5 different ISBN numbers:

https://fr.wikipedia.org/api/rest_v1/data/citation/mediawiki/9782877631815

[
  {
    ...
    "title": "Mon jardin tropical : [guide de jardinage : Antilles [et] Réunion]",
    "oclc": "496442002",
    "url": "https://www.worldcat.org/oclc/496442002",
    "ISBN": [
      "9782877631815",
      "9791029801",
      "2877631818",
      "2908490307",
      "9782908490305"
    ],
    ...
]

These ISBNs are mapped to the "isbn", "isbn2", "isbn3" fields of the template using TemplateData defined at https://fr.wikipedia.org/w/index.php?title=Modèle:Ouvrage/Documentation&action=edit (search for "citoid"). As a workaround, if the second ISBN is consistently incorrect, you could also change that map to only use the first one.

I'm not very familiar with Citoid, but in a quick search, I found a few regular expressions in https://github.com/wikimedia/citoid/blob/95e6318faf3ee3eda4358e5d5e591e90f767beaa/lib/CitoidService.js and https://github.com/wikimedia/citoid/blob/95e6318faf3ee3eda4358e5d5e591e90f767beaa/lib/Exporter.js that only match '978' as the prefix (and not '979'). That's probably wrong.

@matmarex
Yes, that's wrong, ISBN-13 can start either with 978 or with 979 (currently 979-10 to 979-12 according to ISBN Range, but may increase in the future)
I think Citoid should be fixed to work properly with all ISBN.

Ignoring the second ISBN is not really an option, the ISBN with 979 could be the first one also...

Mvolz triaged this task as Medium priority.

Change 494788 had a related patch set uploaded (by Mvolz; owner: Mvolz):
[mediawiki/services/citoid@master] Allow ISBNs with 979 prefix

https://gerrit.wikimedia.org/r/494788

Change 494788 merged by jenkins-bot:
[mediawiki/services/citoid@master] Allow ISBNs with 979 prefix

https://gerrit.wikimedia.org/r/494788