Page MenuHomePhabricator

Citoid service should validate ISSN in mediawiki format
Closed, ResolvedPublic0 Estimated Story Points

Description

Insert citation with VisualEditor for "http://chroniclingamerica.loc.gov/lccn/sn85040224/"

Result:

"The Daily Palo Alto times.". ISSN None Check |issn= value (help). Retrieved 2015-06-23. 

Meta data of the target url:

<meta name="DC.title" content="The Daily Palo Alto times." />
<meta name="DC.publisher" content="Times Pub. Co." />
<meta name="DC.issued" content="1905/1943" />
<meta name="DC.identifier" content="info:lccn/sn85040224" />
<meta name="DC.identifier" content="info:oclcnum/11682912" />
<meta name="DC.identifier" content="urn:issn:None" />
<meta name="DC.type" content="text" />
<meta name="DC.subject" content="California--Palo Alto.--fast--(OCoLC)fst01212098" />
<meta name="DC.subject" content="Palo Alto (Calif.)--Newspapers." />
 
<meta name="DC.language" content="eng" />

<link title="MODS Metadata Schema" rel="schema.mods" href="http://www.loc.gov/standards/mods/mods.xsd" />
<meta name="mods.title" content="The Daily Palo Alto times." />
<meta name="mods.place" content="Palo Alto, Calif." />
<meta name="mods.place" content="California--Santa Clara--Palo Alto" />
<meta name="mods.url" content="http://chroniclingamerica.loc.gov/lccn/sn85040224/" />
<meta name="mods.issn" content="None" />
<meta name="mods.lccn" content="sn85040224" />
<meta name="mods.languageTerm" content="eng" />


<meta name="citation_title" content="The Daily Palo Alto times." />
<meta name="citation_issn" content="None" />

While it is unfortunate, it seems not uncommon for loc.gov to output these kind of values. It produces a user-visible error (surfaced through the template parsing preview), but not a native VE error and it will make its way into page content when saved as-is without looking carefully at the preview.

Details

Event Timeline

Krinkle raised the priority of this task from to Needs Triage.
Krinkle updated the task description. (Show Details)
Krinkle added projects: Citoid, VisualEditor.
Krinkle subscribed.
Krinkle renamed this task from Citoid should omit IISN if value from loc.gov is invalid (e.g. "None") to Citoid should omit ISSN if value from loc.gov is invalid (e.g. "None").Jun 23 2015, 6:26 AM
Krinkle set Security to None.
Mvolz renamed this task from Citoid should omit ISSN if value from loc.gov is invalid (e.g. "None") to Citoid service should validate ISSN.Oct 28 2016, 3:26 PM
Mvolz raised the priority of this task from Low to Medium.
Mvolz renamed this task from Citoid service should validate ISSN to Citoid service should validate ISSN in mediawiki format.EditedJan 6 2017, 4:37 PM
Mvolz closed this task as Resolved.
Mvolz subscribed.

From T138481, citoid.wikimedia.org/api?format=zotero still yields "ISSN":"undefined1463-9084".

Yeah, we validate in mediawiki format (have for a little while, forgot to resolve this task, apparently), in Zotero format we don't. There is probably an underlying bug somewhere though, I will investigate this in the other task.

Yeah, we validate in mediawiki format (have for a little while, forgot to resolve this task, apparently), in Zotero format we don't.

I don't know what "mediawiki format" is, but using these simple steps in VisualEditor, the problem still persists:

  • Add a citation
  • Paste url http://chroniclingamerica.loc.gov/lccn/sn85040224/
  • Results in an error:
    Screen Shot 2017-01-26 at 00.03.07.png (414×824 px, 49 KB)

Yeah, we validate in mediawiki format (have for a little while, forgot to resolve this task, apparently), in Zotero format we don't.

I don't know what "mediawiki format" is, but using these simple steps in VisualEditor, the problem still persists:

  • Add a citation
  • Paste url http://chroniclingamerica.loc.gov/lccn/sn85040224/
  • Results in an error:
    Screen Shot 2017-01-26 at 00.03.07.png (414×824 px, 49 KB)

Nooooo. :) VE uses mediawiki format, so it means our validation is flawed. Alas.

Mvolz claimed this task.

Change 334265 had a related patch set uploaded (by Mvolz):
Use stricter validation for ISSNs

https://gerrit.wikimedia.org/r/334265

Change 334265 merged by jenkins-bot:
Use stricter validation for ISSNs

https://gerrit.wikimedia.org/r/334265

Mentioned in SAL (#wikimedia-operations) [2017-03-07T22:11:24Z] <mobrovac@tin> Finished deploy [citoid/deploy@5a7e053]: Deploy for T158675 T103478 T159486 (duration: 02m 36s)

mobrovac subscribed.

Deployed, resolving.