Page MenuHomePhabricator

Zotero scraped dates from PubMed are irregular and handled badly by validation
Closed, ResolvedPublic0 Estimated Story Points

Description

This article is returning an incorrect publication date:
https://www.ncbi.nlm.nih.gov/pubmed/1234

Zotero has the date as: 1975 Nov-Dec but the validation is returning "2016-12-01"
https://citoid.wikimedia.org/api?search=https://www.ncbi.nlm.nih.gov/pubmed/1234&format=zotero

Same issue with PMC2096233

There, Zotero gives us '1932-10' which chrono is interpreting as '2017-01-11'.

Event Timeline

Gstupp created this task.Dec 20 2016, 8:54 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 20 2016, 8:54 PM
Restricted Application added a project: VisualEditor. · View Herald TranscriptDec 20 2016, 8:56 PM
Gstupp renamed this task from Incorrect publication date to Incorrect publication date PMID:1234.Dec 20 2016, 8:56 PM
Gstupp moved this task from Backlog to Site specific issues on the Citoid board.
Mvolz renamed this task from Incorrect publication date PMID:1234 to Date validation in citoid handles date "1975 Nov-Dec" from PMID:1234 badly.Jan 9 2017, 2:15 PM
Mvolz claimed this task.
Mvolz triaged this task as Low priority.
Mvolz moved this task from Site specific issues to Service: Scraper & Validation on the Citoid board.
Mvolz added a comment.Jan 9 2017, 2:46 PM

This is because chrono-node is correctly reading in December, but not finding the year. Reported here but don't have high hopes for that, maybe we can get the Zotero translator to pick a better date instead. https://github.com/wanasit/chrono/issues/165

Mvolz renamed this task from Date validation in citoid handles date "1975 Nov-Dec" from PMID:1234 badly to Zotero returns date "1975 Nov-Dec" for pubmed article PMID:1234 and chrono-node handles it badly..Jan 9 2017, 2:47 PM
Mvolz renamed this task from Zotero returns date "1975 Nov-Dec" for pubmed article PMID:1234 and chrono-node handles it badly. to Zotero scraped dates from PubMed are irregular and handled badly by validation.Jan 11 2017, 4:50 PM
Mvolz updated the task description. (Show Details)
Mvolz updated the task description. (Show Details)
Mvolz updated the task description. (Show Details)

Change 331849 had a related patch set uploaded (by Mvolz):
Add custom parsers for PubMed dates

https://gerrit.wikimedia.org/r/331849

Change 331849 merged by jenkins-bot:
Add custom parsers for PubMed dates

https://gerrit.wikimedia.org/r/331849

mobrovac removed a project: Patch-For-Review.
mobrovac removed a subscriber: gerritbot.
mobrovac closed this task as Resolved.Jan 18 2017, 1:24 AM
mobrovac added a subscriber: mobrovac.

Deployed, resolving

Jdforrester-WMF set the point value for this task to 0.Feb 3 2017, 9:22 PM