Page MenuHomePhabricator

Nature.com articles gives citoid 401s.
Closed, ResolvedPublic0 Story Points

Description

This is a problem for following all nature.com DOIs. which link to an intermediate doi resolver - instead of giving us a redirect, these pages give us 401s. We can use crossRef metadata, but in some cases this metadata is incomplete. i.e. http://www.nature.com/doifinder/10.1038/35093097 which has a missing title.

We are also unable to scrape the page, meaning we are unable to supply the title directly from the page either.

The actual response from the page includes the metadata we want, but we have a policy of not including metadata from non-200 status resources (since these are usually bad).

However, the headers do have the correct content-location- but we can't be certain that in general, the content-location is good, because it could be the location of an error page.

Zotero is able to handle these links when directly given the final url.

Event Timeline

Mvolz created this task.Nov 20 2014, 6:22 PM
Mvolz raised the priority of this task from to Needs Triage.
Mvolz updated the task description. (Show Details)
Mvolz added a project: Citoid.
Mvolz changed Security from none to None.
Mvolz moved this task from Backlog to Extension on the Citoid board.
Mvolz added a subscriber: Mvolz.
Mvolz renamed this task from scraper.js gets 500 on nature.com articles (err connection reset by peer) to scraper.js gets Internal Server Error on nature.com articles.Nov 26 2014, 12:36 PM
Mvolz renamed this task from scraper.js gets Internal Server Error on nature.com articles to scraper.js gets 404 not found on nature.com articles.Dec 12 2014, 12:24 PM
Mvolz updated the task description. (Show Details)
Mvolz moved this task from Extension to Site specific issues on the Citoid board.Feb 7 2015, 2:51 PM
Mvolz triaged this task as Lowest priority.Mar 9 2015, 3:50 PM
Josve05a raised the priority of this task from Lowest to Low.Jul 12 2015, 6:32 PM
Josve05a added a subscriber: Josve05a.

Citoid should say that it is unable to process the request, instead of giving a {{cite journal|url=http://www.nature.com/ijo/journal/v38/n1/full/ijo201369a.html |title=Internal Server Error}} for when searching for 10.1038/ijo.2013.69.

Mvolz added a subscriber: mobrovac.EditedJul 12 2015, 7:15 PM

@Josve05a this is actually a totally different issue; looks like the Zotero translator for nature.com is broken as that response is coming straight from Zotero

Mvolz added a comment.Jul 12 2015, 7:15 PM

(Thanks for commenting here, I'll file a separate bug.)

I'll file a separate bug.

Thanks, feel free to CC me.

Mvolz added a comment.Jul 12 2015, 7:45 PM
This comment was removed by Mvolz.
Mvolz renamed this task from scraper.js gets 404 not found on nature.com articles to Nature.com articles gives citoid 404s..Jul 19 2015, 5:39 PM
Mvolz raised the priority of this task from Low to Medium.
Mvolz updated the task description. (Show Details)
Mvolz renamed this task from Nature.com articles gives citoid 404s. to Nature.com articles gives citoid 401s..Jul 19 2015, 5:44 PM
Mvolz updated the task description. (Show Details)
Mvolz updated the task description. (Show Details)Jul 19 2015, 5:47 PM
Restricted Application added a project: VisualEditor. · View Herald TranscriptOct 29 2016, 7:37 AM
Jdforrester-WMF set the point value for this task to 0.Feb 9 2017, 6:21 PM
czar added a subscriber: czar.Mar 14 2017, 2:35 AM

Looks like this is handled now: https://citoid.wikimedia.org/api?format=mediawiki&search=http://www.nature.com/ijo/journal/v38/n1/full/ijo201369a.html yields

[{"itemType":"journalArticle","notes":[],"tags":[],"title":"Perceived ‘healthiness’ of foods can influence consumers’ estimations of energy density and appropriate portion size","publicationTitle":"International Journal of Obesity","rights":"© 2013 Nature Publishing Group","volume":"38","issue":"1","pages":"106–112","date":"2014-01-01","DOI":"10.1038/ijo.2013.69","language":"en","url":"http://www.nature.com/ijo/journal/v38/n1/full/ijo201369a.html","abstractNote":"OBJECTIVE:\nMETHODS:\nRESULTS:\nCONCLUSIONS:","libraryCatalog":"www.nature.com","accessDate":"2017-03-14","author":[["G. P.","Faulkner"],["L. K.","Pourshahidi"],["J. M. W.","Wallace"],["M. A.","Kerr"],["T. A.","McCaffrey"],["M. B. E.","Livingstone"]],"source":["Zotero"]}]
Mvolz added a comment.Mar 14 2017, 2:21 PM

Nope- that result is coming from Zotero. With citoid alone it's unable to
scrape the page.

czar added a comment.Mar 15 2017, 5:32 AM

Does that mean a bypass was created to handle these? I can confirm that the sample citation also compiles properly in the VE's Citoid dialog.

Deskana closed this task as Resolved.Aug 16 2018, 3:47 PM
Deskana claimed this task.
Deskana added a subscriber: Deskana.

I tried the example URL in the description (http://www.nature.com/doifinder/10.1038/35093097), and as far as I can tell it gave a well-formed and complete citation for the DOI in question. Based on this, I think this issue is now resolved. Please reopen if I'm wrong!

Restricted Application added a project: User-Ryasmeen. · View Herald TranscriptAug 16 2018, 3:47 PM