Page MenuHomePhabricator

Nature.com articles gives citoid 401s.
Closed, ResolvedPublic0 Estimated Story Points

Description

This is a problem for following all nature.com DOIs. which link to an intermediate doi resolver - instead of giving us a redirect, these pages give us 401s. We can use crossRef metadata, but in some cases this metadata is incomplete. i.e. http://www.nature.com/doifinder/10.1038/35093097 which has a missing title.

We are also unable to scrape the page, meaning we are unable to supply the title directly from the page either.

The actual response from the page includes the metadata we want, but we have a policy of not including metadata from non-200 status resources (since these are usually bad).

However, the headers do have the correct content-location- but we can't be certain that in general, the content-location is good, because it could be the location of an error page.

Zotero is able to handle these links when directly given the final url.

Event Timeline

Mvolz raised the priority of this task from to Needs Triage.
Mvolz updated the task description. (Show Details)
Mvolz added a project: Citoid.
Mvolz changed Security from none to None.
Mvolz moved this task from Backlog to Extension on the Citoid board.
Mvolz subscribed.
Mvolz renamed this task from scraper.js gets 500 on nature.com articles (err connection reset by peer) to scraper.js gets Internal Server Error on nature.com articles.Nov 26 2014, 12:36 PM
Mvolz renamed this task from scraper.js gets Internal Server Error on nature.com articles to scraper.js gets 404 not found on nature.com articles.Dec 12 2014, 12:24 PM
Mvolz updated the task description. (Show Details)
Mvolz triaged this task as Lowest priority.Mar 9 2015, 3:50 PM
Josve05a raised the priority of this task from Lowest to Low.Jul 12 2015, 6:32 PM
Josve05a subscribed.

Citoid should say that it is unable to process the request, instead of giving a {{cite journal|url=http://www.nature.com/ijo/journal/v38/n1/full/ijo201369a.html |title=Internal Server Error}} for when searching for 10.1038/ijo.2013.69.

@Josve05a this is actually a totally different issue; looks like the Zotero translator for nature.com is broken as that response is coming straight from Zotero

(Thanks for commenting here, I'll file a separate bug.)

I'll file a separate bug.

Thanks, feel free to CC me.

This comment was removed by Mvolz.
Mvolz renamed this task from scraper.js gets 404 not found on nature.com articles to Nature.com articles gives citoid 404s..Jul 19 2015, 5:39 PM
Mvolz raised the priority of this task from Low to Medium.
Mvolz updated the task description. (Show Details)
Mvolz renamed this task from Nature.com articles gives citoid 404s. to Nature.com articles gives citoid 401s..Jul 19 2015, 5:44 PM
Mvolz updated the task description. (Show Details)

Looks like this is handled now: https://citoid.wikimedia.org/api?format=mediawiki&search=http://www.nature.com/ijo/journal/v38/n1/full/ijo201369a.html yields

[{"itemType":"journalArticle","notes":[],"tags":[],"title":"Perceived ‘healthiness’ of foods can influence consumers’ estimations of energy density and appropriate portion size","publicationTitle":"International Journal of Obesity","rights":"© 2013 Nature Publishing Group","volume":"38","issue":"1","pages":"106–112","date":"2014-01-01","DOI":"10.1038/ijo.2013.69","language":"en","url":"http://www.nature.com/ijo/journal/v38/n1/full/ijo201369a.html","abstractNote":"OBJECTIVE:\nMETHODS:\nRESULTS:\nCONCLUSIONS:","libraryCatalog":"www.nature.com","accessDate":"2017-03-14","author":[["G. P.","Faulkner"],["L. K.","Pourshahidi"],["J. M. W.","Wallace"],["M. A.","Kerr"],["T. A.","McCaffrey"],["M. B. E.","Livingstone"]],"source":["Zotero"]}]

Nope- that result is coming from Zotero. With citoid alone it's unable to
scrape the page.

Does that mean a bypass was created to handle these? I can confirm that the sample citation also compiles properly in the VE's Citoid dialog.

Deskana claimed this task.
Deskana subscribed.

I tried the example URL in the description (http://www.nature.com/doifinder/10.1038/35093097), and as far as I can tell it gave a well-formed and complete citation for the DOI in question. Based on this, I think this issue is now resolved. Please reopen if I'm wrong!

Screen Shot 2018-08-16 at 16.44.35.png (552×830 px, 86 KB)