Page MenuHomePhabricator

Results of a test with 10 random DOIs from en.wiki on the Beta site
Closed, ResolvedPublic8 Estimated Story Points

Description

I took 10 random DOIs from en.wiki.

You can see the results here.

10.1038/scientificamerican0200-90 still not working. - due to 401 at website pointing to log-in only .pdf. Not sure there's much we can do here.
10.2307/3677029 still says "JSTOR: An Error Occurred Setting Your User Cookie" . Forked to T93877
10.1542/peds.2007-2362 still gives " Check date values in: |date= (help)" . Forked to T95016
Hope this helps.

Event Timeline

Elitre created this task.Mar 27 2015, 6:23 PM
Elitre raised the priority of this task from to Needs Triage.
Elitre updated the task description. (Show Details)
Elitre added a project: Citoid.
Elitre added a subscriber: Elitre.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 27 2015, 6:23 PM
Mvolz added a subscriber: Mvolz.Mar 27 2015, 6:45 PM

https://gerrit.wikimedia.org/r/#/c/199921/ should help when it gets merged, but I will check all of these with that change and give an update.

Mvolz moved this task from Backlog to IO Tasks on the Citoid board.Mar 27 2015, 7:13 PM
Mvolz added a comment.Mar 28 2015, 2:46 PM

@Elitre, changes to the DOI converter are live, would you mind rechecking these?

Mvolz added a comment.Mar 28 2015, 2:48 PM

@Elitre, scratch that, they've been merged but aren't live yet :). I'll let you know when they are.

Mvolz added a comment.Apr 3 2015, 1:43 PM

Deployed, can retest now.

Elitre added a comment.Apr 3 2015, 1:51 PM

10.1038/scientificamerican0200-90 still not working.
10.2307/3677029 still says "JSTOR: An Error Occurred Setting Your User Cookie" .
10.1542/peds.2007-2362 still gives " Check date values in: |date= (help)" .

Note that you get the JSTOR error when using a standard JSTOR URL as well, like http://www.jstor.org/stable/25177324 - should that get a separate bug open?

Elitre added a comment.Apr 3 2015, 4:19 PM

The problem lies within CS1 templates like the Cite journal one, which do not accept dates with slashes, and is quickly fixed by replacing them with dashes (this is explained in https://en.wikipedia.org/wiki/Help:CS1_errors#bad_date , and it's good that the error message links to the explanation page, on en.wiki). So I don't know if this can be solved by changing the templates, or by having Citoid convert the date to an accepted format - I suspect the former. I can fork this task if Marielle says it's needed.

Mvolz added a comment.Apr 3 2015, 4:43 PM

It's possible we've been IP blocked by JSTOR, see: T88323.

@Elitre, open up a ticket for the dates, we should be validating date fields.

Mvolz added a comment.Apr 3 2015, 4:46 PM

Re: cookies issue, we have a ticket for that: T93877

Mvolz added a comment.Apr 7 2015, 12:59 PM

The problem with the scientific american one is it resolves to a pdf that requires you to log-in to access. 10.1038/scientificamerican0200-90

http://www.nature.com/scientificamerican/journal/v282/n2/pdf/scientificamerican0200-90.pdf

That website gives us a 401 response code, so we won't try to scrape it. If it had resolved to an actual pdf, we wouldn't be able to scrape that either, because we can't scrape actual pdfs.

In the future we may try to scrape data from certain non-200 response codes and try to generate a citation anyway, but probably not now. You could try filing this one as a site specific issue, as all these dois are probably bad since they try to point to pdfs that you we can't scrape, but I doubt much progress will be made here. Scientific american should make their DOIs point to abstract pages instead of log-in only pdfs :).

Mvolz updated the task description. (Show Details)Apr 7 2015, 1:02 PM
Mvolz set Security to None.
Jdforrester-WMF closed this task as Resolved.Jul 16 2015, 1:39 PM
Jdforrester-WMF claimed this task.
Jdforrester-WMF added a subscriber: Jdforrester-WMF.

Checked all three DOIs; they now all work as expected.

Jdforrester-WMF edited a custom field.