Citoid converts ignores <302::aid-ajmg13>
Open, NormalPublic0 Story Points

Description

Paste:

10.1002/1096-8628(20000612)96:3<302::aid-ajmg13>3.0.co;2-i

in Citoid and it will fill the citation, however, in the doi-field it wil add:

10.1002/1096-8628(20000612)96:33.0.CO;2-I

Which is a non-existing doi.

Checked on sv.wiki prodution

Josve05a updated the task description. (Show Details)
Josve05a raised the priority of this task from to Needs Triage.
Josve05a added a project: Citoid.
Josve05a added a subscriber: Josve05a.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptNov 24 2015, 11:20 AM
Mvolz added a subscriber: Mvolz.Dec 7 2015, 8:34 PM

Hmm. It's getting caught by a fix that strips html tags out of any fields using a node library called stripTags. This is a validation measure to make sure we aren't accidentally sending html to wikis... not sure how to get around this except not to use it on the doi fields. I'll have to consult with security :).

Mvolz set Security to None.
Mvolz edited projects, added Security; removed Security-Reviews.
Bawolff added a subscriber: Bawolff.Dec 7 2015, 8:40 PM

Hmm. It's getting caught by a fix that strips html tags out of any fields using a node library called stripTags

Use a better library?

Stripping html tags does not sound like the correct solution here. Unless you're worried about dirty data that has extra html tags you don't want in it, I would expect that you would simply want to escape angled brackets (ie turn >, < into &gt; and &lt; respectively).

Mvolz added a comment.Dec 7 2015, 8:42 PM

@Bawolff, That's exactly what we're worried about. It was originally a fix for some Zotero data that was coming in with div and i tags.

Mvolz added a comment.EditedDec 7 2015, 8:42 PM

Fittingly their wikipedia translator, hah :).

Mvolz added a comment.Dec 7 2015, 8:44 PM

Also, changing < into &gt; in a doi changes its meaning.

Also, changing < into &gt; in a doi changes its meaning

It should be different layers. turning < into &lt; when fed to the wiki, will eventually get turned back into a '<' when read by the user.

That's exactly what we're worried about. It was originally a fix for some Zotero data that was coming in with div and i tags.

I'm not exactly sure about the syntax of doi's, but that means you'd probably have to do something like strip html-type tags, but only those that are div, span, i, etc, and then escape angle brackets for the other uses of angle brackets.

Bawolff claimed this task.Dec 8 2015, 10:08 PM
Bawolff triaged this task as Normal priority.
Bawolff edited projects, added Security-Team; removed Security.
Mvolz added a subscriber: mobrovac.Dec 9 2015, 6:56 AM

That's true, but we can't guarantee that every consumer of the doi field is
going to be html, so that makes me concerned. I think I'd be happier not
using striptags just on the doi field. The issue we had with polluting html
tags was not in the doi field and I think is unlikely to be. @mobrovac what
do you think?

I'm with @Mvolz that the best solution might be not to enforce tag stripping on all of the fields. But, before we do that, I'd like to see some research conducted into what happens if malicious tags (primarily <script>) are inserted in the DOI field by the client. How would Zotero process this? What about DOI indexers?

Personally, I'm sad to see such abuses of the DOI spec.

Mvolz moved this task from Backlog to IO Tasks on the Citoid board.Jan 12 2016, 10:14 AM
Restricted Application added a project: VisualEditor. · View Herald TranscriptOct 28 2016, 3:13 PM
Jdforrester-WMF set the point value for this task to 0.Feb 9 2017, 6:18 PM
Mvolz claimed this task.Sep 13 2017, 9:21 AM
Josve05a moved this task from Backlog to Tasks to follow on the User-Josve05a board.