Page MenuHomePhabricator

+ is replaced by %2B in external identifiers
Open, Needs TriagePublic

Description

The link here:
https://www.wikidata.org/w/index.php?title=Q7016547&oldid=1401635783#P9173

should point to

https://rateyourmusic.com/genre/Newa+Folk+Music

but instead it points to

https://rateyourmusic.com/genre/Newa%2BFolk%2BMusic

similar issues: T160281, T136346

Event Timeline

Probably such a thing should also bypass % symbols in identifiers. If I try to set identifier Trap [EDM] (Q47781397) as Trap%20%5BEDM%D, it converts it to Trap%2520%255BEDM%255D, where each % changes to %25. I supose that it should not work it like this.

As far as I can see, external IDs are basically based on Strings in Wikidata. All characters are allowed, nothing is encoded. This is how we probably should continue to treat external IDs. The example seems to struggles with the encoding/decoding:

"https://rateyourmusic.com/genre/Newa+Folk+Music" is an URL. In URLs, the "+" is encoding " ". So the decoded ID in the URL would actually be "Newa Folk Music". If you enter "Newa Folk Music" as ID in Wikidata, everything should be fine. However, if you enter "Newa+Folk+Music" as the ID in Wikidata, the "+" will be treated as actual plus symbols. So when Wikidata creates a URL for it, it is encoding the "+" symbols to "%2B".

My take would be that external IDs should only be collected in Wikidata in their natural (unencoded) form. That would mean that you could not e.g. just use parts of URLs as external IDs as they are URL encoded already. You would always have to use the decoded form of the ID.

@Shisma: Thank you for reporting this! In my opinion, this is not a bug. I have tried to explain why I think so in my previous comment. The consequences of this would be that you enter the external ID in the example as "Newa Folk Music" (the genre as written on the site rateyourmusic.com and not as encoded in the URL of the page). The "property examples" in the Property (https://www.wikidata.org/wiki/Property:P9173 in this case) would have to be changed accordingly.

Does my explanation make sense to you (and anyone reading)? If so, we will try to give a better explanation of this in the documentation.