Page MenuHomePhabricator

Geoshapes and tabular data with spaces in title are exported to RDF with + instead of _
Open, HighPublic

Description

RdfVocabulary::getGeoShapeUri currently turns the title Data:Avignon Roman Wall.map into the URI <http://commons.wikimedia.org/data/main/Data:Avignon+Roman+Wall.map>, which results in 404 Not Found. It should probably be <http://commons.wikimedia.org/data/main/Data:Avignon_Roman_Wall.map>, or possibly <http://commons.wikimedia.org/data/main/Data:Avignon%20Roman%20Wall.map>.

If @Smalyshev is going to reload WDQS soon, it would be swell if we could fix this bug before then. (But otherwise, we could probably purge all items that have geoshape statements in some other way, there are only a few hundred of them.)

The exact same thing happens with tabular data, too (e. g. Data:Dolmens of the Preseli Hills.tab).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 13 2017, 5:10 PM

If I understand correctly, RdfVocabulary::getGeoShapeUri gets the text form of the title, and we need to export the DB key form.

Or is the bug somewhere else, and getGeoShapeUri’s argument is supposed to be in DB key form already?

Lucas_Werkmeister_WMDE renamed this task from Geoshapes with spaces in title are exported to RDF with + instead of _ to Geoshapes and tabular data with spaces in title are exported to RDF with + instead of _.Oct 13 2017, 5:16 PM
Lucas_Werkmeister_WMDE updated the task description. (Show Details)

Workaround: Wikibase also lets you enter the title with underscores instead of spaces (requiring a normalized form is left to the “Commons link” constraint), and that gets exported correctly. Based on that, I assume that getGeoShapeUri’s argument is just directly the value stored in the statement, and the function should fix it up appropriately. (Do we need a full TitleParser for this?)

Since these URIs are unlikely to be IDs, this does not look like high priority, but should be fixed I think. The reload ticket is T176593.

which results in 404 Not Found

They should still be resolvable though… otherwise we needn’t have bothered setting up Special:PageData and all that :)

Change 384501 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[wikidata/query/gui@master] Add workaround for T178184

https://gerrit.wikimedia.org/r/384501

Change 384501 merged by jenkins-bot:
[wikidata/query/gui@master] Add workaround for T178184

https://gerrit.wikimedia.org/r/384501