Page MenuHomePhabricator

Bad formatting for quotes in .nt export
Closed, ResolvedPublic

Description

When exporting data as NTriples, quotes in strings are not quoted correctly. Example: https://www.wikidata.org/wiki/Special:EntityData/Q33742.nt?flavor=dump produces this string:

<http://www.wikidata.org/entity/Q33742> <http://schema.org/description> "language naturally spoken by humans, as opposed to \\"formal\\" or \\"built\\" languages"@en .

Note how the slash is doubled. It is not correct - it should be single slash. TTL format works fine.

Event Timeline

Change 332816 had a related patch set uploaded (by Smalyshev):
Do not double-quote quotes in NTriples format

https://gerrit.wikimedia.org/r/332816

Change 332816 merged by jenkins-bot:
Do not double-quote quotes in NTriples format

https://gerrit.wikimedia.org/r/332816

Smalyshev triaged this task as Medium priority.Jan 20 2017, 9:48 PM

The purtle part is done but I'm not sure how to update it in production - is there anything that needs to be done for it?

@Smalyshev composer.json for wikibase has "wikimedia/purtle": "~1.0", so any bugfix release should be pulled in automatically. That means that the next update of the wikidata-build repository will have the new purtle version, and it will be deployed with the next regular wikidata deployment. Not sure when that is, though, ask @hoo. Updating the wikidata build is still a manual process.

We deployed this week, so we're probably not going to deploy next week (unless there's something important).

This should be definitely deployed before we move forward with T144103, but otherwise I think it can wait for a regular deployment if it happens in ~a couple of weeks.