Page MenuHomePhabricator

Include redirects in RDF dumps
Closed, ResolvedPublic

Description

Wikidata redirects defined on Q-id to be an alias for another Q-id. This should be reflected by a owl:sameAs relationship between the respective entity URIs (but not the document URIs). The Document URI of the redirect could have rdf:type wikibase:Redirect or something similar.

This is needed to allow references to entities that were turned into redirects/aliases to be handled correctly by rdf consumers like triple stores.

NOTE: T69033 calls for redirects to item X to be included in the output for item X. That would imply all redirects would be present in an RDF dump. The straight forward implementation would be very inefficient though (it would mean looking up redirects for each entity). Instead, we should omit that lookup in dump mode, and process redirects as they appear in the ID stream that drives the dump output.

Event Timeline

daniel created this task.Mar 24 2015, 4:21 PM
daniel raised the priority of this task from to Needs Triage.
daniel updated the task description. (Show Details)
daniel added a project: Wikidata.
daniel added subscribers: daniel, Smalyshev, Denny, mkroetzsch.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 24 2015, 4:21 PM
Lydia_Pintscher triaged this task as Normal priority.Mar 30 2015, 9:47 AM
Lydia_Pintscher set Security to None.

Isn't this one the same as T69033?

Smalyshev updated the task description. (Show Details)Apr 8 2015, 6:41 AM
daniel added a comment.May 4 2015, 6:21 PM

@Smalyshev: not exactly. T69033 calls for redirects to item X to be included in the output for item X. This ticket asks for all redirects to be present in RDF dumps. Logically, T69033 would imply that, but the straight forward implementation would be very inefficient (it would mean looking up redirects for each entity). Instead, we should omit that lookup in dump mode, and process redirects as they appear in the ID stream that drives the dump output.

I'll edit the description to make the distinction clear.

Change 208716 had a related patch set uploaded (by Daniel Kinzler):
Include redirects in RDF dumps

https://gerrit.wikimedia.org/r/208716

picking this up, since it comes naturally with T96364

daniel claimed this task.May 4 2015, 7:30 PM

patch already done

Change 208716 merged by jenkins-bot:
Include redirects in RDF dumps

https://gerrit.wikimedia.org/r/208716

daniel closed this task as Resolved.May 4 2015, 7:40 PM
daniel moved this task from Backlog to Done on the Wikidata-Sprint-2015-04-21 board.