Page MenuHomePhabricator

Fix the munger to support commons RDF dump
Closed, ResolvedPublic

Description

When trying to munge the dumps the process is filtering many triples saying:

15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:http://commons.wikimedia.org/entity/statement/M51372-16FD5B4C-7B40-4FCC-984C-4DAA9A8D00CA p:http://wikiba.se/ontology#rank o:http://wikiba.se/ontology#NormalRank
15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:http://commons.wikimedia.org/entity/statement/M51372-16FD5B4C-7B40-4FCC-984C-4DAA9A8D00CA p:http://www.wikidata.org/prop/statement/P7482 o:http://www.wikidata.org/entity/Q66458942
15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized subjects: [http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB] while processing http://commons.wikimedia.org/entity/M51376.  Expected only sitelinks and subjects starting with http://commons.wikimedia.org/wiki/Special:EntityData/ and [http://www.wikidata.org/entity/, http://commons.wikimedia.org/entity/]
15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://wikiba.se/ontology#BestRank
15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB p:http://wikiba.se/ontology#rank o:http://wikiba.se/ontology#NormalRank
15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:http://commons.wikimedia.org/entity/statement/M51376-4B8D8CD4-0783-433F-B0A2-1DD667F8FBAB p:http://www.wikidata.org/prop/statement/P7482 o:http://www.wikidata.org/entity/Q66458942
15:03:28.962 [org.wikidata.query.rdf.tool.rdf.AsyncRDFHandler$RDFActionsReplayer] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized subjects: [http://commons.wikimedia.org/entity/statement/M51389-FE3B5391-E9F2-45E2-B353-84FD0ED8FDC8] while processing http://commons.wikimedia.org/entity/M51389.  Expected only sitelinks and subjects starting with http://commons.wikimedia.org/wiki/Special:EntityData/ and [http://www.wikidata.org/entity/, http://commons.wikimedia.org/entity/]

The munger is ran with the following options: -w commons.wikimedia.org -U http://www.wikidata.org --commonsUri http://commons.wikimedia.org.

Related Objects

StatusSubtypeAssignedTask
Declineddchen
OpenNone
OpenNone
DuplicateNone
OpenNone
DuplicateNone
OpenNone
ResolvedNone
ResolvedNone
ResolvedNone
DuplicateNone
ResolvedArielGlenn
ResolvedGehel
ResolvedEBernhardson
ResolvedEBernhardson
ResolvedGehel
ResolvedGehel
ResolvedZbyszko
ResolvedZbyszko
ResolvedZbyszko
ResolvedZbyszko
ResolvedGehel
ResolvedZbyszko
OpenNone
ResolvedZbyszko
Resolvedbd808
ResolvedGehel
DuplicateNone
ResolvedGehel
ResolvedZbyszko
ResolvedGehel
ResolvedZbyszko

Event Timeline

Change 596458 had a related patch set uploaded (by ZPapierski; owner: ZPapierski):
[wikidata/query/rdf@master] Fix for Commons Structured Data uri scheme

https://gerrit.wikimedia.org/r/596458

Change 596458 merged by jenkins-bot:
[wikidata/query/rdf@master] Fix for Commons Structured Data uri scheme

https://gerrit.wikimedia.org/r/596458

Munger correctly processes commons dump.