I am following this excellent tutorial by @Addshore to reset my WDQS after migrating data to a new Wikibase.
Relevant images running: wikibase/wdqs:0.3.10 and wikibase/wikibase:1.30-bundle
The munging throws an error:
#logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n 23:57:52.108 [main] INFO org.wikidata.query.rdf.tool.Munge - Switching to /tmp/db-dumps/mungedOut/wikidump-000000001.ttl.gz 23:58:02.401 [main] INFO o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized subjects: [https://artbase.rhizome.org/entity/statement/Q4198-2BB92CD9-DB2D-4482-87F1-115C209FE3A9, https://artbase.rhizome.org/prop/statement/value/P109, https://artbase.rhizome.org/entity/statement/Q1623-28C1990A-7E66-442C-8617-37D890C63B30, https://artbase.rhizome.org/prop/statement/value/P107, https://artbase.rhizome.org/prop/statement/value/P108, https://artbase.rhizome.org/prop/statement/value/P101, https://artbase.rhizome.org/prop/statement/value/P102, https://artbase.rhizome.org/entity/statement/ [...]
...then follows a list of each and every triple in my Wikibase as "Unrecognized subject". Finally, the output concludes with...
at org.wikidata.query.rdf.tool.rdf.Munger$MungeOperation.finishCommon(Munger.java:965) at org.wikidata.query.rdf.tool.rdf.Munger$MungeOperation.munge(Munger.java:493) at org.wikidata.query.rdf.tool.rdf.Munger.munge(Munger.java:148) at org.wikidata.query.rdf.tool.rdf.Munger.munge(Munger.java:192) at org.wikidata.query.rdf.tool.Munge$EntityMungingRdfHandler.munge(Munge.java:255) at org.wikidata.query.rdf.tool.Munge$EntityMungingRdfHandler.endRDF(Munge.java:243) at org.wikidata.query.rdf.tool.rdf.DelegatingRdfHandler.endRDF(DelegatingRdfHandler.java:28) at org.openrdf.rio.turtle.TurtleParser.parse(TurtleParser.java:223) at org.wikidata.query.rdf.tool.Munge.run(Munge.java:115) at org.wikidata.query.rdf.tool.Munge.main(Munge.java:76) <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text/html;charset=UTF-8"><title>blazegraph™ by SYSTAP</title ></head ><body<p>totalElapsed=188ms, elapsed=65ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p ><hr><p>COMMIT: totalElapsed=242ms, commitTime=1581813802754, mutationCount=6</p ></html >Processing wikidump-000000001.ttl.gz <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"><html><head><meta http-equiv="Content-Type" content="text/html;charset=UTF-8"><title>blazegraph™ by SYSTAP</title ></head ><body<p>totalElapsed=92ms, elapsed=92ms, connFlush=0ms, batchResolve=0, whereClause=0ms, deleteClause=0ms, insertClause=0ms</p ><hr><p>COMMIT: totalElapsed=131ms, commitTime=1581813803605, mutationCount=6</p ></html >File wikidump-000000002.ttl.gz not found, terminating
The munger creates a wikidump-000000001.ttl.gz that is 535 bytes long.
I wonder what the issue is with my exported TTL file, but it doesn't look problematic to me, all namespaces are defined at the top of the file with the correct base URI https://artbase.rhizome.org and the docker-compose file contains the environment to make that known to WDQS:
environment: - WIKIBASE_SCHEME=https - WIKIBASE_HOST=artbase.rhizome.org
I was trying to examine the munger's code to see when it throws the "Unrecognized subjects" error but the Dockerfile just loads compiled JVM binaries and I don't know where to find the source.