Symptoms:
- The script uses all CPU for hours, without producing any output.
Steps to reproduce:
- Install wdqs using wikibase-docker (version 0.3.10)
- docker-compose exec wdqs mkdir -p data/split
- time docker-compose exec wdqs curl -L https://nimiarkisto.fi/dumps/nimiarkisto.fi-CC-BY-4.0_2020-09-09.rdf.bz2 -o data/dump.rdf.bz2
- time docker-compose exec wdqs ./munge.sh -c 50000 -f data/dump.rdf.bz2 -d data/split -l en,fi,sv -s
I have checked that this is not just slow. With Wikidata Lexemes dump it does output to the log and to the split files. With Nimiarkisto dump I only get:
root@nimiarkisto-qs:~/nimiarkisto-qs# time docker-compose exec wdqs ./munge.sh -c 5000 -f data/dump.rdf -d data/split -l en,fi,sv -s #logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n 10:03:13.441 [main] INFO org.wikidata.query.rdf.tool.Munge - Switching to data/split/wikidump-000000001.ttl.gz ^C
And the file data/split/wikidump-000000001.ttl.gz contains no output.
Is it possible to enable more verbose logging to debug this further?