I'm creating this task to record open questions/issues that we have for Blazegraph implementation, for the stage of the data level research. I'll update these periodically. If anybody thinks better form/forum is preferable, please tell.
RDR syntax. This produces an error:
<<entity:Q16 v:P1451 "A mari usque ad mare"@la>> wikibase:Rank wikibase:PreferredRank ; q:P805 entity:Q41423 .
The error message is:
Caused by: java.lang.RuntimeException: Could not load: url=file:///home/smalyshev/dump-3m.fixed.ttlx.gz, cause=org.openrdf.rio.RDFParseException: Illegal language tag char: '>' [line 8287]
What is the correct form?
- Query performance. This query:
SELECT ?h ?date WHERE { ?h wdt:P31 entity:Q5 . ?h wdt:P569 ?date . FILTER NOT EXISTS {?h wdt:P570 ?d } } LIMIT 100
is very slow (minutes) on 3m entities dump. Seems to be exclusively for FILTER NOT EXISTS - other filters, like FILTER (?date < "00000001880-01-01T00:00:00Z"^^xsd:dateTime) works fine.
Is there some issue/configuration/rewrite that would make it fast? Didn't have this problem previously on einsteinium with full dump, but now the truthy dump makes trouble.
- Date types. The DB seems to accept values marked ^^xsd:dateTime fine, but it looks like it uses them as text, not dates, since these filters: FILTER (?date < "1880-01-01"^^xsd:date) or FILTER (?date < "1880"^^xsd:gYear) don't seem to work. We'll probably implement our own date handling eventually but it'd be nice to understand how it works for now. Also, see two forms of date display:
<http://wikidata-wdq.testme.wmflabs.org/entity/Q2587183> -3059-01-01T00:00:00.000Z <http://wikidata-wdq.testme.wmflabs.org/entity/Q3142361> 00000001979-00-00T00:00:00Z
Which is a bit strange.
- Extensibility. Right now I'm running the server with:
java -server -Xmx4g -jar bigdata-1.5.0.jar
but the configuration mentions a lot of properties that can be configured. How it is done? Also, how one would add functions/extensions, etc. to the jar?
5. Backup - I wonder, if I just copy .jnl file in a random moment of time, and restore it later, is it an ok scenario? What about the HA setup?