Page MenuHomePhabricator

Correctly configure WIKIBASE_HOST etc. in order to have the WDQS work with the correct URIs
Closed, ResolvedPublic

Description

Dear community,

We are running each of the Docker-containers as a separate "pod" on a Kubernetes cluster. The initial setup worked just as the local one produced via docker-compose, which means with respect to the WikiDataQueryService (WDQS) and its frontend:

  • all URIs are shown as <http://wikibase-svc/entity/Q...> or accordingly
  • typing wd: and activating the search via Ctrl+Space works but the query does not match anything because wd: is the original Wikidata-prefix

The goal of the endeavor is to reach the following:

  • all URIs are entered into Blazegraph with our platform's URL, e.g. https://muwi.epfl.ch/entity/Q1;
  • the prefix wd: should designate https://muwi.epfl.ch/entity/;
  • nevertheless, the search functionality needs to work;
  • as in the original WDQS, URIs beginning with the base URL should be displayed with the respective prefixes

Regardless of having read all setup tutorials I could find, it is not clear to me how this can be achieved. I've tried plenty of different configurations but got nowhere near the desired result. Here is the configuration that looks most promising to me.

Since wikibase-svc--which is the name of the Wikibase service--is the part in the URIs that we want to replace, it seems clear that this has to be replaced. The potential EnvVars where such a change might apply are (shown with the current configuration):

  • wdqs:
    • WIKIBASE_HOST:wikibase-svc
  • wdqs-updater:
    • WIKIBASE_HOST:muwi.epfl.ch
  • wdqs-frontend:
    • WIKIBASE_HOST:wikibase-svc
  • quickstatements
    • WIKIBASE_SCHEME_AND_HOST:muwi.epfl.ch
    • WB_PUBLIC_SCHEME_HOST_AND_PORT:muwi.epfl.ch

I should add that wikibase-svc is the service name by which containers can access the Wikibase. The services don't know via which URL they are being accessed from outside. One problem seems to lie in the fact that our URLs are automatically prefixed with https: and that somewhere, http: seems to be hard-coded. When I enter the first item, the log of wdqs-updater shows

19:22:57.484 [main] INFO  o.w.q.r.t.change.RecentChangesPoller - Got 1 changes, from Q1@2@20200720192253|2 to Q1@2@20200720192253|2
19:22:57.789 [update 0] WARN  org.wikidata.query.rdf.tool.Updater - Contained error syncing.  Giving up on Q1
org.wikidata.query.rdf.tool.rdf.Munger$BadSubjectException: Unrecognized subjects:  [https://muwi.epfl.ch/entity/Q1].  Expected only sitelinks and subjects starting with http://muwi.epfl.ch/wiki/Special:EntityData/ and [http://muwi.epfl.ch/entity/]
        at org.wikidata.query.rdf.tool.rdf.Munger$MungeOperation.finishCommon(Munger.java:941)
        at org.wikidata.query.rdf.tool.rdf.Munger$MungeOperation.munge(Munger.java:493)
        at org.wikidata.query.rdf.tool.rdf.Munger.munge(Munger.java:148)
        at org.wikidata.query.rdf.tool.rdf.Munger.mungeWithValues(Munger.java:182)
        at org.wikidata.query.rdf.tool.Updater.handleChange(Updater.java:421)
        at org.wikidata.query.rdf.tool.Updater.lambda$fetchDataFromWikibaseAndMunge$7(Updater.java:305)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Then entering the first property, the log seems to show the same problem:

19:34:12.614 [main] INFO  o.w.q.r.t.change.RecentChangesPoller - Got 1 changes, from P1@3@20200720193412|3 to P1@3@20200720193412|3
19:34:12.804 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized subjects: [https://muwi.epfl.ch/entity/P1, https://muwi.epfl.ch/prop/statement/P1, https://muwi.epfl.ch/prop/reference/P1, https://muwi.epfl.ch/prop/qualifier/value/P1, https://muwi.epfl.ch/prop/P1, https://muwi.epfl.ch/prop/statement/value/P1, https://muwi.epfl.ch/prop/qualifier/P1, https://muwi.epfl.ch/prop/novalue/P1, https://muwi.epfl.ch/prop/reference/value/P1, https://muwi.epfl.ch/prop/direct/P1] while processing http://muwi.epfl.ch/entity/P1.  Expected only sitelinks and subjects starting with http://muwi.epfl.ch/wiki/Special:EntityData/ and [http://muwi.epfl.ch/entity/]
19:34:12.808 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://wikiba.se/ontology#Property
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://wikiba.se/ontology#Property
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#propertyType o:http://wikiba.se/ontology#WikibaseItem
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#directClaim o:https://muwi.epfl.ch/prop/direct/P1
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#claim o:https://muwi.epfl.ch/prop/P1
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#statementProperty o:https://muwi.epfl.ch/prop/statement/P1
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#statementValue o:https://muwi.epfl.ch/prop/statement/value/P1
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#qualifier o:https://muwi.epfl.ch/prop/qualifier/P1
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#qualifierValue o:https://muwi.epfl.ch/prop/qualifier/value/P1
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#reference o:https://muwi.epfl.ch/prop/reference/P1
19:34:12.809 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#referenceValue o:https://muwi.epfl.ch/prop/reference/value/P1
19:34:12.810 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://wikiba.se/ontology#novalue o:https://muwi.epfl.ch/prop/novalue/P1
19:34:12.810 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://www.w3.org/2000/01/rdf-schema#label o:"instance of"@en
19:34:12.810 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://www.w3.org/2004/02/skos/core#prefLabel o:"instance of"@en
19:34:12.811 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/entity/P1 p:http://schema.org/name o:"instance of"@en
19:34:12.811 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/prop/statement/P1 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://www.w3.org/2002/07/owl#ObjectProperty
19:34:12.811 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/prop/reference/P1 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://www.w3.org/2002/07/owl#ObjectProperty
19:34:12.811 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/prop/qualifier/value/P1 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://www.w3.org/2002/07/owl#ObjectProperty
19:34:12.811 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/prop/P1 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://www.w3.org/2002/07/owl#ObjectProperty
19:34:12.811 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - Unrecognized statement: s:https://muwi.epfl.ch/prop/statement/value/P1 p:http://www.w3.org/1999/02/22-rdf-syntax-ns#type o:http://www.w3.org/2002/07/owl#ObjectProperty
19:34:12.811 [update 1] INFO  o.wikidata.query.rdf.tool.rdf.Munger - More than 20 unrecognized statements, further statements not logged.

The only triples that can be found in the triplestore afterwards are

http://muwi.epfl.chschema:dateModified20 July 2020
http://muwi.epfl.ch/entity/P1wikibase:timestamp20 July 2020
http://muwi.epfl.ch/entity/P1schema:version3
http://muwi.epfl.ch/entity/P1schema:dateModified20 July 2020
http://muwi.epfl.ch/entity/P1wikibase:statements0
t5rdf:typeowl:Restriction
t5owl:onPropertyhttps://muwi.epfl.ch/prop/direct/P1
t5owl:someValuesFromowl:Thing

The problem persists when changing any of the two remaining values to muwi.epfl.ch as well

  • wdqs:
    • WIKIBASE_HOST:muwi.epfl.ch
  • wdqs-updater:
    • WIKIBASE_HOST:muwi.epfl.ch
  • wdqs-frontend:
    • WIKIBASE_HOST:muwi.epfl.ch

Changing the value to

has the wdqs-updater fail with

wait-for-it.sh: waiting 300 seconds for https://muwi.epfl.ch
nc: bad address 'https'
nc: bad address 'https'

Could someone please offer clarification as to which EnvVars actually need to be changed to the "outside URL" and how we can have WDQS incorporate the correct URLs (be it with http: or https:) at all stages?

Event Timeline

By the way, after creating the snak P1 instance of Q1 class for the item Q2 person, the turtle file Q2.ttl actually looks as it should:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix wikibase: <http://wikiba.se/ontology#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix schema: <http://schema.org/> .
@prefix cc: <http://creativecommons.org/ns#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix v: <https://muwi.epfl.ch/value/> .
@prefix wd: <https://muwi.epfl.ch/entity/> .
@prefix data: <https://muwi.epfl.ch/wiki/Special:EntityData/> .
@prefix s: <https://muwi.epfl.ch/entity/statement/> .
@prefix ref: <https://muwi.epfl.ch/reference/> .
@prefix wdt: <https://muwi.epfl.ch/prop/direct/> .
@prefix wdtn: <https://muwi.epfl.ch/prop/direct-normalized/> .
@prefix p: <https://muwi.epfl.ch/prop/> .
@prefix ps: <https://muwi.epfl.ch/prop/statement/> .
@prefix psv: <https://muwi.epfl.ch/prop/statement/value/> .
@prefix psn: <https://muwi.epfl.ch/prop/statement/value-normalized/> .
@prefix pq: <https://muwi.epfl.ch/prop/qualifier/> .
@prefix pqv: <https://muwi.epfl.ch/prop/qualifier/value/> .
@prefix pqn: <https://muwi.epfl.ch/prop/qualifier/value-normalized/> .
@prefix pr: <https://muwi.epfl.ch/prop/reference/> .
@prefix prv: <https://muwi.epfl.ch/prop/reference/value/> .
@prefix prn: <https://muwi.epfl.ch/prop/reference/value-normalized/> .
@prefix wdno: <https://muwi.epfl.ch/prop/novalue/> .

data:Q2 a schema:Dataset ;
	schema:about wd:Q2 ;
	cc:license <http://creativecommons.org/publicdomain/zero/1.0/> ;
	schema:softwareVersion "1.0.0" ;
	schema:version "5"^^xsd:integer ;
	schema:dateModified "2020-07-20T20:18:17Z"^^xsd:dateTime ;
	wikibase:statements "1"^^xsd:integer ;
	wikibase:identifiers "0"^^xsd:integer ;
	wikibase:sitelinks "0"^^xsd:integer .

wd:Q2 a wikibase:Item ;
	rdfs:label "Person"@en ;
	skos:prefLabel "Person"@en ;
	schema:name "Person"@en ;
	wdt:P1 wd:Q1 ;
	p:P1 s:Q2-63483472-49a6-a906-c236-adf05e5d7d61 .

s:Q2-63483472-49a6-a906-c236-adf05e5d7d61 a wikibase:Statement,
		wikibase:BestRank ;
	wikibase:rank wikibase:NormalRank ;
	ps:P1 wd:Q1 .

wd:Q1 a wikibase:Item ;
	rdfs:label "Class"@en ;
	skos:prefLabel "Class"@en ;
	schema:name "Class"@en .

wd:P1 a wikibase:Property,
		wikibase:Property ;
	wikibase:propertyType <http://wikiba.se/ontology#WikibaseItem> ;
	wikibase:directClaim wdt:P1 ;
	wikibase:claim p:P1 ;
	wikibase:statementProperty ps:P1 ;
	wikibase:statementValue psv:P1 ;
	wikibase:qualifier pq:P1 ;
	wikibase:qualifierValue pqv:P1 ;
	wikibase:reference pr:P1 ;
	wikibase:referenceValue prv:P1 ;
	wikibase:novalue wdno:P1 .

p:P1 a owl:ObjectProperty .

psv:P1 a owl:ObjectProperty .

pqv:P1 a owl:ObjectProperty .

prv:P1 a owl:ObjectProperty .

wdt:P1 a owl:ObjectProperty .

ps:P1 a owl:ObjectProperty .

pq:P1 a owl:ObjectProperty .

pr:P1 a owl:ObjectProperty .

wdno:P1 a owl:Class ;
	owl:complementOf _:genid1 .

_:genid1 a owl:Restriction ;
	owl:onProperty wdt:P1 ;
	owl:someValuesFrom owl:Thing .

wd:P1 rdfs:label "instance of"@en ;
	skos:prefLabel "instance of"@en ;
	schema:name "instance of"@en .
Johentsch claimed this task.

It was very easy to fix in the end:

The envVar WIKIBASE_SCHEME needs to be set to 'https' not only for wdqs but also for the wdqs-updater.

That was a bit too obvious...