Page MenuHomePhabricator

BUG REPORT: WDQS_Updater issue when Docker containers restart in Example (and likely otherwise)
Open, In Progress, LowPublic2 Estimated Story Points

Description

Currently: When the Example Docker containers are stopped and restarted without adding or resetting its data volumes, the following error appears and the container stops running:

12:55:33.443 [main] INFO  org.wikidata.query.rdf.tool.Update - Starting Updater 0.3.135 (993993ebef2c4c06d98aaa24767edef029bee287)
12:55:33.803 [main] INFO  o.w.q.r.t.change.ChangeSourceContext - Checking where we left off
12:55:33.803 [main] INFO  o.w.query.rdf.tool.rdf.RdfRepository - Checking for left off time from the updater
12:55:33.897 [main] INFO  o.w.query.rdf.tool.rdf.RdfRepository - Found left off time from the updater
12:55:33.897 [main] ERROR org.wikidata.query.rdf.tool.Update - Error during initialization.
java.lang.IllegalStateException: RDF store reports the last update time is before the minimum safe poll time.  You will have to reload from scratch or you might have missing data.
	at org.wikidata.query.rdf.tool.change.ChangeSourceContext.getStartTime(ChangeSourceContext.java:100)
	at org.wikidata.query.rdf.tool.Update.initialize(Update.java:144)
	at org.wikidata.query.rdf.tool.Update.main(Update.java:97)`

Goal: Find the root cause using the Example configuration, determine whether it is a general issue, and fix it. There have been reports of this same issue in the past in this ticket: https://phabricator.wikimedia.org/T182394. It was evidently resolved at that time but this looks to be possibly a regression.

Acceptance Criteria:

  • WDQS service remains stable and data accessible / uncorrupted upon container restarts in Example
  • WDQS service remains stable and data accessible / uncorrupted upon container restarts in general use

To Reproduce Issue:

Start and stop Example and start again without any data.

You should see this error appear, and the service will stop.

Event Timeline

lojo_wmde renamed this task from WDQS issue when Docker containers restart in Example (and likely otherwise) to WDQS_Updater issue when Docker containers restart in Example (and likely otherwise).Jan 3 2024, 1:00 PM
lojo_wmde updated the task description. (Show Details)
lojo_wmde updated the task description. (Show Details)

This part of our old documentation might be helpful as well.

https://github.com/wmde/wikibase-release-pipeline/blob/135f5d5d684d3a72c9f3b58e9d72c38a7469c086/docs/topics/test-system.md#updating
If the query service updater is restarting, it is likely due to updates not having happened in the past month.

Hi @roti_WMDE,

We are trying to set up local wiki instance for our institute but facing the updater restart issue.

Followed the instructions on https://github.com/wmde/wikibase-release-pipeline/blob/135f5d5d684d3a72c9f3b58e9d72c38a7469c086/docs/topics/test-system.md#updating to fix the restart issue: stopping updater, running with the current date, exit when "Sleeping for 10 secs" and restart updater
However, updater keeps restarting every minute. Can you please help? We installed https://www.mediawiki.org/wiki/Wikibase/Docker wmde 20. Thanks!

Dear @Padmalc,

Let me point you to the discussion in T352975, if you have not seen it already. I think the suggestions in there might be helpful for your case as well.

Especially:

[...] it could be interesting to experiment with the --start parameter for wdqs' runUpdate.sh script, if you have not tried that yet.
From the directory of your docker compose files (what is the example directory in the wikibase release pipeline):

# stop the currently running wdqs-updater service
docker compose -f docker-compose.yml -f docker-compose.extra.yml --env-file YOUR_ENV_FILE.env stop wdqs-updater

# start a manual update taking in all the changes since Jan 1, 2000
docker compose -f docker-compose.yml -f docker-compose.extra.yml --env-file YOUR_ENV_FILE.env run --rm wdqs-updater bash -c 
  '/wdqs/runUpdate.sh -h http://"$WDQS_HOST":"$WDQS_PORT" -- --wikibaseUrl "$WIKIBASE_SCHEME"://"$WIKIBASE_HOST" --conceptUri "$WIKIBASE_SCHEME"://"$WIKIBASE_HOST" --entityNamespaces "120,122" --init --start 20000101000000'

# start the wdqs-updater service again
docker compose -f docker-compose.yml -f docker-compose.extra.yml --env-file YOUR_ENV_FILE.env start wdqs-updater

[...]

Please let me know if that helped in your situation. If not, I would be very interested in some of the logs you get, especially from WDQS and MediaWiki.

Best,
Robert

roti_WMDE changed the task status from Open to In Progress.May 30 2024, 2:05 PM
roti_WMDE claimed this task.
darthmon_wmde renamed this task from WDQS_Updater issue when Docker containers restart in Example (and likely otherwise) to BUG REPORT: WDQS_Updater issue when Docker containers restart in Example (and likely otherwise).Jun 25 2024, 1:51 PM

Hi Robert,

Following up on your suggestions:

docker compose -f docker-compose.yml -f docker-compose.extra.yml --env-file .env stop wdqs-updater

docker compose -f docker-compose.yml -f docker-compose.extra.yml --env-file .env run --rm wdqs-updater bash -c '/wdqs/runUpdate.sh -h http://"testing.vtdataservices.org":"9999" -- --
wikibaseUrl "http"://"testing.vtdataservices.org" --conceptUri "http"://"testing.vtdataservices.org" --entityNamespaces "120,122" --init --start 20000101000000'

Gave us the following exception:

  • Creating 3/0 ✔ Container wbdocker-wdqs-1 Running 0.0s ✔ Container wbdocker-mysql-1 Running 0.0s ✔ Container wbdocker-wikibase-1 Running 0.0s

Updating via http://localhost:9999/bigdata/namespace/wdq/sparql
#logback.classic pattern: %d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n
16:10:11.618 [main] INFO org.wikidata.query.rdf.tool.Update - Starting Updater 0.3.137 (1ac054ca5bfa49a6f743c581decf5c9be649f455)
16:10:12.356 [main] INFO o.w.q.r.t.change.ChangeSourceContext - Checking where we left off
16:10:12.357 [main] INFO o.w.query.rdf.tool.rdf.RdfRepository - Checking for left off time from the updater
16:10:12.436 [main] INFO o.w.query.rdf.tool.HttpClientUtils - HTTP request failed: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused, attempt 1, will retry
16:10:14.439 [main] INFO o.w.query.rdf.tool.HttpClientUtils - HTTP request failed: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused, attempt 2, will retry
16:10:18.442 [main] INFO o.w.query.rdf.tool.HttpClientUtils - HTTP request failed: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused, attempt 3, will retry
16:10:26.445 [main] INFO o.w.query.rdf.tool.HttpClientUtils - HTTP request failed: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused, attempt 4, will retry
16:10:36.448 [main] INFO o.w.query.rdf.tool.HttpClientUtils - HTTP request failed: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused, attempt 5, will retry
16:10:46.452 [main] INFO o.w.query.rdf.tool.HttpClientUtils - HTTP request failed: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused, attempt 6, will fail
16:10:46.452 [main] ERROR org.wikidata.query.rdf.tool.Update - Error during initialization.
org.wikidata.query.rdf.tool.exception.FatalException: Error accessing triple store

at org.wikidata.query.rdf.tool.rdf.client.RdfClient.execute(RdfClient.java:136)
at org.wikidata.query.rdf.tool.rdf.client.RdfClient.query(RdfClient.java:78)
at org.wikidata.query.rdf.tool.rdf.RdfRepository.dateFromQuery(RdfRepository.java:417)
at org.wikidata.query.rdf.tool.rdf.RdfRepository.fetchLeftOffTime(RdfRepository.java:386)
at org.wikidata.query.rdf.tool.change.ChangeSourceContext.getStartTime(ChangeSourceContext.java:92)
at org.wikidata.query.rdf.tool.Update.initialize(Update.java:144)
at org.wikidata.query.rdf.tool.Update.main(Update.java:97)

Caused by: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 6 attempts.

at com.github.rholder.retry.Retryer.call(Retryer.java:174)
at org.wikidata.query.rdf.tool.rdf.client.RdfClient.execute(RdfClient.java:125)
... 6 common frames omitted

Caused by: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused

at org.eclipse.jetty.client.util.FutureResponseListener.getResult(FutureResponseListener.java:118)
at org.eclipse.jetty.client.util.FutureResponseListener.get(FutureResponseListener.java:101)
at org.eclipse.jetty.client.HttpRequest.send(HttpRequest.java:684)
at org.wikidata.query.rdf.tool.rdf.client.RdfClient.lambda$execute$0(RdfClient.java:126)
at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
at com.github.rholder.retry.Retryer.call(Retryer.java:160)
... 7 common frames omitted

Caused by: java.net.ConnectException: Connection refused

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.eclipse.jetty.io.SelectorManager.doFinishConnect(SelectorManager.java:355)
at org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:207)
at org.eclipse.jetty.io.ManagedSelector.access$1400(ManagedSelector.java:61)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:485)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:351)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
at java.lang.Thread.run(Thread.java:750)

Exception in thread "main" org.wikidata.query.rdf.tool.exception.FatalException: Error accessing triple store

at org.wikidata.query.rdf.tool.rdf.client.RdfClient.execute(RdfClient.java:136)
at org.wikidata.query.rdf.tool.rdf.client.RdfClient.query(RdfClient.java:78)
at org.wikidata.query.rdf.tool.rdf.RdfRepository.dateFromQuery(RdfRepository.java:417)
at org.wikidata.query.rdf.tool.rdf.RdfRepository.fetchLeftOffTime(RdfRepository.java:386)
at org.wikidata.query.rdf.tool.change.ChangeSourceContext.getStartTime(ChangeSourceContext.java:92)
at org.wikidata.query.rdf.tool.Update.initialize(Update.java:144)
at org.wikidata.query.rdf.tool.Update.main(Update.java:97)

Caused by: com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 6 attempts.

at com.github.rholder.retry.Retryer.call(Retryer.java:174)
at org.wikidata.query.rdf.tool.rdf.client.RdfClient.execute(RdfClient.java:125)
... 6 more

Caused by: java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused

at org.eclipse.jetty.client.util.FutureResponseListener.getResult(FutureResponseListener.java:118)
at org.eclipse.jetty.client.util.FutureResponseListener.get(FutureResponseListener.java:101)
at org.eclipse.jetty.client.HttpRequest.send(HttpRequest.java:684)
at org.wikidata.query.rdf.tool.rdf.client.RdfClient.lambda$execute$0(RdfClient.java:126)
at com.github.rholder.retry.AttemptTimeLimiters$NoAttemptTimeLimit.call(AttemptTimeLimiters.java:78)
at com.github.rholder.retry.Retryer.call(Retryer.java:160)
... 7 more

Caused by: java.net.ConnectException: Connection refused

at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.eclipse.jetty.io.SelectorManager.doFinishConnect(SelectorManager.java:355)
at org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:207)
at org.eclipse.jetty.io.ManagedSelector.access$1400(ManagedSelector.java:61)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:485)
at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:351)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:132)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
at java.lang.Thread.run(Thread.java:750)

Do you happen to know what this means? Are we going wrong somewhere? We did create a test item on this instance.
We also tried to enter the container following the suggestion of playing with --start parameter, but we could not enter this container to play with it since it keeps restarting

Thanks for all your suggestions and support.

Padma

I am just updating progress on this. The following seems to have worked for us:

Following Addshore suggestions on https://phabricator.wikimedia.org/T186161, and combining it with your suggestions: https://phabricator.wikimedia.org/T354266 above:

Ran an empty blazegraph container: wdqs-updater-custom, with the same image as the existing wdqs container 'wdqs:0.3.137-wmde.20':

docker run -d -p 9999:9999 --env WIKIBASE_SCHEME=https --env WIKIBASE_HOST=testing.vtdataservices.org --env WDQS_HOST=testing.vtdataservices.org --env WDQS_PORT=9999 --name demo-wdqs wikibase/wdqs:0.3.137-wmde.20 ./runBlazegraph.sh

Waited for the service to come up and made sure its empty. Check:

curl "localhost:9999/bigdata/sparql?query=SELECT%20%2A%20WHERE%20%7B%3Fa%20%3Fb%20%3Fc%7D"

Ran updater once pointing to some wikibase, and the query service:

docker exec demo-wdqs ./runUpdate.sh

Check the updater status:

curl "localhost:9999/bigdata/sparql?query=SELECT%20%2A%20WHERE%20%7B%3Fa%20%3Fb%20%3Fc%7D"

Reset the updater date:

docker exec demo-wdqs bash -c './runUpdate.sh -h http://testing.vtdataservices.org:9999 -- --wikibaseUrl http://testing.vtdataservices.org --conceptUri http://testing.vtdataservices.org --entityNamespaces 120,122 --init --start 20000101000000'

Check if the date is updated:

curl "localhost:9999/bigdata/sparql?query=SELECT%20%2A%20WHERE%20%7B%3Fa%20%3Fb%20%3Fc%7D"

We are thinking wbdocker-updater-custom is now replacing wbdocker-updater-1 with no restart issues. The new container stays even after restarting docker images: 'docker compose -f docker-compose.yml -f docker-compose.extra.yml up -d'

Thanks for all your help!!

Hi @Padmalc,

Thanks a lot for reporting back. Glad the problem was solved for you!

To my understanding the issue still exists when WDQS starts with an empty database. I just updated the docs a bit in order to hopefully provide some more clarity. https://github.com/wmde/wikibase-release-pipeline/pull/759

The actual fix is TBD.

Thanks for looking into this and updating the docs for more clarity, this is helpful!

Padma

Leif_WMDE set the point value for this task to 2.Jul 11 2025, 12:17 PM
Leif_WMDE raised the priority of this task from Low to Medium.Sep 15 2025, 6:30 AM
Leif_WMDE lowered the priority of this task from Medium to Low.Oct 23 2025, 8:36 AM