Page MenuHomePhabricator

Prepare wdqs1009 to run the streaming updater
Closed, ResolvedPublic

Description

RDF Data is flowing in the wdqs_streaming_upater_test topic on the jumbo cluster.
We should prepare wdqs1009 to read and apply this RDF stream:

  1. Increase retention period to 31days on wdqs_streaming_updater_test@kafka-jumbo
  2. Update vocabulary version in RWStore.properties to support the skolem prefix URI (https://gerrit.wikimedia.org/r/c/operations/puppet/+/605536)
  3. Import the RDF dumps with the --skolemize option (lexemes from june 20 and entities from june 22, dumps already downloaded to wdqs1009:/srv/wdqs)
  4. (while importing or just before) Merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/597790 and make sure the wdqs-updater service is switched to the streaming updater so that the data-reload cookbook does not restart the old one

Event Timeline

dcausse created this task.Jun 15 2020, 8:38 AM
Restricted Application added a project: Wikidata. · View Herald TranscriptJun 15 2020, 8:38 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 605536 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/puppet@production] [wdqs] bump vocabulary and inline URI handler version

https://gerrit.wikimedia.org/r/605536

@Ottomata could we increase the retention period on the topic wdqs_streaming_updater_test on the jumbo cluster?
We need extra time as this is first time we assemble all components together and we may run out of time to start all of this in the default 7days.

Addshore moved this task from incoming to monitoring on the Wikidata board.Jun 15 2020, 1:25 PM

Done:

kafka configs --alter --entity-type topics --entity-name wdqs_streaming_updater_test --add-config retention.ms=2678400000
dcausse updated the task description. (Show Details)Jun 29 2020, 9:08 PM
dcausse updated the task description. (Show Details)
dcausse assigned this task to RKemper.Jun 30 2020, 7:29 AM
dcausse triaged this task as Medium priority.
dcausse updated the task description. (Show Details)Jul 22 2020, 8:41 AM

Change 605536 merged by Gehel:
[operations/puppet@production] [wdqs] bump vocabulary and inline URI handler version

https://gerrit.wikimedia.org/r/605536

Gehel updated the task description. (Show Details)Jul 22 2020, 9:37 AM
Gehel added a subscriber: Gehel.Oct 8 2020, 8:09 AM

Now that SSD has been replaced in wdqs1009 (T263125), we can restart this data import.

Change 633182 had a related patch set uploaded (by Gehel; owner: Gehel):
[operations/cookbooks@master] wdqs: don't fail if journal does not exist before data reload

https://gerrit.wikimedia.org/r/633182

Change 633182 merged by Gehel:
[operations/cookbooks@master] wdqs: don't fail if journal does not exist before data reload

https://gerrit.wikimedia.org/r/633182

Mentioned in SAL (#wikimedia-operations) [2020-10-20T09:59:35Z] <dcausse> T255399: resuming wdqs-data-reload manually from chunk no 776 on wdqs1009

Change 636033 had a related patch set uploaded (by DCausse; owner: DCausse):
[wikidata/query/rdf@master] Fix StreamingUpdate package name after refactoring

https://gerrit.wikimedia.org/r/636033

Change 636034 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/puppet@production] [wdqs] fix StreamingUpdate package name after refactoring

https://gerrit.wikimedia.org/r/636034

Change 636033 merged by jenkins-bot:
[wikidata/query/rdf@master] Fix StreamingUpdate package name after refactoring

https://gerrit.wikimedia.org/r/636033

Change 636034 merged by Gehel:
[operations/puppet@production] [wdqs] fix StreamingUpdate package name after refactoring

https://gerrit.wikimedia.org/r/636034

Change 636432 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/puppet@production] [wdqs] add support for streaming updater lag metric

https://gerrit.wikimedia.org/r/636432

Change 636432 merged by Ryan Kemper:
[operations/puppet@production] [wdqs] add support for streaming updater lag metric

https://gerrit.wikimedia.org/r/636432

We need to do another reload but not under this ticket

Gehel closed this task as Resolved.Mon, Nov 23, 1:15 PM