Page MenuHomePhabricator

Provide real-time updates for WCQS
Closed, ResolvedPublic8 Estimated Story Points

Description

As a user of WCQS I want to have a real-time updates to WCQS so that I can see the changes soon after they are done and without the need of weekly downtime.

Proposed solution is to use Streaming Updater with WCQS. This would require:

  • some changes for Streaming updater consumer(mostly around stream filtering)
  • probably only configuration one for producer

Out of scope:

  • publicly available patch stream for SDC content (beta lives on horizon, with no access to Jumbo kafka cluster)

My suggestion would be to parameterize first two points and run a seperate pipeline instance for SDC changes.

Note that this would still be a part of beta service - Streaming Updater is not yet ready for production deployment (and neither is WCQS).

AC:

  • WCQS is updated with SDC changes as they happen (with minimal lag)
  • Data updates generates no downtime

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

If we go by the solution of having an additional pipeline for SDC - https://phabricator.wikimedia.org/T262020 should be done first.

Just to be own's devil's advocate or to provide alternatives, we can solve both downtime and real-time updates with the old updater. Additionally, we can eliminate the downtime by having two blazegraph instances in an active/standby setup.

Additionally, we can eliminate the downtime by having two blazegraph instances in an active/standby setup.

Or 2 namespaces and an alias map, as we do with categories.

CBogen triaged this task as Medium priority.Sep 14 2020, 3:31 PM
CBogen moved this task from Incoming to Scaling on the Wikidata-Query-Service board.
Gehel raised the priority of this task from Medium to High.Sep 15 2020, 7:43 AM
MPhamWMF set the point value for this task to 8.May 3 2021, 3:26 PM

Change 751171 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/deployment-charts@master] rdf-streaming-updater: increase capacity for commons

https://gerrit.wikimedia.org/r/751171

Change 751171 merged by jenkins-bot:

[operations/deployment-charts@master] rdf-streaming-updater: increase capacity for commons

https://gerrit.wikimedia.org/r/751171

Change 753972 had a related patch set uploaded (by ZPapierski; author: ZPapierski):

[wikidata/query/rdf@master] Allow providing updater config path

https://gerrit.wikimedia.org/r/753972

Change 753972 merged by jenkins-bot:

[wikidata/query/rdf@master] Allow providing updater config path

https://gerrit.wikimedia.org/r/753972

afaict updates are being applied, consumer positions are incrementing for all wcqs hosts as expected. Querying a random triple that was recently added was returned by wcqs1001.