Page MenuHomePhabricator

🟡 Devise mechanism to play back all entities on a wiki into Queryservice batches in parallel to default operation
Closed, ResolvedPublic

Description

The Queryservice updater has seen a lot of changes and outages over the last year. This means we cannot be sure all updates on every wiki made it into the respective query service. To rule out the possibility of "this probably was some glitch a few months ago" we'd like to play back all entities ever into the queryservice.

This should happen in parallel to normal operation so that it does not block recent updates being made.

Event Timeline

When thinking about how to implement this, the "trouble" for me is that we'd need an additional table for storing the batches that are being processed in parallel.

Theoretically, we'd need a second API deployment talking to a different database than the "proper" one which then is connected to a second queryservice deployment.

Fring renamed this task from 🟡 Devise mechanism to play back Queryservice batches from any point in time in parallel to default operation to 🟡 Devise mechanism to play back all entities on a wiki into Queryservice batches in parallel to default operation.EditedNov 29 2023, 10:18 AM
Fring updated the task description. (Show Details)

I would assume the building blocks needed for this are:

  • a second deployment for the queryservice-updater
  • an application that can (this could either be added to api or as a standalone app):
    • poll all known wikis for all entities and batch them accordingly
    • mimick the /getBatches endpoint from api, serving these batches until they're done

Ideas from the refinement on how to build this:

  • do not update from a RDF dump as this creates possible races
  • instead, load a JSON dump from a wiki and transform this into a list of entity ids
  • this list is then chunked into sensible batches
  • for each batch, a Kubernetes job processing the passed batch is created
  • the kubernetes jobs run in a dedicated namespace so we are in control of how many resources they consume

This should be run from a shell script that developers can run on their local machines, i.e. in wbaas-deploy/k8s/jobs


Upstream docs:

Fring removed Fring as the assignee of this task.Dec 4 2023, 11:53 AM
Fring claimed this task.
Fring moved this task from Doing to In Review on the Wikibase Cloud (Kanban board Q4 2023) board.
Fring moved this task from In Review to Doing on the Wikibase Cloud (Kanban board Q4 2023) board.
Fring removed Fring as the assignee of this task.Dec 5 2023, 1:59 PM
Fring moved this task from Doing to To do on the Wikibase Cloud (Kanban board Q4 2023) board.