Page MenuHomePhabricator

🟡 Do not mark queryservice batches as `done` until they are actually completed
Closed, ResolvedPublic

Description

As per https://github.com/wbstack/api/blob/cc8323a6e88c11919aa35dd55f3cff9bffa16176/app/Http/Controllers/Backend/QsController.php#L92

batches of entities to be consumed by the queryservice updater are marked as done as soon as they are sent to the updater. In the event the updater fails, times out, falls off the face of the earth etc. they will not be retried.

We suggest that perhaps after replying to requests from the updater for a batch the batch is not marked done but instead marked "pending".

We may then mark pending batches older than 300s as totally undone and ready to go again

Finally we add logic to the updater that marks batches a "really done" when it has had a positive HTTP response code from the queryservice.

We must also track a "failed count" and mark batches that fail 3 times as permanently failed and stop shipping them so that they don't block future batches. At this time we would like to see an error reported using the error reporting system.

Event Timeline

Evelien_WMDE renamed this task from Do not mark queryservice batches as `done` until they are actually completed to 🟡 Do not mark queryservice batches as `done` until they are actually completed.Oct 9 2023, 9:52 AM

Pull Requests needed to get this running locally / review:

What I did to set this up locally:

  1. Check out the respective branch in wbaas-deploy
  2. Checkout the respective branch in charts
  3. Manually link api and queryservice-updater charts to be used in helmfile.yaml
  4. Apply local changes
  5. Run skaffold for api
  6. Apply pending migrations: k exec deployments/api-app-backend -- php artisan migrate
  7. Run skaffold for queryservice-updater
Fring removed Fring as the assignee of this task.Oct 31 2023, 11:40 AM
Fring moved this task from Doing to In Review on the Wikibase Cloud (Kanban board Q4 2023) board.
Fring subscribed.
Tarrow removed dang as the assignee of this task.Nov 3 2023, 9:07 AM
Tarrow added a subscriber: dang.

Hi, I'm presuming that this work is meant to address problems like that described in this ticket: T345199. I think you're already aware, but just crossposting it here. Thanks.

Evelien_WMDE claimed this task.