Page MenuHomePhabricator

Investigate [SQLBagOStuff] warnings and errors in production
Closed, ResolvedPublic

Description

Investigate these warnings and errors on production:
Warnings:

"[SQLBagOStuff] SqlBagOStuff::handleDBError: ignoring query error"
"[SQLBagOStuff] SqlBagOStuff::handleDBError: ignoring connection error"

Errors:

[SQLBagOStuff] DBError: Database is read-only: This wiki is currently read-only.

from https://cloudlogging.app.goo.gl/idcbUJfBv2RdyU3w7

Related ticket: T412080: ⬆️ Investigate SQLBagOStuff Duplicate get error

Event Timeline

[SQLBagOStuff] DBError: Database is read-only: This wiki is currently read-only.

I imagine this is to be expected when updating wikis, as we set the wiki to read only before doing the update.

I wonder if the other errors are also due to updating wikis, given that they happened at the same time. @Tarrow had the idea that in the future we could prevent any traffic going to wikis while we are updating.

From what I can see:

"[SQLBagOStuff] SqlBagOStuff::handleDBError: ignoring connection error"

I only see occurrences of this around the node rollover on Monday morning.

"[SQLBagOStuff] SqlBagOStuff::handleDBError: ignoring query error"

and

[SQLBagOStuff] DBError: Database is read-only: This wiki is currently read-only.

I see very closely paired together in time. As I understand it we we see is:

  • We can't write the the SQLBagOStuff so we get DBError: Database is read-only: This wiki is currently read-only.
  • the SQLBagOStuff tells us that it is ignoring this error with ignoring query error

So I think we can "ignore this". We could prevent this from happening by showing partially updated Wiki's a waiting page and not sending traffic to them.

Did we do update skirmish on 3/12?

Screenshot from 2025-12-10 10-47-56.png (116×1 px, 48 KB)

In Conclusion:
These warnings and errors appear when we put wiki in read-only mode for update. First wave of errors was on 3/12, same day we updated Adam's wiki (see https://phabricator.wikimedia.org/T411634#11428737)
We can ignore these.

dang removed dang as the assignee of this task.
dang moved this task from In Peer Review to Done on the Wikibase Cloud (Kanban Board) board.
Tarrow claimed this task.