Page MenuHomePhabricator

Fix db-switchover update zarcillo part
Closed, ResolvedPublic

Description

While running the switchover for s4, we noticed that zarcillo wasn't updated due to:

ERROR 1665 (HY000): Cannot execute statement: impossible to write to binary log since BINLOG_FORMAT = STATEMENT and at least one table uses a storage engine limited to row-based logging. InnoDB is limited to row-logging when transaction isolation level is READ COMMITTED or READ UNCOMMITTED.

Possible short-term fixes for: https://phabricator.wikimedia.org/diffusion/OSMD/browse/master/wmfmariadbpy/cli_admin/switchover.py;38660e943a9167fe7d174f79806acf3a6e4d4f23$722?as=source&blame=off

Is to add one of these two before running the UPDATE query above:

set session binlog_format="ROW";
set session tx_isolation = 'REPEATABLE-READ';

Example:

1root@db1115.eqiad.wmnet[zarcillo]> set session binlog_format="ROW"
2 -> ;
3Query OK, 0 rows affected (0.000 sec)
4
5root@db1115.eqiad.wmnet[zarcillo]> update masters set instance = (select name from instances where server = 'db1138.eqiad.wmnet' and port=3306) where section = 's4' and dc = 'eqiad' limit 1;
6Query OK, 1 row affected (0.001 sec)
7Rows matched: 1 Changed: 1 Warnings: 0

We have a few more switchovers to come soonish so we might want to get this fixed and released "soon".
It doesn't block the switchover though, it just requires manual intervention.

Event Timeline

Marostegui triaged this task as Medium priority.Jan 26 2021, 7:49 AM
Marostegui moved this task from Triage to Ready on the DBA board.

I will add for context that the core reason for this happening is the required low level of consistency of TokuDB, if we get rid of TokuDB (e.g. moving the db) the source of the problems would dissapear. But I agree with @Marostegui that is a more long term fix (and would require editing the file anyway, as conf is hardcoded).

Change 658580 had a related patch set uploaded (by Kormat; owner: Kormat):
[operations/software/wmfmariadbpy@master] switchover: Work-around isolation level issue

https://gerrit.wikimedia.org/r/658580

Change 658580 merged by jenkins-bot:
[operations/software/wmfmariadbpy@master] switchover: Work-around isolation level issue

https://gerrit.wikimedia.org/r/658580

The fix is merged, but not yet released.

@Kormat was this ever released? I don't recall if I had to manually update the master or not on the last s1 switchover.
Maybe this can be released and tested during the upcoming s6 switch?

No, not yet. Various roadblocks have held up the next release of wmfmariadbpy, though hopefully they're mostly resolved by now. I'll be working on it this week.

@Kormat with yesterday's relase, this is good to be closed?