Page MenuHomePhabricator

Events set to SLAVESIDE_DISABLED when upgrading from 10.1 to 10.4
Closed, ResolvedPublic

Description

On hosts migrated to 10.1 to 10.4 the events are set to SLAVESIDE_DISABLED:

mysql:root@localhost [ops]> show events;
+-----+--------------------------+----------------+-----------+-----------+------------+----------------+----------------+---------------------+------+--------------------+------------+----------------------+----------------------+--------------------+
| Db  | Name                     | Definer        | Time zone | Type      | Execute at | Interval value | Interval field | Starts              | Ends | Status             | Originator | character_set_client | collation_connection | Database Collation |
+-----+--------------------------+----------------+-----------+-----------+------------+----------------+----------------+---------------------+------+--------------------+------------+----------------------+----------------------+--------------------+
| ops | wmf_slave_overload       | root@localhost | SYSTEM    | RECURRING | NULL       | 10             | SECOND         | 2019-05-21 00:00:01 | NULL | SLAVESIDE_DISABLED |  180367395 | utf8                 | utf8_general_ci      | binary             |
| ops | wmf_slave_purge          | root@localhost | SYSTEM    | RECURRING | NULL       | 15             | MINUTE         | 2019-05-21 00:00:00 | NULL | SLAVESIDE_DISABLED |  180367395 | utf8                 | utf8_general_ci      | binary             |
| ops | wmf_slave_wikiuser_sleep | root@localhost | SYSTEM    | RECURRING | NULL       | 30             | SECOND         | 2019-05-21 00:00:05 | NULL | SLAVESIDE_DISABLED |  180367395 | utf8                 | utf8_general_ci      | binary             |
| ops | wmf_slave_wikiuser_slow  | root@localhost | SYSTEM    | RECURRING | NULL       | 30             | SECOND         | 2019-05-21 00:00:03 | NULL | SLAVESIDE_DISABLED |  180367395 | utf8                 | utf8_general_ci      | binary             |
+-----+--------------------------+----------------+-----------+-----------+------------+----------------+----------------+---------------------+------+--------------------+------------+----------------------+----------------------+--------------------+
4 rows in set (0.001 sec)

Looks like this is a bug and it will be fixed on the next release (10.4.13), we are currently running the latest 10.4.12
https://jira.mariadb.org/browse/MDEV-21758
https://jira.mariadb.org/browse/MDEV-21896

Workaround for now:
set session sql_log_bin=0; use ops; alter event wmf_slave_wikiuser_sleep enable; alter event wmf_slave_wikiuser_slow enable; alter event wmf_slave_purge enable; alter event wmf_slave_overload enable;

Event Timeline

Marostegui triaged this task as Medium priority.Mar 16 2020, 8:14 AM
Marostegui moved this task from Triage to In progress on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2020-03-16T08:15:58Z] <marostegui> Review and enable events on recently migrated 10.4 hosts - T247728

Marostegui moved this task from In progress to Blocked external/Not db team on the DBA board.

I have checked and enabled them on:

db1107 db2085:3311 db1103:3312 db1103:3314 db2125 db1078 db2109 db2084:3314 db1096:3315 db2084:3315 db1096:3316 db1098:3316 db2114 db1098:3317 db2121 db1111 db2085:3318

Not closing this till 10.4.13 is released and we can check that the fix is actually shipped.

Looks like this is included in the 10.4.13 release finally: https://jira.mariadb.org/projects/MDEV/versions/24223 not closing until I can confirm this is indeed included.

I have installed 10.4.13 (from 10.4.12) on db2102 and this is confirmed fixed
The events were not disabled.
Will be interesting to also check an update from 10.1 to 10.4 - will do that next week and then confirm this as fixed.

Change 596964 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Reimage db2088

https://gerrit.wikimedia.org/r/596964

Change 596964 merged by Marostegui:
[operations/puppet@production] install_server: Reimage db2088

https://gerrit.wikimedia.org/r/596964

Change 596996 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Reimage db2088 as Buster

https://gerrit.wikimedia.org/r/596996

Change 596996 merged by Marostegui:
[operations/puppet@production] install_server: Reimage db2088 as Buster

https://gerrit.wikimedia.org/r/596996

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db2088.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202005180644_marostegui_135356.log.

Completed auto-reimage of hosts:

['db2088.codfw.wmnet']

and were ALL successful.

Marostegui claimed this task.

This is confirmed fixed on 10.4.13.

I have reimaged db2088 and installed 10.4.13 and after starting mysql:

mysql:root@localhost [ops]> show events;
+-----+--------------------------+----------------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+
| Db  | Name                     | Definer        | Time zone | Type      | Execute at | Interval value | Interval field | Starts              | Ends | Status  | Originator | character_set_client | collation_connection | Database Collation |
+-----+--------------------------+----------------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+
| ops | wmf_slave_overload       | root@localhost | SYSTEM    | RECURRING | NULL       | 10             | SECOND         | 2018-09-04 00:00:01 | NULL | ENABLED |  180367447 | utf8                 | utf8_general_ci      | binary             |
| ops | wmf_slave_purge          | root@localhost | SYSTEM    | RECURRING | NULL       | 15             | MINUTE         | 2018-09-04 00:00:00 | NULL | ENABLED |  180367447 | utf8                 | utf8_general_ci      | binary             |
| ops | wmf_slave_wikiuser_sleep | root@localhost | SYSTEM    | RECURRING | NULL       | 30             | SECOND         | 2018-09-04 00:00:05 | NULL | ENABLED |  180367447 | utf8                 | utf8_general_ci      | binary             |
| ops | wmf_slave_wikiuser_slow  | root@localhost | SYSTEM    | RECURRING | NULL       | 30             | SECOND         | 2018-09-04 00:00:03 | NULL | ENABLED |  180367447 | utf8                 | utf8_general_ci      | binary             |
+-----+--------------------------+----------------+-----------+-----------+------------+----------------+----------------+---------------------+------+---------+------------+----------------------+----------------------+--------------------+
4 rows in set (0.001 sec)

After T252952 I have reviewed all the hosts in production to make sure no events are disabled.

Hosts with disabled events:
x1: db2115
es5: es2025, es1024