Let¡s migrate the replicas from backup1-{eqiad,codfw} to MariaDB 10.6
- db1205
- db2184
Let¡s migrate the replicas from backup1-{eqiad,codfw} to MariaDB 10.6
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | • Marostegui | T329499 Migrate backup1-* replicas to MariaDB 10.6 | |||
Resolved | • Marostegui | T330861 Migrate backup1-* masters to MariaDB 10.6 |
Change 888673 had a related patch set uploaded (by Marostegui; author: Marostegui):
[operations/puppet@production] db1205,db2184: Migrate them to MariaDB 10.6.12
Change 888673 merged by Marostegui:
[operations/puppet@production] db1205,db2184: Migrate them to MariaDB 10.6.12
Mentioned in SAL (#wikimedia-operations) [2023-02-13T12:15:16Z] <marostegui> Upgrade db1205 and db2184 to mariadb 10.6.12 T329499
@jcrespo db1205 and db2184 have been migrated to 10.6.12 - could you check if all looks good from your side?
Yes, although sadly I didn't have the time to test the 10.6 recovery process, I intend to do it this week.
Recovery took ~1h30 with the script:
# mini_loader.sh dump.backup1-codfw.2023-02-28--03-45-08 Starting recovery at 2023-03-01 10:31:50+00:00 [...] Finishing recovery at 2023-03-01 11:58:33+00:00
The total size recovered is:
root@db2184:/srv/sqldata$ du -hs 152G .
I will now do a complete data check compared to the primary host and restart replication.
I intend to leave fully setup this recovery script as well as the the modified documentation recommending to use it for recoveries before this Friday.
Data check looks good:
2023-03-01T12:19:56.993934: row id 104990001/105492898, ETA: 00m02s, 0 chunk(s) found different Execution ended, no differences found.
root@db2183.codfw.wmnet[mediabackups]> SELECT count(*) FROM backups; +-----------+ | count(*) | +-----------+ | 102602541 | +-----------+ 1 row in set (24.277 sec) root@db2184.codfw.wmnet[mediabackups]> SELECT count(*) FROM backups; +-----------+ | count(*) | +-----------+ | 102602541 | +-----------+ 1 row in set (34.382 sec)
(there 2 are the only ones with real data that cannot be lost)
Removing original copy and returning db2184 to production.
We can resolve this- and upgrade their respective primaries or proceed with the misc replicas.
Change 893803 had a related patch set uploaded (by Jcrespo; author: Jcrespo):
[operations/puppet@production] Moving the working prototype/hack into production
Change 893803 merged by Jcrespo:
[operations/puppet@production] dbbackups: Moving the recovery working prototype/hack into production