Page MenuHomePhabricator

Upgrade s3 to Debian Buster and MariaDB 10.4
Closed, ResolvedPublic

Description

Steps to upgrade:

Please read the doc about procedure for more details.

Original master: db1123
Original candidate master: db1157

Related Objects

Event Timeline

Marostegui changed the task status from Open to Stalled.May 19 2021, 6:57 AM
Marostegui assigned this task to Kormat.
Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to Blocked on the DBA board.
Marostegui added a subscriber: Kormat.

Assigning this to @Kormat as she's done s6 already. This should be stalled for now and only to be done once we are happy with s6's performance/stability in around 3-4 weeks or so.
codfw is entirely done, so is sanitarium mastesr, so effectively what is pending:

  • Confirm backup sources are good to go (cc @jcrespo
  • Upgrade eqiad candidate master to buster and 10.4
  • Perform eqiad master switchover
  • Upgrade old master to buster and 10.4

Change 692845 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] dbbackups: Switchover s3 backup source from db1171 to db1102 (buster)

https://gerrit.wikimedia.org/r/692845

s6 hasn't given any issues, so maybe we can start working on this next week (after 3 weeks since we switched s6) and attempt to do the switchover the 17th?
@Kormat thoughts?

s6 hasn't given any issues, so maybe we can start working on this next week (after 3 weeks since we switched s6) and attempt to do the switchover the 17th?
@Kormat thoughts?

Sounds good :)

Change 698469 had a related patch set uploaded (by Kormat; author: Kormat):

[operations/puppet@production] db1157: Disable notificcations.

https://gerrit.wikimedia.org/r/698469

Mentioned in SAL (#wikimedia-operations) [2021-06-07T10:08:23Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1157 depooling: reimage to buster T283131', diff saved to https://phabricator.wikimedia.org/P16311 and previous config saved to /var/cache/conftool/dbconfig/20210607-100822-kormat.json

Change 698469 merged by Kormat:

[operations/puppet@production] db1157: Disable notifications.

https://gerrit.wikimedia.org/r/698469

Change 698471 had a related patch set uploaded (by Kormat; author: Kormat):

[operations/puppet@production] install_server: switch db1157 to buster

https://gerrit.wikimedia.org/r/698471

Change 698471 merged by Kormat:

[operations/puppet@production] install_server: switch db1157 to buster

https://gerrit.wikimedia.org/r/698471

Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts:

['db1157.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202106071021_kormat_12649.log.

Completed auto-reimage of hosts:

['db1157.eqiad.wmnet']

and were ALL successful.

db1157 upgraded to buster. Running mysqlcheck now.

Mentioned in SAL (#wikimedia-operations) [2021-06-08T10:53:47Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16326 and previous config saved to /var/cache/conftool/dbconfig/20210608-105346-kormat.json

db1157 had a clean mysqlcheck run, repooling it now.

Mentioned in SAL (#wikimedia-operations) [2021-06-08T11:08:51Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16327 and previous config saved to /var/cache/conftool/dbconfig/20210608-110850-kormat.json

Mentioned in SAL (#wikimedia-operations) [2021-06-08T11:23:54Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16328 and previous config saved to /var/cache/conftool/dbconfig/20210608-112354-kormat.json

Mentioned in SAL (#wikimedia-operations) [2021-06-08T11:38:58Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16329 and previous config saved to /var/cache/conftool/dbconfig/20210608-113857-kormat.json

Change 700463 had a related patch set uploaded (by Kormat; author: Kormat):

[operations/puppet@production] db1123: Disable notifications.

https://gerrit.wikimedia.org/r/700463

Change 700463 merged by Kormat:

[operations/puppet@production] db1123: Disable notifications.

https://gerrit.wikimedia.org/r/700463

Change 692845 merged by Jcrespo:

[operations/puppet@production] dbbackups: Switchover s3 backup source from db1171 to db1102 (buster)

https://gerrit.wikimedia.org/r/692845

Mentioned in SAL (#wikimedia-operations) [2021-06-21T05:31:19Z] <kormat> stopping replication on db1123 T283131

Change 700613 had a related patch set uploaded (by Kormat; author: Kormat):

[operations/puppet@production] install_server: switch db1123 to buster.

https://gerrit.wikimedia.org/r/700613

Change 700613 merged by Kormat:

[operations/puppet@production] install_server: switch db1123 to buster.

https://gerrit.wikimedia.org/r/700613

Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts:

['db1123.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202106211401_kormat_3114.log.

Completed auto-reimage of hosts:

['db1123.eqiad.wmnet']

and were ALL successful.

db1123 is reimaged to buster, mysqlcheck --all-databases running now. As this is s3, this is going to take A While.

Mentioned in SAL (#wikimedia-operations) [2021-06-22T10:21:09Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16688 and previous config saved to /var/cache/conftool/dbconfig/20210622-102108-kormat.json

db1123 had a clean mysqlcheck run, repooling it now.

Mentioned in SAL (#wikimedia-operations) [2021-06-22T10:36:12Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1123 (re)pooling @ 50%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16689 and previous config saved to /var/cache/conftool/dbconfig/20210622-103612-kormat.json

Mentioned in SAL (#wikimedia-operations) [2021-06-22T10:51:16Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16690 and previous config saved to /var/cache/conftool/dbconfig/20210622-105115-kormat.json

Mentioned in SAL (#wikimedia-operations) [2021-06-22T11:06:20Z] <kormat@cumin1001> dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: reimaged to buster T283131', diff saved to https://phabricator.wikimedia.org/P16691 and previous config saved to /var/cache/conftool/dbconfig/20210622-110619-kormat.json

@jcrespo : assigning to you for the final backups step. Please resolve the task when that's done. Thanks!

jcrespo changed the task status from Stalled to Open.Jun 28 2021, 9:27 AM
jcrespo closed this task as Resolved.
jcrespo reassigned this task from jcrespo to Kormat.
jcrespo updated the task description. (Show Details)

There should be no more s3 servers with 10.1 available, that I am aware, but feel free to double check.