Page MenuHomePhabricator

Upgrade m2 to Buster and Mariadb 10.4
Closed, ResolvedPublic

Description

After m1, let's now upgrade m2 to Buster and MariaDB 10.4

  • db2133
  • db2078
  • db1132 (to be replaced with db1107)
  • db1107
  • db1117

Involved active databases:

xhgui
recommendationsapi
otrs
debmonitor

Event Timeline

Marostegui triaged this task as Medium priority.Jul 9 2020, 5:21 AM
Marostegui moved this task from Triage to In progress on the DBA board.
Marostegui updated the task description. (Show Details)

Change 610597 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1084: Remove it from dbctl

https://gerrit.wikimedia.org/r/610597

Change 610597 merged by Marostegui:
[operations/puppet@production] db1084: Remove it from dbctl

https://gerrit.wikimedia.org/r/610597

Change 610598 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Move db1084 from s4 to m2

https://gerrit.wikimedia.org/r/610598

Change 610598 merged by Marostegui:
[operations/puppet@production] mariadb: Move db1084 from s4 to m2

https://gerrit.wikimedia.org/r/610598

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db1084.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202007090637_marostegui_10724.log.

Completed auto-reimage of hosts:

['db1084.eqiad.wmnet']

and were ALL successful.

Change 610675 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] install_server: Format db1084

https://gerrit.wikimedia.org/r/610675

Change 610675 merged by Marostegui:
[operations/puppet@production] install_server: Format db1084

https://gerrit.wikimedia.org/r/610675

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db1084.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202007090703_marostegui_13834.log.

Completed auto-reimage of hosts:

['db1084.eqiad.wmnet']

and were ALL successful.

Mentioned in SAL (#wikimedia-operations) [2020-07-09T07:59:41Z] <marostegui> Stop db1117:3322 to clone db1084, this will trigger haproxy alerts - T257540

Given 90% of m2 is OTRS database, I will setup db1077 with buster/MariaDB 10.4 on db1077 at T257928, and that will allow testing of the eventual upgrade of the primary instance to it. If that works correctly, I think the upgrade should cause no issue/could happen shortly after.

Change 613466 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1084: Enable notifications

https://gerrit.wikimedia.org/r/613466

Change 613466 merged by Marostegui:
[operations/puppet@production] db1084: Enable notifications

https://gerrit.wikimedia.org/r/613466

Mentioned in SAL (#wikimedia-operations) [2020-07-23T08:16:50Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1107 to move it to m2 T257540', diff saved to https://phabricator.wikimedia.org/P12024 and previous config saved to /var/cache/conftool/dbconfig/20200723-081650-marostegui.json

Change 615669 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Move db1107 from s1 to m2

https://gerrit.wikimedia.org/r/615669

Change 615669 merged by Marostegui:
[operations/puppet@production] mariadb: Move db1107 from s1 to m2

https://gerrit.wikimedia.org/r/615669

Mentioned in SAL (#wikimedia-operations) [2020-07-23T08:26:48Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Remove db1107 from s1 T257540', diff saved to https://phabricator.wikimedia.org/P12025 and previous config saved to /var/cache/conftool/dbconfig/20200723-082647-marostegui.json

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db1107.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202007230827_marostegui_25483.log.

Completed auto-reimage of hosts:

['db1107.eqiad.wmnet']

and were ALL successful.

Mentioned in SAL (#wikimedia-operations) [2020-07-23T08:59:00Z] <marostegui> transfer --type=xtrabackup from db1117:3322 to db1107 T257540

The only thing pending is the master switchover.
Going to wait a few days to make sure db1107 works ok and then I will go ahead and get a day scheduled

Change 615926 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1107: Enable notifications

https://gerrit.wikimedia.org/r/615926

Change 615926 merged by Marostegui:
[operations/puppet@production] db1107: Enable notifications

https://gerrit.wikimedia.org/r/615926

@Marostegui hi there! I want to analyse some categorylinks data (namely, collect all categories which are reachable from given category), but it's hierarchical and traversing it usually means utilising recursive common table expressions (CTE). It's only supported for MariaDB 10.2+, which is not the case for replica server (which is 10.1.44 AFAIK). So I wanted to ask when such upgrade will be available. Thanks.

@Adamant.pwn I assume you need wikireplicas service for that and not m2?
If that is the case, we are not yet sure when wikireplicas will be upgraded, but it will take a few months still for them to be running mariadb 10.4.

Thanks

@jcrespo @akosiaris @Volans @dpifke Recommendation-API Research I would like to failover the master Tuesday 4th Aug at 08:00 AM UTC
This means around 1 minute of read only time.

@Marostegui: green light from me for Debmonitor, usually no action is needed as it reconnects automatically but I can be around just in case.

@jcrespo @akosiaris @Volans @dpifke Recommendation-API Research I would like to failover the master Tuesday 4th Aug at 08:00 AM UTC
This means around 1 minute of read only time.

Fine by me.

Change 617402 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] dbproxy1013,1015: Test db1107 into haproxy

https://gerrit.wikimedia.org/r/617402

Change 617402 merged by Marostegui:
[operations/puppet@production] dbproxy1013,1015: Test db1107 in haproxy

https://gerrit.wikimedia.org/r/617402

Change 617997 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Promote db1107 to m2 master

https://gerrit.wikimedia.org/r/617997

Mentioned in SAL (#wikimedia-operations) [2020-08-04T07:27:47Z] <marostegui> Start topology changes on m2 - T257540

Change 617997 merged by Marostegui:
[operations/puppet@production] mariadb: Promote db1107 to m2 master

https://gerrit.wikimedia.org/r/617997

Mentioned in SAL (#wikimedia-operations) [2020-08-04T08:00:23Z] <marostegui> Failover m2 from db1132 to db1107 -T257540

Change 618236 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1132: Disable notifications

https://gerrit.wikimedia.org/r/618236

Change 618236 merged by Marostegui:
[operations/puppet@production] db1132: Disable notifications

https://gerrit.wikimedia.org/r/618236

Resolving this - db1132 will be moved to m3 which will be tracked at T253217