Page MenuHomePhabricator

Address Database infrastructure blockers on datacenter switchover & multi-dc deployment
Open, NormalPublic

Description

Tracking task for the Q4 goal for 2018

  • Rack and setup 13 eqiad hosts T211613
  • Productionize 13 eqiad hosts T222682
  • Rack and setup 18 codfw hosts T221532
  • Productionize 18 codfw hosts T222772
    • Mimic eqiad db-eqiad to db-codfw as much as possible (while we still have to get the pending 10 hosts)

Related Objects

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 5 2019, 6:10 AM
Marostegui triaged this task as Normal priority.
Marostegui moved this task from Triage to Meta/Epic on the DBA board.
Marostegui added a subtask: Unknown Object (Task).Apr 9 2019, 5:08 AM

Change 505697 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Promote db2079 to s8 codfw master

https://gerrit.wikimedia.org/r/505697

Change 505699 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Promote db2079 to s8 codfw master.

https://gerrit.wikimedia.org/r/505699

Mentioned in SAL (#wikimedia-operations) [2019-04-25T05:47:52Z] <marostegui> Start changing topology to make db2079 s8 codfw master - T220170

Change 505697 merged by Marostegui:
[operations/puppet@production] mariadb: Promote db2079 to s8 codfw master

https://gerrit.wikimedia.org/r/505697

Change 505699 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Promote db2079 to s8 codfw master.

https://gerrit.wikimedia.org/r/505699

Mentioned in SAL (#wikimedia-operations) [2019-04-25T06:04:40Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Promote db2079 to s8 codfw master T220170 (duration: 00m 52s)

Change 506350 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Reorganize s8

https://gerrit.wikimedia.org/r/506350

Change 506350 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Reorganize s8

https://gerrit.wikimedia.org/r/506350

Change 506351 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Make db2080 candidate master for s8 codfw

https://gerrit.wikimedia.org/r/506351

Mentioned in SAL (#wikimedia-operations) [2019-04-25T06:38:43Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Reorganize s8 codfw - T220170 (duration: 00m 54s)

Change 506351 merged by Marostegui:
[operations/puppet@production] mariadb: Make db2080 candidate master for s8 codfw

https://gerrit.wikimedia.org/r/506351

Mentioned in SAL (#wikimedia-operations) [2019-04-25T07:01:49Z] <marostegui> Run compare.py for main tables between db2045 and db2080 T220170

Papaul closed subtask Unknown Object (Task) as Resolved.May 2 2019, 9:39 PM

Change 511652 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Repool db2104

https://gerrit.wikimedia.org/r/511652

Change 511652 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Repool db2104

https://gerrit.wikimedia.org/r/511652

Marostegui updated the task description. (Show Details)May 21 2019, 9:02 AM

Change 511814 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Tackle s4 weights

https://gerrit.wikimedia.org/r/511814

Change 511816 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db2090: Make it candidate master for s4

https://gerrit.wikimedia.org/r/511816

Change 511816 merged by Marostegui:
[operations/puppet@production] db2090: Make it candidate master for s4

https://gerrit.wikimedia.org/r/511816

Change 511814 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Tackle s4 weights

https://gerrit.wikimedia.org/r/511814

Mentioned in SAL (#wikimedia-operations) [2019-05-22T07:23:38Z] <marostegui> Restart MySQL on db2090 to change binlog format T220170

Mentioned in SAL (#wikimedia-operations) [2019-05-22T07:24:42Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Tackle s4 codfw weights T220170 (duration: 01m 06s)

Change 511818 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Tackle s8 weights

https://gerrit.wikimedia.org/r/511818

Change 511818 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Tackle s8 weights

https://gerrit.wikimedia.org/r/511818

Mentioned in SAL (#wikimedia-operations) [2019-05-22T07:41:18Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Tackle s8 codfw weights T220170 (duration: 00m 55s)

Change 512091 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Depool db2107

https://gerrit.wikimedia.org/r/512091

Change 512092 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db2107: Change binlog format

https://gerrit.wikimedia.org/r/512092

Change 512092 merged by Marostegui:
[operations/puppet@production] db2107: Change binlog format

https://gerrit.wikimedia.org/r/512092

Change 512091 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Clarify db2017 status

https://gerrit.wikimedia.org/r/512091

Change 512311 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db2062

https://gerrit.wikimedia.org/r/512311

Change 512311 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db2062

https://gerrit.wikimedia.org/r/512311

Mentioned in SAL (#wikimedia-operations) [2019-05-24T05:34:25Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Remove db2062 from config T220170 (duration: 00m 49s)

Mentioned in SAL (#wikimedia-operations) [2019-05-24T05:35:38Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Remove db2062 from config T220170 (duration: 00m 48s)

Change 512312 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Move db2062 from s1 to m1

https://gerrit.wikimedia.org/r/512312

Change 512312 merged by Marostegui:
[operations/puppet@production] mariadb: Move db2062 from s1 to m1

https://gerrit.wikimedia.org/r/512312

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db2062.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201905240550_marostegui_4423.log.

Mentioned in SAL (#wikimedia-operations) [2019-05-24T06:08:18Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: More traffic to new hosts T220170 (duration: 00m 48s)

Completed auto-reimage of hosts:

['db2062.codfw.wmnet']

and were ALL successful.

Mentioned in SAL (#wikimedia-operations) [2019-05-24T06:17:14Z] <marostegui> Stop MySQL on db2078:m1 to clone db2062 - T220170

Mentioned in SAL (#wikimedia-operations) [2019-06-03T13:19:03Z] <marostegui> Move db2078:3321 under db2062 T220170

Change 514418 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db2058db2065: Enable notifications

https://gerrit.wikimedia.org/r/514418

Change 514418 merged by Marostegui:
[operations/puppet@production] db2058,db2065: Enable notifications

https://gerrit.wikimedia.org/r/514418

Mentioned in SAL (#wikimedia-operations) [2019-06-05T06:17:58Z] <marostegui> Start topology changes on s4 codfw to replace current master db2051 with db2090 - T220170

Change 514423 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Mimic codfw weights to eqiad

https://gerrit.wikimedia.org/r/514423

Change 514423 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Mimic codfw weights to eqiad

https://gerrit.wikimedia.org/r/514423

Mentioned in SAL (#wikimedia-operations) [2019-06-05T06:25:29Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Mimic s4 codfw weights to eqiad T220170 (duration: 00m 55s)

Change 514424 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Promote db2090 to codfw s4 master

https://gerrit.wikimedia.org/r/514424

Change 514424 merged by Marostegui:
[operations/puppet@production] mariadb: Promote db2090 to codfw s4 master

https://gerrit.wikimedia.org/r/514424

Change 514425 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Promote db2090 to s4 codfw master

https://gerrit.wikimedia.org/r/514425

Change 514426 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: db2051 disable notifications

https://gerrit.wikimedia.org/r/514426

Change 514425 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Promote db2090 to s4 codfw master

https://gerrit.wikimedia.org/r/514425

Change 514426 merged by Marostegui:
[operations/puppet@production] mariadb: db2051 disable notifications

https://gerrit.wikimedia.org/r/514426

Mentioned in SAL (#wikimedia-operations) [2019-06-05T06:45:23Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Promote db2090 to s4 codfw master T220170 (duration: 00m 54s)

Mentioned in SAL (#wikimedia-operations) [2019-06-05T06:45:29Z] <marostegui> Restart MySQL on db2110 to get the binlog format changed to STATEMENT - T220170

Mentioned in SAL (#wikimedia-operations) [2019-06-07T09:00:40Z] <marostegui> Upgrade x1 codfw hosts in preparation for its failover T220170

I had a chat with @mark and we are considering this Q4 goal done:

I will not close this task, as it will be used for the remaining part of the goals related to this, which will happen in Q1

Marostegui updated the task description. (Show Details)Thu, Jun 27, 8:32 AM
Marostegui mentioned this in Unknown Object (Task).Wed, Jul 17, 7:11 AM
Marostegui added a subtask: Unknown Object (Task).Mon, Jul 22, 12:46 PM
Papaul closed subtask Unknown Object (Task) as Resolved.Mon, Jul 22, 4:06 PM