Productionize 11 new eqiad database servers
Open, NormalPublic

Description

Like T170662, more than a task, this is a tracking list to not "lose" not yet in-production servers and coordinate how to set them up.

These servers T162233 have to be used to:

  • Decom servers < db1051
  • Setup misc servers appropiately
  • Setup the eventual s8

State:

  • db1096: provisioned on s5 rc service (and will later serve s8)
  • db1097: provisioned and serving s4
  • db1098: provisioned and serving s6
  • db1099: provisioned on s5 rc service (and will later serve s8)
  • db1100: provisioned on s5 (cloned from old master db1049 - and will later serve s8)
  • db1101: provisioned on s2
  • db1102: temporarily used as sanitarium3 - T169510
  • db1103:
  • db1104:
  • db1105:
  • db1106:
There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 7 2017, 9:36 AM
jcrespo triaged this task as Normal priority.Aug 7 2017, 9:36 AM

Change 370447 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Pool db1098 as new s8 recentchanges/watchlist host

https://gerrit.wikimedia.org/r/370447

Marostegui moved this task from Triage to Meta/Epic on the DBA board.Aug 7 2017, 9:44 AM

Change 370447 merged by Jcrespo:
[operations/puppet@production] mariadb: Pool db1098 as new s6 recentchanges/watchlist host

https://gerrit.wikimedia.org/r/370447

Change 370465 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/software@master] mariadb: Add db1098 to the list of available s6 hosts

https://gerrit.wikimedia.org/r/370465

Change 370465 merged by Jcrespo:
[operations/software@master] mariadb: Add db1098 to the list of available s6 hosts

https://gerrit.wikimedia.org/r/370465

Change 370480 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Add db1098 as new s6 recentchanges/watchlist/... replica

https://gerrit.wikimedia.org/r/370480

Setup the eventual s8

for which wikis would that be, and is there a task for that / for the decision?

for which wikis would that be, and is there a task for that / for the decision?

There is not yet a specific task for that- probably it will be created as a subtask of this one- work has not started on that and probably will not start until October.

Based on dewiki user complains, the rate of edits, the predicted growth, the dumps time needs and the performance considerations of a non-sharded mysql setup (reads are easy to scale, writes are not), s8 hardware expansion was requested mainly for wikidatawiki, which should have a dedicated shard, the same that commons is s4. You can see on this graph that aside from s4 (which already has a dedicated setup), s5 is the place where more bots and edits are there, and more problems typically occurs: https://grafana.wikimedia.org/dashboard/db/mysql-aggregated?panelId=7&fullscreen&orgId=1&from=now-7d&to=now&var-dc=eqiad%20prometheus%2Fops&var-group=core&var-shard=All&var-role=master Of course, dewiki will also benefit from not sharing resources with such a large wiki and have more reliability, plus that may cause a chain reaction where the largest wikis in s3 are given more resources, and more wikis are moved around, to not leave shards unbalanced.

Change 370480 merged by Jcrespo:
[operations/mediawiki-config@master] mariadb: Add db1098 as new s6 recentchanges/watchlist/... replica

https://gerrit.wikimedia.org/r/370480

@jcrespo Thank you very much for the detailed answer :)

doctaxon added a subscriber: doctaxon.
jcrespo updated the task description. (Show Details)Aug 24 2017, 11:41 AM
Marostegui updated the task description. (Show Details)Aug 24 2017, 11:43 AM

Change 373525 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Add db1096 as s5 slave

https://gerrit.wikimedia.org/r/373525

Change 373525 merged by Marostegui:
[operations/puppet@production] mariadb: Add db1096 as s5 slave

https://gerrit.wikimedia.org/r/373525

Script wmf_auto_reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1096.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201708241159_marostegui_24955.log.

Completed auto-reimage of hosts:

['db1096.eqiad.wmnet']

and were ALL successful.

Change 373528 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1026

https://gerrit.wikimedia.org/r/373528

Change 373529 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s5.hosts: Add db1096

https://gerrit.wikimedia.org/r/373529

Change 373528 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1026

https://gerrit.wikimedia.org/r/373528

Mentioned in SAL (#wikimedia-operations) [2017-08-24T12:29:28Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1026 to clone db1096 from it T172679 (duration: 00m 47s)

Change 373529 merged by jenkins-bot:
[operations/software@master] s5.hosts: Add db1096

https://gerrit.wikimedia.org/r/373529

Mentioned in SAL (#wikimedia-operations) [2017-08-24T12:30:26Z] <marostegui> Stop MySQL on db1026 to clone db1096 from it - T172679

Change 373543 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Add db1096 to s5

https://gerrit.wikimedia.org/r/373543

Change 373543 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Add db1096 to s5

https://gerrit.wikimedia.org/r/373543

Mentioned in SAL (#wikimedia-operations) [2017-08-24T15:17:38Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Add db1096 depooled to s5 - T172679 (duration: 00m 47s)

Change 373569 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Give some weight to db1096

https://gerrit.wikimedia.org/r/373569

Change 373569 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Give some weight to db1096

https://gerrit.wikimedia.org/r/373569

Mentioned in SAL (#wikimedia-operations) [2017-08-24T15:58:22Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Give some weight to db1096 - T172679 (duration: 00m 47s)

Change 373594 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Add weight to db1096

https://gerrit.wikimedia.org/r/373594

Change 373594 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Add weight to db1096

https://gerrit.wikimedia.org/r/373594

Mentioned in SAL (#wikimedia-operations) [2017-08-24T16:35:01Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Give some more weight to db1096 - T172679 (duration: 00m 47s)

Mentioned in SAL (#wikimedia-operations) [2017-08-24T16:43:03Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Give some more weight to db1096 - T172679 (duration: 00m 47s)

Marostegui updated the task description. (Show Details)Aug 24 2017, 4:52 PM

Mentioned in SAL (#wikimedia-operations) [2017-08-24T16:52:56Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Give normal weight to db1096 - T172679 (duration: 00m 47s)

Marostegui updated the task description. (Show Details)Mon, Aug 28, 7:28 AM

Change 374136 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1045

https://gerrit.wikimedia.org/r/374136

Script wmf_auto_reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1099.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201708280734_marostegui_1789.log.

Change 374138 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Add db1099 to s5

https://gerrit.wikimedia.org/r/374138

Completed auto-reimage of hosts:

['db1099.eqiad.wmnet']

and were ALL successful.

Change 374138 merged by Marostegui:
[operations/puppet@production] mariadb: Add db1099 to s5

https://gerrit.wikimedia.org/r/374138

Change 374136 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1045

https://gerrit.wikimedia.org/r/374136

Mentioned in SAL (#wikimedia-operations) [2017-08-28T08:04:53Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1045 to clone db1099 from it - T172679 (duration: 00m 46s)

Mentioned in SAL (#wikimedia-operations) [2017-08-28T08:10:46Z] <marostegui> Stop MySQL on db1045 - T172679

Change 374311 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s5.hosts: Add db1099 to s5

https://gerrit.wikimedia.org/r/374311

Change 374311 merged by jenkins-bot:
[operations/software@master] s5.hosts: Add db1099 to s5

https://gerrit.wikimedia.org/r/374311

Change 374312 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Add db1099 to s5

https://gerrit.wikimedia.org/r/374312

Change 374312 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Add db1099 to s5

https://gerrit.wikimedia.org/r/374312

Mentioned in SAL (#wikimedia-operations) [2017-08-28T12:03:41Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Add db1099 pooled with low weight to s5 - T172679 (duration: 00m 45s)

Marostegui updated the task description. (Show Details)Mon, Aug 28, 12:10 PM

Mentioned in SAL (#wikimedia-operations) [2017-08-28T12:27:07Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Add more weight to db1099 on s5 - T172679 (duration: 00m 44s)

Mentioned in SAL (#wikimedia-operations) [2017-08-28T12:53:33Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Add more weight to db1099 on s5 - T172679 (duration: 00m 44s)

Change 374318 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Fully pool db1099

https://gerrit.wikimedia.org/r/374318

Change 374318 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Fully pool db1099

https://gerrit.wikimedia.org/r/374318

Mentioned in SAL (#wikimedia-operations) [2017-08-28T14:34:48Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Pool db1099 with normal weight on s5 - T172679 (duration: 00m 44s)

Script wmf_auto_reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1101.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201709011503_marostegui_24235.log.

Completed auto-reimage of hosts:

['db1101.eqiad.wmnet']

and were ALL successful.

Marostegui updated the task description. (Show Details)Fri, Sep 1, 3:29 PM
Marostegui updated the task description. (Show Details)Fri, Sep 1, 3:39 PM
Marostegui updated the task description. (Show Details)Tue, Sep 5, 5:39 PM

Change 376178 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Add db1100 to s5

https://gerrit.wikimedia.org/r/376178

Change 376179 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s5.hosts: Add db1100

https://gerrit.wikimedia.org/r/376179

Change 376179 merged by jenkins-bot:
[operations/software@master] s5.hosts: Add db1100

https://gerrit.wikimedia.org/r/376179

Change 376178 merged by Marostegui:
[operations/puppet@production] mariadb: Add db1100 to s5

https://gerrit.wikimedia.org/r/376178

Mentioned in SAL (#wikimedia-operations) [2017-09-06T07:15:04Z] <marostegui> Stop MySQL on db1049 to copy its content to db1100 - https://phabricator.wikimedia.org/T172679

Change 376481 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Add db1100

https://gerrit.wikimedia.org/r/376481

Change 376481 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Add db1100

https://gerrit.wikimedia.org/r/376481

Mentioned in SAL (#wikimedia-operations) [2017-09-07T07:53:58Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Add db1100 - T172679 (duration: 00m 49s)

Mentioned in SAL (#wikimedia-operations) [2017-09-07T07:54:51Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Add db1100 - T172679 (duration: 00m 48s)

Change 376484 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Add db1100 to s5

https://gerrit.wikimedia.org/r/376484

Change 376484 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Add db1100 to s5 depooled

https://gerrit.wikimedia.org/r/376484

Mentioned in SAL (#wikimedia-operations) [2017-09-07T09:15:57Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Add db1100 depooled to s5 array - T172679 (duration: 00m 49s)

Marostegui updated the task description. (Show Details)Thu, Sep 7, 9:16 AM

Change 376499 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Pool db1100 with weight 0

https://gerrit.wikimedia.org/r/376499

Change 376499 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Pool db1100 with weight 0

https://gerrit.wikimedia.org/r/376499

Mentioned in SAL (#wikimedia-operations) [2017-09-07T11:24:38Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Pool db1100 with 0 weight - T172679 (duration: 00m 49s)

Change 378859 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Setup db1101 on s2 to replace db1018 and db1036

https://gerrit.wikimedia.org/r/378859

jcrespo updated the task description. (Show Details)Tue, Sep 19, 9:51 AM

Change 378859 merged by Jcrespo:
[operations/puppet@production] mariadb: Setup db1101 on s2 to replace db1018 and db1036

https://gerrit.wikimedia.org/r/378859

Change 378914 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Pool db1101 as new s2 host

https://gerrit.wikimedia.org/r/378914

Change 378916 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/software@master] dbtools: Add db1101 to s2

https://gerrit.wikimedia.org/r/378916

Change 378916 merged by Jcrespo:
[operations/software@master] dbtools: Add db1101 to s2

https://gerrit.wikimedia.org/r/378916

Change 378914 merged by Jcrespo:
[operations/mediawiki-config@master] mariadb: Pool db1101 as new s2 host

https://gerrit.wikimedia.org/r/378914

Change 378962 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] icinga: Disable notifications on db2078, enable them on db1101

https://gerrit.wikimedia.org/r/378962

Change 378962 merged by Jcrespo:
[operations/puppet@production] icinga: Disable notifications on db2078, enable them on db1101

https://gerrit.wikimedia.org/r/378962