Page MenuHomePhabricator

Productionize es10[35-40]
Closed, ResolvedPublic

Description

These will be the next two RW external store.
es4 and es5 will go RO once this ticket is done.

es6

  • es1038 master
  • es1036
  • es1037
  • Pending dbctl configuration
  • etcd.php changes
  • Set candidate masters

es7

  • es1035 master
  • es1039
  • es1040
  • Pending dbctl configuration
  • etcd.php changes
  • Set candidate masters

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Marostegui changed the task status from Open to Stalled.Jan 18 2024, 6:39 AM
Marostegui moved this task from Triage to Blocked on the DBA board.

Stalling until the hosts are installed T355269

Marostegui changed the task status from Stalled to Open.Feb 27 2024, 4:29 PM
Marostegui triaged this task as Medium priority.
Marostegui moved this task from Blocked to Ready on the DBA board.

Just ran:

sudo cumin es10[35-40].eqiad.wmnet 'lvextend -L+1T /dev/mapper/tank-data'

And then

$ sudo cumin es10[35-40].eqiad.wmnet 'xfs_growfs /srv ; df -hT /srv'
6 hosts will be targeted:
es[1035-1040].eqiad.wmnet
OK to proceed on 6 hosts? Enter the number of affected hosts to confirm or "q" to quit: 6
===== NODE GROUP =====
(6) es[1035-1040].eqiad.wmnet
----- OUTPUT of 'xfs_growfs /srv ; df -hT /srv' -----
meta-data=/dev/mapper/tank-data  isize=512    agcount=32, agsize=76294016 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=2441408512, imaxpct=5
         =                       sunit=64     swidth=256 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=521728, version=2
         =                       sectsz=512   sunit=64 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 2441408512 to 2709843968
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs    11T   73G   11T   1% /srv
Marostegui updated the task description. (Show Details)

Change #1025289 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Productionize es6

https://gerrit.wikimedia.org/r/1025289

Change #1025289 merged by Marostegui:

[operations/puppet@production] mariadb: Productionize es6

https://gerrit.wikimedia.org/r/1025289

Change #1025291 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] valid_section.pp: Add es6 and es7

https://gerrit.wikimedia.org/r/1025291

Change #1025291 merged by Marostegui:

[operations/puppet@production] valid_section.pp: Add es6 and es7

https://gerrit.wikimedia.org/r/1025291

es6 hosts added to zarcillo.
Also added es6 and es7 as valid sections

Change #1025356 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] es1038: Make it es6 master

https://gerrit.wikimedia.org/r/1025356

Change #1025356 merged by Marostegui:

[operations/puppet@production] es1038: Make it es6 master

https://gerrit.wikimedia.org/r/1025356

replication topology for es6 in eqiad set up - with heartbeat running

Change #1025363 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Remove comments from es6

https://gerrit.wikimedia.org/r/1025363

Change #1025363 merged by Marostegui:

[operations/puppet@production] mariadb: Remove comments from es6

https://gerrit.wikimedia.org/r/1025363

es6 eqiad is now showing up in orchestrator.

Change #1025603 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] sections.yaml: Add es6 as valid dbctl section

https://gerrit.wikimedia.org/r/1025603

Change #1025603 merged by Marostegui:

[operations/puppet@production] sections.yaml: Add es6 as valid dbctl section

https://gerrit.wikimedia.org/r/1025603

Change #1025670 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/mediawiki-config@master] etcd.php: Add es6

https://gerrit.wikimedia.org/r/1025670

Change #1025684 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Add es7 eqiad servers

https://gerrit.wikimedia.org/r/1025684

Change #1025684 merged by Marostegui:

[operations/puppet@production] mariadb: Add es7 eqiad servers

https://gerrit.wikimedia.org/r/1025684

Change #1025685 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Set up eqiad es7 hosts

https://gerrit.wikimedia.org/r/1025685

Change #1025685 merged by Marostegui:

[operations/puppet@production] mariadb: Set up eqiad es7 hosts

https://gerrit.wikimedia.org/r/1025685

Change #1025689 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] es1035: Make it es7 master

https://gerrit.wikimedia.org/r/1025689

Change #1025689 merged by Marostegui:

[operations/puppet@production] es1035: Make it es7 master

https://gerrit.wikimedia.org/r/1025689

Change #1025670 merged by jenkins-bot:

[operations/mediawiki-config@master] etcd.php: Add es6

https://gerrit.wikimedia.org/r/1025670

Mentioned in SAL (#wikimedia-operations) [2024-04-30T09:14:59Z] <marostegui@deploy1002> Started scap: Backport for [[gerrit:1025670|etcd.php: Add es6 (T355285 T355424)]]

Mentioned in SAL (#wikimedia-operations) [2024-04-30T09:17:48Z] <marostegui@deploy1002> marostegui: Backport for [[gerrit:1025670|etcd.php: Add es6 (T355285 T355424)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-04-30T09:30:01Z] <marostegui@deploy1002> Finished scap: Backport for [[gerrit:1025670|etcd.php: Add es6 (T355285 T355424)]] (duration: 15m 01s)

Change #1025699 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] profile: Add es6 to the regex of valid sections

https://gerrit.wikimedia.org/r/1025699

Change #1025699 merged by Marostegui:

[operations/puppet@production] profile: Add es6 to the regex of valid sections

https://gerrit.wikimedia.org/r/1025699

Mentioned in SAL (#wikimedia-operations) [2024-04-30T09:46:37Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Push es6 eqiad section T355285', diff saved to https://phabricator.wikimedia.org/P61486 and previous config saved to /var/cache/conftool/dbconfig/20240430-094635-marostegui.json

Change #1025701 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] es6 eqiad: Enable notifications

https://gerrit.wikimedia.org/r/1025701

Change #1025701 merged by Marostegui:

[operations/puppet@production] es6 eqiad: Enable notifications

https://gerrit.wikimedia.org/r/1025701

Mentioned in SAL (#wikimedia-operations) [2024-05-01T05:33:34Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424

Mentioned in SAL (#wikimedia-operations) [2024-05-01T05:33:51Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 6 hosts with reason: Setting up T355285 T355424

Mentioned in SAL (#wikimedia-operations) [2024-05-01T05:33:56Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424

Mentioned in SAL (#wikimedia-operations) [2024-05-01T05:34:11Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on es[1035,1039-1040].eqiad.wmnet with reason: Setting up T355285 T355424

es7 eqiad hosts now in orchestrator.

Change #1025902 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] conftool: Add es7 as valid section

https://gerrit.wikimedia.org/r/1025902

Change #1025902 merged by Marostegui:

[operations/puppet@production] conftool: Add es7 as valid section

https://gerrit.wikimedia.org/r/1025902

dbctl section (not config per host) created for es7

Change #1026089 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] instances.yaml: Add es7 eqiad hosts

https://gerrit.wikimedia.org/r/1026089

Change #1026089 merged by Marostegui:

[operations/puppet@production] instances.yaml: Add es7 eqiad hosts

https://gerrit.wikimedia.org/r/1026089

Mentioned in SAL (#wikimedia-operations) [2024-05-01T08:31:20Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Push es7 eqiad config T355285', diff saved to https://phabricator.wikimedia.org/P61551 and previous config saved to /var/cache/conftool/dbconfig/20240501-083120-marostegui.json

I have pushed the es7 eqiad dbctl config to production

1--- eqiad/externalLoads/es7 live
2+++ eqiad/externalLoads/es7 generated
3@@ -1 +1,9 @@
4-{}
5+[
6+ {
7+ "es1035": 10
8+ },
9+ {
10+ "es1039": 100,
11+ "es1040": 100
12+ }
13+]
14--- eqiad/hostsByName live
15+++ eqiad/hostsByName generated
16@@ -96,7 +96,10 @@
17 "es1032": "10.64.32.24",
18 "es1033": "10.64.48.7",
19 "es1034": "10.64.48.8",
20+ "es1035": "10.64.152.4",
21 "es1036": "10.64.154.3",
22 "es1037": "10.64.156.3",
23- "es1038": "10.64.160.4"
24+ "es1038": "10.64.160.4",
25+ "es1039": "10.64.162.3",
26+ "es1040": "10.64.164.3"
27 }

Change #1026096 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/mediawiki-config@master] etcd.php: Add es7

https://gerrit.wikimedia.org/r/1026096

Change #1026096 merged by jenkins-bot:

[operations/mediawiki-config@master] etcd.php: Add es7

https://gerrit.wikimedia.org/r/1026096

Mentioned in SAL (#wikimedia-operations) [2024-05-01T09:27:32Z] <marostegui@deploy1002> Started scap: Backport for [[gerrit:1026096|etcd.php: Add es7 (T355285 T355424)]]

Mentioned in SAL (#wikimedia-operations) [2024-05-01T09:30:18Z] <marostegui@deploy1002> marostegui: Backport for [[gerrit:1026096|etcd.php: Add es7 (T355285 T355424)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-05-01T09:42:26Z] <marostegui@deploy1002> Finished scap: Backport for [[gerrit:1026096|etcd.php: Add es7 (T355285 T355424)]] (duration: 14m 53s)

@Ladsgroup can you create the tables on es7 as well? Thanks!

es1037 and es1039 set as candidate masters on dbctl and orchestrator

This is done, pending enabling notifications before the hosts are ready to start receiving traffic (that is tracked at T364446: Enable writes on es6 and es7)