Change rack for servers in s1 in codfw
Closed, ResolvedPublic

Description

Hello,

Before the DC switchover we should take advantage that it is easier to move servers around and move 3 servers within s1.

The following servers should be moved out from C6 and D6 - I have selected destination racks, but @Papaul let me know if it is doable

db2034 C6 ->A5
db2062 D6 -> B5
db2070 D6 -> C5

I have talked to Faidon to see if there was any network restriction in codfw for DBs, and he mentioned that apart from racks X2 and X7 we should be good to go.

The idea behind this is to avoid what happened here: T155875 where we had lots of databases from the same shard within the same rack.

Restricted Application added a project: Operations. · View Herald TranscriptJan 27 2017, 12:41 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Restricted Application added a project: Operations. · View Herald TranscriptJan 27 2017, 12:42 PM

@Marostegui yes it is doable.

Thanks, let's do it next week!

Marostegui renamed this task from Change rack for servers in s1 to Change rack for servers in s1 in codfw.Jan 30 2017, 3:01 PM

Change 335048 had a related patch set uploaded (by Marostegui):
db-codfw.php: Depool db2034

https://gerrit.wikimedia.org/r/335048

Change 335048 merged by jenkins-bot:
db-codfw.php: Depool db2034

https://gerrit.wikimedia.org/r/335048

Mentioned in SAL (#wikimedia-operations) [2017-01-30T16:40:59Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2034 for maintenance - T156478 (duration: 00m 40s)

Mentioned in SAL (#wikimedia-operations) [2017-01-30T17:05:19Z] <marostegui> Shutdown mysql and poweroff db2034 for maintenance - T156478

Change 335053 had a related patch set uploaded (by Marostegui):
db-codfw,db-eqiad.php: Update db2034 IP

https://gerrit.wikimedia.org/r/335053

Papaul added a subscriber: RobH.Jan 30 2017, 5:58 PM

@RobH we about to move db2034 in row c rack C6 to row A rack 5. I will like for you please if you have time to make some changes on the both switches .

old port configuration : ge-6/0/1
new port configuration: ge-5/0/32

Thanks.

Change 335053 merged by jenkins-bot:
db-codfw,db-eqiad.php: Update db2034 IP

https://gerrit.wikimedia.org/r/335053

Mentioned in SAL (#wikimedia-operations) [2017-01-30T18:04:14Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Change db2034 IP - T156478 (duration: 00m 40s)

Mentioned in SAL (#wikimedia-operations) [2017-01-30T18:05:04Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Change db2034 IP - T156478 (duration: 00m 40s)

RobH added a comment.Jan 30 2017, 6:08 PM

@RobH we about to move db2034 in row c rack C6 to row A rack 5. I will like for you please if you have time to make some changes on the both switches .

old port configuration : ge-6/0/1
new port configuration: ge-5/0/32

Thanks.

I've left the old port asw-c-codfw:ge-6/0/1 alone until AFTER the server is moved.

The new port asw-a-codfw:ge-5/0/32 is now ready for use (description set, enabled, internal vlan set).

@Marostegui server is now in A5. Just waiting for https://gerrit.wikimedia.org/r/#/c/335054/ to be merge.

Merged. Virtual console is busy (I assume by yourself), so I do not have visibility of the state of the server right now.

Marostegui added a comment.EditedJan 31 2017, 6:57 AM

Thanks @RobH @jcrespo and @Papaul. The server looked good yesterday night when I checked it :-)

Change 335190 had a related patch set uploaded (by Marostegui):
db-codfw.php: Repool db2034

https://gerrit.wikimedia.org/r/335190

Change 335190 merged by jenkins-bot:
db-codfw.php: Repool db2034

https://gerrit.wikimedia.org/r/335190

Mentioned in SAL (#wikimedia-operations) [2017-01-31T07:10:11Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Repool db2034 - T156478 (duration: 00m 57s)

jcrespo moved this task from Triage to Backlog on the DBA board.Feb 3 2017, 4:51 PM

34 is done, I think 62 and 70 are pending.

@Papaul you think we can do db2062 sometime this week? Thanks!

We can today.

Awesome, I will depool it and get it ready to be moved
Thanks!

Change 337592 had a related patch set uploaded (by Marostegui):
db-codfw.php: Depool db2062

https://gerrit.wikimedia.org/r/337592

Change 337592 merged by jenkins-bot:
db-codfw.php: Depool db2062

https://gerrit.wikimedia.org/r/337592

Mentioned in SAL (#wikimedia-operations) [2017-02-14T15:04:29Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2062 to change its rack - T156478 (duration: 00m 41s)

Mentioned in SAL (#wikimedia-operations) [2017-02-14T15:05:29Z] <marostegui> Shutdown mysql (and later the whole host) on db2062 for maintenance - T156478

Change 337597 had a related patch set uploaded (by Marostegui):
db-codfw,db-eqiad.php: Change db2062 IP

https://gerrit.wikimedia.org/r/337597

@RobH we about to move db2062 in row D rack D6 to row B rack 5. I will like for you please if you have time to make some changes on both switches .

old port configuration : ge-6/0/10
new port configuration: ge-5/0/40

Thanks.

Change 337597 merged by jenkins-bot:
db-codfw,db-eqiad.php: Change db2062 IP

https://gerrit.wikimedia.org/r/337597

Mentioned in SAL (#wikimedia-operations) [2017-02-14T16:01:33Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Change db2062 IP after its move to another rack - T156478 (duration: 00m 40s)

Mentioned in SAL (#wikimedia-operations) [2017-02-14T16:02:26Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Change db2062 IP after its move to another rack - T156478 (duration: 00m 39s)

RobH added a comment.Feb 14 2017, 6:00 PM

@RobH we about to move db2062 in row D rack D6 to row B rack 5. I will like for you please if you have time to make some changes on both switches .

old port configuration : ge-6/0/10
new port configuration: ge-5/0/40

Thanks.

Ok, as discussed in IRC, I've setup the port asw-b-codfw:ge-5/0/40 with the description of db2062, and placed in the internal vlan for that row.

Once this is moved, please create a network sub-task to clear the configuration off of asw-d-codfw:ge-6/0/10. This ensures it doesn't get neglected in the cleanup. You can feel free to assign that followup cleanup task to me.

db2062 has been moved to B5
DNS updated
db-eqiad,codfw files updated
mysql started
replication started and server catching up
tendril updated

Thanks @Papaul and @RobH!

Change 337777 had a related patch set uploaded (by Marostegui):
db-codfw.php: Repool db2062

https://gerrit.wikimedia.org/r/337777

Change 337777 merged by jenkins-bot:
db-codfw.php: Repool db2062

https://gerrit.wikimedia.org/r/337777

Mentioned in SAL (#wikimedia-operations) [2017-02-15T07:27:47Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Repool db2062 - T156478 (duration: 00m 42s)

Mentioned in SAL (#wikimedia-operations) [2017-02-15T18:30:37Z] <marostegui> Stop MySQL and shutdown db2062 for maintenance - T156478

@Marostegui here is the new ip address we will use for db2070: 10.192.32.5

Thanks @Papaul! I will get that ready!

Change 338083 had a related patch set uploaded (by Marostegui):
db-codfw.php: Depool db2070

https://gerrit.wikimedia.org/r/338083

Change 338083 merged by jenkins-bot:
db-codfw.php: Depool db2070

https://gerrit.wikimedia.org/r/338083

Mentioned in SAL (#wikimedia-operations) [2017-02-16T09:33:56Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)

Change 338087 had a related patch set uploaded (by Marostegui):
dns: Change db2070 IP

https://gerrit.wikimedia.org/r/338087

Change 338088 had a related patch set uploaded (by Marostegui):
db-codfw,db-eqiad.php: Change db2070 IP

https://gerrit.wikimedia.org/r/338088

Mentioned in SAL (#wikimedia-operations) [2017-02-16T13:19:52Z] <marostegui> Shutdown db2070 for maintenance - T156478

Change 338088 merged by jenkins-bot:
db-codfw,db-eqiad.php: Change db2070 IP

https://gerrit.wikimedia.org/r/338088

Mentioned in SAL (#wikimedia-operations) [2017-02-16T13:27:31Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Change db2070 IP as it goes to another rack - T156478 (duration: 00m 56s)

Mentioned in SAL (#wikimedia-operations) [2017-02-16T13:28:24Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Change db2070 IP as it goes to another rack - T156478 (duration: 00m 41s)

@Papaul db2070 off, mediawiki files changed with its new IP.
If you review the DNS patch I will push it too.

Change 338087 merged by Marostegui:
dns: Change db2070 IP

https://gerrit.wikimedia.org/r/338087

db2070:

  • DNS updated
  • network/interfaces changed
  • mediawiki files changed
  • MySQL up and replication up

Pending: port configuration

Once the switch is changed replication will flow automatically

Oh, I saw that @RobH already changed the port and the server is replicating fine! :)

Claiming this task to do the last checks, repool the server etc before closing it.

Change 338319 had a related patch set uploaded (by Marostegui):
db-codfw.php: Repool db2070

https://gerrit.wikimedia.org/r/338319

Change 338319 merged by jenkins-bot:
db-codfw.php: Repool db2070

https://gerrit.wikimedia.org/r/338319

Mentioned in SAL (#wikimedia-operations) [2017-02-17T06:59:39Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Repool db2070 - T156478 (duration: 00m 48s)

db2070 has been repooled.
Thanks everyone for the help to move all these three servers!

Marostegui closed this task as Resolved.Feb 17 2017, 7:02 AM

Change 338341 had a related patch set uploaded (by Marostegui):
db-codfw.php: Depool db2070

https://gerrit.wikimedia.org/r/338341

Change 338341 merged by jenkins-bot:
db-codfw.php: Depool db2070

https://gerrit.wikimedia.org/r/338341

Mentioned in SAL (#wikimedia-operations) [2017-02-17T12:18:27Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)

Mentioned in SAL (#wikimedia-operations) [2017-02-17T12:21:50Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Depool db2070 - T156478 (duration: 00m 41s)

Mentioned in SAL (#wikimedia-operations) [2017-02-17T13:47:40Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Repool db2070 - T156478 (duration: 00m 45s)