Page MenuHomePhabricator

Decommission db1015, db1035, db1044 and db1038
Closed, ResolvedPublic

Description

They are running out of space, and they are part of the easy-to-decom servers.

db1015 is ready: T173570
db1035 is ready: T176931
db1038 is ready: T177911
db1044 can be done at any time now. Pending puppet patches and https://gerrit.wikimedia.org/r/393747 (T181696)

Event Timeline

jcrespo created this task.Oct 13 2016, 7:49 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 13 2016, 7:49 PM
jcrespo renamed this task from Decommission db1035 to Decommission db1015, db1035 and db1044.Oct 13 2016, 7:52 PM
jcrespo updated the task description. (Show Details)
jcrespo removed jcrespo as the assignee of this task.Feb 14 2017, 3:39 PM
jcrespo raised the priority of this task from Medium to High.

High because they will complain of lack of space soon.

jcrespo renamed this task from Decommission db1015, db1035 and db1044 to Decommission db1015, db1035, db1044 and db1038.Feb 14 2017, 3:40 PM

db1044 is db1095's master, so we need to look for another candidate within the shard (and change it to ROW) and make sure it has the same content as db1044 otherwise we will break db1095 :(

Have a look at my shard planning, I think I had some options there.

Yeah, you placed db1064 there as a master for it for s4.
We need to make sure they have the same data as otherwise ROW will not like that

s4 is... special in that regard.

worst case scenario we can move db1044's data to db1064 :-)

Marostegui moved this task from Triage to Backlog on the DBA board.Apr 12 2017, 9:54 AM
Marostegui added a comment.EditedAug 13 2017, 11:51 AM

I will try to run pt-table-checksum on the most important tables (revision, pagelinks, templatelinks, text) on s3 next week as db1015 is quite low on disk space already :-(

 root@db1015:~# df -hT /srv
Filesystem            Type  Size  Used Avail Use% Mounted on
/dev/mapper/tank-data xfs   1.6T  1.5T   91G  95% /srv

db1015 was scheduled for decommission: T173570

Marostegui updated the task description. (Show Details)Sep 25 2017, 12:25 PM
Marostegui updated the task description. (Show Details)Sep 28 2017, 6:18 AM

db1035 scheduled for decommission: T176931

Change 393740 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1044

https://gerrit.wikimedia.org/r/393740

Change 393740 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Depool db1044

https://gerrit.wikimedia.org/r/393740

Change 393745 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Disable all notifications on db1044, preparing to decom

https://gerrit.wikimedia.org/r/393745

Change 393745 merged by Jcrespo:
[operations/puppet@production] mariadb: Disable all notifications on db1044, preparing to decom

https://gerrit.wikimedia.org/r/393745

Change 393747 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Decommission db1044

https://gerrit.wikimedia.org/r/393747

jcrespo updated the task description. (Show Details)
Marostegui closed this task as Resolved.Nov 30 2017, 7:45 AM
Marostegui updated the task description. (Show Details)

All hosts have now an individual decommission task - let's close this and follow up on each individual task

Change 393747 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Decommission db1044

https://gerrit.wikimedia.org/r/393747