Page MenuHomePhabricator

Finish eqiad metadata database backup setup (s1-s8, x1)
Closed, ResolvedPublic

Description

After T192358 backups were setup on dbstore1001 for s1 and misc. Now setup also s2, s3, s4 and s5 using db1095 and db1102. Setup s7 and s8 on db1116. Finally, setup s1, s6 and x1 on dbstore1001 until new hardware is in place, which to be purchased soon (dbstore1001 will be decommissioned).

Event Timeline

jcrespo created this task.Aug 7 2018, 9:31 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 7 2018, 9:31 AM
jcrespo claimed this task.Aug 7 2018, 9:32 AM
jcrespo triaged this task as Medium priority.
jcrespo moved this task from Triage to In progress on the DBA board.

Change 450928 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Setup db1095 and db1102 as db backup sources for eqiad

https://gerrit.wikimedia.org/r/450928

Change 450929 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb-backups: Start backing up s2-5 from the new eqiad backup hosts

https://gerrit.wikimedia.org/r/450929

Change 450930 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] install-server: Allow reimage of db110X hosts

https://gerrit.wikimedia.org/r/450930

Change 450930 merged by Jcrespo:
[operations/puppet@production] install-server: Allow reimage of db1102 and db1095 database hosts

https://gerrit.wikimedia.org/r/450930

Change 450928 merged by Jcrespo:
[operations/puppet@production] mariadb: Setup db1095 and db1102 as db backup sources for eqiad

https://gerrit.wikimedia.org/r/450928

Change 450939 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1122 and db1081

https://gerrit.wikimedia.org/r/450939

Change 450939 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Depool db1122 and db1081

https://gerrit.wikimedia.org/r/450939

Change 450990 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Repool es1019 and db1102 with low load after maintenance

https://gerrit.wikimedia.org/r/450990

Change 450990 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Repool es1019 and db1102 with low load after maintenance

https://gerrit.wikimedia.org/r/450990

Change 450994 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Repool db1081 with low load after maintenance

https://gerrit.wikimedia.org/r/450994

Change 450994 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Repool db1081 with low load after maintenance

https://gerrit.wikimedia.org/r/450994

db1095:s2 and db1102:s4 are currently compressiong (and with replication stopped), when they finish tomorrow, I will import s3 and s5 too.

Change 451235 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1100 and db1123

https://gerrit.wikimedia.org/r/451235

Change 451235 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Depool db1100 and db1123

https://gerrit.wikimedia.org/r/451235

Change 451269 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Repool db1100, db1123 with low load after maintenance

https://gerrit.wikimedia.org/r/451269

Change 451269 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Repool db1100, db1123 with low load after maintenance

https://gerrit.wikimedia.org/r/451269

Change 450929 merged by Jcrespo:
[operations/puppet@production] mariadb-backups: Start backing up s2-5 from the new eqiad backup hosts

https://gerrit.wikimedia.org/r/450929

jcrespo changed the task status from Open to Stalled.Sep 18 2018, 11:06 AM
jcrespo removed jcrespo as the assignee of this task.

Blocked on new backup hardware setup.

jcrespo changed the task status from Stalled to Open.Sep 28 2018, 8:34 AM

Change 463434 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1089, db1104 to setup backup source for s7,s8

https://gerrit.wikimedia.org/r/463434

Change 463434 merged by Jcrespo:
[operations/mediawiki-config@master] mariadb: Depool db1086, db1104 to setup backup source for s7,s8

https://gerrit.wikimedia.org/r/463434

root@neodymium:~/tendril/bin$ host=db1116.eqiad.wmnet; port=3317; bash tendril-host-add.sh $host $port ~/.my.cnf.tendril tendril | mysql -h db1115.eqiad.wmnet tendril && bash tendril-host-enable.sh $host $port | mysql -h db1115.eqiad.wmnet tendril
@server_id := id
1519
root@neodymium:~/tendril/bin$ host=db1116.eqiad.wmnet; port=3318; bash tendril-host-add.sh $host $port ~/.my.cnf.tendril tendril | mysql -h db1115.eqiad.wmnet tendril && bash tendril-host-enable.sh $host $port | mysql -h db1115.eqiad.wmnet tendril
@server_id := id

$ mysql.py -A -h db1115 zarcillo
insert into servers values ('db1116.eqiad.wmnet', 'db1116', 'eqiad', 171966477, NULL, '2018-09-05 09:18');
insert into instances values ('db1116:3317', 'db1116.eqiad.wmnet', 3317, '10.0.36-MariaDB', '2018-09-28 11:54:43');
insert into instances values ('db1116:3318', 'db1116.eqiad.wmnet', 3318, '10.0.36-MariaDB', '2018-09-28 13:43:01');
insert into section_instances values ('db1116:3317', 's7'), ('db1116:3318', 's8');

Probably not very intuitive...

Change 463484 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Setup db1116 for backup generation on eqiad of s7 and s8

https://gerrit.wikimedia.org/r/463484

jcrespo updated the task description. (Show Details)Sep 28 2018, 2:39 PM

These tables were mistakenly compressed, revert them (or in some case, delete them):

Compressing test.t6...
Compressing sys.sys_config...
Compressing ops.event_log...
Compressing heartbeat.heartbeat...
Compressing test.idbt...
Compressing mysql.gtid_slave_pos...
Compressing mysql.innodb_table_stats...
Compressing mysql.innodb_index_stats...

Change 463715 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1092, db1064 to create backup instances

https://gerrit.wikimedia.org/r/463715

Tables reverted on s6 and compression finished, now compressing s8, only the following (largest) are missing:

$ mysql -BN -S /run/mysqld/mysqld.s8.sock -e "SELECT table_schema, table_name FROM information_Schema.tables WHERE engine='INNODB' and row_format <> 'COMPRESSED' and (table_schema = 'centralauth' or table_schema like '%wik%') ORDER BY DATA_LENGTH ASC" | while read db table; do echo "Compressing $db.$table..."; mysql --skip-ssl --socket /run/mysqld/mysqld.s8.sock -e "SET SESSION sql_log_bin=0; -- ALTER TABLE $db.$table ROW_FORMAT=COMPRESSED, FORCE"; done
Compressing wikidatawiki.pagelinks...
Compressing wikidatawiki.text...
Compressing wikidatawiki.content...
Compressing wikidatawiki.revision...
Compressing wikidatawiki.wb_terms...

Change 463715 merged by Jcrespo:
[operations/mediawiki-config@master] mariadb: Depool db1093, db1064 to create backup instances

https://gerrit.wikimedia.org/r/463715

Change 463751 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Add s6 instance to dbstore1001 for backup generation

https://gerrit.wikimedia.org/r/463751

Change 463751 merged by Jcrespo:
[operations/puppet@production] mariadb: Add s6 instance to dbstore1001 for backup generation

https://gerrit.wikimedia.org/r/463751

Change 463785 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Add x1 to dbstore1001 as a backup source

https://gerrit.wikimedia.org/r/463785

Change 463788 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Setup dbstore1001 as the backup source of s6, x1

https://gerrit.wikimedia.org/r/463788

Change 463785 merged by Jcrespo:
[operations/puppet@production] mariadb: Add x1 to dbstore1001 as a backup source

https://gerrit.wikimedia.org/r/463785

All sections should be available now , but s8 is still compressing the last 2 tables, and s6 and x1 are uncompressed.

Mentioned in SAL (#wikimedia-operations) [2018-10-01T17:02:40Z] <jynus> stopping some mariadb instances on dbstore1001 and starting compression T201392

Change 463484 merged by Jcrespo:
[operations/puppet@production] mariadb: Setup db1116 for backup generation on eqiad of s7 and s8

https://gerrit.wikimedia.org/r/463484

Change 463788 merged by Jcrespo:
[operations/puppet@production] mariadb: Setup dbstore1001 as the backup source of s6, x1

https://gerrit.wikimedia.org/r/463788

jcrespo closed this task as Resolved.Oct 4 2018, 9:51 AM
jcrespo claimed this task.

All green! https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=Backup+of

Pending "only" snapshots, binlogs and incremental es backups, out of scope.