After T192358 backups were setup on dbstore1001 for s1 and misc. Now setup also s2, s3, s4 and s5 using db1095 and db1102. Setup s7 and s8 on db1116. Finally, setup s1, s6 and x1 on dbstore1001 until new hardware is in place, which to be purchased soon (dbstore1001 will be decommissioned).
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T138562 Improve regular production database backups handling | |||
Resolved | jcrespo | T201392 Finish eqiad metadata database backup setup (s1-s8, x1) |
Event Timeline
Change 450928 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Setup db1095 and db1102 as db backup sources for eqiad
Change 450929 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb-backups: Start backing up s2-5 from the new eqiad backup hosts
Change 450930 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] install-server: Allow reimage of db110X hosts
Change 450930 merged by Jcrespo:
[operations/puppet@production] install-server: Allow reimage of db1102 and db1095 database hosts
Change 450928 merged by Jcrespo:
[operations/puppet@production] mariadb: Setup db1095 and db1102 as db backup sources for eqiad
Change 450939 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1122 and db1081
Change 450939 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Depool db1122 and db1081
Change 450990 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Repool es1019 and db1102 with low load after maintenance
Change 450990 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Repool es1019 and db1102 with low load after maintenance
Change 450994 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Repool db1081 with low load after maintenance
Change 450994 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Repool db1081 with low load after maintenance
db1095:s2 and db1102:s4 are currently compressiong (and with replication stopped), when they finish tomorrow, I will import s3 and s5 too.
Change 451235 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1100 and db1123
Change 451235 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Depool db1100 and db1123
Change 451269 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Repool db1100, db1123 with low load after maintenance
Change 451269 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: Repool db1100, db1123 with low load after maintenance
Change 450929 merged by Jcrespo:
[operations/puppet@production] mariadb-backups: Start backing up s2-5 from the new eqiad backup hosts
Change 463434 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1089, db1104 to setup backup source for s7,s8
Change 463434 merged by Jcrespo:
[operations/mediawiki-config@master] mariadb: Depool db1086, db1104 to setup backup source for s7,s8
root@neodymium:~/tendril/bin$ host=db1116.eqiad.wmnet; port=3317; bash tendril-host-add.sh $host $port ~/.my.cnf.tendril tendril | mysql -h db1115.eqiad.wmnet tendril && bash tendril-host-enable.sh $host $port | mysql -h db1115.eqiad.wmnet tendril @server_id := id 1519 root@neodymium:~/tendril/bin$ host=db1116.eqiad.wmnet; port=3318; bash tendril-host-add.sh $host $port ~/.my.cnf.tendril tendril | mysql -h db1115.eqiad.wmnet tendril && bash tendril-host-enable.sh $host $port | mysql -h db1115.eqiad.wmnet tendril @server_id := id $ mysql.py -A -h db1115 zarcillo insert into servers values ('db1116.eqiad.wmnet', 'db1116', 'eqiad', 171966477, NULL, '2018-09-05 09:18'); insert into instances values ('db1116:3317', 'db1116.eqiad.wmnet', 3317, '10.0.36-MariaDB', '2018-09-28 11:54:43'); insert into instances values ('db1116:3318', 'db1116.eqiad.wmnet', 3318, '10.0.36-MariaDB', '2018-09-28 13:43:01'); insert into section_instances values ('db1116:3317', 's7'), ('db1116:3318', 's8');
Probably not very intuitive...
Change 463484 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Setup db1116 for backup generation on eqiad of s7 and s8
These tables were mistakenly compressed, revert them (or in some case, delete them):
Compressing test.t6... Compressing sys.sys_config... Compressing ops.event_log... Compressing heartbeat.heartbeat... Compressing test.idbt... Compressing mysql.gtid_slave_pos... Compressing mysql.innodb_table_stats... Compressing mysql.innodb_index_stats...
Change 463715 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/mediawiki-config@master] mariadb: Depool db1092, db1064 to create backup instances
Tables reverted on s6 and compression finished, now compressing s8, only the following (largest) are missing:
$ mysql -BN -S /run/mysqld/mysqld.s8.sock -e "SELECT table_schema, table_name FROM information_Schema.tables WHERE engine='INNODB' and row_format <> 'COMPRESSED' and (table_schema = 'centralauth' or table_schema like '%wik%') ORDER BY DATA_LENGTH ASC" | while read db table; do echo "Compressing $db.$table..."; mysql --skip-ssl --socket /run/mysqld/mysqld.s8.sock -e "SET SESSION sql_log_bin=0; -- ALTER TABLE $db.$table ROW_FORMAT=COMPRESSED, FORCE"; done Compressing wikidatawiki.pagelinks... Compressing wikidatawiki.text... Compressing wikidatawiki.content... Compressing wikidatawiki.revision... Compressing wikidatawiki.wb_terms...
Change 463715 merged by Jcrespo:
[operations/mediawiki-config@master] mariadb: Depool db1093, db1064 to create backup instances
Change 463751 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Add s6 instance to dbstore1001 for backup generation
Change 463751 merged by Jcrespo:
[operations/puppet@production] mariadb: Add s6 instance to dbstore1001 for backup generation
Change 463785 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Add x1 to dbstore1001 as a backup source
Change 463788 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb: Setup dbstore1001 as the backup source of s6, x1
Change 463785 merged by Jcrespo:
[operations/puppet@production] mariadb: Add x1 to dbstore1001 as a backup source
All sections should be available now , but s8 is still compressing the last 2 tables, and s6 and x1 are uncompressed.
Mentioned in SAL (#wikimedia-operations) [2018-10-01T17:02:40Z] <jynus> stopping some mariadb instances on dbstore1001 and starting compression T201392
Change 463484 merged by Jcrespo:
[operations/puppet@production] mariadb: Setup db1116 for backup generation on eqiad of s7 and s8
Change 463788 merged by Jcrespo:
[operations/puppet@production] mariadb: Setup dbstore1001 as the backup source of s6, x1
All green! https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=Backup+of
Pending "only" snapshots, binlogs and incremental es backups, out of scope.