Page MenuHomePhabricator

Setup dbprov1004 an dbprov2004 as an expansion of the dbprov (database provisioning) cluster, in preparation of binlog backups backup implementation
Closed, ResolvedPublic

Assigned To
Authored By
jcrespo
Jan 17 2023, 10:13 AM
Referenced Files
F36563448: Screenshot_20230131_110804.png
Jan 31 2023, 10:08 AM
F36563439: Screenshot_20230131_104501.png
Jan 31 2023, 9:45 AM
Restricted File
Jan 25 2023, 4:15 PM
F36494349: Screenshot_20230125_154309.png
Jan 25 2023, 2:49 PM

Description

The following hosts:

  • dbprov1004
  • dbprov2004

Are an expansion of the existing dbprov[12]00[1-3] cluster, intended to implement streaming backups (binlog backups) for point in time recovery. See more at https://wikitech.wikimedia.org/wiki/MariaDB/Backups

These new hosts will not be dedicated to binlogs, existing backups will be reshuffled over the existing hosts evenly, and binlogs will be downloaded local to the existing backups, so later automated PITR can be implemented.

Event Timeline

jcrespo renamed this task from Setup dbprov1004 an dbprov2004 as an expansion of the dbprov cluster, in preparation of binlog backups to Setup dbprov1004 an dbprov2004 as an expansion of the dbprov (database provisioning) cluster, in preparation of binlog backups backup implementation.Jan 17 2023, 10:13 AM
jcrespo triaged this task as High priority.

Change 880896 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] dbbackups: Setup dbprov1004, dbprov2004 as empty dbprov

https://gerrit.wikimedia.org/r/880896

Change 880896 merged by Jcrespo:

[operations/puppet@production] dbbackups: Setup dbprov1004, dbprov2004 as empty dbprov

https://gerrit.wikimedia.org/r/880896

Change 881360 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] dbbackups: Reorganize backups with the new dbprov[12]04 host

https://gerrit.wikimedia.org/r/881360

Change 881868 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] dbbackups: Setting up grants for new dbprov hosts

https://gerrit.wikimedia.org/r/881868

Change 881868 merged by Jcrespo:

[operations/puppet@production] dbbackups: Setting up grants for new dbprov hosts

https://gerrit.wikimedia.org/r/881868

Mentioned in SAL (#wikimedia-operations) [2023-01-20T18:22:32Z] <jynus> deploying new grants for backups on m1 T327155

Mentioned in SAL (#wikimedia-operations) [2023-01-24T18:55:24Z] <jynus> deploy new dump grants for analytics dbs at db1108 T327155

Change 881360 merged by Jcrespo:

[operations/puppet@production] dbbackups: Reorganize backups with the new dbprov[12]04 host

https://gerrit.wikimedia.org/r/881360

Good news so far:

Screenshot_20230125_154309.png (839×1 px, 77 KB)

{F36503811}

Change 883834 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] dbbackups: Optimize execution time and delay backups

https://gerrit.wikimedia.org/r/883834

Change 883834 merged by Jcrespo:

[operations/puppet@production] dbbackups: Optimize execution time and delay backups

https://gerrit.wikimedia.org/r/883834

Change 883857 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] dbbackups: Reorganize backups to avoid overload

https://gerrit.wikimedia.org/r/883857

Change 883857 merged by Jcrespo:

[operations/puppet@production] dbbackups: Reorganize backups to avoid overload

https://gerrit.wikimedia.org/r/883857

Mentioned in SAL (#wikimedia-operations) [2023-01-30T09:29:01Z] <jynus> disabling puppet on dbprov2004 to reorganize partitions T327155

Grant issue- fixed, looking good for now:

Screenshot_20230131_110804.png (872×2 px, 176 KB)

Running multiple times without issue so far, more optimizations could be done, but so far, let's consider the scope done.