Page MenuHomePhabricator

Document clearly the mariadb backup and recovery setup
Closed, DuplicatePublic

Description

Everybody should be able to recover a database with just reading clear documentation. That documentation has to be written, telling about the recover process, and in general, the internals of how backups for databases work. Because database backups refactoring is still a work in progress, we need to wait to have it almost feature-complete. Once it is, we should have both end-user documentation for general operations, as well as for developers of tooling.

Documentation most likely should be on Wikitech wiki.
So far this is what we have: https://wikitech.wikimedia.org/wiki/MariaDB/Backups#Recovering_a_logical_backup

Code also should be reviewed to be better documented and self-evident for other potential hackers.

Event Timeline

jcrespo triaged this task as Medium priority.Sep 27 2018, 2:29 PM
jcrespo created this task.

Change 496475 had a related patch set uploaded (by Jcrespo; owner: Jcrespo):
[operations/puppet@production] mariadb-backups-monitoring: Link to more specific subpage

https://gerrit.wikimedia.org/r/496475

Change 496475 merged by Jcrespo:
[operations/puppet@production] mariadb-backups-monitoring: Link to more specific subpage on icinga

https://gerrit.wikimedia.org/r/496475

https://wikitech.wikimedia.org/wiki/MariaDB/Backups is close to be a complete description of the architecture, only missing some review and the individual application documentation, and maybe some extra details on the recovery process.

I have done a first full reading and just fixed minor things: https://wikitech.wikimedia.org/w/index.php?title=MariaDB/Backups&action=history
Thanks for putting all this together

Self note:
TODO: expand the following point on https://wikitech.wikimedia.org/wiki/MariaDB/Backups#Recovering_a_Snapshot "Setup replication based on the GTID coordinates (remember GTID tracks already executed transactions, while binlogs tracks offsets or "gaps between transactions", do not confuse both methods)"
So someone without less knowledge of how a binlog works can find out which one to set up and which fully command to run "change master to master..."