Page MenuHomePhabricator

Implement production zookeeper backups
Open, LowPublic

Description

  • Original ticket text:

I noticed that since confcluster nodes have both etcd AND zookeeper, only backups of etcd were generated (only on some hosts, see T271573 for more details).

I wonder if we should, like with etcd, generate regular backups of the containing configuration and send them to bacula. While I understand that these are not "traditional databases", literature suggest that backing them up is a good idea:

and both suggest using zk-txnlog-tools for that. However, because I am not familiar with the stack and its usage, I am unsure about what is the best way to perform it.

This task is to ask if this is worthwhile pursuit, and if it is, if there are already plans for it, otherwise I may be able to help as part of the "backup everything" SRE long term goal.

  • Otto's answer:

Hello! Why not eh? I don't think this is high priority for us, but it is certainly a good idea. Feel free to reach out to any Data/Analytics Eng SREs, me, Luca, or @razzi to coordinate as needed.

Most likely this will mean puppetization of a zk-txnlog-tools dump and bacula storage, testing it, as well as the documentation of how backups are generated and restored later.

Event Timeline

This is not something we ask analytics to take care of, but for the initial questions, I believe @elukey or @Ottomata may be the person to know more about this, based on git blame?

Hello! Why not eh? I don't think this is high priority for us, but it is certainly a good idea. Feel free to reach out to any Data/Analytics Eng SREs, me, Luca, or @razzi to coordinate as needed.

I will reuse this ticket as the implementation one, but with low priority for now.

jcrespo renamed this task from Evaluate the need to generate and maintain zookeeper backups to Implement production zookeeper backups.Mar 10 2021, 9:10 AM
jcrespo updated the task description. (Show Details)