Should be done continuously from eqiad to codfw for all the volumes (Tools, Maps, Others)
|Resolved||yuvipanda||T105720 Labs team reliability goal for Q1 2015/16|
|Resolved||coren||T106474 Make continuous backups of NFS data to codfw|
|Invalid||None||T106871 paramiko (python SSH implementation) needs older hashes for host authentication|
@yuvipanda: We now have working on-demand backups, pending a script to manage cleanup of snapshots we could now automate this entirely. Do you have a preference for the retention policy? I was considering doing:
- clean any snapshot getting too full (as they will become worthless anyways)
- clean the oldest snapshots remaining until there is enough space for a full set.
If we do daily backups (the original plan) then the process is trivial; this simply needs to be done once before the next set of backups is started.
If we go with your idea of doing backups in a loop, then we'll need to be a little fancier about space management as the smaller filesystems will generate several snapshots per day - including possibly have variably-sized snapshots and resizing since we can't do terabyte-sized snapshots dozens of times per day.
So remaining steps are:
- Find a way to monitor script failure
- Find a way to monitor script hasn't run in X hours
- Make sure that the previous two work (by having them fail)
- Add systemd timers to run the scripts at schedules.