T290602 has inspired some frantic conversation about the future of our NFS servers. The current plan is:
Decisions taken:
- we will use regular NFS VMs, one per share
- all in the cloudinfra-nfs VPS project (to be created)
- volume backups will happen using cinder-backup service, on cloudbackup2001 (codfw datacenter)
- will automate the provisioning of the NFS VMs using cookbooks
- will do a first run of the migration process and iterate on that
Done:
- Tested creating NFS VMs using cinder volumes manually with puppet config and tested mounting it on toolsbeta
Doing:
- Setup cinder-backups service on cloudbackup2001 an link it to the eqiad cluster
- Automate with cookbooks the creation of the NFS VMs and volumes
- Do a test run of the migration procedure with one of the less busy shares (scratch/misc)
To define:
- How/what to monitor/alert on for this system
- Iterate on the migration procedure on how to migrate the rest of the shares
- Add a script to trigger the volume backups on clouddb on a weekly basis
Notes:
CephFS use is not in our immediate plans because that opens complicated networking/DC questions that we're not ready to think about