Page MenuHomePhabricator

[NFS] Reduce or eliminate bare-metal NFS servers
Open, HighPublic

Description

T290602 has inspired some frantic conversation about the future of our NFS servers. The current plan is:

Decisions taken:

  • we will use regular NFS VMs, one per share
  • all in the cloudinfra-nfs VPS project (to be created)
  • volume backups will happen using cinder-backup service, on cloudbackup2001 (codfw datacenter)
  • will automate the provisioning of the NFS VMs using cookbooks
  • will do a first run of the migration process and iterate on that

Done:

  • Tested creating NFS VMs using cinder volumes manually with puppet config and tested mounting it on toolsbeta

Doing:

  • Setup cinder-backups service on cloudbackup2001 an link it to the eqiad cluster
  • Automate with cookbooks the creation of the NFS VMs and volumes
  • Do a test run of the migration procedure with one of the less busy shares (scratch/misc)

To define:

  • How/what to monitor/alert on for this system
  • Iterate on the migration procedure on how to migrate the rest of the shares
  • Add a script to trigger the volume backups on clouddb on a weekly basis

Notes:
CephFS use is not in our immediate plans because that opens complicated networking/DC questions that we're not ready to think about

Event Timeline

Notes from our just-completed meeting (for future reference):

Our NFS servers pools are:

  • Tools Using 6 of 8TB
  • Maps Using 5 of 8TB
  • Other projects not tools or maps Using 2 of 5 TB
  • Scratch using 2 of 4 TB

Total: Using 15 out of allocated 25 TB
(Ceph currently has ~35 available TB)

  • Dumps (read-only and SO BIG that we aren't talking about this today) (dumps is worked on by Ariel and also some analytics/Data Engineering folks)
  • Status Quo
    • Pros: It's the status quo, using DRBD
    • Cons: Clunky, requires domain-specific knowledge, Violates network separation rules
  • Status Quo but with rsync backup instead of drbd
    • Pros: not needing to understand drbd
    • Cons: Potential data loss between backups; violates network separation, clunky
  • Existing server model but on VMs (no openstack manila)
    • Pros: fewer kinds of hardware, fewer kinds of networks, we could start doing this today!
    • Cons: possible network congestion, heavy Ceph usage, possibly difficult migration
    • The Backup Plan *****
  • Some different server model on VMs (e.g. more servers but no automatic provisioning)
    • Pros: roughly the same as above but possibly with better load/risk distribution, we could start doing this today!
    • Cons: roughly the same as above
    • The WINNER ******
  • Proper openstack-native share management via Manila
    • Pros: builds VMs with nova, cinder volumes, etc. More or less automates the VM model? Also supports quotas. Could flip later to cephfs easily
    • Cons: Tools would still be it's own project (WMCS would have to manage); less flexibility to configure NFS since Manila will want us to treat it as a black box
  • CephFS
    • Pros: quotas?, supported by Manila
    • Cons: new/unknown, requires network proxy. How can you authenicate? (Ceph is in the production realm)

Open questions:

  • Do we want to put NFS data into Ceph?
    • Ceph is the only scalable performant solution.
    • What about backups? Could use backy2, reusing existing backup servers and jobs.. Not everything can/will be 100% backed up.
  • What about HA?
    • DRBD'd NFS servers are in the same rack, given the direct cable. Limits physical setup
    • Don't auto failover as-is
  • What about network traffic?
    • Think carefully about network setup and flows between racks
    • One reason NFS in VM's won't work is because of bandwidth constraints / concerns; at least as we build VM's now
    • IE, create a dedicated cloud-virt to host NFS VM's, etc
  • Which of those scenarios requires us to re-learn all of the performance throttling that we've learned with our existing setup?
  • DON'T DO NFS soft-mounts. Once they time-out, they wont recover and you need to reboot the VM.
  • How can we further seperate the tools share? Making seperate shares for quota / performance reasons

Mentioned in SAL (#wikimedia-cloud) [2021-09-20T21:57:03Z] <andrewbogott> moving cloudvirt1043 into the 'nfs' aggregate for T291405

dcaro renamed this task from Reduce or eliminate bare-metal NFS servers to [NFS] Reduce or eliminate bare-metal NFS servers.Tue, Oct 19, 3:43 PM
dcaro triaged this task as High priority.Tue, Oct 19, 3:50 PM
dcaro updated the task description. (Show Details)
dcaro added subscribers: dcaro, aborrero.