Page MenuHomePhabricator

Research CephFS as a replacement for NFS
Closed, ResolvedPublic

Description

Research CephFS as a replacement for NFS:

  • Feature comparison
  • Investigate necessary hardware
  • Compatibility with Jessie/Stretch clients
  • Integration with OpenStack storage management
  • Integration with Kubernetes storage management
  • Performance comparison (small vs large I/O requests, maximum throughput, latency, etc)
  • Backup strategy

Task from WMCS 2018 offsite meetings.

Event Timeline

GTirloni triaged this task as Medium priority.Oct 21 2018, 12:34 PM

Some notes that also will apply to T90364:

  • Ceph is a very high-latency system without lots of grooming and love when used as anything but a straight-up object store like Swift. With proper tuning it can be somewhat faster than NFSv4 in sync mode (which we use).
  • 10G network on all nodes and clients should be viewed as a requirement except where we basically don't care at all about speed
  • If OSD nodes are large with lots of disks, a failure and rebuild could collapse the system by overstraining the OSDs and creating instability. Smaller, more numerous nodes allow for more resiliency and higher availability and performance.
  • The better the disk, the faster the processor needs to be...and single socket can outperform dual socket motherboards for ceph with the same processors--multiple cores good.
  • Ceph doesn't do comprehensive testing and development on Debian, though they do package for it with a basic install and check test. They also recommend upgrading stock Debian kernels. They do full comprehensive support on CentOS and Ubuntu. No mimic packages are available for Debian until Buster because of needing gcc8.
  • IO throughput requirement testing needs to be done so that we can tune things. I'm digging around in prometheus to find good metrics to watch and compare.
  • Reviewing the network architecture around Ceph is recommended to avoid collapsing the system because of network changes and unrecommended config

Also: because of weaknesses in the old system, we probably will want to start with Luminous or Mimic--especially since no other releases are actually supported. Bluestore-as-default is one of the biggest benefits.

bd808 assigned this task to JHedden.