Page MenuHomePhabricator

Replace all disk-usage flavor variants with Cinder use (was: Cinder storage vs. ephemeral storage vs. flavor)
Closed, ResolvedPublic

Description

Cinder/Horizon support a (currently disabled) workflow that creates a Cinder volume in concert with VM creation. That suggests a possible new process for managing storage:

  • We rearrange flavors such that all flavors have the same ephemeral disk size (e.g. 20 Gb)
  • Any use-cases that require more storage are handled via attachable volumes.
  • We provide some modest puppet automation for common workflows (e.g. 'format attached ceph volume with one partition an xfs and mount on /srv)

A few advantages:

  • Cinder storage has quotas, so we'd be able to manage tenant storage use without having to use flavor as a proxy
  • Users would be nudged in the direction of creating more cattle-like VMs since important storage would persist outside of VM crashes/rebuilds

Event Timeline

@Bstorm, iirc you had some reasons why we should continue to allow variable amounts of ephemeral storage. I created this task mostly so you'd have a place to write those down.

This all sounds quite sensible to me so far.

I've learned a few bad things:

  • The +volume workflow in Horizon doesn't actually create a separate volume, it just creates a user-sized primary volume for launching the image.
  • That workflow is EXTREMELY slow -- in all my tests it exceeds the built in 5 minute timeout. Forum posts suggest that this is slow for everyone. I could increase the timeout to 20 minutes but it's hard to imagine users not assuming this is broken if it takes that long to start up.

So... I'm sticking with the original proposal but the user experience won't be as slick:

  • User creates a VM (which has a pre-set size of 20Gb, always. Still no variation in flavor size)
  • If the user wants to add more storage, they can create and attach it in a subsequent step via the volumes tab

That leaves the question of formatting. I could add default puppet behavior to every VM that detects attached volumes and automatically mounts and formats them as needed, or we could have puppet install a command-line tool that does the formatting and mounting. I'm leaning towards the latter.

Could that be worked around with something like heat?

Change 658452 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] cloud-vps instances: add a helper script to format & mount a cinder volume

https://gerrit.wikimedia.org/r/658452

Change 658452 merged by Andrew Bogott:
[operations/puppet@production] cloud-vps instances: add a helper script to format & mount a cinder volume

https://gerrit.wikimedia.org/r/658452

For the near term we're leaving flavors as they are and only implementing by-hand workflows. Next steps are:

  • Give users some time to use the current processes
  • Assess existing puppet processes and update them to support cinder volumes
  • Simplify flavors, remove LVM from startup scripts
Andrew renamed this task from Cinder storage vs. ephemeral storage vs. flavor to Replace all disk-usage flavor variants with Cinder use (was: Cinder storage vs. ephemeral storage vs. flavor).Feb 12 2021, 4:17 PM

Change 668567 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labs_lvm: check for available space before partitioning

https://gerrit.wikimedia.org/r/668567

Change 668567 merged by Andrew Bogott:
[operations/puppet@production] labs_lvm: check for available space before partitioning

https://gerrit.wikimedia.org/r/668567

Change 670961 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] cinderutils::ensure: Gracefully handle lvm legacy cases

https://gerrit.wikimedia.org/r/670961

Change 670962 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] rake default_facts: add defaults for the new 'cinder_volumes' fact

https://gerrit.wikimedia.org/r/670962

Change 670962 merged by Andrew Bogott:
[operations/puppet@production] rake default_facts: add defaults for the new 'cinder_volumes' fact

https://gerrit.wikimedia.org/r/670962

Change 670961 merged by Andrew Bogott:
[operations/puppet@production] cinderutils::ensure: Gracefully handle lvm legacy cases

https://gerrit.wikimedia.org/r/670961

Change 671208 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] prepare_cinder_volume.py: Add optional arg for mount options

https://gerrit.wikimedia.org/r/671208

Change 671210 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] cinderutils::ensure: support specifying mount options and file mode

https://gerrit.wikimedia.org/r/671210

Change 671209 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] prepare_cinder_volume.py: Add optional mount mode

https://gerrit.wikimedia.org/r/671209

In a few cases we have puppet code that creates tmp or swap volumes with LVM. We probably don't want to use cinder for that as a rule; instead in those rare cases I'll create special flavors with ephemeral volumes (with --swap or --ephemeral in the flavor creation). That will avoid the weirdness of persistent swap volumes but will let us use the same non-lvm-using code to allocate them and get us away from our LVM hacks on the root volume.

Change 672437 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Refactor cindervolumes fact again

https://gerrit.wikimedia.org/r/672437

Change 672438 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Add cinderutils::swap

https://gerrit.wikimedia.org/r/672438

Change 672437 merged by Andrew Bogott:
[operations/puppet@production] Refactor cindervolumes fact again

https://gerrit.wikimedia.org/r/672437

Change 671208 merged by Andrew Bogott:
[operations/puppet@production] prepare_cinder_volume.py: Add optional arg for mount options

https://gerrit.wikimedia.org/r/671208

Change 671209 merged by Andrew Bogott:
[operations/puppet@production] prepare_cinder_volume.py: Add optional mount mode

https://gerrit.wikimedia.org/r/671209

Change 671210 merged by Andrew Bogott:
[operations/puppet@production] cinderutils::ensure: support specifying mount options and file mode

https://gerrit.wikimedia.org/r/671210

Change 672456 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Support building a grid-exec node with cinder volumes or flavor-defined ephemeral storage

https://gerrit.wikimedia.org/r/672456

Change 672438 merged by Andrew Bogott:
[operations/puppet@production] Add cinderutils::swap

https://gerrit.wikimedia.org/r/672438

Change 672538 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Nova vendordata: rework initial partitioning

https://gerrit.wikimedia.org/r/672538

Change 672786 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[openstack/horizon/horizon@master] instance view flavors: Load all readable flavors for instance context

https://gerrit.wikimedia.org/r/672786

Change 672786 merged by Andrew Bogott:
[openstack/horizon/horizon@master] WMF Hack: Load all readable flavors for instance context

https://gerrit.wikimedia.org/r/672786

Change 672790 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[openstack/horizon/horizon@train-buster] WMF Hack: Load all readable flavors for instance context

https://gerrit.wikimedia.org/r/672790

Change 672790 merged by Andrew Bogott:
[openstack/horizon/horizon@train-buster] WMF Hack: Load all readable flavors for instance context

https://gerrit.wikimedia.org/r/672790

Change 673059 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[openstack/horizon/deploy@train-buster] Bump Horizon submodule; should improve display of disabled flavors.

https://gerrit.wikimedia.org/r/673059

Change 673059 merged by Andrew Bogott:
[openstack/horizon/deploy@train-buster] Bump Horizon submodule; should improve display of disabled flavors.

https://gerrit.wikimedia.org/r/673059

Change 672538 merged by Andrew Bogott:
[operations/puppet@production] Nova vendordata: rework initial partitioning

https://gerrit.wikimedia.org/r/672538

This is now in place for the general cases:

  • New g3. flavors all have 20Gb default storage
  • LVM no longer used for new VMs; rather, cloud-init resizes / to fill available space
  • All projects have a default cinder quota of 80Gb
  • More cinder features (attach/detach and dashboard summary) now working in Horizon

I have a few pending patches for special cases in CI and Toolforge. I'll write a few more drafts but most of those will need to be dealt with when it comes time to build new nodes.

Change 673267 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] toolforge: grid: base: stop using LVM

https://gerrit.wikimedia.org/r/673267

Change 673267 abandoned by Arturo Borrero Gonzalez:
[operations/puppet@production] toolforge: grid: base: stop using LVM

Reason:
https://gerrit.wikimedia.org/r/c/operations/puppet/ /672456

https://gerrit.wikimedia.org/r/673267

Change 672456 merged by Andrew Bogott:
[operations/puppet@production] Support building a grid-exec node with cinder or flavor-defined storage

https://gerrit.wikimedia.org/r/672456

A variation of this is now done. All flavors are now limited to a 20GB / volume. Most use cases will use Cinder for additional storage; some special cases (e.g. toolforge exec nodes) will have additional ephemeral storage instead of cinder volumes; this is handled via project-local custom flavors.