Page MenuHomePhabricator

Large images cloned to /var/lib/nova/instances/_base filling up disk on hypervisors
Closed, ResolvedPublic

Description

In at least one case, this is a problem that is blocking launching a ceph buster VM (cloudvirt1028). It appears that when an instance launches, the image is cloned to the /var/lib/nova/instances/_base path, which is the root disk. Since they are quite large sometimes (and possibly not cleaned up?), this seems to be a problem.

eg:

[bstorm@cloudvirt1028]:instances $ du -sh _base/*
20G	_base/4dfdfe374763360d37aa1d930157602452d8c8dd
20G	_base/612031ed4f080b36c1841be7b0a227577fcf7935
20G	_base/da3e2add64b5dd579131564236e799ff768fc7a7

Event Timeline

Bstorm triaged this task as High priority.Feb 22 2021, 8:06 PM
Bstorm created this task.

This was noticed with reference to trying to start canary1028-01. It is worth noting that the image for buster was also refreshed.

This appears to be a consequence of our switching from qcow to raw images when we moved to ceph. I'm not 100% sure exactly how to resolve this just yet.

This assumes you use cinder for disks always https://docs.ceph.com/en/latest/rbd/rbd-openstack/.

Therefore, not very helpful.

I can confirm we use the default of show_image_direct_url=False. The implications of changing that are clearly warned about in the config files and need some review if we were to change that.

More pertinent info at https://bugzilla.redhat.com/show_bug.cgi?id=1062022. This is very clearly what we are doing. We download the image to the local disk of the cloudvirt and then upload it back to RBD to create the disk. This suggests that we would only be able to stop doing this if the image direct URL is shared, but again, I'm quite iffy about the security of that.

Confirmed you do need that setting according to redhat here https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/block_device_to_openstack_guide/configuring-openstack-to-use-ceph-block-devices#configuring-glance-to-use-ceph-block-devices_rbd-osp. It seems that it just exposes the direct URL of the ceph volume. The doc is not 100% clear if they are talking about the glance API or the ceph endpoint not being publicly accessible. Either way, ceph is not and the glance API can be locked down via ACLs in haproxy if there's no option within ferm. That should make this doable.

For that matter, I don't see why we wouldn't use the recommended properties there for images as well.

hw_scsi_model=virtio-scsi
hw_disk_bus=scsi
hw_qemu_guest_agent=yes
os_require_quiesce=yes

One of the biggest in that list I'd like to test on an image in codfw is hw_scsi_model=virtio-scsi. The docs for rocky suggest that the default is virtio-blk, which is lower performance (and I can confirm that is in use). Perhaps this is an optimization that might help with T273649: Improve ceph performance since nova will accept image properties as changes to its own defaults apparently.

I don't have any real concerns about sharing the download URL, we can always firewall it off from the public net if necessary.

Regarding the image settings: Jason also left notes about image creation rules:

https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Ceph#Glance

I can't remember why I didn't switch things over, I think i was worried about breaking compatibility with older image types? In any case we should definitely experiment with it.

Yeah, it looks like he left out two, likely because they don't show up in the list from rocky so they might be deprecated anyway :)

So I found that the value must be set in the glance-api.conf (not the glance-cache.conf file where it is commented). In my latest test though, deleting the locally cached images, it is re-downloading the thing for some reason.

I can confirm that instance launch waits for nova-compute to download the image locally for basically no reason.

Change 666435 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] openstack-glance: enable copy-on-write inside ceph

https://gerrit.wikimedia.org/r/666435

Mentioned in SAL (#wikimedia-cloud) [2021-02-23T20:36:00Z] <andrewbogott> adding r/o access to the eqiad1-glance-images ceph pool for the client.eqiad1-compute for T275430

Change 666435 merged by Bstorm:
[operations/puppet@production] openstack-glance: enable copy-on-write inside ceph

https://gerrit.wikimedia.org/r/666435

This works now in codfw and should be working in eqiad. Messing with image properties now in codfw.

So far, I've set:

hw_scsi_model=virtio-scsi
hw_disk_bus=scsi

On the buster image in codfw.

That works on new VMs and doesn't seem to have adversely affected existing VMs.

Ok the other two properties I'm not sure we want to use. They are very good for taking snapshots, specifically:
hw_qemu_guest_agent¶
If true, QEMU guest agent will be exposed to the instance.

os_require_quiesce¶
If true, require quiesce on snapshot via QEMU guest agent.

The guest agent can do some things I also do not like entirely. I am proposing not doing those for now, but I'd like to enable the scsi option on all our images.

Mentioned in SAL (#wikimedia-cloud) [2021-02-23T22:40:26Z] <bstorm> rebuild the canary for 1028 after image changes and all is well T275430

Mentioned in SAL (#wikimedia-cloud) [2021-02-23T22:43:55Z] <bstorm> set --property hw_scsi_model=virtio-scsi and --property hw_disk_bus=scsi on the main buster image in glance on eqiad1 T275430

Mentioned in SAL (#wikimedia-cloud) [2021-02-24T00:17:09Z] <bstorm> set --property hw_scsi_model=virtio-scsi and --property hw_disk_bus=scsi on the main stretch image in glance on eqiad1 T275430

Bstorm claimed this task.

Resolving this because the images won't keep getting added, however, we *might* want to try cleaning them up. I'm not absolutely sure that's safe yet.