Page MenuHomePhabricator

tools-docker-registry almost out of disk space
Closed, ResolvedPublic

Description

tools-docker-registry is almost out of disk space:

/dev/sdb         79G   72G  3.1G  96% /srv/registry

Marking as high priority as we'll need to rebuild most images (new webservice version) + add some new ones when Debian Bullseye comes out likely next week (August 14th). It's a Cinder volume, so I think we should be able to just grow that.

Event Timeline

taavi triaged this task as High priority.Aug 5 2021, 11:21 AM
taavi created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Mentioned in SAL (#wikimedia-cloud) [2021-08-05T23:04:05Z] <bstorm> extended docker registry volume to 120GB T288229

That appears to be the right disk, but the volume doesn't appear to have changed to the VM.

from fdisk -l

Disk /dev/sdb: 80 GiB, 85899345920 bytes, 167772160 sectors
Disk model: QEMU HARDDISK
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

So, I'm not sure what's up there. resize2fs won't work if the VM doesn't see the space.

This makes me think I should have used the CLI https://docs.openstack.org/cinder/latest/cli/cli-manage-volumes.html#extend-attached-volume

I thought this worked in testing even with the webUI though...

Mentioned in SAL (#wikimedia-cloud) [2021-08-05T23:50:56Z] <bstorm> rebooting the docker registry T288229

Well, can't blame me for trying. Actually the problem is clear in the logs on horizon:

Screen Shot 2021-08-05 at 4.55.22 PM.png (324×2 px, 233 KB)

The api that allows live resizing is not available in Openstack Victoria it seems. They add that in Wallaby. So I'm not actually sure how we came to believe we could expand these live, though I could swear I thought we tested that.

Anyway. I'm inclined to make a copy of the volume because now it is not actually 120GB, but the API thinks it is. That seems bad to me.

The simple way, unless the puppetization has changed, would be to build a replica for it.

Mentioned in SAL (#wikimedia-cloud) [2021-08-06T00:21:57Z] <bstorm> provisioning second docker registry server to rsync to (120GB disk and fairly large server) T288229

Mentioned in SAL (#wikimedia-cloud) [2021-08-06T00:42:58Z] <bstorm> set up sync between the new registry host and the existing one T288229

Mentioned in SAL (#wikimedia-cloud) [2021-08-06T16:16:59Z] <bstorm> failed over to tools-docker-registry-06 (which has more space) T288229

Despite what openstack documentation says, a hard reboot "fixed" the disparity between the API and the disk. Now both servers have 120GB of space.

Both servers now look like:

Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb        118G   72G   41G  64% /srv/registry