Page MenuHomePhabricator

deployment-jobrunner04.deployment-prep.eqiad1.wikimedia.cloud inadequate storage resources
Closed, ResolvedPublic

Description

root@deployment-jobrunner04:~# df -h -t ext4
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        20G   11G  8.6G  55% /
/dev/sdb        9.8G  9.3G     0 100% /srv

/srv keeps filling up during beta scap sync-world.

As a temporary workaround I copied /srv/mediawiki/ to /mediawiki and then bind-mounted /mediawiki onto /srv/mediawiki. I did this since the root filesystem had plenty of free space to accommodate the tree. /etc/fstab has been altered so that this arrangement should survive a reboot but that has not been tested yet.

I assume that the proper solution is to reimage the system (or make a jobruner05) with a more appropriate storage layout (smaller root filesystem, much larger (at least 20GB) /srv).

Event Timeline

The instance flavor is g3.cores4.ram8.disk20 so only 20G which is allocated to /.

The 10G /dev/sdb comes from the jobrunner04 volume. It can be extended (there is quota left on the project), then I guess the partition has to be unmounted and resized manually. The volume has been created by @taavi

Side question, /srv/mediawiki/php-master/cache/l10n is almost 5G:

  • 2.4G for the l10n_cache-XXX.cdb files
  • 2.5G for upstream/l10n_cache-*.cdb.json files

I have long lost track of how l10n caching works. I am guessing the json files are synced to the app server and the cdb files are regenerated there, but the json files are however not used after that.

I extended the size of the volume to 25GB but the change did not register in the VM. The Horizon UI says extend volume:Compute service failed to extend volume.. I will reboot the VM.

Mentioned in SAL (#wikimedia-releng) [2023-01-18T21:33:48Z] <dancy> Rebooting deployment-jobrunner04 for T327329

dancy claimed this task.

Rebooting didn't work. Ultimately I had to shut down the VM, detach the volume, reattach the volume, then start the VM back up.

I resized the /srv/ filesystem and removed the prior hacks. beta-scap-sync-world is still happy. Closing this ticket.

Well done thanks for the fix!