Page MenuHomePhabricator

gerrit1002 running out of space
Open, MediumPublic

Description

There's an icinga alert for gerrit1002, which is running out of space:

root@gerrit1002:/srv/gerrit# df -hT /
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/vda1      ext4   63G   58G  2.3G  97% /
root@gerrit1002:/srv/gerrit# du -sh *
30G	git
4.0K	jvmlogs
15G	plugins

Related Objects

StatusSubtypeAssignedTask
ResolvedDzahn
ResolvedDzahn
ResolvedDzahn
OpenNone
OpenNone
ResolvedPaladox
OpenNone
OpenNone
OpenNone
StalledNone
OpenNone
OpenPaladox
ResolvedPaladox
OpenNone
OpenNone
OpenNone
StalledNone
OpenNone
Openthcipriani
ResolvedDzahn
OpenNone

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptTue, Jan 28, 1:53 AM
Marostegui triaged this task as High priority.Tue, Jan 28, 1:53 AM
Dzahn added a comment.Tue, Jan 28, 1:55 AM

This is not the production services. This is a test setup for the 2.16 upgrade.

Marostegui lowered the priority of this task from High to Medium.Tue, Jan 28, 1:57 AM

Thanks, I have added a comment to the alert to avoid confusions.

Dzahn added a comment.Tue, Jan 28, 2:00 AM

Tried to avoid alerts on this and turn off monitoring with https://gerrit.wikimedia.org/r/c/operations/puppet/+/562619 but that isn't enough as it still pops up in the web UI for base checks, even if they don't send notifications.

I will schedule a long downtime.

Tried to avoid alerts on this and turn off monitoring with https://gerrit.wikimedia.org/r/c/operations/puppet/+/562619 but that isn't enough as it still pops up in the web UI for base checks, even if they don't send notifications.
I will schedule a long downtime.

Thank you :-)

Mentioned in SAL (#wikimedia-operations) [2020-01-28T02:05:33Z] <mutante> gerrit1002 - gzipping a bunch of /var/log/gerrit/ log files (T243808)

Dzahn added a comment.Tue, Jan 28, 2:19 AM

@thcipriani ^ This is back to 94% as of right now after ^. And it's been downtime for a month. Is the test instance usable with the current size?

Also fwiw, when i looked at /srv and the largest files in it i found a single file: gzip compressed data, was "GoogleNews-vectors-negative300.bin" in /srv/gerrit/plugins/plugins/lfs/21/c0 . It's a 1.6 G compressed file.

@thcipriani ^ This is back to 94% as of right now after ^. And it's been downtime for a month. Is the test instance usable with the current size?
Also fwiw, when i looked at /srv and the largest files in it i found a single file: gzip compressed data, was "GoogleNews-vectors-negative300.bin" in /srv/gerrit/plugins/plugins/lfs/21/c0 . It's a 1.6 G compressed file.

Ugh. I did some digging on this machine today: it seems like most of the data there is legitamately in-use by gerrit; i.e., nothing there obvious to trash (aside from rotating some logs early, but that won't get us the kind of space we evidently need).

Is there an easy way to expand the disk space here/move /srv to a seperate partition? It seems like even though all the git repos are like 30GB we have enough other data to fill up space :(

See T243983. I added a second disk to this VM, it's an additional 10GB and mounted on /srv/dbdump. Hope that does it.

Marostegui added a subscriber: MoritzMuehlenhoff.

Per the duplicate task I merged here filled by @MoritzMuehlenhoff:

root@gerrit1002:~# df -hT /
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/vda1      ext4   63G   60G     0 100% /

Per discussion me and @thcipriani just had, we found lfs objects using 15G, so if we remove /srv/dbdump and recreate it as a 20g partition, we can move the objects there.

Actually, we will use that partition for db readonly, so i think a new /srv/lfs partition 18g would do.

Mentioned in SAL (#wikimedia-operations) [2020-02-20T23:25:52Z] <mutante> ganeti1003 - adding another virtual 20G disk to gerrit1002 (T243808)

Had to fix /etc/network/interfaces again (interface name changed again, ens5 -> ens6 now ens7) and restart to fix networking.

Then formatted with ext4 and mounted additional 20G on /srv/lfs. Added to /etc/fstab to survive reboots.

/dev/vdc         20G   45M   19G   1% /srv/lfs