Page MenuHomePhabricator

tools-bastion-02 /dev/vda1 (root mount) 100% full
Closed, ResolvedPublic

Description

While using some tab-shortcuts in bash, I got -bash: cannot create temp file for here-document: No space left on device
Checking disk usage, /dev/vda1 is 100% full:

04:21:41 0 ✓ zhuyifei1999@tools-bastion-02: ~$ df -h
Filesystem                                                          Size  Used Avail Use% Mounted on
udev                                                                3.9G   12K  3.9G   1% /dev
tmpfs                                                               799M  520K  799M   1% /run
/dev/vda1                                                            18G   18G     0 100% /
none                                                                4.0K     0  4.0K   0% /sys/fs/cgroup
none                                                                5.0M     0  5.0M   0% /run/lock
none                                                                3.9G     0  3.9G   0% /run/shm
none                                                                100M     0  100M   0% /run/user
labstore.svc.eqiad.wmnet:/project/tools/project                     8.0T  5.3T  2.7T  67% /data/project
labstore.svc.eqiad.wmnet:/project/tools/project/.system/gridengine  8.0T  5.3T  2.7T  67% /var/lib/gridengine
labstore.svc.eqiad.wmnet:/scratch                                   984G  613G  322G  66% /data/scratch
labstore.svc.eqiad.wmnet:/project/tools/home                        8.0T  5.3T  2.7T  67% /home
labstore1003.eqiad.wmnet:/dumps                                      44T   14T   30T  32% /public/dumps

Event Timeline

zhuyifei1999 raised the priority of this task from to Needs Triage.
zhuyifei1999 updated the task description. (Show Details)
zhuyifei1999 added a project: Toolforge.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript

Removed files in /tmp belonging to my tool with

tools.yifeibot@tools-bastion-02:~$ for FILE in $(ls -l /tmp | grep yifeibot | awk '{print $(NF)}'); do rm -rv /tmp/$FILE; done

Doesn't really help.

Seems to have been a transient issue:

/dev/vda1                                                            18G   14G  3.7G  79% /

but graphite suggests something is (still) steadily eating space, with some sort of cleanup every now and then:

pasted_file (308×586 px, 31 KB)

Some du later, lots of space seems to be in:

valhallasw@tools-bastion-02:/var/log/account$ ls -lh
total 5.2G
-rw-r----- 1 root adm 852M Jan  1 11:18 pacct
-rw-r----- 1 root adm 4.4G Jan  1 06:49 pacct.0

but, rather than the NFS issue in T107052: tools bastion accounting logs super noisy, filling /var, this seems to be caused by tools.icebot running lots of commands.

Sorry, yes, tools.icelab. The tool is basically running lots of small sql queries from bash, in parallel from different screens. In general, I would say that's reasonable, but it creates almost 50k entries in pacct per minute. The load is also absurdly high (~55), but nothing is really sluggish. I haven't looked at it further (or thought about solutions), as it didn't really seem to be an 'unbreak now' situation and it's the first of january ;-)

I have reminded the user of the rules and asked him to optimize his script on his talk page.

I have looked at my mail archive and I do not see a Shinken alert for this event. Did I miss it? I'm quite sure that in the past I have received warnings for (nearly) full disks.