Page MenuHomePhabricator

deployment-kafka01 out of disk space
Closed, ResolvedPublic

Description

Found while looking into T170521: deployment-logstash2 out of disk space

$ ssh deployment-kafka01.eqiad.wmflabs df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             10M     0   10M   0% /dev
tmpfs           401M   41M  361M  11% /run
/dev/vda3        19G   19G     0 100% /
tmpfs          1003M     0 1003M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs          1003M     0 1003M   0% /sys/fs/cgroup

Event Timeline

greg added a subscriber: greg.

kafka -> Analytics

Analytics: please help diagnose/fix the beta cluster kafka host.

Nuria added subscribers: Ottomata, Nuria.

Any of us during ops week should be able to do it, ping @Ottomata , what is the best way to free space

Look for what is causing the used space. Looks like /var/log/daemon.log* was really huge. Not really sure why, I see lots of errors about puppet failing, buuuuut those are due to lack of space.

I removed /var/log/daemon.log.1 to recover 3.5G.

Probably the right thing to do would be to audit the kafka clusters and nodes in beta, and use larger instances. We should replace some of them as part of T152015

The cause is likely logging of mediawiki revision-create events last week