deployment-kafka01 out of disk space
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	fgiunchedi
	Jul 13 2017, 8:36 AM

Description

Found while looking into T170521: deployment-logstash2 out of disk space

$ ssh deployment-kafka01.eqiad.wmflabs df -h
Filesystem      Size  Used Avail Use% Mounted on
udev             10M     0   10M   0% /dev
tmpfs           401M   41M  361M  11% /run
/dev/vda3        19G   19G     0 100% /
tmpfs          1003M     0 1003M   0% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs          1003M     0 1003M   0% /sys/fs/cgroup

Related Objects

Mentioned Here: T152015: Provision new Kafka cluster(s) with security features
T170521: deployment-logstash2 out of disk space

Event Timeline

fgiunchedi created this task.Jul 13 2017, 8:36 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 13 2017, 8:36 AM

kafka -> Analytics

Analytics: please help diagnose/fix the beta cluster kafka host.

Any of us during ops week should be able to do it, ping @Ottomata , what is the best way to free space

Look for what is causing the used space. Looks like /var/log/daemon.log* was really huge. Not really sure why, I see lots of errors about puppet failing, buuuuut those are due to lack of space.

I removed /var/log/daemon.log.1 to recover 3.5G.

Probably the right thing to do would be to audit the kafka clusters and nodes in beta, and use larger instances. We should replace some of them as part of T152015

The cause is likely logging of mediawiki revision-create events last week

Milimetric claimed this task.Jul 18 2017, 3:14 PM

Milimetric moved this task from Next Up to Done on the Analytics-Kanban board.

• Nuria closed this task as Resolved.Jul 18 2017, 9:00 PM

deployment-kafka01 out of disk spaceClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

deployment-kafka01 out of disk space
Closed, ResolvedPublic
Actions