Today after rebooting kafka1012 I had to run fsck on /dev/sdf1 to correct some issues. I ran smartctl -a and this is the result:
elukey@kafka1012:~$ for el in `df -h | grep spool | cut -d " " -f 1`; do echo $el; sudo smartctl -a $el | grep defect; done /dev/sdg1 Elements in grown defect list: 0 /dev/sdd1 Elements in grown defect list: 0 /dev/sdl1 Elements in grown defect list: 0 /dev/sde1 Elements in grown defect list: 0 /dev/sdk1 Elements in grown defect list: 0 /dev/sdh1 Elements in grown defect list: 0 /dev/sdc1 Elements in grown defect list: 0 /dev/sdj1 Elements in grown defect list: 0 /dev/sdi1 Elements in grown defect list: 0 /dev/sdf1 Elements in grown defect list: 1425 /dev/sda3 Elements in grown defect list: 0 /dev/sdb3 Elements in grown defect list: 0
The disks seem to be JBOD:
elukey@kafka1012:~$ sudo megacli -AdpAllInfo -aALL Device Present ================ Virtual Drives : 0 Degraded : 0 Offline : 0 Physical Devices : 14 Disks : 12 Critical Disks : 0 Failed Disks : 0
The disk is now in service but I think it would be wise to replace it before it fails completely. We need to schedule downtime for this server since it is part of the Analytics Kafka cluster.