Hello!
We received a SMART notification about kafka1012.eqiad.wmfnet, and the dmesg reports this:
[19958.049571] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[19958.049583] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[19958.049591] sd 0:0:5:0: [sdf] Add. Sense: Ack/nak timeout
[19958.049593] sd 0:0:5:0: [sdf] CDB:
[19958.049596] Read(10): 28 00 96 80 0a 18 00 00 08 00
[19958.049604] blk_update_request: I/O error, dev sdf, sector 2524973592
[38257.716894] sd 0:0:5:0: [sdf] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[38257.716899] sd 0:0:5:0: [sdf] Sense Key : Aborted Command [current]
[38257.716903] sd 0:0:5:0: [sdf] Add. Sense: Ack/nak timeout
[38257.716905] sd 0:0:5:0: [sdf] CDB:
[38257.716907] Read(10): 28 00 54 00 0b d0 00 00 38 00
[38257.716914] blk_update_request: I/O error, dev sdf, sector 1409289168
elukey@kafka1012:~$ cat /proc/mounts | grep sdf
/dev/sdf1 /var/spool/kafka/f ext4 rw,noatime,data=writeback 0 0
We use the sdf1 partition for the Kafka broker's log, and we should pay a attention since this host has already caused https://phabricator.wikimedia.org/T125084 :D
Thanks!
Luca