Disk usage across brokers looks imbalanced: Here's analytics1022:
/dev/sdi1 1.8T 770G 1.1T 42% /var/spool/kafka/i /dev/sdd1 1.8T 1.2T 617G 67% /var/spool/kafka/d /dev/sde1 1.8T 768G 1.1T 42% /var/spool/kafka/e /dev/sdl1 1.8T 769G 1.1T 42% /var/spool/kafka/l /dev/sdc1 1.8T 769G 1.1T 42% /var/spool/kafka/c /dev/sdh1 1.8T 1.3T 566G 70% /var/spool/kafka/h /dev/sdg1 1.8T 604G 1.3T 33% /var/spool/kafka/g /dev/sdk1 1.8T 614G 1.2T 34% /var/spool/kafka/k /dev/sdf1 1.8T 1.3T 578G 69% /var/spool/kafka/f /dev/sdj1 1.8T 600G 1.3T 33% /var/spool/kafka/j /dev/sdb3 1.8T 1.7T 102G 95% /var/spool/kafka/b /dev/sda3 1.8T 1.2T 587G 68% /var/spool/kafka/a
Why? Webrequests should be sent to random partitions. A week ago I did start sending eventlogging data to Kafka, keyed by schema names. This is using the python Kafka producer. Perhaps it isn't random? Perhaps it is hashing keys to partitions, which is causing some partitions to be much more imbalanced. Need to look into this.