Page MenuHomePhabricator

Data integrity on Analytics Kafka nodes
Closed, ResolvedPublic

Description

Context https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Kafka/Administration#Data_integrity

A potential fix could be to investigate the possibility of using RAID even for broker's partition logs and configure it during the next cluster installation (for example, for the Kafka 0.9 migration).

Short term fixes might include having SMART icinga alerts on our IRC Channel.

Event Timeline

elukey claimed this task.
elukey raised the priority of this task from to Needs Triage.
elukey updated the task description. (Show Details)
elukey added a project: Analytics.
elukey subscribed.
Milimetric subscribed.

Kind of the same as T99105, if we move to RAID that'll probably solve this