As a follow up task to
https://phabricator.wikimedia.org/T297231
Implement logging of selected Spark metrics and visualize on Dashboard.
As a follow up task to
https://phabricator.wikimedia.org/T297231
Implement logging of selected Spark metrics and visualize on Dashboard.
I'd love if we could go this way: https://richardstartin.github.io/posts/publishing-dropwizard-metrics-to-kafka
After all the discussion on this subject, I also think that publishing Spark metrics to Kafka (then exported to hdfs) seems like the most obvious first step.
An example of a KafakSink is here: https://github.com/erikerlandson/spark-kafka-sink/blob/master/src/main/scala/org/apache/spark/metrics/sink/KafkaSink.scala