0 data points for wmf_netflow in this interval: https://w.wiki/SVo
I checked a few nfacctd exporters and they were all sending data to Kafka in that interval.
0 data points for wmf_netflow in this interval: https://w.wiki/SVo
I checked a few nfacctd exporters and they were all sending data to Kafka in that interval.
The missing data seems from May 31st 18:30 to 19:00. I did a quick check via Spark and on HDFS the data seems present:
scala> spark.sql("select stamp_inserted from wmf.netflow where year=2020 and month=05 and day=31 and hour=18 and stamp_inserted like '2020-05-31 18:4%' limit 20").show(20); +-------------------+ | stamp_inserted| +-------------------+ |2020-05-31 18:43:00| |2020-05-31 18:40:00| |2020-05-31 18:46:00| |2020-05-31 18:43:00| |2020-05-31 18:48:00| |2020-05-31 18:40:00| |2020-05-31 18:43:00| |2020-05-31 18:43:00| |2020-05-31 18:41:00| |2020-05-31 18:42:00| |2020-05-31 18:49:00| |2020-05-31 18:40:00| |2020-05-31 18:43:00| |2020-05-31 18:49:00| |2020-05-31 18:41:00| |2020-05-31 18:43:00| |2020-05-31 18:44:00| |2020-05-31 18:48:00| |2020-05-31 18:47:00| |2020-05-31 18:44:00| +-------------------+
scala> spark.sql("select count(*) from wmf.netflow where year=2020 and month=05 and day=31 and hour=18").show(); 20/06/04 07:09:37 WARN Utils: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.debug.maxToStringFields' in SparkEnv.conf. +--------+ |count(1)| +--------+ |12405683| +--------+ scala> spark.sql("select count(*) from wmf.netflow where year=2020 and month=05 and day=31 and hour=19").show(); +--------+ |count(1)| +--------+ |12335817| +--------+ scala> spark.sql("select count(*) from wmf.netflow where year=2020 and month=05 and day=31 and hour=17").show(); +--------+ |count(1)| +--------+ |12569783| +--------+