When all varnishkafka instances are restarted during the same calendar-hour, the webrequest_sequence_stats_hourly table is empty as we filter out rows having their sequence_min equals to 0.
This makes the dataloss checks fail (both ERROR and WARNING) because no file is generated by Hive when the query has no input data.
I suggest mitigating this by checking for the file existence before checking for its size here and here.
When we move the job to airflow and spark we wish to avoid that corner case.