The update lag as reported in https://grafana-rw.wikimedia.org/d/8xDerelVz/search-update-lag-slo is inaccurate when running a backfill using Spark the EventStream kafka writer.
Even if the event-time is set via current_timestamp() in the data-frame that time might be assigned way too early causing the lag calculation to be completely meaningless.
The pipeline should ideally keep track if the tag is meant to be because a real-time update or a batch import. Perhaps the rev_based flag could be propagated somehow to help the update lag reporter to simply ignore those?
AC:
- The Search Update lag dashboard is no longer reporting inaccurate lag information when backfilling a weighted tag.