Beginning at 1100 UTC on 2025-06-23, the rdf streaming updater for WDQS and WCQS failed. Lag continued to grow until we intervened around 1745 UTC on 2025-06-24 .
Creating this ticket to:
- Identify root cause (it was the k8s upgrade in T397148 )
- Create runbook for this failure scenario
- Identify and implement any changes that would make this less likely to happen and/or easier to fix in the future (such as updating the Flink App dashboard)