As a product manager, I want to know the baseline and current lagtime for WDQS/WCQS, so that I can report on how well the products are performing with respect to our annual OKR of getting update lag under 10 minutes.
This Grafana view allows us to see performance of the different WDQS servers with respect to lag: https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=8&orgId=1&refresh=1m&from=now-90d&to=now. However it does not allow us to know what the average/effective lag is, as it reports each server independently. It also doesn't take into account depooled servers allowed to catch up, and data reloads, which may not affect users directly.
Until we have an effective way of accurately capturing this, the KR will reported indirectly by counting number of 10min+ lag events given a time period. This will be an overestimate of actual update lag, for the reasons above relating to depooled servers/data reload.
AC:
- There is a dashboard for WDQS/WCQS update lag time
- Establish an update lag baseline going into the 2021-2022 fiscal year (so we can track the impact of our work)
- Preferably can be raport can be produced based on historical data