Between 3:00 and 4:00 today, Wikidata has a maxlag of 26490422 seconds (306 days). This may break many bots that sleeps accordings to the maxlag. I proposed that in any cases Wikidata should not report a maxlag more than 5 minutes (300 seconds). In comparison, the usual limit of maxlag is 5 seconds.
https://grafana.wikimedia.org/d/000000170/wikidata-edits?orgId=1&from=1589422190411&to=1589431678958
Description
Related Objects
Event Timeline
Maxlag now climbed to 300+ days again: https://grafana.wikimedia.org/d/000000170/wikidata-edits?orgId=1&from=1589485983686&to=1589486256637
This breaks Widars too as Widar sleeps 3*maxlag+1 seconds before edits, and now Widar sleeps 919 days.
Looking at WDQS lag and [[ URL | Wikidata lag ]] in parallel, I don't see a clear correlation between the 2. There have been a few data reloads last week (you can see the spike on lag on the WDQS side), but the servers should be depooled during that operation. And the spike on lag is only 2h.
It looks like the Wikidata lag is always at 43.80 weeks. This looks like a NaN converted to something wrong, or an overflow of some kind.
@Addshore: you might have a better idea of what's could be happening on the Wikidata side.
Side note: exposing both the MySQL replication lag and the WDQS replication lag through the same value seems like a bit of an abuse of the system. Would it be better to expose both separately and let the clients have a more specific interpretation of those numbers?
The number of second was current Unix time/60, which means unavailable server are treated as have a epoch of zero.