Intermittent replica lag alarms are showing up, this time on db1106:
https://phab.wmfusercontent.org/file/data/blhuxhn5dovevdtctxew/PHID-FILE-35lgnomig7rvzfmdw2qj/Screenshot_from_2021-02-11_12-37-48.png
11:45 AM <jynus> I've seen in the past show slave status returning something like max_int before
11:45 AM <jynus> se we could patch to have a ceiling (e.g. 50 years of lag) and in the future move to pt-heartbeat
Last 2 noticed (soft alert) occurrences (as they are soft alerts, many others could be missed):
- 2021-02-10 18:40 db2121 MariaDB sustained replica lag CRITICAL 6.266e+05 ge 2
- 2021-02-11 11:36 db1106 MariaDB sustained replica lag CRITICAL 4.169e+05 ge 2