fran1001 has started alerting for total procs counts during evening (PDT) runs. The proc count starts growing around 0200 UTC and returns to normal around 1000 UTC. The count climbs from the steady state of ~430 to right near 2000. This has started in the last 3 evenings. The major changed that happened recently was the upgrade of frdb1003 to buster and mariadb 10.4 (T255066) on 20200622.
What we are seeing is a growth of total procs on fran1001 and a increase of connected db threads on frdb1003. Current hypothesis is that there is a query or set of queries that is backing up the database responses and thus causing the other connections to stack up.
Todo:
- investigate the db performance, connection status, and queries running during the trouble period
- check the analytics processes and scripts to see how they are interacting with the db during that time period
- adjust analytics scripts as needed to eliminate concurrent running processes where possible. Opened T256924 to handled the updating of scripts as needed.