Running systemctl stop mariadb should complete in a few minutes at most, but the last 3 times it was run for the primary it took:
- 4 minutes (on 2025-02-06)
- 8 minutes (on 2025-02-24)
- 50 minutes (on 2025-04-28, see T392596 for more details)
Running systemctl stop mariadb should complete in a few minutes at most, but the last 3 times it was run for the primary it took:
In the last occurrence, nothing was logged between those 2 lines:
Apr 28 13:07:32 tools-db-4 mysqld[992820]: 2025-04-28 13:07:32 0 [Note] InnoDB: FTS optimize thread exiting. Apr 28 13:57:33 tools-db-4 mysqld[992820]: 2025-04-28 13:57:33 0 [Note] InnoDB: Starting shutdown...
Maybe increasing the logging verbosity would give more information, but it's tricky to reproduce because it seems to happen only on the primary host, so we cannot test it on a replica.
Perhaps we could:
I'm worried that we have no guarantee that the problem will reoccur, so we might have to do this multiple times before getting anything useful out of it.
I'll mark this as "Low" for the moment, failing over to the replica is an acceptable workaround when we have to shutdown the primary.