Page MenuHomePhabricator

[toolsdb] MariaDB sometimes takes very long to shut down
Open, LowPublic

Description

Running systemctl stop mariadb should complete in a few minutes at most, but the last 3 times it was run for the primary it took:

  • 4 minutes (on 2025-02-06)
  • 8 minutes (on 2025-02-24)
  • 50 minutes (on 2025-04-28, see T392596 for more details)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

In the last occurrence, nothing was logged between those 2 lines:

Apr 28 13:07:32 tools-db-4 mysqld[992820]: 2025-04-28 13:07:32 0 [Note] InnoDB: FTS optimize thread exiting.
Apr 28 13:57:33 tools-db-4 mysqld[992820]: 2025-04-28 13:57:33 0 [Note] InnoDB: Starting shutdown...

Maybe increasing the logging verbosity would give more information, but it's tricky to reproduce because it seems to happen only on the primary host, so we cannot test it on a replica.

Perhaps we could:

  • failover the primary to the replica
  • increase the logging level and shut down the old primary
  • see if we get something useful from the logs

I'm worried that we have no guarantee that the problem will reoccur, so we might have to do this multiple times before getting anything useful out of it.

I'll mark this as "Low" for the moment, failing over to the replica is an acceptable workaround when we have to shutdown the primary.