Replication stopped on Saturday 29 Oct:
Oct 29 16:20:18 clouddb1002 mysqld[27318]: 2022-10-29 16:20:18 139958015805184 [ERROR] Slave SQL: Could not execute Update_rows_v1 event on table s54518__mw.online; Can't find record in 'online', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log log.301158, end_log_pos 20678372, Gtid 2886731673-2886731673-2226312553, Internal MariaDB error code: 1032 Oct 29 16:20:18 clouddb1002 mysqld[27318]: 2022-10-29 16:20:18 139958015805184 [Warning] Slave: Can't find record in 'online' Error_code: 1032 Oct 29 16:20:18 clouddb1002 mysqld[27318]: 2022-10-29 16:20:18 139958015805184 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'log.301158' position 20678171; GTID position '0-2886731673-33522724637,2886731673-2886731673-2226312552' Oct 29 16:20:18 clouddb1002 mysqld[27318]: 2022-10-29 16:20:18 139958015805184 [Note] Slave SQL thread exiting, replication stopped in log 'log.301158' at position 20678171; GTID position '0-2886731673-33522724637,2886731673-2886731673-2226312552'
A similar error happened two years ago, see T266587.
I did manually restart mariadb on that host a couple days before, but it seemed to resume correctly:
Oct 27 13:41:41 clouddb1002 systemd[1]: Stopping mariadb database server... Oct 27 13:41:41 clouddb1002 mysqld[23602]: 2022-10-27 13:41:41 139804210677504 [Note] /opt/wmf-mariadb101/bin/mysqld: Normal shutdown Oct 27 13:41:41 clouddb1002 mysqld[23602]: 2022-10-27 13:41:41 139804210677504 [Note] Event Scheduler: Killing the scheduler thread, thread id 2 Oct 27 13:41:41 clouddb1002 mysqld[23602]: 2022-10-27 13:41:41 139804210677504 [Note] Event Scheduler: Waiting for the scheduler thread to reply Oct 27 13:41:41 clouddb1002 mysqld[23602]: 2022-10-27 13:41:41 139804210677504 [Note] Event Scheduler: Stopped Oct 27 13:41:41 clouddb1002 mysqld[23602]: 2022-10-27 13:41:41 139804210677504 [Note] Event Scheduler: Purging the queue. 0 events Oct 27 13:41:41 clouddb1002 mysqld[23602]: 2022-10-27 13:41:41 139804415751936 [Note] Slave SQL thread exiting, replication stopped in log 'log.300972' at position 42188486; GTID position '0-2886731673-33522724637,2886731673-2886731673-2193663302' Oct 27 13:41:41 clouddb1002 mysqld[23602]: 2022-10-27 13:41:41 139804415956736 [Note] Slave I/O thread exiting, read up to log 'log.300972', position 42188486; GTID position 0-2886731673-33522724637,2886731673-2886731673-2193663302 Oct 27 13:41:42 clouddb1002 mysqld[23602]: 2022-10-27 13:41:42 139767312262912 [Note] InnoDB: FTS optimize thread exiting. Oct 27 13:41:42 clouddb1002 mysqld[23602]: 2022-10-27 13:41:42 139804210677504 [Note] InnoDB: Starting shutdown... Oct 27 13:41:42 clouddb1002 mysqld[23602]: 2022-10-27 13:41:42 139767287084800 [Warning] InnoDB: Dumping buffer pool(s) to .//ib_buffer_pool Oct 27 13:41:43 clouddb1002 mysqld[23602]: 2022-10-27 13:41:43 139767287084800 [Warning] InnoDB: Buffer pool(s) dump completed at 221027 13:41:43 Oct 27 13:41:43 clouddb1002 mysqld[23602]: 2022-10-27 13:41:43 139804210677504 [Note] InnoDB: Waiting for page_cleaner to finish flushing of buffer pool Oct 27 13:41:50 clouddb1002 mysqld[23602]: 2022-10-27 13:41:50 139804210677504 [Note] InnoDB: Shutdown completed; log sequence number 70416120485936 Oct 27 13:41:50 clouddb1002 mysqld[23602]: 2022-10-27 13:41:50 139804210677504 [Note] /opt/wmf-mariadb101/bin/mysqld: Shutdown complete Oct 27 13:41:50 clouddb1002 systemd[1]: Stopped mariadb database server. Oct 27 13:42:53 clouddb1002 systemd[1]: Starting mariadb database server... Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] /opt/wmf-mariadb101/bin/mysqld (mysqld 10.1.39-MariaDB) starting as process 27318 ... Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [ERROR] Plugin 'unix_socket' already installed Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: Using mutexes to ref count buffer pool pages Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: The InnoDB memory heap is disabled Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: Compressed tables use zlib 1.2.11 Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: Using Linux native AIO Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: Using SSE crc32 instructions Oct 27 13:42:53 clouddb1002 mysqld[27318]: 2022-10-27 13:42:53 139958272529088 [Note] InnoDB: Initializing buffer pool, size = 31.0G Oct 27 13:42:55 clouddb1002 mysqld[27318]: 2022-10-27 13:42:55 139958272529088 [Note] InnoDB: Completed initialization of buffer pool Oct 27 13:42:55 clouddb1002 mysqld[27318]: 2022-10-27 13:42:55 139958272529088 [Note] InnoDB: Highest supported file format is Barracuda. Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139958272529088 [Note] InnoDB: 128 rollback segment(s) are active. Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139958272529088 [Note] InnoDB: Waiting for purge to start Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139958272529088 [Note] InnoDB: Percona XtraDB (http://www.percona.com) 5.6.43-84.3 started; log sequence number 70416120485936 Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139921096406784 [Note] InnoDB: Dumping buffer pool(s) not yet started Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 7f41eb3f7700 InnoDB: Loading buffer pool(s) from .//ib_buffer_pool Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139958272529088 [Note] Server socket created on IP: '::'. Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139958272529088 [Warning] 'proxies_priv' entry '@% root@labsdb1005' ignored in --skip-name-resolve mode. Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139958220961536 [Note] Event Scheduler: scheduler thread started with id 2 Oct 27 13:42:57 clouddb1002 mysqld[27318]: 2022-10-27 13:42:57 139958272529088 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--log -basename=#' or '--relay-log=clouddb1002-relay-bin' to avoid this problem. Oct 27 13:42:58 clouddb1002 mysqld[27318]: 2022-10-27 13:42:58 139958272529088 [Note] /opt/wmf-mariadb101/bin/mysqld: ready for connections. Oct 27 13:42:58 clouddb1002 mysqld[27318]: Version: '10.1.39-MariaDB' socket: '/var/run/mysqld/mysqld.sock' port: 3306 MariaDB Server Oct 27 13:42:58 clouddb1002 systemd[1]: Started mariadb database server. Oct 27 13:42:58 clouddb1002 mysqld[27318]: 2022-10-27 13:42:58 139958220551936 [Note] Slave I/O thread: connected to master 'repl@clouddb1001.clouddb-services.eqiad1.wikimedia.cloud:3306',replication starts at GTID position '0-2886731673-33522724637,2886731673-2886731673-2193663302'