Reported on IRC. Some bots and tools are failing as a result.
Description
Related Objects
- Mentioned In
- T136789: High replag on db1069
Event Timeline
It was swapping, we rebooted it and it is back up. Seems to be an XFS related memory leak that @Dzahn remembers as having stuck elsewhere too. Lots of repeated:
Jun 1 22:27:59 labsdb1001 kernel: [47163569.167879] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
in kern.log.
After it was rebooted mysql was not running yet:
18:37 < mutante> root@labsdb1001:~# /etc/init.d/mysql status
18:37 < mutante> /opt/wmf-mariadb10 * MySQL is not running
18:37 < mutante> root@labsdb1001:~# /etc/init.d/mysql start
18:37 < mutante> /opt/wmf-mariadb10
18:37 < mutante> Starting MySQL
18:37 < mutante> ............
took a while and then
18:39 < mutante> ok, it is done
18:39 < YuviPanda> yay
18:39 < mutante> * Manager of pid-file quit without updating file.
18:39 < YuviPanda> it seems back
18:39 < mutante> running now
This XFS bug was reported here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1382333
The fix landed in 3.13.0-40.69, but before the crash occured labsdb1001 was still running the previous
trusty kernel release. With the reboot it's now running the 3.13.0-83 kernel so this should not happen again.