replica labsdb1001 down
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	MusikAnimal
	Jun 2 2016, 1:23 AM

Description

Reported on IRC. Some bots and tools are failing as a result.

Related Objects

Mentioned In: T136789: High replag on db1069

Event Timeline

MusikAnimal created this task.Jun 2 2016, 1:23 AM

Restricted Application added a project: Cloud-Services. · View Herald TranscriptJun 2 2016, 1:23 AM

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

It was swapping, we rebooted it and it is back up. Seems to be an XFS related memory leak that @Dzahn remembers as having stuck elsewhere too. Lots of repeated:

Jun  1 22:27:59 labsdb1001 kernel: [47163569.167879] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

in kern.log.

After it was rebooted mysql was not running yet:

18:37 < mutante> root@labsdb1001:~# /etc/init.d/mysql status
18:37 < mutante> /opt/wmf-mariadb10 * MySQL is not running
18:37 < mutante> root@labsdb1001:~# /etc/init.d/mysql start
18:37 < mutante> /opt/wmf-mariadb10
18:37 < mutante> Starting MySQL
18:37 < mutante> ............

took a while and then

18:39 < mutante> ok, it is done
18:39 < YuviPanda> yay
18:39 < mutante> * Manager of pid-file quit without updating file.
18:39 < YuviPanda> it seems back
18:39 < mutante> running now

yuvipanda mentioned this in T136789: High replag on db1069.Jun 2 2016, 1:55 AM

This XFS bug was reported here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1382333
The fix landed in 3.13.0-40.69, but before the crash occured labsdb1001 was still running the previous
trusty kernel release. With the reboot it's now running the 3.13.0-83 kernel so this should not happen again.

\o/ awesome!

yuvipanda closed this task as Resolved.Jun 2 2016, 8:16 AM

yuvipanda claimed this task.

• Phabricator_maintenance removed a subscriber: yuvipanda.Jun 7 2017, 6:42 PM

replica labsdb1001 downClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

replica labsdb1001 down
Closed, ResolvedPublic
Actions