cloudvirt1014 crash
Closed, DuplicatePublic
Actions

Assigned To

None

Authored By

	Andrew
	Dec 27 2019, 5:10 PM

Description

Fri 27 Dec 2019 05:10:13 -- I just got a page about cloudvirt1014 being down, and restarted it from the mgmt console.

Status	Assigned	Task
		Unknown Object (Task)
Resolved	• Cmjohnson	T138509 rack/setup/install/deploy labvirt1012 labvirt1013 labvirt1014 nodes (cloudvirt1012 cloudvirt1013 cloudvirt1014)
Duplicate	None	T241492 cloudvirt1014 crash
Resolved	Jclark-ctr	T241494 Degraded RAID on cloudvirt1014

Surely this is related to T241313, although cloudvirt1013 and 1014 are in different racks

cloudvirt1013, cloudvirt1014, and cloudvirt1023 are the only cloudvirts running

Linux 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11)

cloudvirt1023 is held back as a spare, so not under load.

Kernel is probably unrelated, they're running that new kernel because of the post-crash reboot, were running the standard kernel before that.

Note for DCOps: This still has VMs live. Please coordinate with WMCS before shutting down for troubleshooting.

I believe this is the same as T241494: Degraded RAID on cloudvirt1014. BBU replacement. Closing as duplicated.