Page MenuHomePhabricator

cp3051 crashed
Closed, DuplicatePublic

Description

This one too crashed today together with cp3055. Nothing on the console.

Event Timeline

Nothing in racadm, checked both getsel and lclog view. Nothing in syslog & co.

FYI in dmesg during the end of the boot process it logged a bunch of kvm: disabled by bios.

Thanks @Volans for taking care of this.

Nothing in racadm, checked both getsel and lclog view. Nothing in syslog & co.

Just like all other crashes tracked in T238305 :-/
Now, I know it sounds crazy, but: this is the 6th host crashing out of 8 cache_upload nodes in esams. So far none of the 8 cache_text nodes has crashed. I don't think there's too much to look at at the software configuration level, considering that in eqiad a text node has crashed (cp1077), but perhaps it's worth checking what's special about upload@esams that differentiates it from text? Something at the hardware level maybe, like parts batches, or anything special related to racking? You can tell upload@esams hosts from text because their hostname is odd: cp30(5[13579]|6[135]) vs cp30(5[02468]|6[024]).

FYI in dmesg during the end of the boot process it logged a bunch of kvm: disabled by bios.

Disabled on purpose, we don't use kvm on cache nodes.