Was woken up to alerts and found frban2002 and frmon2002 pingable, but unable to log in via ssh or the console. Restarted both hosts from the console using serveraction powercycle. When investigating the logs, both host appear to have a kernel panic in conjunction with an fstrim operation:
frmon2002
May 5 00:27:07 frmon2002 systemd[1]: Starting fstrim.service - Discard unused blocks on filesystems from /etc/fstab... May 5 00:27:08 frmon2002 kernel: [537444.719622] BUG: kernel NULL pointer dereference, address: 0000000000000000 May 5 00:27:08 frmon2002 kernel: [537444.726674] #PF: supervisor instruction fetch in kernel mode May 5 00:27:08 frmon2002 kernel: [537444.732420] #PF: error_code(0x0010) - not-present page May 5 00:27:08 frmon2002 kernel: [537444.737647] PGD c2a784067 P4D 0- May 5 00:27:08 frmon2002 kernel: [537444.740965] Oops: 0010 [#1] PREEMPT SMP NOPTI May 5 00:27:08 frmon2002 kernel: [537444.745412] CPU: 19 PID: 442801 Comm: fstrim Not tainted 6.1.0-34-amd64 #1 Debian 6.1.135-1 May 5 00:27:08 frmon2002 kernel: [537444.753933] Hardware name: Dell Inc. PowerEdge R450/073H50, BIOS 1.9.2 11/17/2022 May 5 00:27:08 frmon2002 kernel: [537444.761498] RIP: 0010:0x0 May 5 00:27:08 frmon2002 kernel: [537444.764212] Code: Unable to access opcode bytes at 0xffffffffffffffd6. May 5 00:27:08 frmon2002 kernel: [537444.770822] RSP: 0018:ff244d8da6fa7718 EFLAGS: 00010206 May 5 00:27:08 frmon2002 kernel: [537444.776136] RAX: 0000000000000000 RBX: 0000000000092800 RCX: 0000000000000c00 May 5 00:27:08 frmon2002 kernel: [537444.783356] RDX: 0000000000000803 RSI: 0000000000000000 RDI: 0000000000092800 May 5 00:27:08 frmon2002 kernel: [537444.790575] RBP: ff1a917ae0996718 R08: ff1a917ae0996700 R09: ff1a9179cd8fbf50 May 5 00:27:08 frmon2002 kernel: [537444.797792] R10: 0000000000000001 R11: 000000000005554a R12: 0000000000092c00 May 5 00:27:08 frmon2002 kernel: [537444.805014] R13: 0000000000000400 R14: 0000000000000803 R15: 0000000000000000 May 5 00:27:08 frmon2002 kernel: [537444.812233] FS: 00007f662b746840(0000) GS:ff1a918a1f840000(0000) knlGS:0000000000000000 May 5 00:27:08 frmon2002 kernel: [537444.820404] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 00:27:08 frmon2002 kernel: [537444.826238] CR2: ffffffffffffffd6 CR3: 00000009e7428005 CR4: 0000000000771ee0 May 5 00:27:08 frmon2002 kernel: [537444.833458] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 5 00:27:08 frmon2002 kernel: [537444.840677] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 May 5 00:27:08 frmon2002 kernel: [537444.847895] PKRU: 55555554 May 5 00:27:08 frmon2002 kernel: [537444.850695] Call Trace:
frban2002
May 5 00:42:52 frban2002 systemd[1]: Starting fstrim.service - Discard unused blocks on filesystems from /etc/fstab... May 5 00:42:53 frban2002 kernel: [372148.019323] BUG: kernel NULL pointer dereference, address: 0000000000000000 May 5 00:42:53 frban2002 kernel: [372148.026380] #PF: supervisor instruction fetch in kernel mode May 5 00:42:53 frban2002 kernel: [372148.032126] #PF: error_code(0x0010) - not-present page May 5 00:42:53 frban2002 kernel: [372148.037354] PGD 8ab89d067 P4D 0- May 5 00:42:53 frban2002 kernel: [372148.040683] Oops: 0010 [#1] PREEMPT SMP NOPTI May 5 00:42:53 frban2002 kernel: [372148.045134] CPU: 13 PID: 447162 Comm: fstrim Not tainted 6.1.0-34-amd64 #1 Debian 6.1.135-1 May 5 00:42:53 frban2002 kernel: [372148.053655] Hardware name: Dell Inc. PowerEdge R450/0VT18Y, BIOS 1.14.1 03/11/2024 May 5 00:42:53 frban2002 kernel: [372148.061306] RIP: 0010:0x0 May 5 00:42:53 frban2002 kernel: [372148.064024] Code: Unable to access opcode bytes at 0xffffffffffffffd6. May 5 00:42:53 frban2002 kernel: [372148.070633] RSP: 0018:ff218493645b78d8 EFLAGS: 00010206 May 5 00:42:53 frban2002 kernel: [372148.075945] RAX: 0000000000000000 RBX: 0000000000092800 RCX: 0000000000000c00 May 5 00:42:53 frban2002 kernel: [372148.083164] RDX: 0000000000000803 RSI: 0000000000000000 RDI: 0000000000092800 May 5 00:42:53 frban2002 kernel: [372148.090385] RBP: ff11c1add95c4718 R08: ff11c1add95c4700 R09: ff11c1b54961f850 May 5 00:42:53 frban2002 kernel: [372148.097604] R10: 0000000000000001 R11: 000000000005554e R12: 0000000000092c00 May 5 00:42:53 frban2002 kernel: [372148.104822] R13: 0000000000000400 R14: 0000000000000803 R15: 0000000000000000 May 5 00:42:53 frban2002 kernel: [372148.112043] FS: 00007f3ba72f3840(0000) GS:ff11c1bd1f780000(0000) knlGS:0000000000000000 May 5 00:42:53 frban2002 kernel: [372148.120215] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 00:42:53 frban2002 kernel: [372148.126049] CR2: ffffffffffffffd6 CR3: 0000000893646002 CR4: 0000000000771ee0 May 5 00:42:53 frban2002 kernel: [372148.133267] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 May 5 00:42:53 frban2002 kernel: [372148.140487] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 May 5 00:42:53 frban2002 kernel: [372148.147708] PKRU: 55555554 May 5 00:42:53 frban2002 kernel: [372148.150505] Call Trace:
Full logs available on host and on frlog2002.