Earlier today we have experienced prometheus (eqiad, codfw) and centrallog hosts locking up more or less at the same time, the hosts were for the most part unresponsive on ssh and sometimes in console.
A correlation I was able to find so far is that `fstrim.service` started a few moments before the kernel started reporting problems. Full paste from e.g. centrallog1002 is at https://phabricator.wikimedia.org/P75746
```
2025-05-05T01:39:02.626566+00:00 centrallog1002 systemd[1]: Starting fstrim.service - Discard unused blocks on filesystems from /etc/fstab...
2025-05-05T01:39:03.328414+00:00 centrallog1002 kernel: [472531.143541] BUG: kernel NULL pointer dereference, address: 0000000000000000
2025-05-05T01:39:03.328452+00:00 centrallog1002 kernel: [472531.150586] #PF: supervisor instruction fetch in kernel mode
2025-05-05T01:39:03.328455+00:00 centrallog1002 kernel: [472531.156331] #PF: error_code(0x0010) - not-present page
2025-05-05T01:39:03.328456+00:00 centrallog1002 kernel: [472531.161558] PGD 0 P4D 0
2025-05-05T01:39:03.328458+00:00 centrallog1002 kernel: [472531.164185] Oops: 0010 [#1] PREEMPT SMP NOPTI
```