Page MenuHomePhabricator

db2097@s1 got killed due to hardware memory corruption
Closed, ResolvedPublic

Description

[Mon Jul 19 02:49:14 2021] Disabling lock debugging due to kernel taint
[Mon Jul 19 02:49:14 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 02:49:14 2021] mce: Uncorrected hardware memory error in user-access at 466483c700
[Mon Jul 19 02:49:14 2021] Memory failure: 0x466483c: Killing puppet:980 due to hardware memory corruption
[Mon Jul 19 02:49:14 2021] Memory failure: 0x466483c: recovery action for dirty LRU page: Recovered
[Mon Jul 19 02:57:00 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 02:57:00 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 03:12:31 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 03:12:31 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 03:30:47 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 03:30:47 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 03:33:14 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 03:33:14 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 04:19:49 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 04:19:49 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 04:38:04 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 04:38:04 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 04:40:31 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 04:40:31 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 04:40:31 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 05:01:13 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 05:19:28 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 05:19:28 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 05:21:56 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 05:21:56 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 05:58:09 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 05:58:09 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:03:20 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 06:03:20 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:16:25 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:16:41 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:19:08 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:19:08 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:37:07 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 06:37:07 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:37:07 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:39:34 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:39:34 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 06:57:49 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:00:16 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:15:48 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:16:04 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:34:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:34:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:36:30 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:36:46 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:54:45 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:56:07 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 07:56:23 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 08:13:00 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 08:13:00 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 08:30:59 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 08:31:16 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 08:33:26 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 08:50:36 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 08:53:20 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:09:57 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:11:35 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:12:24 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:12:40 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:30:39 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:30:39 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:33:06 2021] mce_notify_irq: 2 callbacks suppressed
[Mon Jul 19 09:33:06 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:33:22 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:35:00 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:48:54 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:51:21 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 09:51:21 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:06:53 2021] mce_notify_irq: 3 callbacks suppressed
[Mon Jul 19 10:06:53 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:09:20 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:09:36 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:25:24 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:27:35 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:27:51 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:30:18 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:30:18 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:45:34 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:47:12 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 10:48:34 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 11:03:49 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 11:20:59 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 11:22:04 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 11:22:04 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 11:39:14 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 11:39:14 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 11:40:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 11:40:36 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:00:45 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:00:45 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:03:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:03:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:18:11 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 12:18:11 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:19:00 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:19:17 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:20:55 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:37:32 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:37:32 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:38:54 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:39:43 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:53:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 12:57:09 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:10:46 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:11:19 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:13:29 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 13:13:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:13:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:26:50 2021] mce_notify_irq: 3 callbacks suppressed
[Mon Jul 19 13:26:50 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:29:01 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:44:16 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:44:33 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:47:16 2021] mce_notify_irq: 2 callbacks suppressed
[Mon Jul 19 13:47:16 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 13:47:16 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:01:59 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:02:32 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:18:03 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 14:18:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:18:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:21:03 2021] mce_notify_irq: 3 callbacks suppressed
[Mon Jul 19 14:21:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:21:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:33:51 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:34:08 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:35:46 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 14:35:46 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:36:18 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:51:50 2021] mce_notify_irq: 2 callbacks suppressed
[Mon Jul 19 14:51:50 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 14:51:50 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:08:11 2021] mce_notify_irq: 2 callbacks suppressed
[Mon Jul 19 15:08:11 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:10:05 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:23:10 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:23:10 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:25:04 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:25:37 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:38:09 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:38:41 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:41:25 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:55:35 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:56:24 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 15:56:57 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:12:28 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:13:50 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:14:39 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:15:12 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:30:11 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:30:43 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:32:54 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:43:15 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:43:48 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:45:59 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:46:15 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:58:47 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 16:59:03 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 17:15:08 2021] mce_notify_irq: 2 callbacks suppressed
[Mon Jul 19 17:15:08 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 17:17:02 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 17:17:02 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 19:08:27 2021] mce_notify_irq: 2 callbacks suppressed
[Mon Jul 19 19:08:27 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 19:26:43 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 20:00:13 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 20:00:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 20:52:31 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 20:53:53 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 20:54:42 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:10:14 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:25:13 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:25:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:26:18 2021] mce_notify_irq: 2 callbacks suppressed
[Mon Jul 19 21:26:18 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:26:34 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:28:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:28:29 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:43:28 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:43:44 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 21:59:49 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 21:59:49 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:00:05 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:14:15 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:15:04 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:17:47 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:30:19 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:30:19 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:46:07 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 22:46:24 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:04:06 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:04:22 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:19:54 2021] mce_notify_irq: 3 callbacks suppressed
[Mon Jul 19 23:19:54 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:20:27 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:21:49 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:22:38 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:22:54 2021] mce_notify_irq: 1 callbacks suppressed
[Mon Jul 19 23:22:54 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:23:10 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:38:09 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:40:36 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:40:53 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:45:04 2021] Process accounting resumed
[Mon Jul 19 23:55:35 2021] mce: [Hardware Error]: Machine check events logged
[Mon Jul 19 23:55:52 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 00:13:51 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 00:13:51 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 00:14:23 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 00:32:55 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 00:32:55 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 00:32:55 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 00:43:16 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 00:43:49 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 00:45:10 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:00:09 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:00:42 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:01:31 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:01:31 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:18:08 2021] mce_notify_irq: 2 callbacks suppressed
[Tue Jul 20 01:18:08 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:18:24 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:34:12 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:34:29 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:35:18 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 01:35:18 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:35:50 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:54:06 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:54:55 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 01:58:27 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:13:26 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:14:15 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:16:10 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:16:42 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:31:41 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:32:30 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:33:03 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:33:52 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 02:54:18 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 03:07:39 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 03:25:54 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 03:26:43 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 03:41:42 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 03:42:31 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:00:30 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:00:46 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:01:35 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 04:01:35 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:01:52 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:17:56 2021] mce_notify_irq: 2 callbacks suppressed
[Tue Jul 20 04:17:56 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:18:45 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:20:23 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:34:33 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:34:49 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:36:11 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:54:10 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:54:26 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:55:15 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:56:04 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 04:56:53 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 05:11:20 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 05:11:36 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 05:28:13 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 05:28:13 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 05:30:07 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 05:32:02 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 05:49:28 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 06:09:05 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 06:10:27 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 06:45:35 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 06:45:35 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 06:47:30 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 06:49:08 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:03:01 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:04:07 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:35:59 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:51:31 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:52:20 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:53:09 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:53:58 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 07:54:14 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:08:57 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:09:46 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:10:35 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 08:10:35 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:23:23 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:23:39 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:25:01 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:25:34 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:27:12 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:27:45 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:40:17 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:40:49 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:41:38 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:41:55 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:57:43 2021] mce_notify_irq: 2 callbacks suppressed
[Tue Jul 20 08:57:43 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:58:48 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 08:59:04 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:12:09 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:12:09 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:14:36 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 09:14:36 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:14:36 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:25:14 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:26:03 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:26:19 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 09:26:19 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:27:41 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:40:29 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:41:18 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:53:17 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 09:53:17 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 09:54:55 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 10:04:11 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 10:06:22 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 10:17:16 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 10:18:21 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 10:19:26 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 10:38:47 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:16:22 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:16:55 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:17:44 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 11:17:44 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:18:50 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:31:54 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:32:43 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:33:00 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:33:32 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:35:10 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:50:09 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 11:50:09 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 12:05:41 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 12:06:30 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:03:06 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:04:44 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:07:27 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:08:33 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:19:27 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:20:16 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:30:37 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:30:37 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:31:42 2021] mce_notify_irq: 1 callbacks suppressed
[Tue Jul 20 14:31:42 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:31:58 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:46:25 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:46:57 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:47:30 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 14:48:52 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 15:00:35 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 15:01:56 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 15:03:18 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 15:04:56 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 15:45:42 2021] mce: [Hardware Error]: Machine check events logged
[Tue Jul 20 15:45:42 2021] mce: Uncorrected hardware memory error in user-access at 7ec6952b40
[Tue Jul 20 15:45:42 2021] Memory failure: 0x7ec6952: Killing mysqld:2472 due to hardware memory corruption
[Tue Jul 20 15:45:42 2021] Memory failure: 0x7ec6952: recovery action for dirty LRU page: Recovered
[Tue Jul 20 15:45:42 2021] MCE: Killing mysqld:2741 due to hardware memory corruption fault at 7f96b07fcad8
[Tue Jul 20 23:41:41 2021] Process accounting resumed
[Wed Jul 21 01:02:48 2021] systemd: 27 output lines suppressed due to ratelimiting
Jul 20 22:27:56 db2097 mysqld[2472]: 210720 22:27:56 [ERROR] mysqld got signal 7 ;
Jul 20 22:27:56 db2097 mysqld[2472]: This could be because you hit a bug. It is also possible that this binary
Jul 20 22:27:56 db2097 mysqld[2472]: or one of the libraries it was linked against is corrupt, improperly built,
Jul 20 22:27:56 db2097 mysqld[2472]: or misconfigured. This error can also be caused by malfunctioning hardware.
Jul 20 22:27:56 db2097 mysqld[2472]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Jul 20 22:27:56 db2097 mysqld[2472]: We will try our best to scrape up some info that will hopefully help
Jul 20 22:27:56 db2097 mysqld[2472]: diagnose the problem, but since we have already crashed,
Jul 20 22:27:56 db2097 mysqld[2472]: something is definitely wrong and this may fail.
Jul 20 22:27:56 db2097 mysqld[2472]: Server version: 10.1.44-MariaDB
Jul 20 22:27:56 db2097 mysqld[2472]: key_buffer_size=1048576
Jul 20 22:27:56 db2097 mysqld[2472]: read_buffer_size=131072
Jul 20 22:27:56 db2097 mysqld[2472]: max_used_connections=26
Jul 20 22:27:56 db2097 mysqld[2472]: max_threads=252
Jul 20 22:27:56 db2097 mysqld[2472]: thread_count=13
Jul 20 22:27:56 db2097 mysqld[2472]: It is possible that mysqld could use up to
Jul 20 22:27:56 db2097 mysqld[2472]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 554588 K  bytes of memory
Jul 20 22:27:56 db2097 mysqld[2472]: Hope that's ok; if not, decrease some variables in the equation.
Jul 20 22:27:56 db2097 mysqld[2472]: Thread pointer: 0x0
Jul 20 22:27:56 db2097 mysqld[2472]: Attempting backtrace. You can use the following information to find out
Jul 20 22:27:56 db2097 mysqld[2472]: where mysqld died. If you see no messages after this, something went
Jul 20 22:27:56 db2097 mysqld[2472]: terribly wrong...
Jul 20 22:27:56 db2097 mysqld[2472]: stack_bottom = 0x0 thread_stack 0x49000
Jul 20 22:27:57 db2097 mysqld[2472]: /opt/wmf-mariadb101/bin/mysqld(my_print_stacktrace+0x2e)[0x564aeecfa74e]
Jul 20 22:28:10 db2097 systemd[1]: mariadb@s1.service: Main process exited, code=killed, status=7/BUS
Jul 20 22:28:10 db2097 systemd[1]: mariadb@s1.service: Unit entered failed state.
Jul 20 22:28:10 db2097 systemd[1]: mariadb@s1.service: Failed with result 'signal'.
Jul 20 22:28:15 db2097 systemd[1]: mariadb@s1.service: Service hold-off time over, scheduling restart.
Jul 20 22:28:15 db2097 systemd[1]: Stopped mariadb database server.
Jul 20 22:28:15 db2097 systemd[1]: Starting mariadb database server...
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] /opt/wmf-mariadb101/bin/mysqld (mysqld 10.1.44-MariaDB) starti
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [ERROR] Plugin 'unix_socket' already installed
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 7fcea1dce900 InnoDB: Warning: Using innodb_locks_unsafe_for_binlog is DEPRECATED. Thi
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: Using mutexes to ref count buffer pool pages
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: The InnoDB memory heap is disabled
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: Compressed tables use zlib 1.2.11
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: Using Linux native AIO
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: Using SSE crc32 instructions
Jul 20 22:28:15 db2097 mysqld[20447]: 2021-07-20 22:28:15 140525455599872 [Note] InnoDB: Initializing buffer pool, size = 192.0G
Jul 20 22:28:24 db2097 mysqld[20447]: 2021-07-20 22:28:24 140525455599872 [Note] InnoDB: Completed initialization of buffer pool
Jul 20 22:28:26 db2097 mysqld[20447]: 2021-07-20 22:28:26 140525455599872 [Note] InnoDB: Highest supported file format is Barracuda.
Jul 20 22:28:26 db2097 mysqld[20447]: 2021-07-20 22:28:26 140525455599872 [Note] InnoDB: Starting crash recovery from checkpoint LSN=8075385886
Jul 20 22:28:26 db2097 mysqld[20447]: 2021-07-20 22:28:26 140525455599872 [Note] InnoDB: Restoring possible half-written data pages from the do
Jul 20 22:28:41 db2097 mysqld[20447]: 2021-07-20 22:28:41 140525455599872 [Note] InnoDB: Read redo log up to LSN=80755130922496
Jul 20 22:28:52 db2097 mysqld[20447]: 2021-07-20 22:28:52 140525455599872 [Note] InnoDB: Starting final batch to recover 684596 pages from redo
Jul 20 22:28:56 db2097 mysqld[20447]: 2021-07-20 22:28:56 140304816527104 [Note] InnoDB: To recover: 586848 pages from log
Jul 20 22:29:11 db2097 mysqld[20447]: 2021-07-20 22:29:11 140304816527104 [Note] InnoDB: To recover: 311849 pages from log
Jul 20 22:29:26 db2097 mysqld[20447]: InnoDB: In a MySQL replication slave the last master binlog file
Jul 20 22:29:26 db2097 mysqld[20447]: InnoDB: position 0 935399742, file name db1052-bin.001577
Jul 20 22:29:26 db2097 mysqld[20447]: InnoDB: Last MySQL binlog file position 0 475389594, file name ./db2072-bin.000032
Jul 20 22:29:26 db2097 mysqld[20447]: 2021-07-20 22:29:26 140525455599872 [Note] InnoDB: 128 rollback segment(s) are active.
Jul 20 22:29:26 db2097 mysqld[20447]: 2021-07-20 22:29:26 140525455599872 [Note] InnoDB: Waiting for purge to start
Jul 20 22:29:26 db2097 mysqld[20447]: 2021-07-20 22:29:26 140525455599872 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.46-86.2 s
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140300525733632 [Note] InnoDB: Dumping buffer pool(s) not yet started
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 7f9a42ff87002021-07-20 22:30:00 140525455599872 [Note] Plugin 'FEEDBACK' is disabled.
Jul 20 22:30:00 db2097 mysqld[20447]:  InnoDB: Loading buffer pool(s) from .//ib_buffer_pool
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Note] Recovering after a crash using tc.log
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Note] Starting crash recovery...
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Note] Crash recovery finished.
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Note] Server socket created on IP: '::'.
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Note] Server socket created on IP: '::'.
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [ERROR] mysqld: Table './mysql/event' is marked as crashed and should
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Warning] Checking table:   './mysql/event'
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [ERROR] mysql.event: 1 client is using or hasn't closed the table pro
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455070976 [Note] Event Scheduler: scheduler thread started with id 1
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Warning] Neither --relay-log nor --relay-log-index were used; so rep
Jul 20 22:30:00 db2097 mysqld[20447]: 2021-07-20 22:30:00 140525455599872 [Note] /opt/wmf-mariadb101/bin/mysqld: ready for connections.
Jul 20 22:30:00 db2097 mysqld[20447]: Version: '10.1.44-MariaDB'  socket: '/run/mysqld/mysqld.s1.sock'  port: 3311  MariaDB Server
Jul 20 22:30:00 db2097 systemd[1]: Started mariadb database server.
Jul 20 23:16:37 db2097 mysqld[20447]: 2021-07-20 23:16:37 7f9a42ff8700 InnoDB: Buffer pool(s) load completed at 210720 23:16:37

Event Timeline

jcrespo triaged this task as High priority.

s2 mariadb log, on the same host seems clean.

This are some weird hw logs:

/map1/log1/record352
  Targets
  Properties
    number=352
    severity=Critical
    date=07/19/2021
    time=09:23:59
    description=The server could not be powered on or a server critical error occurred
  Verbs
    cd version exit show

/map1/log1/record353
  Targets
  Properties
    number=353
    severity=Critical
    date=07/19/2021
    time=09:24:06
    description=The server could not be powered on or a server critical error occurred
  Verbs
    cd version exit show

/map1/log1/record354
  Targets
  Properties
    number=354
    severity=Critical
    date=07/19/2021
    time=09:24:10
    description=The server could not be powered on or a server critical error occurred
  Verbs
    cd version exit show

From the web interface:

"ID","Severity","Class","Description","Last Update","Count","Category",
"85","Critical","CPU","Uncorrectable Machine Check Exception (Processor 1, APIC ID 0x00000004, Bank 0x00000001, Status 0xBD800000'00100134, Address 0x0000007E'C6952B40, Misc 0x00000000'00000086). ","07/20/2021 22:25:37","1","Hardware",
"84","Critical","UEFI","DIMM Failure - Uncorrectable Memory Error (Processor 2, DIMM 4)","07/19/2021 09:24:08","1","Hardware",
"83","Critical","CPU","Uncorrectable Machine Check Exception (Processor 2, APIC ID 0x00000022, Bank 0x00000008, Status 0xBC000000'01010091, Address 0x0000007E'C6952B40, Misc 0x200406C3'0FC02086). ","07/19/2021 09:24:07","1","Hardware",
"82","Critical","CPU","Uncorrectable Machine Check Exception (Processor 2, APIC ID 0x00000022, Bank 0x00000008, Status 0xBC000000'01010091, Address 0x00000046'6483C700, Misc 0x200406C3'08002086). ","07/19/2021 09:24:05","1","Hardware",
"81","Critical","CPU","Uncorrectable Machine Check Exception (Processor 1, APIC ID 0x00000002, Bank 0x00000001, Status 0xBD800000'00100134, Address 0x00000046'6483C700, Misc 0x00000000'00000086). ","07/19/2021 09:24:04","1","Hardware",
"80","Critical","CPU","Uncorrectable Machine Check Exception (Processor 2, APIC ID 0x00000022, Bank 0x00000008, Status 0xBC000000'01010091, Address 0x00000063'34924F00, Misc 0x200402C1'28002086). ","07/19/2021 09:24:02","1","Hardware",
"79","Critical","UEFI","DIMM Failure - Uncorrectable Memory Error (Processor 2, DIMM 3)","07/19/2021 09:24:05","3","Hardware",
"78","Critical","CPU","Uncorrectable Machine Check Exception (Processor 2, APIC ID 0x00000022, Bank 0x00000008, Status 0xBC000000'01010091, Address 0x0000006C'316F9300, Misc 0x200401C0'8B602086). ","07/19/2021 09:23:59","1","Hardware",

I am going to reboot, see what happens and then send to dcops for advice.

jcrespo added a project: ops-codfw.
jcrespo added a subscriber: Papaul.

As expected, the fauly memory module is only properly detected on reboot.

free -g
              total        used        free      shared  buff/cache   available
Mem:            472           1         470           0           0         468
Swap:             7           0           7

@Papaul let us know what we can do about the bad DIMM:

Memory 	 Degraded

PROC 2 DIMM 3 	 Map Out Configuration 	32.00 GB 	2666 MHz 	RDIMM
87	
UEFI	One or more DIMMs have been mapped out due to a memory error, resulting in an unbalanced memory configuration across memory controllers. This may result in non-optimal memory performance.	07/21/2021 09:39:47	1	Configuration
	86	
	UEFI	Uncorrectable Memory Error Threshold Exceeded (Processor 2, DIMM 3). The DIMM is mapped out and is currently not available.	07/21/2021 09:36:41	2	Hardware

Here is the Active Health log: https://drive.google.com/file/d/1LyZJiDk_u1jKhHdskDULCgUOnuUsJgFB/view?usp=sharing . Please ping me if/when you intend to shutdown the server in advance, as I will try to put it now back into service with reduced memory.

@jcrespo I will request for HP to send us a new DIMM

Case Reference ID: 5357298848
Status: Case is generated and in Progress
Subject: HPE ProLiant DL360 Gen10 - DIMM Failed
Product: HPE ProLiant DL360 Gen10 8SFF Configure-to-order Server
Product Number: 867959-B21
Serial number:

In reference to your Hewlett Packard Enterprise Support Case Number 5357298848, the following Customer Self Repair Part has been shipped:

Part/s shipped: 850881-001
Part description: SPS-DIMM 32GB PC4-2666V-R 2Gx4
Carrier Name: UPSN
Tracking Number: 1Z4295AR0115473287

Mentioned in SAL (#wikimedia-operations) [2021-07-22T15:14:35Z] <jynus> shutdown db2097 for hw servicing T287072

Looking good:

$ free -g
              total        used        free      shared  buff/cache   available
Mem:            503           1         502           0           0         500
Swap:             7           0           7

No errors on dmesg.

Return DIMM information

I doubled confirmed all dimms "Good, In use". Thank you, @Papaul for the quick response!

PROC 1 DIMM 3 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 1 DIMM 4 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 1 DIMM 5 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 1 DIMM 6 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 1 DIMM 7 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 1 DIMM 8 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 1 DIMM 9 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 1 DIMM 10 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 3 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 4 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 5 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 6 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 7 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 8 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 9 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM
PROC 2 DIMM 10 	
 Good, In Use 	32.00 GB 	2666 MHz 	RDIMM