We have several crashes in the near past that are suspected because of an unknown kernel bug.
Dec 16 04:46:05 labstore1001 kernel: [4198858.858163] INFO: rcu_sched self-detected stall on CPU { 6} (t=5251 jiffies g=572267666 c=572267665 q=1115175) Dec 16 04:46:05 labstore1001 kernel: [4198858.866170] INFO: rcu_sched detected stalls on CPUs/tasks: { 6} (detected by 0, t=5252 jiffies, g=572267666, c=572267665, q=1115175) Dec 16 04:46:05 labstore1001 kernel: [4198858.866170] Task dump for CPU 6: Dec 16 04:46:05 labstore1001 kernel: [4198858.866173] kworker/6:2 R running task 0 9169 2 0x00000008 Dec 16 04:46:05 labstore1001 kernel: [4198858.866194] Workqueue: kcopyd do_work [dm_mod] Dec 16 04:46:05 labstore1001 kernel: [4198858.866196] ffff88081a5ae800 ffff88081aba5de0 ffff88081a5ae800 ffff88081aba5de0 Dec 16 04:46:05 labstore1001 kernel: [4198858.866197] ffff8807082fb900 0000000000000286 ffffffffa00bbff1 ffff88041c1aa000 Dec 16 04:46:05 labstore1001 kernel: [4198858.866198] ffff880400001000 0000000000000001 ffff88081a5ae820 0000000000000001 Dec 16 04:46:05 labstore1001 kernel: [4198858.866199] Call Trace: Dec 16 04:46:05 labstore1001 kernel: [4198858.866207] [<ffffffffa00bbff1>] ? __make_request+0xc11/0xe20 [raid10] Dec 16 04:46:05 labstore1001 kernel: [4198858.866213] [<ffffffff8109b847>] ? select_idle_sibling+0x27/0x120 Dec 16 04:46:05 labstore1001 kernel: [4198858.866215] [<ffffffff8109eaf9>] ? enqueue_task_fair+0x309/0xf40 Dec 16 04:46:05 labstore1001 kernel: [4198858.866220] [<ffffffff81155d88>] ? free_one_page+0x78/0x480 Dec 16 04:46:05 labstore1001 kernel: [4198858.866224] [<ffffffff8101d1d6>] ? native_sched_clock+0x26/0x90 Dec 16 04:46:05 labstore1001 kernel: [4198858.866226] [<ffffffff8101d245>] ? sched_clock+0x5/0x10 Dec 16 04:46:05 labstore1001 kernel: [4198858.866229] [<ffffffffa0563c50>] ? copy_callback+0x40/0x100 [dm_snapshot] Dec 16 04:46:05 labstore1001 kernel: [4198858.866231] [<ffffffffa0563c10>] ? complete_exception+0x50/0x50 [dm_snapshot] Dec 16 04:46:05 labstore1001 kernel: [4198858.866235] [<ffffffffa00cdd71>] ? run_complete_job+0x61/0xb0 [dm_mod]
Dec 16 04:53:09 labstore1001 kernel: [4199283.625421] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169] Dec 16 04:53:37 labstore1001 kernel: [4199311.647538] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169] Dec 16 04:54:13 labstore1001 kernel: [4199347.675975] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:2:9169] Dec 16 04:55:57 labstore1001 kernel: [4199451.758125] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:2:9169] Dec 16 04:57:41 labstore1001 kernel: [4199555.840273] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:2:9169] Dec 16 04:58:09 labstore1001 kernel: [4199583.862390] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/6:2:9169] Dec 16 04:59:33 labstore1001 kernel: [4199667.928740] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169] Dec 16 05:01:13 labstore1001 kernel: [4199768.007731] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169] Dec 16 05:01:41 labstore1001 kernel: [4199796.029849] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169] Dec 16 05:04:09 labstore1001 kernel: [4199944.146753] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169] Dec 16 05:06:05 labstore1001 kernel: [4200060.238384] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169] Dec 16 05:08:53 labstore1001 kernel: [4200228.371088] NMI watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/6:2:9169