mw1271 went down at 2018-01-10 21:37, seems to be a hardware error of some kind, there's nothing in the OS logs (so no kernel error etc.). However, earlier the day a hardware error was logged:
Jan 10 10:57:12 mw1271 kernel: [ 8926.921882] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 65534 Jan 10 10:57:12 mw1271 kernel: [ 8926.921885] {1}[Hardware Error]: It has been corrected by h/w and requires no further action Jan 10 10:57:12 mw1271 kernel: [ 8926.921886] {1}[Hardware Error]: event severity: corrected Jan 10 10:57:12 mw1271 kernel: [ 8926.921888] {1}[Hardware Error]: Error 0, type: corrected Jan 10 10:57:12 mw1271 kernel: [ 8926.921891] {1}[Hardware Error]: section type: unknown, 330f1140-72a5-11df-9690-0002a5d5c51b Jan 10 10:57:22 mw1271 kernel: [ 8936.318277] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 65534 Jan 10 10:57:22 mw1271 kernel: [ 8936.318281] {2}[Hardware Error]: It has been corrected by h/w and requires no further action Jan 10 10:57:22 mw1271 kernel: [ 8936.318282] {2}[Hardware Error]: event severity: corrected Jan 10 10:57:22 mw1271 kernel: [ 8936.318284] {2}[Hardware Error]: Error 0, type: corrected Jan 10 10:57:22 mw1271 kernel: [ 8936.318287] {2}[Hardware Error]: section type: unknown, 330f1140-72a5-11df-9690-0002a5d5c51b
Can you please run hardware diagnostics? The server is under warranty until 2019 still.
I have depooled the server, you can power it down at any time.