Page MenuHomePhabricator

wtp1032 bootlooping on CPU error
Closed, ResolvedPublic

Description

wtp1032's racadm lclog view shows this happening over and over:

1/admin1-> racadm lclog view
2SeqNumber = 368
3Message ID = RAC0703
4Category = Audit
5AgentID = RACLOG
6Severity = Information
7Timestamp = 2020-06-02 15:09:14
8Message = Requested system hardreset.
9FQDD = iDRAC.Embedded.1
10--------------------------------------------------------------------------------
11SeqNumber = 367
12Message ID = SYS1003
13Category = Audit
14AgentID = DE
15Severity = Information
16Timestamp = 2020-06-02 15:09:14
17Message = System CPU Resetting.
18FQDD = iDRAC.Embedded.1#HostPowerCtrl
19--------------------------------------------------------------------------------
20SeqNumber = 366
21Message ID = CPU0000
22Category = System
23AgentID = iDRAC
24Severity = Information
25Timestamp = 2020-06-02 15:09:14
26Message = Internal error has occurred check for additional logs.
27--------------------------------------------------------------------------------
28SeqNumber = 365
29Message ID = RAC0703
30Category = Audit
31AgentID = RACLOG
32Severity = Information
33Timestamp = 2020-06-02 15:08:49
34Message = Requested system hardreset.
35FQDD = iDRAC.Embedded.1
36--------------------------------------------------------------------------------
37SeqNumber = 364
38Message ID = SYS1003
39Category = Audit
40AgentID = DE
41Severity = Information
42Timestamp = 2020-06-02 15:08:49
43Message = System CPU Resetting.
44FQDD = iDRAC.Embedded.1#HostPowerCtrl
45--------------------------------------------------------------------------------
46SeqNumber = 363
47Message ID = CPU0000
48Category = System
49AgentID = iDRAC
50Severity = Information
51Timestamp = 2020-06-02 15:08:48
52Message = Internal error has occurred check for additional logs.
53--------------------------------------------------------------------------------
54SeqNumber = 362
55Message ID = RAC0703
56Category = Audit
57AgentID = RACLOG
58Severity = Information
59Timestamp = 2020-06-02 15:08:24
60Message = Requested system hardreset.
61FQDD = iDRAC.Embedded.1
62--------------------------------------------------------------------------------
63SeqNumber = 361
64Message ID = SYS1003
65Category = Audit
66AgentID = DE
67Severity = Information
68Timestamp = 2020-06-02 15:08:24
69Message = System CPU Resetting.
70FQDD = iDRAC.Embedded.1#HostPowerCtrl
71--------------------------------------------------------------------------------
72SeqNumber = 360
73Message ID = CPU0000
74Category = System
75AgentID = iDRAC
76Severity = Information
77Timestamp = 2020-06-02 15:08:23
78Message = Internal error has occurred check for additional logs.

I tried a manual power cycle but that didn't fix it.

For now it is depooled pending dcops checking out what's up. Changed netbox status to FAILED

Related Objects

Event Timeline

wiki_willy added subscribers: Cmjohnson, wiki_willy.

@Cmjohnson - looks like the warranty on this one just ended a few months ago, so just let me know whatever you find during troubleshooting, and we can order the part. Thanks, Willy

Machine seems to still be in the dsh group; can this be fixed?

the server is out of warranty, I reseated both CPUs and cleared the system event log. The server booted okay. I will resolve this for now, please open again if the issue comes back.