Page MenuHomePhabricator

elastic1060 reported errors in getsel
Closed, ResolvedPublic2 Estimated Story Points


elast1060 reports the following errors in racadm getsel, after some hours of dowtime:

Record:      17                                                                                                                          
Date/Time:   03/27/2021 08:03:50                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: An OEM diagnostic event occurred.                                                                                           
Record:      18                                                                                                                          
Date/Time:   03/27/2021 08:03:50                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: An OEM diagnostic event occurred.                                                                                           
Record:      19                                                                                                                          
Date/Time:   03/27/2021 08:03:50                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: An OEM diagnostic event occurred.                                                                                           
Record:      20                                                                                                                          
Date/Time:   03/27/2021 07:08:30                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: A problem was detected related to the previous server boot.                                                                 
Record:      21                                                                                                                          
Date/Time:   03/27/2021 07:08:30                                                                                                         
Source:      system                                                                                                                      
Severity:    Critical                                                                                                                    
Description: Multi-bit memory errors detected on a memory device at location(s) DIMM_A7.                                                 
Record:      22                                                                                                                          
Date/Time:   03/27/2021 07:08:30                                                                                                         
Source:      system                                                                                                                      
Severity:    Critical                                                                                                                    
Description: CPU 1 machine check error detected.                                                                                         
Record:      23                                                                                                                          
Date/Time:   03/27/2021 07:08:30                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: An OEM diagnostic event occurred.                                                                                           
Record:      24                                                                                                                          
Date/Time:   03/27/2021 07:08:30                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: An OEM diagnostic event occurred.                                                                                           
Record:      25                                                                                                                          
Date/Time:   03/27/2021 07:08:31                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: An OEM diagnostic event occurred.                                                                                           
Record:      26                                                                                                                          
Date/Time:   03/27/2021 07:08:31                                                                                                         
Source:      system                                                                                                                      
Severity:    Ok                                                                                                                          
Description: An OEM diagnostic event occurred.                                                                                           

Event Timeline

The host has been up after the powercycle without leading to errors, but I'll defer to @RKemper and @Gehel to close or investigate further :)

Gehel set the point value for this task to 2.Mar 29 2021, 3:35 PM
jijiki triaged this task as High priority.Mar 29 2021, 9:14 PM
Cmjohnson claimed this task.
Cmjohnson subscribed.

The DIMM only reported the error that one day and has not returned. I am clearing the system log and resolving this for now, if the issue persists please re-open.