- - Provide FQDN of system.
- - If other than a hard drive issue, please depool the machine (and confirm that it’s been depooled) for us to work on it. If not, please provide time frame for us to take the machine down.
- - Put system into a failed state in Netbox. - was staged, so its not online for use yet, and is offline in icinga.
- - Provide urgency of request, along with justification (redundancy, dependencies, etc) - 1 of 14 newly staged hosts so while it is likely not highly urgent, but requires sub-team feedback.
- - Describe issue and/or attach hardware failure log. (Refer to https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook if you need help)
- - Assign correct project tag and appropriate owner (based on above). Also, please ensure the service owners of the host(s) are added as subscribers to provide any additional input.
Description
Related Objects
Event Timeline
This is a newly racked host so this could just require reseating to clear it up, as the memory can unseat during shipment. If reseating doesn't fix it, then we'll need to put in a self dispatch for a new dimm.
Since the memory error shows on POST, we'll know right away if it clears up.
UEFI0339: The Dual Inline Memory Module (DIMM) in the memory slot A2 is
disabled because of initialization errors caused by uncorrectable memory
errors, invalid configuration, and others.
Check the System Event Log (SEL) or the Lifecycle Controller Log and replace
the identified DIMM.
UEFI0058: Uncorrectable Memory Error has occurred because a Dual Inline Memory
Module (DIMM) is not functioning.
Check the System Event Log (SEL) to identify the non-functioning DIMM, and then
replace it.
Dell request for new DIMM place, You have successfully submitted request SR1091181415.
Mentioned in SAL (#wikimedia-operations) [2022-06-02T20:16:22Z] <ryankemper> T306449 Marked elastic1097 as Staged in Netbox (was previously failed, but fixed in https://phabricator.wikimedia.org/T306449#7888260)