There is a potential bad memory on cp2038. I will like for the system to be depool if possible for me to swap DINM A3 with DIMM B3
Thanks.
Correctable memory error logging disabled for a memory device at location DIMM_A3.
There is a potential bad memory on cp2038. I will like for the system to be depool if possible for me to swap DINM A3 with DIMM B3
Thanks.
Correctable memory error logging disabled for a memory device at location DIMM_A3.
Mentioned in SAL (#wikimedia-operations) [2022-05-20T13:24:53Z] <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: T308459
Mentioned in SAL (#wikimedia-operations) [2022-05-20T13:24:58Z] <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2038.codfw.wmnet with reason: downtimed because of DIMM replacement: T308459
Hi @Papaul: Thanks for letting us know! The host is depooled and downtimed and so please proceed whenever you want. Thanks!
I Swapped DIMMM A3 with DIMM B3 . No error showing on DIMMB3 for now. I upgrade also IDRAC from version 4.10 to 5.00. Resolving this task for now.
@Vgutierrez you can put the server back in service for now. Thanks
Mentioned in SAL (#wikimedia-operations) [2022-05-23T15:39:13Z] <vgutierrez> pool cp2038 - T308459