User Details
- User Since
- Jul 24 2019, 8:11 PM (358 w, 5 d)
- Availability
- Available
- LDAP User
- Jclark-ctr
- MediaWiki User
- Jclark-ctr [ Global Accounts ]
Today
@MoritzMuehlenhoff if current location location is ok. Could it just be renamed and reimaged by service owner?
@Marostegui. Dell did just respond.
Yeah, unfortunately we haven't had any luck obtaining replacement parts. Some of the earlier failures may not have had Dell support tickets submitted when this issue first occurred, so Dell has trested these cases as first-time incidents.
no Faults for the last 2 days resolving ticket
Closing this ticket Opened Decom ticket T428582 for Data Platform
@Marostegui if you want to leave ticket open at least till Dell responds.
Dell SR227538194
Yesterday
Rebalanced the PDU. I will leave the ticket open to monitor for any additional alerts
Dell advised performing the same steps that had already been completed: a flea-power drain, firmware updates, and hardware diagnostic testing.
Sun, Jun 7
#1: Sensor: Phase, BA:L1-L2, Active Power Value: 1.749 kW (power) Thresholds: High: 1650
Fri, Jun 5
Current fans installed are F2B. (Front to Back). They should be B2F ( Back to Front)
I ran the CPU stress test for approximately 30 minutes and did not encounter any issues. I think the server is good to be repooled.
Dell SR 227400671
Performed a flea power drain, and the server came back up. I am currently updating the BIOS, then I will pull a TSR report and open a Dell support ticket for documentation and tracking in case the issue continues.
Thu, Jun 4
Thank you for checking @ayounsi we will add them with the servers
@cmooney @ayounsi Following the NetOps sync on Tuesday, I verified the serial numbers of the fabric cards in storage and documented them Can you advise if these can be add to next recycling event at Eqiad.
| PID | Revision | Serial | Assembly | PCB |
| SCB-MX960-S-G | AG72400361REVA | ABBH8547 | 750-021524R15 | 710-021523R09 |
| SCB-MX960-S-G | AG72400361REVA | ABBH2700 | 750-021524R15 | 710-021523R09 |
| SCB-MX960-S-G | AG72400361REVA | ABBH2635 | 750-021524R15 | 710-021523R09 |
| SCB-MX960-S-G | AG72400361REVA | ABBH8423 | 750-021524R15 | 710-021523R09 |
Wed, Jun 3
Tue, Jun 2
New drive has been Attached @colewhite ready to be rebuilt
Removed Failed drive Verified sdb has been removed
I was double checking and i was looking at model not serail.. verified again it is actually slot 5 .
@colewhite can this be swapped at any time would you be able to rebuild after swapping?
I did attempt the firmware updates, but after rebooting, the server became unresponsive and will not boot.
@RKemper @wiki_willy I have gone through all decommissioned servers and do not have a matching Intel(R) Xeon(R) Silver 4215 CPU @ 2.50GHz available to replace CPU1.
Dedicated dse-k8s workers for production WDQS in codfw - See #T425653
node /^dse-k8s-wdqs200[1-4]\.codfw\./ {
role(insetup::data_platform_ferm)
}
Mon, Jun 1
This server is out of warranty @RKemper. but I am looking at it right now
Errrors are on Sdb and has failed in md1 array matching serials according to idrac it is in slot 4
This server is out of warranty will check to see what is available from decom servers
Fri, May 29
Thu, May 28
A7 U27
@BCornwall I see cookbook failed. Is it still good for us to proceed with onsite work?
Wed, May 27
I have updated server names, switchports and provisioned servers. pending puppet being updated @BTullis
D6 U36
Tue, May 26
pc1013 C5 U26
Wed, May 20
@MLechvien-WMF i believe so looks like @Papaul noticed the missing part in puppet and updated both
Tue, May 19
wdqs1037 is failing to provision will check cabling next time on site


