Thu, Nov 7
Thanks @Jgreen - much appreciated.
@Jgreen - Child task T237651 created to order the part. For any hardware repair requests going forward, can you follow the template here - https://phabricator.wikimedia.org/maniphest/task/edit/form/55/ . There's a few things in the description that we're trying to get in advance to help us with scheduling downtime, prioritization, etc. Much appreciated.
@Jgreen - looks like the warranty ended for the server a few months ago in May. Let me know if you're looking to decommission this server soon or if you would like us to purchase the replacement part.
Tue, Nov 5
@Dzahn - sounds good to me.
Mon, Nov 4
Fri, Nov 1
@Jclark-ctr - looks like this one is from last Thursday's PDU upgrade. Can you check if it's maybe a loose cord? If not, we'll have to RMA it (server under warranty thru March 2020) . Thanks, Willy
Thu, Oct 31
Emailed Jim Buatti last Tuesday to provide overview of what we're trying to do with the contract (have OE14,15,16 renew in May 2021 and terminate OE10,11,12,13 in Nov 2020 with clause to term early if another customer can be found to lease the racks), and received confirmation that he'll help us put something together. Thanks, Willy
Wed, Oct 30
@jijiki - just following up to see if this is still an issue or if we can resolve this. Thanks, Willy
Reassigning to @Cmjohnson for Ariel's RAID question
Thanks @elukey I'll close out this request, if all the alerting is suppressed now.
Mon, Oct 28
Thanks @MoritzMuehlenhoff - no worries though, since this task looks like it was autogenerated. (I'll have to talk to Ricardo on how we can modify the autogenerated ones) @Gehel - child task T236725 created to order the replacement disk for the out of warranty system. Thanks, Willy
Per my conversation with Guillaume, this system will be decommissioned, so assigning it to @Gehel for now.
Thu, Oct 24
@RobH - can you check if the configuration on this one is complete? It was one of the PDUs you and Chris upgraded, when you went out to eqiad. Thanks, Willy
Pointed this task out to our Dell account rep today. @Jclark-ctr - let me know if the steps they provided don't work, and then I'll forward our case number over to them...to see if we can just get a new server.
Talked to our Dell rep on this one, who can reach out to the Dell tech support rep directly, after we re-open the ticket. He basically confirmed the same thing @Bstorm had found from the earlier comments...that the 1.9tb drive was sent from Dell previously as a RMA. @Jclark-ctr - can you coordinate with Brooke to update the firmware on this (which might fix things, with all the drive failures), and then call in a request with Dell again, if they drive continues to fail? Shoot me the support case number as well, so I can forward it over to our account rep. Thanks, Willy
Wed, Oct 23
Hey @Bstorm - thanks for tracking all these previous tasks down. It's definitely helpful...I'll bring it up to Dell tomorrow during my bi-weekly sync up call with them, and see if I can more details. Worse case, we may just have to buy a replacement drive. Thanks, Willy
Tue, Oct 22
@RobH - ps2 was swapped last Tuesday on 10/15
Procurement task created for Rob to order replacement drive. Thanks, Willy
Mon, Oct 21
Fri, Oct 18
Thu, Oct 17
Updating the Need by Date in the subject line, based on the procurement task. @Cmjohnson - can you provide an ETA on when this can be completed? Thanks, Willy
Redundant power has been added to OE14,15,16 by Iron Mountain free of charge. Replacement Servertech PDUs have been ordered, and are scheduled to be arriving today.
I just made one slight change - changed the point person to @Jclark-ctr for assigning eqiad decom tasks
Wed, Oct 16
Awesome, thanks @Jclark-ctr
Tue, Oct 15
@Jclark-ctr - looks like this one is barely out of warranty. Before we order the part though, can you doublecheck that it's not something simply like a loose power cord or anything? Thanks, Willy
Hi @Papaul - if there aren't any objections from anyone, I think we can just resolve this. You have your primary connection via MIFI and a backup option via CyrusOne. And since it seems to be have been working ok for that past 4-5yrs without issue, I'm fine with not moving forward with a wifi setup. Thanks, Willy
Oct 14 2019
Oct 11 2019
Great job @Papaul in troubleshooting this and tracking it down to the root cause. Thanks! ~Willy
@Jclark-ctr - this arrived Thursday via https://www.fedex.com/en-us/home.html. Just a heads up, this will need to be replaced before the PDU upgrade next Tuesday, to retain redundant power on labsdb1009. Thanks, Willy
I'll dig around a bit and check with Dell to see if we can figure why Com1 and Com2 have to be flipped to get it working. Talked to Luca and worse case, if we can't find any answers to why it's happening, then we'll just leave them as is. Thanks, Willy
Hi @ayounsi - I talked to a couple other people who had the same concern the other day, and I agree as well...so I started scheduling downtime for the PDU alerts in Icinga starting from today's B1 PDU upgrade, and will continue for the remaining PDU swaps. Thanks, Willy
Oct 9 2019
Hi @Papaul - this task is relabel, update in Netbox, and update switchport descriptions to the newly renamed hostnames. Thanks, Willy
@Cmjohnson - this task is relabel, update in Netbox, and update switchport descriptions to the newly renamed hostnames
@Jclark-ctr - can you wrap up the netbox entries on this one, and then close out the task? Thanks, Willy
Thanks for confirming @ayounsi Resolving task.
Oct 8 2019
Re-assigning to @RobH to complete install/updating of new PDU. Thanks, Willy
@RobH - can you take care of DNS for this to get things completed from the dc-ops side for this install? This one's super urgent, so if you can complete in the AM, it would be much appreciated. Thanks, Willy
Oct 7 2019
Ok @Dzahn - just let us know when it's ready to go. Thanks, Willy
@Cmjohnson - let me know if we need to order a replacement drive (along with what type of disk), since it's out of warranty. Thanks, Willy
@Dzahn - just wanted to confirm that this has been depooled. Thanks, Willy
Thanks @elukey . Should we ignore/resolve this alert then? Thanks, Willy
Hi @elukey - looks like this host is out of warranty (ended in June 2018). Let me know if you want us to purchase a replacement part or if this system is close to being decommissioned. Thanks, Willy
@Marostegui - it was ordered last Friday morning. We haven't received the tracking number from the vendor yet, but will update that in T233277 once provided. There's still a chance it arrives before the 15th, but we should know have an ETA soon. Thanks, Willy
Oct 1 2019
Hi @Vgutierrez - just following up on this to see if there was an ETA, since these are supposed to replace lvs2001-2006...which are all past their 5yr mark, and have the following hardware issues associated with them:
@Marostegui - sure, will do. This week is the approval & ordering phase of the procurement cycle, so it shouldn't be an issue getting the PO submitted for labsdb1009. Thanks, Willy
Sep 30 2019
New target date for upgrading the PDUs on this network rack is Thursday 10/17 @11am UTC. @ayounsi will be in Europe this week to oversee, in case any potential issues occur. Thanks, Willy
New date for upgrading the remaining PDU on the network rack A1 will be targeting Tuesday, 10/15 at 11am UTC. Thanks, Willy