Since the only thing remaining in this task is bringing up the Dell switches in racks E8 and F8 (which I believe the Network SRE team is working on), I'm going to go ahead and resolve the main tracking ticket. Thanks, Willy
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Mon, Apr 15
Wed, Apr 3
Sure, no prob @LSobanski. Here's the list of the 24 active devices that still reference RT tasks in Netbox, along with their purchase dates (network equipment usually EOLs every 8yrs):
Tue, Apr 2
Thanks for checking @LSobanski. It's definitely rare that we need to refer back to RT. In the last 5 years, the 2-3 cases that we've had to reference RT was typically due to tracking down information about core routers that we had purchased back then. In Netbox, we only have 24 active devices left that still reference RT tasks. As long as we're able to access these in someway (ideally quickly and easily) on the rare occasions that it's needed, you should be able to proceed with moving forward.
Mar 19 2024
Hi @elukey - do you want me to change the Lift Wing expansion requests for 16x servers in FY24-25 to 10g? Thanks, Willy
Mar 13 2024
++ @VRiley-WMF & @Jclark-ctr for troubleshooting the hardware. (host was installed a few quarters ago)
Mar 5 2024
Sounds good. @Jhancock.wm - I created a new sheet below, with the following fields. I entered in the hostnames and asset tag, but can you fill in the remaining items for old S/N, new S/N, and Phabricator Task?
Mar 4 2024
Thanks for confirming, @Volans. If everyone else is ok with making the correlation on the accounting spreadsheet, my vote is that we go with that route. Thanks, Willy
Mar 1 2024
Thanks @Volans, that makes sense. My preference would be to leave Netbox as is, and use the accounting spreadsheet to make the S/N connection to each other. Would we be adding a different tab on the accounting spreadsheet for that?
Feb 29 2024
If we change the serial number, I think it would create an error for S/N / Asset tag mismatch. (related to Riccardo's points earlier) We also reference the original chassis S/N when dealing with vendors for recycling servers (estimates, official documentation, etc) and purchasing replacement parts, so I'm still a bit hesitant with editing the S/N in Netbox as the solution. Since it doesn't sound like we receive any Netbox alerts when we replacing with a new motherboard, is there something that we could tweak to replicate the same thing? (ie: change the status or something of the donor server) Or worse case, just suppress these alerts somehow, until they eventually decommission?
Feb 28 2024
Hey @Volans - much appreciated for your feedback and for the suggestions. I was wondering since the physical serial number listed on the chassis doesn't change (it's only from a Puppet perspective that the serial number changes), is there anything on the Puppet side that could be modified to reflect the MB replacement? If there's something easy that could be done in Puppet to prevent the Netbox error from alerting, I kind of feel like it would be a more accurate representation.
++ @VRiley-WMF and @Jclark-ctr - can one of you pick up this request? We'll be repurposing one of the previously decommissioned cp servers to set up a temp server for Adam to use. Thanks, Willy
Sounds good @bking, thanks!
Hi @bking - thanks for coming up with the list. I have the following refreshes already on the CapEx doc, so you just have to fill in the missing columns for "Hardware Config", "Network Speed" and "Total Equipment Cost" (for custom configs)
Feb 27 2024
Thanks for picking this up @Jhancock.wm. @Marostegui - since this host looks like it's close to being refreshed in T355350, do you want to just wait for the refreshed server to be setup instead of fixing this one? Thanks, Willy
Feb 26 2024
In T358421#9574362, @Marostegui wrote:@wiki_willy can we contact the vendor about this issue which caused a reboot?
Record: 27 Date/Time: 02/24/2024 10:08:18 Source: system Severity: Critical Description: CPU 1 machine check error detected.
Feb 23 2024
Hi @ssingh - the hardware should still be around, and we should be able to reallocate one of them for testing purposes. Can you shoot open a new Phabricator for us with all the necessary details (hostname, racking info, network setup, raid/partitioning, OS, and main poc)? Also, do you know how long Adam would need it for?
Feb 21 2024
++ @Jhancock.wm for visibility and in case any onsite support is needed
Feb 8 2024
++ @Jhancock.wm
Jan 10 2024
Thanks @VRiley-WMF. I have T354684 assigned over to you, so you can work with @fgiunchedi on coordinating downtime for the upgrades. Thanks, Willy
Jan 9 2024
Awesome, thanks @Jhancock.wm. Here's the codfw upgrade ticket for you to coordinate with @fgiunchedi on the downtime - T354685. Thanks, Willy
++ @Jclark-ctr & @VRiley-WMF
@Papaul / @Jhancock.wm and @Jclark-ctr / @VRiley-WMF - can you see if you have any spare memory onsite for Filippo? I think it's for prometheus100[5,6] and prometheus200[5,6]. (cc @RobH in case we have to order them)
Dec 15 2023
@Jclark-ctr or @VRiley-WMF - can one of you take a look at this one?
Dec 7 2023
Definitely. @Jclark-ctr & @VRiley-WMF - can you check if we have any spare drives from a decommissioned host? If not, we'll purchase one via @RobH). Thanks, Willy
Dec 1 2023
Nov 29 2023
++ @Jclark-ctr & @VRiley-WMF - can one of you two work on getting the drive RMA'd for this one? Thanks, Willy
Nov 23 2023
Nov 22 2023
Nov 10 2023
Thanks for working on this @bking. I'm mainly looking to see how much future growth you're looking at (a rough estimate is fine), if you have any requests for the type of servers we provide (ie: ARM, GPU, etc), or just have any feedback for us in general. We're getting pretty full at codfw, so when we purchase additional data center space, we want to ensure we're adding enough capacity for everyone's future needs over the next 3-5yrs. Thanks, Willy
Oct 30 2023
Awesome, thanks for working on this @VRiley-WMF. @nskaggs & @cmooney - since we have some discrepancies with the number of ports being used on these cloudvirts, should we come up with a plan/process to help us free up the second switchport on them? This will help us reclaim some switchports for new installs and server migrations. Thanks, Willy
Oct 25 2023
Oct 17 2023
@Jclark-ctr or @VRiley-WMF - can one of you follow up on Ben's question above on an-tool1010, along with Alex's comment on deploy1102? Thanks, Willy
Oct 3 2023
++ @Papaul , who's going to dig around a bit and provide some feedback
Aug 30 2023
Aug 11 2023
Aug 2 2023
It's not on the refresh list for this fiscal year; looks like it'll be due for a refresh in FY24-25. If the firmware upgrade on the iDrac doesn't work, we can try sourcing the fan if you want. (cc @RobH)
Jul 31 2023
Jul 19 2023
Cool, thanks for confirming @Papaul. Hopefully Iron Mountain will come back with the same confirmation as well.
Jul 18 2023
Jul 13 2023
Jul 12 2023
Jul 11 2023
Hi @Jclark-ctr - can you work with @aborrero on the timeframe and migration plan for these servers? Thanks, Willy
Jul 10 2023
Jun 27 2023
Jun 23 2023
Jun 20 2023
Jun 16 2023
Jun 15 2023
Jun 8 2023
Jun 6 2023
Jun 1 2023
May 31 2023
May 26 2023
May 25 2023
Hi @Marostegui - Papaul is on paternity leave for another week, so I'm going to pass this over to @Jhancock.wm to check out. The server is about 4yrs old, so it's out of warranty, but there might be parts that could be pulled from a decommissioned server if we're able to isolate the issue. Thanks, Willy
May 18 2023
May 17 2023
Thanks @Jclark-ctr. Feel free to pull the drives from a server that's already been decommissoned.
May 16 2023
@RobH - this might be something we could add to the recycle pickup
Agreed, I don't think there's any need to continue using "platform" in Netbox, especially since more than half the devices don't have it currently filled out. @Papaul, @RobH, @Jclark-ctr, @Jhancock.wm - feel free to chime in if you have any other thoughts.
May 15 2023
May 11 2023
May 10 2023
May 3 2023
May 2 2023
@Jclark-ctr - can you take a peak at this one to see if it's pending on anything from our side? Thanks, Willy