Thu, Mar 4
Hi @fgiunchedi - let us know when you have the decom task for ms-be1034 submitted per our conversation on IRC....then we can pull one of the drives for this. Thanks, Willy
Tue, Mar 2
Mon, Mar 1
Fri, Feb 26
No worries @elukey, it looks like I missed the double count in rack A4 as well. If these hosts need to stay in row A though, the only other 10g options would be in racks A2 or A7. Both are pretty full, but I do see room to fit one server in each rack, near the very top. We typically don't use shelf 42, but it could be possible - @Jclark-ctr will probably need to confirm how tight the space is on shelf 42 is in A2 and A7. Also, ms-be1019 in A2 is EOL, so hopefully the SREs will have a decom task submitted for that soon, which would also free up another spot in the future. Would this work for you?
Thu, Feb 25
Wed, Feb 24
Hi @wkandek - from our previous conversation, you mentioned that refreshing the mw hosts in eqiad would have to be in a phased approach, since there can only be a certain number of mediawiki hosts down at one time. Would the same apply to codfw also? Since it's the secondary "non-active" site, we were hoping all the old mw's in rack A3 could be decom'd first (to make room) before @Papaul installs the new ones, then all of rack A4 decom'd afterwards, and so forth. Thanks, Willy
Mon, Feb 22
Sun, Feb 21
Ack @Marostegui, we'll take a look at it, with whoever heads onsite first this week. @Cmjohnson or @Jclark-ctr - since this machine is out of warranty, can you see if you can grab a spare drive from one of the decom'd servers? Thanks, Willy
Sat, Feb 20
Updating task with the new single row Chatsworth design. It's not already supported by Librenms, so it looks like we would have to add it in. A few other notes I took from our meeting with the Account Reps - 3yr warranty (can usually send RMA in 2days, then we ship broken PDU back), 31 days to test the sample PDU (tho we can keep it longer if needed), 3phase is color coded, clips hold the power plugs in, switching capability available on other models, can swap controller module, field failure rate is less than .5%, MTBF of 1.7532 million hours
Fri, Feb 19
Thu, Feb 18
Thanks @Jgreen, do you have a decom task for payments1004 as well?
Nice work, thanks @Jclark-ctr
Wed, Feb 17
Tue, Feb 16
Hi @Jclark-ctr - let's just move the hard drives over to the chassis of one of the decom'd hosts. (assuming the decom'd host doesn't have any hw issues) It'll probably save some time trying to figure out if it's the motherboard, CPU, etc. Thanks, Willy
Fri, Feb 12
Nice work @Jclark-ctr, much appreciated.
Thu, Feb 11
Hi @fgiunchedi - since this server is at the 4yr mark, are you ok with decommissioning it? Thanks, Willy
Wed, Feb 10
Mon, Feb 8
Feb 5 2021
Feb 4 2021
Hi @Jclark-ctr - can you confirm all the firmware/bios/idrac is all updated? I have an email queued up to send to our technical Dell rep on this, but we should make sure it's all updated on the host first, to see if it fixes the issue or if helps show anything new on the TSR report. Thanks, Willy
Hi @Papaul - thanks for bringing it up. Unless there's some other dependencies that need to be removed beforehand, I think it should be ok. We officially term'd out of the contract for these 3x racks right around late December. Thanks, Willy
Feb 2 2021
Thanks @Marostegui, this helps us a lot!
Thanks a lot @jcrespo, it's much appreciated!
Feb 1 2021
Thanks @Dzahn - I saw it listed under the "pending onsite steps (codfw)" column, so it threw me off for a sec.
Thanks @Marostegui, I appreciate it. We discussed this during my staff meeting a bit last week, and @Cmjohnson will work with you and the other server owners on the moves. It won't be during the next few days, as there's a big storm in Virginia....but I'll let @Cmjohnson chime in to propose which dates/times would work the best. Thanks, Willy
Jan 30 2021
adding the "ops-eqiad" project tag, so we can track this on the dc-ops workboard
Jan 29 2021
No one scheduled to be onsite today, but @Cmjohnson will go in to check it out later this afternoon. Thanks, Willy
Jan 28 2021
Thanks @fgiunchedi . Now that the holidays are over, I'm re-engaging the vendor on discussions. After all the paperwork and stuff, my guess is we'll probably have the sample PDUs onsite in about a month or so, but we'll be sure to update the Phab task as we make progress. Thanks, Willy
I think that should work, but let me defer to @Cmjohnson and @Jclark-ctr for any additional concerns though. In summary, here's the game plan (slightly adjusted from my original proposal, so that the nodes don't go over 5x in a rack):
Jan 27 2021
Hi @elukey - thanks for the mapping. What makes it tough is that the remaining 6x hosts need to be on 10g switches, which really limits our options. Right now, it looks like you're maxed out in almost all our 10g racks (A2, A4, A7, B2, B4, B7, C2, C4, C7, D2, D4, D7). But based on your mapping, I think we make this happen instead, if this works for you:
Jan 26 2021
Hi @elukey - in looking through Netbox and talking to Chris, this is what I'm thinking, but @Cmjohnson/@Jclark-ctr/@elukey - please call me out if I'm off with any part of this plan, and we can think of an alternative:
Jan 25 2021
Hi @Jgreen - it looks like we're running a bit tight on space in the Fundraising rack. In order for us to rack the servers for this install, do you have 1-2 existing servers that can be decommissioned in eqiad? Thanks, Willy
Jan 19 2021
Jan 16 2021
Dell provided some docs that show DYV8773 should be onsite, and John confirmed all 25 were received. @Cmjohnson - it probably got mixed in, with one of the other install tasks. But also, it could be just be a Netbox discrepancy. In looking at the Netbox errors, I see db1156 (in Rack A1) seems to have the incorrect serial number in Netbox. The S/N listed in Netbox wasn't one of the servers listed in the invoice.
Jan 15 2021
Jan 14 2021
Thanks @Jclark-ctr, I just sent an email to Dell to figure out what's going on.
Jan 8 2021
Netbox error associated with this install:
Jan 4 2021
Duplicate of T270768
Servers arrived Dec 23. @Jclark-ctr - can you install the GPU into one of these hosts?
Dec 22 2020
Dec 21 2020
Dec 17 2020
Arrived on Dec 12
Arrived on Dec 12
Dec 15 2020
Dec 10 2020
Hardware arrived Dec 3
Dec 8 2020
Hi @herron - it looks like this server is due to be refreshed next next year. (around Nov 2021) Let me know if you want a replacement disk purchased for this in the mean time, or if you can go without it, until the server replacement.
@Cmjohnson - looks like this one is still under warranty (installed in Aug 2019), so you should be good with submitting a RMA. Thanks, Willy
Dec 7 2020
Dec 4 2020
Hi @Cmjohnson - can you add the S/N for db1153: