Page MenuHomePhabricator

wiki_willy
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Wednesday

  • Clear sailing ahead.

User Details

User Since
Apr 16 2019, 9:00 PM (98 w, 5 d)
Availability
Available
LDAP User
Wpao
MediaWiki User
Unknown

Recent Activity

Thu, Mar 4

wiki_willy updated subscribers of T273714: Interface errors on asw2-b-eqiad:ge-8/0/6 (dumpsdata1001).

Looks like the errors have cleared up from the past week. (thanks for checking @Papaul) @ArielGlenn - you ok if we close this task out? Thanks, Willy

Thu, Mar 4, 11:07 PM · SRE, User-ArielGlenn, ops-eqiad
wiki_willy added a comment to T272209: Degraded RAID on ms-be1032.

Hi @fgiunchedi - let us know when you have the decom task for ms-be1034 submitted per our conversation on IRC....then we can pull one of the drives for this. Thanks, Willy

Thu, Mar 4, 9:07 PM · SRE, ops-eqiad

Tue, Mar 2

wiki_willy assigned T276239: Try to move some new analytics worker nodes to different racks to Cmjohnson.
Tue, Mar 2, 6:32 PM · Analytics-Radar, SRE, ops-eqiad

Mon, Mar 1

wiki_willy reassigned T275019: decommission db1092.eqiad.wmnet from wiki_willy to Cmjohnson.
Mon, Mar 1, 5:46 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware

Fri, Feb 26

wiki_willy added a comment to T260445: (Need By: TBD) rack/setup/install an-worker11[18-41].

No worries @elukey, it looks like I missed the double count in rack A4 as well. If these hosts need to stay in row A though, the only other 10g options would be in racks A2 or A7. Both are pretty full, but I do see room to fit one server in each rack, near the very top. We typically don't use shelf 42, but it could be possible - @Jclark-ctr will probably need to confirm how tight the space is on shelf 42 is in A2 and A7. Also, ms-be1019 in A2 is EOL, so hopefully the SREs will have a decom task submitted for that soon, which would also free up another spot in the future. Would this work for you?

Fri, Feb 26, 6:21 PM · Analytics-Clusters, SRE, ops-eqiad, DC-Ops

Thu, Feb 25

wiki_willy removed projects from T255681: Put rdb200[78] into service: ops-codfw, DC-Ops.
Thu, Feb 25, 9:21 PM · User-jijiki, SRE
wiki_willy removed a project from T141756: audit / test / upgrade hp smartarray P840 firmware: ops-codfw.
Thu, Feb 25, 9:20 PM · SRE-swift-storage, SRE

Wed, Feb 24

wiki_willy updated subscribers of T274171: (Need By: TBD) rack/setup/install (35) mw2377 and upwards.

Hi @wkandek - from our previous conversation, you mentioned that refreshing the mw hosts in eqiad would have to be in a phased approach, since there can only be a certain number of mediawiki hosts down at one time. Would the same apply to codfw also? Since it's the secondary "non-active" site, we were hoping all the old mw's in rack A3 could be decom'd first (to make room) before @Papaul installs the new ones, then all of rack A4 decom'd afterwards, and so forth. Thanks, Willy

Wed, Feb 24, 10:27 PM · SRE, ops-codfw, DC-Ops
wiki_willy reassigned T274333: decommission db1090.eqiad.wmnet from wiki_willy to Cmjohnson.
Wed, Feb 24, 5:23 PM · SRE, ops-eqiad, DC-Ops, Patch-For-Review, decommission-hardware

Mon, Feb 22

wiki_willy added a comment to T275266: Degraded RAID on db1103.

@Marostegui Swapped Bad SSD @wiki_willy we did have one new in box same size same model ect. it originally came from HP

Mon, Feb 22, 11:11 PM · DBA, SRE, ops-eqiad
wiki_willy reassigned T275309: db1162 crashed from wiki_willy to Cmjohnson.
Mon, Feb 22, 7:13 AM · SRE, ops-eqiad, DBA

Sun, Feb 21

wiki_willy assigned T275266: Degraded RAID on db1103 to Jclark-ctr.
Sun, Feb 21, 7:08 AM · DBA, SRE, ops-eqiad
wiki_willy updated subscribers of T275266: Degraded RAID on db1103.

Ack @Marostegui, we'll take a look at it, with whoever heads onsite first this week. @Cmjohnson or @Jclark-ctr - since this machine is out of warranty, can you see if you can grab a spare drive from one of the decom'd servers? Thanks, Willy

Sun, Feb 21, 7:08 AM · DBA, SRE, ops-eqiad

Sat, Feb 20

wiki_willy added a comment to T265435: codfw: Testing Out Sample PDUs.

Updating task with the new single row Chatsworth design. It's not already supported by Librenms, so it looks like we would have to add it in. A few other notes I took from our meeting with the Account Reps - 3yr warranty (can usually send RMA in 2days, then we ship broken PDU back), 31 days to test the sample PDU (tho we can keep it longer if needed), 3phase is color coded, clips hold the power plugs in, switching capability available on other models, can swap controller module, field failure rate is less than .5%, MTBF of 1.7532 million hours

Sat, Feb 20, 12:49 AM · observability, ops-codfw, DC-Ops, SRE
wiki_willy updated the task description for T265435: codfw: Testing Out Sample PDUs.
Sat, Feb 20, 12:44 AM · observability, ops-codfw, DC-Ops, SRE

Fri, Feb 19

wiki_willy assigned T273922: update hostname labels on logstash103[345] & db11[51-76] to Cmjohnson.
Fri, Feb 19, 11:56 PM · ops-eqiad, DC-Ops, SRE
wiki_willy moved T274488: ms-be1034 not powering on from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Fri, Feb 19, 8:26 PM · User-fgiunchedi, SRE, ops-eqiad
wiki_willy assigned T275241: Eqiad: Port with no description xe-4/0/24 xe-4/0/3 to Cmjohnson.
Fri, Feb 19, 8:24 PM · SRE, ops-eqiad
wiki_willy assigned T275215: Replace sata cables for cloudvirt1024 to Jclark-ctr.
Fri, Feb 19, 8:24 PM · DC-Ops, ops-eqiad, cloud-services-team (Kanban)

Thu, Feb 18

wiki_willy added a comment to T266481: (Need By: TBD) rack/setup/install payments100[5-8].

Thanks @Jgreen, do you have a decom task for payments1004 as well?

Thu, Feb 18, 5:41 PM · ops-eqiad, SRE, DC-Ops
wiki_willy moved T274671: decommission frqueue1002.frack.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Thu, Feb 18, 5:39 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy assigned T274671: decommission frqueue1002.frack.eqiad.wmnet to Jclark-ctr.
Thu, Feb 18, 5:39 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy reassigned T267414: (Need By: TBD) rack/setup/install aqs101[0-5] from Cmjohnson to Jclark-ctr.

@Jclark-ctr - can you check the connection on aqs1014, then shoot this back over to @RobH to finish off? Thanks, Willy

Thu, Feb 18, 5:24 PM · SRE, ops-eqiad, DC-Ops
wiki_willy moved T274235: decommission db1075.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Thu, Feb 18, 5:22 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy added a comment to T260445: (Need By: TBD) rack/setup/install an-worker11[18-41].

Nice work, thanks @Jclark-ctr

Thu, Feb 18, 1:37 AM · Analytics-Clusters, SRE, ops-eqiad, DC-Ops

Wed, Feb 17

wiki_willy moved T267932: decommission analytics10[42-57] from Backlog to Decommission on the ops-eqiad board.
Wed, Feb 17, 3:56 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy added projects to T267932: decommission analytics10[42-57]: ops-eqiad, DC-Ops.
Wed, Feb 17, 3:55 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware

Tue, Feb 16

wiki_willy assigned T274751: Upgrade firmware on wdqs1009 to Jclark-ctr.
Tue, Feb 16, 8:09 PM · Discovery-Search (Current work), wdwb-tech-focus, SRE, Wikidata-Query-Service, ops-eqiad, DC-Ops, Wikidata
wiki_willy moved T273955: decommission db1093.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Tue, Feb 16, 8:07 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy reassigned T273955: decommission db1093.eqiad.wmnet from wiki_willy to Cmjohnson.
Tue, Feb 16, 5:03 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy added a comment to T274488: ms-be1034 not powering on.

Hi @Jclark-ctr - let's just move the hard drives over to the chassis of one of the decom'd hosts. (assuming the decom'd host doesn't have any hw issues) It'll probably save some time trying to figure out if it's the motherboard, CPU, etc. Thanks, Willy

Tue, Feb 16, 3:20 AM · User-fgiunchedi, SRE, ops-eqiad

Fri, Feb 12

wiki_willy renamed T273922: update hostname labels on logstash103[345] & db11[51-76] from update hostname labels on logstash103[345] to update hostname labels on logstash103[345] & db11[51-76].
Fri, Feb 12, 11:25 PM · ops-eqiad, DC-Ops, SRE
wiki_willy reassigned T274472: Investigate and repool db1134 from Cmjohnson to Jclark-ctr.

Moving over to @Jclark-ctr to receive and replace the memory, since @Cmjohnson is out on vacation next week. Thanks, Willy

Fri, Feb 12, 9:03 PM · Patch-For-Review, ops-eqiad, DBA, SRE
wiki_willy reassigned T266481: (Need By: TBD) rack/setup/install payments100[5-8] from Cmjohnson to Jclark-ctr.
Fri, Feb 12, 9:02 PM · ops-eqiad, SRE, DC-Ops
wiki_willy assigned T274622: ms-be1038 NIC link down to Jclark-ctr.
Fri, Feb 12, 7:44 PM · SRE, ops-eqiad
wiki_willy added a comment to T274488: ms-be1034 not powering on.

Nice work @Jclark-ctr, much appreciated.

Fri, Feb 12, 7:41 PM · User-fgiunchedi, SRE, ops-eqiad
wiki_willy added a comment to T274488: ms-be1034 not powering on.

Hi @fgiunchedi - @Jclark-ctr is going to use some parts from decommissioned servers to try and get the server back up. Thanks, Willy

Fri, Feb 12, 7:30 PM · User-fgiunchedi, SRE, ops-eqiad
wiki_willy reopened T274488: ms-be1034 not powering on as "Open".
Fri, Feb 12, 7:29 PM · User-fgiunchedi, SRE, ops-eqiad

Thu, Feb 11

wiki_willy reassigned T274472: Investigate and repool db1134 from wiki_willy to Cmjohnson.

@Cmjohnson /@Jclark-ctr - just a heads up, this is higher priority and the server is still under warranty, through November 2021. Thanks, Willy

Thu, Feb 11, 5:51 PM · Patch-For-Review, ops-eqiad, DBA, SRE
wiki_willy added a parent task for T274488: ms-be1034 not powering on: Unknown Object (Task).
Thu, Feb 11, 5:35 PM · User-fgiunchedi, SRE, ops-eqiad
wiki_willy added a comment to T274488: ms-be1034 not powering on.

Hi @fgiunchedi - since this server is at the 4yr mark, are you ok with decommissioning it? Thanks, Willy

Thu, Feb 11, 5:10 PM · User-fgiunchedi, SRE, ops-eqiad

Wed, Feb 10

wiki_willy reassigned T273040: decommission db1081.eqiad.wmnet from wiki_willy to Cmjohnson.

This is ready for DCOps!

Wed, Feb 10, 3:58 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware

Mon, Feb 8

wiki_willy reassigned T273710: decommission db1094.eqiad.wmnet from wiki_willy to Cmjohnson.

Ready for DC-Ops

Mon, Feb 8, 4:55 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware

Feb 5 2021

wiki_willy assigned T273803: mw2220 - broken IPMI / mgmt to Papaul.
Feb 5 2021, 10:27 PM · DC-Ops, SRE, ops-codfw, serviceops-radar
wiki_willy moved T273732: decommission db1095 from Backlog to Decommission on the ops-eqiad board.
Feb 5 2021, 5:26 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy reassigned T273732: decommission db1095 from wiki_willy to Cmjohnson.
Feb 5 2021, 5:26 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware

Feb 4 2021

wiki_willy added a comment to T265113: Memory issue on elastic1063 caused elasticsearch to be killed.

Hi @Jclark-ctr - can you confirm all the firmware/bios/idrac is all updated? I have an email queued up to send to our technical Dell rep on this, but we should make sure it's all updated on the host first, to see if it fixes the issue or if helps show anything new on the TSR report. Thanks, Willy

Feb 4 2021, 11:09 PM · Discovery-Search (Current work), ops-eqiad, SRE
wiki_willy added a comment to T273841: Esams: Delete rack OE10, OE11, OE12 and OE13 from Netbox.

Hi @Papaul - thanks for bringing it up. Unless there's some other dependencies that need to be removed beforehand, I think it should be ok. We officially term'd out of the contract for these 3x racks right around late December. Thanks, Willy

Feb 4 2021, 5:23 PM · SRE, DC-Ops, ops-esams
wiki_willy reassigned T273597: decommission db1078.eqiad.wmnet from wiki_willy to Cmjohnson.

@wiki_willy this is ready for DC-Ops

Feb 4 2021, 5:20 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware

Feb 2 2021

wiki_willy moved T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem from Hardware Failure / Troubleshoot to Cloud Tasks on the ops-eqiad board.
Feb 2 2021, 11:37 PM · cloud-services-team (Hardware), SRE, ops-eqiad
wiki_willy assigned T271058: cloudnet1004/cloudnet1003: network hiccups because broadcom driver/firmware problem to Jclark-ctr.
Feb 2 2021, 11:36 PM · cloud-services-team (Hardware), SRE, ops-eqiad
wiki_willy moved T238957: decommission phab1003.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Feb 2 2021, 7:00 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy moved T273142: decommission francium.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Feb 2 2021, 6:32 PM · SRE, ops-eqiad, serviceops, decommission-hardware
wiki_willy moved T273417: decommission db1089.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Feb 2 2021, 6:31 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy assigned T273034: Degraded RAID on an-worker1099 to Cmjohnson.
Feb 2 2021, 6:30 PM · Analytics-Radar, SRE, ops-eqiad
wiki_willy reassigned T273417: decommission db1089.eqiad.wmnet from wiki_willy to Cmjohnson.

Thanks @Marostegui, this helps us a lot!

Feb 2 2021, 5:00 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy added a comment to T273049: decommission helium.eqiad.wmnet and helium-array.

Thanks a lot @jcrespo, it's much appreciated!

Feb 2 2021, 4:58 PM · SRE, ops-eqiad, Data-Persistence-Backup, decommission-hardware

Feb 1 2021

wiki_willy added a comment to T245279: decommission kraz.wikimedia.org.

Thanks @Dzahn - I saw it listed under the "pending onsite steps (codfw)" column, so it threw me off for a sec.

Feb 1 2021, 8:02 PM · Analytics-Radar, decommission-hardware, serviceops, SRE
wiki_willy placed T245279: decommission kraz.wikimedia.org up for grabs.
Feb 1 2021, 7:55 PM · Analytics-Radar, decommission-hardware, serviceops, SRE
wiki_willy assigned T245279: decommission kraz.wikimedia.org to Papaul.
Feb 1 2021, 7:44 PM · Analytics-Radar, decommission-hardware, serviceops, SRE
wiki_willy assigned T273142: decommission francium.eqiad.wmnet to Cmjohnson.
Feb 1 2021, 7:42 PM · SRE, ops-eqiad, serviceops, decommission-hardware
wiki_willy added a project to T245279: decommission kraz.wikimedia.org: ops-codfw.
Feb 1 2021, 6:36 PM · Analytics-Radar, decommission-hardware, serviceops, SRE
wiki_willy added a comment to T260445: (Need By: TBD) rack/setup/install an-worker11[18-41].

Thanks @Marostegui, I appreciate it. We discussed this during my staff meeting a bit last week, and @Cmjohnson will work with you and the other server owners on the moves. It won't be during the next few days, as there's a big storm in Virginia....but I'll let @Cmjohnson chime in to propose which dates/times would work the best. Thanks, Willy

Feb 1 2021, 5:32 PM · Analytics-Clusters, SRE, ops-eqiad, DC-Ops

Jan 30 2021

wiki_willy added a project to T238957: decommission phab1003.eqiad.wmnet: ops-eqiad.

adding the "ops-eqiad" project tag, so we can track this on the dc-ops workboard

Jan 30 2021, 1:22 AM · SRE, ops-eqiad, DC-Ops, decommission-hardware

Jan 29 2021

wiki_willy added a comment to T273301: cr1-eqiad<>asw2-d-eqiad link down.

The optics on asw2-d2 xe-2/0/40 was bad. I replace both for good measure and the link is back up

Jan 29 2021, 9:45 PM · SRE, netops, ops-eqiad
wiki_willy reassigned T273301: cr1-eqiad<>asw2-d-eqiad link down from wiki_willy to Cmjohnson.

No one scheduled to be onsite today, but @Cmjohnson will go in to check it out later this afternoon. Thanks, Willy

Jan 29 2021, 5:44 PM · SRE, netops, ops-eqiad

Jan 28 2021

wiki_willy added a comment to T265435: codfw: Testing Out Sample PDUs.

Thanks @fgiunchedi . Now that the holidays are over, I'm re-engaging the vendor on discussions. After all the paperwork and stuff, my guess is we'll probably have the sample PDUs onsite in about a month or so, but we'll be sure to update the Phab task as we make progress. Thanks, Willy

Jan 28 2021, 10:51 PM · observability, ops-codfw, DC-Ops, SRE
wiki_willy added a comment to T260445: (Need By: TBD) rack/setup/install an-worker11[18-41].

I think that should work, but let me defer to @Cmjohnson and @Jclark-ctr for any additional concerns though. In summary, here's the game plan (slightly adjusted from my original proposal, so that the nodes don't go over 5x in a rack):

Jan 28 2021, 6:00 PM · Analytics-Clusters, SRE, ops-eqiad, DC-Ops

Jan 27 2021

wiki_willy added a comment to T260445: (Need By: TBD) rack/setup/install an-worker11[18-41].

Hi @elukey - thanks for the mapping. What makes it tough is that the remaining 6x hosts need to be on 10g switches, which really limits our options. Right now, it looks like you're maxed out in almost all our 10g racks (A2, A4, A7, B2, B4, B7, C2, C4, C7, D2, D4, D7). But based on your mapping, I think we make this happen instead, if this works for you:

Jan 27 2021, 6:26 PM · Analytics-Clusters, SRE, ops-eqiad, DC-Ops

Jan 26 2021

wiki_willy added a comment to T260445: (Need By: TBD) rack/setup/install an-worker11[18-41].

Hi @elukey - in looking through Netbox and talking to Chris, this is what I'm thinking, but @Cmjohnson/@Jclark-ctr/@elukey - please call me out if I'm off with any part of this plan, and we can think of an alternative:

Jan 26 2021, 5:39 PM · Analytics-Clusters, SRE, ops-eqiad, DC-Ops
wiki_willy updated subscribers of T266481: (Need By: TBD) rack/setup/install payments100[5-8].

Thanks @Jgreen (cc'ing @Jclark-ctr as a fyi)

Jan 26 2021, 4:28 PM · ops-eqiad, DC-Ops, SRE

Jan 25 2021

wiki_willy updated subscribers of T266481: (Need By: TBD) rack/setup/install payments100[5-8].

Hi @Jgreen - it looks like we're running a bit tight on space in the Fundraising rack. In order for us to rack the servers for this install, do you have 1-2 existing servers that can be decommissioned in eqiad? Thanks, Willy

Jan 25 2021, 8:50 PM · ops-eqiad, DC-Ops, SRE

Jan 19 2021

wiki_willy assigned T272209: Degraded RAID on ms-be1032 to Cmjohnson.
Jan 19 2021, 8:32 PM · SRE, ops-eqiad
wiki_willy assigned T272396: ms-be1046 stuck on reboot to Cmjohnson.
Jan 19 2021, 8:31 PM · ops-eqiad, SRE

Jan 16 2021

wiki_willy added a comment to T267043: (Need By: 2020-11-29) rack/setup/install db11[51-76].

Dell provided some docs that show DYV8773 should be onsite, and John confirmed all 25 were received. @Cmjohnson - it probably got mixed in, with one of the other install tasks. But also, it could be just be a Netbox discrepancy. In looking at the Netbox errors, I see db1156 (in Rack A1) seems to have the incorrect serial number in Netbox. The S/N listed in Netbox wasn't one of the servers listed in the invoice.

Jan 16 2021, 12:41 AM · DBA, ops-eqiad, DC-Ops, SRE

Jan 15 2021

wiki_willy assigned T272125: Memory errors on clouddb1019 to Cmjohnson.
Jan 15 2021, 4:03 PM · DBA, ops-eqiad, SRE

Jan 14 2021

wiki_willy added a comment to T267043: (Need By: 2020-11-29) rack/setup/install db11[51-76].

Thanks @Jclark-ctr, I just sent an email to Dell to figure out what's going on.

@RobH @Cmjohnson DYV8773 is the ST not in netbox right now

Jan 14 2021, 10:43 PM · DBA, ops-eqiad, DC-Ops, SRE

Jan 8 2021

wiki_willy added a comment to T267271: (Need By: TBD) rack/setup/install mwlog1002.eqiad.wmnet.

Netbox error associated with this install:

Jan 8 2021, 10:38 PM · SRE, ops-eqiad, DC-Ops
wiki_willy assigned T271512: Please remove sdb from ms-be1022 to Cmjohnson.

Hi @fgiunchedi, just wanted to confirm - since this server was recently refreshed last quarter via T265093, no need to replace the disk, right? Thanks, Willy

Jan 8 2021, 6:24 PM · ops-eqiad, SRE

Jan 4 2021

wiki_willy assigned T270806: Degraded RAID on ms-be1019 to Cmjohnson.
Jan 4 2021, 9:42 PM · SRE-swift-storage, ops-eqiad, SRE
wiki_willy closed T271098: Degraded RAID on an-coord1002 as Resolved.

Duplicate of T270768

Jan 4 2021, 9:41 PM · Analytics, ops-eqiad, SRE
wiki_willy assigned T267050: (Need By: TBD) rack/setup/install ml-serve100[1-4] to Jclark-ctr.

Servers arrived Dec 23. @Jclark-ctr - can you install the GPU into one of these hosts?

Jan 4 2021, 9:39 PM · ops-eqiad, DC-Ops, SRE

Dec 22 2020

wiki_willy assigned T267969: frdev1001 ILO inaccessible to Cmjohnson.
Dec 22 2020, 7:31 PM · SRE, ops-eqiad, DC-Ops
wiki_willy closed Unknown Object (Task), a subtask of T235805: ESAMS Refresh/Rebuild (October 2019), as Resolved.
Dec 22 2020, 7:11 PM · Patch-For-Review, DC-Ops, SRE, ops-esams

Dec 21 2020

wiki_willy reassigned T268436: decommission es1013.eqiad.wmnet from wiki_willy to Cmjohnson.
Dec 21 2020, 8:06 AM · ops-eqiad, SRE, DC-Ops, decommission-hardware
wiki_willy assigned T270571: Degraded RAID on db1101 to Jclark-ctr.

@Jclark-ctr or @Cmjohnson - do we have any decom'd servers onsite with this drive size? Thanks, Willy

Dec 21 2020, 8:02 AM · DBA, ops-eqiad, SRE

Dec 17 2020

wiki_willy assigned T267666: (Need By: TBD) rack/setup/install logstash103[345] to Cmjohnson.

Arrived on Dec 12

Dec 17 2020, 6:56 PM · ops-eqiad, SRE, DC-Ops
wiki_willy assigned T267271: (Need By: TBD) rack/setup/install mwlog1002.eqiad.wmnet to Cmjohnson.

Arrived on Dec 12

Dec 17 2020, 6:55 PM · SRE, ops-eqiad, DC-Ops
wiki_willy moved T270159: decommission es1019.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Dec 17 2020, 5:00 PM · ops-eqiad, SRE, DC-Ops, decommission-hardware
wiki_willy reassigned T270159: decommission es1019.eqiad.wmnet from wiki_willy to Cmjohnson.
Dec 17 2020, 4:59 PM · ops-eqiad, SRE, DC-Ops, decommission-hardware

Dec 15 2020

wiki_willy updated subscribers of T267043: (Need By: 2020-11-29) rack/setup/install db11[51-76].

@Cmjohnson and @RobH - per our conversation on IRC, just a heads up to avoid installing the remaining db hosts with IPV6. (reference T270101 for the reasoning) Thanks, Willy

Dec 15 2020, 5:12 PM · DBA, ops-eqiad, DC-Ops, SRE
wiki_willy added a comment to T270101: Grants not working with DB hosts with to ipv6.

@wiki_willy could you talk to your team to make sure the rest of hosts at T267043: (Need By: 2020-11-29) rack/setup/install db11[51-76] do not get installed with ipv6 once they start to get provisioned?

Dec 15 2020, 5:04 PM · netbox, DBA

Dec 10 2020

wiki_willy assigned T267414: (Need By: TBD) rack/setup/install aqs101[0-5] to Cmjohnson.

Hardware arrived Dec 3

Dec 10 2020, 8:56 PM · SRE, ops-eqiad, DC-Ops

Dec 8 2020

wiki_willy updated subscribers of T269552: Degraded RAID on logstash2022.

Hi @herron - it looks like this server is due to be refreshed next next year. (around Nov 2021) Let me know if you want a replacement disk purchased for this in the mean time, or if you can go without it, until the server replacement.

Dec 8 2020, 6:35 PM · SRE, ops-codfw
wiki_willy assigned T269556: sdg1 failed on ms-be1054 to Cmjohnson.

@Cmjohnson - looks like this one is still under warranty (installed in Aug 2019), so you should be good with submitting a RMA. Thanks, Willy

Dec 8 2020, 1:09 AM · SRE, ops-eqiad

Dec 7 2020

wiki_willy created T269621: codfw: Netbox Errors.
Dec 7 2020, 7:40 PM · ops-codfw, SRE, DC-Ops
wiki_willy assigned T269552: Degraded RAID on logstash2022 to Papaul.
Dec 7 2020, 7:36 PM · SRE, ops-codfw

Dec 4 2020

wiki_willy assigned T267043: (Need By: 2020-11-29) rack/setup/install db11[51-76] to Cmjohnson.
Dec 4 2020, 9:36 PM · DBA, ops-eqiad, DC-Ops, SRE
wiki_willy added a comment to T267043: (Need By: 2020-11-29) rack/setup/install db11[51-76].

Hi @Cmjohnson - can you add the S/N for db1153:

Dec 4 2020, 9:35 PM · DBA, ops-eqiad, DC-Ops, SRE
wiki_willy assigned T266481: (Need By: TBD) rack/setup/install payments100[5-8] to Cmjohnson.

Arrived Dec 3

Dec 4 2020, 9:24 PM · ops-eqiad, DC-Ops, SRE