Page MenuHomePhabricator

Cmjohnson (cmjohnson)
User

Projects (11)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Dec 16 2014, 10:22 PM (334 w, 3 d)
Availability
Available
IRC Nick
cmjohnson1
LDAP User
Cmjohnson
MediaWiki User
Unknown

Recent Activity

Thu, May 13

Cmjohnson added a comment to T276922: cloudvirt1038: PCIe error.

The last Dell tech that came in identified the problem as a riser card, Oddly enough this was replaced already but maybe the second time is the charm. Dell is sending the part directly to me and I will replace and fingers crossed this works

Thu, May 13, 6:52 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)

Tue, May 11

Cmjohnson closed T278934: Add eqiad airport express to Netbox as Resolved.

Added Airport Express and connected to mr1 in netbox

Tue, May 11, 3:51 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson closed T280668: Degraded RAID on cloudvirt1018 as Resolved.
Tue, May 11, 3:41 PM · SRE, ops-eqiad
Cmjohnson reassigned T276637: (Need By: TBD) rack/setup/install moss-be100[12] from Cmjohnson to Jclark-ctr.
Tue, May 11, 3:33 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson added a comment to T276637: (Need By: TBD) rack/setup/install moss-be100[12].

@Jclark-ctr moss-be1001 cables are wrong, the ports you have them connected to are already labeled for cloudcephosd1016 but I see that the server is not connected to the switch and also listed as decommission in Netbox (unsure about that status as well). I am confused about what's going on with this switch and available ports. Can you let me know which ports are available?

Tue, May 11, 3:32 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson added a comment to T276922: cloudvirt1038: PCIe error.

Dell is supposed to be here today to replace several more parts. We will see how it goes

Tue, May 11, 2:57 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)
Cmjohnson added a comment to T278934: Add eqiad airport express to Netbox.

thanks @ayounsi could you add a device type to the list, that is a netbox requirement for me to save.

Tue, May 11, 2:54 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson closed T275759: decommission scb100[1-4].eqiad.wmnet as Resolved.

all for are removed from rack and decom'd on netbox.

Tue, May 11, 2:48 PM · SRE, ops-eqiad, serviceops
Cmjohnson closed T280110: decommission bast1002.wikimedia.org as Resolved.
Tue, May 11, 2:47 PM · SRE, ops-eqiad, decommission-hardware
Cmjohnson updated the task description for T280110: decommission bast1002.wikimedia.org.
Tue, May 11, 2:46 PM · SRE, ops-eqiad, decommission-hardware
Cmjohnson closed T274752: decommission db1076.eqiad.wmnet, a subtask of T258361: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers), as Resolved.
Tue, May 11, 2:46 PM · Patch-For-Review, SRE, DBA
Cmjohnson closed T274752: decommission db1076.eqiad.wmnet as Resolved.
Tue, May 11, 2:46 PM · SRE, ops-eqiad, DC-Ops, Patch-For-Review, decommission-hardware
Cmjohnson updated the task description for T274752: decommission db1076.eqiad.wmnet.
Tue, May 11, 2:46 PM · SRE, ops-eqiad, DC-Ops, Patch-For-Review, decommission-hardware
Cmjohnson closed T278229: decommission db1086.eqiad.wmnet, a subtask of T258361: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers), as Resolved.
Tue, May 11, 2:46 PM · Patch-For-Review, SRE, DBA
Cmjohnson closed T278229: decommission db1086.eqiad.wmnet as Resolved.
Tue, May 11, 2:46 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
Cmjohnson closed T278229: decommission db1086.eqiad.wmnet, a subtask of T275633: Productionize db21[45-52] and db11[76-84], as Resolved.
Tue, May 11, 2:46 PM · Data-Persistence-Backup, Patch-For-Review, DBA
Cmjohnson updated the task description for T278229: decommission db1086.eqiad.wmnet.
Tue, May 11, 2:46 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
Cmjohnson closed T281075: decommission db1077.eqiad.wmnet, a subtask of T258361: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers), as Resolved.
Tue, May 11, 2:45 PM · Patch-For-Review, SRE, DBA
Cmjohnson closed T281075: decommission db1077.eqiad.wmnet as Resolved.
Tue, May 11, 2:45 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
Cmjohnson updated the task description for T281075: decommission db1077.eqiad.wmnet.
Tue, May 11, 2:45 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
Cmjohnson closed T280121: decommission db1080.eqiad.mnet, a subtask of T276448: Failover m1 master: db1080 -> db1159 Wed 14th April at 10 AM UTC, as Resolved.
Tue, May 11, 2:06 PM · DBA
Cmjohnson closed T280121: decommission db1080.eqiad.mnet as Resolved.
Tue, May 11, 2:06 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
Cmjohnson updated the task description for T280121: decommission db1080.eqiad.mnet.
Tue, May 11, 2:04 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
Cmjohnson closed T281794: decommission db1082.eqiad.wmnet, a subtask of T258361: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers), as Resolved.
Tue, May 11, 2:03 PM · Patch-For-Review, SRE, DBA
Cmjohnson closed T281794: decommission db1082.eqiad.wmnet, a subtask of T280492: Upgrade all sanitarium masters to 10.4 and Buster, as Resolved.
Tue, May 11, 2:03 PM · Data-Persistence-Backup, Patch-For-Review, DBA
Cmjohnson closed T281794: decommission db1082.eqiad.wmnet as Resolved.
Tue, May 11, 2:03 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
Cmjohnson updated the task description for T281794: decommission db1082.eqiad.wmnet.
Tue, May 11, 2:03 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
Cmjohnson added a comment to T278934: Add eqiad airport express to Netbox.

@wiki_willy I can add to netbox but how do I classify it? I could add as an access switch but Apple is not a manufacturer listed in our devices. Please advise.

Tue, May 11, 1:59 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson closed T278185: hw troubleshooting: IPMI sensor critical for elastic1042.eqiad.wmnet as Resolved.

it appears the power cable was loose and not properly seated, pushed it back in and the LED lit up. These servers are over 5 years old.

Tue, May 11, 1:57 PM · SRE, ops-eqiad, Discovery-Search (Current work), DC-Ops

Thu, May 6

Cmjohnson moved T278185: hw troubleshooting: IPMI sensor critical for elastic1042.eqiad.wmnet from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Thu, May 6, 5:22 PM · SRE, ops-eqiad, Discovery-Search (Current work), DC-Ops
Cmjohnson moved T281881: hw troubleshooting: server hardlocking for cloudmetrics1002.eqiad.wmnet from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Thu, May 6, 5:22 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson moved T282054: (Need By: TBD) rack/setup/install frdev1002 from Backlog to Racking Tasks on the ops-eqiad board.
Thu, May 6, 5:21 PM · SRE, ops-eqiad, DC-Ops

Mon, Apr 26

Cmjohnson added a comment to T276922: cloudvirt1038: PCIe error.

Dell sent an email with a list of things they want to be done, considering that they've had 2 technicians out to fix the issue with zero resolution, I replied that they will need to send one of their technicians out to perform these tasks. I do not feel it is wise to start poking around myself in case we need to RMA this server.

Mon, Apr 26, 3:20 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)

Fri, Apr 23

Cmjohnson reassigned T272509: (Need By: 2021-03-31) rack/setup/install snapshot101[1-5] from Cmjohnson to RobH.

Assigning to @RobH for installs

Fri, Apr 23, 5:07 PM · SRE, Dumps-Generation, ops-eqiad, DC-Ops
Cmjohnson updated the task description for T272509: (Need By: 2021-03-31) rack/setup/install snapshot101[1-5].
Fri, Apr 23, 5:06 PM · SRE, Dumps-Generation, ops-eqiad, DC-Ops
Cmjohnson reassigned T276644: (Need By: 2021-04-30) rack/setup/install wcqs100[123] from Cmjohnson to RobH.

Assigning this to @RobH to complete install

Fri, Apr 23, 5:03 PM · SRE, Discovery-Search (Current work), ops-eqiad, DC-Ops
Cmjohnson updated the task description for T276644: (Need By: 2021-04-30) rack/setup/install wcqs100[123].
Fri, Apr 23, 5:02 PM · SRE, Discovery-Search (Current work), ops-eqiad, DC-Ops
Cmjohnson closed T280618: htmldumper1001 power suply failure as Resolved.

Loose power cable

Fri, Apr 23, 4:23 PM · SRE, ops-eqiad
Cmjohnson moved T280618: htmldumper1001 power suply failure from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Fri, Apr 23, 4:22 PM · SRE, ops-eqiad
Cmjohnson moved T280668: Degraded RAID on cloudvirt1018 from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Fri, Apr 23, 4:22 PM · SRE, ops-eqiad
Cmjohnson moved T280961: Degraded RAID on ms-be1019 from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Fri, Apr 23, 4:22 PM · SRE, ops-eqiad
Cmjohnson closed T280623: Can't access thanos-fe1001.mgmt as Resolved.
Fri, Apr 23, 3:57 PM · SRE, ops-eqiad
Cmjohnson added a comment to T280623: Can't access thanos-fe1001.mgmt.

Password was incorrect, fixed

Fri, Apr 23, 3:57 PM · SRE, ops-eqiad
Cmjohnson closed T280132: Degraded RAID on an-worker1100 as Resolved.

The disk has been swapped, I am resolving this task because the on-site work has been completed.

Fri, Apr 23, 3:52 PM · SRE, ops-eqiad
Cmjohnson closed T278729: cp1087 powercycled as Resolved.

replaced cpu1 and cleared the idrac log, resolving, if the issue returns please re-open.

Fri, Apr 23, 3:44 PM · ops-eqiad, SRE, Traffic

Thu, Apr 15

Cmjohnson reassigned T276644: (Need By: 2021-04-30) rack/setup/install wcqs100[123] from Cmjohnson to Jclark-ctr.

@Jclark-ctr netbox script ran for wcqs1001 and 1002. I'm not sure why 1003 is in C4, that's a 10G rack. If it is can you please move to a standard rack please.

Thu, Apr 15, 7:40 PM · SRE, Discovery-Search (Current work), ops-eqiad, DC-Ops
Cmjohnson reassigned T272403: (Need By: 2021-03-31) rack/setup/install cloudgw100[12].eqiad.wmnet from Cmjohnson to RobH.

@RobH the 2nd interface was added to these, can you try the install again please.

Thu, Apr 15, 6:51 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson added a comment to T274925: (Need By: TBD) rack/setup/install mc10[37-54].eqiad.wmnet.

mc1041-50 netbox and network ports updated have been completed, need to go on-site and setup idrac

Thu, Apr 15, 5:29 PM · User-jijiki, SRE, serviceops, ops-eqiad, DC-Ops
Cmjohnson updated the task description for T274925: (Need By: TBD) rack/setup/install mc10[37-54].eqiad.wmnet.
Thu, Apr 15, 5:28 PM · User-jijiki, SRE, serviceops, ops-eqiad, DC-Ops
Cmjohnson updated the task description for T274925: (Need By: TBD) rack/setup/install mc10[37-54].eqiad.wmnet.
Thu, Apr 15, 5:27 PM · User-jijiki, SRE, serviceops, ops-eqiad, DC-Ops
Cmjohnson added a comment to T279721: Dc-Ops Commands for Cumin.

I need to be able to login to servers and run megacli commands as well as cat /proc/

Thu, Apr 15, 1:14 PM · ops-codfw, ops-eqiad, SRE, DC-Ops
Cmjohnson moved T274752: decommission db1076.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Thu, Apr 15, 1:13 PM · SRE, ops-eqiad, DC-Ops, Patch-For-Review, decommission-hardware
Cmjohnson moved T280110: decommission bast1002.wikimedia.org from Backlog to Decommission on the ops-eqiad board.
Thu, Apr 15, 1:13 PM · SRE, ops-eqiad, decommission-hardware
Cmjohnson moved T280132: Degraded RAID on an-worker1100 from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Thu, Apr 15, 1:13 PM · SRE, ops-eqiad
Cmjohnson added a comment to T280132: Degraded RAID on an-worker1100.

ticket opened with Dell! You have successfully submitted request SR1057103007.

Thu, Apr 15, 1:12 PM · SRE, ops-eqiad
Cmjohnson moved T278934: Add eqiad airport express to Netbox from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Thu, Apr 15, 1:00 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson added a comment to T276922: cloudvirt1038: PCIe error.

Dell has us on a wild goose hunt. Responded to their questions with the following:

Thu, Apr 15, 1:00 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)

Apr 13 2021

Cmjohnson reassigned T274945: (Need By: TBD) rack/setup/install cloudcephosd10[16-20].eqiad.wmnet from Cmjohnson to RobH.

@RobH all the secondary ports are updated and added to the private vlan per the instructions above. Feel free to do the installs whenever you have a moment.

Apr 13 2021, 3:52 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops

Apr 12 2021

Cmjohnson added a comment to T276922: cloudvirt1038: PCIe error.

another Dell tech arrived today with what was believed to be the replacement part. The part was replaced and the error persisted. Several reboots and TSR reports later, we do not know what is going on. At what point in time reseating the CMOS battery worked but then the PCI error returned on the next reboot. The Dell technician is still on-site attempting to troubleshoot with Dell tech support now. Once something has been decided I will update the task.

Apr 12 2021, 6:13 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)

Apr 9 2021

Cmjohnson added a comment to T276922: cloudvirt1038: PCIe error.

the Dell tech came out and replaced the motherboard, that did not fix the issue, it turns out that there is bad cable to the backplane. A new part has been ordered.

Apr 9 2021, 4:06 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)

Apr 8 2021

Cmjohnson moved T278729: cp1087 powercycled from High Priority Task to Hardware Failure / Troubleshoot on the ops-eqiad board.
Apr 8 2021, 4:55 PM · ops-eqiad, SRE, Traffic
Cmjohnson added a comment to T279160: Netbox Duplicate Cable Lables.

Fixed the report has zero errors

Apr 8 2021, 4:51 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson closed T279160: Netbox Duplicate Cable Lables as Resolved.
Apr 8 2021, 4:51 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson added a comment to T278729: cp1087 powercycled.

updated the BIOS and submitted Dell ticket You have successfully submitted request SR1056516502.

Apr 8 2021, 4:34 PM · ops-eqiad, SRE, Traffic
Cmjohnson added a comment to T272403: (Need By: 2021-03-31) rack/setup/install cloudgw100[12].eqiad.wmnet.

@aborrero The 2nd interfaces are
cloudgw1001 cloudsw1-c8 xe-0/0/19 cable id 5321

Apr 8 2021, 4:28 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson updated subscribers of T274945: (Need By: TBD) rack/setup/install cloudcephosd10[16-20].eqiad.wmnet.

These are all connected, the 2nd interfaces are not setup, it seems that we're all confused on how to do this so I didn't do anything. maybe @Papaul can let us know if the 2nd interface is automated or requires a manual setup.

Apr 8 2021, 4:22 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson updated the task description for T274945: (Need By: TBD) rack/setup/install cloudcephosd10[16-20].eqiad.wmnet.
Apr 8 2021, 2:10 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops

Apr 7 2021

Cmjohnson added a comment to T279475: Icinga/MegaRAID alert on an-worker1100.

@elukey that's a first! Maybe the raid bios settings are wrong?

Apr 7 2021, 2:11 PM · SRE, ops-eqiad, Analytics-Clusters

Apr 5 2021

Dzahn awarded T275759: decommission scb100[1-4].eqiad.wmnet a Love token.
Apr 5 2021, 5:28 PM · SRE, ops-eqiad, serviceops

Apr 1 2021

Cmjohnson reassigned T275081: (Need By: TBD) rack/setup/install cloudvirt104[0-6].eqiad.wmnet from Cmjohnson to RobH.

@RobH assigning this to you, 1040-1045 are ready for installs. I set up both ports in netbox. Since we're waiting on a new nic card for 1046 please reassign to John after (assuming no issues that need me to get involved). Thanks!

Apr 1 2021, 5:39 PM · Patch-For-Review, SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson updated the task description for T275081: (Need By: TBD) rack/setup/install cloudvirt104[0-6].eqiad.wmnet.
Apr 1 2021, 4:37 PM · Patch-For-Review, SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson added a comment to T276922: cloudvirt1038: PCIe error.

Dell Ticket Created

Apr 1 2021, 4:27 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)
Cmjohnson added a comment to T276239: Try to move some new analytics worker nodes to different racks.

@wiki_willy That will work! Thanks

Apr 1 2021, 4:20 PM · Analytics-Radar, SRE, ops-eqiad
Cmjohnson added a comment to T276239: Try to move some new analytics worker nodes to different racks.

@elukey I have not forgotten about this, A7 is a rack for the possible move but we are already maxing out our power utilization in that rack and adding another R740XD is probably not a good idea.

Apr 1 2021, 4:08 PM · Analytics-Radar, SRE, ops-eqiad
Cmjohnson moved T276239: Try to move some new analytics worker nodes to different racks from Backlog to High Priority Task on the ops-eqiad board.
Apr 1 2021, 4:06 PM · Analytics-Radar, SRE, ops-eqiad
Cmjohnson moved T278729: cp1087 powercycled from Backlog to High Priority Task on the ops-eqiad board.
Apr 1 2021, 4:06 PM · ops-eqiad, SRE, Traffic
Cmjohnson added a comment to T278729: cp1087 powercycled.

Looks like a possible DIMM error, since the server is already depooled I will run a couple of tests to determine if it's a DIMM, CPU or motherboard issue.

Apr 1 2021, 4:06 PM · ops-eqiad, SRE, Traffic
Cmjohnson closed T278630: elastic1060 reported errors in getsel as Resolved.

The DIMM only reported the error that one day and has not returned. I am clearing the system log and resolving this for now, if the issue persists please re-open.

Apr 1 2021, 4:05 PM · Discovery-Search (Current work), SRE, ops-eqiad, Discovery
Cmjohnson closed T278726: Eqiad: Ports with no description on cloudsw1-d5-eqiad as Resolved.

There are new servers installed in this rack, most of these are the new cloudvirts. The servers will be racked and connected and then netbox gets updated so there is a lag between these things happening.

Apr 1 2021, 4:02 PM · cloud-services-team (Kanban), SRE, ops-eqiad
Cmjohnson closed T277171: decommission frqueue1001.frack.eqiad.wmnet as Resolved.
Apr 1 2021, 4:00 PM · ops-eqiad, SRE, decommission-hardware
Cmjohnson updated the task description for T277171: decommission frqueue1001.frack.eqiad.wmnet.
Apr 1 2021, 4:00 PM · ops-eqiad, SRE, decommission-hardware
Cmjohnson closed T276302: decommission db1084.eqiad.wmnet, a subtask of T258361: Productionize db1155-db1175 and refresh and decommission db1074-db1095 (22 servers), as Resolved.
Apr 1 2021, 3:59 PM · Patch-For-Review, SRE, DBA
Cmjohnson closed T276302: decommission db1084.eqiad.wmnet as Resolved.
Apr 1 2021, 3:59 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
Cmjohnson updated the task description for T276302: decommission db1084.eqiad.wmnet.
Apr 1 2021, 3:59 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
Cmjohnson added a comment to T272403: (Need By: 2021-03-31) rack/setup/install cloudgw100[12].eqiad.wmnet.

@elukey that was my fault, I left it in it's BIOS settings when I left yesterday. I rebooted and it's back.

Apr 1 2021, 3:57 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson reassigned T275511: (Need By: TBD) rack/setup/install moss-fe100[12].eqiad.wmnet from Cmjohnson to RobH.

@RobH these are ready for installs when you have the time.

Apr 1 2021, 3:55 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson updated the task description for T275511: (Need By: TBD) rack/setup/install moss-fe100[12].eqiad.wmnet.
Apr 1 2021, 3:55 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson reassigned T276396: (Need By: TBD) rack/setup/install bast1003.wikimedia.org from Cmjohnson to RobH.

@RobH this is ready for install when you have time.

Apr 1 2021, 2:50 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson updated the task description for T276396: (Need By: TBD) rack/setup/install bast1003.wikimedia.org.
Apr 1 2021, 2:49 PM · SRE, ops-eqiad, DC-Ops
Cmjohnson assigned T278250: (Need By: TBD) install second SSD into payments100[5-8] to Jclark-ctr.
Apr 1 2021, 2:48 PM · SRE, ops-eqiad, fundraising-tech-ops, DC-Ops
Cmjohnson added a comment to T278250: (Need By: TBD) install second SSD into payments100[5-8].

@Jgreen The disks have been installed, feel free to do the install. @Jclark-ctr I left the packing slip in the box, can you receive and do the coupa thing.

Apr 1 2021, 2:48 PM · SRE, ops-eqiad, fundraising-tech-ops, DC-Ops
Cmjohnson updated the task description for T278250: (Need By: TBD) install second SSD into payments100[5-8].
Apr 1 2021, 2:47 PM · SRE, ops-eqiad, fundraising-tech-ops, DC-Ops

Mar 23 2021

Cmjohnson added a comment to T272403: (Need By: 2021-03-31) rack/setup/install cloudgw100[12].eqiad.wmnet.

Fixed the primary port for cloudgw1001

Mar 23 2021, 4:43 PM · SRE, cloud-services-team (Hardware), ops-eqiad, DC-Ops
Cmjohnson closed T278226: Degraded RAID on db1086 as Resolved.

Disk replaced with a disk from decom'd db host

Mar 23 2021, 4:41 PM · DBA, SRE, ops-eqiad

Mar 19 2021

Cmjohnson added a comment to T276239: Try to move some new analytics worker nodes to different racks.

@elukey can I move the 2 servers anytime or does this need to be scheduled?

Mar 19 2021, 5:16 PM · Analytics-Radar, SRE, ops-eqiad

Mar 18 2021

Cmjohnson reassigned T218751: Audit down ports from Cmjohnson to ayounsi.

@ayounsi I verified all of the ports listed in https://librenms.wikimedia.org/ports/state=down/hostname=asw/format=list_basic/ are not in service at the moment. There were 2 in fundraising that I already took care of, please disable these ports. Thanks!

Mar 18 2021, 6:17 PM · DC-Ops, SRE, ops-eqiad
Cmjohnson moved T268750: eqiad: add VC-links IDs to Netbox from Hardware Failure / Troubleshoot to Lower Priority Items on the ops-eqiad board.
Mar 18 2021, 6:00 PM · SRE, ops-eqiad
Cmjohnson moved T259758: (Need By: Q2) eqiad: Upgrades of Management Switches from Backlog to Lower Priority Items on the ops-eqiad board.
Mar 18 2021, 5:58 PM · SRE, DC-Ops, ops-eqiad
Cmjohnson moved T218751: Audit down ports from Hardware Failure / Troubleshoot to Lower Priority Items on the ops-eqiad board.
Mar 18 2021, 5:58 PM · DC-Ops, SRE, ops-eqiad
Cmjohnson moved T175876: document all scs connections from Hardware Failure / Troubleshoot to Lower Priority Items on the ops-eqiad board.
Mar 18 2021, 5:58 PM · ops-eqiad, DC-Ops, SRE