Cmjohnson (cmjohnson)
User

Projects (11)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Dec 16 2014, 10:22 PM (213 w, 6 d)
Availability
Available
IRC Nick
cmjohnson1
LDAP User
Cmjohnson
MediaWiki User
Unknown

Recent Activity

Sun, Jan 20

Eevans awarded T212418: Memory error on restbase1016 a Cookie token.
Sun, Jan 20, 7:06 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad

Fri, Jan 18

Cmjohnson added a comment to T213748: swap a2-eqiad PDU with on-site spare.

Correct, it was only the fuse

Fri, Jan 18, 4:12 PM · Patch-For-Review, DBA, Analytics, ops-eqiad, Operations

Thu, Jan 17

Cmjohnson closed Unknown Object (Task), a subtask of T213157: Increase utilization of application logging pipeline (FY2018-2019 Q3 TEC6), as Resolved.
Thu, Jan 17, 10:31 PM · User-fgiunchedi, User-herron, Operations, Wikimedia-Logstash

Wed, Jan 16

Cmjohnson added a comment to T209815: Upgrade firmware on db1078.

I ran the Service Pack on db1078, all firmware is up to date including BIOS and raid controller. The server is currently powered off

Wed, Jan 16, 3:12 PM · Patch-For-Review, DBA, ops-eqiad, Operations
Cmjohnson moved T213128: Replace eqiad mgmt switches with EX4200s from Backlog to Up next on the ops-eqiad board.
Wed, Jan 16, 2:53 PM · ops-eqiad, Operations, netops
Cmjohnson moved T209815: Upgrade firmware on db1078 from Backlog to Being worked on on the ops-eqiad board.
Wed, Jan 16, 2:53 PM · Patch-For-Review, DBA, ops-eqiad, Operations
Cmjohnson moved T211998: Decommission brokenasw-c2-eqiad from Backlog to Decommission on the ops-eqiad board.
Wed, Jan 16, 2:53 PM · Operations, ops-eqiad
Cmjohnson moved T212348: Move servers off asw2-a5-eqiad from Backlog to Up next on the ops-eqiad board.
Wed, Jan 16, 2:53 PM · ops-eqiad, netops, Operations
Cmjohnson moved T204479: Heating alerts and broken RAM on kafka1014 from Being worked on to Up next on the ops-eqiad board.
Wed, Jan 16, 2:53 PM · User-Elukey, Operations, ops-eqiad
Cmjohnson moved T212418: Memory error on restbase1016 from Backlog to Being worked on on the ops-eqiad board.
Wed, Jan 16, 2:52 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson moved T212861: Rack A2's hosts alarm for PSU broken from Backlog to Being worked on on the ops-eqiad board.
Wed, Jan 16, 2:52 PM · Analytics, ops-eqiad, Operations
Cmjohnson moved T196726: db1115 (tendril DB) had OOM for some processes and some hw (memory) issues from Backlog to Being worked on on the ops-eqiad board.
Wed, Jan 16, 2:52 PM · Operations, ops-eqiad, DBA
Cmjohnson moved T213859: eqiad: rack a3 pdu swap / failure / replacement from Backlog to Being worked on on the ops-eqiad board.
Wed, Jan 16, 2:52 PM · Patch-For-Review, ops-eqiad, Operations
Cmjohnson moved T213748: swap a2-eqiad PDU with on-site spare from Backlog to Being worked on on the ops-eqiad board.
Wed, Jan 16, 2:52 PM · Patch-For-Review, DBA, Analytics, ops-eqiad, Operations
Cmjohnson added a comment to T213422: es1019 IPMI and its management interface are unresponsive (again).

While the server was down I updated, BIOS, raid firmware and hardware
firmware to the latest updates

Wed, Jan 16, 12:00 PM · Patch-For-Review, Operations, ops-eqiad

Tue, Jan 15

Cmjohnson added a comment to T212418: Memory error on restbase1016.

@Eevans The error has not returned, I cannot say with 100% certainty that it will not return but for now please take the server back and do what you need. All the cables are plugged back in and the server is off. I will leave this open for a few days, lmk if the error returns.

Tue, Jan 15, 5:18 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson added a comment to T196726: db1115 (tendril DB) had OOM for some processes and some hw (memory) issues.

i swapped the dimm from a2 to b2 and cleared the log. Please put back in the rotation and let's see if and where the error occurs.

Tue, Jan 15, 5:06 PM · Operations, ops-eqiad, DBA
Cmjohnson added a comment to T196726: db1115 (tendril DB) had OOM for some processes and some hw (memory) issues.

racadm SEL

Tue, Jan 15, 5:01 PM · Operations, ops-eqiad, DBA

Mon, Jan 14

Cmjohnson added a comment to T212418: Memory error on restbase1016.

The log remains clear and no erros have returned. I will give it another 24 hours and if no change then it can go back into service.

Mon, Jan 14, 5:01 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson added a comment to T196726: db1115 (tendril DB) had OOM for some processes and some hw (memory) issues.

@Marostegui I have to do move the DIMM to another slot and see if the error corrects itself moves with the DIMM or remains the same. Can you take this server down and out of rotation Tuesday? I can do this around the same time as es1019

Mon, Jan 14, 5:00 PM · Operations, ops-eqiad, DBA

Thu, Jan 10

Cmjohnson added a comment to T212418: Memory error on restbase1016.

I ended up leaving the production cables disconnected.

Thu, Jan 10, 9:54 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson updated the task description for T207258: rack/setup/install pc1007-pc1010.
Thu, Jan 10, 6:08 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

return shipping info for parts

Thu, Jan 10, 6:08 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson added a comment to T212418: Memory error on restbase1016.

While the server is offline I took this opportunity to update the f/w on the bios and idrac.

Thu, Jan 10, 6:03 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

*update not ready for install. I set the wrong raid. I am updating the driver now and will fix to raid 5 once the update is complete. @Marostegui odd...may have somethign to do with the f/w update in progress

Thu, Jan 10, 5:19 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson updated the task description for T207258: rack/setup/install pc1007-pc1010.
Thu, Jan 10, 5:18 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson reassigned T207258: rack/setup/install pc1007-pc1010 from Cmjohnson to RobH.

@RobH @Marostegui I went through the very long and painful Dell troubleshooting and it's one of those cases where it actually worked. The server is ready to install.

Thu, Jan 10, 5:13 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson added a comment to T213422: es1019 IPMI and its management interface are unresponsive (again).

@jcrespo Sure...Tuesday works

Thu, Jan 10, 4:21 PM · Patch-For-Review, Operations, ops-eqiad
Cmjohnson added a comment to T212418: Memory error on restbase1016.

@Eevans I am going to have to power it back on and let it go for a few days to see if the error returns, will that present an issue for you?

Thu, Jan 10, 4:09 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson added a comment to T212418: Memory error on restbase1016.

I need to move DIMM around and do standard troubleshooting. Is this server able to be powered off and down in icinga?

Thu, Jan 10, 3:31 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson added a comment to T212418: Memory error on restbase1016.

Record: 4
Date/Time: 11/17/2017 19:18:35
Source: system
Severity: Non-Critical

Description: Correctable memory error rate exceeded for DIMM_A1.

Record: 5
Date/Time: 11/17/2017 19:22:08
Source: system
Severity: Critical

Description: Correctable memory error rate exceeded for DIMM_A1.

Record: 6
Date/Time: 02/13/2018 22:08:17
Source: system
Severity: Non-Critical

Description: Correctable memory error rate exceeded for DIMM_A2.

Record: 7
Date/Time: 02/14/2018 12:26:34
Source: system
Severity: Critical

Description: Correctable memory error rate exceeded for DIMM_A2.

Record: 8
Date/Time: 12/20/2018 12:12:05
Source: system
Severity: Ok

Description: A problem was detected in Memory Reference Code (MRC).

Record: 9
Date/Time: 12/20/2018 12:12:05
Source: system
Severity: Critical
Description: Multi-bit memory errors detected on a memory device at location(s) DIMM_A2.

Thu, Jan 10, 3:31 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations, ops-eqiad
Cmjohnson added a comment to T213422: es1019 IPMI and its management interface are unresponsive (again).

I need to power this off and unplug it for 10-20 secs. LMK if I can do
that today

Thu, Jan 10, 3:30 PM · Patch-For-Review, Operations, ops-eqiad

Mon, Jan 7

Cmjohnson added a comment to T212791: Interface errors on cr1-eqiad:xe-3/3/1.

After the initial optics swap, the link was still not working.

Mon, Jan 7, 7:02 PM · Operations, ops-eqiad
Cmjohnson moved T212791: Interface errors on cr1-eqiad:xe-3/3/1 from Backlog to Being worked on on the ops-eqiad board.
Mon, Jan 7, 6:24 PM · Operations, ops-eqiad
Cmjohnson added a comment to T212791: Interface errors on cr1-eqiad:xe-3/3/1.

Replaced the optics @ayounsi please resolve once confirmed all is well.

Mon, Jan 7, 6:20 PM · Operations, ops-eqiad
Cmjohnson added a comment to T212348: Move servers off asw2-a5-eqiad.

I will need to create space in the 10G racks to make this work and some juggling will be required.

Mon, Jan 7, 6:15 PM · ops-eqiad, netops, Operations
Cmjohnson added a comment to T209029: cloudelastic1004: SMART/disk error.

a ticket has been opened with Dell

Mon, Jan 7, 5:52 PM · Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

@jcrespo An email was sent to Dell requesting a new board. I have not received a response

Mon, Jan 7, 5:38 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson added a comment to T212010: Degraded RAID on sodium.

Sodium does not have any failed disks. One of the disks is listed as a hotspare.

Mon, Jan 7, 5:36 PM · ops-eqiad, Operations
Cmjohnson closed T210927: Update label and switch to rename labvirt1014 to cloudvirt1014 as Resolved.

Done

Mon, Jan 7, 5:26 PM · DC-Ops, Operations, ops-eqiad
Cmjohnson closed T210927: Update label and switch to rename labvirt1014 to cloudvirt1014, a subtask of T210904: Rename labvirt1014 to cloudvirt1014, move to eqiad1, as Resolved.
Mon, Jan 7, 5:26 PM · Patch-For-Review
Cmjohnson closed T212522: Update label and switch to rename labvirt1013 to cloudvirt1013, a subtask of T212513: Rename labvirt1013 to cloudvirt1013, move to eqiad1, as Resolved.
Mon, Jan 7, 5:22 PM · Patch-For-Review, cloud-services-team (Kanban)
Cmjohnson closed T212522: Update label and switch to rename labvirt1013 to cloudvirt1013 as Resolved.

updated

Mon, Jan 7, 5:22 PM · ops-eqiad, DC-Ops, Operations, cloud-services-team (Kanban)
Cmjohnson added a comment to T213038: Degraded RAID on analytics1054.

@elukey sorry, i replaced the disk and it is still showing failed, I don't know if the disk needs to be manually added back to the array?

Mon, Jan 7, 5:15 PM · Patch-For-Review, Analytics, User-Elukey, ops-eqiad, Operations
Cmjohnson closed T212556: frdb1001 RAID controller battery failure as Resolved.

This was resolved over the holiday break 12/27/2018

Mon, Jan 7, 4:53 PM · Operations, ops-eqiad
Cmjohnson moved T213038: Degraded RAID on analytics1054 from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Mon, Jan 7, 4:51 PM · Patch-For-Review, Analytics, User-Elukey, ops-eqiad, Operations
Cmjohnson updated subscribers of T213038: Degraded RAID on analytics1054.

@elukey the disk still shows failed do you have to manually add it back?

Mon, Jan 7, 4:51 PM · Patch-For-Review, Analytics, User-Elukey, ops-eqiad, Operations
Cmjohnson added a comment to T213038: Degraded RAID on analytics1054.

The disk at slot 1 is failed, the server is out of warranty but I do have a spare 4TB SATA.

Mon, Jan 7, 4:50 PM · Patch-For-Review, Analytics, User-Elukey, ops-eqiad, Operations
Cmjohnson moved T212990: Degraded RAID on helium from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Mon, Jan 7, 4:30 PM · ops-eqiad, Operations
Cmjohnson added a comment to T212990: Degraded RAID on helium.

helium is out of warranty, I created a procurement task to purchase a replacement disk.

Mon, Jan 7, 4:30 PM · ops-eqiad, Operations
Cmjohnson added a subtask for T212990: Degraded RAID on helium: Unknown Object (Task).
Mon, Jan 7, 4:29 PM · ops-eqiad, Operations
Cmjohnson added a comment to T212861: Rack A2's hosts alarm for PSU broken.

I replaced the fuse on the wrong side initially and caused an outage. I then replaced the fuses on the correct phase and the power was not restored, I tried replacing them both a second time and still nothing. I am out of spare fuses, I do have a spare PDU but we are wanting to replace the PDU and there is an order request for a new set.

Mon, Jan 7, 4:15 PM · Analytics, ops-eqiad, Operations

Wed, Jan 2

Cmjohnson added a comment to T212624: wtp1028 unresponsive.

@herron @fgiunchedi I went to the data center on the 27th and powercycled the server. I thought I updated task but I don't see my update.

Wed, Jan 2, 6:41 PM · Operations, ops-eqiad

Dec 20 2018

Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

the tech came today to swap pc1007 system board and the new (refurbed) board is bad again. This will require another call into Dell and will not be fixed until after the holiday break.

Dec 20 2018, 7:51 PM · Patch-For-Review, Operations, ops-eqiad, DBA

Dec 19 2018

Cmjohnson added a comment to T212305: restbase1011 fails to boot, ASSERT error lines .

@MoritzMuehlenhoff did a hard power cycle and the server came up clean, I've never seen the ASSERT messages. Typically if there is a h/w error on HP I will get a yellow notice and a prompt to review the log. I am not sure what you want to do with it at this time?

Dec 19 2018, 7:16 PM · Operations, ops-eqiad
Cmjohnson added a comment to T209616: rack/setup/install cloudvirt10[25-30].eqiad.wmnet.

@RobH also, the 2nd ethernet port was placed in cloud-virt-instance-trunk

Dec 19 2018, 7:11 PM · Patch-For-Review, Operations, cloud-services-team
Cmjohnson reassigned T209616: rack/setup/install cloudvirt10[25-30].eqiad.wmnet from Cmjohnson to RobH.
Dec 19 2018, 7:00 PM · Patch-For-Review, Operations, cloud-services-team
Cmjohnson added a comment to T209616: rack/setup/install cloudvirt10[25-30].eqiad.wmnet.

@RobH these are ready for installs I added the mac address and netboot.cfg I did not merge the changes, please review.

Dec 19 2018, 7:00 PM · Patch-For-Review, Operations, cloud-services-team
Cmjohnson updated the task description for T209616: rack/setup/install cloudvirt10[25-30].eqiad.wmnet.
Dec 19 2018, 6:47 PM · Patch-For-Review, Operations, cloud-services-team
Cmjohnson closed T212185: Degraded RAID on db1072 as Resolved.

The disk is back

Dec 19 2018, 5:58 PM · DBA, ops-eqiad, Operations
Cmjohnson added a comment to T209616: rack/setup/install cloudvirt10[25-30].eqiad.wmnet.

I know that these say 10G but all 4 nics are standard rj45....granted 2 say 10G and 2 say 1G...kind of confusing. I plug the ethernet cable into the 1G ports which are the 3rd and 4th option in device settings.

Dec 19 2018, 4:49 PM · Patch-For-Review, Operations, cloud-services-team

Dec 17 2018

Cmjohnson added a comment to T209618: rack/setup/install ms-be10[44-50].eqiad.wmnet.

@RobH the server has been moved to d2/u31.

  • update netbox
  • remove from asw2-b-eqiad
  • update asw2-d-eqiad
Dec 17 2018, 5:39 PM · Patch-For-Review, User-fgiunchedi, media-storage, Operations

Dec 14 2018

Cmjohnson closed T206972: asw2-a-eqiad FPC7 faulty PEM0 as Resolved.

I received the new PEM from juniper ...resolving this task

Dec 14 2018, 5:41 PM · netops, Operations, ops-eqiad
Cmjohnson added a comment to T211668: mw1272 crashed: Bad page map in process hhvm.

Today I swapped the DIMM from B1 to A1 and cleared the log. We have to wait and see

Dec 14 2018, 5:32 PM · serviceops, ops-eqiad, Operations, HHVM
Cmjohnson added a comment to T209139: Broken memory on mw1239.

I am sure we have something that can be used from a decom server.

Dec 14 2018, 2:00 PM · ops-eqiad, Operations
Cmjohnson added a comment to T211668: mw1272 crashed: Bad page map in process hhvm.

The idrac logs reporting a couple of things. The errors could just be DIMM but there is a CPU Machine Check error, that indicates that CPU2 may be bad now. A DIMM Swap is needed first, clear the log and see if the error follows the DIMM or stays with CPU2.

Dec 14 2018, 1:53 PM · serviceops, ops-eqiad, Operations, HHVM

Dec 13 2018

Cmjohnson closed T209074: Devices with wmf* names and status active as Resolved.

Moved all the servers that begin with wmf from active to planned

Dec 13 2018, 10:44 PM · ops-ulsfo, ops-codfw, Operations, ops-eqiad
Cmjohnson added a comment to T211796: Degraded RAID on ms-be1045.

I am not seeing anything wrong with the disks

Dec 13 2018, 10:36 PM · ops-eqiad, Operations
Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

no ETA

Dec 13 2018, 4:18 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson reassigned T201364: rack/setup/install sulfur.wikimedia.org from Cmjohnson to RobH.

@RobH I did a racreset....all is working

Dec 13 2018, 4:17 PM · ops-eqiad, Operations

Dec 12 2018

Cmjohnson reassigned T209393: rack/setup/install sessionstore100[123].eqiad.wmnet from Cmjohnson to RobH.

Hi @RobH these are ready for installs. Thanks!

Dec 12 2018, 3:36 PM · Core Platform Team Kanban (Doing), Operations, Core Platform Team (Session Management Service (CDP2)), User-Eevans
Cmjohnson updated the task description for T209393: rack/setup/install sessionstore100[123].eqiad.wmnet.
Dec 12 2018, 3:36 PM · Core Platform Team Kanban (Doing), Operations, Core Platform Team (Session Management Service (CDP2)), User-Eevans

Dec 11 2018

Cmjohnson moved T202705: Degraded RAID on sodium from Being worked on to Blocked on the ops-eqiad board.
Dec 11 2018, 6:38 PM · ops-eqiad, Operations
Cmjohnson moved T203194: cp1075-90 - bnxt_en transmit hangs from Being worked on to Not urgent on the ops-eqiad board.
Dec 11 2018, 6:38 PM · Patch-For-Review, ops-eqiad, Operations, Traffic
Cmjohnson moved T211537: Degraded RAID on db1063 from Backlog to Being worked on on the ops-eqiad board.
Dec 11 2018, 6:38 PM · DBA, ops-eqiad, Operations
Cmjohnson added a comment to T211537: Degraded RAID on db1063.

swapped in slot 0

Dec 11 2018, 6:37 PM · DBA, ops-eqiad, Operations
Cmjohnson moved T211613: rack/setup/install db11[26-38].eqiad.wmnet from Backlog to Racking Tasks on the ops-eqiad board.
Dec 11 2018, 6:37 PM · Patch-For-Review, DBA, ops-eqiad, User-Marostegui, Operations

Dec 10 2018

Cmjohnson added a comment to T207965: eqiad: Re-connect cage cameras .

I agree that cameras are a better fit for the mgmt network. The old switches are still in the racks for row A-C, I removed them from row D awhile ago. I know that we have been talking about using the ex4200's for mgmt but I don't think we've gotten past the planning stage. One other blocker besides re-racking them in row D is we still need to migrate servers off the old stack in row A.

Dec 10 2018, 2:58 PM · Operations, ops-eqiad
Cmjohnson added a comment to T207965: eqiad: Re-connect cage cameras .

@faidon the cameras are connected, I am wondering if these ports have PoE?

Dec 10 2018, 1:21 PM · Operations, ops-eqiad

Dec 7 2018

Cmjohnson closed Unknown Object (Task), a subtask of T206017: Hardware for session storage service, as Resolved.
Dec 7 2018, 7:08 PM · Core Platform Team Kanban (Done with CPT), hardware-requests, Operations, Core Platform Team (Session Management Service (CDP2)), User-Eevans
Cmjohnson reassigned T207965: eqiad: Re-connect cage cameras from Cmjohnson to Papaul.

Thanks @Papaul for helping with this...below is the correct port assignments for each camera. Please assign back to me after switch port update Thanks!

Dec 7 2018, 6:08 PM · Operations, ops-eqiad

Dec 6 2018

Cmjohnson added a comment to T207965: eqiad: Re-connect cage cameras .

Some cameras have been re-connected as I am in their racks, others will need me to run new cables to reach the new switches. Some progress as I get the chance.

Dec 6 2018, 9:34 PM · Operations, ops-eqiad
Cmjohnson added a comment to T196507: Degraded RAID on cloudvirt1019.

@Andrew yes, you are correct it is the same exact issue. My goal was to work with one, figure out the issue and then go to HPE with a solution but that obviously is not working out so great.

Dec 6 2018, 8:34 PM · Patch-For-Review, cloud-services-team (Kanban), ops-eqiad, Operations
Cmjohnson closed T207319: labvirt1018 -> cloudvirt1018: update physical label, network port description, netbox as Resolved.
Dec 6 2018, 8:28 PM · ops-eqiad, DC-Ops, Operations, Cloud-Services
Cmjohnson closed T207319: labvirt1018 -> cloudvirt1018: update physical label, network port description, netbox, a subtask of T207317: Rename labvirt1018 to cloudvirt1018, move to eqiad1, as Resolved.
Dec 6 2018, 8:28 PM · Patch-For-Review, Cloud-Services
Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

@Marostegui and all,

Dec 6 2018, 8:23 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson updated subscribers of T196507: Degraded RAID on cloudvirt1019.

@GTirloni This has been an ongoing thing since August, I have replaced the battery 3 maybe 4 times already. Replaced the raid controller once and replaced 4 SSDs.

Dec 6 2018, 8:21 PM · Patch-For-Review, cloud-services-team (Kanban), ops-eqiad, Operations
Cmjohnson added a comment to T209618: rack/setup/install ms-be10[44-50].eqiad.wmnet.

@RobH this is already assigned to you but these are ready for you to take over

Dec 6 2018, 8:16 PM · Patch-For-Review, User-fgiunchedi, media-storage, Operations
Cmjohnson updated the task description for T209618: rack/setup/install ms-be10[44-50].eqiad.wmnet.
Dec 6 2018, 8:15 PM · Patch-For-Review, User-fgiunchedi, media-storage, Operations

Dec 5 2018

Cmjohnson updated the task description for T209618: rack/setup/install ms-be10[44-50].eqiad.wmnet.
Dec 5 2018, 10:05 PM · Patch-For-Review, User-fgiunchedi, media-storage, Operations
Cmjohnson moved T209618: rack/setup/install ms-be10[44-50].eqiad.wmnet from Backlog to Racking Tasks on the ops-eqiad board.
Dec 5 2018, 4:25 PM · Patch-For-Review, User-fgiunchedi, media-storage, Operations
Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

Waiting on a technician to swap out the motherboard. Our request was approved.

Dec 5 2018, 4:25 PM · Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson reassigned T207194: rack/setup/install cloudvirtan100[1-5].eqiad.wmnet from Cmjohnson to RobH.

all have been cabled and switch port updated minus the vlan. Can you please update vlan to whatever they need.

Dec 5 2018, 3:46 PM · Patch-For-Review, Analytics-Kanban, Operations, User-Elukey, Analytics

Nov 29 2018

Cmjohnson added a comment to T209618: rack/setup/install ms-be10[44-50].eqiad.wmnet.

@fgiunchedi For racking this is the space I have

Nov 29 2018, 6:33 PM · Patch-For-Review, User-fgiunchedi, media-storage, Operations
Cmjohnson closed T210683: lvs1006 down as Resolved.
Nov 29 2018, 4:27 PM · netops, ops-eqiad, Traffic, Operations
Cmjohnson added a comment to T210683: lvs1006 down .

@BBlack @ayounsi sfp-t was bad, replaced and the link is up

Nov 29 2018, 4:24 PM · netops, ops-eqiad, Traffic, Operations
Cmjohnson added a comment to T196507: Degraded RAID on cloudvirt1019.

@aborrero Unfortunately it's not that simple. Once we take delivery of a
server we then have to work through technical support. We may be at the
point where the issue needs to be escalated but I'm waiting on HPE.

Nov 29 2018, 1:19 PM · Patch-For-Review, cloud-services-team (Kanban), ops-eqiad, Operations

Nov 28 2018

Cmjohnson added a comment to T207194: rack/setup/install cloudvirtan100[1-5].eqiad.wmnet.

@RobH these are ready for installs, I changed the primary nic to boot from the 10G NIC. The raid was set up exactly like an analytics box raid 1 the ssds and raid 0 the other 11 disks.

Nov 28 2018, 9:37 PM · Patch-For-Review, Analytics-Kanban, Operations, User-Elukey, Analytics
Cmjohnson updated the task description for T207194: rack/setup/install cloudvirtan100[1-5].eqiad.wmnet.
Nov 28 2018, 9:35 PM · Patch-For-Review, Analytics-Kanban, Operations, User-Elukey, Analytics

Nov 27 2018

Cmjohnson added a comment to T207258: rack/setup/install pc1007-pc1010.

Dell ticket information for pc1007 You have successfully submitted request SR983104667.

Nov 27 2018, 2:37 PM · Patch-For-Review, Operations, ops-eqiad, DBA