Page MenuHomePhabricator

Cmjohnson (cmjohnson)
User

Projects (11)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Dec 16 2014, 10:22 PM (222 w, 2 d)
Availability
Available
IRC Nick
cmjohnson1
LDAP User
Cmjohnson
MediaWiki User
Unknown

Recent Activity

Mon, Mar 18

Cmjohnson added a comment to T212010: Degraded RAID on sodium.

@RobH interesting response from Dell regarding the disk

Mon, Mar 18, 4:29 PM · ops-eqiad, Operations

Fri, Mar 15

Cmjohnson added a comment to T216528: confirm gpu form factor in stat1005.

@elukey please see attached jpg .

Fri, Mar 15, 6:31 PM · ops-eqiad, Analytics-Kanban, Analytics, Operations
Cmjohnson closed T218411: No description on asw2-c-eqiad:xe-2/0/5 as Resolved.

Port description corrected

Fri, Mar 15, 6:24 PM · Operations, ops-eqiad

Thu, Mar 14

Cmjohnson added a comment to T202705: Degraded RAID on sodium.

a new ticket has been created with Dell

Thu, Mar 14, 3:42 PM · ops-eqiad, Operations
Cmjohnson removed a project from T217473: labstore1006 spontaneous reboot: ops-eqiad.

I updated all the F/W on this server. I am removing the dc ops tag. If this becomes a h/w issue please add back.

Thu, Mar 14, 2:50 PM · Patch-For-Review, Operations, Data-Services, cloud-services-team (Kanban)
Cmjohnson added a comment to T217274: mw1264 DIMM error.

The error didn't appear again (yet) but I created a task with Dell worst case they push back...best they send a DIMM. We're less than 30 days from end of warranty.

Thu, Mar 14, 2:46 PM · Operations, ops-eqiad
Cmjohnson added a comment to T218006: mw1280 crashed.

The error didn't appear again (yet) but I created a task with Dell worst case they push back...best they send a DIMM. We're less than 30 days from end of warranty.

Thu, Mar 14, 2:46 PM · ops-eqiad, Operations, serviceops
Cmjohnson added a comment to T215411: thumbor1004 memory errors.

Should've checked this but thumbor1004 is out of warranty.

Thu, Mar 14, 2:35 PM · User-jijiki, Thumbor, ops-eqiad, serviceops, Operations

Wed, Mar 13

Cmjohnson added a comment to T218006: mw1280 crashed.

Swapped DIMM B1 with A1 cleared idrac log.

Wed, Mar 13, 6:59 PM · ops-eqiad, Operations, serviceops
Cmjohnson added a comment to T218006: mw1280 crashed.

Record: 42
Date/Time: 03/10/2019 07:43:40
Source: system
Severity: Non-Critical

Description: Correctable memory error rate exceeded for DIMM_B1.

Record: 43
Date/Time: 03/10/2019 07:53:15
Source: system
Severity: Critical

Description: Correctable memory error rate exceeded for DIMM_B1.

Wed, Mar 13, 6:55 PM · ops-eqiad, Operations, serviceops
Cmjohnson added a comment to T215411: thumbor1004 memory errors.

DIMM A1 is now showing bad so it looks a DIMM replacement is needed.

Wed, Mar 13, 6:55 PM · User-jijiki, Thumbor, ops-eqiad, serviceops, Operations
Cmjohnson added a comment to T218006: mw1280 crashed.

@MoritzMuehlenhoff can you please depool the server

Wed, Mar 13, 6:27 PM · ops-eqiad, Operations, serviceops
Cmjohnson closed T217394: dbproxy1012 power supply without power as Resolved.
Wed, Mar 13, 6:25 PM · Operations, ops-eqiad, DBA
Cmjohnson added a comment to T215012: cloudvirt1015: apparent hardware errors in CPU/Memory.

@aborrero the CPU is here...let me know when it's safe for me to change.

Wed, Mar 13, 6:05 PM · Patch-For-Review, Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
Cmjohnson added a comment to T215411: thumbor1004 memory errors.

I moved DIMM from B side to A side and cleared the log...let's give it a day or so and see if the error follows.

Wed, Mar 13, 5:59 PM · User-jijiki, Thumbor, ops-eqiad, serviceops, Operations

Thu, Mar 7

Cmjohnson added a comment to T202966: Make cp1099 the new pinkunicorn.

@ayounsi server moved

Thu, Mar 7, 5:06 PM · Patch-For-Review, Operations, Traffic
Cmjohnson added a comment to T202966: Make cp1099 the new pinkunicorn.

@ayounsi asw2-c8 is a 1G switch....does this need to go to a 10G rack?

Thu, Mar 7, 4:45 PM · Patch-For-Review, Operations, Traffic

Thu, Feb 28

Cmjohnson added a comment to T214760: icinga1001 crashed.

The new CPU came in and I replaced CPU1

Thu, Feb 28, 7:39 PM · Patch-For-Review, ops-eqiad, monitoring, Operations
Cmjohnson added a comment to T211668: mw1272 crashed: Bad page map in process hhvm.

Received the parts, replaced CPU2 and DIMM B1 and cleared the log

Thu, Feb 28, 7:25 PM · serviceops, ops-eqiad, Operations, HHVM
Cmjohnson closed T214720: db1114 crashed (HW memory issues) as Resolved.

the motherboard has been replaced, the idrac and bios have been updated to latest version. resolving task, reopen if there are any problems.

Thu, Feb 28, 5:44 PM · Patch-For-Review, DBA, Operations, ops-eqiad

Wed, Feb 27

Cmjohnson added a comment to T215231: rack/setup/install labsdb1012.eqiad.wmnet.

@elukey I moved the host to A6 and updated netbox. Arzhel updated network switch cfg. DNS will need to be updated and then ready for installs.

Wed, Feb 27, 7:13 PM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, Operations
Cmjohnson added a comment to T217274: mw1264 DIMM error.

While powering the server off to move to rack A5 I swapped the B1 DIMM with A1 to see if error follows.

Wed, Feb 27, 6:43 PM · Operations, ops-eqiad
Cmjohnson created T217274: mw1264 DIMM error.
Wed, Feb 27, 6:43 PM · Operations, ops-eqiad
Cmjohnson added a comment to T214720: db1114 crashed (HW memory issues).

a new motherboard arrives tomorrow 28/2/2019 to be replaced.

Wed, Feb 27, 5:38 PM · Patch-For-Review, DBA, Operations, ops-eqiad
Cmjohnson updated the task description for T212348: Move servers off asw2-a5-eqiad.
Wed, Feb 27, 5:32 PM · Patch-For-Review, ops-eqiad, Operations, netops
Cmjohnson updated the task description for T212348: Move servers off asw2-a5-eqiad.
Wed, Feb 27, 5:31 PM · Patch-For-Review, ops-eqiad, Operations, netops
Cmjohnson added a comment to T212348: Move servers off asw2-a5-eqiad.

We have an issue with these servers. I completely forgot about the lack of power availability in rack A2. We will need to move these to a different row or wait until we swap the PDUs?

Wed, Feb 27, 4:24 PM · Patch-For-Review, ops-eqiad, Operations, netops

Mon, Feb 25

Cmjohnson added a comment to T212348: Move servers off asw2-a5-eqiad.

@ayounsi I want to do all the server moves on Thursday this week. Can you ask the service owners to have everything depooled. I will get started at 1500 UTC. The server move will take a couple of hours and then we can do the network changes afterward. I think physical moving will not exceed 3 hours.

Mon, Feb 25, 5:52 PM · Patch-For-Review, ops-eqiad, Operations, netops
Cmjohnson added a comment to T214720: db1114 crashed (HW memory issues).

This will most like need a new motherboard. I requested one through Dell

Mon, Feb 25, 4:32 PM · Patch-For-Review, DBA, Operations, ops-eqiad
Cmjohnson added a comment to T215012: cloudvirt1015: apparent hardware errors in CPU/Memory.

Requested a new CPU from Dell

Mon, Feb 25, 4:22 PM · Patch-For-Review, Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
Cmjohnson added a comment to T211668: mw1272 crashed: Bad page map in process hhvm.

A self-dispatch ticket has been created for a new DIMM and CPU

Mon, Feb 25, 4:15 PM · serviceops, ops-eqiad, Operations, HHVM
Cmjohnson added a comment to T214760: icinga1001 crashed.

I forgot to update the task

Mon, Feb 25, 4:06 PM · Patch-For-Review, ops-eqiad, monitoring, Operations

Fri, Feb 22

Cmjohnson added a comment to T214778: Degraded RAID on ms-be1020.

@fgiunchedi Let's do this on Monday if you are available and now that ms-be1033 is working again.

Fri, Feb 22, 5:24 PM · ops-eqiad, Operations
Cmjohnson closed T215998: ms-be1033 down and not powering up as Resolved.

Resolving, feel free to open if the problem returns

Fri, Feb 22, 5:23 PM · Operations, ops-eqiad
Cmjohnson added a comment to T196478: rack/setup/install backup1001.

backup1001 is all connected now, I do notice that the raid card is not picking up any of the disk arrays.

Fri, Feb 22, 4:34 PM · Patch-For-Review, Operations, ops-eqiad
Cmjohnson added a comment to T214760: icinga1001 crashed.

That’s not really all that interesting, they send us refurbished parts all
the time that don’t work. I will submit another ticket for a new CPU.

Fri, Feb 22, 3:52 PM · Patch-For-Review, ops-eqiad, monitoring, Operations

Thu, Feb 21

Cmjohnson added a comment to T214760: icinga1001 crashed.

I swapped CPU1 w/ CPU2 and cleared the log. Please monitor to see where and if the error continues or moves.

Thu, Feb 21, 8:02 PM · Patch-For-Review, ops-eqiad, monitoring, Operations
Cmjohnson added a comment to T214720: db1114 crashed (HW memory issues).

@jynus @Marostegui I swapped DIMM B3 to A3 and B7 to A7 and cleared the idrac log. Please put some stress on the server and let's monitor.

Thu, Feb 21, 7:31 PM · Patch-For-Review, DBA, Operations, ops-eqiad
Cmjohnson added a comment to T214720: db1114 crashed (HW memory issues).

Before DIMM Swap racadm log

Thu, Feb 21, 7:28 PM · Patch-For-Review, DBA, Operations, ops-eqiad
Cmjohnson added a comment to T215998: ms-be1033 down and not powering up.

The server did eventually power up, so it looks like I am eating some crow on this one. Re-connected everything and put back to normal operating standard. Booting into the OS now.

Thu, Feb 21, 6:59 PM · Operations, ops-eqiad
Cmjohnson moved T216202: Disk failure on labsdb1005 from Hardware Failure / Troubleshoot to Cloud Tasks on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · Operations, ops-eqiad
Cmjohnson moved T215998: ms-be1033 down and not powering up from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · Operations, ops-eqiad
Cmjohnson moved T216324: cloudvirt1009: upgrade to 10G from Racking Tasks to Cloud Tasks on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
Cmjohnson moved T215231: rack/setup/install labsdb1012.eqiad.wmnet from Backlog to Racking Tasks on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, Operations
Cmjohnson moved T216192: Update label and switch to rename labvirt1012 to cloudvirt1012 from Backlog to Cloud Tasks on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · ops-eqiad, Operations
Cmjohnson moved T216281: Update label and switch to rename labvirt1009 to cloudvirt1009 from Backlog to Cloud Tasks on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · ops-eqiad, cloud-services-team (Kanban), Operations
Cmjohnson moved T216491: Decommission dbstore1002 from Backlog to Decommission on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · Patch-For-Review, decommission, ops-eqiad, Operations, Analytics
Cmjohnson moved T216724: relocate cloudvirt1024 from b8-eqiad:u24 to b2-eqiad:u17 from Backlog to Cloud Tasks on the ops-eqiad board.
Thu, Feb 21, 6:42 PM · DC-Ops, Operations, ops-eqiad, cloud-services-team (Kanban)
Cmjohnson added a comment to T214720: db1114 crashed (HW memory issues).

@Marostegui I will need to swap DIMM B3 and B7 to the A side. LMK when the server is down and ready

Thu, Feb 21, 6:41 PM · Patch-For-Review, DBA, Operations, ops-eqiad
Cmjohnson moved T215012: cloudvirt1015: apparent hardware errors in CPU/Memory from Hardware Failure / Troubleshoot to Cloud Tasks on the ops-eqiad board.
Thu, Feb 21, 6:39 PM · Patch-For-Review, Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
Cmjohnson added a comment to T216202: Disk failure on labsdb1005.

@Bstorm I swapped the disk with a used spare but this server really needs to be decommissioned...the warranty expired in 2015.

Thu, Feb 21, 6:39 PM · Operations, ops-eqiad
Cmjohnson reassigned T215231: rack/setup/install labsdb1012.eqiad.wmnet from Cmjohnson to ayounsi.

@arzhel This server needs to go into the cloud-support vlan but it's not available to me for row C. Can you update it, please.

Thu, Feb 21, 6:35 PM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, Operations
Cmjohnson added a comment to T215998: ms-be1033 down and not powering up.

@fgiunchedi HPE did not believe me that a motherboard swap is needed. They asked that I do a bunch of troubleshooting first. Below are the steps they asked me to do. I have replied that their wild goose chase did not work and to please send me a new board. fingers crossed.

Thu, Feb 21, 6:33 PM · Operations, ops-eqiad
Cmjohnson updated the task description for T215231: rack/setup/install labsdb1012.eqiad.wmnet.
Thu, Feb 21, 6:03 PM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, Operations

Feb 19 2019

Cmjohnson added a comment to T194855: Degraded RAID on cloudvirt1020.

I believe the supposed failed disk was a result of me working inside the server last week and I put it back together quickly. The cables that attach the raid card to the backplane for the ssds is very touchy and if it's slightly off it could show a disk offline. I confirmed this thought after I checked the raid bios and noticed the card was only seeing 9 of the 10 disks. I then swapped a disk from a different slot and the orange indicator light stayed with the slot. I opened the server up, reseated the cables and all 10 disks show but the raid had to be rebuilt. The server will need a full re-install.

Feb 19 2019, 7:17 PM · Patch-For-Review, cloud-services-team (Kanban), ops-eqiad, Operations
Cmjohnson added a comment to T216528: confirm gpu form factor in stat1005.

There appears to be power already connected to the GPU

Feb 19 2019, 6:23 PM · ops-eqiad, Analytics-Kanban, Analytics, Operations
Cmjohnson added a comment to T216528: confirm gpu form factor in stat1005.

Feb 19 2019, 6:21 PM · ops-eqiad, Analytics-Kanban, Analytics, Operations
Cmjohnson closed T214760: icinga1001 crashed as Resolved.

CPU2 was replaced

Feb 19 2019, 5:34 PM · Patch-For-Review, ops-eqiad, monitoring, Operations
Cmjohnson closed T214760: icinga1001 crashed, a subtask of T210108: icinga1001 mysterious reboots, as Resolved.
Feb 19 2019, 5:34 PM · ops-eqiad, DC-Ops, Operations
Cmjohnson closed T215892: Degraded RAID on cloudvirt1024 as Resolved.

@GTirloni The disk has been replaced

Feb 19 2019, 5:24 PM · cloud-services-team (Kanban), ops-eqiad, Operations
Cmjohnson added a comment to T216004: Degraded RAID on cloudvirt1018.

@GTirloni The disks in slots 2 and 3 have been replaced.

Feb 19 2019, 5:22 PM · cloud-services-team (Kanban), ops-eqiad, Operations

Feb 18 2019

Cmjohnson added a comment to T215231: rack/setup/install labsdb1012.eqiad.wmnet.

@elukey, it will affect how it's rack...10G racks have different switches but we are also limited in space for those racks. If 1G works now and for the foreseeable future, stick with that...if a change is needed then we will make adjustments in the future.

Feb 18 2019, 4:34 PM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, Operations

Feb 13 2019

Cmjohnson added a comment to T214720: db1114 crashed (HW memory issues).

I updated the bios to the latest version as of February 11, 2019 v2.9.1
updated idrac to latest version 2.61.60.60

Feb 13 2019, 6:23 PM · Patch-For-Review, DBA, Operations, ops-eqiad
Cmjohnson updated the task description for T215231: rack/setup/install labsdb1012.eqiad.wmnet.
Feb 13 2019, 6:06 PM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, Operations
Cmjohnson added a comment to T215231: rack/setup/install labsdb1012.eqiad.wmnet.

@elukey is this a 1G or 10G rack?

Feb 13 2019, 6:05 PM · DBA, Patch-For-Review, ops-eqiad, Analytics, User-Elukey, Operations
Cmjohnson added a comment to T215998: ms-be1033 down and not powering up.

A ticket has been opened with HPE

Feb 13 2019, 6:03 PM · Operations, ops-eqiad
Cmjohnson added a comment to T214760: icinga1001 crashed.

I requested a new CPU but w/out Dell's idrac log stating it's a CPU there is a good chance they will kick it back.

Feb 13 2019, 5:53 PM · Patch-For-Review, ops-eqiad, monitoring, Operations
Cmjohnson added a comment to T215998: ms-be1033 down and not powering up.

I physically cannot turn the server on either, I tried pulling the power and waiting 10 minutes but I just get a flashing green indicator at the power button. I am able to access the ilo's web interface but that does not tell me anything. In the past, a motherboard replacement was needed. I will update the ticket with HPE response

Feb 13 2019, 5:44 PM · Operations, ops-eqiad
Cmjohnson closed T216011: kafka1012 power supply alerts as Declined.

This sever came from thae a batch of r720's in 2011 and have faulty power modules on the main board. There are several like this. This server and all R720's from that batch need to be decommissioned.

Feb 13 2019, 5:25 PM · Operations, ops-eqiad
Cmjohnson reassigned T215569: mw1299 is down (jobrunner-canary, now up but depooled) from Cmjohnson to RobH.

I replaced CPU1 with new. Powered the server on. Assigning to @RobH to coordinate re-pooling and resolving

Feb 13 2019, 5:04 PM · ops-eqiad, Operations
Cmjohnson added a comment to T216004: Degraded RAID on cloudvirt1018.

a ticket with Dell has been submitted to replace both SSDs

Feb 13 2019, 1:02 PM · cloud-services-team (Kanban), ops-eqiad, Operations
Cmjohnson added a comment to T215892: Degraded RAID on cloudvirt1024.

A ticket with Dell has been created

Feb 13 2019, 12:53 PM · cloud-services-team (Kanban), ops-eqiad, Operations

Feb 12 2019

Cmjohnson added a comment to T196507: Degraded RAID on cloudvirt1019.

@faidon and all, it looks like we were missing a connection from the raid card to the riser card. This was not anywhere on the instruction that came with the raid card. Fortunately, I still had one but am missing for cloudvirt1020. I have already started a ticket with HPE and expect to have one in the next day or 2.

Feb 12 2019, 8:02 PM · Patch-For-Review, cloud-services-team (Kanban), ops-eqiad, Operations
Cmjohnson added a comment to T196507: Degraded RAID on cloudvirt1019.

@faidon battery replaced on cloudvirt1020

Feb 12 2019, 6:25 PM · Patch-For-Review, cloud-services-team (Kanban), ops-eqiad, Operations
Cmjohnson added a comment to T215569: mw1299 is down (jobrunner-canary, now up but depooled).

The self-dispatch was approved and the part should hopefully be here by tomorrow.

Feb 12 2019, 3:05 PM · ops-eqiad, Operations

Feb 11 2019

Cmjohnson reassigned T214608: rack/setup/install logstash101[012].eqiad.wmnet from Cmjohnson to RobH.

assigning to @RobH to do the installations

Feb 11 2019, 6:22 PM · Patch-For-Review, Operations
Cmjohnson updated the task description for T214608: rack/setup/install logstash101[012].eqiad.wmnet.
Feb 11 2019, 6:21 PM · Patch-For-Review, Operations
Cmjohnson added a comment to T214608: rack/setup/install logstash101[012].eqiad.wmnet.

i updated the bios versions on all 3 hosts

Feb 11 2019, 6:21 PM · Patch-For-Review, Operations
Cmjohnson added a comment to T213121: Deploy cr2-eqsin.

I also put in an in-bound ticket

Feb 11 2019, 4:38 PM · Patch-For-Review, ops-eqiad, ops-eqsin, Operations, netops
Cmjohnson added a comment to T215569: mw1299 is down (jobrunner-canary, now up but depooled).

Ticket open for a new CPU

Feb 11 2019, 4:29 PM · ops-eqiad, Operations
Cmjohnson added a comment to T215569: mw1299 is down (jobrunner-canary, now up but depooled).

racadm sel
Record: 29
Date/Time: 02/02/2019 21:20:29
Source: system
Severity: Critical

Description: CPU 1 machine check error detected.

Feb 11 2019, 4:24 PM · ops-eqiad, Operations

Feb 7 2019

Cmjohnson closed T215542: Degraded RAID on cloudelastic1004 as Invalid.
Feb 7 2019, 6:41 PM · ops-eqiad, Operations
Cmjohnson assigned T214079: cloudstore100{8,9} - Upgrade to 10GbE to RobH.

@RobH Can you do a re-install and hand off to cloud, please.

Feb 7 2019, 6:40 PM · Patch-For-Review, ops-eqiad, Operations
Cmjohnson added a comment to T209029: cloudelastic1004: SMART/disk error.

The disk has been replaced, @aborrero the OS will need to be re-installed. Until then the raid is out of whack because I removed /dev/sda.

Feb 7 2019, 6:38 PM · Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
Cmjohnson closed T215075: cloudcontrol1004 mgmt HTTPS SSL error as Resolved.
Feb 7 2019, 6:27 PM · Operations, Cloud-Services, cloud-services-team, ops-eqiad
Cmjohnson added a comment to T215075: cloudcontrol1004 mgmt HTTPS SSL error.

I updated the f/w and bios with the SPP provided by HP
The error did not resolve, I had to reset the rbsu to manufacturer settings and the error did not reappear.
I re-configured the bios and rebooted. All should be in working order now.

Feb 7 2019, 5:36 PM · Operations, Cloud-Services, cloud-services-team, ops-eqiad

Feb 6 2019

RobH awarded T215338: WMF7426 fails to accept racadm powercycle commands a Like token.
Feb 6 2019, 9:24 PM · Operations, ops-eqiad
Cmjohnson added a comment to T214608: rack/setup/install logstash101[012].eqiad.wmnet.

hi @herron they are not going just yet. I will get to them next week.

Feb 6 2019, 9:06 PM · Patch-For-Review, Operations
Cmjohnson closed T215338: WMF7426 fails to accept racadm powercycle commands as Resolved.

@RobH updated f/w and bios....all is well. resolving

Feb 6 2019, 8:17 PM · Operations, ops-eqiad
Cmjohnson moved T215050: Degraded RAID on db1073 from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.
Feb 6 2019, 7:55 PM · DBA, ops-eqiad, Operations
Cmjohnson added a comment to T215050: Degraded RAID on db1073.

The disk has been replaced but I also a bad disk on slot 6. leaving this open until tomorrow and will replace it

Feb 6 2019, 7:55 PM · DBA, ops-eqiad, Operations
Cmjohnson added a comment to T215338: WMF7426 fails to accept racadm powercycle commands.

pulled the power to do a hard reset but the function still does not work. Attempting to update idrac to latest version

Feb 6 2019, 7:47 PM · Operations, ops-eqiad

Feb 1 2019

Cmjohnson added a comment to T214079: cloudstore100{8,9} - Upgrade to 10GbE.

@GTirloni I do not have room in row A. These can go into Row D racks D2 and D7. Doing this will require a DNS (ip) change and I will have to fix the servers to use the 10G NIC. A re-install of the OS will be needed. I would like to do at least one on February 7th at 1600UTC (11am EST). If we can do both that would be great but can stagger so I move the 2nd server the following week. Please confirm that this will work for you.

Feb 1 2019, 7:38 PM · Patch-For-Review, ops-eqiad, Operations
Cmjohnson created T215075: cloudcontrol1004 mgmt HTTPS SSL error.
Feb 1 2019, 7:06 PM · Operations, Cloud-Services, cloud-services-team, ops-eqiad

Jan 30 2019

Cmjohnson added a comment to T184293: rack/setup/install lvs101[3-6].

lvs1013 and lvs1014 still need to be connected.

Jan 30 2019, 11:30 PM · Patch-For-Review, ops-eqiad, Operations, Traffic
Cmjohnson added a comment to T196487: upgrade row d to have 3 10G switches.

@RobH @ayounsi Let's get the procurement items we need to move this task along please.

Jan 30 2019, 11:01 PM · ops-eqiad, netops, Operations
Cmjohnson updated the task description for T211613: rack/setup/install db11[26-38].eqiad.wmnet.
Jan 30 2019, 10:58 PM · Patch-For-Review, DBA, ops-eqiad, User-Marostegui, Operations
Cmjohnson moved T196478: rack/setup/install backup1001 from Racking Tasks to Being worked on on the ops-eqiad board.

@akosiaris Sorry for the really late response to this....the task got buried. No, I don't know why mgmt would not be working now unless it's disconnected or the cable is bad. I will check it next week after all hands.

Jan 30 2019, 10:57 PM · Patch-For-Review, Operations, ops-eqiad
Cmjohnson moved T184293: rack/setup/install lvs101[3-6] from Racking Tasks to Being worked on on the ops-eqiad board.
Jan 30 2019, 10:51 PM · Patch-For-Review, ops-eqiad, Operations, Traffic
Cmjohnson added a comment to T184293: rack/setup/install lvs101[3-6].
Jan 30 2019, 10:50 PM · Patch-For-Review, ops-eqiad, Operations, Traffic