Cmjohnson (cmjohnson)
User

Projects (11)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Dec 16 2014, 10:22 PM (187 w, 2 d)
Availability
Available
IRC Nick
cmjohnson1
LDAP User
Cmjohnson
MediaWiki User
Unknown

Recent Activity

Wed, Jul 18

Cmjohnson moved T199524: Relabel labnet1003.eqiad.wmnet as cloudnet1003.eqiad.wmnet from Backlog to Cloud Tasks on the ops-eqiad board.
Wed, Jul 18, 3:53 PM · Operations, ops-eqiad
Cmjohnson moved T199636: Degraded RAID on db1072 from Backlog to Being worked on on the ops-eqiad board.
Wed, Jul 18, 3:53 PM · DBA, ops-eqiad, Operations
Cmjohnson moved T199782: Relabel labcontrol1004.wikimedia.org as cloudcontrol1004.wikimedia.org from Backlog to Cloud Tasks on the ops-eqiad board.
Wed, Jul 18, 3:53 PM · ops-eqiad, Cloud-Services, Epic, Operations
Cmjohnson moved T199921: Relabel labnet1004.eqiad.wmnet as cloudnet1004.eqiad.wmnet from Backlog to Cloud Tasks on the ops-eqiad board.
Wed, Jul 18, 3:53 PM · Operations, ops-eqiad, cloud-services-team

Tue, Jul 17

Cmjohnson closed T198792: snapshot1005 does not power back up as Resolved.

The system board has been replaced and the server is accessible now

Tue, Jul 17, 6:24 PM · Dumps-Generation, Patch-For-Review, DC-Ops, ops-eqiad, Operations

Thu, Jul 12

Cmjohnson moved T199125: rack/setup/install cloudvirt102[34] from Backlog to Racking Tasks on the ops-eqiad board.
Thu, Jul 12, 3:19 PM · ops-eqiad, Cloud-VPS, Operations

Tue, Jul 10

Cmjohnson added a comment to T184293: rack/setup/install lvs101[3-6].

lvs1015 idrac is setup, I think it's cabled correctly but I am not really sure, enp4s0f1 doesn't translate for me looking at h/w but I am pretty sure it matches the port order. I am not sure what you need from here to make it all work. I am attaching the picture of the mac addresses.

Tue, Jul 10, 3:38 PM · Patch-For-Review, ops-eqiad, Operations, Traffic
Cmjohnson closed T199132: Relabel labvirt1021.eqiad.wmnet as cloudvirt1021.eqiad.wmnet as Resolved.
Tue, Jul 10, 2:35 PM · ops-eqiad, Operations
Cmjohnson closed T199203: Relabel labvirt1022.eqiad.wmnet as cloudvirt1022.eqiad.wmnet as Resolved.
Tue, Jul 10, 2:34 PM · Operations, ops-eqiad
Cmjohnson moved T194012: labsdb1004 and labsdb1005 some hard disks not healthy from Backlog to Cloud Tasks on the ops-eqiad board.
Tue, Jul 10, 2:27 PM · DC-Ops, Operations, ops-eqiad, Cloud-Services
Cmjohnson moved T199203: Relabel labvirt1022.eqiad.wmnet as cloudvirt1022.eqiad.wmnet from Not urgent to Cloud Tasks on the ops-eqiad board.
Tue, Jul 10, 2:27 PM · Operations, ops-eqiad
Cmjohnson moved T199132: Relabel labvirt1021.eqiad.wmnet as cloudvirt1021.eqiad.wmnet from Not urgent to Cloud Tasks on the ops-eqiad board.
Tue, Jul 10, 2:27 PM · ops-eqiad, Operations
Cmjohnson moved T199056: db1069 bad disk from Backlog to Up next on the ops-eqiad board.
Tue, Jul 10, 2:27 PM · Operations, ops-eqiad, DBA
Cmjohnson moved T199132: Relabel labvirt1021.eqiad.wmnet as cloudvirt1021.eqiad.wmnet from Backlog to Not urgent on the ops-eqiad board.
Tue, Jul 10, 2:27 PM · ops-eqiad, Operations
Cmjohnson moved T199203: Relabel labvirt1022.eqiad.wmnet as cloudvirt1022.eqiad.wmnet from Backlog to Not urgent on the ops-eqiad board.
Tue, Jul 10, 2:27 PM · Operations, ops-eqiad
Cmjohnson moved T198792: snapshot1005 does not power back up from Backlog to Being worked on on the ops-eqiad board.
Tue, Jul 10, 2:26 PM · Dumps-Generation, Patch-For-Review, DC-Ops, ops-eqiad, Operations
Cmjohnson added a comment to T198792: snapshot1005 does not power back up.

I attempted to power off, unplug and power the server back on, unfortunately it does not want to power on...i just get a flashing green led on the front. I am able to access the mgmt web portal and pulled the AHS log a requirement by HP tech support. Worth noting that this is a lease server and it expires in February 2019. We should still fix the server but is this a good opportunity to move the service?

Tue, Jul 10, 2:22 PM · Dumps-Generation, Patch-For-Review, DC-Ops, ops-eqiad, Operations

Tue, Jul 3

Cmjohnson updated the task description for T196691: rack/setup/install dns100[12].wikimedia.org.
Tue, Jul 3, 1:24 PM · DNS, ops-eqiad, Operations, Traffic
Cmjohnson updated the task description for T196484: rack/setup/install graphite1004.
Tue, Jul 3, 12:48 PM · monitoring, ops-eqiad, Operations
Cmjohnson updated the task description for T196698: rack/setup/install auth1002.
Tue, Jul 3, 12:35 PM · ops-eqiad, Operations

Mon, Jul 2

Cmjohnson triaged T190086: Decommission old server wmf4077 as Lowest priority.
Mon, Jul 2, 4:54 PM · decommission, Operations
Cmjohnson moved T197063: Decommission db1054 from Backlog to Decommission on the ops-eqiad board.
Mon, Jul 2, 4:16 PM · ops-eqiad, decommission, Operations, DBA
Cmjohnson moved T197630: decommission samarium.frack.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Mon, Jul 2, 4:16 PM · ops-eqiad, Operations
Cmjohnson moved T195484: Decommission db1051 from Backlog to Decommission on the ops-eqiad board.
Mon, Jul 2, 4:15 PM · ops-eqiad, Operations, decommission, DBA
Cmjohnson moved T198398: mw1239 correctable memory errors from Backlog to Up next on the ops-eqiad board.
Mon, Jul 2, 4:15 PM · ops-eqiad, Operations
Cmjohnson added a comment to T198398: mw1239 correctable memory errors.

@herron a DIMM swap is the next step. Can you please remove from dsh group so I can power it off and swap DIMM

Mon, Jul 2, 4:15 PM · ops-eqiad, Operations
Cmjohnson closed T198407: Degraded RAID on labstore1007 as Resolved.

The issue resulted from a disk shelf being added incorrectly. This has been fixed.

Mon, Jul 2, 4:14 PM · ops-eqiad, Operations
Cmjohnson closed T198408: Degraded RAID on labstore1006 as Declined.

This is related to the addition of disk shelf.

Mon, Jul 2, 4:13 PM · ops-eqiad, Operations
Cmjohnson moved T198479: labvirt1009 HP Raid alert from Backlog to Blocked on the ops-eqiad board.
Mon, Jul 2, 4:12 PM · Operations, ops-eqiad, DC-Ops, cloud-services-team
Cmjohnson updated subscribers of T198479: labvirt1009 HP Raid alert.

@Bstorm this is a hot swap disk but the server is now out of warranty. @RobH should we order spare disks?

Mon, Jul 2, 4:11 PM · Operations, ops-eqiad, DC-Ops, cloud-services-team
Cmjohnson closed T194851: Degraded RAID on labvirt1019 as Invalid.
Mon, Jul 2, 4:06 PM · ops-eqiad, Operations
Cmjohnson closed T194841: Degraded raid in labnet1002 as Invalid.
Mon, Jul 2, 4:05 PM · cloud-services-team, DC-Ops, Operations, ops-eqiad

Thu, Jun 28

Cmjohnson updated the task description for T196701: rack/setup/install torrelay1001.wikimedia.org.
Thu, Jun 28, 4:33 PM · ops-eqiad, Operations
Cmjohnson added a comment to T196873: ms-be1036 in power off status, not responsive to power on commands.

I pushed the schedule and the HP tech came today. The server is back online. @godog please resolve if satisfied.

Thu, Jun 28, 3:46 PM · User-fgiunchedi, ops-eqiad, Operations
Cmjohnson updated the task description for T196693: rack/setup/install authdns1001.wikimedia.org.
Thu, Jun 28, 3:12 PM · Traffic, DNS, ops-eqiad, Operations
Cmjohnson added a comment to T196478: rack/setup/install backup1001.

disk arrays are racked in D2.

Thu, Jun 28, 2:40 PM · Patch-For-Review, Operations, ops-eqiad
Cmjohnson updated the task description for T196478: rack/setup/install backup1001.
Thu, Jun 28, 2:39 PM · Patch-For-Review, Operations, ops-eqiad
Cmjohnson updated the task description for T194186: rack/setup/install cloudelastic100[1-4].eqiad.wmnet systems.
Thu, Jun 28, 2:35 PM · ops-eqiad, cloud-services-team, Cloud-VPS, Operations
Cmjohnson added a comment to T196651: rack upgraded storage capacity in labstore100[67].eqiad.wmnet.

i was able to relocate a few servers in d2 to make room for the new disk shelf (LS1007). For LS1006, I just removed 2 decom'd servers from u24 and 25. This is close enough to the labstore1006 so I do not need to move any other servers. They are in the racked w/asset tags.

Thu, Jun 28, 2:08 PM · Patch-For-Review, Datasets-General-or-Unknown, ops-eqiad, Cloud-VPS, Operations
Cmjohnson updated the task description for T196651: rack upgraded storage capacity in labstore100[67].eqiad.wmnet.
Thu, Jun 28, 2:07 PM · Patch-For-Review, Datasets-General-or-Unknown, ops-eqiad, Cloud-VPS, Operations
Cmjohnson updated the task description for T196685: rack/setup/install rdb10[09|10].eqiad.wmnet.
Thu, Jun 28, 1:36 PM · ops-eqiad, User-Joe, User-Elukey, Operations
Cmjohnson added a comment to T197707: Degraded RAID on dbstore1002.

The disk has been swapped with a 2TB disk.

Thu, Jun 28, 12:05 PM · User-Elukey, Analytics, ops-eqiad, Operations

Wed, Jun 27

Cmjohnson added a comment to T149287: Heating alerts for mw servers in eqiad.

Thanks Moritz. I have a procurement task for more thermal paste. Once it arrives, we can schedule a time to take care of these.
procurement task https://phabricator.wikimedia.org/T198326

Wed, Jun 27, 3:18 PM · Operations, ops-eqiad
Cmjohnson added a comment to T193196: labnet1003 and labnet1004 moving and enabling 10G NICs.

labnet1004 the cable in eth4 is connected the correct port and according the bios the mac address is E0:07:1B:EF:15:D0 which is the one attempting to hit the installer.

Wed, Jun 27, 3:15 PM · Patch-For-Review, Cloud-VPS, Operations, ops-eqiad
Cmjohnson moved T185337: rack spare switches in c1-eqiad from Not urgent to Racking Tasks on the ops-eqiad board.
Wed, Jun 27, 2:56 PM · Operations, netops, ops-eqiad
Cmjohnson added a comment to T120856: Remove all out of warranty unused cp10xx's from A2.
Wed, Jun 27, 2:55 PM · DC-Ops, Operations, ops-eqiad
Cmjohnson added a comment to T149287: Heating alerts for mw servers in eqiad.

Most of the servers are decommissioned. Are you still have problems with

Wed, Jun 27, 2:51 PM · Operations, ops-eqiad
Cmjohnson moved T183390: unrack/decom pfw1-eqiad and pfw2-eqiad from Not urgent to UnRacking Tasks on the ops-eqiad board.
Wed, Jun 27, 2:49 PM · decommission, netops, Operations, ops-eqiad
Cmjohnson moved T185004: Decommission mw1201-mw1220 from Being worked on to UnRacking Tasks on the ops-eqiad board.
Wed, Jun 27, 2:49 PM · decommission, ops-eqiad, User-Joe, Operations
Cmjohnson moved T189566: Decommission eventlog1001 from Up next to UnRacking Tasks on the ops-eqiad board.
Wed, Jun 27, 2:49 PM · decommission, ops-eqiad, Operations
Cmjohnson moved T171168: cp1050 apparently stuck while "Initializing firmware interfaces..." from Not urgent to Up next on the ops-eqiad board.
Wed, Jun 27, 2:48 PM · Operations, Traffic, ops-eqiad
Cmjohnson closed T174449: tin has a failing hdd as Resolved.

This server now has a decom task https://phabricator.wikimedia.org/T196175

Wed, Jun 27, 2:48 PM · Release-Engineering-Team (Watching / External), ops-eqiad, Operations
Cmjohnson added a comment to T193628: tungsten disk 1 and 8 SMART failure.

Is there a plan to decommission this server soon?

Wed, Jun 27, 2:46 PM · ops-eqiad, Operations
Cmjohnson added a comment to T194234: anaytics1032's BBU is not working correctly.

@elukey let's do this tomorrow morning. I will ping you when I get to the data center in the morning.

Wed, Jun 27, 1:46 PM · ops-eqiad, Operations
Cmjohnson moved T194234: anaytics1032's BBU is not working correctly from Not urgent to Being worked on on the ops-eqiad board.
Wed, Jun 27, 1:45 PM · ops-eqiad, Operations
Cmjohnson added a comment to T194234: anaytics1032's BBU is not working correctly.

@elukey is this still an issue. I do have a spare bbu I can install. If so, please let me know when you would like to schedule this to happen

Wed, Jun 27, 12:59 PM · ops-eqiad, Operations
Cmjohnson added a comment to T194855: Degraded RAID on labvirt1020.

@Bstorm after reinstall please let me know if this is still an issue.

Wed, Jun 27, 12:58 PM · ops-eqiad, Operations
Cmjohnson added a comment to T196252: Labservices1001 crashed.

@Andrew We need thermal paste. I have created a procurement task https://phabricator.wikimedia.org/T198326. Once it arrives I will ping you regarding a good day/time to power off.

Wed, Jun 27, 12:56 PM · Patch-For-Review, ops-eqiad, cloud-services-team, Operations
Cmjohnson closed T196751: labvirt1019 IPMI alert as Resolved.

This may very well have been partially unplugged during the 10G issues. Resolving the task. If it returns we can open again.

Wed, Jun 27, 12:51 PM · cloud-services-team, ops-eqiad, Operations, DC-Ops
Cmjohnson added a comment to T196507: Degraded RAID on labvirt1019.

@Bstorm Is this sever fully functional? I wanted to wait until it's working and the connectivity issues were resolved before tackling the next set of issues. Thanks!

Wed, Jun 27, 12:49 PM · ops-eqiad, Operations
Cmjohnson moved T184293: rack/setup/install lvs101[3-6] from Blocked to Being worked on on the ops-eqiad board.
Wed, Jun 27, 12:48 PM · Patch-For-Review, ops-eqiad, Operations, Traffic
Cmjohnson moved T193196: labnet1003 and labnet1004 moving and enabling 10G NICs from Blocked to Cloud Tasks on the ops-eqiad board.
Wed, Jun 27, 12:48 PM · Patch-For-Review, Cloud-VPS, Operations, ops-eqiad
Cmjohnson moved T197707: Degraded RAID on dbstore1002 from Backlog to Being worked on on the ops-eqiad board.
Wed, Jun 27, 12:47 PM · User-Elukey, Analytics, ops-eqiad, Operations

Tue, Jun 26

Cmjohnson closed T190225: Decommission unused host wmf3565 as Resolved.
Tue, Jun 26, 4:53 PM · Patch-For-Review, ops-eqiad, Operations
Cmjohnson closed T190225: Decommission unused host wmf3565 , a subtask of T187473: Decommission old and unused/spare servers in eqiad, as Resolved.
Tue, Jun 26, 4:53 PM · decommission, Operations, DC-Ops, ops-eqiad
Cmjohnson updated the task description for T190225: Decommission unused host wmf3565 .
Tue, Jun 26, 4:53 PM · Patch-For-Review, ops-eqiad, Operations
Cmjohnson closed T187190: Decommission graphite1002 as Resolved.
Tue, Jun 26, 4:45 PM · decommission, Patch-For-Review, ops-eqiad, Operations
Cmjohnson updated the task description for T187190: Decommission graphite1002.
Tue, Jun 26, 4:45 PM · decommission, Patch-For-Review, ops-eqiad, Operations
Cmjohnson closed T184054: Decommission db1029 and db1031 as Resolved.
Tue, Jun 26, 4:44 PM · decommission, ops-eqiad, Operations, DBA
Cmjohnson closed T184054: Decommission db1029 and db1031, a subtask of T183469: Setup newer machines and replace all old misc (m*) and x1 eqiad machines, as Resolved.
Tue, Jun 26, 4:44 PM · Patch-For-Review, Operations, DBA
Cmjohnson closed T184054: Decommission db1029 and db1031, a subtask of T134476: Decommission old coredb machines (<=db1050), as Resolved.
Tue, Jun 26, 4:44 PM · decommission, Goal, Operations, DBA
Cmjohnson updated the task description for T184054: Decommission db1029 and db1031.
Tue, Jun 26, 4:44 PM · decommission, ops-eqiad, Operations, DBA
Cmjohnson closed T181750: decommission mobile 1004 and mobile1005 as Resolved.
Tue, Jun 26, 4:21 PM · Patch-For-Review, DC-Ops, Operations, ops-eqiad
Cmjohnson updated the task description for T181750: decommission mobile 1004 and mobile1005.
Tue, Jun 26, 4:21 PM · Patch-For-Review, DC-Ops, Operations, ops-eqiad
Cmjohnson closed T182034: Decommission osm-cp100[1-4] as Resolved.
Tue, Jun 26, 4:14 PM · DC-Ops, Operations, ops-eqiad
Cmjohnson updated the task description for T182034: Decommission osm-cp100[1-4].
Tue, Jun 26, 4:14 PM · DC-Ops, Operations, ops-eqiad
Cmjohnson closed T182033: Decommission osm-web100[1-4] as Resolved.
Tue, Jun 26, 4:12 PM · Patch-For-Review, DC-Ops, Operations, ops-eqiad
Cmjohnson updated the task description for T182033: Decommission osm-web100[1-4].
Tue, Jun 26, 4:12 PM · Patch-For-Review, DC-Ops, Operations, ops-eqiad
Cmjohnson moved T196697: rack/setup/add to spares tracking 2 single cpu misc class systems from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:58 PM · ops-eqiad, Operations
Cmjohnson moved T196690: rack/setup/install dbproxy101[2-7].eqiad.wmnet from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:58 PM · ops-eqiad, DBA, Operations
Cmjohnson moved T196698: rack/setup/install auth1002 from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:58 PM · ops-eqiad, Operations
Cmjohnson moved T196691: rack/setup/install dns100[12].wikimedia.org from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · DNS, ops-eqiad, Operations, Traffic
Cmjohnson moved T196685: rack/setup/install rdb10[09|10].eqiad.wmnet from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · ops-eqiad, User-Joe, User-Elukey, Operations
Cmjohnson moved T196484: rack/setup/install graphite1004 from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · monitoring, ops-eqiad, Operations
Cmjohnson moved T196701: rack/setup/install torrelay1001.wikimedia.org from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · ops-eqiad, Operations
Cmjohnson moved T196651: rack upgraded storage capacity in labstore100[67].eqiad.wmnet from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · Patch-For-Review, Datasets-General-or-Unknown, ops-eqiad, Cloud-VPS, Operations
Cmjohnson moved T196693: rack/setup/install authdns1001.wikimedia.org from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · Traffic, DNS, ops-eqiad, Operations
Cmjohnson moved T195923: rack/setup/install cp1075-cp1090 from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · Patch-For-Review, ops-eqiad, Traffic, Operations
Cmjohnson moved T194186: rack/setup/install cloudelastic100[1-4].eqiad.wmnet systems from Being worked on to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · ops-eqiad, cloud-services-team, Cloud-VPS, Operations
Cmjohnson moved T196478: rack/setup/install backup1001 from Up next to Racking Tasks on the ops-eqiad board.
Tue, Jun 26, 3:57 PM · Patch-For-Review, Operations, ops-eqiad
Cmjohnson added a comment to T196901: Replace memory bank on scb1002.

@Joe can you stress the DIMM? A simple reseating of the DIMM may also work. Let me know if I can power it down and do that.

Tue, Jun 26, 3:55 PM · Operations, ops-eqiad, DC-Ops
Cmjohnson added a comment to T196901: Replace memory bank on scb1002.

the hardware log does not show any indication of a bad DIMM I can probably pull from a decommissioned spare

Tue, Jun 26, 3:54 PM · Operations, ops-eqiad, DC-Ops
Cmjohnson added a comment to T196873: ms-be1036 in power off status, not responsive to power on commands.

This is scheduled for this coming Friday 29/6/2018 at 1000(EST)

Tue, Jun 26, 3:52 PM · User-fgiunchedi, ops-eqiad, Operations
Cmjohnson moved T194855: Degraded RAID on labvirt1020 from Being worked on to Cloud Tasks on the ops-eqiad board.
Tue, Jun 26, 3:51 PM · ops-eqiad, Operations
Cmjohnson moved T194964: Connect or troubleshoot eth1 on labvirt1019 and labvirt1020 from Blocked to Cloud Tasks on the ops-eqiad board.
Tue, Jun 26, 3:51 PM · Cloud-Services, Patch-For-Review, Operations, ops-eqiad
Cmjohnson moved T196252: Labservices1001 crashed from Being worked on to Cloud Tasks on the ops-eqiad board.
Tue, Jun 26, 3:51 PM · Patch-For-Review, ops-eqiad, cloud-services-team, Operations
Cmjohnson moved T193655: rack/setup/install labstore1008 & labstore1009 from Being worked on to Cloud Tasks on the ops-eqiad board.
Tue, Jun 26, 3:51 PM · cloud-services-team (Kanban), Patch-For-Review, ops-eqiad, Cloud-VPS, Operations
Cmjohnson moved T196751: labvirt1019 IPMI alert from Not urgent to Cloud Tasks on the ops-eqiad board.
Tue, Jun 26, 3:51 PM · cloud-services-team, ops-eqiad, Operations, DC-Ops
Cmjohnson moved T196507: Degraded RAID on labvirt1019 from Not urgent to Cloud Tasks on the ops-eqiad board.
Tue, Jun 26, 3:50 PM · ops-eqiad, Operations
Cmjohnson added a comment to T193196: labnet1003 and labnet1004 moving and enabling 10G NICs.

cabled
bios updated
switch cfg updated
xe-4/0/3 up up labnet1004 eth0
xe-4/0/46 up up labnet1004 eth1

Tue, Jun 26, 3:48 PM · Patch-For-Review, Cloud-VPS, Operations, ops-eqiad