Cmjohnson (cmjohnson)
User

Projects (13)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Dec 16 2014, 10:22 PM (152 w, 2 d)
Availability
Available
IRC Nick
cmjohnson1
LDAP User
Cmjohnson
MediaWiki User
Unknown

Recent Activity

Today

Cmjohnson moved T180700: Rack and setup db1109 and db1110 from Backlog to Blocked on the ops-eqiad board.
Fri, Nov 17, 2:14 PM · Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson assigned T180700: Rack and setup db1109 and db1110 to Marostegui.

@Marostegui These are ready for you

Fri, Nov 17, 2:14 PM · Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson updated the task description for T180700: Rack and setup db1109 and db1110.
Fri, Nov 17, 2:13 PM · Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson added a comment to T150651: Information missing from racktables.

@faidon
db1018 and db1022 are confirmed in racktables but are both decommissioned and removed from the rack.

Fri, Nov 17, 1:44 PM · Operations, DC-Ops
Cmjohnson updated the task description for T180700: Rack and setup db1109 and db1110.
Fri, Nov 17, 1:10 PM · Patch-For-Review, ops-eqiad, Operations, DBA

Yesterday

Cmjohnson updated the task description for T180700: Rack and setup db1109 and db1110.
Thu, Nov 16, 6:58 PM · Patch-For-Review, ops-eqiad, Operations, DBA

Tue, Nov 14

Cmjohnson updated the task description for T177387: Decomission mw1161-69.
Tue, Nov 14, 4:55 PM · Patch-For-Review, User-Elukey, User-Joe, Operations, ops-eqiad
Cmjohnson added a comment to T177387: Decomission mw1161-69.

@elukey or @Joe I went to finish the decom and found that 2 host still show up in puppet. please give me the okay to proceed.

Tue, Nov 14, 4:38 PM · Patch-For-Review, User-Elukey, User-Joe, Operations, ops-eqiad
Cmjohnson moved T177387: Decomission mw1161-69 from Up next to Being worked on on the ops-eqiad board.
Tue, Nov 14, 4:26 PM · Patch-For-Review, User-Elukey, User-Joe, Operations, ops-eqiad
Cmjohnson moved T178162: Decommission db1050 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 14, 4:26 PM · hardware-requests, ops-eqiad, Operations, Patch-For-Review, DBA
Cmjohnson moved T175679: Decommission db1048 (was Move m3 slave to db1059) from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 14, 4:26 PM · Operations, ops-eqiad, Phabricator, DBA
Cmjohnson moved T175264: Decommission db1049 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 14, 4:26 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson moved T174806: Decommission db1045 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 14, 4:26 PM · Patch-For-Review, hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson moved T174763: Decommission db1026 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 14, 4:26 PM · hardware-requests, Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson moved T173570: Decommission db1015 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 14, 4:26 PM · Patch-For-Review, hardware-requests, ops-eqiad, DBA, Operations
Cmjohnson closed T173915: Decommission db1041 as Resolved.

Wiped, racktables updated

Tue, Nov 14, 4:25 PM · hardware-requests, Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson closed T173915: Decommission db1041, a subtask of T134476: Decommission old coredb machines (<=db1050), as Resolved.
Tue, Nov 14, 4:25 PM · Patch-For-Review, Operations, DBA
Cmjohnson closed T177911: Decommission db1038 as Resolved.

Wiped, racktables updated

Tue, Nov 14, 4:25 PM · Operations, hardware-requests, ops-eqiad, Patch-For-Review, DBA
Cmjohnson closed T177911: Decommission db1038, a subtask of T134476: Decommission old coredb machines (<=db1050), as Resolved.
Tue, Nov 14, 4:25 PM · Patch-For-Review, Operations, DBA
Cmjohnson closed T177911: Decommission db1038, a subtask of T164488: Run pt-table-checksum on s3, as Resolved.
Tue, Nov 14, 4:25 PM · Patch-For-Review, DBA
Cmjohnson closed T174902: Decommission db1037, a subtask of T134476: Decommission old coredb machines (<=db1050), as Resolved.
Tue, Nov 14, 4:24 PM · Patch-For-Review, Operations, DBA
Cmjohnson closed T174902: Decommission db1037 as Resolved.

wiped, racktables updted

Tue, Nov 14, 4:24 PM · hardware-requests, Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson closed T176311: decommission db1036 as Resolved.

wiped, racktables updated

Tue, Nov 14, 4:24 PM · hardware-requests, ops-eqiad, Patch-For-Review, Operations, DBA
Cmjohnson closed T176311: decommission db1036, a subtask of T162699: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036), as Resolved.
Tue, Nov 14, 4:24 PM · Patch-For-Review, Operations, DBA
Cmjohnson closed T176931: Decommission db1035 as Resolved.

wiped, racktables updated.

Tue, Nov 14, 4:24 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson closed T176931: Decommission db1035, a subtask of T134476: Decommission old coredb machines (<=db1050), as Resolved.
Tue, Nov 14, 4:24 PM · Patch-For-Review, Operations, DBA
Cmjohnson closed T174076: Decommission db1033 and db1028 as Resolved.

Wiped, removed from racktables

Tue, Nov 14, 4:21 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson closed T174076: Decommission db1033 and db1028, a subtask of T134476: Decommission old coredb machines (<=db1050), as Resolved.
Tue, Nov 14, 4:21 PM · Patch-For-Review, Operations, DBA

Wed, Nov 8

Cmjohnson added a comment to T173570: Decommission db1015.

@Marostegui during my decom checks I found db1015 in this file. Should a replacement be identified?

Wed, Nov 8, 8:58 PM · Patch-For-Review, hardware-requests, ops-eqiad, DBA, Operations
Cmjohnson added a comment to T175679: Decommission db1048 (was Move m3 slave to db1059).

@jcrespo @Marostegui is it safe to finish off db1048?

Wed, Nov 8, 8:53 PM · Operations, ops-eqiad, Phabricator, DBA
Cmjohnson closed T179727: Degraded RAID on db1059 as Resolved.

Looks like the rebuild is complete and all disks are back online

Wed, Nov 8, 8:43 PM · DBA, ops-eqiad, Operations
Cmjohnson updated the task description for T173915: Decommission db1041.
Wed, Nov 8, 8:37 PM · hardware-requests, Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson updated the task description for T177911: Decommission db1038.
Wed, Nov 8, 8:37 PM · Operations, hardware-requests, ops-eqiad, Patch-For-Review, DBA
Cmjohnson updated the task description for T174902: Decommission db1037.
Wed, Nov 8, 8:37 PM · hardware-requests, Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson updated the task description for T176311: decommission db1036.
Wed, Nov 8, 8:36 PM · hardware-requests, ops-eqiad, Patch-For-Review, Operations, DBA
Cmjohnson updated the task description for T176931: Decommission db1035.
Wed, Nov 8, 8:36 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson updated the task description for T174076: Decommission db1033 and db1028.
Wed, Nov 8, 8:36 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson closed T172323: Decommission WMF3248 (old R510) as Resolved.
Wed, Nov 8, 7:01 PM · hardware-requests, ops-eqiad, Operations
Cmjohnson updated the task description for T172323: Decommission WMF3248 (old R510).
Wed, Nov 8, 7:01 PM · hardware-requests, ops-eqiad, Operations
Cmjohnson moved T173915: Decommission db1041 from Decommission to Being worked on on the ops-eqiad board.
Wed, Nov 8, 6:52 PM · hardware-requests, Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson updated the task description for T177911: Decommission db1038.
Wed, Nov 8, 6:52 PM · Operations, hardware-requests, ops-eqiad, Patch-For-Review, DBA
Cmjohnson updated the task description for T174902: Decommission db1037.
Wed, Nov 8, 6:52 PM · hardware-requests, Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson updated the task description for T176311: decommission db1036.
Wed, Nov 8, 6:50 PM · hardware-requests, ops-eqiad, Patch-For-Review, Operations, DBA
Cmjohnson updated the task description for T176931: Decommission db1035.
Wed, Nov 8, 6:49 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson updated the task description for T174076: Decommission db1033 and db1028.
Wed, Nov 8, 6:49 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson moved T179968: setup/install lawrencium for temp use by performance team from Backlog to Blocked on the ops-eqiad board.
Wed, Nov 8, 6:48 PM · Patch-For-Review, Performance-Team, Operations
Cmjohnson reassigned T179968: setup/install lawrencium for temp use by performance team from Cmjohnson to RobH.

All on-site work is complete assigning to @RobH

Wed, Nov 8, 5:10 PM · Patch-For-Review, Performance-Team, Operations
Cmjohnson updated the task description for T179968: setup/install lawrencium for temp use by performance team.
Wed, Nov 8, 5:09 PM · Patch-For-Review, Performance-Team, Operations
Cmjohnson added a comment to T179727: Degraded RAID on db1059.

Disk has been swapped

Wed, Nov 8, 5:01 PM · DBA, ops-eqiad, Operations

Tue, Nov 7

Gilles awarded T179968: setup/install lawrencium for temp use by performance team a Love token.
Tue, Nov 7, 6:53 PM · Patch-For-Review, Performance-Team, Operations
Cmjohnson moved T174076: Decommission db1033 and db1028 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 7, 3:33 PM · hardware-requests, ops-eqiad, Operations, DBA
Cmjohnson moved T177911: Decommission db1038 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 7, 3:33 PM · Operations, hardware-requests, ops-eqiad, Patch-For-Review, DBA
Cmjohnson moved T174902: Decommission db1037 from Decommission to Being worked on on the ops-eqiad board.
Tue, Nov 7, 3:33 PM · hardware-requests, Patch-For-Review, Operations, ops-eqiad, DBA
Cmjohnson moved T171473: labvirt1015 crashes from Being worked on to Blocked on the ops-eqiad board.
Tue, Nov 7, 3:33 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations
Cmjohnson moved T177911: Decommission db1038 from Backlog to Decommission on the ops-eqiad board.
Tue, Nov 7, 3:33 PM · Operations, hardware-requests, ops-eqiad, Patch-For-Review, DBA
Cmjohnson moved T179727: Degraded RAID on db1059 from Backlog to Up next on the ops-eqiad board.
Tue, Nov 7, 3:32 PM · DBA, ops-eqiad, Operations

Mon, Nov 6

Cmjohnson added a comment to T179042: Setup eqsin RIPE Atlas anchor.

I have connected the Ripe atlas anchor to iron if you want to load the image.

Mon, Nov 6, 2:23 PM · ops-eqiad, netops, Operations

Fri, Nov 3

Cmjohnson added a comment to T171473: labvirt1015 crashes.

@chasemp please try again, I replaced the broken CPU.

Fri, Nov 3, 4:13 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations

Thu, Nov 2

Cmjohnson added a comment to T171473: labvirt1015 crashes.

The CPU was replaced and idrac log cleared.

Thu, Nov 2, 5:59 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations
Cmjohnson added a comment to T176975: connect second interface for each frack to opposite switch for each eqiad host.

the 2nd interfaces are connected, updated the switch descriptions, I did not enable the ports.

Thu, Nov 2, 5:58 PM · ops-eqiad, netops, fundraising-tech-ops, Operations
Cmjohnson added a comment to T175625: scs-c1-eqiad unresponsive.

Tested a standard ethernet cable and it works fine. It appears that the custom pinout for the cable is no longer required and each of the cables will need to be re-done.

Thu, Nov 2, 5:33 PM · ops-eqiad, DC-Ops, Operations

Mon, Oct 30

Cmjohnson closed T176215: decommission db1018, a subtask of T162699: Decomissions old s2 eqiad hosts (db1018, db1021, db1024, db1036), as Resolved.
Mon, Oct 30, 2:29 PM · Patch-For-Review, Operations, DBA
Cmjohnson closed T176215: decommission db1018 as Resolved.
Mon, Oct 30, 2:29 PM · ops-eqiad, Operations, DBA
Cmjohnson added a comment to T176215: decommission db1018.

Server has been wiped and removed from rack...racktables updated

Mon, Oct 30, 2:28 PM · ops-eqiad, Operations, DBA
Cmjohnson updated the task description for T176215: decommission db1018.
Mon, Oct 30, 2:28 PM · ops-eqiad, Operations, DBA
Cmjohnson moved T179042: Setup eqsin RIPE Atlas anchor from Backlog to Up next on the ops-eqiad board.
Mon, Oct 30, 2:24 PM · ops-eqiad, netops, Operations
Cmjohnson closed T166171: rack/setup/wire/deploy msw2-c1-eqiad as Resolved.

Completed.

Mon, Oct 30, 2:22 PM · fundraising-tech-ops, netops, ops-eqiad, Operations
Cmjohnson closed T169644: eqiad: rack frack refresh equipment as Resolved.

this has been completed.

Mon, Oct 30, 2:21 PM · Patch-For-Review, Operations, netops, ops-eqiad
Cmjohnson moved T175625: scs-c1-eqiad unresponsive from Blocked to Being worked on on the ops-eqiad board.
Mon, Oct 30, 2:21 PM · ops-eqiad, DC-Ops, Operations
Cmjohnson reassigned T175625: scs-c1-eqiad unresponsive from Cmjohnson to RobH.

I've replaced the console, set it up so it's accessible. Setup all the ports but I am not able to access ports via pmshell. @RobH could you look into this and see if I missed something please.

Mon, Oct 30, 2:21 PM · ops-eqiad, DC-Ops, Operations
Cmjohnson moved T179192: Check analytics1037 power supply status from Backlog to Blocked on the ops-eqiad board.
Mon, Oct 30, 2:19 PM · ops-eqiad, Operations, User-Elukey, Analytics
Cmjohnson moved T178742: Possibly faulty BBU on analytics1029 from Backlog to Blocked on the ops-eqiad board.
Mon, Oct 30, 2:19 PM · User-Elukey, Operations, Analytics, ops-eqiad
Cmjohnson assigned T179192: Check analytics1037 power supply status to RobH.

this server is out of warranty by 6 months. Assigning to @RobH to determine if we should order a new one...probably two.

Mon, Oct 30, 2:19 PM · ops-eqiad, Operations, User-Elukey, Analytics
Cmjohnson assigned T178742: Possibly faulty BBU on analytics1029 to RobH.

this server is out of warranty by 6 months. Assigning to @RobH to determine if we should order a new one?

Mon, Oct 30, 2:18 PM · User-Elukey, Operations, Analytics, ops-eqiad
Cmjohnson moved T176957: Decommission host copper.eqiad.wmnet from Backlog to Decommission on the ops-eqiad board.
Mon, Oct 30, 2:16 PM · hardware-requests, ops-eqiad, Packaging, Operations
Cmjohnson renamed T176957: Decommission host copper.eqiad.wmnet from Deprecate host copper.eqiad.wmnet to Decommission host copper.eqiad.wmnet.
Mon, Oct 30, 2:16 PM · hardware-requests, ops-eqiad, Packaging, Operations
Cmjohnson closed T179129: Degraded RAID on tungsten as Resolved.

Replaced the disk but this server should be replaced and decommissioned sooner rather than later.

Mon, Oct 30, 2:15 PM · ops-eqiad, Operations

Sat, Oct 28

Cmjohnson added a comment to T179192: Check analytics1037 power supply status.

I am not sure what else can be done here. I’ve replaced the PSUs twice,
upgraded f/w and they continue to burn through the fans. This group of
servers is almost a year out of warranty.

Sat, Oct 28, 12:22 PM · ops-eqiad, Operations, User-Elukey, Analytics

Fri, Oct 27

Cmjohnson added a comment to T175150: Decommission stat1003.eqiad.wmnet.

@Ottomata okay please update task when it's okay to wipe. Thanks

Fri, Oct 27, 5:42 PM · ops-eqiad, Operations

Wed, Oct 25

Cmjohnson closed T166489: Decommission ms-be1001 - ms-be1012 as Resolved.

resolved

Wed, Oct 25, 6:12 PM · ops-eqiad, hardware-requests, User-fgiunchedi, Operations
Cmjohnson updated the task description for T166489: Decommission ms-be1001 - ms-be1012 .
Wed, Oct 25, 6:11 PM · ops-eqiad, hardware-requests, User-fgiunchedi, Operations
Cmjohnson added a comment to T178383: db1101 crashed - memory errors.

Dell declined to send the new DIMM, stated that my supporting documentation was insufficient. I swapped the DIMM at A4 to B4 and will need to wait for that to fail before submitting again.

Wed, Oct 25, 4:49 PM · ops-eqiad, Operations, DBA
Cmjohnson added a comment to T171473: labvirt1015 crashes.

Dell declined the new system board. We are getting another CPU to since that is the part that seems to be broken.

Wed, Oct 25, 4:48 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations

Mon, Oct 23

Cmjohnson moved T178460: db1082 storage crashed from Being worked on to Blocked on the ops-eqiad board.
Mon, Oct 23, 5:03 PM · Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson added a comment to T178460: db1082 storage crashed.

The controller has been replaced and the server has been powered on. @Marostegui please resolve task when you're comfortable with the new controller.

Mon, Oct 23, 5:03 PM · Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson added a comment to T171473: labvirt1015 crashes.

@chasemp, no unfortunately it does not work that way. I new CPU and motherboard has been requested through Dell. I believe that will fix the issue. The CPU they sent the first time is refurbished so its possible it may have been bad.

Mon, Oct 23, 4:15 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations
Cmjohnson added a comment to T178460: db1082 storage crashed.

@Marostegui the HP tech will be at the data center today to swap the controller. Is the server depooled?

Mon, Oct 23, 1:29 PM · Patch-For-Review, ops-eqiad, Operations, DBA
Cmjohnson moved T178460: db1082 storage crashed from Blocked to Being worked on on the ops-eqiad board.
Mon, Oct 23, 1:28 PM · Patch-For-Review, ops-eqiad, Operations, DBA

Fri, Oct 20

Cmjohnson added a comment to T177227: Multiple servers in eqiad D8 showing PSU failures.

replaced both psu's in analtyics1037, the psu in an1036 cleared the error and is functioning normally as far as I can tell.

Fri, Oct 20, 4:06 PM · ops-eqiad, DC-Ops, Operations
Cmjohnson added a comment to T171473: labvirt1015 crashes.

@andrew..wrote that in the wrong ticket

Fri, Oct 20, 3:59 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations
Cmjohnson added a comment to T171473: labvirt1015 crashes.
Fri, Oct 20, 3:30 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations
Cmjohnson added a comment to T171473: labvirt1015 crashes.

Still no sign of failure from the h/w log....it took awhile last time

Fri, Oct 20, 3:05 PM · cloud-services-team (Kanban), DC-Ops, ops-eqiad, Operations
Cmjohnson closed T177227: Multiple servers in eqiad D8 showing PSU failures as Resolved.

@herron @faidon I updated the f/w on both servers and the issue has been resolved.

Fri, Oct 20, 2:44 PM · ops-eqiad, DC-Ops, Operations

Thu, Oct 19

Cmjohnson added a comment to T177227: Multiple servers in eqiad D8 showing PSU failures.

They all have the same problem. I swapped PSU's for both an1036 and 1037 yesterday but still show the failure. The new psu's are failing after a new one is installed. These servers are now out of warranty.

Thu, Oct 19, 6:28 PM · ops-eqiad, DC-Ops, Operations

Oct 18 2017

Cmjohnson closed T177633: check kafka1022 power supply status as Resolved.

This server's warranty expired 2 years ago. The R720XD's have a history of killing power supplies. I inserted a new power supply and immediately it was fried. This server should be decommissioned sooner rather than later. @Ottomata

Oct 18 2017, 5:07 PM · ops-eqiad, Operations
Cmjohnson closed T177631: check elastic1022 power supply redundancy as Resolved.

Verified that the settings are all correct, the server both racadm and web UI do not show any problems with the power supplies

Oct 18 2017, 5:04 PM · Elasticsearch, ops-eqiad, Discovery-Search, Operations, Discovery
Cmjohnson moved T177405: rack and setup db1107 and db1108 from Up next to Blocked on the ops-eqiad board.
Oct 18 2017, 5:01 PM · Patch-For-Review, Analytics-Kanban, User-Elukey, ops-eqiad, Operations
Cmjohnson closed T177634: check mc1016 power supply redundancy as Declined.

This server is alread scheduled for decommission. T164341

Oct 18 2017, 5:01 PM · ops-eqiad, Operations
Cmjohnson closed T177637: check mw1203 power supply redundancy as Resolved.

This PSU on the surface appears to be fine, no LED's and racadm showed it normal. The mgmt U?I showed PS1 failed. Replaced from a recent decom. (out of warranty)

Oct 18 2017, 5:00 PM · ops-eqiad, Operations
Cmjohnson closed T177635: check mw1200 power supply redundancy as Resolved.

strange but the PSU shows active, green light, hot spare was disabled but in mgmt U/I shows psu2 failed. I replaced with a psu from a recent decom and back to normal.

Oct 18 2017, 4:56 PM · ops-eqiad, Operations