eqiad: Move logstash1020 to rack A8
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• Cmjohnson
	Feb 5 2021, 3:55 PM

Description

@herron This is related to the an-worker task, I need to move this server and would like to do this 8 Feb @1530UTC. The server will stay in the same vlan just a different rack location. I intend to move to rack A8 ge-8/0/11

Please confirm this will work for you.

Related Objects

Mentioned In: T260445: (Need By: TBD) rack/setup/install an-worker11[18-41]

Event Timeline

• Cmjohnson created this task.Feb 5 2021, 3:55 PM

• Cmjohnson mentioned this in T260445: (Need By: TBD) rack/setup/install an-worker11[18-41].Feb 5 2021, 4:04 PM

Added Filippo and Cole too for awareness. The idea is to shutdown the node, move it to a different rack within the same row (so no IP/vlan change) and boot it up again. The downtime requested will be around max 30 mins from my past experience, but it may also depend on how busy Chris is in the DC and if we have emergencies etc..

This will free rack space for Hadoop worker nodes so thanks a lot in advance for helping!

herron added a project: observability.Feb 5 2021, 4:36 PM

Hey @Cmjohnson, @elukey, sure this should be no problem. I've set a reminder in my calendar to stop services on this host ahead of the window, and yup as long as the host/network config stays the same ES should do the right thing when services are brought back up. Would like to monitor it as it comes up though, just shoot a ping when ready. Thanks!

Maintenance_bot added a project: SRE.Feb 5 2021, 4:45 PM

RhinosF1 subscribed.Feb 5 2021, 5:38 PM

Mentioned in SAL (#wikimedia-operations) [2021-02-08T14:50:30Z] <herron> stopped ES on logstash1020 in prep for re-rack T273984

@herron Thanks! all finished and I was able to ssh to the server.

herron awarded a token.Feb 8 2021, 4:52 PM

logstash1020.mgmt is shown as down in icinga, reopening

logstash1020.mgmt 
View Service Details For This Host
DOWN	2021-02-09 10:08:30	0d 18h 33m 24s	1/2	PING CRITICAL - Packet loss = 100%

• Cmjohnson moved this task from Backlog to Hardware Failure / Troubleshoot on the ops-eqiad board.Feb 9 2021, 3:57 PM

fixed

The last Puppet run was at Mon Feb 8 14:16:19 UTC 2021 (19799 minutes ago). Puppet is disabled. disabled for re-racking T273984 --herron

Because of that it has been removed from PuppetDB and is alerting in the Netbox report (good safeguard!)

Please re-enable it if it's safe to do so and check that the Netbox report isn't alerting anymore https://netbox.wikimedia.org/extras/reports/puppetdb.PhysicalHosts/

Thanks @ayounsi it's been re-enabled and puppet has been run

eqiad: Move logstash1020 to rack A8Closed, ResolvedPublicActions

Description

Related Objects

Event Timeline

eqiad: Move logstash1020 to rack A8
Closed, ResolvedPublic
Actions