Maniphest T338408

decommission analytics1059.eqiad.wmnet
Closed, ResolvedPublicRequest
Actions

Assigned To

Authored By

	Stevemunene
	Jun 8 2023, 1:58 AM

Tags

Referenced Files

None

Subscribers

Details

	Subject	Repo	Branch	Lines +/-
	analytics: Remove analytics58_60 from the HDFS topology	operations/puppet	production	+0 -3
	analytics: Decommission analytics10[59-60] from hadoop cluster	operations/puppet	production	+4 -0

Customize query in gerrit

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		Stevemunene	T317861 Decommission analytics10[58-69]
		Resolved	Request	Jclark-ctr	T338408 decommission analytics1059.eqiad.wmnet

Event Timeline

Stevemunene created this task.Jun 8 2023, 1:58 AM

Stevemunene added a parent task: T317861: Decommission analytics10[58-69].

Stevemunene added a project: Data-Platform-SRE.Jun 8 2023, 2:02 AM

Change 928478 had a related patch set uploaded (by Stevemunene; author: Stevemunene):

[operations/puppet@production] analytics: Decommission analytics10[59-60] from hadoop cluster

https://gerrit.wikimedia.org/r/928478

Change 928479 had a related patch set uploaded (by Stevemunene; author: Stevemunene):

[operations/puppet@production] analytics: Remove analytics58_60 from the HDFS topology

https://gerrit.wikimedia.org/r/928479

Change 928478 merged by Stevemunene:

[operations/puppet@production] analytics: Decommission analytics10[59-60] from hadoop cluster

https://gerrit.wikimedia.org/r/928478

Stevemunene moved this task from Incoming to In Progress on the Data-Platform-SRE board.Jun 13 2023, 3:23 PM

Change 928479 merged by Stevemunene:

[operations/puppet@production] analytics: Remove analytics58_60 from the HDFS topology

https://gerrit.wikimedia.org/r/928479

Maintenance_bot removed a project: Patch-For-Review.Jun 14 2023, 1:30 PM

Stevemunene updated the task description. (Show Details)Jun 16 2023, 1:43 PM

Mentioned in SAL (#wikimedia-analytics) [2023-06-19T10:47:55Z] <stevemunene> decommission host analytics1059.eqiad.wmnet -t T338408

cookbooks.sre.hosts.decommission executed by stevemunene@cumin1001 for hosts: analytics1059.eqiad.wmnet

analytics1059.eqiad.wmnet (WARN)
- Downtimed host on Icinga/Alertmanager
- Found physical host
- Management interface not found on Icinga, unable to downtime it
- Wiped all swraid, partition-table and filesystem signatures
- Powered off
- [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
- Configured the linked switch interface(s)
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB

Stevemunene updated the task description. (Show Details)Jun 26 2023, 5:32 AM

Stevemunene moved this task from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.Jul 10 2023, 1:25 PM

Stevemunene updated the task description. (Show Details)

Stevemunene reassigned this task from Stevemunene to Jclark-ctr.Jul 12 2023, 9:49 AM

Jclark-ctr added a project: ops-eqiad.Jul 12 2023, 1:52 PM

Jclark-ctr moved this task from Backlog to Decommission on the ops-eqiad board.Jul 12 2023, 1:56 PM

BTullis moved this task from In Progress to Needs Reporting on the Data-Platform-SRE board.Jul 17 2023, 8:56 AM

Maintenance_bot added a project: SRE.Jul 17 2023, 9:31 AM

Jclark-ctr closed this task as Resolved.Jul 17 2023, 12:46 PM

Jclark-ctr updated the task description. (Show Details)

Gehel moved this task from Needs Reporting to Done on the Data-Platform-SRE board.Jul 19 2023, 8:52 AM