Page MenuHomePhabricator

Hardware refresh: Decommission restbase10[19-27]
Closed, ResolvedPublic

Description

Hosts (9) are end-of-life; Provision restbase10[34-42] as replacements, and decommission restbase10[19-27].

Provision new hosts

  • Depool restbase service(s)
  • Downtime
  • RESTBase scap targets (r1015537)
  • logstash-logback-encoder scap targets (r1015538)
  • Re-enable blocking read-repair (see: T360548 & P58877)
Cassandra decommissions
  • row/rack: a
    • restbase1019
    • restbase1020
    • restbase1021
  • row/rack: b
    • restbase1022
    • restbase1023
    • restbase1024
  • row/rack: d
    • restbase1025
    • restbase1026
    • restbase1027
Cassandra cleanups
  • row/rack: a
  • row/rack: b
  • row/rack: d
Hardware decommission

Event Timeline

Eevans triaged this task as Medium priority.Jan 8 2024, 8:14 PM
Eevans created this task.

Mentioned in SAL (#wikimedia-operations) [2024-03-10T14:18:58Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1019.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=9796a311-df1d-4f2e-bb25-b53d9c7867e8) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1019.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-10T14:19:12Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1019.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-12T07:05:51Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1020.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=a60145fd-738c-4b55-9a7c-21eabfe44a70) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1020.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-12T07:06:05Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1020.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-13T16:33:38Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1021.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=900e92f1-66a1-4f63-94ae-fed3ae46a79c) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1021.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-13T16:33:51Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1021.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-15T09:45:05Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1022.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=be085aec-234a-4a77-9925-9a904e432a89) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1022.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-15T09:45:19Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1022.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-17T19:04:31Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1023.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=d3a5c1f5-1151-488d-9865-6d4a78317d3e) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1023.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-17T19:04:45Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1023.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-19T13:08:24Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1024.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=e8fa3e9b-beee-445b-9193-963692f043c1) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1024.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-19T13:08:38Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1024.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-25T19:02:23Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1025.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=6930da48-61b0-488c-a992-27cdc3f2f91a) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1025.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-25T19:02:37Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1025.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-27T23:15:18Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1026.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=c097da94-8458-4b68-b39f-f27f15de9525) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1026.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-27T23:15:32Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1026.eqiad.wmnet with reason: Decommissioning — T354561

Mentioned in SAL (#wikimedia-operations) [2024-03-29T14:00:07Z] <eevans@cumin1002> START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1027.eqiad.wmnet with reason: Decommissioning — T354561

Icinga downtime and Alertmanager silence (ID=bca21dfb-8b16-4f3a-bca6-43b355ddf778) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1027.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-03-29T14:00:21Z] <eevans@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1027.eqiad.wmnet with reason: Decommissioning — T354561

Change #1015537 had a related patch set uploaded (by Eevans; author: Eevans):

[mediawiki/services/restbase/deploy@master] scap/targets: Remove decommissioned hosts

https://gerrit.wikimedia.org/r/1015537

Change #1015538 had a related patch set uploaded (by Eevans; author: Eevans):

[operations/software/logstash-logback-encoder@master] targets: Remove decommissioned hosts

https://gerrit.wikimedia.org/r/1015538

Eevans renamed this task from Decommission restbase10[19-27] to Decommission (hardware refresh) restbase10[19-27].Fri, Mar 29, 3:19 PM
Eevans renamed this task from Decommission (hardware refresh) restbase10[19-27] to Hardware refresh: Decommission restbase10[19-27].
Eevans updated the task description. (Show Details)
Eevans updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2024-04-01T15:33:43Z] <urandom> cassandra (restbase): re-enable blocking read-repair — T354561

Change #1015538 merged by Eevans:

[operations/software/logstash-logback-encoder@master] targets: Remove decommissioned hosts

https://gerrit.wikimedia.org/r/1015538

Change #1016003 had a related patch set uploaded (by Eevans; author: Eevans):

[operations/puppet@production] restbase: remove decommissioned hosts restbase10[19-27]

https://gerrit.wikimedia.org/r/1016003

Change #1015537 merged by Jgiannelos:

[mediawiki/services/restbase/deploy@master] scap/targets: Remove decommissioned hosts

https://gerrit.wikimedia.org/r/1015537

Change #1016003 merged by Eevans:

[operations/puppet@production] restbase: remove decommissioned hosts restbase10[19-27]

https://gerrit.wikimedia.org/r/1016003

Change #1016829 had a related patch set uploaded (by Eevans; author: Eevans):

[operations/puppet@production] site.pp: cleanup restbase10[19-27]

https://gerrit.wikimedia.org/r/1016829

Change #1016829 merged by Eevans:

[operations/puppet@production] site.pp: cleanup restbase10[19-27]

https://gerrit.wikimedia.org/r/1016829

Eevans claimed this task.
Eevans updated the task description. (Show Details)

Done!