Page MenuHomePhabricator

Decommission db1030
Closed, ResolvedPublic

Description

In order to decommission db1030 first, the following movements are required:

  • In s8, make db1087 vslow server
  • Remove db1063 from s8 vslow and move it to s6 vslow
  • Clone db1063 from db1030
  • - all system services confirmed offline from production use: Removed from mediawiki-config: https://gerrit.wikimedia.org/r/#/c/406613/
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/heira/dsh config removed
  • - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.):
  • Host set to spare: https://gerrit.wikimedia.org/r/#/c/406981/

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare)
  • - power down host
  • - disable switch port , note on task for later unracking - asw-b-eqiad:ge-1/0/14
  • - remove production dns entries & remove hostname entries in mgmt dns
  • - puppet node clean, puppet node deactivate, salt key removed

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - remove hostname label, remove hostname from visible label field in racktables (by onsite)
  • - system removed from rack

Related Objects

StatusAssignedTask
ResolvedNone
ResolvedCmjohnson

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 8 2018, 8:04 AM
Marostegui triaged this task as Normal priority.Jan 8 2018, 8:05 AM
Marostegui moved this task from Triage to Next on the DBA board.

Change 404258 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Replace db1063 with db1087 as vslow

https://gerrit.wikimedia.org/r/404258

Change 404258 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Replace db1063 with db1087 as vslow

https://gerrit.wikimedia.org/r/404258

I have set db1087 as vslow in s8 instead of db1063, which is the first step to move db1063 as vslow in s6, so we can get rid of db1030.

I will reimage db1063 as stretch and use 10.1 for it //cc @jcrespo

It is exactly that :)

Marostegui moved this task from Next to In progress on the DBA board.

Change 404692 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1063.yaml: Disable notifications

https://gerrit.wikimedia.org/r/404692

Change 404692 merged by Marostegui:
[operations/puppet@production] db1063.yaml: Disable notifications

https://gerrit.wikimedia.org/r/404692

Change 405672 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Move db1063 to s6

https://gerrit.wikimedia.org/r/405672

Change 405672 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Move db1063 to s6

https://gerrit.wikimedia.org/r/405672

Mentioned in SAL (#wikimedia-operations) [2018-01-22T06:41:00Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Move db1063 from s8 to s6 - T184397 (duration: 00m 58s)

Change 405673 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Move db1063 to s6

https://gerrit.wikimedia.org/r/405673

Change 405673 merged by Marostegui:
[operations/puppet@production] mariadb: Move db1063 to s6

https://gerrit.wikimedia.org/r/405673

Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1063.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201801220657_marostegui_4121.log.

Completed auto-reimage of hosts:

['db1063.eqiad.wmnet']

Of which those FAILED:

['db1063.eqiad.wmnet']

Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1063.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201801220843_marostegui_11772.log.

Completed auto-reimage of hosts:

['db1063.eqiad.wmnet']

Of which those FAILED:

['db1063.eqiad.wmnet']

Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1063.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201801220844_marostegui_11978.log.

Completed auto-reimage of hosts:

['db1063.eqiad.wmnet']

and were ALL successful.

Change 405681 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1030

https://gerrit.wikimedia.org/r/405681

Change 405681 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1030

https://gerrit.wikimedia.org/r/405681

Mentioned in SAL (#wikimedia-operations) [2018-01-22T09:20:46Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1030 - T184397 (duration: 00m 56s)

Mentioned in SAL (#wikimedia-operations) [2018-01-22T09:21:08Z] <marostegui> Stop MySQL on db1030 to clone db1063 - T184397

Change 405688 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s6,s8.hosts: Move db1063 to s6

https://gerrit.wikimedia.org/r/405688

Change 405688 merged by jenkins-bot:
[operations/software@master] s6,s8.hosts: Move db1063 to s6

https://gerrit.wikimedia.org/r/405688

Change 405695 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Disable notifications db1030

https://gerrit.wikimedia.org/r/405695

Change 405696 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Pool db1063 as vslow in s6

https://gerrit.wikimedia.org/r/405696

Change 405695 merged by Marostegui:
[operations/puppet@production] mariadb: Disable notifications db1030

https://gerrit.wikimedia.org/r/405695

Change 405696 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Pool db1063 as vslow in s6

https://gerrit.wikimedia.org/r/405696

Mentioned in SAL (#wikimedia-operations) [2018-01-22T12:30:49Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Pool db1063 as vslow - T184397 (duration: 00m 56s)

db1030 is no longer serving vslow in s6.
db1063 is now serving vslow there, let's leave it running for a week before proceeding and starting db1030 decommissioning process.

Marostegui updated the task description. (Show Details)Jan 26 2018, 11:38 PM

Change 406613 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1030

https://gerrit.wikimedia.org/r/406613

Change 406843 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s6.hosts: Remove db1030

https://gerrit.wikimedia.org/r/406843

Change 406613 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1030

https://gerrit.wikimedia.org/r/406613

Mentioned in SAL (#wikimedia-operations) [2018-01-31T07:10:20Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 57s)

Mentioned in SAL (#wikimedia-operations) [2018-01-31T07:13:06Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 56s)

Marostegui updated the task description. (Show Details)Jan 31 2018, 7:21 AM

Change 406981 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Remove db1030

https://gerrit.wikimedia.org/r/406981

Change 406981 merged by Marostegui:
[operations/puppet@production] mariadb: Remove db1030

https://gerrit.wikimedia.org/r/406981

Change 406843 merged by jenkins-bot:
[operations/software@master] s6.hosts: Remove db1030

https://gerrit.wikimedia.org/r/406843

Marostegui updated the task description. (Show Details)Jan 31 2018, 7:46 AM
Marostegui updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2018-01-31T07:47:34Z] <marostegui> Remove db1030 from tendril - T184397

Mentioned in SAL (#wikimedia-operations) [2018-01-31T07:48:38Z] <marostegui> Stop MySQL on db1030 - T184397

Marostegui added a project: ops-eqiad.
Marostegui moved this task from In progress to Done on the DBA board.
Marostegui added a subscriber: Cmjohnson.

db1030 is now ready to be fully decommissioned by @Cmjohnson

Restricted Application added a project: Operations. · View Herald TranscriptJan 31 2018, 8:06 AM
Cmjohnson moved this task from Backlog to Decommission on the ops-eqiad board.Feb 2 2018, 4:40 PM
RobH claimed this task.Feb 9 2018, 10:12 PM

Change 409451 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] db1030 decom

https://gerrit.wikimedia.org/r/409451

Change 409451 merged by RobH:
[operations/dns@master] db1030 decom

https://gerrit.wikimedia.org/r/409451

Change 409454 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom db1030

https://gerrit.wikimedia.org/r/409454

Change 409454 merged by RobH:
[operations/puppet@production] decom db1030

https://gerrit.wikimedia.org/r/409454

RobH reassigned this task from RobH to Cmjohnson.Feb 9 2018, 10:24 PM
RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)
RobH added a subscriber: RobH.

Change 422451 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Removing mgmt dns db1030

https://gerrit.wikimedia.org/r/422451

Change 422451 merged by Cmjohnson:
[operations/dns@master] Removing mgmt dns db1030

https://gerrit.wikimedia.org/r/422451

Cmjohnson updated the task description. (Show Details)Mar 28 2018, 6:19 PM
Cmjohnson closed this task as Resolved.Apr 3 2018, 6:41 PM
238482n375 set Security to Software security bug.Jun 15 2018, 8:05 AM
238482n375 changed the visibility from "Public (No Login Required)" to "Custom Policy".
This comment was removed by Reedy.
Restricted Application added a project: Security. · View Herald TranscriptJun 15 2018, 1:52 PM
Reedy added a subscriber: Reedy.
Reedy removed a subscriber: Reedy.
Aklapper raised the priority of this task from Lowest to Normal.Jun 15 2018, 2:05 PM
Aklapper added a subscriber: Aklapper.