Page MenuHomePhabricator

Decommission db1030
Closed, ResolvedPublic

Description

In order to decommission db1030 first, the following movements are required:

  • In s8, make db1087 vslow server
  • Remove db1063 from s8 vslow and move it to s6 vslow
  • Clone db1063 from db1030
  • - all system services confirmed offline from production use: Removed from mediawiki-config: https://gerrit.wikimedia.org/r/#/c/406613/
  • - set all icinga checks to maint mode/disabled while reclaim/decommmission takes place.
  • - remove system from all lvs/pybal active configuration
  • - any service group puppet/heira/dsh config removed
  • - remove site.pp (replace with role::spare if system isn't shut down immediately during this process.):
  • Host set to spare: https://gerrit.wikimedia.org/r/#/c/406981/

START NON-INTERRUPPTABLE STEPS

  • - disable puppet on host
  • - remove all remaining puppet references (include role::spare)
  • - power down host
  • - disable switch port , note on task for later unracking - asw-b-eqiad:ge-1/0/14
  • - remove production dns entries & remove hostname entries in mgmt dns
  • - puppet node clean, puppet node deactivate, salt key removed

END NON-INTERRUPPTABLE STEPS

  • - system disks wiped (by onsite)
  • - remove hostname label, remove hostname from visible label field in racktables (by onsite)
  • - system removed from rack

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
Resolved Cmjohnson

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Marostegui triaged this task as Medium priority.Jan 8 2018, 8:05 AM
Marostegui moved this task from Triage to Pending comment on the DBA board.

Change 404258 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Replace db1063 with db1087 as vslow

https://gerrit.wikimedia.org/r/404258

Change 404258 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Replace db1063 with db1087 as vslow

https://gerrit.wikimedia.org/r/404258

I have set db1087 as vslow in s8 instead of db1063, which is the first step to move db1063 as vslow in s6, so we can get rid of db1030.

I will reimage db1063 as stretch and use 10.1 for it //cc @jcrespo

Change 404692 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1063.yaml: Disable notifications

https://gerrit.wikimedia.org/r/404692

Change 404692 merged by Marostegui:
[operations/puppet@production] db1063.yaml: Disable notifications

https://gerrit.wikimedia.org/r/404692

Change 405672 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Move db1063 to s6

https://gerrit.wikimedia.org/r/405672

Change 405672 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Move db1063 to s6

https://gerrit.wikimedia.org/r/405672

Mentioned in SAL (#wikimedia-operations) [2018-01-22T06:41:00Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Move db1063 from s8 to s6 - T184397 (duration: 00m 58s)

Change 405673 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Move db1063 to s6

https://gerrit.wikimedia.org/r/405673

Change 405673 merged by Marostegui:
[operations/puppet@production] mariadb: Move db1063 to s6

https://gerrit.wikimedia.org/r/405673

Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1063.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201801220657_marostegui_4121.log.

Completed auto-reimage of hosts:

['db1063.eqiad.wmnet']

Of which those FAILED:

['db1063.eqiad.wmnet']

Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1063.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201801220843_marostegui_11772.log.

Completed auto-reimage of hosts:

['db1063.eqiad.wmnet']

Of which those FAILED:

['db1063.eqiad.wmnet']

Script wmf-auto-reimage was launched by marostegui on neodymium.eqiad.wmnet for hosts:

['db1063.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201801220844_marostegui_11978.log.

Completed auto-reimage of hosts:

['db1063.eqiad.wmnet']

and were ALL successful.

Change 405681 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1030

https://gerrit.wikimedia.org/r/405681

Change 405681 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1030

https://gerrit.wikimedia.org/r/405681

Mentioned in SAL (#wikimedia-operations) [2018-01-22T09:20:46Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1030 - T184397 (duration: 00m 56s)

Mentioned in SAL (#wikimedia-operations) [2018-01-22T09:21:08Z] <marostegui> Stop MySQL on db1030 to clone db1063 - T184397

Change 405688 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s6,s8.hosts: Move db1063 to s6

https://gerrit.wikimedia.org/r/405688

Change 405688 merged by jenkins-bot:
[operations/software@master] s6,s8.hosts: Move db1063 to s6

https://gerrit.wikimedia.org/r/405688

Change 405695 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Disable notifications db1030

https://gerrit.wikimedia.org/r/405695

Change 405696 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Pool db1063 as vslow in s6

https://gerrit.wikimedia.org/r/405696

Change 405695 merged by Marostegui:
[operations/puppet@production] mariadb: Disable notifications db1030

https://gerrit.wikimedia.org/r/405695

Change 405696 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Pool db1063 as vslow in s6

https://gerrit.wikimedia.org/r/405696

Mentioned in SAL (#wikimedia-operations) [2018-01-22T12:30:49Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Pool db1063 as vslow - T184397 (duration: 00m 56s)

db1030 is no longer serving vslow in s6.
db1063 is now serving vslow there, let's leave it running for a week before proceeding and starting db1030 decommissioning process.

Change 406613 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1030

https://gerrit.wikimedia.org/r/406613

Change 406843 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/software@master] s6.hosts: Remove db1030

https://gerrit.wikimedia.org/r/406843

Change 406613 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad,db-codfw.php: Remove db1030

https://gerrit.wikimedia.org/r/406613

Mentioned in SAL (#wikimedia-operations) [2018-01-31T07:10:20Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 57s)

Mentioned in SAL (#wikimedia-operations) [2018-01-31T07:13:06Z] <marostegui@tin> Synchronized wmf-config/db-codfw.php: Remove db1030, will be decommissioned - T184397 (duration: 00m 56s)

Change 406981 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Remove db1030

https://gerrit.wikimedia.org/r/406981

Change 406981 merged by Marostegui:
[operations/puppet@production] mariadb: Remove db1030

https://gerrit.wikimedia.org/r/406981

Change 406843 merged by jenkins-bot:
[operations/software@master] s6.hosts: Remove db1030

https://gerrit.wikimedia.org/r/406843

Mentioned in SAL (#wikimedia-operations) [2018-01-31T07:47:34Z] <marostegui> Remove db1030 from tendril - T184397

Marostegui added a project: ops-eqiad.
Marostegui moved this task from In progress to Done on the DBA board.
Marostegui added a subscriber: Cmjohnson.

db1030 is now ready to be fully decommissioned by @Cmjohnson

Change 409451 had a related patch set uploaded (by RobH; owner: RobH):
[operations/dns@master] db1030 decom

https://gerrit.wikimedia.org/r/409451

Change 409451 merged by RobH:
[operations/dns@master] db1030 decom

https://gerrit.wikimedia.org/r/409451

Change 409454 had a related patch set uploaded (by RobH; owner: RobH):
[operations/puppet@production] decom db1030

https://gerrit.wikimedia.org/r/409454

Change 409454 merged by RobH:
[operations/puppet@production] decom db1030

https://gerrit.wikimedia.org/r/409454

RobH removed a project: Patch-For-Review.
RobH updated the task description. (Show Details)
RobH subscribed.

Change 422451 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/dns@master] Removing mgmt dns db1030

https://gerrit.wikimedia.org/r/422451

Change 422451 merged by Cmjohnson:
[operations/dns@master] Removing mgmt dns db1030

https://gerrit.wikimedia.org/r/422451

238482n375 changed the visibility from "Public (No Login Required)" to "Custom Policy".
This comment was removed by Reedy.
Aklapper raised the priority of this task from Lowest to Medium.Jun 15 2018, 2:05 PM
Aklapper subscribed.