Page MenuHomePhabricator

Decommission db1010
Closed, ResolvedPublic

Description

All other pending tasks (puppet, dhcp, salt, icinga, etc. have already been done). The server is currently down.

  • - wipe ALL disks on db1010.eqiad.wmnet (it contains sensitive data)
  • - unrack
  • - remove mgmt dns entries
  • - update racktables
  • - add decommissioned servers to decom tracking tab
  • - remove the switch configuration description/vlan assignments.

Event Timeline

RobH subscribed.

This needs more info before it can be acted upon. Is this machine no longer in use and no longer wanted by the DB team? (I would assume so since the task is generated, but as it doesn't specifically state release, clarification is requested.)

I've added #hw-requests, as all decom tasks should have that. I've removed the ops-eqiad for now, as there isn't any actual actions for the onsite to do. (Tasks don't get flagged with that project until there is onsite action.)

RobH added a subscriber: Volans.

assigned to @Volans for input since @jcrespo is out. If this waits for his return, that is likely ok.

I chat with @RobH the same day on IRC, just forgot to update the task here: we decided to wait for @jcrespo

MZMcBride renamed this task from Decomission db1010 to Decommission db1010.Apr 14 2016, 6:31 AM
MZMcBride subscribed.

So this is not (yet) a hardware request (that is why I didn't involve datacenter ops nor hardware requests- someone else did). All databases <(=?)db1050 will be eventually replaced due to >=db1074 purchases.

This one in particular has to:

  • Be checked so that it is not really in use
  • Check if there is any data to backup /home s, mysql
  • Removed from the usual suspects (puppet, mediawiki, salt, etc.)
  • Checked its disks to see if they can be reused
  • Decide its fate (reuse for other service, unrack, etc.)

We are not yet on the last step.

I noticed this task during preparing ferm patches for the switchover when I was wondering why db1010 is not in site.pp. Turns out db1010 was already dropped from site.pp on Jan 7th 2015 by Sean.

Status reminder:

  • is not in site.pp
  • puppet is running
  • is on icinga with basic checks
  • is not on tendril or wmf-config
  • mysql is still running and replicating from db1023 (old S6 master)
jcrespo moved this task from Triage to Backlog on the DBA board.
jcrespo moved this task from Backlog to In progress on the DBA board.

Change 312486 had a related patch set uploaded (by Jcrespo):
db1010: retire entry from dhcp install

https://gerrit.wikimedia.org/r/312486

Change 312492 had a related patch set uploaded (by Marostegui):
wmnet: Deleted db1010 entry

https://gerrit.wikimedia.org/r/312492

Mentioned in SAL (#wikimedia-operations) [2016-09-23T09:48:03Z] <jynus> disabling alerts and shutting down db1010 in preparation for decommissioning T129395

Change 312486 merged by Jcrespo:
db1010: retire entry from dhcp install

https://gerrit.wikimedia.org/r/312486

Change 312492 merged by Jcrespo:
wmnet: Delete db1010 entry

https://gerrit.wikimedia.org/r/312492

Mentioned in SAL (#wikimedia-operations) [2016-09-23T14:55:38Z] <jynus> deployed dns update (removing db1010) T129395

Change 315690 had a related patch set uploaded (by Cmjohnson):
Moving db1053 to row A, updating dns entries(T147774), at same time removing dns entries for decom host db1010 (T129395)

https://gerrit.wikimedia.org/r/315690

Change 315690 merged by Cmjohnson:
Moving db1053 to row A, updating dns entries(T147774), at same time removing dns entries for decom host db1010 (T129395)

https://gerrit.wikimedia.org/r/315690

Cmjohnson updated the task description. (Show Details)

Sorry, I think you do the vlan yourself, I copied from a codfw ticket. Next tickets will strictly follow the documented procedure.