Page MenuHomePhabricator

setup naos/WMF6406 as new codfw deployment server
Closed, ResolvedPublic

Description

This task will track the setup of spare pool system WMF6406 as 'naos' in codfw. The current deployment host mira is out of warranty, and experiencing network card issues (mainboard issues.) WMF6406 was recently used as graphite2003 to copy data, so it will need to have some items updated.

Details

Related Gerrit Patches:
operations/puppet : productionadds naos to everywhere mira is listed
operations/puppet : productiondeployment: sync home dirs from mira to naos
operations/dns : mastersetting naos (new codfw deploy host) ipv6 dns
operations/puppet : productionsetting naos install and site parameters
operations/dns : mastersetting up dns for naos

Event Timeline

RobH created this task.Apr 13 2017, 3:36 PM
RobH mentioned this in T162859: Swap NIC on mira.
RobH raised the priority of this task from Medium to High.Apr 13 2017, 3:41 PM

So the switchover from the eqiad deployment host (tin) to the codfw deployment host (mira) was scheduled for approximately April 19th. Ideally naos is online to replace mira before then.

Shifting this to high priority.

RobH updated the task description. (Show Details)Apr 13 2017, 3:45 PM
Papaul reassigned this task from Papaul to RobH.Apr 13 2017, 4:31 PM
Papaul updated the task description. (Show Details)

Change 348103 had a related patch set uploaded (by RobH):
[operations/dns@master] setting up dns for naos

https://gerrit.wikimedia.org/r/348103

Change 348103 merged by RobH:
[operations/dns@master] setting up dns for naos

https://gerrit.wikimedia.org/r/348103

Change 348104 had a related patch set uploaded (by RobH):
[operations/puppet@production] setting naos install and site parameters

https://gerrit.wikimedia.org/r/348104

RobH updated the task description. (Show Details)

Change 348104 merged by RobH:
[operations/puppet@production] setting naos install and site parameters

https://gerrit.wikimedia.org/r/348104

RobH reassigned this task from RobH to Papaul.Apr 13 2017, 5:15 PM
RobH updated the task description. (Show Details)

I just noticed the network port wasn't labeled on the switch with the asset tag, so I need @Papaul to determine what it is:

  • - @Papaul to update this task with the network port this system is plugged into and assign back to @RobH
RobH updated the task description. (Show Details)Apr 13 2017, 5:17 PM
Papaul reassigned this task from Papaul to RobH.Apr 13 2017, 6:04 PM

ge-5/0/15

RobH updated the task description. (Show Details)Apr 13 2017, 6:14 PM
RobH updated the task description. (Show Details)Apr 13 2017, 11:29 PM

Change 348474 had a related patch set uploaded (by RobH):
[operations/dns@master] setting naos (new codfw deploy host) ipv6 dns

https://gerrit.wikimedia.org/r/348474

Change 348474 merged by RobH:
[operations/dns@master] setting naos (new codfw deploy host) ipv6 dns

https://gerrit.wikimedia.org/r/348474

Change 348478 had a related patch set uploaded (by RobH):
[operations/puppet@production] adds naos to everywhere mira is listed

https://gerrit.wikimedia.org/r/348478

RobH updated the task description. (Show Details)Apr 17 2017, 3:59 PM
RobH added a comment.Apr 17 2017, 4:01 PM

So my latest patchset https://phabricator.wikimedia.org/T162900 has appending in naos for everything that has mira. I'm not 100% on some of the files, and if simple merging may require further update elsewhere (like touching scap config).

I've added for review some folks on the patchset, and it will be listed on our async meeting notes as a blocker.

Change 348478 merged by Filippo Giunchedi:
[operations/puppet@production] adds naos to everywhere mira is listed

https://gerrit.wikimedia.org/r/348478

I've merged @RobH patch and ran puppet on naos, issues I've encountered so far:

The UID issues needs a more permanent fix but I've manually fixed it for now

Dzahn updated the task description. (Show Details)Apr 18 2017, 11:47 PM

Change 348886 had a related patch set uploaded (by Dzahn):
[operations/puppet@production] deployment: sync home dirs from mira to naos

https://gerrit.wikimedia.org/r/348886

Dzahn added a subscriber: Dzahn.EditedApr 19 2017, 12:51 AM
  • backups: confirmed with bconsole that naos now exists in Bacula with the same backup sets (/home and /srv/deployment are backed up on deployment servers)
  • UID issue: see this https://gerrit.wikimedia.org/r/#/c/348884/

Change 348886 merged by Dzahn:
[operations/puppet@production] deployment: sync home dirs from mira to naos

https://gerrit.wikimedia.org/r/348886

Mentioned in SAL (#wikimedia-operations) [2017-04-19T01:47:27Z] <mutante> rsyncing /home from mira to naos (T162900)

Dzahn updated the task description. (Show Details)Apr 19 2017, 1:59 AM
Dzahn updated the task description. (Show Details)
Dzahn updated the task description. (Show Details)Apr 19 2017, 2:02 AM
Dzahn updated the task description. (Show Details)Apr 19 2017, 2:06 AM
Dzahn removed a project: Patch-For-Review.
Dzahn added a subscriber: demon.

Mentioned in SAL (#wikimedia-operations) [2017-04-19T11:23:40Z] <godog> add naos to git-deploy term on common-infrastructure4 - T162900

fgiunchedi updated the task description. (Show Details)Apr 19 2017, 11:24 AM
fgiunchedi updated the task description. (Show Details)Apr 19 2017, 1:04 PM

Followup for trebuchet/mwdeploy fixed uid/gid: https://phabricator.wikimedia.org/T163667

Anything left to do here? naos is effectively in service and mira should be decom'd (either with its NIC swapped or not in T162859)

Dzahn added a comment.May 5 2017, 2:22 PM

I don't think so. Except mira needs a proper decom task with the checkbox-template for decoms on it.

Dzahn closed this task as Resolved.May 5 2017, 2:44 PM
Dzahn mentioned this in T164588: decom mira.

follow-up task for mira created at T164588

closing as resolved