Page MenuHomePhabricator

find a way to systematically update the deployment server name across all repos
Open, HighPublic

Description

After our recent migration of our deployment server from tin to deploy1001 (T175288) there were reports from several users
about being blocked from deploying because files in their local repos still referred to the old deployment server name.

There were different categories of this issue, some were .config files in the "deployment-cache" directory which contained the string "tin.eqiad.wmnet". As the name implies these are cached files. One way to fix the issue was to manually edit the file and replace the host name. Another was apparently to just delete the file and have it recreated by scap and/or running scap with --refresh-config.

Seperate from this there was another category where .config files were not in the deployment-cache directory and still contained the old host name and it has been reported that this happened after a fresh OS install.

Also there were comments about a fix inside scap that is needed for this but still needs to be deployed.

This ticket is for all that and finding a clean way to handle this next time we have to switch from say deploy1001 to deploy1002.

Event Timeline

Dzahn created this task.Jun 15 2018, 2:46 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 15 2018, 2:46 PM
Dzahn edited projects, added Scap; removed Deployments.Jun 15 2018, 2:47 PM
Dzahn updated the task description. (Show Details)
Joe triaged this task as High priority.Jun 18 2018, 8:41 AM
Joe added a subscriber: Joe.Jun 18 2018, 10:47 AM
Vvjjkkii renamed this task from find a way to systematically update the deployment server name across all repos to 1taaaaaaaa.Jul 1 2018, 1:03 AM
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from 1taaaaaaaa to find a way to systematically update the deployment server name across all repos.Jul 2 2018, 12:07 PM
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.
thcipriani moved this task from Needs triage to Debt on the Scap board.Jul 11 2018, 12:10 AM
thcipriani added a subscriber: thcipriani.

There are a couple of different issues here.

The tin and deployment-cache issues that came up after the initial move to deploy1001 are fixed by the 3.8.2-1 scap release. These issue were due to repositories with submodules issuing a fetch (recursively) before remapping submodules to look at the deployment server (see T196663#4265139 for details). That is fixed.

The second issue is that some repositories override deployment_server that is tracked in T162814. The only outstanding patch I'm aware of is for 3d2png (https://gerrit.wikimedia.org/r/#/c/3d2png/deploy/+/441234/).

Mentioned in SAL (#wikimedia-operations) [2019-08-27T11:51:31Z] <mutante> miscweb1001 - manually remove tin.eqiad.wmnet (!) from /srv/iegreview/iegreview-cache/.config and replace with deploy1001 after first puppet run. still existing bug that tin is not fully removed (T224247, T175288, T197470)