Page MenuHomePhabricator

Reimage gerrit2001 as stretch
Closed, ResolvedPublic

Description

gerrit2001 should be reimaged as stretch since stretch is now stable :)

Using gerrit2001 as it doesn't have any traffic yet so we can see if any problems arise before reimagine cobalt as stretch

Afterwards we could reimage cobalt to stretch including renaming it to gerrit1001?

Related Objects

StatusSubtypeAssignedTask
ResolvedDzahn
ResolvedDzahn
ResolvedDzahn
ResolvedNone
ResolvedRobH
ResolvedMarostegui
Resolvedjcrespo
ResolvedPapaul
ResolvedMarostegui
ResolvedRobH
ResolvedRobH
DeclinedNone
ResolvedPaladox
ResolvedPaladox
DeclinedNone
ResolvedPaladox
Resolvedhashar
Resolvedhashar
Resolvedhashar
ResolvedNone
ResolvedJoe
ResolvedJoe
ResolvedJdforrester-WMF
Resolvedbd808
Resolvedhashar
Resolvedhashar
Duplicatehashar
ResolvedPaladox
ResolvedDzahn
Resolvedthcipriani
ResolvedQChris
ResolvedQChris
ResolvedDzahn
ResolvedQChris
ResolvedDzahn
Declinedhashar

Event Timeline

demon triaged this task as Lowest priority.Jun 21 2017, 6:37 PM

I did talk to paladox about it and said it's fine to make this and subscribe me. but i also agree with a lower prio for now, first we should stabilize the systemd conversion and not do it all at once maybe. stretch _is_ stable now though.

Paladox renamed this task from Reimage gerrit2001 as stretch to Reimage gerrit2001 and cobalt as stretch.Jul 4 2017, 2:39 PM
Paladox updated the task description. (Show Details)
Dzahn renamed this task from Reimage gerrit2001 and cobalt as stretch to Reimage gerrit2001 as stretch.Jul 4 2017, 4:21 PM

One at a time please, first gerrit2001 only i suggest.

I've got about 6 other priorities before we do this. Yes, systemd first, also finishing logstash, scap deploy....

Change 380656 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] install: let gerrit2001 use stretch installer

https://gerrit.wikimedia.org/r/380656

Change 380656 merged by Dzahn:
[operations/puppet@production] install: let gerrit2001 use stretch installer

https://gerrit.wikimedia.org/r/380656

Mentioned in SAL (#wikimedia-operations) [2017-09-25T23:15:39Z] <mutante> gerrit2001 reinstalled with stretch, revoked old puppet cert, accepted new puppet cert, initial run that will do base and all the gerrit things at once.. (T168562)

gerrit2001 is back up with stretch, puppet did all the things, Apache is up and running, Letsencrypt worked and automatically got cert for gerrit-slave.wm, gerrit is installed, the latest wmf.7 version that wasn't on cobalt yet. No puppet errors at all. Pretty nice :) and almost unexpectedly smooth so far :)

ii gerrit 2.13.8+git1-wmf.7 all This is a code review system

Next step will be configuring gerrit. Puppet has not tried to start it yet, as expected and it is told to.

I also did not try anything manual whatsoever, only puppet.. so it's definitely clean.

I guess we can close this as resolved now?

Depends how you define it. If it's only about OS installation and applying the puppet roles without errors, yes. But Gerrit the service isn't running yet, it needs to be initialized and deployed with scap. Also repo data needs to be rsynced. That part isn't done. I think it should stay open until that is done too and Gerrit is running, with --slave option. Though, you can argue that it is all part of the other "gerrit-ssh fails to start" ticket that we should link here.

Change 380824 had a related patch set uploaded (by Dzahn; owner: Chad):
[operations/puppet@production] Gerrit: Update known_hosts with newly reprovisioned gerrit2001

https://gerrit.wikimedia.org/r/380824

Change 380827 had a related patch set uploaded (by Dzahn; owner: Dzahn):
[operations/puppet@production] mariadb::misc: allow connections from gerrit servers

https://gerrit.wikimedia.org/r/380827

Change 380824 merged by Dzahn:
[operations/puppet@production] Gerrit: Update known_hosts with newly reprovisioned gerrit2001

https://gerrit.wikimedia.org/r/380824

gerrit2001 currently has a puppet error because Letsencrypt cert request gets denied by LE due to hitting rate limits. This affects the gerrit-slave.wm.org hostname. It is not broken though because we _do_ have a working cert which got created after the recent reinstall with stretch. For some unknown reason though we hit the rate limit after that and logs say we requested it 6 times, but that seems odd, maybe 3 seems realistic.

Change 380827 abandoned by Dzahn:
mariadb::misc: allow connections from gerrit servers

https://gerrit.wikimedia.org/r/380827

Dzahn changed the task status from Open to Stalled.Sep 28 2017, 8:13 PM

Stalled by firewall on DB.

This isn't really stalled, the host has indeed been reimaged as stretch and is completely working. The remaining issue is tracked in the subtask -- it's not really a subtask here.