Page MenuHomePhabricator

setup/deploy cobalt as gerrit warm standby/replacement
Closed, ResolvedPublic

Description

This task will track the setup/deployment for server cobalt(WMF4725)

  • - update mgmt dns for hostname
  • - update public dns record
  • - install_server update (raid10-lvm-ext4-srv)
  • - network port setup
  • - os install (jessie)
  • - puppet/salt acceptance
  • - restore/copy data from lead
  • - make needed changes to gerrit role (provide a way to specify slave mode, others?)
  • - add cobalt to contint ferm rules
  • - apply gerrit::server role
  • - remove backup::host include in site.pp, lead and cobalt identical in site.pp, remove lead from site.pp

Event Timeline

Change 314601 had a related patch set uploaded (by Dzahn):
add IPs for cobalt, using WMF4725

https://gerrit.wikimedia.org/r/314601

RobH updated the task description. (Show Details)

Change 314601 merged by Dzahn:
add IPs for cobalt, using WMF4725

https://gerrit.wikimedia.org/r/314601

Dzahn updated the task description. (Show Details)

[radon:~] $ host cobalt.mgmt.eqiad.wmnet
cobalt.mgmt.eqiad.wmnet has address 10.65.2.127

[radon:~] $ host cobalt.wikimedia.org
cobalt.wikimedia.org has address 208.80.154.81
cobalt.wikimedia.org has IPv6 address 2620:0:861:3:208:80:154:81

Change 314609 had a related patch set uploaded (by Dzahn):
add cobalt to DHCP, set partman recipe

https://gerrit.wikimedia.org/r/314609

Change 314609 merged by Dzahn:
add cobalt to DHCP, set partman recipe

https://gerrit.wikimedia.org/r/314609

Change 314612 had a related patch set uploaded (by Dzahn):
add cobalt site.pp, comment gerrit role, access for admins

https://gerrit.wikimedia.org/r/314612

Change 314612 merged by Dzahn:
add cobalt site.pp, comment gerrit role, access for gerrit-roots

https://gerrit.wikimedia.org/r/314612

Change 314628 had a related patch set uploaded (by Chad):
Gerrit: Copy public IPs from lead to cobalt, we're reusing them

https://gerrit.wikimedia.org/r/314628

OS installed, added to puppet, signed salt-key, gave access to gerrit-roots, gerrit server role commented out until tomorrow...

Change 314638 had a related patch set uploaded (by Dzahn):
make cobalt a backup::host

https://gerrit.wikimedia.org/r/314638

Change 314638 merged by Dzahn:
make cobalt a backup::host

https://gerrit.wikimedia.org/r/314638

started bacula restore of lead data to cobalt /srv

Run Restore job
JobName: RestoreFiles
Bootstrap: /var/lib/bacula/helium.eqiad.wmnet.restore.2.bsr
Where: /srv
Replace: always
FileSet: srv-gerrit-git
Backup Client: lead.wikimedia.org-fd
Restore Client: cobalt.wikimedia.org-fd
Storage: helium-FileStorage1
When: 2016-10-07 00:46:22
Catalog: production
Priority: 1
Plugin Options: *None*
OK to run? (yes/mod/no): yes
Job queued. JobId=38854

oops, since Where: is a prefix, this is restoring it as /srv/srv/gerrit but we can simply move it when done.. and then we'll rsync the diff tomorrow.

Dzahn updated the task description. (Show Details)

Change 314641 had a related patch set uploaded (by Dzahn):
contint: allow ssh from cobalt, in addition to lead

https://gerrit.wikimedia.org/r/314641

Change 314641 merged by Dzahn:
contint: allow ssh from cobalt, in addition to lead

https://gerrit.wikimedia.org/r/314641

Change 314726 had a related patch set uploaded (by Dzahn):
gerrit: add rsyncd on cobalt for migrating data

https://gerrit.wikimedia.org/r/314726

Change 314726 merged by Dzahn:
gerrit: add rsyncd on cobalt for migrating data

https://gerrit.wikimedia.org/r/314726

Change 314734 had a related patch set uploaded (by Dzahn):
gerrit: activate gerrit::server role on cobalt

https://gerrit.wikimedia.org/r/314734

Change 314628 merged by Dzahn:
Gerrit: Specify public IPs for eqiad, we're not changing them

https://gerrit.wikimedia.org/r/314628

Change 314734 merged by Dzahn:
gerrit: activate gerrit::server role on cobalt

https://gerrit.wikimedia.org/r/314734

20:30 bblack: lead.wikimedia.org: replaced by cobalt functionally, please leave it untouched for now with puppet disabled!
19:46 mutante: deleted old /var/lib/gerrit2/ data on cobalt, syncing from lead
19:45 mutante: rsyncing /var/lib/gerrit2 from lead to cobalt
19:30 mutante: removed gerrit IPs from cobalt interfaces
19:29 mutante: disabled puppet on lead and cobalt
19:21 mutante: re-enabling puppet on cobalt
19:21 mutante: removed gerrit IP from lead's interface, v4 and v6
19:09 mutante: rsyncing gerrit data one more time from lead to cobalt
19:07 ostriches: stopped puppet on lead
19:07 mutante: stopping gerrit on lead
19:02 mutante: cobalt, disabled puppet, removed service IP from interface

Change 314767 had a related patch set uploaded (by Dzahn):
gerrit: remove backup::host include from cobalt

https://gerrit.wikimedia.org/r/314767

Change 314767 merged by Dzahn:
gerrit: remove backup::host, rsyncd include from cobalt

https://gerrit.wikimedia.org/r/314767

Change 314768 had a related patch set uploaded (by Dzahn):
gerrit: mv standard incl to role, rm duplicate firewall

https://gerrit.wikimedia.org/r/314768

Change 314768 merged by Dzahn:
gerrit: mv standard incl to role, rm duplicate firewall

https://gerrit.wikimedia.org/r/314768

Change 315418 had a related patch set uploaded (by Dzahn):
gerrit: remove lead from site.pp, adjust comment

https://gerrit.wikimedia.org/r/315418

Change 315418 merged by Dzahn:
gerrit: remove lead from site.pp, adjust comment

https://gerrit.wikimedia.org/r/315418

Dzahn updated the task description. (Show Details)