- apply the puppet role gerrit with gerrit::is_replica: true to gerrit2002
- Add a DNS gerrit-replica-new.wikimedia.org to gerrit2002
- add gerrit2002 as a replica in the primary gerrit server's config (on gerrit1001)
- Ensure replication is complete on gerrit2002
Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | Dzahn | T313250 Bring up gerrit2002 | |||
| Resolved | Dzahn | T243027 replacement for gerrit2001, decom gerrit2001 | |||
| Resolved | Dzahn | T313972 Add gerrit2002 as a replica of gerrit1001 |
Event Timeline
For context on a similar discussion for gitlab, please see T310265.
In particular I have some questions/comments:
- Could gerrit-replica-new.wikimedia.org be a CNAME? Or it must be an A/AAAA record? For what is used? I'm assuming that if we had to failover gerrit to gerrit2002 we'll have to update the gerrit.wikimedia.org DNS record.
- The current one is set as VIP in Netbox, but the IP itself is part of the public1-d-codfw VLAN and as such could "migrate" only within row D in codfw. Does it need to be a "VIP"?
The answer of the above would help to guide how we should allocate the required IP(s).
Change 815395 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/dns@master] add gerrit-replica-new.wikimedia.org, point to 208.80.153.109
Change 815396 had a related patch set uploaded (by Dzahn; author: Dzahn):
[operations/puppet@production] gerrit: add hiera settings for replica to gerrit2002
@Volans When I looked at netbox I noticed the same thing. The reserved entry there for gerrit-replica.wikimedia.org has been entered as a VIP (didn't check or aware who added it originally).
Then when I compared that to how gitlab-replica.wikimedia.org has been added more recently, after you talked with @Jelto and @Arnoldokoth about it, it has been done in a different way in netbox.
But it should actually be the same. I am assuming the old existing gerrit-replica entry is the wrong one. That isn't a VIP as in Loadbalancer. It's just a second public IP on the same interface.
The point is that it should be exactly the same for both gerrit and gitlab. What you heard about gitlab-replica and gitlab-replica-new recently also applies to gerrit-replica-new.
Could you edit Netbox so that things are treatead the same way for gerrit and gitlab and assign us an IP for gerrit-replica-new just like you did the other day for gitlab-replica-new?
I also added changes above that would be merged once we have an assigned IP. ( treat the 208.80.153.109 in that patch like a placeholder).
We need to replicate to 2 different replicas at the same time, that's why "replica" and "replica-new".
That ticket you link to about discussing IP usage is still valid of course in the long-term. Would be nice to have it for the current sprint though.
Also I have heard voices that said using 2 IPs for service vs server is actually not a bad approach here.
@Dzahn I've assigned in Netbox an additional IPv4 and an additional IPv6 to gerrit2002 like for gitlab not marked as VIP, manual DNS as I guess you'll need to keep the TTL low. The IPs are:
[1] 208.80.153.104/27 => 104.153.80.208.in-addr.arpa [2] 2620:0:860:4:208:80:153:104/64 => 4.0.1.0.3.5.1.0.0.8.0.0.8.0.2.0.4.0.0.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa
Let me know if you need anything else from my side.
[1] https://netbox.wikimedia.org/ipam/ip-addresses/11649/
[2] https://netbox.wikimedia.org/ipam/ip-addresses/11650/
@Volans Thank you! That looks good. I will do the manual DNS change for that (amending to the existing one now)
Change 815395 merged by Dzahn:
[operations/dns@master] add gerrit-replica-new.wikimedia.org, point to 208.80.153.104
new in DNS:
[authdns1001:~] $ host gerrit-replica-new.wikimedia.org
gerrit-replica-new.wikimedia.org has address 208.80.153.104
gerrit-replica-new.wikimedia.org has IPv6 address 2620:0:860:4:208:80:153:104
Change 815396 merged by Dzahn:
[operations/puppet@production] gerrit: add hiera settings and IP for new replica gerrit2002
@thcipriani @demon and all:
We got the IP address, we tied it to gerrit-replica-new.wikimedia.org. (v4 + v6)
I added in Hiera via gerrit2002.yaml that it should use those plus
profile::gerrit::backups_enabled: false profile::gerrit::is_replica: true
next we need this: https://gerrit.wikimedia.org/r/c/operations/puppet/+/815401
what this will do is write on gerrit1001 in `File[/var/lib/gerrit2/review_site/etc/replication.config]
+[remote "replica_new_codfw"]
+ defaultForceUpdate = true
+ mirror = true
+ replicateHiddenProjects = true
+ replicateProjectDeletions = true
+ replicationDelay = 5
+ rescheduleDelay = 5
+ threads = 4
+ url = gerrit2@gerrit2002.wikimedia.org:/srv/gerrit/git/${name}.git
`https://puppet-compiler.wmflabs.org/pcc-worker1001/36529/gerrit1001.wikimedia.org/index.html
Let's enable that together next week?
Mentioned in SAL (#wikimedia-operations) [2022-08-01T21:02:57Z] <mutante> gerrit2002 - mkdir /var/lib/gerrit2/review_site | gerrit1001 - rsyncing /var/lib/gerrit2/review_site/ to gerrit2002 T313250 T313972
Mentioned in SAL (#wikimedia-operations) [2022-08-02T22:15:31Z] <mutante> gerrit - syncing data (/srv/gerrit /var/lib/gerrit2/review_site /home) again after gerrit2002 was reimaged with buster T313250 T313972
gerrit2002.wikmedia.org is gerrit-replica.wikimedia.org
gerrit2001.wikimedia.org is down and fully removed.