Page MenuHomePhabricator

Add gerrit2002 as a replica of gerrit1001
Closed, ResolvedPublic

Description

  • apply the puppet role gerrit with gerrit::is_replica: true to gerrit2002
  • Add a DNS gerrit-replica-new.wikimedia.org to gerrit2002
  • add gerrit2002 as a replica in the primary gerrit server's config (on gerrit1001)
  • Ensure replication is complete on gerrit2002

Event Timeline

For context on a similar discussion for gitlab, please see T310265.

In particular I have some questions/comments:

  • Could gerrit-replica-new.wikimedia.org be a CNAME? Or it must be an A/AAAA record? For what is used? I'm assuming that if we had to failover gerrit to gerrit2002 we'll have to update the gerrit.wikimedia.org DNS record.
  • The current one is set as VIP in Netbox, but the IP itself is part of the public1-d-codfw VLAN and as such could "migrate" only within row D in codfw. Does it need to be a "VIP"?

The answer of the above would help to guide how we should allocate the required IP(s).

Change 815395 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/dns@master] add gerrit-replica-new.wikimedia.org, point to 208.80.153.109

https://gerrit.wikimedia.org/r/815395

Change 815396 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] gerrit: add hiera settings for replica to gerrit2002

https://gerrit.wikimedia.org/r/815396

The current one is set as VIP in Netbox,

@Volans When I looked at netbox I noticed the same thing. The reserved entry there for gerrit-replica.wikimedia.org has been entered as a VIP (didn't check or aware who added it originally).

Then when I compared that to how gitlab-replica.wikimedia.org has been added more recently, after you talked with @Jelto and @Arnoldokoth about it, it has been done in a different way in netbox.

But it should actually be the same. I am assuming the old existing gerrit-replica entry is the wrong one. That isn't a VIP as in Loadbalancer. It's just a second public IP on the same interface.

The point is that it should be exactly the same for both gerrit and gitlab. What you heard about gitlab-replica and gitlab-replica-new recently also applies to gerrit-replica-new.

Could you edit Netbox so that things are treatead the same way for gerrit and gitlab and assign us an IP for gerrit-replica-new just like you did the other day for gitlab-replica-new?

I also added changes above that would be merged once we have an assigned IP. ( treat the 208.80.153.109 in that patch like a placeholder).

We need to replicate to 2 different replicas at the same time, that's why "replica" and "replica-new".

That ticket you link to about discussing IP usage is still valid of course in the long-term. Would be nice to have it for the current sprint though.

Also I have heard voices that said using 2 IPs for service vs server is actually not a bad approach here.

Krinkle renamed this task from Add Gerrit2002 as a replica of Gerrit1001 to Add gerrit2002 as a replica of gerrit1001.Thu, Jul 28, 8:46 PM

@Dzahn I've assigned in Netbox an additional IPv4 and an additional IPv6 to gerrit2002 like for gitlab not marked as VIP, manual DNS as I guess you'll need to keep the TTL low. The IPs are:

[1] 208.80.153.104/27 => 104.153.80.208.in-addr.arpa
[2] 2620:0:860:4:208:80:153:104/64  => 4.0.1.0.3.5.1.0.0.8.0.0.8.0.2.0.4.0.0.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa

Let me know if you need anything else from my side.

[1] https://netbox.wikimedia.org/ipam/ip-addresses/11649/
[2] https://netbox.wikimedia.org/ipam/ip-addresses/11650/

@Volans Thank you! That looks good. I will do the manual DNS change for that (amending to the existing one now)

Dzahn changed the task status from Open to In Progress.Fri, Jul 29, 7:09 PM
Dzahn claimed this task.
Dzahn triaged this task as High priority.

Change 815395 merged by Dzahn:

[operations/dns@master] add gerrit-replica-new.wikimedia.org, point to 208.80.153.104

https://gerrit.wikimedia.org/r/815395

new in DNS:

[authdns1001:~] $ host gerrit-replica-new.wikimedia.org
gerrit-replica-new.wikimedia.org has address 208.80.153.104
gerrit-replica-new.wikimedia.org has IPv6 address 2620:0:860:4:208:80:153:104

Change 815396 merged by Dzahn:

[operations/puppet@production] gerrit: add hiera settings and IP for new replica gerrit2002

https://gerrit.wikimedia.org/r/815396

@thcipriani @demon and all:

We got the IP address, we tied it to gerrit-replica-new.wikimedia.org. (v4 + v6)

I added in Hiera via gerrit2002.yaml that it should use those plus

profile::gerrit::backups_enabled: false
profile::gerrit::is_replica: true

next we need this: https://gerrit.wikimedia.org/r/c/operations/puppet/+/815401

what this will do is write on gerrit1001 in `File[/var/lib/gerrit2/review_site/etc/replication.config]

+[remote "replica_new_codfw"]
+  defaultForceUpdate = true
+  mirror = true
+  replicateHiddenProjects = true
+  replicateProjectDeletions = true
+  replicationDelay = 5
+  rescheduleDelay = 5
+  threads = 4
+  url = gerrit2@gerrit2002.wikimedia.org:/srv/gerrit/git/${name}.git
`

https://puppet-compiler.wmflabs.org/pcc-worker1001/36529/gerrit1001.wikimedia.org/index.html

Let's enable that together next week?

Mentioned in SAL (#wikimedia-operations) [2022-08-01T21:02:57Z] <mutante> gerrit2002 - mkdir /var/lib/gerrit2/review_site | gerrit1001 - rsyncing /var/lib/gerrit2/review_site/ to gerrit2002 T313250 T313972

Mentioned in SAL (#wikimedia-operations) [2022-08-02T22:15:31Z] <mutante> gerrit - syncing data (/srv/gerrit /var/lib/gerrit2/review_site /home) again after gerrit2002 was reimaged with buster T313250 T313972

Confirming that replication from gerrit1001 to gerrit2002 is working.

Dzahn updated the task description. (Show Details)

gerrit2002.wikmedia.org is gerrit-replica.wikimedia.org

gerrit2001.wikimedia.org is down and fully removed.