Page MenuHomePhabricator

Iron out issues in the proxy structure for multi-instance wikireplicas
Closed, ResolvedPublic

Description

At this time, various stakeholders agreed upon a general method of making this all work in T267376.
With this patch https://gerrit.wikimedia.org/r/627379, the existing haproxy servers have the ability to serve the multi-instance replicas but for the firewall and other requirements detailed in T267376 (to be added here). Things still to work out:

  • Establish the correct route through firewalls to the clouddb-wikireplica-proxy VMs in the cloud, per the agreed on structure
  • Write detailed documentation for WMCS and Data Persistence for depooling and related operations on the proxy level
  • Determine if dbproxy1018 and dbproxy1019 should have auto failover set up for the multi-instance replicas. Right now they don't without an override, and that seems incorrect. Each port just sends to the main backend. A puppet change should fix that.
  • Create the DNS names needed and adjust the wmcs-wikireplicas-dns script accordingly

Generally agreed on routing structure:

wikireplicas-floating ips.png (806×3 px, 232 KB)

The main diff compared to the previous architecture diagram is that we introduce another proxy layer inside the CloudVPS virtual network. This proxy layer 1 is able to abstract away database sections for customers.
Then proxy layer 2 receives connections from the first layer, using the general egress CloudVPS NAT public IPv4 address, and is responsible to proxy again to the actual database, using different TCP port selectors which is only known between proxy layer 1 and proxy layer 2.

The flow is something like this:

client 
->
 enwiki.web.svc.clouddbservices.eqiad1.wikimedia.cloud IN CNAME s1.web.svc.clouddbservices.eqiad1.wikimedia.cloud
 ->
   s1.web.svc,clouddbservices.eqiad1.wikimedia.cloud (proxy layer 1 -- 172.16.x.x -- TCP/3306)
   ->
     dbproxy1019 (proxy layer 2 -- web db address (208.80.x.x) -- s1 port TCP/XXXX)
     ->
       clouddb1013 or any other db mysql

client 
->
 enwiki.analytics.svc.clouddbservices.eqiad1.wikimedia.cloud IN CNAME s1.analytics.svc.clouddbservices.eqiad1.wikimedia.cloud
 ->
   s1.analytics.svc,clouddbservices.eqiad1.wikimedia.cloud (proxy layer 1 -- 172.16.y.y -- TCP/3306)
   ->
     dbproxy1018 (proxy layer 2 -- analytics db address (208.80.y.y) -- s1 port TCP/YYYY)
     ->
       clouddb1013 or any other db mysql

Event Timeline

Bstorm triaged this task as Medium priority.Jan 7 2021, 10:18 PM
Bstorm created this task.
Bstorm raised the priority of this task from Medium to High.Jan 7 2021, 10:22 PM
Bstorm updated the task description. (Show Details)

At this point, each proxy has the capability to route to the new replicas as well as the old ones, but it only routes to each instances primary. I presume we want the analytics replica to be the standby for "web" and vice versa, right @Marostegui? That seems better than requiring manual intervention if something happens.

The firewall is not open to cloud yet, intentionally pending sorting out the whole public ip issue.

@aborrero I need to sync up with you on the naming and IPVS stuff here when you have time. I'll suggest a scheduled time if I miss you tomorrow.

At this point, each proxy has the capability to route to the new replicas as well as the old ones, but it only routes to each instances primary. I presume we want the analytics replica to be the standby for "web" and vice versa, right @Marostegui? That seems better than requiring manual intervention if something happens.

+1 yes, that would be the ideal scenario. There are different query killer thresholds on those two services, but that is still better than having to wait for a person to come and fix things.

@aborrero I need to sync up with you on the naming and IPVS stuff here when you have time. I'll suggest a scheduled time if I miss you tomorrow.

We wont overlap a lot today (friday), so I just sent an invite for 2021-01-11.

Change 655174 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: enable standby behavior on multiinstance proxys

https://gerrit.wikimedia.org/r/655174

Change 655174 merged by Bstorm:
[operations/puppet@production] wikireplicas: enable standby behavior on multiinstance proxies

https://gerrit.wikimedia.org/r/655174

Change 655498 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: enable standby behavior on multiinstance proxies

https://gerrit.wikimedia.org/r/655498

Change 655498 merged by Bstorm:
[operations/puppet@production] wikireplicas: enable standby behavior on multiinstance proxies

https://gerrit.wikimedia.org/r/655498

Change 655509 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: one last tweak to merge the hashes right

https://gerrit.wikimedia.org/r/655509

Change 655509 merged by Bstorm:
[operations/puppet@production] wikireplicas: one last tweak to merge the hashes right

https://gerrit.wikimedia.org/r/655509

Change 655533 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: set up LVS for multiinstance wikireplicas

https://gerrit.wikimedia.org/r/655533

Change 657155 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: Add new DNS names for multiinstance replicas

https://gerrit.wikimedia.org/r/657155

Change 657155 merged by Bstorm:
[operations/puppet@production] wikireplicas: Add new DNS names for multiinstance replicas

https://gerrit.wikimedia.org/r/657155

Change 658457 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: fix error in VM proxy config

https://gerrit.wikimedia.org/r/658457

Change 658457 merged by Bstorm:
[operations/puppet@production] wikireplicas: fix error in VM proxy config

https://gerrit.wikimedia.org/r/658457

Change 659041 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: open the proxies for the new ports

https://gerrit.wikimedia.org/r/659041

Change 659041 merged by Bstorm:
[operations/puppet@production] wikireplicas: open the proxies for the new ports

https://gerrit.wikimedia.org/r/659041

Change 659070 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/homer/public@master] wikireplicas: open the firewall for multiinstance databases

https://gerrit.wikimedia.org/r/659070

Change 659070 merged by jenkins-bot:
[operations/homer/public@master] wikireplicas: open the firewall for multiinstance databases

https://gerrit.wikimedia.org/r/659070

Mentioned in SAL (#wikimedia-operations) [2021-01-28T16:03:24Z] <arturo> running homer on cr*-eqiad* for T271476

Change 659312 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/homer/public@master] cr/firewall.conf: cloud-in4: allow new wiki replicas TCP ports

https://gerrit.wikimedia.org/r/659312

Mentioned in SAL (#wikimedia-operations) [2021-01-28T16:41:09Z] <arturo> running homer on cr*-eqiad* again for reverting latest changes (T271476)

Change 659409 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas proxy: flip the service to production

https://gerrit.wikimedia.org/r/659409

Change 655533 merged by Bstorm:
[operations/puppet@production] wikireplicas: set up LVS for multiinstance wikireplicas

https://gerrit.wikimedia.org/r/655533

Mentioned in SAL (#wikimedia-operations) [2021-01-28T20:13:55Z] <bblack> lvs1014,lvs1016 - puppet temporarily disabled for new service config deploy - T271476

Change 659409 merged by BBlack:
[operations/puppet@production] wikireplicas proxy: flip the service to production

https://gerrit.wikimedia.org/r/659409

Change 659414 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas proxy: add PRODUCTION_NETWORKS to the ferm rules

https://gerrit.wikimedia.org/r/659414

Change 659414 merged by Bstorm:
[operations/puppet@production] wikireplicas proxy: add PRODUCTION_NETWORKS to the ferm rules

https://gerrit.wikimedia.org/r/659414

Change 659439 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/puppet@production] lvs101[46] - add cloud-support1-c-eqiad vlan

https://gerrit.wikimedia.org/r/659439

Change 659439 merged by BBlack:
[operations/puppet@production] lvs101[46] - add cloud-support1-c-eqiad vlan

https://gerrit.wikimedia.org/r/659439

@Marostegui FYI, the work above has added a number of things to dbproxy1019 and dbproxy1018. None of it should make a big difference to operations related to the legacy wikireplicas. I very much owe a lot of documentation at this point for the new replica systems.
Things you might notice:

  • Each proxy has an LVS pubilc IP on the loopback with dns of wikirepilcas-a.wikimedia.org and wikireplicas-b.wikimedia.org. The names are generic to avoid requiring us to change what IP is when doing maintenance.
  • We do not anticipate any technical need to ever use conftool to depool the servers, however, if they are not depooled before maintenance, it will spam the -operations channel a bit, so there may be social reasons to do so. I will add notes on that to the new docs.
  • The new replicas can receive traffic, but I suspect that will only begin to be much of a thing until next week. Even then, I expect slow adoption because reshuffling of the LVS config may be needed at some point due to how we connected the VLANs. My tests went great at using DNS to correctly select the right database instances.
  • The LVS-based service is limited to the cloud public IP range only via ferm. Cloud internal clients can only access the ports via NAT, floating IP or via the VM proxies (preferred). I expect that will become a limit in switch ACLs in the future as well.
  • All the conftool/pybal tools got stuck on those two proxies as well because of puppet magic, but no new services at least.

Thanks for the heads-up @Bstorm!
I have two questions at this point:

  1. Should I enable notifications on these hosts already or do you want me to wait a bit? Enabling them means that we will get IRC notifications and not doing so means that they will only show up on Icinga UI (I have it open at all times).
  2. What's the procedure to depool a new given replica (say: cloudb1015:XXXX) at the moment?

Thank you for all the hard work.

Change 659312 abandoned by Arturo Borrero Gonzalez:
[operations/homer/public@master] cr/firewall.conf: cloud-in4: allow new wiki replicas TCP ports

Reason:
no longer required

https://gerrit.wikimedia.org/r/659312

Thanks for the heads-up @Bstorm!
I have two questions at this point:

  1. Should I enable notifications on these hosts already or do you want me to wait a bit? Enabling them means that we will get IRC notifications and not doing so means that they will only show up on Icinga UI (I have it open at all times).

Well, they are kind of ready, aren't they? Obviously, clouddb1019 should be based on whether it's fully working. I think so.

  1. What's the procedure to depool a new given replica (say: cloudb1015:XXXX) at the moment?

I didn't manage to write a doc on that today. I will try to get that out tomorrow. The short version, however, is, to specifically depool clouddb1015:

If it was ok to leave it as the secondary for dbproxy1018, then you'd add the following to hieradata/hosts/dbproxy1019, and this should do it by simply overwriting the s4 and s6 keys:

profile::mariadb::proxy::multiinstance_replicas::section_overrides:
  s4:
    clouddb1019.eqiad.wmnet:
      ipaddress: 10.64.48.9
  s6:
    clouddb1019.eqiad.wmnet:
      ipaddress: 10.64.48.9

I have an example in the labs/private repo because I wanted to test the whole mess there where I don't have puppetdb (in hieradata/role/common/mariadb/proxy/replicas.yaml).
It should be possible to also set weights, add a standby and fully depool the host you are depooling on the other proxy so it doesn't get treated as a standby. While writing the docs, I'll make sure and test all of that while these are in the early-adopter stage. The only thing that could make things look a little different is if the merge in puppet keeps the old value. I don't think it will without a deep merge, but I'll make sure while testing. If it mysteriously uses deep merge, I added a check for a depooled key that will drop it from the config, just in case.

The LVS should not need fiddling, and there is another proxy layer on VMs where you'd depool one of the physical dbproxy hosts (using horizon hiera). If we wanted to depool a VM proxy, we'd use the wikireplicas DNS config...so yeah, docs!

Thanks for the heads-up @Bstorm!
I have two questions at this point:

  1. Should I enable notifications on these hosts already or do you want me to wait a bit? Enabling them means that we will get IRC notifications and not doing so means that they will only show up on Icinga UI (I have it open at all times).

Well, they are kind of ready, aren't they? Obviously, clouddb1019 should be based on whether it's fully working. I think so.

Yes, they are, also clouddb1019, which was fixed past week. The question was more like: should we consider them fully production? :-)
Notifications have been enabled!

  1. What's the procedure to depool a new given replica (say: cloudb1015:XXXX) at the moment?

I didn't manage to write a doc on that today. I will try to get that out tomorrow. The short version, however, is, to specifically depool clouddb1015:

If it was ok to leave it as the secondary for dbproxy1018, then you'd add the following to hieradata/hosts/dbproxy1019, and this should do it by simply overwriting the s4 and s6 keys:

profile::mariadb::proxy::multiinstance_replicas::section_overrides:
  s4:
    clouddb1019.eqiad.wmnet:
      ipaddress: 10.64.48.9
  s6:
    clouddb1019.eqiad.wmnet:
      ipaddress: 10.64.48.9

I have an example in the labs/private repo because I wanted to test the whole mess there where I don't have puppetdb (in hieradata/role/common/mariadb/proxy/replicas.yaml).
It should be possible to also set weights, add a standby and fully depool the host you are depooling on the other proxy so it doesn't get treated as a standby. While writing the docs, I'll make sure and test all of that while these are in the early-adopter stage. The only thing that could make things look a little different is if the merge in puppet keeps the old value. I don't think it will without a deep merge, but I'll make sure while testing. If it mysteriously uses deep merge, I added a check for a depooled key that will drop it from the config, just in case.

Ah thank you. Probably in order to ease the operational side of things, maybe we can leave all those entries commented on both proxies, so if we need to depool stuff, we simply need to uncomment whatever we need to depool. Does that sound good to you?

Thank you

Change 661158 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas-proxy: tune the main haproxy config for databases

https://gerrit.wikimedia.org/r/661158

Change 661158 merged by Bstorm:
[operations/puppet@production] wikireplicas-proxy: tune the main haproxy config for databases

https://gerrit.wikimedia.org/r/661158

Ah thank you. Probably in order to ease the operational side of things, maybe we can leave all those entries commented on both proxies, so if we need to depool stuff, we simply need to uncomment whatever we need to depool. Does that sound good to you?

Thank you

I'll do that before getting the docs out, sure.

Change 661206 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas-proxy: add commented examples of depoolings for multiinstance

https://gerrit.wikimedia.org/r/661206

Once that is merged, it probably would make sense to test depooling one, honestly.

I've added some documentation here https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas#Multi-instance_replicas

There is more to do on the documentation end, but I figure getting eyes on it will improve it. Also, is there any place you'd like me to update docs @Marostegui ?

I also should sync up with @aborrero and get a copy of the new network diagram in there with the only possible change that I did add public floating ips to the VM proxies in case they needed public egress. Those floating IPs are not used by wmcs-wikireplicas-dns. That uses the 16 internal ones. I am not sure if that changes the diagram.

Once that is merged, it probably would make sense to test depooling one, honestly.

+1. Let me know if you need help with it.

I've added some documentation here https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas#Multi-instance_replicas

There is more to do on the documentation end, but I figure getting eyes on it will improve it. Also, is there any place you'd like me to update docs @Marostegui ?

Thanks (made a small comment, just a reminder that haproxy needs to be reloaded for the depooling to take effect) - I have linked our docs to that one, so with this doc you wrote and the comments on the dbproxyXXXX.yaml should be good enough I think. We need quite a few clean up on our internal documentation. I will take care of that.

I also should sync up with @aborrero and get a copy of the new network diagram in there with the only possible change that I did add public floating ips to the VM proxies in case they needed public egress. Those floating IPs are not used by wmcs-wikireplicas-dns. That uses the 16 internal ones. I am not sure if that changes the diagram.

I just made some edits to https://wikitech.wikimedia.org/wiki/Portal:Data_Services/Admin/Wiki_Replicas to include the diagram.

Given we aren't using floating IPs today (that's my understanding) I would leave them out of the diagram, to avoid potential confusion.

Given we aren't using floating IPs today (that's my understanding) I would leave them out of the diagram, to avoid potential confusion.

For the VM proxies we are using floating IPs as well for egress (not for ingress). I don't know that I need them. I created them when we were unsure how we were going to get the testing of the new replicas started. Since it gets proxied through LVS, I don't know if that changes the diagram at all.

Change 661206 merged by Bstorm:
[operations/puppet@production] wikireplicas-proxy: add commented examples of depoolings for multiinstance

https://gerrit.wikimedia.org/r/661206

Bstorm updated the task description. (Show Details)

The documentation looks pretty ok for a first pass. It will need updates. However, I think it is reasonably complete now.