Page MenuHomePhabricator

cloud-private: figure out, implement and test cross-DC traffic
Closed, ResolvedPublic

Description

In the new cloud-private network layout, we need traffic crossing the DC boundaries. This is primarily for data backup purposes, but it may have other usages.

I think @cmooney had some ideas on how to implement that: https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Iteration_on_network_isolation#How_to_connect_cross-DC, which involves GRE tunnels and BGP.

Event Timeline

A couple questions:
Presumably this would also allow for cross-DC VMs (an eqiad control plane managing a VM in codfw for example?) and even VM migrations across DC's?
Does this help with accessing wikireplicas from a cloud offering within codfw?

A couple questions:
Presumably this would also allow for cross-DC VMs (an eqiad control plane managing a VM in codfw for example?) and even VM migrations across DC's?
Does this help with accessing wikireplicas from a cloud offering within codfw?

  • cross-DC VMs (an eqiad control plane managing a VM in codfw): yes. This is possible today on a technical level (but we don't implement it). In my opinion we would need to assess what's the business need for it?
  • VM migrations across DCs: this is mostly a ceph replication topic. In my opinion this is a a different set of challenges and also it needs assessment like what's the business need for it?
  • Wikireplicas from codfw: yes. This is also possible today on a technical level (but we don't implement it). It would be slow without local DBs on codfw though.

In terms of IP connectivity between sites the plan/design - with CloudGW acting as gateway between sites using GRE tunneling over the WMF production network - is fit for purpose. It can be implemented with some puppet work / changes to the Bird module, and some Linux networking additions for the CloudGW. I will probably not have a huge amount of time available to look at this in the coming quarter unfortunately, but I will do what I can. I've labbed it up previously and can demo the setup if that might help.

The proposal would only allow for IP routing between sites. There is no scenario in which physical servers in Codfw could be placed on Vlan1105 (cloud-instances2-b-eqiad), nor servers in in Eqiad placed on Vlan2105 (cloud-instances2-b-codfw). The same goes for the cloud-private and any other vlans in use at each site, vlans will always be site-specific.

The connectivity will allow a host on a cloud-private vlan in Eqiad connect to a host on the cloud-private vlan in Codfw, and vice-versa. Once the CloudGW is updated this should be fairly seemless, no changes will be needed on other hosts. The 172.20.0.0/16 aggregate route they already have covers all the cloud-private ranges. Once the CloudGW tunnels are established and announced in BGP traffic for the remote site should flow automatically for any remote ranges within this.

In terms of what use might be made of this inter-site connectivity, or how any particular OpenStack function might operate across sites, I'll need to leave that to the WMCS team. Obviously happy to assist and answer any questions!

For the rest of this comment, I'm assuming that as of today the primary (and only) use case for cross-DC traffic that we have is cinder-backups (which is backing up ceph storage data to the other DC).

I have some questions about the implementation details.

In particular, if we establish the GRE tunnels using cloudgw, all cross-DC traffic goes through these servers, which implies:

  • mixing 'undercloud' traffic with 'cloud customer' traffic over the same links
  • higher network load on cloudgw
  • introducing cloudgw in the cinder-backups dependency chain. If cloudgw is down for whatever reason (failure, maintenance), the cross-DC link would be down and we would be unable to do backups or restores

These concerns make me feel like we should re-evaluate the implementation, and perhaps think again on doing the GRE/BGP on cloudsw devices?

Moreover, I'll talk to my team to see if we can follow a completely different approach for doing ceph backups, hopefully eliminating the need for this cross-DC traffic.

I had a conversation yesterday with @cmooney and he agreed on the points above.

At least two potential ways forward:

  • stop doing cincer-backups as we do today, and move to a more SRE-supported backup solution.
  • keep the same approach as today, but leave this traffic use the wikiland production links (instead of cloud-private, the tunnels and such).

We are exploring both options mentioned in the previous comment at the same time.

Not introducing cloud-private for the cross-dc traffic in any case.

fnegri changed the task status from Declined to Resolved.Jul 26 2023, 4:11 PM
fnegri moved this task from Backlog to Done on the cloud-services-team (FY2022/2023-Q4) board.