Page MenuHomePhabricator

Three small ganeti VMs to host haproxy for OpenStack endpoints
Closed, ResolvedPublic

Description

They aren't going to do a whole lot so 2/2/20 should do the trick.

Event Timeline

Andrew created this task.Jul 1 2019, 10:16 PM
Restricted Application added a project: Operations. · View Herald TranscriptJul 1 2019, 10:16 PM

btw, I'm happy to actually set up the VMs, only assigning to Alex to approve the resource usage.

Andrew updated the task description. (Show Details)Jul 1 2019, 10:22 PM

I feel that 1 CPU might be too limiting, haproxy is multi-threaded and we'll have a number of backends defined.

2/2/20 would be a good place to start if that's an option.

Sounds fine to me. Please use row_A in eqiad for this as it has more resources available. Also, I guess all three VMs will have to go on the same row anyway due to the requirement that all 3 nodes share the network.

Please use the cumin cookbook for the creation (see T203963) which was just introduced so we can have some more users for it.

Andrew added a comment.Jul 2 2019, 3:28 PM

Sounds fine to me. Please use row_A in eqiad for this as it has more resources available. Also, I guess all three VMs will have to go on the same row anyway due to the requirement that all 3 nodes share the network.

I was imagining that we'd put one in each of the three rows, since HA is the whole point and I don't want to just move the spof from the existing API endpoint to a ganeti server. I don't think there are network concerns since these are all going to serve public IPs.

Andrew added a comment.Jul 2 2019, 5:01 PM

(Let's use Buster for this if it's available on ganeti)

Sounds fine to me. Please use row_A in eqiad for this as it has more resources available. Also, I guess all three VMs will have to go on the same row anyway due to the requirement that all 3 nodes share the network.

I was imagining that we'd put one in each of the three rows, since HA is the whole point and I don't want to just move the spof from the existing API endpoint to a ganeti server. I don't think there are network concerns since these are all going to serve public IPs.

How is corosync/pacemaker going to work then with a single VIP?

(Let's use Buster for this if it's available on ganeti)

It is, same as the rest of the fleet.

How is corosync/pacemaker going to work then with a single VIP?

I may be missing something but we have range of service IPs that we can map to anywhere in eqiad, don't we?

How is corosync/pacemaker going to work then with a single VIP?

I may be missing something but we have range of service IPs that we can map to anywhere in eqiad, don't we?

Sorry, I wasn't clear (note to self: Don't reply to phab tasks at 11pm after a long day). Let me try again.

Per T223907, the chosen approach for providing HA over the 3 haproxy nodes is pacemaker. Corosync is actually an implementation detail to provide cluster management services, i.e. communication, membership, quorum and could be easily exchanged for heartbeat.

Pacemaker is the resource manager and as such the software that will be making sure that the resource the clients should be using is failed over correctly. I am assuming here (please correct me if I am wrong) that the shared resource is the Virtual IP (aka VIP) I was referring to (I am not calling it a Service IP on purpose in this discussion to differentiate it from the LVS Service IPs). For that service IP to be shared among the nodes and be successfully migrated from one to the other they all need to share the same link layer network (i.e. the same ethernet segment, the VLAN in this specific case).

Given our VLANs span 1 rack row it is not possible to have the 3 nodes in different rack rows cause which of the 3 VLANs will the VIP be in then?

The range of service IPs you refer to, are there for LVS, where the LVS nodes act as routers and are specifically cabled in the datacenter to connect to all rack rows, so they are not applicable here.

Per T223907, the chosen approach for providing HA over the 3 haproxy nodes is pacemaker. Corosync is actually an implementation detail to provide cluster management services, i.e. communication, membership, quorum and could be easily exchanged for heartbeat.

We're still putting the HA architecture together. In that task, I used pacemaker/corosync as an example of what's popular in the OpenStack community to manage HA resources. Keepalived or pacemaker/heartbeat are still options that we could use.

Pacemaker is the resource manager and as such the software that will be making sure that the resource the clients should be using is failed over correctly. I am assuming here (please correct me if I am wrong) that the shared resource is the Virtual IP (aka VIP) I was referring to (I am not calling it a Service IP on purpose in this discussion to differentiate it from the LVS Service IPs). For that service IP to be shared among the nodes and be successfully migrated from one to the other they all need to share the same link layer network (i.e. the same ethernet segment, the VLAN in this specific case).
Given our VLANs span 1 rack row it is not possible to have the 3 nodes in different rack rows cause which of the 3 VLANs will the VIP be in then?

You are correct. We'll need the Virtual IP available on all 3 nodes, requiring all hosts to be in the same row.

Let's hold off on these VMs until we have a better understanding of the HA architecture and some automation in place.

akosiaris changed the task status from Open to Stalled.Jul 3 2019, 1:47 PM

Let's hold off on these VMs until we have a better understanding of the HA architecture and some automation in place.

Fine by me. Setting to Stalled then.

Some questions I have. Do we have a single ganeti hypervisor in each row? Could you set affiniting/pinning for VMs/hypervisor running in ganeti? For what value of N we could deploy N virtual machines in N different ganeti hypervisors in the same DC row?

Some questions I have. Do we have a single ganeti hypervisor in each row? Could you set affiniting/pinning for VMs/hypervisor running in ganeti? For what value of N we could deploy N virtual machines in N different ganeti hypervisors in the same DC row?

From the context I understand you refer to hardware machines, not https://en.wikipedia.org/wiki/Hypervisor, disregard the answer below if my understanding was wrong.

We have 2 sets of 4 or 6 hardware machines currently, depending on DC and row. For now we are limited to 2 rows, but this is to be fixed soon as we have already procured (and in codfw's case provisioned as well, although not yet put into service) the hardware required for expanding to a 3rd row. VMs can not migrate across rows of course. So N is in the [4,5,6] set currently.

We can pin a VM on a specific hardware machine, but giving it a non migrate-able storage type. That does mean that when the node goes down (for whatever reason, the VM will also not be available). Depending on the service provided by that node, that may very well be fine. An example is etcd, a distributed data store than can survive a node not being available.

We can also inform the allocator that it should not place VMs that are powering the same service on the same hardware machines by tagging them specifically. During rebalancings of the cluster the allocator does its best to honor that (if the cluster is overly packed it can't really help much of course).

OK, thanks. So I think we could have 2 or 3 VMs in the same row, but running on different hardware, which more or less serves our goal for availability from the SPOF point of view.

JHedden closed this task as Resolved.Aug 16 2019, 3:03 PM

For this phase we're going to install haproxy directly on the openstack controllers. We will not be needing these VMs. Thank you for all the information, it was very helpful.