Page MenuHomePhabricator

Re-IP Swift hosts to per-rack subnets in codfw row A and B.
Open, MediumPublic

Description

As part of the move from per-row to per-rack redundancy model hosts in codfw rows A and B need to be configured / moved to new per-rack vlans/subnets. This work can be tackled once we have completed the physical move of all hosts in those rows from old 'asw' switch devices to new 'lsw' ones.

In discussion on irc we touched on some of the challenges for these hosts, which as I understand may use IP addresses as identifiers. We also need to consider how clusters function with hosts on different subnets that were previous layer-2 adjacent.

Creating this task so we can discuss the options, make plans and test the way forward.

Event Timeline

cmooney triaged this task as Medium priority.Jan 11 2024, 2:47 PM
cmooney created this task.

Swift uses IP(v4) address (and then device name) as the identifier for entries in its rings.

Additionally, when adding nodes to the ring, we use IP address to tell where the node is located, and thus which "zone" it should be in (the zones are used to make sure each of the three replicas is in a different row) - see the find_ip_zone function.

The safest approach would be to drain a node & remove it from the rings, then renumber it and add it again. But a drain takes 2-3 weeks (we do it gradually to avoid overload), and a reload the same time again.

In theory swift-ring-builder has a set_info command with a --change-ip argument, so one could change every device on a node in the rings, renumber it and push out the new rings. We'd need to write some tooling to do this, and I've no idea how safe such an operation is.

In either approach, extra constraints are that we'd not want too many nodes "in flight" at once, because swift will try and backfill to make up for missing/down devices and we need to avoid overloading (in terms of load or capacity) the rest of the cluster; and that you have to wait 12 hours between changes to the rings.

Sorry, I think object stores are often not really written with renumbering in mind...