Page MenuHomePhabricator

Avoid using codfw expansion cage for non-IPIP LVS-fronted services
Closed, ResolvedPublic

Description

The new cage in codfw cannot support direct layer-2 adjacency to our LVS servers, meaning we can only support services behind LVS there which use IPIP encapsulation.

I've edited this description based on feedback to list out the hosts we should avoid putting there for now.

Background

The new switches for {T380240} were ordered without the VXLAN license for Juniper, and the cage will be a hybrid of Juniper/Nokia (due to needing the racks online sooner than we would be able to support a full Nokia setup). This rules out using VXLAN for layer-2 extension in the new cage.

That decision was made after discussions at the SRE Summit which suggested that by the time the racks went live we would no longer have any services dependent on the layer-2 connectivity from LVS load-balancers. But we are not quite there so day-one we need to avoid placing some hosts in the new location.

Hostname List

The main hosts we need to avoid racking in the new cage for now are the Kubernetes hosts (work to move them to IPIP is tracked in T352956). Additionally some search hosts should be avoided (see T373020).

The full list of hostname prefixes we cannot rack in the new cage right away are:

aux-k8s-ctrl
aux-k8s-worker
cirrussearch
dse-k8s-ctrl
dse-k8s-worker
elastic
kubestage
kubestagemaster
ml-serve
ml-serve-ctrl
ml-staging
ml-staging-ctrl
wikikube-ctrl
wikikube-worker

Event Timeline

cmooney renamed this task from Migrate remaining LVS-backed services to IPIP to Avoid using codfw expansion cage for LVS-backed services that are not on IPIP.May 14 2025, 10:40 AM
cmooney updated the task description. (Show Details)

services present on service.yaml with lvs configuration that don't have an ipip_configuration entry require L2 adjacency.

cmooney renamed this task from Avoid using codfw expansion cage for LVS-backed services that are not on IPIP to Avoid using codfw expansion cage for non-IPIP LVS services.May 14 2025, 10:56 AM
cmooney added a parent task: Unknown Object (Task).
cmooney updated the task description. (Show Details)
cmooney renamed this task from Avoid using codfw expansion cage for non-IPIP LVS services to Avoid using codfw expansion cage for non-IPIP LVS-fronted services.May 14 2025, 10:58 AM

services present on service.yaml with lvs configuration that don't have an ipip_configuration entry require L2 adjacency.

Thanks, that amounts to all these then: P76141

Still unsure how to work back from that to a list of hostnames for dc-ops but we'll find a way.

@ayounsi steered me the right way here, I believe these are the host types we want to avoid racking in the new cage for now. Just the K8s ones and cirrussearch:

aux-k8s-ctrl
aux-k8s-worker
cirrussearch
dse-k8s-ctrl
dse-k8s-worker
elastic
kubestage
kubestagemaster
ml-serve
ml-serve-ctrl
ml-staging
ml-staging-ctrl
wikikube-ctrl
wikikube-worker

cmooney triaged this task as Medium priority.May 14 2025, 11:44 AM
cmooney updated the task description. (Show Details)
cmooney updated the task description. (Show Details)

@Papaul @Jhancock.wm FYI. Is there a good way to save this list somewhere so DC-ops can cross-refernce? Or are you happy to refer back to this task?

I think I'm gonna make a physical list and post it somewhere in the DH5. for my personal reference. I will otherwise forget this is a thing. Thanks!

I think I'm gonna make a physical list and post it somewhere in the DH5. for my personal reference. I will otherwise forget this is a thing. Thanks!

Ok @Jhancock.wm thanks. The other thing to put on that list is that no hosts that need to be on public vlans can go in those racks. Thanks.

cmooney claimed this task.

Gonna close this one, I trust you guys have the heads up on what can't go there just yet.

cmooney mentioned this in Unknown Object (Task).Nov 6 2025, 5:16 PM