Page MenuHomePhabricator

cloud: prepare codfw for expansion (racks, switches, ceph)
Open, Needs TriagePublic

Description

As part of the project in T342750: cloud: introduce a kubernetes undercloud to run openstack (via openstack-helm) we will most likely do a build-out of openstack-in-kubernetes in the codfw datacenter.
However, as of this writing, we only have one rack for WMCS in codfw.

We need to expand our WMCS rack, switch and ceph capacity in codfw before we can do any meaningful deployment in there.

At very least, we need 3 total racks, each in 3 different rows, each with one cloudsw device.
Given we already have one (codfw B1) https://netbox.wikimedia.org/dcim/racks/51/ this means we would need 2 more racks and 2 more cloudsw devices.

The base reason is that we need redundancy at switch/rack/row level, meaning that we should not connect all the services to the same switch/rack/row.

Regarding ceph storage, I'm guessing we would start small and expand as required later.

Event Timeline

aborrero renamed this task from cloud: prepare codfw for expansion (racks, switches) to cloud: prepare codfw for expansion (racks, switches, ceph).Tue, Sep 19, 8:53 AM
aborrero updated the task description. (Show Details)

@cmooney I raised a similar question about expanding WMCS racks in eqiad and as I understood the answer was the newer rows in eqiad have compatible ToR switches for cloud use cases. Is that similar to what can be expected in codfw? In other words, are there 1 or 2 more racks we could set aside as WMCS racks in codfw without ordering cloudsw devices?

And, if we do move forward with pushing VXLAN responsibilities to the hosts, would that allow any existing switches in codfw to work as WMCS racks?

@cmooney You predicted this question and answered it already here :-) https://phabricator.wikimedia.org/T346724#9206025 <3 Thank you!