Page MenuHomePhabricator

Determine IP ranges for dse-k8s cluster
Closed, ResolvedPublic

Assigned To
Authored By
JArguello-WMF
Jun 8 2022, 2:32 PM
Referenced Files
F35459272: image.png
Aug 16 2022, 11:28 AM
F35458908: image.png
Aug 16 2022, 11:28 AM
F35457465: image.png
Aug 16 2022, 9:49 AM
F35457427: image.png
Aug 16 2022, 9:49 AM
F35222391: image.png
Jun 9 2022, 2:52 PM

Event Timeline

The ML team bootstrapped the ml-serve cluster with the standards /23 and /24 subnets that the tutorial suggests, but we ended up re-evaluating the choice later on with T302701.

There are two things to keep in mind and reason about them:

  • overall number of k8s svc IPs
  • overall number of pods

The downside of not getting these right is a cluster reinit, something that we have done in T304673 to apply the new IP ranges. The kube-api control plane is responsible for the allocation of the svc IP addresses from a given pool, and it cannot be instructed to use another one.
This means that if we finish either svc or pod IPs, we will not be able to create new resources.

For the ML-serve cluster we opted for a /20 for svc IPs (~4k max) and /21 for pod IPs (~ 2k max) due to how knative-serving works, namely assigning a new svc IP to every new revision (see https://github.com/knative/specs/blob/main/specs/serving/knative-api-specification-1.0.md#revision for more info). Since we'll use Kubeflow on the dse cluster (kserve is only the serving slice of it), we'll have to account for knative's eagerness to get svc IPs as well. The use case is different from ML-serve, where we already have a lot of pods for models, on DSE we'll have to train models and hopefully not keep them alive for a long time (so likely lower svc IP addresses / revisions needed, but I am not 100% sure).

We should do a research on the Data Engineering use cases, to figure out how many IPs we'll be needed (rough figures, just to have an order of magnitude in my opinion). In theory if a /20 and /21 (svc and pods) is ok, we should be able to get space easily from the current ip space that we are using for the ml-serve clusters:

https://netbox.wikimedia.org/ipam/prefixes/535/prefixes/

Now that I see we already have allocated space for the Train Wing cluster (the previous name of DSE-K8s), so in this case we should be good. More space may require some follow up with neteng and service ops to understand if we can get bigger ranges.

BTullis added subscribers: klausman, BTullis.

Thanks @elukey - that's really useful.

My gut feeling is that a /20 and /21 (svc and pods) will be fine, so I'd be tempted to proceed with the current allocation.

It's a little difficult to estimate the data engineering requirements at the moment, but if we think about a few likely candidates such as:

  • spark
  • jupyterhub
  • airflow

...my guess is that these will use a greater number of pod IPs than service IPs and may make extensive use of headless services.

So whilst we could say ask for the pod IPs to be stretched from /21 to /20 I can't make a strong case for doing so, at least based on the DE team requirements.
What do you think @klausman and @elukey - would you say it's worth asking for a larger pod range?

BTullis triaged this task as Medium priority.Jun 9 2022, 2:42 PM

Copied in from netbox.

image.png (421×1 px, 65 KB)

10.67.24.0/21pod IPs
10.67.32.0/20service IPs

Tentatively moving this ticket to Done, based on the existing allocation of a /21 for pods and a /20 for services.

I think it is fine to proceed with the current allocation, worst case scenario we can re-init the cluster from scratch with new IP ranges (not nice but doable in a couple of days of work).

I notice that we don't have any IPv6 ranges allocated for this yet, nor a specific ASN.

I'm planning to create them in netbox.

  • 2620:0:861:302::/64 - DSE K8S pod IPs (eqiad)
  • 2620:0:861:303::/64 - DSE K8S service IPs (eqiad)

image.png (310×1 px, 52 KB)

We also need an asNumber, which I'm planning to create as:

  • 64609 - Kubernetes DSE eqiad

image.png (509×1 px, 55 KB)

I think that this is OK, but I'd like to check with serviceops before committing the change to Netbox.

We already have the IPv4 ranges defined, but we might want to update the descriptions for clarity, now that we're sharing responsibility for this cluster between ML and DE teams.

  • 10.67.24.0/21 - ML Team Kubernetes IP spaces - Train/DSE pod IPs (eqiad)
  • 10.67.32.0/20 - ML Team Kubernetes IP spaces - Train/DSE service IPs (eqiad)