Cloud IPv6 subnets
Open, MediumPublic
Actions

Assigned To

None

Authored By

	ayounsi
	Feb 21 2018, 7:21 PM

Description

Follow up from T184209

Looking at codfw (but it's similar in eqiad)

We currently use the following IPv6 for the ~~labs~~ cloud ranges:
ae2.2122 - labs-support1-b-codfw - 2620:0:860:122::/64
ae2.2118 - labs-hosts1-b-codfw - 2620:0:860:118::/64
ae2.2120 - labs-instance-transport1-b-codfw - 2620:0:860:120::/64

The reason was probably including part of the vlan ID in the IP.
But this falls into the larger subnet 2620:0:860:100::/56 - codfw private

It's not an issue right now, especially as cloud doesn't use much IPv6, but might be an issue in the future.

I see 2 options:
1/ use a different /56
For example:

2620:0:860:200::/56  - labs-codfw
2620:0:861:200::/56  - labs-eqiad

2/ use dedicated /48s

2a02:ec80:0::/44 - labs (16 * /48) (can be shrinked to a /45)
    2a02:ec80:0::/48 - labs eqiad
        XXXX
    2a02:ec80:1::/48 - labs codfw
        2a02:ec80:1:2122::/64 - 2122 - labs-support1-b-codfw  (84A)
        2a02:ec80:1:2118::/64 - 2118 - labs-hosts1-b-codfw  (846)
        2a02:ec80:1:2120::/64 - 2120 - labs-instance-transport1-b-codfw  (848)

Having the vlanID in decimal in the IP makes it easier to understand, but we can also use the hex value (2122->84A) so it's more accurate.

1/ is more of a short term solution while 2/ will require more work (advertise new /48s to the world) but is the most sustainable option.

Related Objects
Search...

Status	Assigned	Task
Open	None	T53494 Use Beta cluster as a true canary for code deployments (epic)
Open	None	T87220 Minimize infrastructure differences between Beta Cluster and production
Stalled	None	T211677 Support IPv6 in beta
Resolved	None	T209460 CloudVPS: network architecture
Resolved	None	T244727 CloudVPS: networking improvements
Stalled	None	T211575 Enable IPv6 on toolforge.org
Open	None	T220306 Add IPv6 monitoring
Open	None	T37947 Enable IPv6 on CloudVPS
Open	None	T245495 CloudVPS: IPv6 early PoC
Open	None	T187929 Cloud IPv6 subnets

Event Timeline

ayounsi triaged this task as Medium priority.Feb 21 2018, 7:21 PM

ayounsi created this task.

Restricted Application added a project: SRE. · View Herald TranscriptFeb 21 2018, 7:21 PM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

• chasemp mentioned this in T187933: Labs to Cloud renaming for networking equipment.Feb 21 2018, 7:41 PM

ayounsi mentioned this in T193496: Allocate public v4 IPs for Neutron setup in eqiad.May 24 2018, 11:27 AM

Krenair subscribed.Sep 19 2018, 8:52 PM

ayounsi mentioned this in T122406: Consider renumbering Labs to separate address spaces.Oct 22 2018, 2:47 PM

ayounsi mentioned this in T207663: Renumber cloud-instance-transport1-b-eqiad to public IPs.Oct 22 2018, 6:55 PM

Paladox subscribed.Oct 22 2018, 8:30 PM

• Phabricator_maintenance moved this task from Backlog to Acknowledged on the SRE board.Jan 26 2019, 9:48 PM

aborrero added a parent task: T245495: CloudVPS: IPv6 early PoC.Mar 19 2020, 2:55 PM

I like option 2) the most. Are those ranges actual data?

Regarding coding the vlan id: I don't think we should do it. We might eventually move away from the prod VLAN thing, or have addresses where the VLAN part is meaningless (think of virtual networking inside the cloud itself, like VMs or virtual routers).

Will all this in mind, I suggest we use this addressing plan

2a02:ec80:0::/44 - cloud (16 * /48)
    2a02:ec80:0::/48 - cloud eqiad1 (16 * /64)
        2a02:ec80:0:0::/64 - cloud-physical-eqiad1 -- includes physical transport networks (may have more than one), physical servers, virtual IP addresses for physical servers, and whatever we may need that correspond to physical hardware
            2a02:ec80:0:0:0::/80 - cloud-upstream1-eqiad1 -- physical connectivity between our external physical router and the prod core routers (example of thing that might happen sooner than later)
            2a02:ec80:0:0:1::/80 - cloud-transport1-eqiad1 -- physical connectivity between neutron and our external physical routers
            2a02:ec80:0:0:2::/80 - cloud-hosts1-eqiad1 -- physical connectivity for servers and supporting services, a subnet connected to our external physical router
        2a02:ec80:0:1::/64 - cloud-virtual-eqiad1  -- everything from neutron virtual routers to VMs, including virtual addresses inside openstack and other virtual services.
    2a02:ec80:1::/48 - cloud codfw1dev
        2a02:ec80:1:0::/64 - cloud-physical-codfw1dev -- (see eqiad1 equivalent)
        2a02:ec80:1:1::/64 - cloud-virtual-codfw1dev  -- (see eqiad1 equivalent)

ayounsi mentioned this in T245495: CloudVPS: IPv6 early PoC.Mar 19 2020, 5:31 PM

I agree that option 2 is the way to go.

The complication is how to subnet them properly for both the short term (T245495 PoC) and the longer term. I couldn't find much subnetting recommendation doc in my little research.
While keeping in mind v6 subnetting convention (eg. nothing smaller than a /64).

For example we take eqiad's:

2a02:ec80:0::/48
    2a02:ec80:0::/49
        2a02:ec80::/56 - infrastructure and support networks (gives 256*/64)
        2a02:ec80:0:100::/56 - virtual networks (gives 256*/64)
            2a02:ec80:0:100::/64 - eg VMs flat network (similar to the the 172.16.0.0/21network)
    2a02:ec80:0:8000::/49 - reserved for future use

Which is very similar to your proposal, but with different mask lengths.

For now the first one would not be used (afaik) but will be if we move to a model where the whole cloud infra is behind its dedicated gear.
https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Network_refresh#intermediate_router/firewall

Re-assigning to @faidon for approval as we're talking about long time design and a lots of IPs (see also T245495 for context)

AfroThundr3007730 subscribed.Apr 16 2020, 7:17 PM

mdaniels5757 subscribed.Apr 20 2020, 4:23 PM

• taavi subscribed.Mar 22 2021, 7:14 AM

cmooney subscribed.Jun 3 2021, 1:03 PM

I agree on option 2 above that it makes sense to assign a /48 for cloud services at each site. Some people these days are assigning a /64 per-VM so we should provide space to cater for potential future cases such as that.

Ideally, we'd be able to assign the new cloud /48s from aggregates already announced at each site. But that's not possible.

If we need to announce new space it occurs to me that, rather than adding more /48s to the v6 table, we could instead use some of the RIPE space and allocate (for example,) a new /40 for each of our sites? We have 2,048 x /40 available in our RIPE allocation, so assigning one per site leaves plenty for future POPs. We'd then have 256 x /48s for use internally at each site, the first going to cloud services. We may never need another /48 at any of them, but if we do it makes it simpler. And if we don't it doesn't matter, v6 space is designed to be wasted.

Also, the way we've divided our current v6 allocations is based on geography. I think it makes sense to stick with that, rather than have some subnets at the first level under our RIR allocation done on a geographical basis, and some on a category/service basis (i.e. /44 for cloud services).

@aborrero I agree with Arzhel that no networks smaller than a /64 should be used. Your maths is slightly off though, there are 65,536 /64s in a /48, so plenty of space. I'd also not divide the /48 directly into /64s, add a layer of hierarchy under the /48. Maybe sparsely allocate (i.e. leaving gaps for future growth) some of the 16 x /52s, each earmarked for a different use case (i.e. vlan-attached networks, infrastructure, VMs/containers etc.) I'd avoid using /49s etc we've enough space to avoid it so segment at nibble boundaries.

FWIW a good guide on this is Tom Coffen's "IPv6 Address Planning": https://www.oreilly.com/library/view/ipv6-address-planning/9781491908211/

sparse_slash_52.png (667×893 px, 133 KB)

random_example_plan.png (868×893 px, 410 KB)

Ok, so the plan would be to have:

2a02:ec80:0::/48 - cloud eqiad1
2a02:ec80:1::/48 - cloud codfw1dev

Please confirm and request approvals as required.

My own preference would be to allocate larger ranges to each site as mentioned above, and allocate the cloud prefixes from within those geographic aggregates. Doesn't have to be that way of course, I guess we can see what the consensus is.

Sticking with my previous example:

2a02:ec80:1000::/40    eqiad
2a02:ec80:2000::/40    codfw
2a02:ec80:3000::/40    esams
2a02:ec80:4000::/40    ulsfo
2a02:ec80:5000::/40    eqsin

(there are 64 /40s in the first half of our RIPE /29 if allocated like this, with a gap of 15 /40s between each for future expansion, and the upper half of the /29 untouched).

So then possibly:

2a02:ec80:1001::/48	cloud eqiad1
2a02:ec80:2001::/48     cloud codfw1dev

Leave it to us to discuss and we will get back to you.

cool, thanks!

Aklapper added a project: Infrastructure-Foundations.Jun 21 2021, 9:00 PM

Having discussed with @ayounsi we were thinking it may be better to assign aggregates less sparsely, as follows:

2a02:ec80:100::/40	eqiad
2a02:ec80:200::/40	codfw
2a02:ec80:300::/40	esams
2a02:ec80:400::/40	ulsfo
2a02:ec80:500::/40	eqsin

This would allow us to move beyond "site 9" and have 2a02:ec80:1000::/40, 2a02:ec80:1100::/40 and 2a02:ec80:1200::/40 for sites 10, 11 and 12 respectively (and up to '99'). It does remove the gaps between blocks / room to expand, but given the size of each /40 this is probably never going to be an issue.

In that circumstance we could allocate these for cloud:

2a02:ec80:101::/48	cloud eqiad1
2a02:ec80:201::/48      cloud codfw1dev

But again leave it with us to discuss more widely in the team, just documenting here for visibility.

@faidon wondering if you are ok with the plan set out above? Any comments / feedback welcome.

Prioritization-wise, is there a reason why we're going for an IPv6 allocation while our IPv4 segmentation is still in flux or in progress? I fear that we're adding more features/problems to the mix without having set and implemented clear boundaries first, and making an already complex situation more complex (e.g. more filters to maintain) so I'd like to hear more about those trade offs and perhaps wait.

Sizing-wise it all makes sense to me - thanks for all the details @cmooney!

What I'm not entirely sure (but not negative per se) about this space being within region's geographic aggregates. We've had different routing rules for cloud VPS in the past (cold potato, announcing the space to e.g. our peers in Amsterdam and carrying traffic in our own transport), and we've also been very deliberate in trying to keep the IP space isolated compared to our production traffic (i.e. by moving VMs out of the 208.80.152.0/22 space). Directionally, I'd like for us to be treating our public cloud like a) a customer b) any other public cloud - so perhaps it's worth thinking about one large supernet for customers, under which cloud could get a large allocation, with specific assignments for their regions (which may be hosted in our data centers, or even outside of our production data centers in the future). Happy to discuss those trade offs further though :)

Thanks @faidon for the comments. In terms of why it is being discussed, I'm trying to advance tasks outstanding for WMCS (as discussed by myself and @joanna_borun), and the IPv6 stuff seemed like something that could be progressed. Providing new blocks for cloud inevitably involves using space from our RIPE /29 allocation, hence the discussion on how to subnet it.

Understood that there is a question regarding IPv4 also, more than happy to park this for now until we have decided how to segment the IPv4. That said any eventual decision in terms of v4 should not really impact the v6 decision. We shouldn't compromise how we segment the v6 space by trying to align it to the plan for v4, which will inevitably be constrained by scarcity. Filter rules should be as close as possible though, so maybe it is best to wait.

Keeping the "customer" ranges separate from production - and the ability to implement different policy / announce them to the internet separately - makes perfect sense. That's perfectly valid, and if there is precedent let's stick with it. Keeping the goal to announce large aggregates, minimizing our total number of prefixes in the dfz, we could stick with a similar scheme but announce a /40 for customers from each site (and a separate /40 for prod if ever needed). For example allocate the first half of our /29 like this:

2a02:ec80::/32          WMF Production [Reserved for future expansion]
2a02:ec81::/32          TBD
2a02:ec82::/32          WMF Hosted Customers
2a02:ec83::/32          TBD

We could sub-allocate /40s for "customers" on a geographic basis, same scheme as before:

2a02:ec82:100::/40      eqiad customers
2a02:ec82:200::/40	codfw customers
2a02:ec82:300::/40	esams customers
2a02:ec82:400::/40	ulsfo customers
2a02:ec82:500::/40	eqsin customers

And assign specific ranges to WMCS from those as needed, again similar to before:

2a02:ec82:100::/48	cloud eqiad1
2a02:ec82:200::/48      cloud codfw1dev

If cloud eventually run their own networks we could give them addresses from the remaining space, and not touch the existing customer aggregates we announce. There's loads of room for that. I'd probably advise we go to ARIN or RIPE and get them their own AS for that too, but not something to worry about today.

@cmooney That looks cleaner indeed.

@faidon We're moving away from 172.16/12 IPs being able to reach the Wikis, which means VM traffic needs to be NATed and looses useful troubleshooting information ("which VM did edit X"?).
There are several good options to solve that (see sub tasks of T209011), one of them being using IPv6, as each VM would have a distinct IP.
That's why we shouldn't wait to solve the IPv4 segmentation before doing work on IPv6.

There are some ongoing conversations with the WMCS team regarding the placement of their infrastructure in our network/infrastructure, and I think it would be good to resolve that first, before moving forward on implementing this. Setting this to Stalled - hope that makes sense!

@faidon are there any updates on this? We've been discussing tooling to help with steward workflow, but they are highly dependent on IPv6. If that is already a problem locally, globally it'd be even worse (especially considering India, where it's basically only IPv6). Thanks!

@Tks4Fish I don't think there is any reason to worry in terms of availability of IPv6 address space.

Is there a specific proposal on the table requiring additional IPv6 address space for WMCS currently? We would be happy to discuss if there is.

In terms of how we are dividing up our LIR IPv6 allocations, we are broadly in line with that is set out above. But we have not made any new allocations for cloud services as there doesn't seem to be an urgent need, and bearing in mind Faidon's comments about it being a transitionary period.

@cmooney Sorry, I think I ended up asking in the wrong place.

My question comes from T37947, and after looking at the comments there, I got to this task, saw it as stalled and concluded (possibly wrongly) that it is the reason why IPv6 support for tools hasn't gone forward, and that's why I asked it as I did.

I don't think the availability is a problem, just asking for an update so that we can go forward with the support for the tools.

I'm happy to clarify why I think this deserves more urgency than what is currently given on IRC or another medium, as I now realize that this task isn't the place for that. Sorry again!

@Tks4Fish no problem at all! And certainly no need to apologize.

This task more relates to allocating blocks of IPv6 for Toolforge/Cloud. As per the above discussion there are some small open questions, but I've no doubt we will be able to allocate the space in a reasonable time when needed.

The bigger part of the job, enabling IPv6 across the cloud infra, needs to be ready before those blocks can be used though. So a bit of a way to go.

But we are very keen on IPv6 in general, and supporting our colleagues on the cloud team as needed to help us get there. Thanks for the input!

• nskaggs subscribed.Jul 21 2022, 7:57 PM

• nskaggs mentioned this in T313256: Request increased quota for Canasta Cloud VPS project.Jul 21 2022, 8:07 PM

In T187929#7623061, @cmooney wrote:

This task more relates to allocating blocks of IPv6 for Toolforge/Cloud. As per the above discussion there are some small open questions, but I've no doubt we will be able to allocate the space in a reasonable time when needed.

The bigger part of the job, enabling IPv6 across the cloud infra, needs to be ready before those blocks can be used though. So a bit of a way to go.

But we are very keen on IPv6 in general, and supporting our colleagues on the cloud team as needed to help us get there. Thanks for the input!

Stalled and assigned to ex-WMF staff is not a great look for the future of this task. How do we get this back on track to unblock T245495: CloudVPS: IPv6 early PoC? Should the POC move forward using some other IPv6 allocation? What does that look like?

My understanding is that priorities shifted and other WMCS projects (joint with Netops) are being worked on.
Allocating space can be done in a relatively short time frame, but if done too early (eg. if we did it in 2018!) it might not match the updated needs or state of the art allocation model.

I see in T245495#8537519 that it might have been re-prioritized on the WMCS side. In that case it's worth a discussion to make sure we're all on the same page in term of time frame and scope. For what I remember, parts of the Cloud infra didn't support IPv6. Hopefully that changed :)

ayounsi moved this task from Backlog to Watching on the netops board.Oct 3 2023, 11:44 AM

reopening -- we might want to take a look at this soon.

aborrero moved this task from Backlog to Radar/observer on the User-aborrero board.Apr 24 2024, 2:31 PM

There are a few elements here to consider:

Existing production-realm private IPv6 ranges

The existing cloud-hosts vlans, in the WMF production realm, have IPs from the wider WMF production private IPv6 block at that site, for instance:

cloud-hosts1-eqiad	10.64.20.0/24	2620:0:861:118::/64
cloud-hosts1-e4-eqiad	10.64.148.0/24	2620:0:861:11c::/64
cloud-hosts1-f4-eqiad	10.64.149.0/24	2620:0:861:11d::/64
cloud-hosts1-d5-eqiad	10.64.150.0/24	2620:0:861:11e::/64
cloud-hosts1-c8-eqiad	10.64.151.0/24	2620:0:861:11f::/64


cloud-hosts1-b1-codfw	10.192.20.0/24	2620:0:860:118::/64

Both the v4 and v6 ranges here come from the wider private ranges allocated for WMF production networks at each site. If we were doing this from scratch perhaps we'd do something different (to have a single, separate aggregate for them), but overall there is no major problem. Scenario is the same with v4 and v6 so I think we can leave this as is.

Cloud-private subnets

The cloud-private networks, introduced since we last visited this task, are currently only configured for IPv4. This was potentially an error when we first deployed them, the fact existing cloud services did not run over IPv6 probably meant none were allocated.

No reason not to move ahead with this though. Similar to v4 we can allocate an aggregate for all the cloud-private ranges at a site, and take per-rack subnets from that. Following the v4 pattern hosts have no default route over the cloud-private interface, so IPv6 Router Advertisements don't need to be configured on the switches.

The aggregate that is used for the cloud-private allocations should come from IPv6 space not announced to the internet/DFZ, or space that is announced but is filtered inbound on our CRs (similar to the WMF production private IPv6 aggregates).

Once addressing is assigned the steps are to configure all the hosts with their newly assigned v6 addresses, and then begin the work to ensure all services currently provided on this network are also available via IPv6. In the first instance we probably should not assign DNS names to assigned addresses until we know services are operational on both protocols.

VM instances

Probably the main ask here is to provide IPv6 connectivity to VM instances running on the cloud OpenStack platform. As discussed between teams the current work to move to using the Neutron OpenVSwitch agent (T326373), and further introduce self-service networks based on this (with VXLAN wire-line encapsulation between hosts), provides a good opportunity to introduce IPv6 for instances.

At a high level - making assumptions - it should be possible to create 'self service' IPv6 networks in the same way as with IPv4. It should also be perfectly possible to tunnel these virtual IPv6 networks between hosts using the cloud-private network, the same as for IPv4. In other words the existing IPv4 addressing on the cloud-private ranges can serve as the VXLAN VTEP endpoints even when carrying IPv6 overlay traffic. In other words getting v6 addresses on the cloud-private vlans is not a blocker to enabling v6 for VMs.

Address block allocations

We should allocate an aggregate at each WMCS site from which the self-service networks can be assigned. This should be routed to the global internet so that instances can make outbound connections where needed. A separately routable prefix is desirable for this.

Since this task was first opened we have begun allocating /40 networks from our RIPE /29 for Wikimedia POPs. There are over 2,000 such networks available in that block, which gives us plenty to use for further POPs. As such I'd recommend we allocate /40s to WMCS at each site also, keeping a consistent allocation size.

What I'd propose for WMCS allocations is the following - unless anyone objects I can assign these in Netbox.

Aggregate	Site
2a02:ec80:a000::/40	WMCS eqiad
2a02:ec80:a100::/40	WMCS codfw

Given the amount of space here I would suggest WMCS break these further into /44 networks, beginning with two allocations from each. For example:

2a02:ec80:a000::/44 - private IPv6 networks
2a02:ec80:a080::/44 - public IPv6 networks

We only need to announce the public range to the internet from the WMF edge. In terms of further sub-division from there I'm not sure what the best way to go is, what size or how many networks may be needed for instances or other services. Each /44 provides over 100,000 /64 subnets, so it should be sufficient to support whatever scheme we want, including allocating a full /64 to each instance.

Routing

As mentioned the cloud hosts can transit the virtual IPv6 networks over cloud-private IPv4 addresses, so we don't need to change those. However the machines performing the Neutron router/gateway function will need IPv6 connectivity on the wire to forward traffic outside the cloud realm. Specifically in the current scenario that means the cloudnet and cloudgw need to have IPv6 connectivity between them, and the cloudgw needs to have external IPv6 connectivity.

It would be nice if we could take this opportunity to rework the cloudgw and have it speak BGP to the cloud switches (in a pinch we can do this with static routes again). On the switch side we need to add IPv6 IPs to the interfaces connecting our CRs to the cloud vrf, and also route the newly allocated ranges to the cloudgw. The cloudgw will need to route the ranges forward to cloudnet/neutron.

Happy to get working on this or discuss any time.

In T187929#9748100, @cmooney wrote:

The aggregate that is used for the cloud-private allocations should come from IPv6 space not announced to the internet/DFZ, or space that is announced but is filtered inbound on our CRs (similar to the WMF production private IPv6 aggregates).

This seems fine to me. There is, however, one case where we will need publicly routable cloud-realm addresses for non-VM instances, and that is the "WMCS public service VIPs" ranges (currently 185.15.56.160/28, 185.15.57.24/29). That's indeed not cloud-private, although I think it's closer to that use case than to the instances use case.

VM instances

Address block allocations

We should allocate an aggregate at each WMCS site from which the self-service networks can be assigned. This should be routed to the global internet so that instances can make outbound connections where needed. A separately routable prefix is desirable for this.

Since this task was first opened we have begun allocating /40 networks from our RIPE /29 for Wikimedia POPs. There are over 2,000 such networks available in that block, which gives us plenty to use for further POPs. As such I'd recommend we allocate /40s to WMCS at each site also, keeping a consistent allocation size.

What I'd propose for WMCS allocations is the following - unless anyone objects I can assign these in Netbox.

Aggregate Site

2a02:ec80:a000::/40 WMCS eqiad

2a02:ec80:a100::/40 WMCS codfw

Would these aggregates include both the infrastructure and instance addressing, or would the infrastructure se some other aggregates?

Given the amount of space here I would suggest WMCS break these further into /44 networks, beginning with two allocations from each. For example:
2a02:ec80:a000::/44 - private IPv6 networks
2a02:ec80:a080::/44 - public IPv6 networks
We only need to announce the public range to the internet from the WMF edge. In terms of further sub-division from there I'm not sure what the best way to go is, what size or how many networks may be needed for instances or other services. Each /44 provides over 100,000 /64 subnets, so it should be sufficient to support whatever scheme we want, including allocating a full /64 to each instance.

Apologies in advance if I'm overthinking this. One useful feature in the address allocation would be to clearly separate space used for "trusted" Wikimedia infrastructure (for example, esams, drmrs and magru currently using 2a02:ec80::/32) from "untrusted" WMCS instances so that ACLs don't have to, for example, list every site there individually or otherwise be overly complicated. Your proposal would allow that (the PoPs are all in 2a02:ec80::/36 while the WMCS space would be in 2a02:ec80:a000::/36, for example), but I think it's worth explicitely documenting that separation.

@cmooney what do you think of duplicating the other POPs allocation scheme?
For example looking at eqiad as example, keep 2a02:ec80:a000::/40 as "reserved for future growth"
Then use 2a02:ec80:a000::/48 for the existing WMCS eqiad infra
Then 2a02:ec80:a000::/56 for public, another /56 for private, /55 for the infra, 2a02:ec80:a000:ed1a::/64 for VIPs, etc
Or is the risk to not be able to allocate a /64 for each VM ? (a /48 is 65536 /64s)

I see an advantage of using our current standards, but of course it shouldn't block us later down the road if we expect needing much more prefixes.

This seems fine to me. There is, however, one case where we will need publicly routable cloud-realm addresses for non-VM instances, and that is the "WMCS public service VIPs" ranges (currently 185.15.56.160/28, 185.15.57.24/29). That's indeed not cloud-private, although I think it's closer to that use case than to the instances use case.

Yeah we can allocate a /64 our of the public /44s as VIP prefix.

Would these aggregates include both the infrastructure and instance addressing, or would the infrastructure se some other aggregates?

Yep, for the WMCS specific infrastructure bits

Apologies in advance if I'm overthinking this. One useful feature in the address allocation would be to clearly separate space used for "trusted" Wikimedia infrastructure (for example, esams, drmrs and magru currently using 2a02:ec80::/32) from "untrusted" WMCS instances so that ACLs don't have to, for example, list every site there individually or otherwise be overly complicated. Your proposal would allow that (the PoPs are all in 2a02:ec80::/36 while the WMCS space would be in 2a02:ec80:a000::/36, for example), but I think it's worth explicitely documenting that separation.

Yep, Cathal's allocation does account for that. As we don't have that many sites/realms, there is no need to over-aggregate and reserve huge prefixes, as ACLs are usually per site (eg. per /48). The difficulty is to find the good balance.

In T187929#9793592, @ayounsi wrote:

@cmooney what do you think of duplicating the other POPs allocation scheme?
For example looking at eqiad as example, keep 2a02:ec80:a000::/40 as "reserved for future growth"
Then use 2a02:ec80:a000::/48 for the existing WMCS eqiad infra
Then 2a02:ec80:a000::/56 for public, another /56 for private, /55 for the infra, 2a02:ec80:a000:ed1a::/64 for VIPs, etc

I'm not strongly opposed to it. I somewhat prefer the use of separately routable space for private/public split, rather than having the private space from publically announced ranges, blocked with a firewall rule at the edge. I know that's what we do for prod, but I don't think it's a major issue to do it differently here.

Keeping the same scheme as at POPs makes sense, but either way there will be differences. The sub-division within the WMCS public/private blocks is going to be different than what we do at POPs, whether those blocks are /56s or /44s. I worry that in time perhaps they will need more complex segmentation in the virtual realm, potentially multiple /48s? To that end the wider allocation is going to be more flexible, but I don't doubt a single /48 could be made work. In total there are certainly enough end networks (/64s), it's more about flexible segmentation/sub-division.

So I'm kind of 50/50 on it. I wonder to WMCS folk have any take on this?

In T187929#9789580, @taavi wrote:

Apologies in advance if I'm overthinking this. One useful feature in the address allocation would be to clearly separate space used for "trusted" Wikimedia infrastructure (for example, esams, drmrs and magru currently using 2a02:ec80::/32) from "untrusted" WMCS instances so that ACLs don't have to, for example, list every site there individually or otherwise be overly complicated. Your proposal would allow that (the PoPs are all in 2a02:ec80::/36 while the WMCS space would be in 2a02:ec80:a000::/36, for example), but I think it's worth explicitely documenting that separation.

It definitely doesn't help if filters have too many entries, and having contiguous allocations helps avoid that. If we allocate the two /40 networks (regardless of whether we use a /48 from each or more) then they can be summarized into a single /39 so it should be ok.

	F34495548: sparse_slash_52.png
	Jun 11 2021, 3:55 PM

Cloud IPv6 subnetsOpen, MediumPublicActions

Description

Related ObjectsSearch...

Event Timeline

Existing production-realm private IPv6 ranges

Cloud-private subnets

VM instances

VM instances

Cloud IPv6 subnets
Open, MediumPublic
Actions

Related Objects
Search...