Page MenuHomePhabricator

Free up 185.15.59.0/24
Open, LowPublic

Description

Follow up from an IRC conversation.

https://netbox.wikimedia.org/ipam/aggregates/4/
Context here is that 185.15.56.0/22 barely have any "prod" IPs, and might be able to be remove as a whole from our ACLs. So we don't have to think about what part of that /22 is prod or not prod, as well as not risking typoing a /24 into a /23 or /22 in our ACLs.

185.15.59.0/24 is currently used for two interconnects ( cr1-esams <--> mr1-esams and cr2-knams <--> mr1-esams) and Tilaa OOB.
https://github.com/wikimedia/operations-dns/blob/master/templates/59.15.185.in-addr.arpa

Renumbering the interconnects is straightforward, Tillaa OOB needs to sync up with them, but first we need to find new IPs, ideally in the 91.198.174.0/24 space.

From the following:
https://github.com/wikimedia/operations-dns/blob/master/templates/174.198.91.in-addr.arpa

We can use 91.198.174.240/31 for cr1-esams <--> mr1-esams

If we want the infrastructure IPs to be contiguous (eg. in the same 91.198.174.224/27) we would need to move ns2.wikimedia.org to a different (lower) IP and reclaim "91.198.174.224/28 (224-239) out-of-subnet LVS service IPs"
As this is a heavy/risky operation, I don't think it's worth it.

We can however shrink the reservation "91.198.174.224/28 (224-239) out-of-subnet LVS service IPs" to 91.198.174.232/29
And use 91.198.174.224/29 for infrastructure, eg, carve 91.198.174.224/31 for cr2-knams <--> mr1-esams

Using lower subnets (eg. 91.198.174.144/28) for infrastructure (interco, etc.) adds fragmentation and might bite us later.

We can keep Tilaa OOB on 185.15.59.0/24, that way:
1/ Something is used on that subnet (less risk of theft)
2/ No need to bother them with a renumbering
3/ We can still remove 185.15.59.0/24 from any trusted lists

And revisit it when we need 185.15.59.0/24 or 91.198.174.232/29 for other purposes.

Event Timeline

ayounsi created this task.Dec 5 2018, 10:45 PM
ayounsi triaged this task as Low priority.
Restricted Application added a project: Operations. · View Herald TranscriptDec 5 2018, 10:45 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema moved this task from Triage to Network on the Traffic board.Dec 6 2018, 12:54 PM
ayounsi updated the task description. (Show Details)Dec 10 2018, 10:56 PM

Added some context in the task description.

In addition, 185.15.58.0/24 is currently reserved as a 2nd anycast range (since T98006 I think), which goes against the idea of segregating the whole /22 from prod.
Several options:

  • Un-reserve that /24 and figure out down the road if we need it again
  • Keep the anycast /24 reserved in Netbox (just in case) but still remove the /22 from prod ACLs
  • Find another /24 to be used as 2nd anycast /24, (eg. potentially 208.80.152.0/24)

Note that I don't see this as a blocker to move the mr1-esams interco to a different range as it seems to be the cleanest thing to do at this point.

If we need to free up 208.80.152.0/24, it is currently only used for:

  • 208.80.152.224/28 - frack-codfw
  • 208.80.152.240/28 - sandbox1-a-codfw

Sandbox is only used for 2 hosts, so it could be shrinked to a /29 and moved to 208.80.153.176/29 (leaving only 1 free IP in the range)
If frack is on board, then 208.80.152.224/28 could be moved to 208.80.153.160/28.

Note that:
1/ this would not leave any free space in 208.80.153.0/24
2/ We don't have to move anything, but only reserve 208.80.153.176/29 and 208.80.153.160/28 in Netbox for a possible future need. (while referencing this task).

faidon added a subscriber: faidon.Dec 11 2018, 7:32 PM

What is the rationale behind trying to empty this address space and/or find a new /24?

Talked a bit over IRC, tldr, the rationale has been added to the beginning of the task's description.
Triggering conversation was about removing WMCS 185.15.56.0/23 from prod ACLs.

ayounsi added a parent task: Restricted Task.Jan 16 2019, 11:53 PM

Change 485081 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Move mr1-esams interco links to 91.198.174.0/24

https://gerrit.wikimedia.org/r/485081

ayounsi claimed this task.EditedJan 17 2019, 7:13 PM

In addition to the above DNS change, the following needs to change on the routers:

cr1/2-esams - shrink /28 to /29
[edit routing-options aggregate]
     route 185.15.56.0/22 { ... }
+    route 91.198.174.232/29;
     route 10.2.3.0/24 { ... }
[edit routing-options aggregate]
-    route 91.198.174.224/28;
[edit policy-options prefix-list LVS-service-ips]
-    91.198.174.224/28;
+    91.198.174.232/29;

One link at a time:

mr1-esams
[edit interfaces ge-0/0/1 unit 404 family inet]
+       address 91.198.174.241/31;
-       address 185.15.59.245/31;
[edit interfaces ge-0/0/1 unit 405 family inet]
+       address 91.198.174.225/31;
-       address 185.15.59.247/31;
cr1-esams
[edit interfaces ae1 unit 404 family inet]
+       address 91.198.174.240/31;
-       address 185.15.59.244/31;
cr2-knams
[edit interfaces ae1 unit 405 family inet]
+       address 91.198.174.224/31;
-       address 185.15.59.246/31;

Then update Netbox

mark added a subscriber: mark.Jan 23 2019, 1:12 PM

I really don't see the point of this. With the scarcity of IPv4 space we only need to get MORE flexible about how we use our IP space, and we will almost certainly not be able to maintain production vs others split between these address blocks in the future. Rather than spend time on renumbering I think it's much more valuable to spend that effort on better managing our ACLs and more automation.

It's the same basic rationale as moving WMCS out of 10.68.0.0/16. We could obviously leave them there and just manage our ACLs better with more automation, but it pays some pretty big dividends when address spaces are clearly split on such a big security and functional boundary as Prod-v-WMCS. Humans will always look at IPs as well in various debugging and configuration tasks. Having similar/shared/adjacent numbering for these two realms invites confusion and mistakes.

I would personally have preferred that with the renumbering.of WMCS they simply acquired new public IPv4 space of their own, but the alternative here is we give them some of our existing allocations that are under-utilized, and vacate what little we have in-use or planned there to avoid confusion. We're not effectively using this space anyways at present*, and WMCS actually needs two routeable /24s for eqiad and codfw in the long term, which at least gives them half of this /22. I don't think it makes sense for us to use the other half for production all things above considered. I wouldn't be comfortable e.g. defining a public service IP for TLS or AuthDNS in space that might get mistakenly seen as labs somewhere in the various ACLs derived from our config (or in humans' minds when debugging) due to some confusion that boils down to a single-bit difference in a netmask.

* - Out of the whole /22, we have 5 IPs defined for infrastructure interface IP type stuff in one /24, and another /24 was long ago earmarked for Anycast AuthDNS, but that was under the original tentative plan to advertise 2x disparate /24 anycast networks (the other earmarked for this was 198.27.35.0/24) for authdns address space resiliency against bad route injections, which last I heard you weren't a fan of wasting a second /24 on that and thought we should just use one anyways

mark added a comment.Jan 23 2019, 1:55 PM

It's the same basic rationale as moving WMCS out of 10.68.0.0/16. We could obviously leave them there and just manage our ACLs better with more automation, but it pays some pretty big dividends when address spaces are clearly split on such a big security and functional boundary as Prod-v-WMCS. Humans will always look at IPs as well in various debugging and configuration tasks. Having similar/shared/adjacent numbering for these two realms invites confusion and mistakes.

In a world where there's ample address space (such as 10/8 in our context), yes. In today's world where IPv4 address space is scarce and we can likely not get any more, not so much.

I would personally have preferred that with the renumbering.of WMCS they simply acquired new public IPv4 space of their own

That's simply not realistic, they can't "acquire" IPv4 address space of their own. They're part of this organisation, this ASN, and need to use our PI/PA space where we have it available before we collectively can get more.

but the alternative here is we give them some of our existing allocations that are under-utilized, and vacate what little we have in-use or planned there to avoid confusion. We're not effectively using this space anyways at present*, and WMCS actually needs two routeable /24s for eqiad and codfw in the long term, which at least gives them half of this /22. I don't think it makes sense for us to use the other half for production all things above considered. I wouldn't be comfortable e.g. defining a public service IP for TLS or AuthDNS in space that might get mistakenly seen as labs somewhere in the various ACLs derived from our config (or in humans' minds when debugging) due to some confusion that boils down to a single-bit difference in a netmask.

I don't think that's reasonable or realistic. We absolutely need to be able to use that address space for all uses we may need it for. If we start off now with the mindset of that /22 as "only WMCS" this will only get worse.

We need to be careful everywhere we're applying ACLs, for security or for other reasons. It would certainly be nice if we could do that with clear address space boundaries, and with IPv6 we have that option. Private IPv4, as well. But not public IPv4.

* - Out of the whole /22, we have 5 IPs defined for infrastructure interface IP type stuff in one /24, and another /24 was long ago earmarked for Anycast AuthDNS, but that was under the original tentative plan to advertise 2x disparate /24 anycast networks (the other earmarked for this was 198.27.35.0/24) for authdns address space resiliency against bad route injections, which last I heard you weren't a fan of wasting a second /24 on that and thought we should just use one anyways

I am indeed still very much not a fan of wasting two entire /24s on Anycast...

It's the same basic rationale as moving WMCS out of 10.68.0.0/16. We could obviously leave them there and just manage our ACLs better with more automation, but it pays some pretty big dividends when address spaces are clearly split on such a big security and functional boundary as Prod-v-WMCS. Humans will always look at IPs as well in various debugging and configuration tasks. Having similar/shared/adjacent numbering for these two realms invites confusion and mistakes.

In a world where there's ample address space (such as 10/8 in our context), yes. In today's world where IPv4 address space is scarce and we can likely not get any more, not so much.

I would personally have preferred that with the renumbering.of WMCS they simply acquired new public IPv4 space of their own

That's simply not realistic, they can't "acquire" IPv4 address space of their own. They're part of this organisation, this ASN, and need to use our PI/PA space where we have it available before we collectively can get more.

I understand the basic concerns here about exhaustion and how the process works. I think it would've been possible to find a way to ask for new or acquire new space though, even in the US. It's just a process and a cost at the end of the day.

but the alternative here is we give them some of our existing allocations that are under-utilized, and vacate what little we have in-use or planned there to avoid confusion. We're not effectively using this space anyways at present*, and WMCS actually needs two routeable /24s for eqiad and codfw in the long term, which at least gives them half of this /22. I don't think it makes sense for us to use the other half for production all things above considered. I wouldn't be comfortable e.g. defining a public service IP for TLS or AuthDNS in space that might get mistakenly seen as labs somewhere in the various ACLs derived from our config (or in humans' minds when debugging) due to some confusion that boils down to a single-bit difference in a netmask.

I don't think that's reasonable or realistic. We absolutely need to be able to use that address space for all uses we may need it for. If we start off now with the mindset of that /22 as "only WMCS" this will only get worse.

How does it get worse? We do have other boundaries we draw with ACLs (e.g. analytics, frack), but those are a little different than the kind of boundary we have with WMCS, where we're trying as much as reasonably possible to have Production treat WMCS like it's the outside world.

We need to be careful everywhere we're applying ACLs, for security or for other reasons. It would certainly be nice if we could do that with clear address space boundaries, and with IPv6 we have that option. Private IPv4, as well. But not public IPv4.

IMHO, we do have the option, it's just a question of the cost/benefits tradeoffs of pursuing it.

* - Out of the whole /22, we have 5 IPs defined for infrastructure interface IP type stuff in one /24, and another /24 was long ago earmarked for Anycast AuthDNS, but that was under the original tentative plan to advertise 2x disparate /24 anycast networks (the other earmarked for this was 198.27.35.0/24) for authdns address space resiliency against bad route injections, which last I heard you weren't a fan of wasting a second /24 on that and thought we should just use one anyways

I am indeed still very much not a fan of wasting two entire /24s on Anycast...

The argument here is that if there's a bad route injection somewhere in the world, it likely won't take out all of our edge sites for direct user traffic and we can re-route users around the damage via DNS since they use disparate spaces (although the two core sites are adjacent numerically, but still we technically could live without both at the edge and run users in through ulsfo/eqsin/esams only). But if we're doing anycast authdns with all nameserver IPs in a single /24, a bad route injection which covers that /24 will kill us globally, regardless of the disparate TLS-terminating IPs at all our different edges.

mark added a comment.Jan 23 2019, 3:16 PM

It's the same basic rationale as moving WMCS out of 10.68.0.0/16. We could obviously leave them there and just manage our ACLs better with more automation, but it pays some pretty big dividends when address spaces are clearly split on such a big security and functional boundary as Prod-v-WMCS. Humans will always look at IPs as well in various debugging and configuration tasks. Having similar/shared/adjacent numbering for these two realms invites confusion and mistakes.

In a world where there's ample address space (such as 10/8 in our context), yes. In today's world where IPv4 address space is scarce and we can likely not get any more, not so much.

I would personally have preferred that with the renumbering.of WMCS they simply acquired new public IPv4 space of their own

That's simply not realistic, they can't "acquire" IPv4 address space of their own. They're part of this organisation, this ASN, and need to use our PI/PA space where we have it available before we collectively can get more.

I understand the basic concerns here about exhaustion and how the process works. I think it would've been possible to find a way to ask for new or acquire new space though, even in the US. It's just a process and a cost at the end of the day.

It may be possible to get more space in various shady ways, but it's not possible by following RIR rules.

but the alternative here is we give them some of our existing allocations that are under-utilized, and vacate what little we have in-use or planned there to avoid confusion. We're not effectively using this space anyways at present*, and WMCS actually needs two routeable /24s for eqiad and codfw in the long term, which at least gives them half of this /22. I don't think it makes sense for us to use the other half for production all things above considered. I wouldn't be comfortable e.g. defining a public service IP for TLS or AuthDNS in space that might get mistakenly seen as labs somewhere in the various ACLs derived from our config (or in humans' minds when debugging) due to some confusion that boils down to a single-bit difference in a netmask.

I don't think that's reasonable or realistic. We absolutely need to be able to use that address space for all uses we may need it for. If we start off now with the mindset of that /22 as "only WMCS" this will only get worse.

How does it get worse? We do have other boundaries we draw with ACLs (e.g. analytics, frack), but those are a little different than the kind of boundary we have with WMCS, where we're trying as much as reasonably possible to have Production treat WMCS like it's the outside world.

Yes, but it's also perfectly possible to do that with a subnet of our PI/PA space, as we will need to do as we can't expect to be able to reasonably acquire separate public prefixes for different uses. We need to separate "IP space WMF/AS14907 is responsible for" from the different security realms within, and do that consistently. Where we're not doing that well enough today we should invest there, with likely greater security and automation benefits beyond this aspect alone.

IP addresses are not only a scarce resource, but their uses and therefore trust boundaries change over time, too.

Presumably if we were a larger organization and could not recall individual subnets from the top of our heads, we wouldn't even be having this discussion. :) Better automation/abstraction around subnets at all levels would be a requirement.

We need to be careful everywhere we're applying ACLs, for security or for other reasons. It would certainly be nice if we could do that with clear address space boundaries, and with IPv6 we have that option. Private IPv4, as well. But not public IPv4.

IMHO, we do have the option, it's just a question of the cost/benefits tradeoffs of pursuing it.

This is where we disagree, I think.

* - Out of the whole /22, we have 5 IPs defined for infrastructure interface IP type stuff in one /24, and another /24 was long ago earmarked for Anycast AuthDNS, but that was under the original tentative plan to advertise 2x disparate /24 anycast networks (the other earmarked for this was 198.27.35.0/24) for authdns address space resiliency against bad route injections, which last I heard you weren't a fan of wasting a second /24 on that and thought we should just use one anyways

I am indeed still very much not a fan of wasting two entire /24s on Anycast...

The argument here is that if there's a bad route injection somewhere in the world, it likely won't take out all of our edge sites for direct user traffic and we can re-route users around the damage via DNS since they use disparate spaces (although the two core sites are adjacent numerically, but still we technically could live without both at the edge and run users in through ulsfo/eqsin/esams only). But if we're doing anycast authdns with all nameserver IPs in a single /24, a bad route injection which covers that /24 will kill us globally, regardless of the disparate TLS-terminating IPs at all our different edges.

I fully understand the reasons and arguments for it. However I haven't made up my mind yet if the cost of 2x public /24 is worth it for me, or even if having two anycast /24s in separate spaces is likely to fix it. (If anyone manages to leak exactly one of our two anycast /24s, why wouldn't they also leak the other?)

It may be possible to get more space in various shady ways, but it's not possible by following RIR rules.

Well, we can obviously still ask ARIN (and others) directly for more if we can justify it and show utilization, etc. It may be very predictable that they'll say no, of course :) Situations like a future LACNIC-based space might be easier since we don't have an initial allocation from them yet. But that aside, if ARIN doesn't have anything for us directly, there's supposedly non-shady sales transfers available via e.g. https://www.arin.net/resources/transfer_listing/index.html , but I don't yet know how active or useful that market is and havne't tried it myself. But other random reports seem to indicate it's doable and sanctioned (e.g. https://www.reddit.com/r/networking/comments/7ac7ip/leasing_buying_ipv4_addresses/dpa9t6v/ ).

I haven't made up my mind yet if the cost of 2x public /24 is worth it for me, or even if having two anycast /24s in separate spaces is likely to fix it. (If anyone manages to leak exactly one of our two anycast /24s, why wouldn't they also leak the other?)

Well I assume in the case of malicious advertisements, it doesn't matter in today's world since this stuff is fairly insecure for now. I'm more worried about accidents/mistakes that have happened before where a single bad route is sent out tragically due to someone typing the wrong thing into a configuration somewhere in the world. In those kinds of cases, it seems like having the pair of anycast authdns IPs be in widely-separated spaces could help a lot. But yeah, there's a lot of subjective cost/benefit to look at there.

Mentioned in SAL (#wikimedia-operations) [2019-05-09T19:24:19Z] <XioNoX> renumber mr1-esams<->cr1-esams link to 91.198.174.240/31 - T211254

Mentioned in SAL (#wikimedia-operations) [2019-05-09T19:28:23Z] <XioNoX> renumber mr1-esams<->cr2-knams link to 91.198.174.224/31 - T211254

Change 485081 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Move mr1-esams interco links to 91.198.174.0/24

https://gerrit.wikimedia.org/r/485081

Change 485081 merged by Ayounsi:
[operations/dns@master] Move mr1-esams interco links to 91.198.174.0/24

https://gerrit.wikimedia.org/r/485081

The conversation went a bit outside the scope of the task description.
Re-focusing on it and with the new info of T222392, I renumbered mr1-esams links (trivial change) so 185.15.59.0/24 now only have the one untrusted device.

I also opened CR https://gerrit.wikimedia.org/r/c/operations/puppet/+/509140 to remove unused and untrusted prefixes from our trusted Puppet lists. Scope could be narrowed to only removing 185.15.59.0/24 if needed.