Page MenuHomePhabricator

RPKI Validation
Open, NormalPublic0 Story Points

Description

This is a tracking task for RPKI Validation, description will be updated as the project evolves.

Short description
RPKI + Origin Validation uses cryptography to ensure a prefix received from our BGP peers is advertised from its legitimate owner.
So far about 13% of the Internet's {prefix/ASN} pairs are signed and valid, 0.76 are invalid, see https://rpki-monitor.antd.nist.gov/
It reduces the risk to route traffic to an AS accidentally or maliciously advertising a prefix that it doesn't own.
Combined to short AS paths (eg. peering), it also helps prevent BGP MitM hijack.
The system comes in 2 parts:

  • Validator, software that runs on a normal server, downloads the ROAs from the RIRs and verifies them
  • Router, uses the RPKI-to-Router protocol to get the validated data from the validator to the routers

It's also possible that a dedicated daemon implements RPKI-to-Router (eg. GoRTR)

Validator software selection
Looking at the landscape, my shortlist comes down to:

  • OctoRPKI + gortr from Cloudflare
  • Routinator from NLnetLabs

So far (and after dismissing it by miss-understanding) my vote goes to Routinator for the following reasons:

  • RTR daemon embedded, no need to package/run two tools
  • More active development, explicit roadmap. There is a risk that Cloudflare's tool only get update for their needs
  • Whitelist support

Location to run the validator from
Routers support multiple validators.

Routinator "Running it on a system with 1GB of available RAM and 1GB of available disk space will give the global RPKI data set enough room to grow for the forseeable future."

To get the ROAs, Routinator uses rsync, which should be able to use the Squid proxies (to be tested).
To be researched: I don't know yet if the dedicated protocol RRDP used by OctoRPKI, and ideally by Routinator down the road, supports proxies.

With that in mind and as of today, the best location is a private VM the Ganeti clusters in eqiad and codfw.
If dedicated two VMs for this is too much, another option would be to run the Validator on the netmon hosts. this also solves the proxies concerns as it would have direct Internet access.

To be researched: I'm not sure yet if bringing the validator closer to the routers (eg. in POPs) brings significant improvements. If so and once T96852 is solved, we could bring them closer to the POP routers.

Monitoring, still TBD

  • Both Routinator and OctoRPKI provide a Prometheus endpoint.

To be researched: it would be useful to get alerts if:

  • A router can't reach its configured Validators
  • The data provided to the routers gets stale

eg. Juniper doesn't seem to implement this MIB

Enforcement
The big question is: what to do once the routers know the validation status of a {ASN,prefix} pair?

First (easy) step is to change the BGP local pref, so for two identical prefixes, our router will prefer to use the one coming from a valid AS# (or avoid the invalid one).
This would not protect against a more specific prefix originating from an invalid ASN.

The current state of RPKI is that there are many "RPKI Unreachable" subnets, which are subnets that are not covered by more or less specific prefixes (valid or unsigned). Which mean rejecting these prefixes would make those subnets (with no overlap) unable to reach our network (more exactly would make them invisible to us).
To help with the decision on whether the security aspect is worth discarding those prefixes, I started talking to Analytics and opened T220639 to be able to cross-reference a list of RPKI unreachable (how to get it is still TBD) with our webrequests.
Similarly, once we have an infra wide Netflow, a recent version of pmacct allows to do the same (at lower layers).
Enforcing RPKI on peering links is also good low hanging fruit as any unreachable would route over transit links.

Event Timeline

ayounsi triaged this task as Normal priority.Apr 11 2019, 12:22 AM
ayounsi created this task.
Restricted Application added a project: Operations. · View Herald TranscriptApr 11 2019, 12:22 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
jbond added a subscriber: jbond.May 13 2019, 4:38 PM

Change 508928 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Puppet, add RPKI validation daemon

https://gerrit.wikimedia.org/r/508928

Change 508956 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Prometheus, add Routinator endpoint

https://gerrit.wikimedia.org/r/508956

jbond added a comment.May 23 2019, 3:42 PM

just watching ripe presentation and thought this may be of interest https://ripe78.ripe.net/archives/video/106

ayounsi updated the task description. (Show Details)May 24 2019, 1:00 AM

Mentioned in SAL (#wikimedia-operations) [2019-05-24T16:34:48Z] <XioNoX> add routinator package to reprepro/APT - T220669

Change 512405 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Add rpki1001/2001 to DNS

https://gerrit.wikimedia.org/r/512405

Change 512405 merged by Ayounsi:
[operations/dns@master] Add rpki1001/2001 to DNS

https://gerrit.wikimedia.org/r/512405

Change 512411 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Add cumin alias for rpki hosts

https://gerrit.wikimedia.org/r/512411

Change 512419 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Add rpki1001/2001 to DHCP

https://gerrit.wikimedia.org/r/512419

Change 512419 merged by Ayounsi:
[operations/puppet@production] Add rpki1001/2001 to DHCP

https://gerrit.wikimedia.org/r/512419

Change 512421 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Add netboot.cfg config for rpki1001/2001

https://gerrit.wikimedia.org/r/512421

Change 512421 merged by Ayounsi:
[operations/puppet@production] Add netboot.cfg config for rpki1001/2001

https://gerrit.wikimedia.org/r/512421

Change 508928 merged by Ayounsi:
[operations/puppet@production] Puppet, add RPKI validation daemon

https://gerrit.wikimedia.org/r/508928

Change 508956 merged by Ayounsi:
[operations/puppet@production] Prometheus, add Routinator endpoint

https://gerrit.wikimedia.org/r/508956

Change 512411 merged by Ayounsi:
[operations/puppet@production] Add cumin alias for rpki hosts

https://gerrit.wikimedia.org/r/512411

ayounsi added a comment.EditedMay 29 2019, 7:11 PM

Next step is to configure the RPKI validators on one router (eg. cr4-ulsfo):

[edit routing-options]
+   validation {
+       group rpki {
+           session 2620:0:861:103:10:64:32:19 {
+               port 3323;
+           }
+           session 2620:0:860:101:10:192:0:103 {
+               port 3323;
+           }
+       }
+   }

Note that this will not do any routing changes.

Mentioned in SAL (#wikimedia-operations) [2019-05-30T21:07:21Z] <XioNoX> add RPKI sessions on cr4-ulsfo - T220669

Change 514088 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Routinator, change command line args for 0.4.0

https://gerrit.wikimedia.org/r/514088

Mentioned in SAL (#wikimedia-operations) [2019-06-03T18:17:01Z] <XioNoX> add routinator 0.4.0 to APT repo - T220669

Change 514088 merged by Ayounsi:
[operations/puppet@production] Routinator, change command line args for 0.4.0

https://gerrit.wikimedia.org/r/514088

Mentioned in SAL (#wikimedia-operations) [2019-06-03T18:47:59Z] <XioNoX> Add RPKI validators to all routers - T220669

Bellow is my proposal to add validation to our config. There are many possible ways of doing it, so feedback from @faidon or @mark is welcome!

  • The RPKI BGP communities are non transitive so community delete RPKI:ALL should not be needed, but adding it just in case.
  • from community is an OR, not AND. Otherwise RPKI_INVALID AND PEERING_ROUTE would have make the config cleaner
  • The day we drop RPKI invalids on all links, we can move everything into BGP_sanitize_in and make the code cleaner
  • Still don't reject anything, but gives us visibility on what would happen if we do
  • Focus on peering as dropping invalid here is non impactful ("unreachable" prefixes will route through transits)
[edit policy-options policy-statement BGP_IXP_in]
     term avoid-paths { ... }
+    term rpki-invalid {
+        from validation-database invalid;
+        then {
+            validation-state invalid;
+            community add RPKI_INVALID;
+            next policy;
+        }
+    }
[edit policy-options policy-statement BGP_community_actions]
     term graceful-shutdown { ... 
     /* To be changed to a reject later on */
+    term rpki-invalid {
+        from community RPKI_INVALID;
+        then next policy;
+    }
[edit policy-options policy-statement BGP_sanitize_in then]
      community delete AS14907:ALL { ... }
+     community delete RPKI:ALL;
[edit policy-options]
+   /* RFC8097 */                       
+   community RPKI:ALL members "^0x4300:[0-9]+:[0-9]+$";
+   community RPKI_INVALID members 0x4300:0:2;

New proposal after talking to @faidon. The one above focuses on dropping invalids. The one bellow adds visibility on Invalid (until we drop them) as well as unknown and valids.

First push the following to cr4-ulsfo,

[edit policy-options policy-statement BGP_sanitize_in then]
      community delete AS14907:ALL { ... }
+     community delete RPKI:ALL;
[edit policy-options policy-statement BGP_IXP_in]
     term avoid-paths { ... }
+    term rpki-classification {
+        from policy BGP_rpki;
+    }

[edit policy-options]
+   policy-statement BGP_rpki {
+   	term valid {
+   		from {
+   			protocol bgp;
+   			validation-database valid;
+   		}
+   		then {
+   			validation-state valid;
+   			community add RPKI:VALID;
+   		}
+   	}
+   	term invalid {
+   		from {
+   			protocol bgp;
+   			validation-database invalid;
+   		}
+   		then {
+   			validation-state invalid;
+   			community add RPKI:INVALID;
+   		}
+   	}
+   	term unknown {
+   		from {
+   			protocol bgp;
+   			validation-database unknown;
+   		}
+   		then {
+   			validation-state unknown;
+   			community add RPKI:UNKNOWN;
+   		}
+   	}
+   }
[edit policy-options]
+   /* RFC8097 */                       
+   community RPKI:ALL members "^0x4300:[0-9]+:[0-9]+$";
+   community RPKI:VALID members 0x4300:0:0;
+   community RPKI:UNKNOWN members 0x4300:0:1;
+   community RPKI:INVALID members 0x4300:0:2;

To see if classification works. IXP only so if for some reason it starts dropping prefixes it would not impact traffic.

Then the same to cr3-ulsfo plus iBGP term for the prefixes learned from cr4:

[edit policy-options]
+   policy-statement iBGP_rpki {
+       term valid {
+           from community RPKI:VALID;
+           then validation-state valid;
+       }
+       term invalid {
+           from community RPKI:INVALID;
+           then validation-state invalid;
+       }
+       term unknown {
+           from community RPKI:UNKNOW;
+           then validation-state unknown;
+       }
+   }
[edit protocols bgp group Confed_ulsfo]
+    import iBGP_rpki;

Then apply the above iBGP_rpki to cr4-ulsfo, as well as BGP_rpki to all ulsfo transit/peering. (Making cr3/cr4 similar).

[edit policy-options policy-statement BGP_transit_in]
     term avoid-paths { ... }
+    term rpki-classification {
+        from policy BGP_rpki;
+    }

[edit policy-options policy-statement BGP_Private_Peer_in]
     term avoid-paths { ... }
+    term rpki-classification {
+        from policy BGP_rpki;
+    }

If all good, roll it to all sites.

eqiad/codfw exchange their full views, so iBGP_rpki needs to be added to Confed_eqiad/Confed_codfw

Once we're ready to drop invalids on peering links:

[edit policy-options policy-statement BGP_IXP_in]
     term rpki-classification { ... }
+    /* To be moved to BGP_community_actions */
+    term rpki-invalids {
+        from community RPKI:INVALID;
+        then reject;
+    }

Once we're ready to drop invalids everywhere, move the above term to the BGP_community_actions policy.

One of the blocker to reject invalids on transit is visibility on short lived events.
For example a network that have incorrect ROA for a short amount of time, which will not show up in our current analytics.

To be researched: I'm not sure yet if bringing the validator closer to the routers (eg. in POPs) brings significant improvements. If so and once T96852 is solved, we could bring them closer to the POP routers.

I do not believe there is much value in bringing the validators closer to a POP. The RTR sessions that run between validators and edge routers are not latency sensitive. The mechanism is a "push" rather than "pull" concept, when the validator observes changes in the list of VRPS it'll send a delta to the routers. The routers don't query the validator for each BGP UPDATE they receive.

JobSnijders added a comment.EditedJun 25 2019, 3:43 PM

Once we're ready to drop invalids everywhere, move the above term to the BGP_community_actions policy.

I've reviewed the proposed configuration and this looks good to me. This is exactly the same as has been deployed in a number of Dutch networks that enabled RPKI OV.

One of the blocker to reject invalids on transit is visibility on short lived events.
For example a network that have incorrect ROA for a short amount of time, which will not show up in our current analytics.

This is a matter of weighing risks... is being graceful towards the operator who self-inflicted a short lived event - worth potentially sacrificing other people's reachability who correctly published ROAs? Or perhaps I misunderstand your comment, please help me understand if I did so.

All in all the goal should be to apply a invalid == reject policy on each and every EBGP session, including transit. Also note that I suspect it is largely irrelevant whether wikimedia points default in some direction or not, the purpose of this exercise is to reject BGP announcements, not necessarily also block the data-plane path.

Thanks @JobSnijders, appreciate the feedback very much :) Our goal is to reject all invalids everywhere indeed, just progressively so.

Separate validator instances per PoP would be ideal I think, but more so for redundancy, locality and making the PoPs more self-sufficient and not depending on a working backbone or DC network for site-local routing decisions than anything else.

@ayounsi, thanks for the detailed proposal! Looks good to me too overall! Only a few minor comments:

  • There is a typo (RPKI:UNKNOW instead of RPKI:UNKNOWN)
  • I'd say to deploy the two policies to all routers, even if unused (because e.g. they're not peering routers) - after initial testing that is.
  • I don't think we've tried including a policy from another policy before -- I don't remember what it would actually happen in the absence of a "then" statement while doing so. I suspect nothing, but be careful while deploying this if you haven't tested it! Maybe deploy it on a few peers before expanding to all of IXP peers @ ulsfo as a precautionary step?

Mentioned in SAL (#wikimedia-operations) [2019-06-26T13:48:53Z] <XioNoX> push RPKI classification test to cr4-ulsfo - T220669

I'd say to deploy the two policies to all routers, even if unused (because e.g. they're not peering routers) - after initial testing that is.

Yup, that's the plan, to have all routers similar.

Maybe deploy it on a few peers before expanding to all of IXP peers @ ulsfo as a precautionary step?

That's why I wanted to do the classification on peering links first, so even if we drop prefixes due to policy miss-configuration, we would not impact traffic.
To be extra safe I temporary duplicated BGP_IXP_in so we can apply classification on a few test peers.

This didn't go far.

[edit policy-options community RPKI:INVALID]
  'members 0x4300:0:2'
    invalid autonomous system value at '0' not in range 1 to 65535. Use '0L' Long format to specify 4 byte AS
error: configuration check-out failed

I then tried to duplicate Juniper example. With the same issue.

ayounsi@cr4-ulsfo# set policy-options community origin-validation-state-invalid members 0x4300:0:2 

[edit]
ayounsi@cr4-ulsfo# commit check 
[edit policy-options community origin-validation-state-invalid]
  'members 0x4300:0:2'
    invalid autonomous system value at '0' not in range 1 to 65535. Use '0L' Long format to specify 4 byte AS
error: configuration check-out failed

I'm assuming it's a Juniper bug as both Juniper doc and the RFC says it should be possible. Opened SR2019-0626-0457.

In the meantime we can use any arbitrary value instead of the :0:, for example 0x4300:14907:2 as long as we're consistent.

JobSnijders added a comment.EditedJun 26 2019, 3:41 PM

Try the following, instead of 0, use 0.0.0.0

set policy-options community origin-validation-state-invalid members 0x4300:0.0.0.0:2

This is a documentation bug on juniper's website. It has been reported to them already, they'll fix it soon.

Thanks @JobSnijders for bringing this to my attention.
I've raised this with the documentation team and also added a comment to the SR.
Should be fixed soon.

Indeed this should be configured as:
set policy-options community origin-validation-state-invalid members 0x4300:0.0.0.0:2
set policy-options community origin-validation-state-unknown members 0x4300:0.0.0.0:1
set policy-options community origin-validation-state-valid members 0x4300:0.0.0.0:0

Thanks for the quick replies, it passes a commit check, will push the following shortly.

[edit policy-options policy-statement BGP_sanitize_in then]
      community delete AS14907:ALL { ... }
+     community delete RPKI:ALL;
[edit policy-options]
+   policy-statement BGP_IXP_in_test {
+       term filter {
+           from as-path-group AS-PATH-FILTER;
+           then reject;
+       }
+       term avoid-paths {
+           from as-path-group AVOID-PATHS;
+           then {
+               community add AVOIDED_PATH;
+           }
+       }
+       term rpki-classification {
+           from policy BGP_rpki;
+       }
+       then {
+           community add PEERING_ROUTE;
+           next policy;
+       }
+   }
[edit policy-options]
+   policy-statement BGP_rpki {
+   	term valid {
+   		from {
+   			protocol bgp;
+   			validation-database valid;
+   		}
+   		then {
+   			validation-state valid;
+   			community add RPKI:VALID;
+   		}
+   	}
+   	term invalid {
+   		from {
+   			protocol bgp;
+   			validation-database invalid;
+   		}
+   		then {
+   			validation-state invalid;
+   			community add RPKI:INVALID;
+   		}
+   	}
+   	term unknown {
+   		from {
+   			protocol bgp;
+   			validation-database unknown;
+   		}
+   		then {
+   			validation-state unknown;
+   			community add RPKI:UNKNOWN;
+   		}
+   	}
+   }
[edit policy-options]
+   /* RFC8097 */                       
+   community RPKI:ALL members "^0x4300:0.0.0.0:[0-9]+$";
+   community RPKI:VALID members 0x4300:0.0.0.0:0;
+   community RPKI:UNKNOWN members 0x4300:0.0.0.0:1;
+   community RPKI:INVALID members 0x4300:0.0.0.0:2;

As a side note, I was not able to escape the dots in the RPKI:ALL regex:

ayounsi@cr4-ulsfo# commit check 
[edit policy-options community RPKI:ALL]
  'members "^0x4300:0\.0\.0\.0:[0-9]+$"'
    Unknown extended community type

0.0.0.0 is easier to read even though not fully correct.

And then:

set protocols bgp group IX4 neighbor <???> import BGP_sanitize_in
set protocols bgp group IX4 neighbor <???> import BGP_IXP_in_test
set protocols bgp group IX4 neighbor <???> import BGP_community_actions

On some test neighbors.
If all good remove BGP_IXP_in_test and move term rpki-classification to BGP_IXP_in to continue with the plan outlined in T220669#5282795

These are non-transitive extended communities. They can not cross an EBGP boundary, the deletion in policy-statement BGP_sanitize_in is perhaps superfluous.

Mentioned in SAL (#wikimedia-operations) [2019-06-27T13:26:44Z] <XioNoX> push RPKI classification test to cr4-ulsfo - T220669

Mentioned in SAL (#wikimedia-operations) [2019-06-27T13:43:05Z] <XioNoX> push RPKI classification test to cr3-ulsfo - T220669

All our San Francisco POP now have a validation-state on its received prefixes. Next step is to push it to all the sites.

Mentioned in SAL (#wikimedia-operations) [2019-06-27T14:13:52Z] <XioNoX> push RPKI classification test to eqord - T220669

Mentioned in SAL (#wikimedia-operations) [2019-06-27T14:28:17Z] <XioNoX> push RPKI classification to Dallas - T220669

Classification successfully deployed in ulsfo/codfw/eqdfw/eqord (half-ish of our POPs), will push to the other sites early next week.

Then start dropping invalids on IXPs to see the effect it has in term of traffic shift/latency.

Last analytics stats is 0.00069% of request to our text cluster (main cluster behind Wikipedia) is from RPKI unreachable IPs (not covered by more/less specifics).

Mentioned in SAL (#wikimedia-operations) [2019-07-02T12:51:32Z] <XioNoX> push RPKI classification to AMS - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-02T13:09:45Z] <XioNoX> push RPKI classification to eqsin - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-02T13:13:22Z] <XioNoX> push RPKI classification to eqiad - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-09T14:42:55Z] <XioNoX> reject RPKI invalids on ulsfo peering link - T220669

Confirmed with a given test peer that was sending us RPKI invalids and unknown. We now only receive the unknown. And for a given invalid prefix that we used to receive via peering is now going through transit.

Mentioned in SAL (#wikimedia-operations) [2019-07-09T15:13:12Z] <XioNoX> reject RPKI invalids on Dallas peering link - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-09T15:20:45Z] <XioNoX> reject RPKI invalids on Singapore peering link - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-09T15:28:22Z] <XioNoX> reject RPKI invalids on Chicago peering link - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-09T15:38:02Z] <XioNoX> reject RPKI invalids on Amsterdam peering link - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-09T15:44:16Z] <XioNoX> reject RPKI invalids on Ashburn peering links - T220669

Thanks!

We now reject RPKI invalids on all our private/public peering sessions.

Next step is to review/merge https://gerrit.wikimedia.org/r/c/520337/ so we have almost real-time visibility on which % of our traffic comes from RPKI invalid/unreachable prefixes.

Thanks @JobSnijders for bringing this to my attention.
I've raised this with the documentation team and also added a comment to the SR.
Should be fixed soon.
Indeed this should be configured as:
set policy-options community origin-validation-state-invalid members 0x4300:0.0.0.0:2
set policy-options community origin-validation-state-unknown members 0x4300:0.0.0.0:1
set policy-options community origin-validation-state-valid members 0x4300:0.0.0.0:0

This has been fixed now in https://www.juniper.net/documentation/en_US/junos/topics/topic-map/bgp-origin-as-validation.html
Please let me know if you spot any other issues.

Thanks again!
Cheers,
Melchior

This has been fixed now in https://www.juniper.net/documentation/en_US/junos/topics/topic-map/bgp-origin-as-validation.html

Good news, thanks.

Next step is to review/merge https://gerrit.wikimedia.org/r/c/520337/ so we have almost real-time visibility on which % of our traffic comes from RPKI invalid/unreachable prefixes.

This is now done and displayed on https://grafana.wikimedia.org/d/UwUa77GZk/rpki

Next step is to decide on rejecting invalids on transit links.

Mentioned in SAL (#wikimedia-operations) [2019-07-18T16:06:33Z] <XioNoX> upgrade Routinator to 0.5.0 in codfw - T220669

Mentioned in SAL (#wikimedia-operations) [2019-07-18T16:29:34Z] <XioNoX> upgrade Routinator to 0.5.0 in eqiad - T220669

Change 524277 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Make Icinga alert on Grafana RPKI alerts

https://gerrit.wikimedia.org/r/524277

Change 524277 merged by Ayounsi:
[operations/puppet@production] Make Icinga alert on Grafana RPKI alerts

https://gerrit.wikimedia.org/r/524277

Change 525204 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Routinator set refresh to 10min (instead of 1h)

https://gerrit.wikimedia.org/r/525204

Change 525204 merged by Ayounsi:
[operations/puppet@production] Routinator set refresh to 10min (instead of 1h)

https://gerrit.wikimedia.org/r/525204