For peering planning purposes it'd be useful to include a few more dimensions In our netflow/pmacct/Druid pipeline. Specifically, and in order of usefulness:
- BGP communities, so that we can build queries that answer the question "how much of traffic for ASN X flows through transit". Communities is a essentially a tag-based system (each route can have multiple dimensions applied to it), that we can control on the routers, so that will be quite powerful. This begs the question of how would we store this best in Druid and query with Turnilo. Druid's documentation mentions multi-value dimensions, which seems appropriate here, but not sure if this would work and how :)
- Region/site (eqiad, esams etc.): we currently have "exporter IP" which can be (ab)used for this purpose, but having the region/site is arguably more useful. If adding it to the pmacct pipeline is too much of a trouble, I wonder if we could use something like Druid's lookups? Perhaps too fragile and thus a terrible idea, though :)
- AS names, e.g. coming from the MaxMind GeoIP ASN database. I think we've used that database before e.g. in the webrequest Druid database. Could we perhaps use Druid lookups for this to avoid adding another (identical) dimension to the data set?
- Not sure if this is possible, but a dimension with the network prefix, rather in addition to the individual IP address could be super useful as well.
- Address family (IPv4 or IPv6)