Page MenuHomePhabricator

Detect IP address collisions
Closed, ResolvedPublic

Description

Over at T188045 we had another case of an IP address collision. We, unfortunately, have had those before (thankfully rarely), and we've talked about making sure this kind of thing doesn't happen, with network setup and server provisioning automation.

Separately from that though, it'd be great to also have something that would alert us on IP address collisions, either accidental ones, or potentially even malicious ones.

That wdqs1004 task above has been open since February 20th, and it's a pretty sad thing to have multiple engineers waste time on; let's work on at least alerting us when such thing happens, and then later on making sure it won't happen again.

Related Objects

Event Timeline

faidon triaged this task as High priority.Mar 12 2018, 6:40 PM
faidon created this task.

Change 435797 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Use arping to detect duplicated IPs

https://gerrit.wikimedia.org/r/435797

ayounsi subscribed.

Note that I also created an LibreNMS alert to monitor explicitly the mgmt network:
%syslog.msg ~ "KERN_ARP_ADDR_CHANGE" && %devices.hostname ~ "mr" && %devices.hardware ~ "SRX" && %syslog.timestamp >= %macros.past_30m
mgmt only as "KERN_ARP_ADDR_CHANGE" is only for SRX.

Above Gerrit patch is waiting for reviews, so reassigning to Faidon
It has the advantage of being simple and easy to deploy.
Limitations are:

  • increase of traffic, both ARP and Icinga
  • risking the race condition where the rogue server would take the IP and not letting the Icinga run properly on the host

Counter argument is that the ping check will still succeed, but other checks, including the duplicated MAC one would fail as the Icinga/NRPE tcp session would break, indicating a possible IP conflict

An ideal system, in my mind, would listen on all vlans and keep track of IP/mac bindings then alert on collisions, but would be heavy to build and trunk to all vlans.

On the enforcing side, a few solutions are (or will be) possible:

  • enable the build in DHCP snooping feature of the switches, which will enforce that servers connected to access ports, only use the IP address assigned by the DHCP (prevent any manually assigned IPs).
  • manually configure the allowed IP/MAC addresses on the switch (heavy to do manually, easier with automation).

Change 435797 abandoned by Ayounsi:
Use arping to detect duplicated IPs

Reason:
Sitting for 2 years and not the best way to achieve this goal (eg. vs. and external observer).

https://gerrit.wikimedia.org/r/c/operations/puppet/ /435797

ayounsi raised the priority of this task from High to Needs Triage.Aug 24 2022, 6:24 AM
ayounsi claimed this task.

We have a working solution for the mgmt network (until it's time to split mgmt into smaller subnets).
And for production, automation and per rack subnets are/will be enough.