We have observed a bug with Nokia SR-Linux that affects DHCP relay function in some circumstances. Specifically the switches do not seem to forward packets returned to them from our install/dhcp server. This behaviour has been observed with SR-Linux v24.10.x and v24.7.x, however it has been inconsistent with the problem disappearing on lsw1-c3-eqiad and lsw1-d3-eqiad after they were downgraded to v24.7.2 and then upgraded to v24.10.4 again (whether the downgrade/upgrade cycle made a difference or some other random factor "fixed" it is unknown). The problem has only been observed on switches running EVPN/VXLAN (meaning the DHCP response packets arrive on the leaf switch VXLAN encapsulated following a type-5 EVPN route destinated to the L3VNI/RMAC).
Detail
A network diagram we made for Nokia support can be seen here:
The basic symptoms of what was observed was that DHCP responses sent to the Nokia top-of-rack switch (acting as DHCP relay for hosts on attached vlans) were received by the switch but not forwarded out to the host that sent the initial DHCP request. We could verify they were received as the counters increased on the configured cpm acl entry:
A:lsw1-d3-eqiad# info flat acl acl-filter cpm type ipv4 entry 215 set / acl acl-filter cpm type ipv4 entry 215 description allow_dhcp_reply4 set / acl acl-filter cpm type ipv4 entry 215 match ipv4 protocol 17 set / acl acl-filter cpm type ipv4 entry 215 match ipv4 source-ip prefix 208.80.152.0/22 set / acl acl-filter cpm type ipv4 entry 215 match transport destination-port range start 67 set / acl acl-filter cpm type ipv4 entry 215 match transport destination-port range end 68 set / acl acl-filter cpm type ipv4 entry 215 match transport source-port value 67 set / acl acl-filter cpm type ipv4 entry 215 action accept
A:lsw1-d3-eqiad# info from state acl acl-filter cpm type ipv4 entry 215 statistics
acl {
acl-filter cpm type ipv4 {
entry 215 {
statistics {
last-clear "2025-10-28T16:11:40.000Z (5 minutes ago)"
incomplete false
matched-packets 12
last-match "2025-10-28T16:16:48.000Z (17 seconds ago)"
}
}
}
}However if we looked at the statistics for the DHCP relay daemon on the device it would show the same number of packets sent to our install server, but zero having come back:
A:lsw1-d3-eqiad# info from state interface irb0 subinterface 1079 ipv4 dhcp-relay statistics
interface irb0 {
subinterface 1079 {
ipv4 {
dhcp-relay {
statistics {
client-packets-received 12
client-packets-relayed 12
client-packets-discarded 0
server-packets-received 0
server-packets-relayed 0
server-packets-discarded 0
}
}
}
}
}We were also able to do a mirror on the Spine uplink port and capture the packets arriving on the leaf:
Status
While working to determine what version of SR-Linux introduced the ARP bug we were also dealing with (see T409178) the two switches connecting our test hosts we downgraded through various versions of the OS. It was discovered that following this, when back on v24.10.4, both lsw1-c3-eqiad and lsw1-d3-eqiad correctly relayed DHCP responses to hosts.
While the OS downgrades were focused on the arp issue one DHCP test was done on v24.7.2 and the problem still occurred on it. The ARP bug has not been observed on that version of the OS. So the problems are likely something different.
The current status is that this issue remains with Nokia, but we are not observing it on the switches we formerly had problems. Nokia assure us they are working on it, when we have a test Nokia switch connected we can do more tests ourselves to try and reproduce.