Our current LVS setup assumes that the NICs are able to be properly setup to the point that the proper routing rules are created. If for some reason that fails to happen (a down port for example) the traffic for that row would reach the realservers via the default route on the LVS box. This is undesired cause it would make the pybal healthchecks mark the affected servers as up and pool them but ipvs traffic won't be able to reach the mentioned servers.
A potential fix for this would be injecting static routes of type unreachable or blackhole with a lower metric. This would avoid that row specific traffic will reach the realservers through another row via the default route.
Follows-up https://wikitech.wikimedia.org/wiki/Incident_documentation/2021-07-16_asw-a2-codfw_network