Right now we run LVS services for istio-ingressgateway with:
monitors:
IdleConnection:
max-delay: 300
timeout-clean-reconnect: 3
This has the downside of PyBal showing all nodes of a cluster where no ingress route/backend is configured as down as ingressgateways envoy will not accept connections (on it's traffic port: tcp/30443) in that case.
~~In addition this might not catch errors reported by ingressgateway via it's internal health check (tcp/30021). Although it's currently not sure if there are errors that will result in failing health checks while connections are still possible.~~
~~Ingressgateway only servers health checks on a different than the traffic port (tcp/30021). So to allow checking those as well, PyBal's ProxyFetch monitor would need to be extended to allow checking a different port. A proposal CR exists at https://gerrit.wikimedia.org/r/c/operations/debs/pybal/+/759749 ~~
The above will not help in this particular case.
Kubernetes will internally do health checking on the dedicated health check port (tcp/30021). If that fails it will no longer serve traffic to that ingressgateway instance. I out setup (one ingresscontroller per node) this means connections to the ingressgateway traffic port (tcp/30443) will be dropped. Because of that it seems to be sufficient to just do tcp connection monitoring (for PyBal as well as for monitoring/probes).