Right now we run LVS services for istio-ingressgateway with:
monitors: IdleConnection: max-delay: 300 timeout-clean-reconnect: 3
This has the downside of PyBal showing all nodes of a cluster where no ingress route/backend is configured as down as ingressgateways envoy will not accept connections (on it's traffic port: tcp/30443) in that case.
In addition this might not catch errors reported by ingressgateway via it's internal health check (tcp/30021). Although it's currently not sure if there are errors that will result in failing health checks while connections are still possible. Ingressgateway only servers health checks on a different than the traffic port (tcp/30021). So to allow checking those as well, PyBal's ProxyFetch monitor would need to be extended to allow checking a different port. A proposal CR exists at https://gerrit.wikimedia.org/r/c/operations/debs/pybal/+/759749
The above will not help in this particular case.
Kubernetes will internally do health checking on the dedicated health check port (tcp/30021). If that fails it will no longer serve traffic to that ingressgateway instance. In our setup (one ingressgateways per node) this means connections to the ingressgateway traffic port (tcp/30443) as well as to the health check port (tcp/30021) will be dropped by the node (as they are handled the same).
Because of that it seems to be sufficient to just do tcp connection monitoring (for PyBal as well as for monitoring/probes).