The ML serve clusters will need an LB endpoint to reach the istio ingress gateway pods, that in turn (using HTTP headers) will route packets to the correct pod backends.
We should do the following:
- Add a TLS certificate for inference.discovery.wmnet to the istio ingress pods, configured via knative (that holds the L4/7 configurations for istio in its config). This should be similar to what we do for other services.
- Remove port 80 from istioctl's config
- Add an LVS endpoint in front of the secure port.
At the end we should be able to get scores for revisions using https://inference.discovery.wmnet + related HTTP headers.