While we don't use istio as a full service mesh, we make ample use of istio as a gateway in the ml cluster for knative. So, while istio would be too complex as a mere ingress, but given we're already using it, we should indeed evaluate it as an alternative ingress.
Specifically, for each ingress we're answering the following questions:
What is the general architecture?How can we deploy it on bare metal?Do we need to build and maintain docker images ourselves?- How it can be configured to proxy various services with easy parametrization? Three different ways actually, see Configuration below.
How do we operate on it?- Is it easy to collect metrics?
Metrics can easily be collected with prometheus - in fact, istio ships with the correct annotations and thus should easily be picked up by our prometheus without adding any new rule. - How do we collect logs?
Access logs and other logs are easy to collect as well. See:- https://istio.io/latest/docs/tasks/observability/logs/
- spec.meshConfig.accessLogFile: /dev/stdout - This actually enables access log on all istio-proxy instances. Did not find a way to enable only for ingressgateway.
- https://istio.io/latest/docs/ops/diagnostic-tools/component-logging/
- https://istio.io/latest/docs/tasks/observability/logs/
Thankfully given we already use istio for ML we know the answer to at least the first few questions.
Configuration
There are three ways to configure istio-ingressgateway(s) with different levels of granularity and different feature sets
Kubernetes Ingress (Simple)
https://istio.io/v1.9/docs/tasks/traffic-management/ingress/kubernetes-ingress/
Does not meet requirements, see: T287007#7431081
- Uses Kubernetes default Ingress objects for configuration: https://people.wikimedia.org/~jayme/k8s-docs/v1.16/docs/concepts/services-networking/ingress/
- But supports sharing of hostnames between namespaces (due to the central nature of istio-ingressgateway)
- L7 (HTTP(S)) only
- TLS certificates need to be placed in the namespace of istio-ingressgateway
- Just plain host and path prefix matching
- No advanced features like weight, header matching
- No "role model" or binding restrictions (e.g. allowing namespaces to use a specific hostname only, enforce default policies ...)
- Least number of API objects involved
Kubernetes Gateway API (Medium)
https://istio.io/v1.9/docs/tasks/traffic-management/ingress/gateway-api/
High risk of needing much maintenance/migration work in near future as well as adding a hart dependency to specific Istio (and therefore k8s) versions, see: T287007#7431081
- Uses the "new standard" Kubernetes API which is deemed the successor of Ingress:
- L7 and L4 routing
- TLS certificates need to be placed in the namespace of istio-ingressgateway
- Host and path (prefix and exact) matching
- Advanced features like weight, header matching and modification,...
- "Role model" or binding restrictions (e.g. allowing namespaces to use a specific hostname only, enforce default policies ...).
- API still in v1alpha1 (3rd release), soon to be v1alpha2
- Needs to be added (CRD) to the cluster
- May be subject to breaking changes in the future (not completely clear to me but I think we should assume)
- Already implemented by some other popular ingress controllers (like ambassador, contour, traefik and HAProxy)
Ingress Gateways (Advanced)
https://istio.io/v1.9/docs/tasks/traffic-management/ingress/ingress-control/
Proposed method to continue with!
- Uses Istio specific API
- L7 and L4 routing
- TLS certificates can be placed anywhere
- Host and path (prefix and exact) matching
- No "role model" or binding restrictions (e.g. allowing namespaces to use a specific hostname only, enforce default policies ...).
- Advanced features like weight, header matching and modification,...
- Very advanced features like fault injection, circuit breaking, mirroring etc.
- Needs to be added (CRD) to the cluster
- Specific to istio
- High number of API objects involved