Background
The ML team has decided to limit the experimental namespace to ml-staging to prevent non-production-ready model servers from being deployed to Lift Wing (production). To deploy to the production/API gateway, all requirements in T332711 must be met.
Note that the multilingual model needs more resources:
resources:
limits:
cpu: "4"
memory: 6Gi
requests:
cpu: "4"
memory: 6Gi