We want to move the traffic for mobileapps from the on-prem api cluster to the `mw-api-int` cluster on kubernetes.
Given the amount of requests we're talking about, around 3k rps, this is a *large* chunk of our traffic - about 40% of all of our api calls still going to the on-premises cluster!
Calculating from our current usage on `mw-api-ext` (which might be wrong), we need about 1 replica per 20 rps to keep usage low enough (although I would argue we can live with an higher usage for mw-api-int), it would mean we need about 150 replicas. Assuming ~ 6 cores allocated to a single replica, that would mean we need about 20/22 servers to allocate it all.
I don't think it's doable to move over that amount of servers in one go; we should rather look into moving a portion of traffic and increase it progressively.
Envoy has tools to[[ https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/traffic_splitting | split traffic between different backends ]], and I think it should be the way to go.