Page MenuHomePhabricator

Patch Location headers of HTTP redirects coming from the MW API in Lift Wing services
Open, Needs TriagePublic3 Estimated Story Points

Description

Problem statement

Sometimes the MW API returns HTTP 301 responses carrying a Location header with https links, like:

Location: https://en.wikipedia.org/w/some-path

In our isvcs we explicitly force HTTP when contacting the MW API to allow the Istio/Envoy sidecar to inspect the HTTP content and produce meaningful metrics (so for example, if something goes wrong we can clearly see in Grafana what response codes the MW API is returning).

We also use the aiohttp's default when dealing with redirects, namely follow them "transparently" behind the scenes. This means that if the MW API returns a redirect to a HTTPS domain, aiohttp automatically tries to contact the new endpoint/URL ending up in a almost sure timeout (since https is not allowed as egress traffic from the isvc).

Possible solution

We should try to force aiohttp to not follow redirects, and add a little wrapper in the code that calls the MW API to deal with redirects (if they happen), for example looping until a non HTTP 30X response is returned.

Things like mwapi's async_session.py may need to be patched as well, since the allow_redirect=False option is not a ClientSession one, but something that needs to be enabled/disabled for every request.

Event Timeline

As discussed in the team meeting this task will be restricted to providing a solution for the revertrisk-language-agnostic that currently handles these redirects by rewriting the hosts from the values available in a configuration file. We want to remove this "hack".
Also, a potential idea would be to allow only a specific number of redirects (e.g. up to 3).

Once we figure out a nice way to deal with redirects we will implement and roll them out in all the other model servers in another task(s).