Page MenuHomePhabricator

Inconsistent behavior when fetching redirected pages with Cache-Control header
Closed, ResolvedPublic

Description

When requesting page content (e.g. mobile-sections-lead) for a page that is a redirect, the app expects to receive an HTTP 302 response, with a resolved location header, which tells our network library to automatically re-request the page with the resolved title.

In the general case, this seems to work correctly. However, when the app sets the Cache-Control header to be no-cache, the endpoint actually returns the body of the unresolved redirect page, without giving a 302 status, which is messing up some of our logic in the app.

The strange thing is that this doesn't happen every time.
For example, if you run the following curl command:

curl -v https://en.wikipedia.org/api/rest_v1/page/mobile-sections-lead/Obama

...it consistently gives a HTTP 302 status, which is what we expect.
However, if you run this curl command:

curl -v --header "Cache-Control: no-cache" https://en.wikipedia.org/api/rest_v1/page/mobile-sections-lead/Obama

...about 50% of the time it returns HTTP 302, but the other times it returns HTTP 200, with the actual content of the redirected page, which is not what we expect.

NB: the app uses the Cache-Control header for managing whether content should be cached locally vs. saved for offline use vs. forced to refresh from the network.

Event Timeline

mobrovac added subscribers: ema, BBlack, mobrovac.

This is strange indeed. This part of the logic is handled by Varnish. @BBlack @ema would you be able to explain why a RESTBase-issued redirect doesn't make it through in the case no-cache is supplied by the client?

@Dbrant note that sending Cache-Control: no-cache is pointless, because Varnish strips it away.

In principle, this kind of variation wouldn't be problematic for us, but the real problem is that when the response comes back as HTTP 200 with the redirected content, there's no indication of the target title to which the page redirects. i.e. both displaytitle and normalizedtitle still specify the redirect name, whereas the content comes from the target article.

See also T134464 . I think part of the problem is the VCL worked out in that ticket ( https://gerrit.wikimedia.org/r/#/c/287104/4/templates/varnish/text-frontend.inc.vcl.erb followed up by https://gerrit.wikimedia.org/r/#/c/287659/2/templates/varnish/text-frontend.inc.vcl.erb ) isn't correct. It works right for the case we're looking at, but causes false sharing of the s/302/200/ behavior to requests that do not have an explicit ?redirect=false as well. There's probably some other bits I don't yet understand, though, because the varying 302+200 responses I observe also claim all cache layers were operating in pass mode, which shouldn't allow shared cache objects...

Ottomata triaged this task as Medium priority.Jan 16 2018, 7:39 PM

I've found another issue here: for mobile content, redirects are returned by RESTBase with Cache-Control: no-cache. The reason for this is misconfiguration in RESTBase, will be fixed in a moment.

Submitted a PR for RESTBase to fix inconsistencies on RB side: https://github.com/wikimedia/restbase/pull/945

I'm not sure this will completely resolve the issue, cause Varnish might be involved as well.

Stashbot subscribed.

Mentioned in SAL (#wikimedia-operations) [2018-01-18T10:07:52Z] <mobrovac@tin> Started deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541

Mentioned in SAL (#wikimedia-operations) [2018-01-18T10:10:20Z] <mobrovac@tin> Finished deploy [restbase/deploy@04e7cdb]: Use stable packge names, normalise cache-control headers, update top definition - T184199 T184833 T184541 (duration: 02m 29s)

Mentioned in SAL (#wikimedia-operations) [2018-01-18T10:12:53Z] <mobrovac@tin> Started deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541

Mentioned in SAL (#wikimedia-operations) [2018-01-18T10:25:11Z] <mobrovac@tin> Finished deploy [restbase/deploy@5c353f7]: Use stable packge names, normalise cache-control headers, update top definition, take #2 - T184199 T184833 T184541 (duration: 12m 18s)

mobrovac assigned this task to Pchelolo.
mobrovac edited projects, added Services (done); removed Services (watching).

It seems @Pchelolo's normalisation fix did the trick. I cannot reproduce this any more, regardless of whether there is a Varnish hit or miss, so I think it's safe to assume the problem has been fixed. @Dbrant please reopen if you encounter it again.