Page MenuHomePhabricator

Deal with RB responding with a 302 rather than a 404 for non-extant File: pages from foreign file repos
Closed, ResolvedPublic1 Estimated Story Points

Description

RB will now emit a 3XX rather than a 404 for non-existent file pages which have content on another wiki, per the request from Reading. This means we'll need to treat 3XXs as the same as 404s in this instance (as it's the only time they emit 3XXs, for now maybe we can treat it always as a 404?).

Event Timeline

Some context on redirects in RESTBase:

  1. We use (will use soon) 301 redirect for requests with non-canonical title. So soon https://en.wikipedia.org/api/rest_v1/page/html/Main%20Page will be redirected to https://en.wikipedia.org/api/rest_v1/page/html/Main_Page. Normally VE requests pages already in canonical form, but for the sake of correctness VE should follow 301 redirect in this case.
  2. For File pages right now we emit 301 redirect to commons, but we will change that to 302 redirect. The redirect will be cached in varnish and purged actively if the page is created. https://github.com/wikimedia/restbase/pull/569
Jdforrester-WMF renamed this task from Treat RB 3XXs as 404s for File: pages to Treat RB 302s (used for File: pages from foreign file repos) as 404s.Mar 23 2016, 7:01 PM

Forked the first bit to T130758: Follow RB 301s (used for requests in non-canonical form) as it's separable, important, but less urgent. :-)

@Jdforrester-WMF @Esanders @AlexMonk-WMF: Do you have a time estimate for the image description part? I'm asking as we are currently holding back our deploy to avoid breaking VE, which in turn blocks the Android app roll-out.

Just spoke with @Jdforrester-WMF in the office. It turned out more complex than expected, so won't happen before next week.

We'll implement a work-around to un-block our deploy in the meantime: https://github.com/wikimedia/restbase/pull/572

Should we just assume that all 302s are basically 404s, and ensure that VE always sends the normalised title, or do we really need to know whether a page actually comes from a foreign file repo before interpreting the resulting status code?

Should we just assume that all 302s are basically 404s, and ensure that VE always sends the normalised title, or do we really need to know whether a page actually comes from a foreign file repo before interpreting the resulting status code?

Just assuming for now should be fine. Maybe write a lowest-importance tech debt task to consider actually checking, and mention in a TODO comment?

Change 280095 had a related patch set uploaded (by Alex Monk):
Treat a 302 from RESTBase like a 404

https://gerrit.wikimedia.org/r/280095

Change 280095 abandoned by Alex Monk:
Treat a 302 from RESTBase like a 404

Reason:
This isn't going to work

https://gerrit.wikimedia.org/r/280095

AlexMonk-WMF subscribed.

Turns out the redirects are handled transparently to our client-side RESTBase client code and AFAIK we can't detect that we got redirected in all supported browsers

We're going to enable general redirect handling in RESTBase (PR). For browser clients like VisualEditor we're adding a boolean query parameter ?redirect=false. VE should add this parameter to it's requests to RESTBase before we roll out redirect handling to production. Adding this parameter will completely solve the 302 redirect issue in VE.

@AlexMonk-WMF We have an idea how to share the cache entries for both redirect=false and redirect=true requests. However, a Varnish optimisation should be implemented after this feature is enabled in RESTBase, so for some period of time (between VE adding a flag and us adding an optimisation to varnish) the answer is yes - it will disable caching. However, in practice it's not a big problem, as the hit rate for VE requests is still not very high.

Maybe not for restbase, but what effect will that have upon response times to VE?

@AlexMonk-WMF There shouldn't be any impact.

VE is requesting html via /html/{title} endpoint, which's not very popular for any other client except VE. As most of VE requests result in page modification and a purge, the hit rate for VE right now is very low. Also, we've enabled caching only about a month ago, so the latency will not be worse than it was one month ago. And this is only for a very limited time period.

Change 282448 had a related patch set uploaded (by Alex Monk):
Add ?redirect=false to VE's RB /page/html/ requests

https://gerrit.wikimedia.org/r/282448

Change 282448 merged by jenkins-bot:
Add ?redirect=false to VE's RB /page/html/ requests

https://gerrit.wikimedia.org/r/282448

I think this one can be closed now? What do you think @Jdforrester-WMF ?

I think this one can be closed now? What do you think @Jdforrester-WMF ?

Sure. I'll re-title it.

Jdforrester-WMF renamed this task from Treat RB 302s (used for File: pages from foreign file repos) as 404s to Deal with RB responding with a 302 rather than a 404 for non-extant File: pages from foreign file repos.Apr 25 2016, 6:43 PM
Jdforrester-WMF closed this task as Resolved.
Jdforrester-WMF assigned this task to Krenair.
Jdforrester-WMF reassigned this task from Krenair to AlexMonk-WMF.
Jdforrester-WMF added a subscriber: Krenair.