Page MenuHomePhabricator

Parsoid: Redirects should not be served as HTTP 200 with contents of target
Closed, ResolvedPublic

Description

When editing page with redirect, I can edit just the page to where it is redirected.


Version: unspecified
Severity: normal

Details

Reference
bz45808

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:24 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz45808.

Juan,

Thanks for your bug report; you are correct. This is because Parsoid gives the wrong response for pages with redirects. Pushing over to them.

Current behaviour:

  • 200 response with the contents of the target page

On what we'd expect, I'd suggest either type A (but it's messy architecturally) or type B (but not sure about user agent support for HTTP 300):

Expected behaviour - type A:

  • when in 'reading mode', a 302 to the page with the redirect;
  • when in 'editing mode', a 200 with the contents of the page (to edit the redirect / etc.)

Expected behaviour - type B:

  • a 300 response with the first link (for regular browsers) being off to the target of the page, and the second being the ?redirect=no or whatever version to be able to edit it.

If you ask Parsoid for (say) http://localhost:8000/en/OLPC you get the page for "One Laptop Per Child" (the <title> is correct), but the returned page contains {{Redirect|OLPC}} in addition to the text from the "One Laptop Per Child" article.

So in addition to the redirect being transparent, we're also getting bogus extra content in the parsed version.

From IRC discussion w/ GWicke:

gwicke: I think that we should not resolve redirects for top-level requests
gwicke: should return a meta instead that lets the VE edit the redirect
cscott-free: yeah, although editing "#REDIRECT [[foo]]" is a bit awkward, that's an <ol>
gwicke: obviously we'd expose it as <meta property="mw:Redirect" content="foo"> or something like that
cscott-free: i think i learned recently that there can be other 'stuff' in a redirect?
gwicke: arguably this should eventually end up in the head instead of the content
gwicke: but for now we should probably keep it in the content area in the interest of round-tripping
cscott-free: that makes sense.
gwicke: cscott: there can be page content following it that is normally ignored by MW
gwicke: but we should probably still round-trip it
gwicke: hence the idea to keep it inline for now rather than returning a blank body with meta info in the head
(01:58:07 PM) cscott-free: i suspect there should be a corresponding VE bug for editing the <meta property="mw:Redirect" content="foo">
(01:58:08 PM) gwicke: maybe we should use link / rel / href again
(01:58:21 PM) gwicke: for the same relative link resolution reasons

(see bug 45206 comment 5 for a brief discussion of the meta/link issues)

See bug 47328 for the corresponding Visual Editor bug.

Note: test "#REDIRECT [[www.google.com]]" which currently does unusual things.

"#REDIRECT [[www.google.com]]" behaves just the same way as "#REDIRECT [[Foo]]", so not that unexpected.

Related URL: https://gerrit.wikimedia.org/r/61799 (Gerrit Change I1bd36f32a5e46b90261895e5499a0308875e5e05)

  • Bug 48271 has been marked as a duplicate of this bug. ***

As the gerrit commit is merged, can this now be closed?

I have not tested the HTTP responses yet, but am happy to close. Please reopen if you find issues (or open a new bug).

Sorry: the wikitext parsing bit landed but I didn't yet make any changes to how the server handles redirects. One more patch is needed to change the http status and turn off the old redirect code. Reopening.

I removed our previous behavior of following redirects when encountering a redirect page, so we now return the regular redirect content as expected. Closing as fixed.

[Parsoid component reorg by merging JS/General and General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]