Page MenuHomePhabricator

Parsoid processes redirects incorrectly when they come from a template
Closed, ResolvedPublic

Description

See rendering differences on these two pages: Page 1, Page 2.

Legacy emits the redirect wikitext as a plain ordered list item, but Parsoid generates a redirect link tag -- the bug is more egregious for the second page where the template shows up further down the page.

This is also seen on this mediawiki page. In legacy rendering the redirect shows up as a ordered list item

Event Timeline

Our redirect tokenizer rule is actually pretty terrible from a performance perspective as well, since we grab a whole long substring of non-whitespace characters just to try to feed it to the magic word matcher. I believe we can actually rip this out from our tokenizer completely, since core does the redirect matching itself in the Content hierarchy before the parser is ever invoked.

With legacy parser, #redirect only works on the top level page, even a {{1x|#REDIRECT ... }} doesn't work. So, that should be simple enough in Parsoid.

Ya, as Scott says, we should probably process this first thing before we even invoke the parsing pipeline and do a fast exit and remove it from the tokenizer, and that will fix it.

But, WikitextContentHandler::fillParserOutput says:

			// Parsoid renders the #REDIRECT magic word as an invisible
			// <link> tag and doesn't require it to be stripped.
			// T349087: ...and in fact, RESTBase relies on getting
			// redirect information from this <link> tag, so it needs
			// to be present.

So, looks like removing redirect handling from Parsoid is not as straightforward because looks like we considered this previously and then documented why we cannot do it that way. So, for now, I'm going to fix the grammar to ignore redirects if found in a template.

Also, stripping redirect wikitext messes up source offsets and can lead to page corruption on VE edits. So, non-Parsoid-Grammar approach needs some thought.

Change #1185219 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Ignore redirect wikitext coming from templates

https://gerrit.wikimedia.org/r/1185219

We don't need to strip the redirect text (and mess up the offsets), both legacy and Parsoid sender #REDIRECT as a literal <ol> list (for better or worse) and that's what we'd be doing. We'd just not generate the <link> tag, so:

But, WikitextContentHandler::fillParserOutput says:

			// Parsoid renders the #REDIRECT magic word as an invisible
			// <link> tag and doesn't require it to be stripped.
			// T349087: ...and in fact, RESTBase relies on getting
			// redirect information from this <link> tag, so it needs
			// to be present.

So, looks like removing redirect handling from Parsoid is not as straightforward because looks like we considered this previously and then documented why we cannot do it that way. So, for now, I'm going to fix the grammar to ignore redirects if found in a template.

yes, this is the one part we'd need to fix up. But we don't have RESTBase any more, just a REST API in core. I'm not even 100% certain the comment is true anymore, since I don't know if the core REST API parses the HTML looking for the <link> or not. Needs investigation.

MSantos triaged this task as Medium priority.Nov 21 2025, 10:06 AM

I'd prefer not to /ever/ recognize the #REDIRECT token, so conditionally disabling it in the tokenizer when we're inTemplate is one step towards getting rid of it in all cases in the tokenizer -- so I'm in favor of that, if that fixes the problem.

Change #1185219 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Ignore redirect wikitext coming from templates

https://gerrit.wikimedia.org/r/1185219

Change #1235865 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.23.0-a14

https://gerrit.wikimedia.org/r/1235865

Change #1235865 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.23.0-a14

https://gerrit.wikimedia.org/r/1235865