Page MenuHomePhabricator

Parsoid incorrectly encodes href in redirects
Open, MediumPublic

Description

When parsing pages that are redirect pages, Parsoid includes a PageProp with the redirect target, that is URI-encoded. However, the % symbol in the redirect target is not URI-encoded, which creates invalid URIs.

Example:
https://en.wikipedia.org/w/index.php?title=100%25_Pure_New_Zealand&redirect=no

Produces:

<link rel="mw:PageProp/redirect" href="./Tourism_New_Zealand#%22100%_Pure_New_Zealand%22" id="mwAg"/>

Note how the " symbol in the redirect target is encoded, while the % symbol is not.

Event Timeline

Subsequently, all these redirect pages are broken in RESTBase since they crash the decodeURIComponent that we apply on the href

ssastry lowered the priority of this task from High to Medium.
ssastry moved this task from Needs Triage to Link syntax (links & media) on the Parsoid board.