Page MenuHomePhabricator

URL with % in fragment breaks many scripts
Open, MediumPublic

Description

  1. Go to https://en.wikipedia.org/wiki/Wikipedia#% (or any other page with something in the URL fragment that would be invalid as %-encoding).
  2. Try to use Echo or VE.

Instead of step 1 you can also go to a page that has a % sign in a headline, use the TOC to go to that section and reload the page.

Expected: Everything should work as normal.
Actual: Echo and VE will only work via fallback (i.e. Echo will open Special:Notifications, VE will reload the page for editing).
The console is full with errors like URIError: "malformed URI sequence" and TypeError: "defaultUri is undefined", so probably mw.Uri can't parse the URL, and breaks everything that depends on it.

I haven't checked the specs whether a raw % sign is actually allowed in the fragment, but even if it isn't it shouldn't break anything; and if it isn't the parser should not produce such URLs in the TOC.

Event Timeline

Most things categorised under T106244 are invalid under the spec, and never produced by the Parser for the TOC or otherwise.

The bare % case is interesting and afaik not reported previously. The anchor for == % == and and [[#%]] indeed uses plain % currently which suggests that it is valid. I'm not sure why that is rejected currently. Either way, the fix is potentially the same, if resourced, which is to migrate to native URL.

For now though, this should remain a separate task since it isn't a case of garbage-in/garbage-out. Either we need to generate different anchors like we do for other special chars, or we need to support them in mw.Uri.

ovasileva triaged this task as Medium priority.Mar 2 2021, 10:42 AM
ovasileva moved this task from Incoming to Triaged but Future on the Readers-Web-Backlog board.