Page MenuHomePhabricator

IABot destroys URL containing #
Closed, ResolvedPublic

Description

Here the bot did not correctly detect where the URL starts and ends.

Related Objects

Event Timeline

What it did was accidentally ignore the fragment of the URL. The URL will still work, but the part after the # is a pointer for the web browser to tell it to go to a certain part of the page.

The # aren't pointers for the web browser here - that's what the bot assumed, though.

[ is the [ bracket (and ] is ]), which needs to be encoded for the URL to work with MediaWiki.

&#91 is not valid URL encoding. Per the RFC the very first instance of the # is where the URL fragment begins. URL encoding is in the form of a %## where the numerals represent an ASCII code.

That might very well be true, however, we have lots of these non-standard-URLs and there are also websites using the # symbol for other purposes (Wikipedia, for example).

I believe that we need to find a way to deal with this problem, as it is clearly destroying links which are handled perfectly well by MediaWiki and standard browsers.