{{#ifexist}} does not recognise URL encoded titles
Closed, InvalidPublic

Description

{{#ifexist}} does not recognise URL encoded filenames, see URL for testcase.

I fixed it with r37387 but Brion didn't like the fix at this point.

With r37420 he reverted:

Revert r37387 "Let the {{#ifexist}} works with encoded URLs too. See testcase"
Causes regression for titles containing "+".
Proper fix here is to move the existing % decoding into Title::newFromText or
Title::secureAndSplit so it's done automatically, rather than attempting to
replicate it, with the potential of getting it wrong like this, every place we
decide to accept titles

Sticking it to Bugzilla for the moment until I have time (or another dev ;-) )


Version: unspecified
Severity: normal
URL: http://test.wikipedia.org/wiki/User:Raymond/ifexist

bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz14779.
Raymond created this task.Via LegacyJul 10 2008, 10:07 AM
bzimport added a comment.Via ConduitJan 3 2009, 2:49 AM

rene.kijewski wrote:

Maybe the underlaying Sanitizer::decodeCharReferences could be changed?
{{titleparts}} is even worse affected by the bug: It only returns the input.
I tried to provide a patch, but I got confused with the mathematics of converting URL encode to UTF-8. ;-)

I don't know if someone needs might it, but this it the regex for proper URL encoded characters:
/(%[0-7][0-9A-Za-z])

(%[CDcd][0-9A-Za-z]%[89ABab][0-9A-Za-z])
(%[Ee][0-9A-Za-z](?:%[89ABab][0-9A-Za-z]){2})
(%[Ff][0-7](?:%[89ABab][0-9A-Za-z]){3})/x
bzimport added a comment.Via ConduitJan 3 2009, 3:24 PM

rene.kijewski wrote:

Wouldn't rawurldecode do it well? But I think it should go directly to Title::newFromText.

http://de.php.net/manual/en/function.rawurldecode.php: "Note: rawurldecode() does not decode plus symbols ('+') into spaces. urldecode() does"

brion added a comment.Via ConduitJan 4 2009, 3:32 AM

Sanitizer::decodeCharReferences *must not* attempt to deal with URL percent-encoding, as that would cause corruption of totally unrelated HTML output.

Probably the Sanitizer::decodeCharReferences() and the %-check & urldecode() both belong in either Title:newFromText or directly into Title::secureAndSplit() to ensure that titles are being consistently handled at the low-level; this means the various checks at higher levels should be checked and mostly pulled out.

There are probably a number of related bugs still open on this issue; be good to make sure they're all tied together.

Umherirrender added a comment.Via ConduitJul 1 2009, 5:29 PM

fullurl has no problem with urlencoded pages:

{{fullurl:{{FULLPAGENAME}}}} and {{fullurl:{{FULLPAGENAMEE}}}} gives the same result.

CoreParserFunctions.php has the description - since r15276
$title = Title::newFromText( $s );

  1. Due to order of execution of a lot of bits, the values might be encoded
  2. before arriving here; if that's true, then the title can't be created
  3. and the variable will fail. If we can't get a decent title from the first
  4. attempt, url-decode and try for a second.

if( is_null( $title ) )

$title = Title::newFromUrl( urldecode( $s ) );

#ifexist has not this and only try to get the title from text and not from URL second.

I do not know this is a possible solution. I am not so familiar with MediaWiki or PHP to say, this is good or not. I can only give you a place, who this works. Maybe you can adapt from there.

Platonides added a comment.Via ConduitFeb 13 2010, 3:10 PM

You got it right. Fixed r62436

tstarling added a comment.Via ConduitFeb 19 2010, 7:15 AM

I don't see why this is a bug. It doesn't recognise URL-encoded file names because the parameter is the file name, not the URL. I don't see the need to take MediaWiki apart and put it back together again when you could just fix your broken #ifexist calls.

Umherirrender added a comment.Via ConduitMar 3 2010, 11:49 AM

This is not only a problem of file names. I have add a example (see url).

scfc added a comment.Via ConduitApr 14 2012, 6:52 PM

Rereading this, I have to concur with comment 6. This must have blocked some use case of mine in the past so I CC'd but I don't remember it and neither do the other comments explain why it is a bug from their points of view. I propose closing as INVALID.

scfc added a comment.Via ConduitMay 14 2012, 9:09 PM

Thus closing as INVALID.

Add Comment