Page MenuHomePhabricator

& in page titles is not appropriately escaped in various cases
Closed, ResolvedPublic

Description

Author: dbenbenn

Description:
Page titles are not HTML-sanitized in various cases. Since page titles can have
& in them, this leads to various minor bugs. I'd like to record all I can find
here.

  1. Neither [{{SERVER}}{{localurl:{{PAGENAME}}}}] nor

[{{SERVER}}{{localurl:{{PAGENAMEE}}] works for the page entitled <

  1. Special:Watchlist/edit appears to pass page titles literally. If you're

watching [[&lt;]], the list shows < with no link.

  1. action=unwatch for [[&lt;]] displays "The page "<" has been removed from your

watchlist." The variable $1 has not been escaped in [[MediaWiki:Removedwatchtext]].

  1. The same goes for [[MediaWiki:Addedwatchtext]].
  1. And [[MediaWiki:Deletedtext]], parameter $1.
  1. And [[MediaWiki:Undeletedtext]]
  1. [[Special:Movepage/%26lt%3B]] displays "Move page: <", not "Move page: &lt;"
  1. Although the "Confirm delete" page correctly shows (Deleting "&lt;), the

"Confirm protection" page shows (Protecting "<"). (The same goes for "Confirm
unprotection".)

  1. [[Special:Log]] doesn't accept &lt; in the Title field. See

http://en.wikipedia.org/w/index.php?title=Special%3ALog&page=%26lt%3B

9.1) The same bug shows up when undeleting [[&lt;]]. Instead of displaying the
deletion log for the page, it displays the entire deletion log.

  1. A new variable is needed, like {{PAGENAME}} and {{PAGENAMEE}}, that gives an

HTML-sanitized version of the page title. At the moment, there's no good way to
fix [[MediaWiki:Newarticletext]], as it displays for [[&lt;]].

  1. You can't move a page to [[&lt;]]. It claims that "The requested page title

was invalid, empty, or an incorrectly linked inter-language or inter-wiki title."

  1. I created [[User:Dbenbenn/&lt;]], then deleted it. When I then edited the

page, the "View or restore 2 deleted edits?" message didn't have the usual link
to Special:Undelete.


Version: 1.6.x
Severity: trivial

Details

Reference
bz3243

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:46 PM
bzimport set Reference to bz3243.
bzimport added a subscriber: Unknown Object (MLST).

dbenbenn wrote:

  1. If [[&lt;]] redirects to another page, the redirect message is "(Redirected

from <)" with no link. The variable in [[MediaWiki:Redirectedfrom]] isn't
sanitized.

dbenbenn wrote:

  1. [[MediaWiki:Rclsub]] is another message where the parameter needs to be

sanitized, with & converted to &amp;

dbenbenn wrote:

(In reply to comment #0)

  1. A new variable is needed, like {{PAGENAME}} and {{PAGENAMEE}}, that

gives anHTML-sanitized version of the page title. At the moment, there's
no good way tofix [[MediaWiki:Newarticletext]], as it displays for
[[&lt;]].

Perhaps {{PAGENAME}} should simply be updated. After all, from the point
of view of HTML, "&amp;lt;" actually ''is'' the title of the page whose
plain-text title is &lt;. There's no reason not to HTML-sanitize &, <, >,
[, ], {, }, and | in the PAGENAME variable.

As an aside, ' can also cause trouble. See

http://en.wikipedia.org/w/index.php?title=Image:%27%27I_can_smell_%27em%21%
27%27.jpg&action=edit

for a real-life example. The '' is interpreted as "begin italics", and
should be escaped to &apos; instead.

dbenbenn wrote:

  1. You can't move [[&lt;]] anywhere else. The error message is "Fatal error:

Call to a member function on a non-object in
/usr/local/apache/common-local/php-1.5/includes/SpecialMovepage.php on line 198"

dbenbenn wrote:

Patch to fix 3, 4, 5, 6, 7, and update Language.php slightly

Here's a short patch to fix points 3, 4, 5, 6, and 7. (Incidentally, 8 and 14
got fixed independently.)

The patch also changes the defaults for [[MediaWiki:Addedwatchtext]],
[[MediaWiki:Removedwatchtext]], [[MediaWiki:Deletedtext]], and
[[MediaWiki:Undeletedtext]]. I add <nowiki> tags around the page title
parameter, so that the page title displays correctly in case it has '' in it.

attachment patch ignored as obsolete

dbenbenn wrote:

also fix 2 and 13

The patch also fixes 2 and 13, and adds <nowiki> to the default for
[[MediaWiki:prefslogintext]] ([[User:''dbenbenn'']] is a valid user name!)

attachment patch ignored as obsolete

dbenbenn wrote:

Fix 10 and 12. 1, 9, 11, and 15 remain

The patch now adds Sanitizer::wikiSanitize() to transform {{PAGENAME}}, fixing
point 10. Note that now {{localurl:{{PAGENAME}}}} works for [[&lt;]], but
still not for [[*]]. (The wikiSanitize() function could probably be improved
by a PHP expert.)

The patch also modifies Skin.php to fix point 12.

attachment patch ignored as obsolete

dbenbenn wrote:

better fix for point 10, {{PAGENAME}} sanitization

Here's a better implementation of Sanitizer:wikiSanitize(). It now makes the
display work right if {{PAGENAME}} appears at the beginning of a line in a page
called *, or '', or &, or ==foo==, or ;foo, or ----, or ~~~~
({{subst:PAGENAME}} in this last case).

attachment patch ignored as obsolete

dbenbenn wrote:

better fix for point 10, also fix 11 and 15

wikiSanitize in the previous patch was broken, due to ;. Fixed now.

Also, you can now move [[&lt;]]. Furthermore, to move a page to [[&lt;]], you
used to have to enter "&amp;lt;" in the "To new title" field---now "&lt;" works
as expected.

Attached:

ayg wrote:

*** Bug 6986 has been marked as a duplicate of this bug. ***

ayg wrote:

This bug seems pretty broad. Maybe someone should consider making it a tracker
and opening up separate bugs for the . . . separate bugs.

Category pages also suffer from this problem. Example:
http://de.wikipedia.org/wiki/Kategorie:Schriftzeichen (there's a lone & on the
page instead of &amp;)

Patch to fix it:

  • orig/includes/Linker.php

+++ mod/includes/Linker.php
@@ -316,8 +316,9 @@

$u .= '#' .

str_replace(array_keys($replacearray),array_values($replacearray),$anchor);

}
if ( $text == '' ) {
  • $text = htmlspecialchars( $nt->getPrefixedText() );

+ $text = $nt->getPrefixedText();

}

+ $text = htmlspecialchars( $text );

if ( $style == '' ) {
        $style = $this->getInternalLinkAttributesObj( $nt, $text );
}

@@ -385,8 +386,10 @@

$u = $nt->escapeLocalURL( $query );

if ( '' == $text ) {
  • $text = htmlspecialchars( $nt->getPrefixedText() );

+ $text = $nt->getPrefixedText();

}

+ $text = htmlspecialchars( $text );
+

$style = $this->getInternalLinkAttributesObj( $nt, $text, 'stub' );

list( $inside, $trail ) = Linker::splitTrail( $trail );

Patch against mediawiki 1.7.1. Please apply.

At least the last issue doesn't seem to appear in the newest MediaWiki.
Someone should probably go through the other issues reported here and maybe open separate bugs for them...

Doodle777 wrote:

  • is also not appropriately escaped, e.g.

http://en.wikipedia.org/wiki/*foo

(In reply to comment #14)

  • is also not appropriately escaped, e.g.

http://en.wikipedia.org/wiki/*foo

  • is not a metacharacter in HTML or CSS (please correct me if I'm wrong), so it doesn't need to be escaped.

Doodle777 wrote:

(In reply to comment #15)

(In reply to comment #14)

  • is also not appropriately escaped, e.g.

http://en.wikipedia.org/wiki/*foo

  • is not a metacharacter in HTML or CSS (please correct me if I'm wrong), so it

doesn't need to be escaped.

Yeah, I realized that a little after I posted that.
Should I post another bug for that?

(In reply to comment #16)

(In reply to comment #15)

(In reply to comment #14)

  • is also not appropriately escaped, e.g.

http://en.wikipedia.org/wiki/*foo

  • is not a metacharacter in HTML or CSS (please correct me if I'm wrong), so it

doesn't need to be escaped.

Yeah, I realized that a little after I posted that.
Should I post another bug for that?

I'm not sure what you mean. The page validates correctly, so I don't see any bug there. Can you elaborate?

These cases seemed to have been cleaned up as of now.