Page MenuHomePhabricator

Numeric entity references in alt text
Closed, ResolvedPublic

Description

Author: apb

Description:
Wikitext like [[Image:Whatever.gif|♀]] generates HTML output like
<img src="..." alt="&amp;#9792;">, whereas it should generate HTML
output like <img src="..." alt="&#9792;">.


Version: unspecified
Severity: normal

Details

Reference
bz499

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 6:57 PM
bzimport set Reference to bz499.
bzimport added a subscriber: Unknown Object (MLST).

wmahan_04 wrote:

The culprit is this line in Skin::makeImageLinkObj():

$alt = htmlspecialchars( $alt );

I think a good fix would be to instead just replace
" with &quot;. I won't commit this change yet, to avoid
interfering with other work on alt tags (like bug 368).

rowan.collins wrote:

Don't forget < and > as well; I suspect leaving those unescaped in attributes is
rather a bad idea. And what about literally & - that *does* need to be escaped
to &amp; Somehow, we need to say "escape things that aren't escaped entities
already" :/

wmahan_04 wrote:

(In reply to comment #2)

Don't forget < and > as well; I suspect leaving those unescaped in attributes is
rather a bad idea. And what about literally & - that *does* need to be escaped
to &amp; Somehow, we need to say "escape things that aren't escaped entities
already" :/

Right you are. How about this (the regexp is taken from Parser):

$alt = preg_replace('/&(?!:amp;|#[Xx][0-9A-fa-f]+;|#[0-9]+;|[a-zA-Z0-9]+;)/',

'&amp;', $alt);

$alt = str_replace( array('<', '>', '"'), array('&lt;', '&gt;', '&quot;'), $alt );

Btw, makeThumbLinkObj() has the same code, and so should be changed as well.
Skin.php has a lot of duplicate code, it seems.

wmahan_04 wrote:

I committed the change to HEAD (Skin.php revision 1.287), and
added a parser test case for this bug.

1.4 release imminent, resolving as fixed.