Copied from the description of T37002 :
I noticed that the Sanitizer function fails in the cases where the image tag is written in XHMTL-style as a closed tag <img src='http://image-url' />.
The net is full of reports of such things (XHTML vs. HTML4.1 vs. HTML5),
read for example this http://tiffanybbrown.com/2011/03/23/html5-does-not-allow-self-closing-tags/ .
This is still a valid bug, and a big problem for my RSS extension if the administrator expressly allows the rendering of <img> tags.
In this case the Sanitizer is called as usual but gets additional food:
$extraInclude = "img";
but *fails* to allow the img, escapes it still, which is wrong in that case
In other words, and to make it unambiguously clear:
The sanitizer does still its work and sanitizes (fallback: it is still secure), even if you want it *not* to sanitize img tags.
Because I have no influence on the composition of images tags of the incoming source (RSS feed), and some of these source do not obey the rules for image tags, I ask the MediaWiki Sanitizer specialist to fix this specific problem of "closed image tags".
How to reproduce:
(excerpt and constructed example; part of Extension:RSS)
$extraInclude = array(); $extraExclude = array( "iframe" ); $extraInclude = "a"; $extraInclude = "img"; $text = '<img src="http://tctechcrunch2011.files.wordpress.com/2013/03/yodlee_logo_final_rgb_lrg.jpg">'; $ret = Sanitizer::removeHTMLtags( $text, null, array(), $extraInclude, $extraExclude ); wfDebug( "RSS: after Sanitizer::removeHTMLtags:text:" . print_r( $text, true ) . "\n" ); wfDebug( "RSS: after Sanitizer::removeHTMLtags:extraInclude: " . print_r( $extraInclude, true ) . "\n" ); wfDebug( "RSS: after Sanitizer::removeHTMLtags:extraExclude: " . print_r( $extraExclude, true ) . "\n" ); wfDebug( "RSS: after Sanitizer::removeHTMLtags:ret: " . print_r( $ret, true ) . "\n" );
See Also: T37002: Sanitizer:removeHTMLtags fails for <img src=> tag when enclosed in <a> link