Page MenuHomePhabricator

Unclosed "Enlarge" <a> tag from thumbnail is leaking on Special:BookSources
Open, NormalPublic

Description

On https://en.wikipedia.org/wiki/Special:BookSources/0312898959, something is going wrong somewhere between SpecialBookSources, OutputPage, Parser, Tidy or Linker::thumbnail.

It contains the following HTML:

<div class="hatnote">For assistance, see <a href="/wiki/Help:ISBN" title="Help:ISBN">Help:ISBN</a>.</div>
<div class="thumb tleft"><div class="thumbinner" style="width:402px;">
  <div class="noresize">
    <map name="ImageMap_1_1744687168"><area href="#South_America" shape="poly" coords="90,79,70,91,96,160,115,162,135,100" alt="South America" title="South America"/><area href="#Africa" shape="poly" coords="176,49,159,53,141,68,156,95,187,143,222,143,230,83,200,56" alt="Africa" title="Africa"/><area href="#Europe" shape="poly" coords="179,4,153,36,158,49,201,55,208,42,203,14" alt="Europe" title="Europe"/><area href="#Europe" shape="poly" coords="203,14,239,3,333,15,307,36,285,32,263,34,251,34,225,33,208,35" alt="Russia" title="Russia"/><area href="#Asia" shape="poly" coords="207,34,198,55,222,77,254,88,284,104,306,108,314,94,333,18,301,34" alt="Asia" title="Asia"/><area href="#Australasia" shape="poly" coords="314,91,307,105,282,120,284,139,328,156,355,140,366,115" alt="Australasia" title="Australasia"/><area href="#United_States" shape="poly" coords="58,15,39,11,11,31,47,25" alt="United States" title="United States"/><area href="#United_States" shape="poly" coords="49,38,78,38,92,43,99,42,102,40,103,40,104,43,87,66,65,65,57,59,45,58,40,49" alt="United States" title="United States"/><area href="#Canada" shape="poly" coords="58,15,46,26,48,37,79,37,93,43,102,40,105,45,121,40,120,18,122,6,130,3,124,1" alt="Canada" title="Canada"/><area href="#Central_America" shape="poly" coords="45,57,57,58,65,66,75,71,74,74,67,79,47,70" alt="Central America" title="Central America"/><area href="#Greenland" shape="poly" coords="132,2,122,6,127,23,134,25,154,15,164,3" alt="Greenland" title="Greenland"/></map>
    <img alt="" src="//upload.wikimedia.org/wikipedia/commons/thumb/c/c3/BlankMap-World.png/400px-BlankMap-World.png" width="400" height="197" class="thumbimage" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/c/c3/BlankMap-World.png/600px-BlankMap-World.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/c/c3/BlankMap-World.png/800px-BlankMap-World.png 2x" data-file-width="1500" data-file-height="740" usemap="#ImageMap_1_1744687168"/>
  </div>
  <div class="thumbcaption"><div class="magnify"><a href="/wiki/File:BlankMap-World.png" class="internal" title="Enlarge"/></div>Select your region from the map above</div>
</div></div>
<div class="toc" style="float:right">
<div class="toctitle"><b>Brief Table of Contents</b></div>
<ol><li> <a href="#Notes">Notes</a></li>
<li> <a href="#Online_text">Online text</a></li>
<li> <a href="#Online_databases">Online databases</a></

The <a> for the Enlarge link in the div.magnify element is rendered as if it were a void element. Since anchor tags are not legal void elements, when parsed in Chrome, it stays open.

This is causing:

  • The entire caption to become part of the link.
  • The next heading to become part of the link
  • The whitespace before first item in the table of contents to become a link.
  • Because of the extra link, the first TOC item is now 1em offset to the right (see screenshot)

The same wikitext is also rendered on Wikipedia:Book_sources, however there it renders fine with an open and close tag:

<div class="magnify"><a href="/wiki/File:BlankMap-World.png" class="internal" title="Enlarge"></a></div>

Screenshot of the breakage on Special:BookSources:

Screenshot of Wikipedia:Book_sources by comparison, which is unaffected (possibly thanks to Tidy or some other post-processing).

Screenshot of live DOM, showing the unclosed <a> gets re-created in every block level element until it finds a way to close it. Which is causing whitespace that would otherwise be insignificant, to become significant and render, thus pushing away other content (like the first list item).

The relevant code in Linker::makeThumbLink2:

				$zoomIcon = Html::rawElement( 'div', array( 'class' => 'magnify' ),
					Html::rawElement( 'a', array(
						'href' => $url,
						'class' => 'internal',
						'title' => wfMessage( 'thumbnail-more' )->text() ),
						"" ) );
			}
		}
		$s .= '  <div class="thumbcaption">' . $zoomIcon . $fp['caption'] . "</div></div></div>";

I tried many different ways, but Html::rawElement( 'a', array(), "" ); always produces <a></a>. Having said that, SpecialBookSources does to weird magic by including the raw wikitext directly into the page to be rendered, thus bypassing some processes. What's causing this?

Event Timeline

Krinkle created this task.Jun 7 2015, 5:10 PM
Krinkle raised the priority of this task from to Needs Triage.
Krinkle updated the task description. (Show Details)
Krinkle added a subscriber: Krinkle.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 7 2015, 5:10 PM
Krinkle triaged this task as Normal priority.Jun 7 2015, 5:11 PM
Krinkle set Security to None.
Restricted Application added a project: Multimedia. · View Herald TranscriptJun 8 2015, 5:11 PM
Restricted Application added a subscriber: Matanya. · View Herald TranscriptJul 5 2015, 7:26 PM
Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 6:23 PM
Alexia added a subscriber: Alexia.Mar 15 2016, 8:26 PM

Here is a very simple way to reproduce it tossing it into any page. $wgTidyConfig is the default of null and $wgUseTidy is the default of false. I have not figured out a fix yet.

<imagemap>
File:Test.png|150|thumb|alt=Alt text|This should not? be a link.

default [[Main Page]]
</imagemap>

This definitely should not be a link.
PPaul added a subscriber: PPaul.Apr 17 2016, 5:09 PM
Majr added a subscriber: Majr.Sep 5 2016, 6:04 AM

It's caused by running the HTML through DOMDocument and getting it back out with saveXML, rather than saveHTML.

It can be fixed by either not worrying about the supposed XHTML compliance and using saveHTML, or specifying the LIBXML_NOEMPTYTAG option on saveXML, however this causes other issues with valid void tags, such as <img> becoming <img></img>. But at least that doesn't break the page...

TheDJ added a subscriber: TheDJ.Feb 4 2019, 12:44 PM

Seems this is fixed, possibly since we switched to remexHTML