Page MenuHomePhabricator

Get rid of self-closing slash in void elements like `<br>`, `<img>`, `<input>` or `<link>` on HTML5 documents
Closed, DeclinedPublic

Description

Get rid of self-closing slash in void elements like <br>, <img>, <input> or <link> – the slash "is just syntactic sugar". See also https://stackoverflow.com/a/3558200/1696030

Event Timeline

Volker_E raised the priority of this task from to Needs Triage.
Volker_E updated the task description. (Show Details)
Volker_E added a project: Performance Issue.
Volker_E added subscribers: Volker_E, Jdlrobson, Gilles and 3 others.

This already exists as $wgWellFormedXml, which can be set to false.

Ok, so the question is then what is $wgWellFormedXml affecting and why don't we put it to true per default?

Citing from DefaultSettings.php:

/**
 * Should we try to make our HTML output well-formed XML?  If set to false,
 * output will be a few bytes shorter, and the HTML will arguably be more
 * readable.  If set to true, life will be much easier for the authors of
 * screen-scraping bots, and the HTML will arguably be more readable.
 *
 * Setting this to false may omit quotation marks on some attributes, omit
 * slashes from some self-closing tags, omit some ending tags, etc., where
 * permitted by HTML5.  Setting it to true will not guarantee that all pages
 * will be well-formed, although non-well-formed pages should be rare and it's
 * a bug if you find one.  Conversely, setting it to false doesn't mean that
 * all XML-y constructs will be omitted, just that they might be.
 *
 * Because of compatibility with screen-scraping bots, and because it's
 * controversial, this is currently left to true by default.
 */
$wgWellFormedXml = true;

The comment is IMHO misleading by stating "HTML will arguably be more readable" twice.
Are screen-scraping bots still negatively affected?

Just quoting http://dev.w3.org/html5/spec-author-view/syntax.html a bit:

Then, if the element is one of the void elements, or if the element is a foreign element, then there may be a single U+002F SOLIDUS character (/). This character has no effect on void elements, but on foreign elements it marks the start tag as self-closing.

Void elements
area, base, br, col, command, embed, hr, img, input, keygen, link, meta, param, source, track, wbr

Is task about changing the default value of $wgWellFormedXml in either MediaWiki core or for Wikimedia wikis? If so, I think the task should be more narrowly focused.

We seem to be following the spec currently, so I'm not really sure what problems we'd be solving here by making a change. Or is there some benefit we'd gain by removing the optional slashes?

Duplicate of T52040 (2013). Declined per Tim Starling. Not worth the few bytes it saves compared to the cost of breaking ease of parse-ability.

Due to semantically different subject, leaving open for now.

Krinkle removed a project: Performance Issue.
Krinkle set Security to None.