Page MenuHomePhabricator

Allow use of semantic HTML5 elements in wikitext
Open, LowPublic

Assigned To
None
Authored By
bzimport
Jun 12 2010, 5:23 PM
Tokens
"Love" token, awarded by geraki."Love" token, awarded by ToBeFree."Love" token, awarded by Liuxinyu970226."Love" token, awarded by Danny_B."Love" token, awarded by Volker_E."Love" token, awarded by Ricordisamoa."The World Burns" token, awarded by Bennylin."Love" token, awarded by Kulla.

Description

Many of these tags are a natural compliment or enhancement to the structure of Wikipedia's and Wiktionaries. Levels of support:

  • Whitelist, to allow use in wikitext.
  • Add HTML5 elements to wikitext rendering.

References

Support status

iconmeaning
handled or overridden by MediaWiki core
handled or overridden by MediaWiki extension
enabled
enablable - semantic markup
enablable - tables enhancement
enablable - needs some work though
enablable - form control without interaction, for semantic markup
tag?notes
<a>via discussion with @tstarling doable (in favour of enabling various relevant attributes rather than expanding the current [[..|..]] syntax); T35886: to support for microdata and rdfa, allow <a> tags so external links can have ref/rel attributes
<address>old HTML spec, not a new feature (T2671: Whitelist non-problematic HTML tags: address, especially later discussion)
<area>, <map>handled by ImageMap (<imagemap>)
<article>
<aside>T104770: Add HTML5 <aside> to the parser whitelist
<col>, <colgroup>old HTML spec, not a new feature; T2986: [tables] Please implement COL, COLGROUP
<details>, <summary>T31118: Add HTML 5 semantic elements 'details' and 'summary' to Sanitizer whitelist
<fieldset>old HTML spec, not a new feature; with <legend>
<figcaption>
<figure>
<footer>
<header>
<legend>old HTML spec, not a new feature; with <fieldset>
<link>
<main>
<meta>
<meter>T211259: Allow use of <meter> element in wikitext
<nav>
<progress>fallbackable [ 1, 2 ] to its content: <p>Progress: <progress id="p" max=100><span>0</span>%</progress></p>
<section>handled by MediaWiki-extensions-LabeledSectionTransclusion T32597: <section> tag name collides with HTML5 <section> tag
<source>T39042: Remove <source> syntax from SyntaxHighlight (GeSHi)
<style>TemplateStyles . See also T52644: Support <style scoped> as HTML element in wiki source, T37704: Drop support in wikitext for inline styles
<tbody>, <tfoot>, <thead>old HTML spec, not a new feature; T6740: thead, tbody, tfoot for wikitable syntax T5156: Request not to filter <tbody> and </tbody> codes
iconmeaning
invalid (aka not a part of <body>)
disabled for security reasons
disabled for security reasons (scripting)
disabled for security reasons (interactive form control)

Security implications

tag?alternativesother notes
<audio>[[File:]] syntax
<base>
<body>
<button>MediaWiki-extensions-InputBox
<canvas>
<datalist>
<embed>T18316: Tags like <embed> are needed
<form>
<head>
<html>
<iframe>T18316: Tags like <embed> are needed
<img>[[File:]] syntax, <gallery>
<input>MediaWiki-extensions-InputBox
<keygen>deprecated now (see https://developer.mozilla.org/en-US/docs/Web/HTML/Element/keygen)
<label>MediaWiki-extensions-InputBox
<noscript>T47731: Allow <noscript> tag
<object>T18316: Tags like <embed> are needed
<optgroup>
<option>MediaWiki-extensions-InputBox
<output>
<param>
<picture>See discussion
<script>
<select>
<template>
<textarea>
<title>overridable by {{DISPLAYTITLE:}}
<track>[[File:]] syntax + TimedMediaHandler
<video>[[File:]] syntax

Whitelisted for editor use:

<abbr>, <b>, <bdi>, <bdo>, <blockquote>, <br>, <caption>, <cite>, <code>, <data>, <dd>, <del>, <dfn>, <div>, <dl>, <dt>, <em>, <h1>, <h2>, <h3>, <h4>, <h5>, <h6>, <hr>, <i>, <ins>, <kbd>, <li>, <mark>, <ol>, <p>, <pre>, <q>, <rb>, <rp>, <rt>, <rtc>, <ruby>, <s>, <samp>, <small>, <span>, <strong>, <sub>, <sup>, <table>, <td>, <th>, <time>, <tr>, <u>, <ul>, <var>, <wbr>

Details

Reference
bz23932

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Is there anything holding this off now that T147199 is happening?

Re: T147199 , you'll probably want to re-evaluate UA stats after Nov 17 to see what the true final fallout is. It *should* significantly reduce the population of certain ancient UAs (notably, IE7-8/XP), but there are a few ways such UAs can stick around as well, and we can't readily predict what the numbers will look like:

  1. There's a complex hack (too complex and off the beaten path for us to recommend it to users, anyways) where one can get better-than-3DES crypto with IE8/XP by manually applying some registry hack to convince MS update servers that it's the POSReady commercial variant of XP, which got a longer support lifetime and some crypto DLL updates. Users who go down this road might use IE8/XP longer than others (though I can't fathom why they'd make this choice, it's still horribly insecure in all other senses).
  2. Some IE7-8/XP users might sit behind TLS-intercepting proxies which upgrade their outbound crypto. E.g. there might be software on the host, or a network appliance, which accepts their crappy 3DES TLS connection, uses a fake root installed on the host to enable TLS proxying in general, and then does better crypto on the outbound side facing our servers.
  3. IE7 (and 8 presumably?) also exists for Windows Vista, and IE7/Vista is not shut out by our crypto changes, as it supports TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA via TLSv1.0. This isn't the best crypto option in the world, but it's good enough that it's not going away this year. However, it will probably drop off significantly from whatever its current popularity level is by mid-2018, and depending on the drop in popularity, we may or may not shut it out at the crypto level when the stat drops off sufficiently. The driver here is that by mid-2018 (in many cases, it might get done earlier) all sites that accept credit cards have to eliminate TLSv1.0 to keep up with PCI-DSS standards. PCI-DSS isn't a requirement for our wikis, but it's expected that the lack of compatibility with most of the rest of the commercial internet will drive straggling users away from these older UAs for us.

you'll probably want to re-evaluate UA stats after Nov 17 to see what the true final fallout is

If I'm reading https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser/browser-family-and-major-tabular-view correctly, then IE8 is at 0.5%. If we assume that 1% of those have disabled JavaScript, that makes it 0.005% of all users, or five in ten thousand. Is that good enough to start whitelisting a few HTML5 elements in wikitext?

you'll probably want to re-evaluate UA stats after Nov 17 to see what the true final fallout is

If I'm reading https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-browser/browser-family-and-major-tabular-view correctly, then IE8 is at 0.5%. If we assume that 1% of those have disabled JavaScript, that makes it 0.005% of all users, or five in ten thousand. Is that good enough to start whitelisting a few HTML5 elements in wikitext?

Wait a minute, 0.005% isn't "five in ten thousand". It's five in a hundred thousand, isn't it?

Parsoid uses <section>, <figcaption>, and <figure> already in its output (and thus the main parser will too, as Parsoid is merged into core). <picture> could be considered as part of media layout, but we are using other better-supported mechanisms for responsive images at the moment. These tags should not be whitelisted in article content as they conflict with wikitext features.

A number of the remaining tags are related to overall page organization; they should be emitted by the theme, but should not be whitelisted in article content as such use would conflict with the <section> tags already emitted and with the document structure emitted by the theme: <footer>, <header>, <article>, <aside>, <main>, and <nav>.

I have no issue whitelisting the rest: <mark>, <progress>, and <time>. (Of these, <mark> seems the most obviously useful.)

I have no issue whitelisting the rest: <mark>, <progress>, and <time>. (Of these, <mark> seems the most obviously useful.)

What about <meter>, <details> and <summary>?

Krinkle renamed this task from Enable, whitelist, and incorporate semantic HTML5 elements to Allow use of semantic HTML5 elements in wikitext.Mar 19 2020, 8:52 PM

An element like main should not be enabled in wikitext, but there is no option beyond disabling for security reasons.

I've pulled some stuff out that has clear editor use and which really doesn't need a separate way to Do It.

[snip]

As I said at T104770#6685087, there are valid uses for the wikitext user, in fact, for many of those you put in the "never for wikitext users" buckets. It's not sufficient to say "you can't have this" based on your opinion that they're 'overall page organization' or for 'media' necessary, when in fact they are not, based on the specification:

  • aside -> Sidebar, side box, quote box, etc. (And I'm pretty sure I could sell the HTML-cognizant folks onwiki that infoboxes fall under this)
  • article -> Main page, portals, talk page comments, Template:Excerpt, news reels a la Wikinews, Wikisource in general, etc.
  • figcaption and figure -> Infoboxes, Module:Gallery, Template:Video game reviews, other table uses in the general where caption is insufficient for w/e reason, and this typical use for giving lists headings.
    • (Nb Module:Gallery shouldn't exist, but Wikimedia will always lag users in being able to provide options for display.)
  • header -> Main page, portals particularly inside article, see MDN.
  • footer -> Same places as header
  • section -> Same places as header
  • nav -> Any navbox
  • picture - This one isn't just for responsiveness. Anyway, I have trouble with this one envisioning use cases for WMF wikis, keeping in mind how editors interact with images today ([[File:]]). The wikitext would need something like a |fallback_n parameter in [[File:]]? (The reason I'm on this task today is because there are 3rd parties that have a use case for this. I'm not one of them and not sure I can describe the use case in question. Saw it on the Discord.)
  • source - Similar to picture.

It's fine if these never have a wikitext implementation; I can't imagine the casual wikitext editor will want most of these. The predominant use is for template/module editors to mark up template-generated content.

For sake of completeness:

  • main is not editor-feasible (and moreover has a requirement in the spec of "1-only" per HTML document)
  • mark and time were implemented
  • progress probably needs a task for implementation, since it's not under dispute.

Eyeballing the rest, I don't see anything else that needs comment from an editor perspective.

Izno updated the task description. (Show Details)

Update re <figure> and <figcaption>:

HTML standard no longer allows an inline attribution inside <blockquote>.

Invalid:

<blockquote>
<p>Attribution for the quotation, if any, must be placed outside the blockquote element.</p>
<cite>HTML Living Standard</cite>
</blockquote>

<figure> and <figcaption> are the only way to semantically associate an inline attribution with a blockquote.

Valid:

<figure>
<blockquote>
<p>Attribution for the quotation, if any, must be placed outside the blockquote element.</p>
</blockquote>
<figcaption>
<cite>HTML Living Standard</cite>
</figcaption>
</figure>

As long as <figure> and <figcaption> remain blacklisted, inline attributions have to be separate, with no semantic association to the quote.

Valid, but no semantic association:

<blockquote>
<p>Attribution for the quotation, if any, must be placed outside the blockquote element.</p>
</blockquote>
<p>
<cite>HTML Living Standard</cite>
</p>