This is a down-scoped version of T6740: thead, tbody, tfoot for wikitable syntax which proposed "wikitext-ish" vertical-bar-and-braces syntax for these table features. This task is "just" for allowing these tags to be included as literal HTML in wikitext. (Previously: T5156, and I'm sure others.)
The core is a 1- or 2-line patch to the Sanitizer to allow these tags through when present as tag literals in wikitext. But that opens up a number of possible issues that would need to be understood and worked through:
- As discussed some in T6740#8383265 and T289817#8225410 the jquery.tablesorter.js component "already does" this, we'd want to make sure a literal <thead> doesn't break tablesorter when used.
- Need to think through what attributes should be allowed here on these. Probably the elements the sanitizer allows for <table> and <tr> are appropriate?
- As mentioned in T5156#74956 we'd have to verify that Remex Tidy handles corner cases, since as nesting or "orphaned" <tr>/<thead> elements correctly. (This is *probably* correct and/or reasonable, since Remex is based on HTML5 semantics, but we haven't actually *tested* this corner of its behavior AFAIK...)
- There will be interactions with "wikitext" table syntax. What happens if you start a wikitext table with {| and then insert a literal <thead> tag? Is that behavior consistent in Parsoid and the legacy parser, and the exact behavior something we want to support forever? The legacy parser's table-handling is pretty janky (T134469, etc) -- should we try to explicitly disable <thead> and friends if we're not in a "literal HTML" table, so we don't get inconsistencies between the legacy parser and Parsoid?
- Do the <thead> etc elements play nicely with the current WMF skins and article CSS?
- Do the <thead> etc elements play nicely with the transformations done for mobile?
- How does this interact with <caption>/<colgroup>/<col> and <table summary="..."> and the scope and headers attributes? (See https://developer.mozilla.org/en-US/docs/Learn/HTML/Tables/Advanced). Perhaps we want to roll this out as part of a more-complete "HTML5 table" feature?
- Should we also support a "default" <thead> element -- that is, in the absence of an explicit <thead>, if the first row(s) of a table contain nothing but "th" cells, should they be hoisted into a default <thead>? Is that even useful, if the <thead> contains no class or id attributes? (And how would that hoisting interact with something like {{#attr}} (T230658). (This is probably a big enough feature chunk to merit a subtask of its own.)
Improving table support is probably worth putting on the Content-Transform-Team roadmap, but not until after the Parsoid-Read-Views migration.