Page MenuHomePhabricator

Wikitext constructs across table and templats are not properly parsed
Closed, DeclinedPublicBUG REPORT

Description

Steps to replicate the issue:

https://en.wikipedia.org/wiki/ALT_Linux#Version_history

  • Create a template that expands to something like:
<templatestyles src="Version/styles.css" />class="templateVersion co swatch-unsupported" style="color: var(--color-base, #202122); " title="Old version, not maintained" data-sort-value="1.1" | <span style="display: none;">Old version, not maintained:</span> 1.1

which consists of attributes of a table cell and a table cell.

  • Use it in a table.

image.png (398×816 px, 67 KB)

What happens?:

Contents expanded from the template is seen as a single cell. Attributes are not parsed properly.

What should have happened instead?:

Like the old parser, attributes expanded from the template should be recognized.

Other information:

As described in docs/design.md, this should have been addressed by code in Wt2Html/PP/Handlers/TableFixups.php, but it is still broken so I am reporting it as a bug.

Event Timeline

Investigation notes: getReparseType() and other code have incorrect checks for "in extension content" because the wrapper node is both from an extension *and* from a template and so it skips over the entire template content when the first node is also an extension tag. T87274: DOM nodes with multiple typeof values is related.

The example wikitext is invalid from a less general perspective anyway. You cannot put a <templatestyles> tag in the middle of what eventually expands to <td class ...> (because the templatestyles tag itself expands to one of two HTML tags <style> or <link> depending on whether another templatestyles tag of the same kind is already on the page). If you want to keep the TemplateStyles tag, you need to move it into the cell content proper, that is after the | character mid-template.

In Parser.php, without template considerations as required by the example this kind of wikitext disappears the tag entirely in the HTML, which I would consider about as buggy as anything. Parsoid isn't much better because it gets sucked into a data-mw attribute for some reason, which is still buggy IMO, but at least you know there was supposed to be a tag there. It still should appropriately be resolved by fixing the source template.

... Parsoid isn't much better because it gets sucked into a data-mw attribute for some reason, which is still buggy IMO, but at least you know there was supposed to be a tag there. It still should appropriately be resolved by fixing the source template.

Parsoid's output here is deliberate to match the legacy parser behavior to minimize rendering incompatibilities on switch over. Parsoid sticks in data-mw so it can be emitted back during html -> wt transformations.

Per @Izno's comment above, likes like @XtexChooser has updated the template to fix the issue. The table now renders identically with Parsoid and legacy parser. As such, I am going to decline this issue. If a similar issue crops up in the future for something else, we can revisit if something needs to happen in Parsoid.