Page MenuHomePhabricator

Broken table wikitext pattern requires edits to templates on multiple wikis
Closed, ResolvedPublic

Description

At least cebwiki and svwiki seem to be using this pattern of embedding a table inside another table in fosterable position and relying on Tidy doing some weird fixup that inverts the order of the tables.

Ex: https://sv.wikipedia.org/wiki/Kugelstein and https://ceb.wikipedia.org/wiki/Wujieyue_Shan.

svwiki pages have this wikitext:

{| border="1"
{{klimatöversikt
...
}}
|}

and cebwiki pages have this:

{| border="1"
{{climate chart
...
}}
|}

In both cases, we effectively have HTML of the form:

<table border="1"> <---- GENERATED BY THE {| WIKITEXT
<table class="infobox" style="width: 19.5em; float: left; clear: left; margin-left:0em;margin-right:1em; text-align: center; border: solid 1px silver" cellspacing="0" cellpadding="0"> <---- GENERATED BY THE TEMPLATE
...
</table>
</table>

This is broken HTML and a HTML5 parser (RemexHTML, HTML5Depurate, Balancer, Parsoid) will generate output of the form:

<table border="1"></table> <--- EMPTY TABLE
<table class="infobox" style="width: 19.5em; float: left; clear: left; margin-left:0em;margin-right:1em; text-align: center; border: solid 1px silver" cellspacing="0" cellpadding="0">
.... table content here ...
</table>

However, Tidy removes the infobox-class based table skeleton and retains the table border="1" in place yielding:

<table border="1">
... table content here ...
</table>

Given this Tidy output (where the templates outermost table shell is removed), a good way to get similar output in Tidy, RemexHTML, and Parsoid would be to edit the templates to remove the opening <table> and closing </table> from the cebwiki:climate chart and the svwiki:klimatöversikt and other such templates on other wikis where this pattern is used.

This difference has been found in the Tidy replacement project tests (row 3 of Replacing Tidy Test Results) and while comparing PHP parser output and Parsoid output.

A good way to do this would be to:

  1. Edit the relevant template on one of these wikis by removing the opening <table> and closing </table>
  2. Use the "Preview page with this template" functionality before saving the changes and verify that the new output is still similar:
    pasted_file (1×1 px, 360 KB)
  3. Save, and purge an affected page via ?action=purge
  4. Verify that RemexHTML output now matches the Tidy output via the url: https://<wiki>.wikipedia.org/w/index.php?title=<title>&action=parsermigration-edit.
  5. (Optional) Verify that Parsoid output now matches the Tidy output (by going to https://<wiki>.wikipedia.org/api/rest_v1/page/html/<title> and force-reloading the page).

See also: https://www.mediawiki.org/wiki/Parsing/Replacing_Tidy/FAQ#Simplified_instructions_for_fixing_pages

Event Timeline

An alternative to doing the live template edit would be for someone to make copies of the affected templates to their local sandbox, and do this testing to verify that this will actually fix the problem.

ssastry triaged this task as Medium priority.Mar 24 2017, 9:01 PM

Similar problem with this enwiki template. Line 5 (and 6 too) can be safely deleted. This affects pages like [[en:Typhoon Parma]]

EDIT April 5, 13:53 CT: I edited the template and fixed the problem.

An alternative to doing the live template edit would be for someone to make copies of the affected templates to their local sandbox, and do this testing to verify that this will actually fix the problem.

You can also use the "Preview page with this template" functionality when editing the template:

pasted_file (1×1 px, 360 KB)

An alternative to doing the live template edit would be for someone to make copies of the affected templates to their local sandbox, and do this testing to verify that this will actually fix the problem.

You can also use the "Preview page with this template" functionality when editing the template:

Aha! Where is the Like button when I need it? ;-) Can you update the task description accordingly? :-)

Similar problem with this enwiki template. Line 5 (and 6 too) can be safely deleted. This affects pages like [[en:Typhoon Parma]]

I felt bold and fixed this.

Change 346611 had a related patch set uploaded (by Subramanya Sastry):
[mediawiki/services/parsoid@master] WIP: Add linter detection for T161341

https://gerrit.wikimedia.org/r/346611

Note that have a related test "Table in fosterable position" and task T60730

Change 346611 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Add linter detection for T161341

https://gerrit.wikimedia.org/r/346611

This is now mostly covered by the deletable-table-tag linter category and should let editors find these pages and templates easily. If there are any instances not covered by that linter category, we can update the detection code. I'll leave this open till Linter is reenabled on large wikis and we are able to get some experience with editors fixing templates and pages flagged by the linter.

ssastry claimed this task.

The linter category tells editors what to fix and editors are doing these fixes at this point. Nothing more to do here for now.