Page MenuHomePhabricator

Copy-pasted table into VE from external site with <thead> and <tbody> sections breaks as <tr>s not created at section boundaries
Closed, ResolvedPublic0 Story Points

Description

I just copied a simple table from an external website into a VE edit.
Inside the edit mode it looked perfect :-) However, when I saved the edit and viewed my changes, the first column was broken. I imported the same table again, saved, and the same problem. I then simply added " |- " between the header descriptions and the first column of content, and it worked fine. So, this bug is both about why the pipe for the first column wasn't there automatically, and also, why this table displayed correctly in VE but differently once it saved.

The original table is here:
search for "Senate Quotas Ready Reckoner" at http://blogs.abc.net.au/antonygreen/2016/06/2016-senate-calculators.html

The Diff of the edit that I first saved is here:
https://en.wikipedia.org/w/index.php?title=Australian_federal_election,_2016&diff=prev&oldid=724452786
(the only change I made to the table was to remove the "caption". Otherwise, this is the code that was automatically created).

And the diff of the very simple fix I made in code-editor to make the table work properly once I noticed the problem:
https://en.wikipedia.org/w/index.php?title=Australian_federal_election,_2016&diff=next&oldid=724452786

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 9 2016, 10:32 AM

(adding Parsoid just for a sanity check, thanks Sherry for suggesting it.)

This is a Parsoid bug assuming VE returned the HTML with a <thead> and <tbody> sections.

[subbu@earth parsoid] cat /tmp/tbl
<table>
<thead>
<tr><th>h1</th> </tr>
</thead>
<tbody>
<tr><td>a</td></tr>
<tr><td>b</td></tr>
</tbody>
</table>
[subbu@earth parsoid] parse.js --html2wt < /tmp/tbl
{|

!h1 

|a
|-
|b

|}

Without the thead and tbody, Parsoid handles it fine

[subbu@earth parsoid] cat /tmp/tbl
<table>
<tr><th>h1</th> </tr>
<tr><td>a</td></tr>
<tr><td>b</td></tr>
</table>
[subbu@earth parsoid] parse.js --html2wt < /tmp/tbl
{|
!h1 
|-
|a
|-
|b
|}

https://gerrit.wikimedia.org/r/#/c/291385/ would have fixed this issue cleanly ... but, that is a potentially controversial fix, so will provide a different fix for now.

Change 293881 had a related patch set uploaded (by Subramanya Sastry):
WIP: T137406: Emit |- between thead/tbody/tfoot

https://gerrit.wikimedia.org/r/293881

Jdforrester-WMF renamed this task from Copy-paste simple table from external site in VE breaks first row to Copy-pasted table into VE from external site with <thead> and <tbody> sections breaks as <tr>s not created at section boundaries.Jun 12 2016, 10:53 AM
Jdforrester-WMF assigned this task to ssastry.
Jdforrester-WMF triaged this task as Normal priority.
Jdforrester-WMF set the point value for this task to 0.
Jdforrester-WMF moved this task from To Triage to TR0: Interrupt on the VisualEditor board.

Change 293881 merged by jenkins-bot:
T137406: Emit |- between thead/tbody/tfoot

https://gerrit.wikimedia.org/r/293881

Well, I only got half way there it appears ... looks like there is some other corner case that didn't show up in parser tests but showed up in production .. https://en.wikipedia.org/w/index.php?title=User%3ASSastry_%28WMF%29%2FVE_Test&type=revision&diff=725462249&oldid=724047983 ... There is a missing newline. Well, will be fixed in the next round.

ssastry moved this task from Backlog to In Progress on the Parsoid board.Jun 15 2016, 8:21 PM

Second half of the fix now in gerrit.

ssastry closed this task as Resolved.Jun 23 2016, 5:52 PM

Deployed and verified fixed.

ssastry removed a subscriber: gerritbot.