Page MenuHomePhabricator

P-wrapping doesn't match php parser's $inBlockElem
Closed, ResolvedPublic

Description

[subbu@earth:~/work/wmf/parsoid] cat /tmp/wt 
{|
|-

{|
| x
|}

a

b

c
[subbu@earth:~/work/wmf/parsoid] parse.js --normalize=parsoid < /tmp/wt
<table>
<tbody>
<tr class="mw-empty-elt"></tr>
</tbody>
</table>
<table>
<tbody>
<tr>
<td>x</td>
</tr>
</tbody>
</table>
<p>a b c</p>

Unclosed tables are not that uncommon as a syntactical error category. So, we should probably try to fix this.

Event Timeline

ssastry triaged this task as Medium priority.May 16 2018, 3:17 AM
ssastry updated the task description. (Show Details)
ssastry moved this task from Needs Triage to Read Views on the Parsoid board.

So, we should probably try to fix this.

By "this", I'm going to assume you mean the paragraph wrapping of the abc. The rest of the render looks right to me.

So, we should probably try to fix this.

By "this", I'm going to assume you mean the paragraph wrapping of the abc. The rest of the render looks right to me.

Yes, p-wrapping.

So, we should probably try to fix this.

By "this", I'm going to assume you mean the paragraph wrapping of the abc. The rest of the render looks right to me.

Yes, p-wrapping.

It has "pseudo-dom-info" via table stacks .. and obviously, that goes haywire with unclosed tables. So, unclear how this can actually be fixed in a reasonable manner ... besides moving p-wrapping to the DOM (which keeps popping up every once in a while).

Alternatively, we can throw up our hands and say this is broken wikitext and behavior is undefined => we won't try to make it always do the right thing. But, if possible, we should recover from it.

I'm going to see if the table stacks actually match up with the categories grouped in core,
https://github.com/wikimedia/mediawiki/commit/1f907d500a257a8af6b7d67072843a1ab4d3becf

Notes to self ...

Parsoid's p-wrapper seems to at least get the block on a line part of it right, but the table stacks / hasOpenHTMLPTag don't exactly mimic the php parser's $inBlockElem. For example,

<p>


hi



</p>

parses as,

<p class="mw-empty-elt" data-parsoid='{"stx":"html"'></p>
<p>
<br/>
hi</p>



<p></p>
Arlolra renamed this task from Edge case parsing bug to P-wrapping doesn't match php parser's $inBlockElem.May 29 2018, 8:31 PM

Change 436847 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] [WIP] Pare down some things in paragraph wrapping

https://gerrit.wikimedia.org/r/436847

Change 436847 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Pare down some things in paragraph wrapping

https://gerrit.wikimedia.org/r/436847

Vvjjkkii renamed this task from P-wrapping doesn't match php parser's $inBlockElem to 1vcaaaaaaa.Jul 1 2018, 1:09 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Arlolra as the assignee of this task.
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii edited subscribers, added: Arlolra; removed: gerritbot, Aklapper.
CommunityTechBot lowered the priority of this task from High to Medium.Jul 3 2018, 3:27 AM