Page MenuHomePhabricator

Edge case tokenizing difference
Closed, ResolvedPublic

Description

{|
 !
 foo
 !
 bar
 |}

See parsoid peg output and html output

[subbu@earth:~/work/wmf/parsoid] parse.js --trace peg --normalize=parsoid< /tmp/wt
0-[peg]        | ---->   [{"type":"TagTk","name":"table","attribs":[],"dataAttribs":{"tsr":[0,2]}}]
0-[peg]        | ---->   [{"type":"NlTk","dataAttribs":{"tsr":[2,3]}}," ",{"type":"TagTk","name":"th","attribs":[],"dataAttribs":{"tsr":[4,5],"tmp":{"noAttrs":true}}},{"type":"NlTk","dataAttribs":{"tsr":[5,6]}}," ","foo",{"type":"NlTk","dataAttribs":{"tsr":[10,11]}}," ","!",{"type":"NlTk","dataAttribs":{"tsr":[13,14]}}," ","bar",{"type":"NlTk","dataAttribs":{"tsr":[18,19]}}," ",{"type":"SelfclosingTagTk","name":"meta","attribs":[{"k":"typeof","v":"mw:TSRMarker"},{"k":"data-etag","v":"th"}],"dataAttribs":{"tsr":[20,20]}}]
0-[peg]        | ---->   ["|}"]
0-[peg]        | ---->   [{"type":"NlTk","dataAttribs":{"tsr":[22,23]}}]
0-[peg]        | ---->   [{"type":"EOFTk"}]

<table>
<tbody>
<tr>
<th><pre>foo
!
bar
|}</pre></th>
</tr>
</tbody>
</table>

PEG continues to parse as indent-pre even though we are in a table context and the ! should reset to table heading.

Event Timeline

ssastry created this task.May 20 2018, 4:31 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 20 2018, 4:31 AM
ssastry triaged this task as Normal priority.May 20 2018, 4:32 AM
ssastry moved this task from Backlog to Read Views on the Parsoid board.May 20 2018, 4:46 AM
Arlolra claimed this task.May 22 2018, 4:13 PM

Yes, I think so.

Change 434608 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] Protect indented table syntax from indent pre parsing

https://gerrit.wikimedia.org/r/434608

Change 434608 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Protect indented table syntax from indent pre parsing

https://gerrit.wikimedia.org/r/434608

Change 434725 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] Protect the pipe variable as well from indent pre

https://gerrit.wikimedia.org/r/434725

Change 434725 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Protect the pipe variable as well from indent pre

https://gerrit.wikimedia.org/r/434725

Arlolra closed this task as Resolved.May 23 2018, 4:25 PM
Vvjjkkii renamed this task from Edge case tokenizing difference to tlcaaaaaaa.Jul 1 2018, 1:08 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed Arlolra as the assignee of this task.
Vvjjkkii raised the priority of this task from Normal to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii edited subscribers, added: Arlolra; removed: gerritbot, Aklapper.
CommunityTechBot lowered the priority of this task from High to Normal.Jul 3 2018, 3:26 AM