Page MenuHomePhabricator

Token stream patcher table start retokenizing doesn't handle non-string tokens in table attribute position
Open, MediumPublic

Event Timeline

Arlolra created this task.Oct 25 2018, 3:27 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 25 2018, 3:27 PM
Arlolra claimed this task.Oct 25 2018, 7:51 PM
Arlolra triaged this task as Medium priority.

Change 469684 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] [WIP] Throws from TSP re-tokenizing

https://gerrit.wikimedia.org/r/469684

While the above patch fixes the crasher and improves the rendering from before it was exposed, there's still some work to do to match the php parser's output.

It seems like the token stream patcher needs to buffer all the tokens until the end of the line that are in table start tag attribute position and then stringify them if we're going to be retokenizing the start tag.

Just noting the original reduced test case as,

<noinclude>
<!--
--></noinclude>{|

There's a space after the table start tag.

Without digging into the details, is this lintable and classifiable as 'unsupported' behavior?

Change 469684 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Accept spaces at the end of table_start_tag

https://gerrit.wikimedia.org/r/469684

The source on that fiwiki page is pretty much,

<noinclude>
<!-- 
--></noinclude{| <includeonly>border="0" style="background: none" </includeonly><noinclude>class="wikitable"
</noinclude>

We probably support that kind of attribute tokenizing in other circumstances where we don't end up the token stream patcher.

TBH, I'm not really sure what use cases the token stream patchers in handling or why we end up there at all in this case.

Going to repurpose this task now that the fix for the crasher is merged.

Arlolra renamed this task from Expected end of input but " " found. to Token stream patcher table start retokenizing doesn't handle non-string tokens in table attribute position.Oct 26 2018, 5:53 PM
Arlolra removed Arlolra as the assignee of this task.Jan 23 2019, 4:53 PM
ssastry edited projects, added Parsoid-Read-Views; removed Parsoid.Jun 10 2019, 7:57 PM
Aklapper edited projects, added Parsoid; removed Parsoid-Read-Views.Feb 29 2020, 5:14 PM
Arlolra moved this task from Needs Triage to Backlog on the Parsoid board.Mar 2 2020, 4:27 PM
LGoto moved this task from Backlog to Bugs & Crashers on the Parsoid board.May 8 2020, 4:24 PM