Page MenuHomePhabricator

Edge case bug in table tokenizing
Closed, ResolvedPublic

Description

The template is https://nl.wikipedia.org/wiki/Sjabloon:Infobox_luchthaven , and you can see examples at https://nl.wikipedia.org/w/index.php?title=Vliegbasis_Eindhoven&veaction=edit or https://nl.wikipedia.org/wiki/Luchthaven_Schiphol?veaction=edit . Tagging @Arlolra who's a wizard with templates (and can tell if Parsoiders need to be involved).

Event Timeline

Elitre raised the priority of this task from to Needs Triage.
Elitre updated the task description. (Show Details)
Elitre added a project: VisualEditor.
Elitre subscribed.
matmarex subscribed.

Clearly a Parsoid-related issue. The template (which generates a table) has several tables inside, and their closing |} are "indented" with some leading spaces (this is visible in template expansion). Parsoid misparses one of these as preformatted text, thus one of the tables is unclosed, this everything following the template gets included in the table it generates. Perhaps related to the use of {{!}} to build the tables?

http://parsoid.wmflabs.org/nlwiki/Vliegbasis_Eindhoven

ssastry renamed this task from Airport infobox at nl.wp swallows the whole article to Edge case bug in table tokenizing.Dec 31 2014, 7:30 PM
ssastry triaged this task as Medium priority.
ssastry removed a project: VisualEditor.
ssastry set Security to None.
ssastry subscribed.

Reproducible with this test snippet. There are space chars on the 3 lines after "|foo". It looks at least 2 empty lines with an empty leading space is required for the indented " |}" to be tokenized as a string instead of a table end tag.

{|
|foo
 
 
 |}

Change 182378 had a related patch set uploaded (by Subramanya Sastry):
T85627: Fix edge case tokenizing table lines

https://gerrit.wikimedia.org/r/182378

Patch-For-Review

Change 182378 merged by jenkins-bot:
T85627: Fix edge case tokenizing table lines

https://gerrit.wikimedia.org/r/182378

ssastry added a subscriber: Arlolra.