Page MenuHomePhabricator

Single-line definition lists don't work after html=ish tags with colon in their body
Open, LowPublic

Description

Consider the following wikitext:

;<b>foo:bar</b>:bat

In PHP, this will result in:

<dl><dt><b>foo:bar</b></dt><dd>bat</dd></dl>

However, in Parsoid we don't generate the <dd> for bat -- the final colon is treated literally, not as the start of the definition.

Event Timeline

Arlolra triaged this task as Medium priority.Dec 22 2016, 6:26 PM

Well, this works fine,

;<b>foo bar</b>:bat

The problem is that in,

;foo:bar:bat

the second colon is rightly ignored.

But in the test case,

;<b>foo:bar</b>:bat

what happens is the first colon gets tokenized as a tag and, similarly, the second is ignored.

Then, in the listHanlder's onListItem, the numOpenTags suppresses the dd in the tag, returning just the colon string instead, resulting in no dd's on the line. However, at that point it's too late restore the ignored second colon to a tag, short of retokenizing to the end of the line.

Something else that works fine now is,

;<b class="fo:o">bar</b>:bat

Fixing this is going to be tricky. It's likely the same fix as in T99843 and looks like I started on a patch in https://gerrit.wikimedia.org/r/#/c/194587 but abandoned it for the time being.

Arlolra renamed this task from Single-line definition lists don't work after html=ish tags to Single-line definition lists don't work after html=ish tags with colon in their body.Dec 22 2016, 9:04 PM
Arlolra lowered the priority of this task from Medium to Low.
Arlolra subscribed.