Page MenuHomePhabricator

Multiline code tag results in strange html
Closed, DuplicatePublic

Description

Author: dan

Description:
When creating a new wiki page with the following content:
<code>
foo()
bar()
</code>

The generated html is:
<div id="mw-content-text" lang="en" dir="ltr" class="mw-content-ltr">
<p><code></code></p><code>
<pre>foo()
bar()
</pre>
</code><p><code></code>
</p>
</div>

The empty code tags are a bit curious.


Version: 1.23.0
Severity: normal

Details

Reference
bz66655

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 3:13 AM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz66655.
bzimport added a subscriber: Unknown Object (MLST).

The output in the description happens when Tidy is enabled. When it's disabled, this happens instead:

<p><code>
</p>
<pre>foo()
bar()
</pre>
<p></code>
</p>

Note that it's also wrong, so we can't blame this bug on Tidy.

This still happens with RemexHTML, see example. The generated code is slightly different, though.

This has nothing to do with Tidy or Remex. This is more specific to how <code> is parsed in wikitext. This is a result of T134469: doBlockLevels() inserts <p> and </p> randomly with no regard for HTML validity

@ssastry Thanks for the hit! Anyway, the bug on itwiki was also caused by the presence of PRE tags inside CODE, which I didn't notice since they were generated with indentation.