Page MenuHomePhabricator

Use of center tag ruins paragraph formatting
Closed, ResolvedPublic

Description

Author: puglisi

Description:
When using a center html tag, the following two paragraphs are not broken up,
but instead appear joined. Example test:

  • start wikitext --

<center>
foo
</center>

bar

foo

  • end wikitext --

after the first centered "foo", the page reads "bar foo" on a single line. See
the provided URL for an example.


Version: 1.9.x
Severity: normal
URL: http://it.wikipedia.org/wiki/Utente:Alfio/Test

Details

Reference
bz8293

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:33 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz8293.
bzimport added a subscriber: Unknown Object (MLST).

I've no idea what's causing this, but I've added a parser test for it in r18402.

Note, this does not happen if the center tags are replaced with div, nor if the
opening and closing tag are on the same line. I suspects some confusion in
doBlockLevels(), but I'm finding it hard to make sense of that code.

Patch to doBlockLevels() Parser.php

The attached patch, which simply moves "</center" from $openmatch to
$closematch, to match the way "</div" is treated, seems to fix the bug and
causes no other parsertests to regress. However, I'd rather like to understand
_why_ this is so.

Attached:

ayg wrote:

Are you sure that doesn't break XHTML validity when you mix it with other stuff?
See bug 6200, which notes identical behavior for <blockquote> and <li>. I
tried a similar tweak for that and it resulted in invalid nesting.

I'm not sure if it's possible to fix this properly without rewriting the
relevant loop entirely. There's a comment in the code that seems to suggest it
was never properly implemented in the first place. In any case, both <li> and
<blockquote>, like <center>, most certainly should be able to contain <p> tags.

ayg wrote:

Yes, but did the patch you just committed cause the output pages to become
invalid XHTML? http://validator.w3.org/

It didn't break any existing parser tests, so if there's any breakage it must be
for some case that no-one has thought of testing (which in itself wouldn't be
surprising). In any case, I frankly consider a possibility of improper nesting
in some yet unknown case to be preferable to the parser just throwing in the
towel on a known input. (Which, by the way, I coincidentally just found a live
example of at [[Cellular automaton]] -- I worked around it there by switching to
<div align="center">.)

ayg wrote:

First of all, your patch doesn't seem to change the behavior for me. Paragraphs
in <center> still get eaten.

Second of all, it's a really bad idea to commit patches to code that you by your
own admission don't understand on the basis that they seem to fix the problem
and don't break parser tests. You run a high risk of screwing things up.

Finally, XHTML validity is not tested for by parser tests, but is nevertheless
important in part (such as validity of nesting), because clients and Tidy may
screw up appearance even more in trying to parse the broken tags. (Other
aspects, like multiple id's, aren't quite so important.) I strongly recommend
that you test for XHTML validity, which I would except that your patch doesn't
work for me (see point 1), and revert if you've broken it.