Page MenuHomePhabricator

Parsoid doesn't insert same spacers when article text starts with two or more newlines.
Open, LowPublic

Description

PHP:

$ (echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p>x
</p></div>
$ (echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p>x
</p></div>
$ (echo; echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p><br />
x
</p></div>
$ (echo; echo; echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p><br />
</p><p>x
</p></div>
$ (echo; echo; echo; echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p><br />
</p><p><br />
x
</p></div>
$ (echo; echo; echo; echo; echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p><br />
</p><p><br />
</p><p>x
</p></div>
$ (echo; echo; echo; echo; echo; echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p><br />
</p><p><br />
</p><p><br />
x
</p></div>

Parsoid:

$ (echo 'x' ) | bin/parse.js --normalize
<p>x</p>
$ (echo; echo 'x' ) | bin/parse.js --normalize
<p>x</p>
$ (echo ; echo; echo 'x' ) | bin/parse.js --normalize
<p>x</p>
$ (echo ; echo; echo; echo 'x' ) | bin/parse.js --normalize
<p><br/> x</p>
$ (echo ; echo; echo; echo; echo 'x' ) | bin/parse.js --normalize
<p><br/></p>
<p>x</p>
$ (echo ; echo; echo; echo; echo; echo 'x' ) | bin/parse.js --normalize
<p><br/></p>
<p>x</p>
$ (echo ; echo; echo; echo; echo; echo; echo 'x' ) | bin/parse.js --normalize
<p><br/></p>
<p><br/> x</p>

In general Parsoid looks like it is generating the output for one fewer leading newline than PHP, but there are other differences with 4 and 5 newlines that should be looked into.

Event Timeline

cscott created this task.Sep 8 2017, 8:49 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 8 2017, 8:49 PM
ssastry triaged this task as Low priority.Sep 11 2017, 7:10 PM
$ (echo; echo; echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p><br />
</p><p>x
</p></div>
$ (echo; echo; echo; echo 'x' ) | php maintenance/parse.php --quiet
<div class="mw-parser-output"><p><br />
</p><p><br />
x
</p></div>

What happened here? Is that some sort of non-determinism :)

cscott updated the task description. (Show Details)Sep 19 2017, 5:15 PM

What happened here? Is that some sort of non-determinism :)

Whoops, cut-and-paste error. Fixed.

LGoto moved this task from Needs Triage to Backlog on the Parsoid board.Feb 15 2020, 9:42 PM
LGoto moved this task from Backlog to Bugs & Crashers on the Parsoid board.May 28 2020, 6:28 PM
Aklapper removed cscott as the assignee of this task.Jun 19 2020, 4:27 PM

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)