Page MenuHomePhabricator

New paragraph before section heading becomes line break
Open, NormalPublic

Description

As reported on German WP, whenever one tries to add a new paragraph before a section heading (by pressing enter either with the cursor before the first character of the heading or with the cursor after the last character of the preceding paragraph), VE apparently converts that to an HTML line break. I guess that is not supposed to happen, is it?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 31 2019, 1:06 AM
matmarex added a subscriber: matmarex.

I suspect a relation to T184755 (in some cases we need to generate <br> tags internally when there are empty paragraphs in wikitext).

CC @ssastry does this look like a Parsoid issue?

Lofhi added a subscriber: Lofhi.Feb 20 2019, 2:54 PM

Similiar problem observed on frwiki, diff : 1 and 2.

JTannerWMF moved this task from To Triage to Freezer on the VisualEditor board.Feb 26 2019, 5:07 PM

CC @ssastry does this look like a Parsoid issue?

I don't know offhand .. can someone repro this and report the VE generated HTML for the edited area in question?

It's reproducible. I just added an empty paragraph between the lead and section heading in this edit: https://en.wikipedia.org/w/index.php?title=User:Matma_Rex/sandbox&diff=885235533&oldid=885235196&diffmode=source

There should be no <br /> in the wikitext output. Only a bunch of newline characters.

Recording:

VE HTML (ve.init.target.surface.getHtml()) when loading the page:

<p id="mwAg">Foo.</p>

<h2 id="Bar">Bar</h2>
<p id="mwBA">Baz.</p>

VE HTML after adding the empty paragraph:

<p id="mwAg">Foo.</p><p id="mwAg"></p>

<h2 id="Bar">Bar</h2>
<p id="mwBA">Baz.</p>"

It's reproducible. I just added an empty paragraph between the lead and section heading in this edit: https://en.wikipedia.org/w/index.php?title=User:Matma_Rex/sandbox&diff=885235533&oldid=885235196&diffmode=source
There should be no <br /> in the wikitext output. Only a bunch of newline characters.

See T184755: Consider not removing multiple blank lines/white space between paragraphs and https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/465656

I know about that task (I even linked it here). There should be <br /> generated in the HTML output for multiple newlines in wikitext. There must not be any in the wikitext output for multiple empty paragraphs in HTML, though.

I know about that task (I even linked it here). There should be <br /> generated in the HTML output for multiple newlines in wikitext. There must not be any in the wikitext output for multiple empty paragraphs in HTML, though.

See the commit message of the linked gerrit patch. <p></p> will disappear during html->wt. If VE wants Parsoid to insert newlines in wikitext and inserts multiple <p></p>, Parsoid normalizes them to the wikitext-output form by inserting <br/> tags to mimic that. See T184755#4656731 specifically.

But you do not need <br> in wikitext to emit empty paragraphs in HTML? You just need newlines. I don’t understand why the <br> tags end up in wikitext.

But you do not need <br> in wikitext to emit empty paragraphs in HTML? You just need newlines. I don’t understand why the <br> tags end up in wikitext.

The <br/> is an edge case (literally) because of a single newline needed before headings (and I suppose Parsoid's html->wt figure that is the only way to preserve that newline) .... but, we could probably figure out what to do there if the <br /> is undesirable. But, in general, if there are multiple empty newlines inserted in VE, those newlines will make their way to wikitext (mentioning that because T217205 got closed as a dupe of this).

[subbu@earth:~/work/wmf/parsoid] echo "<p>a</p><p></p><p></p><p>b</p>" | parse.js --scrubWikitext --html2wt
a



b
[subbu@earth:~/work/wmf/parsoid] echo "<p>a</p><p></p><p>b</p>" | parse.js --scrubWikitext --html2wt
a


b
[subbu@earth:~/work/wmf/parsoid] echo "<p>a</p><p></p><p></p><h2>x</h2>" | parse.js --scrubWikitext --html2wt
a


== x ==
[subbu@earth:~/work/wmf/parsoid] echo "<p>a</p><p></p><h2>x</h2>" | parse.js --scrubWikitext --html2wt
a

<br />

== x ==
Cirdan added a subscriber: Cirdan.Mar 2 2019, 4:12 PM
Stryn added a subscriber: Stryn.Jul 22 2019, 6:38 PM

example edit (with link corrected)
And today a new message in village pump: https://fi.wikipedia.org/wiki/Wikipedia:Kahvihuone_(tekniikka)#%22Br-merkinn%C3%A4t%22

I think this should be make a higher priority to work on.
Users are blaming visual editor when it leaves mess behind.

ssastry triaged this task as Normal priority.EditedAug 14 2019, 11:42 AM

We are currently in a feature freeze for Parsoid because of the ongoing porting of Parsoid to PHP. As for bug fixes, we are only addressing critical bugs at this time. We should soon (probably October) out of this freeze and can start working on bug fixes again after.

JTannerWMF added a subscriber: JTannerWMF.

Moving this to External due to Parsing working on it whenever they lift the feature freeze