Page MenuHomePhabricator

Comment at the beginning of paragraph placed outside the paragraph
Closed, DeclinedPublic

Assigned To
None
Authored By
matmarex
Feb 21 2015, 3:47 AM
Referenced Files
F45810: Screen_Shot_2015-02-23_at_2.05.11_PM.png
Feb 23 2015, 10:48 PM
F44845: pasted_file
Feb 21 2015, 5:36 PM
F44843: pasted_file
Feb 21 2015, 5:36 PM

Description

Comment at the beginning of paragraph in wikitext is placed outside the paragraph in Parsoid's HTML.

Wikitext input:

<!-- comment -->foo

Actual:

<!-- comment --><p …>foo</p>

Expected:

<p …><!-- comment -->foo</p>

This will affect VE's rendering of such wikitext after T73085 gets fixed.

Event Timeline

matmarex raised the priority of this task from to Needs Triage.
matmarex updated the task description. (Show Details)
matmarex added a project: Parsoid.
matmarex subscribed.

I don't understand this ... Why does comment placement affect HTML rendering?

VisualEditor displays the comments as little clickable icons within the page. <p><!-- asd -->foo</p> will render like:

pasted_file (32×67 px, 651 B)
while <!-- asd --><p>foo</p> will render like:
pasted_file (63×39 px, 663 B)
(with the comment wrapped in its own paragraph).

matmarex set Security to None.

To test after fix: In Read mode those <p><!-- asd -->foo</p> and <!-- asd --><p>foo</p> will be displayed as

Screen_Shot_2015-02-23_at_2.05.11_PM.png (79×104 px, 3 KB)

There should not be discrepancy between displaying content in VE and Read(generally speaking).

This is nevertheless still quite odd .. but I think I kind of understand why this is important .. mostly because comments in wikitext are not really comments in the HTML semantics sense, but are really wikitext hacks for associating metadata / annotations and therefore need to be closely tied to the paragraph they show up in for a meaningful editing experience. Does that sound about right?

ssastry triaged this task as Medium priority.Feb 24 2015, 6:39 PM

So, there are all kinds of scenarios here ... some input from you can help think about this in a systematic way:

<!--foo-->bar
<!--foo-->           bar <div>foo</div>
       <!--foo-->    bar <div> foo </div>
<!--foo-->[[Category:bar]]abc
[[Category:bar]]<!--foo--> abc
abc<!--foo-->
abc   [[Category:baz]]    <!--foo-->

So, basically, I am trying to figure out in what situations are comments tied to the paragraph? What happens when whitespace, categories, language links, and other kinds of "rendering-transparent" wikitext shows up on that line? Do we then start wrapping all of those other stuff in the same paragraph? This would start moving in the opposite direction of fixes we made for T69554 or T73361 .. although there are intervening line breaks in those scenarios, so we can probably distinguish between the two.

That brings me to other scenarios: "a\n<!--foo-->\nb" vs "<!--foo-->\nb" and then throw in some categories, whitespace, etc. there.

So, I am looking for some guidance about when a comment is considered an annotation on a paragraph and should be tied to it vs. be kept out of it.

First, obviously anything we'll do here will be heuristics anyway, and I don't really think it is important. I'd be perfectly happy if the bug was declined entirely.

If there's a newline between normal text and the comment, then I'm pretty sure we should render it outside the paragraph – this would make sense with how I've seen these warning comments used in the past (sorry, I have no examples handy). Not sure what to do with other cases; I think it would be reasonable to special-case the simple situation only (with no categories or other funny things around).

So for these seven examples, the following pseudocode output would look sensible to me. Compared to current output, the paragraphs would have the comment nested inside them if they were direct siblings, or if there was only a text node containing only spaces between them (and maybe tabs, depending on how you handle tabs/spaces elsewhere).

<p><!--foo-->bar</p>
<p><!--foo-->           bar </p><div>foo</div>
       <p><!--foo-->    bar </p><div> foo </div>
<!--foo-->[[Category:bar]]<p>abc</p>
[[Category:bar]]<p><!--foo--> abc</p>
<p>abc<!--foo--></p>
<p>abc   </p>[[Category:baz]]    <!--foo-->

By the way, the fifth example currently renders differently under Parsoid, as preformatted text rather than just a paragraph. (If anything, it would be sensible for the third one to do so? PHP parser renders no preformatted text anywhere.) http://en.wikipedia.beta.wmflabs.org/wiki/T90321

This is unrelated to the main topic of this bug report, but ..

By the way, the fifth example currently renders differently under Parsoid, as preformatted text rather than just a paragraph. (If anything, it would be sensible for the third one to do so? PHP parser renders no preformatted text anywhere.) http://en.wikipedia.beta.wmflabs.org/wiki/T90321

I think Parsoid's behavior is consistent here .. [[Category:bar]] and comments don't affect start-of-line state and so that line is effectively " abc" which should render as a pre .. and I just verified by copying that line on enwiki sandbox. So, something else is going on there ... looks like the <hr/> from the "----" is causing the pre on the next line to be suppressed .. which is weird .. Won't fix for that one.

As for the 3rd line .. that is the expected behavior because the <div> (a "block" tag in HTML4 semantics) suppresses indent-pre and p-wrapping. The p-wrapper is then added by Tidy.

Try this:

[[Category:Foo]]<!--foo--> a
----
[[Category:Foo]]<!--foo--> a
----
<!--foo--> a
----
LGoto lowered the priority of this task from Medium to Low.Jun 18 2020, 6:28 PM
LGoto moved this task from Backlog to Bugs & Crashers on the Parsoid board.

Since there are all matters of corner cases here and P-wrapping is anyway a pile of mess and will get sorted out in the future, I am going to decline this as not being important. As we move to Parsoid being the default, the read / edit mode differences will disappear. And, at that time, if this became really important to someone, they can refile the task.