Page MenuHomePhabricator

Add support for extensions and templates within indented comments
Open, Needs TriagePublic

Description

When you include an extension within an indented comment, indentation syntax (e.g. : or *) will get prepended to each line in the comment.

This diff [1] demonstrates this issue; notice the :::: at the beginning of each line.

This task is about adding support for extensions (e.g. <syntaxhighlight></syntaxhighlight and <math></math>) within indented comments before T246960 is resolved.

  1. Go to a talk page where the reply tool is enabled (e.g.https://en.wikipedia.beta.wmflabs.org/wiki/Talk:Cats)
  2. Click any "Reply" link
  3. Enter the following into the source comment area:
foo
<syntaxhighlight>
bar
baz
</syntaxhighlight>
quux

Actual behavior

  1. ⚠️ Notice the following appears in the preview/is posted:

foo

:bar
:baz
:

quux ~~~~

Expected behavior

  1. Notice the following appears in the preview/is posted:

foo

bar
baz

quux ~~~~

Done

Event Timeline

Potential approach
In discussing this with @Esanders, he shared one approach we could take to adding the support this task is asking for...

Add logic to the Reply tool that removes indentation syntax (e.g. : and *) from content that is contained within an extension (e.g. <math></math>).

  • Concerns:
    • Such an approach would start us down the path of parsing comment contents which means...
      • A. We would have more to maintain (in this case, a list of extensions) and
      • B. We could be duplicating work the parser (legacy and Parsoid) are already doing at "save"
  • Considerations:
    • Any code we write to add this support now, would become obsolete once new syntax for multi-line comments is implemented (T246960).
ppelberg updated the task description. (Show Details)

Will discuss this next Tuesday, hopefully get some input from @cscott

A note on timing after a quick chat with @Esanders:

  • If we work on this, that work should not happen until we are confident T234403, particularly switching between visual and source, is working well.

I think that https://hu.wikipedia.org/w/index.php?title=Szerkesztővita:Pasztilla&diff=22578512&oldid=22578493 is the same category of problems, but it involves MediaWiki core's <gallery> tag. (In this case, the automatically added signature was not correctly indented after the gallery tag).

My note on this is that extension parsing is in theory not one of the "evil" parts of wikitext parsing. After the open tag is seen, the close tag is found using a fairly braindead regexp, more-or-less deliberately. ("You can include any content in extension content except </closetag....".) The only tricky part might be attribute parsing inside the open tag, but IIRC the sanitizer does this with a regexp so you should be able to do that as well. So, yes, it's a slippery slope, but at least the slope isn't *too* steep in this general area.

Just to not get forgotten from T253482: this should apply to template parameters as well.

ppelberg renamed this task from Add support for extensions within indented comments to Add support for extensions and templates within indented comments.Dec 12 2020, 3:24 AM

My note on this is that extension parsing is in theory not one of the "evil" parts of wikitext parsing. After the open tag is seen, the close tag is found using a fairly braindead regexp, more-or-less deliberately. ("You can include any content in extension content except </closetag....".) The only tricky part might be attribute parsing inside the open tag, but IIRC the sanitizer does this with a regexp so you should be able to do that as well. So, yes, it's a slippery slope, but at least the slope isn't *too* steep in this general area.

Wikitext can contain normal HTML tags, and those might come without a closing pair, which makes tag parsing more annoying than it would otherwise be (and also makes it impossible to use Javascript's built-in XML parsing capabilities). I suppose you can fetch the list of registered extension tags and only apply the logic to those.