Page MenuHomePhabricator

Parsoid strips syntax-critical newlines from template parameters with inline formatting
Open, MediumPublicBUG REPORT

Description

While users are not often confronted with this, it's not uncommon that some whitespace in template transclusions is relevant. For example, templates often don't accept e.g. {{example |content=* List}} or {{example |content= == Heading ==}} but must be called like this:

{{example |content =
== Heading ==
}}

It appears like Parsoid strips this critical newline character together with all other, non-critical whitespace. While VisualEditor tries to avoid dirty diffs, there is nothing it can do when an intentional edit is made to such a field. From VE's perspective there was never a newline.

Steps to reproduce:

  • Create a template with 2 parameters and the format inline (which corresponds to {{_|_=_}})
  • Using source code editing, include the template on a page, add a parameter and make the parameter's value start with an empty line followed by some text
  • Using the VE template dialog, edit the template transclusion and change some of the text in this or the other parameter
  • Apply the changes and save the page

What happens?

  • The newline will be removed creating a dirty diff and possibly weird consequences for the transclusion

Example dirty diff that broke the page: https://de.wikipedia.org/wiki/Special:Diff/220481946. The template in this example is https://de.wikipedia.org/wiki/Template:Mehrspaltige_Liste.

Event Timeline

WMDE-Fisch renamed this task from When I edit & save a template in VE, the first line moves up and devalidates the template to VE template dialog: Empty lines at the beginning of text input are trimmed with inline formatting .Mar 3 2022, 11:11 AM
WMDE-Fisch updated the task description. (Show Details)
thiemowmde edited projects, added Parsoid, Editing-team; removed WMDE-TechWish-Maintenance.
thiemowmde updated the task description. (Show Details)
thiemowmde added a subscriber: Lena_WMDE.
thiemowmde renamed this task from VE template dialog: Empty lines at the beginning of text input are trimmed with inline formatting to Parsoid strips syntax-critical newlines from template parameters with inline formatting.Apr 6 2022, 2:36 PM

I was about to type this --> I am not sure what we can do here realisticallly. If a template is marked as inline-formatted, the newlines will be stripped. Without more fine-grained types about parameters (a type that says: "this parameter is a wikitext construct that needs a leading newline"), it is hard to do much there.

Then went digging and see that TemplateData allows for types of its parameters one of which is "content". I suppose we could suppress newline stripping for parameters with this type. Not sure if I spaced out or forgot about this or if this type information about wikitext content is something that has been added in recent years.

ssastry triaged this task as Medium priority.Jun 1 2022, 2:22 PM

I suggest to direct the investigation in another direction.

There are apparently two different types of whitespace: Most is syntactically not relevant, but some is. Even if the format is set to inline, that should only affect irrelevant whitespace. Relevant whitespace should always stay untouched, as it is in the wikitext. That should be independent from both the format as well as the parameter type.

Here is a simple test scenario. Create a Template:Echo that contains <div style="border: 1px solid #ccc;">{{{1}}}</div>. Try using it:

{{Echo|*First list item
*Another list item}}

This expands to:

<div style="border: 1px solid #ccc;">*First list item
*Another list item</div>

Which correctly but unexpectedly renders as:

*First list item
  • Another list item

When you add the relevant whitespace character it works as expected:

{{Echo|
*First list item
*Another list item}}

What this means is that the parser apparently does not simply trim all whitespace, but leaves some intact. Parsoid and VisualEditor should do the same. I vaguely remember a line of code in the old wikitext parser that said something like "trim all whitespace from template parameter values, except it's a newline followed by some list-style syntax like * or #".

TL;DR: A leading newline followed by * or # is not syntax but content and should not be touched by the format, which is only about syntax.

Whitespace handling in wikitext is generally very messy. @Arlolra and @cscott both wondering why we trim argument values at all in any context So, I'm going to dump a WIP patch for now exploring that.

Good to hear that. Thanks! I experimented a bit more with my echo example and realized that I slightly confused something: Argument values aren't trimmed at all. The special kind of trimming I remember is in the context of ParserFunctions: https://phabricator.wikimedia.org/source/mediawiki/browse/REL1_37/includes/parser/Parser.php$3312. This allows to write e.g. *Test{{#if:1|*Test}} (note there is nothing to trim here). An additional newline makes sure this is rendered as two list items. But this is mostly unrelated to what we are discussing here, I realized.

Change 802231 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] WIP: Don't trim transclusion arg values

https://gerrit.wikimedia.org/r/802231