The newline added to a template, magic word, variable, or parser function that returns line-start wikicode formatting (*#:; {|) causes unexpected parsing
OpenPublic

Description

Author: wiki.warx

Description:
assume there is tamplate named color with content #002255 (normal color definition) if its transcluded this way:

<span style="color:{{color}}"></span>

It works perfectly - it gives

<span style="color: #002255;">test</span>

But if you use it in a table (or anywhere else not inside tag attribute) it crashes:

{| style="color:{{color}};"
|-
| test
|-
|}

gives:

<table>
<ol><li>002255;"
</li></ol>

<tr>
<td> test
</td></tr>
</table>

same:

<p>test {{color}} test</p>

gives:

<p>test 

<ol><li>002255 test</p>
</li></ol>

This has even broken tag nesting!!!


Version: unspecified
Severity: major
URL: http://test.wikipedia.org/wiki/Newline_through_parser_functions
See Also: T25674

Older changes are hidden. Show older changes.
bzimport added a subscriber: Unknown Object (MLST).Via ConduitNov 21 2014, 10:03 PM
bzimport set Reference to bz12974.
bzimport created this task.Via LegacyFeb 8 2008, 6:51 PM
cneubauer added a comment.Via ConduitFeb 8 2008, 9:15 PM

The pound sign (#) is getting interpreted as an ordered list. Like doing this:

  1. first item
  2. second item
  3. etc

Do you have an extra new line in your template? If not, try removing the # or using a hex value or a named color.

bzimport added a comment.Via ConduitFeb 8 2008, 9:20 PM

wiki.warx wrote:

Sure. I did, but it does not change that showed example means inconsistency in interpretation of such tamplates by parser.

cneubauer added a comment.Via ConduitFeb 8 2008, 9:43 PM

Hmm, can't duplicate this bug on 1.11 or 1.12. Do you have any third party extensions installed?

cneubauer added a comment.Via ConduitFeb 8 2008, 9:45 PM

No I was wrong. It does cause the output you described in 1.11. In 1.12 it seems to strip the style attribute out of a table at least.

cneubauer added a comment.Via ConduitFeb 14 2008, 3:28 PM

Okay, in 1.11, the relevant section is in Parser->braceSubstitution():

  1. If the template begins with a table or block-level
  2. element, it should be treated as beginning a new line.

if (!$piece['lineStart'] && preg_match('/^(?:{\\||:|;|#|\*)/', $text)) /*}*/{
$text = "\n" . $text;
}

In 1.12, the same section says:

  1. Bug 529: if the template begins with a table or block-level
  2. element, it should be treated as beginning a new line.
  3. This behaviour is somewhat controversial.

if (!$piece['lineStart'] && preg_match('/^(?:{\\||:|;|#|\*)/', $text)) /*}*/{
$text = "\n" . $text;
}

See bug 529. You can work around this by putting a space before the # in the template.

Mormegil added a comment.Via ConduitNov 12 2008, 9:15 PM

This is a much more general problem, and is much worse: it affects not only templates but also parser functions (e.g. {{#if:}}), and it affects not only HTML colors, but everything which starts with colon, semicolon, asterisk, hashmark, or the “{|” table syntax. Check the linked URL for some examples where this breaks stuff.

The problem: when the result of a template, or parser function call starts with {|, :, ;, #, or *, a newline is prepended to it, forcing this character to be a syntax element, even though the author might not have intended it so (and wanted to just use the plain character). Especially in the case of parser functions, this is quite understandable.

This bug is caused by the fix to bug 529, which I believe is wrong, and should be reverted, even though a compromise version is possible – force the newline only for the table syntax. Table syntax is rare in other uses than tables (while colons and semicolons are perfectly normal in plain text), and tables seem to be the primary use case for that original fix.

(Changing summary, and marking as bug, not a feature request.)

bzimport added a comment.Via ConduitNov 12 2008, 9:56 PM

herd wrote:

Note also page-title magic words like {{PAGENAME}} on a page starting with such a character (like * or ;) cause much breakage, and I can't think of a workaround.

Try [[Special:Prefixindex/{{FULLPAGENAME}}|prefix search]] on a page like [[;Foo]] or [[*Foo]].

Mormegil added a comment.Via ConduitNov 12 2008, 10:04 PM

(In reply to comment #7)

Try [[Special:Prefixindex/{{FULLPAGENAME}}|prefix search]] on a page like
[[;Foo]] or [[*Foo]].

Cool! It also explains the totally broken noarticletext display at e.g. http://en.wikipedia.org/wiki/*Foo

Mormegil added a comment.Via ConduitDec 18 2008, 5:45 PM
  • Bug 13378 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitApr 22 2009, 7:26 PM

happy.melon.wiki wrote:

This also breaks the colon delimiters in magic words:

*Test:

{{SUBJECTPAGENAME{{#if:yes|:Talk:Foo/bar}}}}

*Expected:

Foo/bar

*Actual:

{{SUBJECTPAGENAME
:Talk:Foo/bar}}

This wouldn't be a problem if the parser then recognised the split material as a complete magic word call, but of course it doesn't. This is ugly.

siebrand added a comment.Via ConduitApr 25 2009, 11:07 AM

adding keyword i18n. Also influences PLURAL and GENDER handling in messages. Raising priority.

bzimport added a comment.Via ConduitMay 13 2009, 6:52 PM

herd wrote:

*** Bug 10687 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitMay 13 2009, 6:53 PM

herd wrote:

*** Bug 11262 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitMay 13 2009, 6:53 PM

herd wrote:

*** Bug 8199 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitMay 13 2009, 6:53 PM

herd wrote:

*** Bug 14036 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitMay 13 2009, 6:58 PM

herd wrote:

Update summary to catch more dupes

bzimport added a comment.Via ConduitJun 10 2009, 11:03 AM

herd wrote:

*** Bug 19144 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitSep 10 2009, 5:07 AM

herd wrote:

*** Bug 20574 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitSep 11 2009, 12:00 PM

herd wrote:

*** Bug 20592 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitSep 13 2009, 6:51 PM

rockmfr wrote:

On a page named "*", "a{{PAGENAME}}a" gives "a<ul><li>*</li></ul>a" instead of "a*a". This particular regression was caused by r29205. So I'm assuming that this whole class of bugs are all being tracked in this one bug?

We ran into this particular problem on enwiki at [[MediaWiki:Histlegend]]. No workaround yet.

bzimport added a comment.Via ConduitOct 29 2009, 2:22 PM

catlow wrote:

Can this behaviour not at least be disabled for such pseudo-templates as PAGENAME?

P.Copp added a comment.Via ConduitJan 12 2010, 3:33 PM

*** Bug 22086 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitJan 25 2010, 10:01 PM

happy.melon.wiki wrote:

This behaviour is unjustifiable. The original bug has a trivial workaround: judicious use of newlines where appropriate. The 'solution' creates problems with no reasonable workarounds, such as noted in comments 6, 7, 10, 11, 20 above. However longstanding the feature, this functionality is broken.

Unless there are serious counterarguments, I intend to undo the newline-insertion added for bug529, WONTFIXing that and FIXing this. CCing Tim for parsery-ness.

tstarling added a comment.Via ConduitJan 25 2010, 10:49 PM

You realise it will break many, many templates if it's removed, right?

bzimport added a comment.Via ConduitJan 25 2010, 11:18 PM

happy.melon.wiki wrote:

In the same way the original fix presumably broke many templates, given all these unexpected side effects, yes. However, those breakages can be fixed, unlike some of the breakages it causes. And since the syntax without the bug529 is valid regardless, templates can be fixed any time, before or after they become broken. No one can fix {{talkpage}} on [[Talk:*-algebra]] (http://en.wikipedia.org/w/index.php?title=Talk:*-algebra&oldid=340022974) with this parsing in place.

tstarling added a comment.Via ConduitJan 25 2010, 11:24 PM

No, the original fix did not break many templates. It was 2004, there weren't many templates to break back then.

It's not as easy as you make out to produce line starts without the bug 529 hack. Look at what happens when an extension breaks it:

http://lists.wikimedia.org/pipermail/mediawiki-l/2010-January/033103.html

If the problem is parser function and variable output, then we can fix that specifically and leave template output as it is.

bzimport added a comment.Via ConduitJan 26 2010, 12:27 AM

happy.melon.wiki wrote:

testcases

It's generally trivial: just add a linebreak in the calling table:

{|

-

{{template-with-block-level-wikimarkup}}

-
{{template-without-block-level-wikimarkup}}
}

If the outer markup expects block-level content, it should be on a new line. The comment you refer to is just putting the cart before the horse to try and fix this in the subtemplate; templates generating extra whitespace is a big enough problem as it is. Of course, their specific problem is with the newline position of the #ask: parser function getting lost somewhere, but their implementation puts the contents of the #ask on a newline whether or not that's desired. Linestart status should be decided from the top down, where people can actually see what the transclusions are doing, not blind-guessed from the inside out. However, as was pointed out in that thread, adding anything; be it an nbsp, <nowiki/> tag, etc, reproduces the effect they wanted.

In the testcases attached, the existing implementation (with the hack) breaks cases 5, 6, 7 & 8. Without the hack, cases 1 and 3 break, assuming that block-start functionality is always desired. If the inner template should sometimes exhibit block-level functionality and sometimes not, of course, there's no way to produce that with the hack in place, although that's an unlikely situation.

Attached: bug12974.html

tstarling added a comment.Via ConduitJan 26 2010, 12:42 AM

We could add a bug 529 tracking category to the parser output to determine how the hack is being used on Wikimedia wikis.

IAlex added a comment.Via ConduitApr 5 2010, 11:08 AM
  • Bug 23033 has been marked as a duplicate of this bug. ***
IAlex added a comment.Via ConduitApr 5 2010, 6:36 PM
  • Bug 5590 has been marked as a duplicate of this bug. ***
Bawolff added a comment.Via ConduitMay 1 2010, 6:32 AM
  • Bug 23355 has been marked as a duplicate of this bug. ***
He7d3r added a comment.Via ConduitMay 3 2010, 8:44 PM

Is the code for headers (=) a "line-start" code too?

It is appearing undesired line breaks in the headers here:
http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles%0D|Title+1%0D|Title+2%0D}}

and here:
http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles2%0D|Title+1%0D|Title+2%0D}}

The code of [[User:Heldergeovane/Test/Template for titles]] is:

{{#if:{{{1|}}}|<h1>{{{1}}}</h1>}}

Section 1.1

Text 1.1
{{#if:{{{2|}}}|<h1>{{{2}}}</h1>}}

Section 2.1

Text 2.1

And the code of [[User:Heldergeovane/Test/Template for titles2]] is:

{{#if:{{{1|}}}|={{{1}}}=}}

Section 1.1

Text 1.1
{{#if:{{{2|}}}|={{{2}}}=}}

Section 2.1

Text 2.1

Helder

Mormegil added a comment.Via ConduitMay 3 2010, 9:36 PM

(In reply to comment #32)

Is the code for headers (=) a "line-start" code too?

No

It is appearing undesired line breaks in the headers here:
http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles%0D|Title+1%0D|Title+2%0D}}

I fail to see the “undesired line breaks”. The only line breaks I see there are those you explicitly added yourself (and I do not even think they present any problem). If you remove them from the input, they disappear from the output:

http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles|Title+1|Title+2%7D%7D

http://en.wikipedia.org/w/index.php?title=Special:ExpandTemplates&input=__TOC__%0D{{:User:Heldergeovane/Test/Template+for+titles2|Title+1|Title+2%7D%7D

He7d3r added a comment.Via ConduitMay 3 2010, 11:59 PM

I forgot to mention that the example is based in a case where the number of parameters used is superior to 100, so it is desired to have one in each row (as in the example). The problem is that the line breaks are breaking the TOC, which shows

  • 1 Section 1.1
  • 2 Section 2.1 ----

instead of

  • Title 1
    • 1 Section 1.1
  • Title 2
    • 2 Section 2.1 ----

This should not be happening, I mean:
<h1>Title
</h1>
should also make "Title" to appears in the TOC, as it does in
<h1>Title</h1>

Besides this, the code

{{:User:Heldergeovane/Test/Template with parameters

FIRST
SECOND

}}

should result in the same output as this:

{{:User:Heldergeovane/Test/Template with parameters

1=FIRST
2=SECOND

}}

without any undesired line breaks. Here is a link showing the differences:
http://bit.ly/bVLioV

Helder

Bawolff added a comment.Via ConduitNov 20 2010, 2:28 AM
  • Bug 26000 has been marked as a duplicate of this bug. ***
Umherirrender added a comment.Via ConduitDec 12 2010, 12:59 PM

In some case, you can use <code>&#35;</code> for #, because the entity is replaced after the braceSubstitution.

bzimport added a comment.Via ConduitDec 12 2010, 1:02 PM

happy.melon.wiki wrote:

(In reply to comment #37)

In some case, you can use <code>&#35;</code> for #, because the entity is
replaced after the braceSubstitution.

Ew, god, please no. Escaped entities are escaped entities, they should never be being interpreted as wikimarkup; if they are, that's a separate bug.

Umherirrender added a comment.Via ConduitDec 12 2010, 1:30 PM

(In reply to comment #38)

(In reply to comment #37)
> In some case, you can use <code>&#35;</code> for #, because the entity is
> replaced after the braceSubstitution.
Ew, god, please no. Escaped entities are escaped entities, they should never
be being interpreted as wikimarkup; if they are, that's a separate bug.

That is why I say, in same case. For the Template:Color it is possible (comment 0), because the # is not wikimarkup there. Using

{{SUBJECTPAGENAME{{#if:yes|&#58;Talk:Foo/bar}}}}

does not work (comment 10), because the &#58; is for wikimarkup (for the parser function)

Dinoguy1000 added a comment.Via ConduitDec 12 2010, 2:36 PM

As HM said in comment 27, all you need generally is a preceding nbsp or <nowiki/>:

style="color:<nowiki/>{{#if:yes|#000;|#fff;}}"

...or something to that effect.

bzimport added a comment.Via ConduitJan 17 2011, 10:35 AM

happy.melon.wiki wrote:

I fixed this in r80430. The newline is now only added when the brace construct begins with a wikitable element {|

brion added a comment.Via ConduitJan 26 2011, 1:20 AM

I've provisionally reverted this in r81012. As noted in code review comments, this alters various existing edge cases, and causes unexpected changes in behavior for constructs that are already in use in pages and tables.

As we're in the middle of settling down work on trunk into the 1.17 deployment and release, I'd strongly recommend revisiting this in a few weeks when things have settled down.

Definitely recommend going ahead and testing things and checking to see what the best machine strategies for fixing up old code are, if these are the correct changes to make.

MarkAHershberger added a comment.Via ConduitApr 12 2011, 4:22 PM

Punting this to the new parser Brion has under development.

DieBuche added a comment.Via ConduitApr 14 2011, 8:41 PM

*** Bug 10781 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitMay 3 2011, 2:06 PM

a.d.bergi wrote:

*** Bug 19302 has been marked as a duplicate of this bug. ***

P.Copp added a comment.Via ConduitMar 10 2012, 6:30 PM

*** Bug 35129 has been marked as a duplicate of this bug. ***

duplicatebug added a comment.Via ConduitMay 5 2012, 7:35 PM
  • Bug 36215 has been marked as a duplicate of this bug. ***
duplicatebug added a comment.Via ConduitJul 29 2012, 12:36 PM
  • Bug 38697 has been marked as a duplicate of this bug. ***
Danny_B added a comment.Via ConduitJul 29 2012, 2:02 PM

Bumping the importance - so many dupes and so many broken/non-working things because of this.

Fomafix added a comment.Via ConduitSep 18 2012, 6:20 AM

*** Bug 40294 has been marked as a duplicate of this bug. ***

Mormegil added a comment.Via ConduitOct 24 2012, 2:41 PM

Note that the worst case of this problem, unusability of things like {{PAGENAME}} on pages like “*Foo” was specifically solved in Bug 26781 with commits r80511 and r80512.

MrStradivarius added a comment.Via ConduitApr 9 2013, 4:36 PM

This bug is causing unexpected behaviour in {{#invoke:}} as well. In my case, I found this when writing [[Module:UrlToWiki]] that converts URLs into interwiki links. When the module generates the text for a link that uses the colon trick, the parser generates an unwanted new line. I made a demonstration module as well:

https://test2.wikipedia.org/wiki/Module:User:Mr._Stradivarius/colonbug
https://test2.wikipedia.org/wiki/User:Mr._Stradivarius/colonbug

Allow me to join the voices of those calling for this to be fixed. There would be a simple workaround for templates which rely on this behaviour should it be fixed, but as it is the bug makes certain things impossible.

Anomie added a comment.Via ConduitAug 5 2013, 3:45 PM
  • Bug 52548 has been marked as a duplicate of this bug. ***
duplicatebug added a comment.Via ConduitNov 16 2013, 8:58 PM
  • Bug 56562 has been marked as a duplicate of this bug. ***
Aklapper added a comment.Via ConduitDec 4 2013, 3:25 PM

Happy-melon: This issue has been assigned to you in January 2010.
Could you please provide a status update and inform us whether you are still working (or still plan to work) on this issue?
Only in case you do not plan to work on this issue anymore, should the assignee be set back to default? Thanks.

bzimport added a comment.Via ConduitDec 4 2013, 3:44 PM

happy.melon.wiki wrote:

(In reply to comment #55)

Happy-melon: This issue has been assigned to you in January 2010.
Could you please provide a status update and inform us whether you are still
working (or still plan to work) on this issue?

I pretty much *had finished* working on this, deployed a fix, etc; then Brion reverted. Essentially my code was rejected.

Only in case you do not plan to work on this issue anymore, should the
assignee
be set back to default? Thanks.

I'd say this should either be WONTFIXed if it's actually not going to happen, or my fix should be reinstated (I'd expect it still applies fairly cleanly, the Parser code is *very* stable). There's no reason for anyone *else* to be working on it.

gerritbot added a comment.Via ConduitDec 4 2013, 4:14 PM

Change 99133 had a related patch set uploaded by Bartosz Dziewoński:
Stop prepending newlines to templates starting with *#;:

https://gerrit.wikimedia.org/r/99133

matmarex added a comment.Via ConduitDec 4 2013, 4:22 PM

This behavior has annoyed me for long enough. I say we should break the wikis and fix it (after appropriate community nudging is done, and probably after we run a diff of resulting HTML on a large enough subset of pages – Parsoid testing infrasctructure can probably help here a lot).

I tried reapplying Happy-melon patch from r80430 in Ifc6080cb linked above, fixing a few minor merge conflicts on the way and one larger one, and hopefully not breaking too many unrelated things in the process.

It is naturally failing parser tests right now due to how many other things changed in these three years, but that's nothing insurmountable, I can fix the tests myself if there is any chance of this actually getting merged again someday.

GWicke added a comment.Via ConduitDec 5 2013, 10:08 PM

Changes to this behavior will also very likely break a lot of existing content for a small gain in usability. This is also the reason why it was reverted in the past.

Removing the newline insertion would also make life for Parsoid harder. Example case:

{{random}}{{echo|* foo}}

The newline context of * foo now depends on the expansion of the random template. This makes independent parsing, correct WYSIWYG and efficient updates for template expansions very difficult to impossible. The newline insertion hack happens to help us here, even if the original author probably didn't think about future parser development affected by this.

My preference is to focus efforts on better DOM-based templating rather than spending a lot of time moving sideways with wikitext templating.

Anomie added a comment.Via ConduitJan 7 2014, 6:45 PM

I don't think having the ability to have template output beginning with "#" and similar characters is a "small" gain in usability. This sort of thing seems to come up on enwiki every few months as new people run into this bug.

(In reply to comment #56)

I pretty much *had finished* working on this, deployed a fix, etc; then Brion
reverted. Essentially my code was rejected.

Looking at the history on this bug and the comment on r81012, it doesn't seem so much "no" as "not right now, we're trying to release 1.17" and then it never got followed up on after. And then Brion was supposed to be working on a new parser, etc.

Possibly the biggest help would be to identify what exactly on the wikis would be broken by making this change.

matmarex added a comment.Via ConduitJan 26 2014, 3:57 PM
  • Bug 60444 has been marked as a duplicate of this bug. ***
Anomie added a comment.Via ConduitFeb 4 2014, 2:35 PM
  • Bug 60827 has been marked as a duplicate of this bug. ***
Ciencia_Al_Poder added a comment.Via ConduitMar 29 2014, 3:27 PM

Someone hit this problem today on IRC (discussed privately).

A template with a link to IRC (although it will affect any protocol) and a parameter to supply the port.

Example:

[irc://{{{server|}}}{{#if:{{{port|}}}|:{{{port}}}}}/{{{channel}}} #{{{channel}}}]

If you specify a port, the colon in the port breaks the link, as it's being interpreted as a definition list.

But if you try to escape it wrapping the colon inside nowiki tags, the link is broken anyway since the < character is interpreted as the end of the link.

See this test https://www.mediawiki.org/w/index.php?oldid=943856

Dinoguy1000 added a comment.Via ConduitMar 29 2014, 4:10 PM

I'm pretty sure that particular instance could be worked around by using [[Template:Colon]] on wikis that've created it, but that's very much not an ideal solution.

greg added a comment.Via ConduitJun 24 2014, 5:39 PM

Brad gave a pretty good summary in comment 60, with this as a good next step:

(In reply to Brad Jorsch from comment #60)

Possibly the biggest help would be to identify what exactly on the wikis
would be broken by making this change.

Matma: Can you do this? The current behavior also annoys you, so you might be inclined to help move this bug forward.

For the record, as of today there are 23 duplicate bugs for this issue.

Setting status to Assigned (from patchtoreview) since the next step isn't necessarily reviewing the (old) patch, but working out what will break if it's merged.

matmarex added a comment.Via ConduitJun 24 2014, 8:22 PM

(In reply to Greg Grossmeier from comment #65)

Matma: Can you do this? The current behavior also annoys you, so you might
be inclined to help move this bug forward.

I'd love to help, but I don't think I can test a representative sample of all articles in all Wikimedia wikis on the hardware and the Internet connection available to me.

Isn't there a whole infrastructure for Parsoid testing and comparing the results to current parser? I think it'd make sense to use that intead.

Nemo_bis added a comment.Via ConduitJul 2 2014, 3:07 PM

(In reply to Bartosz Dziewoński from comment #66)

Isn't there a whole infrastructure for Parsoid testing and comparing the
results to current parser? I think it'd make sense to use that instead.

There is [[mw:Parsoid/Setup]] and mediawiki/services/parsoid/tests/dumpGrepper.js, which could be run on any Wikimedia Labs instance over the Wikimedia projects dumps, but last time I tried I wasn't able to make it work and I ended up using bzgrep instead. :-)

It may get easier if someone familiar with parsoid improves the docs and/or testing infrastructure for this sort of things, but in the current state assessing the effects of such a whitespace change certainly is not a few hours' job.

He7d3r awarded a token.Via WebNov 24 2014, 12:05 PM
Ciencia_Al_Poder edited the task description. (Show Details)Via WebDec 31 2014, 4:32 PM
Ciencia_Al_Poder removed a subscriber: Unknown Object (MLST).
Ciencia_Al_Poder set Security to None.
Ciencia_Al_Poder awarded a token.
MrStradivarius added a comment.Via WebJan 4 2015, 7:03 PM

Here's a thought: instead of parsing dumps, how about adding a tracking category, similarly to what we did with duplicate template arguments? Whatever the number of pages affected ends up being, it's going to be big, so we would end up needing to make several dump parses at regular intervals as pages get fixed. Using a tracking category will enable volunteers to fix the pages without having to worry about people updating the dump parse results.

A tracking category would also make it easier for non-Wikimedia wikis to fix their pages before upgrading their MediaWiki installations. If we just scan for affect pages on Wikimedia wikis without taking any other measures, then people on other wikis will lose out if they don't have the knowledge or resources to make their own scans.

The downside would be that most of the pages included in a tracking category would be transclusions of the pages with the problem, rather than being directly affected, as this behaviour is most often seen in templates. But that can be mitigated by looking only at transclusions in the template namespace, with tools such as CatScan.

(By the way, this bug has been mentioned on enwiki's technical vilage pump again.)

Ciencia_Al_Poder added a project: Epic.Via WebJan 4 2015, 7:17 PM
Anomie added a comment.Via WebJan 5 2015, 3:27 PM

Another disadvantage to adding a tracking category could be the difficulty in determining whether this behavior actually makes a difference to the resulting HTML in any particular instance, potentially leading to many false positives.

waldyrious added a subscriber: waldyrious.Via WebFeb 9 2015, 2:21 AM
Liuxinyu970226 added a subscriber: Liuxinyu970226.Via WebFeb 12 2015, 3:31 AM
Blahma added a subscriber: Blahma.Via WebMar 5 2015, 2:12 AM

Run across this bug today while working with {{#invoke:String|…}}. This bug forces me to make nasty obfuscations in my code that sometimes hang on the level of workability. And I guess that the more we use Lua/Scribunto, the more we will bump into annoyances caused by this bug.

In my specific case, I receive a string from the user and need to left trim it to start with the first asterisk I find in it. This bug breaks the simple {{#invoke:String|match|{{{1}}}|%*.*}} which, to implement the space-prepending workaround, needs to be replaced with {{#invoke:String|match|{{#invoke:String|replace|{{{1}}}|*| *|1}}| %*.*}}

The latter code is less comprehensible and doubles the number of invokes and level of nesting (and as I am processing lists of values with this code, parser time actually matters to me). Also, because I actually need to do further String processing on the output of this invoke, I find myself needing to keep that extra starting space on my mind at every moment and to inflate any sub/find/replace String operations accordingly (so as not to start working from index 1 as is the default, but index 2). I probably do not need to add that any nesting call to a String operation that may possibly return a "dangerous" value must again be sanitized in this way, which means that you may easily run into nesting workarounds in workarounds. Very bad practice, I think :-(

MarkAHershberger removed a subscriber: MarkAHershberger.Via EmailMar 5 2015, 5:37 PM
Ricordisamoa added a subscriber: Ricordisamoa.Via WebApr 5 2015, 8:57 PM
PeterBowman added a subscriber: PeterBowman.Via WebApr 6 2015, 11:24 PM
Zebulon84 added a subscriber: Zebulon84.Via WebMay 21 2015, 1:06 AM
Jdforrester-WMF removed a project: Newparser.Via WebJun 4 2015, 9:39 PM
Smith.dan added a subscriber: Smith.dan.Via WebJun 23 2015, 5:54 PM
ssastry added a comment.EditedVia WebJul 9 2015, 7:37 PM

I think the problem here is that we need different (contextual) behavior for different templates.

For templates that are expected to be used in attribute context (as in the {{color}} example), or in templates that are expected to be used in phrasing content (http://www.w3.org/TR/2011/WD-html5-20110525/content-models.html#phrasing-content-0 ), the newline additional behavior breaks expectations.

For everything else, newline addition seems the right behavior.

I think this goes back to various discussions that we've had in the Parsoid context -- about how to enforce content model requirements on template output. This has several benefits beyond the resolution of the conflicting requirements in this task (better visual editablity, improved parsing performance, ability to edit popular templates without causing load spikes on the parse cluster, and improved ability to reason about templates and their usage -- as demonstrated in this bug report).

One of the ideas that came up in our conversations (that @tstarling had, I think) was to rely on some form of magic words that provide content model hints (potentially enforceable rules in the parser) at the start of template wikitext. This idea came up in the context of ideas / solutions to enforce DOM-scoping on template output (https://www.mediawiki.org/wiki/User:SSastry_%28WMF%29/Notes#DOM-scoping_of_template_output, https://www.mediawiki.org/wiki/Parsoid/domparse ) without having to introduce new markup in wikitext.

So, I think, if we pursued this idea further of introducing content-model (and other contextual-use) hints to the parser in templates, we might be able to have our cake and eat it too.

I do want to add the caveat here that I haven't fully thought through all the implications, but I am putting this out as an early idea for consideration.

cscott added a comment.Via WebJul 9 2015, 7:42 PM

It may be that the inAttribute serializer flag added by https://gerrit.wikimedia.org/r/219869 might help in some situations. For example, @Ciencia_Al_Poder's example with the inadvertent definition list might be suppressed if we are inAttribute. Not a 100% solution, but it might help.

Ireas removed a subscriber: Ireas.Via WebThu, Aug 20, 9:18 PM

Add Comment