jqueryMsg should not parse plain messages as HTML
Closed, ResolvedPublic

Description

See https://en.wikipedia.org/w/index.php?title=User_talk:Sandipgshinde&action=history:

(cur | prev) 08:56, 29 January 2013‎ Σ (t | c)‎ . . (1,161 bytes) (+1,161)‎ . . (pagetriage-del-talk-page-notify-summary: Parse error at position 44 in input: Notifying author of deletion nomination for $1)

There were a few other edits that Σ made which had the same broken edit summary.

His browser/OS info:

"Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20100101 Firefox/14.0.1"


Version: 1.21.x
Severity: major

bzimport set Reference to bz44459.
Legoktm created this task.Via LegacyJan 29 2013, 9:44 AM
hashar added a comment.Via ConduitJan 29 2013, 9:45 AM

The message pagetriage-del-talk-page-notify-summary is apparently only used in the javascript side:

modules/ext.pageTriage.views.toolbar/ext.pageTriage.delete.js:719:
'summary': mw.msg( 'pagetriage-del-talk-page-notify-summary', pageName ),

Maybe pageName is an object or mw.msg has some serious bug :-D

Nischayn22 added a comment.Via ConduitJan 29 2013, 12:23 PM

The only assignment I see is
var pageName = new mw.Title( mw.config.get( 'wgPageName' ) ).getPrefixedText();

which returns a string, not sure what the problem is.

Krenair added a comment.Via ConduitJan 29 2013, 10:25 PM

mw.msg( 'pagetriage-del-talk-page-notify-summary', "User talk:Sandipgshinde" );
"pagetriage-del-talk-page-notify-summary: Parse error at position 44 in input: Notifying author of deletion nomination for [[$1]]"

Krenair added a comment.Via ConduitJan 30 2013, 12:53 AM

CCing Krinkle - I think this is a bug in mw.msg.

Mattflaschen added a comment.Via ConduitJan 30 2013, 10:53 PM

Hmm, it looks like this is a result of letting jqueryMsg take all messages containing a square bracket.

This was part of my change eaaec38d6fdba01b30d051711dc84865fbe388ad (before it did the same, but only for curly braces).

mw.msg (as used here) is plain by default. But there is a FIXME in mediawiki.js that plain and parse are wrongly treated exactly the same.

I think it is time to fix this. My initial idea is:

  1. Just pass the format to parser()
  2. If it is plain, fall back to old mw.msg, which should work fine for this, and similar cases where HTML is not wanted.

Later, jqueryMsg could be extended to support plain parsing.

Mattflaschen added a comment.Via ConduitJan 30 2013, 10:54 PM

Plain parsing in jqueryMsg could be useful if you don't want HTML but do want, e.g. int:, plural:, etc. support.

Schnark added a comment.Via ConduitJan 31 2013, 9:20 AM

There are similar reports for the WikiEditor extension, which might be duplicates of this one: Bug 44215, Bug 44195, Bug 42107.

Schnark added a comment.Via ConduitJan 31 2013, 9:36 AM

Another situation where this occurs: Go to https://de.wikipedia.org/wiki/Kategorie:!Hauptkategorie (or any other category page) and click the [+] to open a subcategory. It opens, but a "categorytree-collapse-bullet: Parse error at position 0 in input:" is inserted.

Mattflaschen added a comment.Via ConduitFeb 1 2013, 4:04 AM
Mattflaschen added a comment.Via ConduitFeb 1 2013, 8:10 AM

It doesn't affect the current fix, but part of my comment (6) above is incorrect.

If we want to support the curly braces but not HTML, I believe that would be text, not plain. From https://www.mediawiki.org/wiki/Manual:Messages_API#Format:

"wfMessage()->text(); transforms the message text (MessageCache::transform() which transforms all '{{}}')".

'plain' (which the fix is regarding) only has parameter substitution.

Schnark added a comment.Via ConduitFeb 1 2013, 8:56 AM

(In reply to comment #9)

Proposed fix and tests at https://gerrit.wikimedia.org/r/#/c/47044/

As I don't have a gerrit account, I just comment here: The test with categorytree-collapse-bullet seems useless, as the value used on en.wp doesn't contain any critical characters. Use the default content [<b>−</b>] instead.

Mattflaschen added a comment.Via ConduitFeb 1 2013, 9:10 AM

True. I realized that a while after I posted the change. I will fix it later today, though the pagetriage-del-talk-page-notify-summary covers similar ground.

You can sign up for Gerrit/Labs at https://labsconsole.wikimedia.org/wiki/Help:Getting_Started#Create_a_User_Account . It no longer requires manually asking anyone.

Mattflaschen added a comment.Via ConduitFeb 1 2013, 9:31 PM

Done in patch set 3, so please review.

matmarex added a comment.Via ConduitFeb 4 2013, 5:32 PM

Raising importance, as this is highly visible.

Nischayn22 added a comment.Via ConduitFeb 7 2013, 4:16 PM

*** Bug 44752 has been marked as a duplicate of this bug. ***

Mattflaschen added a comment.Via ConduitFeb 9 2013, 1:31 AM

Merged.

matmarex added a comment.Via ConduitFeb 9 2013, 12:37 PM
Nikerabbit added a comment.Via ConduitFeb 10 2013, 11:01 AM

Reopening. The fix is also causing breakage all over and needs to be replaced with a better fix.

mw.msg() is expected to parse plural in existing code.

Mattflaschen added a comment.Via ConduitFeb 10 2013, 12:15 PM

mw.msg is a shortcut for mw.message(...).plain(), and contrary to what the old mediawiki.js comment suggested, plain is not supposed to support {{plural}}. See https://www.mediawiki.org/wiki/Manual:Messages_API#Format and the source code of Message.php.

plain "returns the message text as-is; only parameters are substituted."

Code that needs {{-transformation on the client side should use 'parse' ('text' is not currently supported in JS):

mw.message(key).parse()

There is no good way (other than format, which is designed for this) to figure out how to parse message text like "[<b>−</b>]".

If you give the affected messages, I can help change them to use parse.

Mattflaschen added a comment.Via ConduitFeb 10 2013, 1:01 PM

Í've uploaded a Gerrit to update some tests that were using plain for plural and gender:

https://gerrit.wikimedia.org/r/#/c/48349/

All I had to do was change the format, not the expected output.

As we keep developing jqueryMsg and client-side message parsing, I don't think we can have this inconsistency (client and server) in the behavior of plain.

In general we should bring client-side message parsing in line with the server behavior.

Mattflaschen added a comment.Via ConduitFeb 10 2013, 1:02 PM

Reopening until that is merged.

Nikerabbit added a comment.Via ConduitFeb 10 2013, 5:46 PM

We are on the right track here.

"mw.msg is a shortcut for mw.message(...).plain()"

Here I disagree. mw.msg should be a shortcut to mw.message().text(), just like what wfMsg was in the PHP side. There is actual code out there depending on this behavior.

I am sorry that we have got into this mess, but I think there is only way out here, and that is, like you said, to make the JavaScript side match PHP side.

This means:

  • mw.message.plain() is now working correctly
  • mw.message.parse() output should be safe for direct inclusion in html, parsing {{ and link syntax
  • mw.message.text() should be introduced and it should function like mw.msg used to function: expand {{ syntax
  • mw.msg should be aliased to mw.message.text()
  • mw.message.escaped() should be the same as mw.message.text() with html escaping
  • fix the calls that assume something else than what was stated above
Nemo_bis added a comment.Via ConduitFeb 10 2013, 7:40 PM

Assigning to last patch's owner.
Matthew, are you going to do what above?

Mattflaschen added a comment.Via ConduitFeb 11 2013, 1:25 PM

"mw.msg is a shortcut for mw.message(...).plain()"

I was saying how mw.msg behaves now. It is currently plain.

I think this Niklas's proposal in comment 22 is a reasonable approach. I'm going to start implementing it.

The main part that be people need to be be careful about is the mw.msg default. wfMessage (the now-recommended API on the server-side) defaults to 'parse'.

We could change it to that, but that would lead to the same short-term problems (e.g. categorytree-collapse-bullet parsing wrong).

So I think we should go with 'text', but recommend people use the long-form approach going forward. This is already recommended for PHP at https://www.mediawiki.org/wiki/Manual:Messages_API#Format ("It is recommended to specify the output format at the end of wfMessage( 'message-key' )).

Mattflaschen added a comment.Via ConduitFeb 11 2013, 3:50 PM

Daniel Werner has put in a Gerrit reverting the change to use plain (https://gerrit.wikimedia.org/r/#/c/48456/).

Since there will be breakage either way (until I'm done the solution proposed in comment 22), I don't have a strong preference.

Jdlrobson added a comment.Via ConduitFeb 11 2013, 10:24 PM
  • Bug 44851 has been marked as a duplicate of this bug. ***
Mattflaschen added a comment.Via ConduitFeb 12 2013, 6:15 AM

The proposed fix/solution is at https://gerrit.wikimedia.org/r/#/c/48601/ . Please review.

DanielFriesen added a comment.Via ConduitFeb 12 2013, 6:26 AM

Note that I actually have some unreviewed changes for jqueryMsg that have been waiting to be merged in. Someone should check if the changes I made make any effect on this bug.
https://gerrit.wikimedia.org/r/#/c/4051/
https://gerrit.wikimedia.org/r/#/c/4055/

Mattflaschen added a comment.Via ConduitFeb 12 2013, 6:40 AM

I do wish I knew about those earlier, particularly text mode.

However, even for the newer one, it's been almost three months since anyone followed up in any way.

There has been significant work since then, even before this, so merging would be quite challenging.

So I think 48601 should go first. You can then rebase those changes that make sense, and I'll be glad to help review.

Aklapper added a comment.Via ConduitFeb 12 2013, 10:40 AM

Potential dups/deps: bug 44885, bug 44832.

Nemo_bis added a comment.Via ConduitFeb 12 2013, 10:49 AM

Thank you Matthew. Could you clarify why you removed 1.21.0 milestone? As far as I can see, we can definitely *not* ship a stable release with this bug.

bzimport added a comment.Via ConduitFeb 12 2013, 2:05 PM

sumanah wrote:

Gerrit changeset 48601 has now been merged.

Mattflaschen added a comment.Via ConduitFeb 13 2013, 1:33 AM

I didn't mean to change the milestone.

I'm marking this fixed. msg now uses text, which should work in almost all cases. In the remaining ones, we may have to change it to use a different format.

CC me if you think that's the case, and I can help assist if needed.

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.