Page MenuHomePhabricator

jqueryMsg should not parse plain messages as HTML
Closed, ResolvedPublic

Description

See https://en.wikipedia.org/w/index.php?title=User_talk:Sandipgshinde&action=history:

(cur | prev) 08:56, 29 January 2013‎ Σ (t | c)‎ . . (1,161 bytes) (+1,161)‎ . . (pagetriage-del-talk-page-notify-summary: Parse error at position 44 in input: Notifying author of deletion nomination for $1)

There were a few other edits that Σ made which had the same broken edit summary.

His browser/OS info:

"Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20100101 Firefox/14.0.1"


Version: 1.21.x
Severity: major

Details

Reference
bz44459

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:19 AM
bzimport set Reference to bz44459.

The message pagetriage-del-talk-page-notify-summary is apparently only used in the javascript side:

modules/ext.pageTriage.views.toolbar/ext.pageTriage.delete.js:719:
'summary': mw.msg( 'pagetriage-del-talk-page-notify-summary', pageName ),

Maybe pageName is an object or mw.msg has some serious bug :-D

The only assignment I see is
var pageName = new mw.Title( mw.config.get( 'wgPageName' ) ).getPrefixedText();

which returns a string, not sure what the problem is.

mw.msg( 'pagetriage-del-talk-page-notify-summary', "User talk:Sandipgshinde" );
"pagetriage-del-talk-page-notify-summary: Parse error at position 44 in input: Notifying author of deletion nomination for [[$1]]"

CCing Krinkle - I think this is a bug in mw.msg.

Hmm, it looks like this is a result of letting jqueryMsg take all messages containing a square bracket.

This was part of my change eaaec38d6fdba01b30d051711dc84865fbe388ad (before it did the same, but only for curly braces).

mw.msg (as used here) is plain by default. But there is a FIXME in mediawiki.js that plain and parse are wrongly treated exactly the same.

I think it is time to fix this. My initial idea is:

  1. Just pass the format to parser()
  2. If it is plain, fall back to old mw.msg, which should work fine for this, and similar cases where HTML is not wanted.

Later, jqueryMsg could be extended to support plain parsing.

Plain parsing in jqueryMsg could be useful if you don't want HTML but do want, e.g. int:, plural:, etc. support.

There are similar reports for the WikiEditor extension, which might be duplicates of this one: Bug 44215, Bug 44195, Bug 42107.

Another situation where this occurs: Go to https://de.wikipedia.org/wiki/Kategorie:!Hauptkategorie (or any other category page) and click the [+] to open a subcategory. It opens, but a "categorytree-collapse-bullet: Parse error at position 0 in input:" is inserted.

It doesn't affect the current fix, but part of my comment (6) above is incorrect.

If we want to support the curly braces but not HTML, I believe that would be text, not plain. From https://www.mediawiki.org/wiki/Manual:Messages_API#Format:

"wfMessage()->text(); transforms the message text (MessageCache::transform() which transforms all '{{}}')".

'plain' (which the fix is regarding) only has parameter substitution.

(In reply to comment #9)

Proposed fix and tests at https://gerrit.wikimedia.org/r/#/c/47044/

As I don't have a gerrit account, I just comment here: The test with categorytree-collapse-bullet seems useless, as the value used on en.wp doesn't contain any critical characters. Use the default content [<b>−</b>] instead.

True. I realized that a while after I posted the change. I will fix it later today, though the pagetriage-del-talk-page-notify-summary covers similar ground.

You can sign up for Gerrit/Labs at https://labsconsole.wikimedia.org/wiki/Help:Getting_Started#Create_a_User_Account . It no longer requires manually asking anyone.

Done in patch set 3, so please review.

Raising importance, as this is highly visible.

*** Bug 44752 has been marked as a duplicate of this bug. ***

Reopening. The fix is also causing breakage all over and needs to be replaced with a better fix.

mw.msg() is expected to parse plural in existing code.

mw.msg is a shortcut for mw.message(...).plain(), and contrary to what the old mediawiki.js comment suggested, plain is not supposed to support {{plural}}. See https://www.mediawiki.org/wiki/Manual:Messages_API#Format and the source code of Message.php.

plain "returns the message text as-is; only parameters are substituted."

Code that needs {{-transformation on the client side should use 'parse' ('text' is not currently supported in JS):

mw.message(key).parse()

There is no good way (other than format, which is designed for this) to figure out how to parse message text like "[<b>−</b>]".

If you give the affected messages, I can help change them to use parse.

Í've uploaded a Gerrit to update some tests that were using plain for plural and gender:

https://gerrit.wikimedia.org/r/#/c/48349/

All I had to do was change the format, not the expected output.

As we keep developing jqueryMsg and client-side message parsing, I don't think we can have this inconsistency (client and server) in the behavior of plain.

In general we should bring client-side message parsing in line with the server behavior.

We are on the right track here.

"mw.msg is a shortcut for mw.message(...).plain()"

Here I disagree. mw.msg should be a shortcut to mw.message().text(), just like what wfMsg was in the PHP side. There is actual code out there depending on this behavior.

I am sorry that we have got into this mess, but I think there is only way out here, and that is, like you said, to make the JavaScript side match PHP side.

This means:

  • mw.message.plain() is now working correctly
  • mw.message.parse() output should be safe for direct inclusion in html, parsing {{ and link syntax
  • mw.message.text() should be introduced and it should function like mw.msg used to function: expand {{ syntax
  • mw.msg should be aliased to mw.message.text()
  • mw.message.escaped() should be the same as mw.message.text() with html escaping
  • fix the calls that assume something else than what was stated above

Assigning to last patch's owner.
Matthew, are you going to do what above?

"mw.msg is a shortcut for mw.message(...).plain()"

I was saying how mw.msg behaves now. It is currently plain.

I think this Niklas's proposal in comment 22 is a reasonable approach. I'm going to start implementing it.

The main part that be people need to be be careful about is the mw.msg default. wfMessage (the now-recommended API on the server-side) defaults to 'parse'.

We could change it to that, but that would lead to the same short-term problems (e.g. categorytree-collapse-bullet parsing wrong).

So I think we should go with 'text', but recommend people use the long-form approach going forward. This is already recommended for PHP at https://www.mediawiki.org/wiki/Manual:Messages_API#Format ("It is recommended to specify the output format at the end of wfMessage( 'message-key' )).

Daniel Werner has put in a Gerrit reverting the change to use plain (https://gerrit.wikimedia.org/r/#/c/48456/).

Since there will be breakage either way (until I'm done the solution proposed in comment 22), I don't have a strong preference.

  • Bug 44851 has been marked as a duplicate of this bug. ***

Note that I actually have some unreviewed changes for jqueryMsg that have been waiting to be merged in. Someone should check if the changes I made make any effect on this bug.
https://gerrit.wikimedia.org/r/#/c/4051/
https://gerrit.wikimedia.org/r/#/c/4055/

I do wish I knew about those earlier, particularly text mode.

However, even for the newer one, it's been almost three months since anyone followed up in any way.

There has been significant work since then, even before this, so merging would be quite challenging.

So I think 48601 should go first. You can then rebase those changes that make sense, and I'll be glad to help review.

Potential dups/deps: bug 44885, bug 44832.

Thank you Matthew. Could you clarify why you removed 1.21.0 milestone? As far as I can see, we can definitely *not* ship a stable release with this bug.

sumanah wrote:

Gerrit changeset 48601 has now been merged.

I didn't mean to change the milestone.

I'm marking this fixed. msg now uses text, which should work in almost all cases. In the remaining ones, we may have to change it to use a different format.

CC me if you think that's the case, and I can help assist if needed.