Page MenuHomePhabricator

mediawiki.jqueryMsg double escapes HTML entities when using mw.message(key).parse() on a message containing a link
Closed, DeclinedPublic

Description

original bug title:
mediawiki.jqueryMsg double escapes HTML entities when using mw.message(key).parse() on a message containing a link

how to reproduce:

mw.loader.using('mediawiki.jqueryMsg', function() {
  mw.messages.set({ simple: '→', subst: '→ $1 $2 $3', wlink: '[$1 linktext] $2 → $3' });
  console.log(mw.message('simple').parse());
  >> →
  console.log(mw.message('subst', 'a', 'b', 'c').parse());
  >> → a b c
  console.log(mw.message('wlink', 'a', 'b', 'c').parse());
  >> <a href="a">linktext</a> b &amp;rarr; c
ISSUE: ---------------------------^
});

Expected result:

console.log(mw.message('wlink', 'a', 'b', 'c').parse());
>> <a href="a">linktext</a> b &rarr; c

Details

Reference
bz53576
Related Gerrit Patches:

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:06 AM
bzimport set Reference to bz53576.
bzimport added a subscriber: Unknown Object (MLST).
Rillke created this task.Aug 30 2013, 8:40 AM

Running example:

mw.loader.using('mediawiki.jqueryMsg', function() {

mw.messages.set({ simple: '&rarr;', subst: '&rarr; $1 $2 $3', wlink: '[$1

linktext] $2 &rarr; $3' });

console.log(mw.message('simple').parse());
>> &rarr;
console.log(mw.message('subst', 'a', 'b', 'c').parse());
>> &rarr; a b c
console.log(mw.message('wlink', 'a', 'b', 'c').parse());
>> <a href="a">linktext</a> b &amp;rarr; c

//ISSUE: -------------------------^
});

Confirmed still happening. However, I think the others are underescaping rather that the last one overescaping.

(In reply to Niklas Laxström from comment #3)

Confirmed still happening. However, I think the others are underescaping
rather that the last one overescaping.

I think the others are okay and the last one is over-escaping. &rarr; is valid wikitext for a right arrow (it's part of the subset of HTML that is allowed). Since it's valid wikitext, it should be allowed in parse mode.

[http://example.com linktext] b &rarr; c

renders to:

<a rel="nofollow" class="external text" href="http://example.com">linktext</a> b → c

(see https://en.wikipedia.org/w/index.php?title=User:Mattflaschen_(WMF)/Sandbox&oldid=632518564)

which is equivalent to (and renders the same as):

<a rel="nofollow" class="external text" href="http://example.com">linktext</a> b &rarr; c

Do we have concrete examples? Perhaps I have one, from UploadWizard:
2014-10-31 10.16 < Nemo_bis> tgr: Descrivi sinteticamente tutto quanto sia degno di nota a proposito di quest&#039;opera. Per le foto, indica le cose principali che vi sono rappresentate, l&#039;occasione e/o il luogo in cui sono state scattate.
2014-10-31 10.16 < Nemo_bis> (overescaping in UW)

That works fine for me. It doesn't seem to use entities (https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FUploadWizard.git/077fc02922ae8023c54b78528195c4c7a77da5a3/i18n%2Fit.json#L191).

::1> @var_export( wfMessage( 'mwe-upwiz-tooltip-description' )->inLanguage( 'it' )->parse() );
@var_export( wfMessage( 'mwe-upwiz-tooltip-description' )->inLanguage( 'it' )->parse() );
'Descrivi sinteticamente tutto quanto sia degno di nota a proposito di quest\'opera.
Per le foto, indica le cose principali che vi sono rappresentate, l\'occasione e/o il luogo in cui sono state scattate.'


mw.message( 'mwe-upwiz-tooltip-description' ).parse()
"Descrivi sinteticamente tutto quanto sia degno di nota a proposito di quest'opera.
Per le foto, indica le cose principali che vi sono rappresentate, l'occasione e/o il luogo in cui sono state scattate."


However, I assume the report is valid; that just doesn't seem to be an example.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 4 2015, 7:17 PM

Do we have concrete examples? Perhaps I have one, from UploadWizard:
2014-10-31 10.16 < Nemo_bis> tgr: Descrivi sinteticamente tutto quanto sia degno di nota a proposito di quest&#039;opera. Per le foto, indica le cose principali che vi sono rappresentate, l&#039;occasione e/o il luogo in cui sono state scattate.
2014-10-31 10.16 < Nemo_bis> (overescaping in UW)

That was eventually filed as T113615 and is a separate issue.

matmarex updated the task description. (Show Details)Oct 5 2015, 12:36 PM
matmarex set Security to None.
matmarex removed a subscriber: wikibugs-l-list.

I don't think we should burden mediawiki.jqueryMsg with understanding all possible HTML entities that can appear in wikitext.

Currently, it can correctly handle: &#039;, &quot;, &lt;, &gt;, &amp;, and I think that's enough. No messages in core use any other entities (except for &#32;, which is the entity for a space and is special-cased somewhere at a lower level to allow a message to begin or end with a space).

The only bug here is that it is inconsistent depending on what else is in the message, because we special-case "simple" messages (that do not contain {{, [, < or >) and don't fully parse them. I think we should just add & to this list, which will cause all entities to be consistently escaped, and document that jqueryMsg does not support entities other than the ones I listed above.

Change 243641 had a related patch set uploaded (by Bartosz Dziewoński):
mediawiki.jqueryMsg: Always parse messages with '&'

https://gerrit.wikimedia.org/r/243641

Currently, it can correctly handle: &#039;, &quot;, &lt;, &gt;, &amp;, and I think that's enough.

Actually, not quite, it can only handle them in HTML tag attributes, but not in text. Just… don't use entities with jqueryMsg, please. :)

Change 243641 merged by jenkins-bot:
mediawiki.jqueryMsg: Always parse messages with '&'

https://gerrit.wikimedia.org/r/243641

matmarex closed this task as Declined.Oct 6 2015, 11:56 AM