Page MenuHomePhabricator

Username beginning with asterisk renders as list in “restore”/“undo” edit summaries of Wikibase items
Closed, ResolvedPublic

Description

For example, see wikidata:Special:Diff/606650780. The edit summary

(Restore revision 597045630 by *Youngjin)

is rendered as

(‎

Restore revision 597045630 by

  • Youngjin

)

As I’m reporting this, Wikidata is on 1.31.0-wmf.11 (rMW6d47031161c3).

I have also reproduced this locally in “undo” edit messages. “rollback” messages, on the other hand, appear unaffected.

This only seems to happen on Wikibase entity pages.

I’m reporting this as a security issue because I feel like there might be a slight risk of an XSS vulnerability here (even though <script>alert('pwnd')</script> is not a valid username and would also presumably be escaped by the wikitext parser that is apparently applied being to these summaries).

Event Timeline

Its unlikely this is an XSS.

Most likely what needs to happen is that when passing parameters to the 'wikibase-item-summary-restore' message, the username should be run through wfEscapeWikitext().

Possible fix:

diff --git a/lib/includes/Formatters/AutoCommentFormatter.php b/lib/includes/Formatters/AutoCommentFormatter.php
index 0c77d8762..b57ab5580 100644
--- a/lib/includes/Formatters/AutoCommentFormatter.php
+++ b/lib/includes/Formatters/AutoCommentFormatter.php
@@ -100,7 +100,7 @@ public function formatAutoComment( $auto ) {
 		}
 
 		// render the autocomment
-		$auto = $msg->params( $args )->parse();
+		$auto = $msg->plaintextParams( $args )->parse();
 		return $auto;
 	}

But I have no idea if this breaks something else – it’s possible that some of the message parameters should be interpreted as Wikitext. (As far as I can tell, AutoCommentFormatter itself has no idea what the parameters mean – it just extracts the partial message key and the parameters from the edit summary and combines them. So it’s also possible that the escaping should happen before writing the username to the edit summary.)

@Bawolff if you think this is unlikely to be an XSS, perhaps you can make the task public and let the rest of the Wikidata team have a look? (I agree, FWIW, I just thought “better safe than sorry”.)

Bawolff changed the visibility from "Custom Policy" to "Public (No Login Required)".Dec 13 2017, 5:22 PM

Possible fix:

diff --git a/lib/includes/Formatters/AutoCommentFormatter.php b/lib/includes/Formatters/AutoCommentFormatter.php
index 0c77d8762..b57ab5580 100644
--- a/lib/includes/Formatters/AutoCommentFormatter.php
+++ b/lib/includes/Formatters/AutoCommentFormatter.php
@@ -100,7 +100,7 @@ public function formatAutoComment( $auto ) {
 		}
 
 		// render the autocomment
-		$auto = $msg->params( $args )->parse();
+		$auto = $msg->plaintextParams( $args )->parse();
 		return $auto;
 	}

But I have no idea if this breaks something else – it’s possible that some of the message parameters should be interpreted as Wikitext. (As far as I can tell, AutoCommentFormatter itself has no idea what the parameters mean – it just extracts the partial message key and the parameters from the edit summary and combines them. So it’s also possible that the escaping should happen before writing the username to the edit summary.)

@Bawolff if you think this is unlikely to be an XSS, perhaps you can make the task public and let the rest of the Wikidata team have a look? (I agree, FWIW, I just thought “better safe than sorry”.)

Made public.

Your proposed patch looks like it would fix the issue.

Change 398081 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Render autocomment parameters as plain text

https://gerrit.wikimedia.org/r/398081

Ah, but plaintext parameters are, like raw parameters, processed after the message is parsed, so {{PLURAL:$3|claim|claims}} no longer works :/

And the username is actually used inside a {{GENDER}} magic word for both messages where it appears, so I assume we simply can’t use plaintext…

Hm, I’m not sure how to solve this. {{GENDER}} needs the unescaped username, but then how can we also include the username in the message without it being parsed? Do we need to pass the username into the message twice, once unescaped and once escaped?

Ah, but plaintext parameters are, like raw parameters, processed after the message is parsed, so {{PLURAL:$3|claim|claims}} no longer works :/

Oh, I didn't think about that. Just use a normal ->params() and wrap in wfEscapeWikiText() instead.

$msg->params( wfEscapeWikiText($args) )->parse();

Hm, {{GENDER}} doesn’t seem to support usernames with asterisks anyways – if I’m not mistaken, they’re treated as “unknown gender”. (Unless that’s just a problem with my local test wiki.) (It was just a problem with my local test wiki. Specifically, I still had the attempted fix applied, so {{GENDER}} didn’t actually see the user name.)

Why is this even being parsed as a list, anyways? The plain text of the message is:

Restore revision 597045630 by [[Special:Contributions/*Youngjin|{{GENDER:*Youngjin|*Youngjin}}]]

There’s no * anywhere near the beginning of a line, and yet a list is emitted. This can also be reproduced outside of this message, e. g. with foo{{GENDER:|*bar}}. Is that a bug of {{GENDER}}?

Hm, looks like that’s just standard behavior for any template (I tried it with foo{{identity|*bar}}, where Template:identity is just {{{1}}}). Odd, but not a bug.

But this suggests that moving the username out of the {{GENDER}} should be enough to fix the bug. If I understand it correctly, we’re not actually making use of the {{GENDER}} here – it’s just used in the English version of the message to give translators a hint that they can use the function, but they will not apply it to the username itself. For example, the Portuguese translation of wikibase-entity-summary-restore is

Restaurar a revisão $3 {{GENDER:$4|do utilizador|da utilizadora}} [[Special:Contributions/$4|$4]]

where the {{GENDER}} is used to switch between the male and female form of “user”, whereas the username itself ($4) stays the same.

But this suggests that moving the username out of the {{GENDER}} should be enough to fix the bug.

Well, it would fix the * bug. It won’t necessarily fix any other problems that may occur because we’re sending unescaped user names into a parsed message.

The curious thing is that I couldn’t reproduce this bug with a user named '''bold'''. It seems like in that case, the autocomment we get from MediaWiki already contains &#039; instead of '. There seems to be some confusion inside MediaWiki about this topic, too, because while logged in as '''bold''' (with gender set to female), {{GENDER:|he|she|they}} (gender for current user) results in “they”, while {{GENDER:'''bold'''|he|she|they}} (gender for '''bold''' user) correctly results in “she”.

Are there any other possible usernames that cause a bug like this? I could only find ;dl, definition list syntax (which is also only effective due to the {{GENDER}} template, and would otherwise have no special meaning in the middle of a line).

  • :, #, <, >, [, ], {, } are not allowed in usernames.
  • ' appears to get escaped.
  • & is allowed in usernames, and in fact MediaWiki seems to treat &quot; and " as the same user everywhere, so I assume allowing HTML entities in usernames is a feature(?). However, “double HTML entities” like &amp;quot; are not allowed.

Any obscure bits of wikitext I forgot?

Alternative fix: wrap the user name in <nowiki>.

diff --git a/repo/i18n/en.json b/repo/i18n/en.json
index c93cfa413..cacb8b4fe 100644
--- a/repo/i18n/en.json
+++ b/repo/i18n/en.json
@@ -369,8 +369,8 @@
 	"wikibase-self-conflict-patched": "Your edit was patched into the latest version, overriding some of your own intermediate changes.",
 	"wikibase-conflict-patched": "Your edit was patched into the latest version.",
 	"wikibase-restoreold": "restore",
-	"wikibase-entity-summary-restore": "Restore revision $3 by [[Special:Contributions/$4|{{GENDER:$4|$4}}]]",
-	"wikibase-entity-summary-undo": "Undo revision $3 by [[Special:Contributions/$4|{{GENDER:$4|$4}}]] ([[User talk:$4|talk]])",
+	"wikibase-entity-summary-restore": "Restore revision $3 by [[Special:Contributions/$4|{{GENDER:$4|<nowiki>$4</nowiki>}}]]",
+	"wikibase-entity-summary-undo": "Undo revision $3 by [[Special:Contributions/$4|{{GENDER:$4|<nowiki>$4</nowiki>}}]] ([[User talk:$4|talk]])",
 	"wikibase-non-entity-diff": "Can not diff an entity with a non-entity content.",
 	"wikibase-no-direct-editing": "Direct editing is disabled in namespace $1",
 	"wikibase-noentity": "This entity does not exist. You can [{{fullurl:{{#Special:Log}}|page={{FULLPAGENAMEE}}}} search the related logs] to find out where it went.",

I’m only half joking – I think this fix does make some sense, since AutoCommentFormatter has no idea which arguments need to be escaped and which don’t (it just does the same thing for all messages), while the individual messages know what each argument means. That said, this is clearly a hack, and what’s worse, a hack that’s shown to all of our translators. Many of them will probably copy the <nowiki> into their translations even though it shouldn’t be necessary outside of the {{GENDER}} template.

Why is this even being parsed as a list, anyways? The plain text of the message is:

Restore revision 597045630 by [[Special:Contributions/*Youngjin|{{GENDER:*Youngjin|*Youngjin}}]]

There’s no * anywhere near the beginning of a line, and yet a list is emitted. This can also be reproduced outside of this message, e. g. with foo{{GENDER:|*bar}}. Is that a bug of {{GENDER}}?

This is a well known feature of parser functions (That they add newlines). The behaviour is very controversial, but is probably here to stay for back compatibility if nothing else. See T14974

Alternative fix: wrap the user name in <nowiki>.

Just use the wfEscapeWikitext() global function (Defined in includes/GlobalFunctions.php), that's the standard solution the rest of MediaWiki uses.

 :, #, <, >, [, ], {, } are not allowed in usernames.
' appears to get escaped.
& is allowed in usernames, and in fact MediaWiki seems to treat &quot; and " as the same user everywhere, so I assume allowing HTML entities in usernames is a feature(?). However, “double HTML entities” like &amp;quot; are not allowed.

Well you've also got percent encoding behaves weirdly in usernames. Some parts of MediaWiki use the '#' to make "fake" users (Especially the blocking code), but that's not relavent in this context. ---- at the beginning of a line generates an <hr>. Urls cause all sorts of weird stuff to happen (But not inside a link). So do things like __TOC__, and magic links (RFC123) if they are enabled.

Just use the wfEscapeWikitext() global function (Defined in includes/GlobalFunctions.php), that's the standard solution the rest of MediaWiki uses.

I don’t think we can do that – {{GENDER}} needs the unescaped username. Otherwise we’ll address these users with the wrong gender. (I tried it out locally, {{GENDER:*asterisk|he|she|they}} produces “she” while {{GENDER:&#42;asterisk|he|she|they}} produces “they”.)

And thanks for the other Wikitext examples! But I think none of them affect this bug.

---- at the beginning of a line generates an <hg>.

Good point, but apparently not within {{GENDER:|----}}, though I’m not sure why.

__TOC__

Right, but in a username _ = , so trying to register as __TOC__ just results in the username “TOC”.

magic links

Also not expanded inside a link.

Hmm. Maybe thats a bug in GENDER: i would expect it to normalize titles.

Change 398081 abandoned by Lucas Werkmeister (WMDE):
Render autocomment parameters as plain text

Reason:
This was premature, see more discussion in the attached bug.

https://gerrit.wikimedia.org/r/398081

I don’t think we can do that – {{GENDER}} needs the unescaped username. Otherwise we’ll address these users with the wrong gender. (I tried it out locally, {{GENDER:*asterisk|he|she|they}} produces “she” while {{GENDER:&#42;asterisk|he|she|they}} produces “they”.)

And weirdly enough, {{GENDER:User:&#42;asterisk|he|she|they}} produces she.

Change 398821 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/Wikibase@master] Escape autocomment message arguments

https://gerrit.wikimedia.org/r/398821

Change 398821 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Escape autocomment message arguments

https://gerrit.wikimedia.org/r/398821

So core is fixed now, so we should be all clear to go ahead with the fix in wikidata

Wikibase part is merged as well, so we should be good to go. But unless someone thinks it’s serious enough for a backport, we’ll have to wait a bit for the next train (no train this and next week).

matmarex subscribed.

Looks like this has been fixed by the patches in January.