character filter for uselang
Closed, ResolvedPublic

Description

Bug 36938 is fixed and adds escaping of uselang for HTML.

For the JavaScript variable mw.config.get( 'wgUserLanguage' ) still a lots of characters are allowed but some are filtered:

https://www.mediawiki.org/w/index.php?uselang= >>> "en"
https://www.mediawiki.org/w/index.php?uselang=%20 >>> " "
https://www.mediawiki.org/w/index.php?uselang=%21 >>> "!"
https://www.mediawiki.org/w/index.php?uselang=%22 >>> """
https://www.mediawiki.org/w/index.php?uselang=%23 >>> "en"
https://www.mediawiki.org/w/index.php?uselang=%24 >>> "$"
https://www.mediawiki.org/w/index.php?uselang=%25 >>> "%"
https://www.mediawiki.org/w/index.php?uselang=%2525 >>> "en"
https://www.mediawiki.org/w/index.php?uselang=%26 >>> "&"
https://www.mediawiki.org/w/index.php?uselang=%26amp >>> "&amp"
https://www.mediawiki.org/w/index.php?uselang=%26amp; >>> "en"
https://www.mediawiki.org/w/index.php?uselang=; >>> ";"
https://www.mediawiki.org/w/index.php?uselang=: >>> "en"
https://www.mediawiki.org/w/index.php?uselang=%3d >>> "="
https://www.mediawiki.org/w/index.php?uselang== >>> "="
https://www.mediawiki.org/w/index.php?uselang=/ >>> "en"
https://www.mediawiki.org/w/index.php?uselang=" >>> """
https://www.mediawiki.org/w/index.php?uselang=' >>> "'"

Many scripts use wgUserLanguage unescaped. Examples:
https://commons.wikimedia.org/wiki/MediaWiki:Common.js
https://commons.wikimedia.org/wiki/MediaWiki:Gadget-HotCat.js

When you open the following link on dewiki with activated gadget HotCat

https://de.wikipedia.org/w/index.php?uselang=en%26curid=19891835

the page https://commons.wikimedia.org/wiki/User:Fomafix/xss.js is loaded.

Of course this is a bug in the gadget, but there are lots of gadgets which maybe contain the same error.

Expected result:
wgUserLanguage should only be set when uselang contains only necessary allowed characters.


Version: 1.19.1
Severity: normal

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz37587.
Fomafix created this task.Via LegacyJun 14 2012, 11:10 AM
Fomafix added a comment.Via ConduitJun 14 2012, 1:55 PM

BCP 47 writes in: https://tools.ietf.org/html/bcp47#section-7

language tags use only the characters A-Z, a-z, 0-9, and HYPHEN-MINUS

This should be the only allowed characters.

csteipp added a comment.Via ConduitJun 25 2012, 6:16 PM

Fomafix,

Thanks for reporting this too! I've been out on leave for a the last two weeks, so apologies for the slow response.

I see exactly what you mean and yes, that is bad. We need to figure out the best place to put in the fix for this, but we will get it addressed asap.

tstarling added a comment.Via ConduitJun 25 2012, 10:01 PM

The uselang attribute commonly contains punctuation characters that aren't allowed by BCP-47, due to the {{int:}} hack commonly used on multilanguage wikis. Only the minimum set of characters required for security should be rejected, plus the ones rejected by Language::isValidCode().

Krinkle added a comment.Via ConduitJun 30 2012, 6:15 AM

(In reply to comment #4)

The uselang attribute commonly contains punctuation characters that aren't
allowed by BCP-47

Such as? I thought it was only used for things like en-upload-ownwork. But always within BCP-47, in general even structer (never numbers or uppercase even).

csteipp added a comment.Via ConduitJul 6 2012, 4:51 PM

Working with Tim on this yesterday, he pulled a list of all of the uselang values that hit WMF sites from the cache (http://paste.tstarling.com/p/qzhZBz.html). There were several obvious attack strings, and some that looked like they probably were errors. Almost all the rest were a-zA-Z0-9.-+ characters, with a few ?, =, and ncr-encoded characters where it was hard to figure out if they were errors or intentional.

From a security perspective, I think we should at least implement Nikerabbits patch now and if anyone was intentionally using ', ", or &, we can work with the site admins to get those cleaned up. Then we can later look at whitelisting [a-zA-Z+.-] only.

csteipp added a comment.Via ConduitAug 1 2012, 6:54 PM

With the rollout of wmf8 today on de.wikipedia.org, the particular issues reported by fomafix appears to be resolved. Thanks everyone!

csteipp added a project: Security.Via WebThu, Mar 26, 8:39 PM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.