Page MenuHomePhabricator

Do a security review of ReplaceText prior to it being bundled with tarball
Closed, ResolvedPublic

Description

Per T178349. ReplaceText might be bundled. We want to do a security review before next release so we could potentially bundle it.

Items from review:

  • Seems to have no CSRF check. Given the potential destructive nature of the tool, this is rather serious. (https://gerrit.wikimedia.org/r/#/q/Ib6a951152db222b6289b9b8d09608dfe75ed2de2)
  • It should use resource loader instead of inline javascript and $out->addScriptFile() (https://gerrit.wikimedia.org/r/#/c/425453/)
  • Maybe $wgReplaceTextUser should be added to the UserGetReservedNames hook so user's can't register that name for themselves.
  • The job should maybe use RequestContext::importScopedSession so that it is more compatible with things like CheckUser
  • "replaceAll.php" line 178 - summary should probably just be a ->plain() msg with normal parameters.
  • The code for Special:ReplaceText is a bit hard to follow in parts. It maybe could benefit from splitting long functions into smaller functions.
  • This won't work with $wgCompressRevisions or $wgExternalStores. Which is also fairly well known. However, ideally there would be some sort of error message for that case. It could perhaps be worked around (In a really slow way), by having it consider anything that matches 'old_flags LIKE \'%object%\' OR old_flags LIKE \'%external%\'' as matching the regex, and then have the job actually determine if they match, but then the preview step wouldn't work. - (Solved with just an error message)

Deferred items:

  • The extension sets the ungreedy and unicode flags for regexes in PHP but does not set it mariadb. Starting in 1.10.11 this can be set using https://mariadb.com/kb/en/library/server-system-variables/#default_regex_flags . Prior to that it would always be ungreedy (and prior to 1.10.5 mariadb didn't even use pcre). Most of the time, this only controls how the pattern matches not what the pattern matches, but if you use advanced pcre features the ungreedy flag could potentially change whether the regex matches at all.
  • "SpecialReplaceText.php" line 603 - Has weird escaping. I don't think its exploitable, but its a bit sketchy.
  • "ReplaceTextJob.php" line 39 might handle colons incorrectly. Should probably use makeTitleSafe + error handling for invalid page names. [However it may be hard to trigger this because of how code in Special:ReplaceText works]
  • There's also T142314 - Mostly irrelavent now due to being out of date, and ReplaceText isn't actually vulnerable. Might make sense to check patterns for null bytea and for /'s that aren't preceeding by a \, just as a paranoia measure.

No action taken:

  • This won't scale to large wikis due as the "search" part of the process is not done as part of the job (I think this is a well known fact, and replace text doesn't intend to scale to really big wikis, I'm just stating for the record)
  • It should of course be noted for the record if we're having a security discussion about this tool, that its a very powerful tool, that in the wrong hands can mass vandalize your wiki. (That kind of goes without saying of course)

Event Timeline

Bawolff set Security to Software security bug.Apr 10 2018, 8:59 AM
Bawolff added a project: acl*security.
Bawolff changed the visibility from "Public (No Login Required)" to "Custom Policy".

Overall thoughts - The CSRF issue is blocking bundling it with core, although that is a pretty easy fix.

Not security, but an argument could be made that we should convert to ResourceLoader before bundling, as people may use bundled extensions as examples, so we want it to follow best practise for JS. I also would like to see there be at least a warning on Special:ReplaceText if external storage or compressed revisions is setup.

Major issues

  • Seems to have no CSRF check. Given the potential destructive nature of the tool, this is rather serious.

Other comments & minor issues

Just my thoughts going through the extension:

  • Maybe $wgReplaceTextUser should be added to the UserGetReservedNames hook so user's can't register that name for themselves.
  • It should use resource loader instead of inline javascript and $out->addScriptFile()
  • The job should maybe use RequestContext::importScopedSession so that it is more compatible with things like CheckUser
  • "ReplaceTextJob.php" line 39 might handle colons incorrectly. Should probably use makeTitleSafe + error handling for invalid page names. [However it may be hard to trigger this because of how code in Special:ReplaceText works]
  • "replaceAll.php" line 178 - summary should probably just be a ->plain() msg with normal parameters.
  • This won't scale to large wikis due as the "search" part of the process is not done as part of the job (I think this is a well known fact, and replace text doesn't intend to scale to really big wikis, I'm just stating for the record)
  • This won't work with $wgCompressRevisions or $wgExternalStores. Which is also fairly well known. However, ideally there would be some sort of error message for that case. It could perhaps be worked around (In a really slow way), by having it consider anything that matches 'old_flags LIKE \'%object%\' OR old_flags LIKE \'%external%\'' as matching the regex, and then have the job actually determine if they match, but then the preview step wouldn't work.
  • The code for Special:ReplaceText is a bit hard to follow in parts. It maybe could benefit from splitting long functions into smaller functions.
  • There's also T142314 - Mostly irrelavent now due to being out of date, and ReplaceText isn't actually vulnerable. Might make sense to check patterns for null bytea and for /'s that aren't preceeding by a \, just as a paranoia measure.
  • "SpecialReplaceText.php" line 603 - Has weird escaping. I don't think its exploitable, but its a bit sketchy.
  • The extension sets the ungreedy and unicode flags for regexes in PHP but does not set it mariadb. Starting in 1.10.11 this can be set using https://mariadb.com/kb/en/library/server-system-variables/#default_regex_flags . Prior to that it would always be ungreedy (and prior to 1.10.5 mariadb didn't even use pcre). Most of the time, this only controls how the pattern matches not what the pattern matches, but if you use advanced pcre features the ungreedy flag could potentially change whether the regex matches at all.
  • It should of course be noted for the record if we're having a security discussion about this tool, that its a very powerful tool, that in the wrong hands can mass vandalize your wiki. (That kind of goes without saying of course)

@Legoktm, the issue noted in T191919 (that's a great task number) is also referenced above.

I intend to tackle at least some of the other issues as well.

Bawolff changed the visibility from "Custom Policy" to "Public (No Login Required)".

[Making this public again now that the CSRF issue is fixed]

Change 425734 had a related patch set uploaded (by Cicalese; owner: Cicalese):
[mediawiki/extensions/ReplaceText@master] Fix items from code review

https://gerrit.wikimedia.org/r/425734

Change 425734 merged by jenkins-bot:
[mediawiki/extensions/ReplaceText@master] Fix items from code review

https://gerrit.wikimedia.org/r/425734

@Bawolff - I actually didn't know that Replace Text won't work with "$wgExternalStores = true". (And I don't understand what that setting does, having read through its explanation.) Are you sure about this? If so, I'll add a check for it.

Also, what specifically was the line in SpecialReplaceText.php with weird escaping? The code has been modified quite a bit recently, so now I'm sure which line you're talking about. Was it somewhere in this piece of code?

if ( $use_regex ) {
        $targetStr = "/$target/Uu";
} else {
        $targetq = preg_quote( $this->convertWhiteSpaceToHTML( $target ), '/' );
        $targetStr = "/$targetq/i";
}

wgExternalStore causes the revisions to be stored in a different db, so you would no longer be able to query them from the text table as they wouldnt be in the text table anymore (more generally it doesnt matter so much what the current value of wgExternalStore or wgCompressRevisions is, but if they were true at any time in the past or if the user has ever run recompressOld.php). The text table has an old_flags field. If the flag object or external is set, you probably cannot use LIKE queries against that entry.

For the regex - it appears to actually be safe - it would be nicer if escaping happened after the search and replace as that would make it easier to see that it is safe, although it might be hard to rewrite the code that way.

e

CCicalese_WMF updated the task description. (Show Details)

@Bawolff would you be able to re-review now that the major issues are addressed, please?

Sorry for the delay.

The CSRF fix is definitely good, and that was the only thing actually blocking this.

For 6258d3300a781 - LinkRenderer::link expects its parameter to be not already be Html escaped (vs Linker::linkKnown which assumes you have previously already run htmlspecialchars() on the link text. So using the two interchangably like that isn't quite right. So In ReplaceTextUtils::link, the line should probably be return $linkRenderer->makeLink( $title, new HtmlArmor( $text ) ); since everything calling it has been escaping the link text prior to the call.

For the javascript, it would be cool if inline event handlers could also be removed (the onclick in line 638 of SpecialReplaceText.php) as that sort of thing is now discouraged in js, and so that it will be compatible with CSP when/if that ever lands.

Everything else looks good.

Change 429207 had a related patch set uploaded (by Cicalese; owner: Cicalese):
[mediawiki/extensions/ReplaceText@master] Remove inline JavaScript

https://gerrit.wikimedia.org/r/429207

Change 429207 merged by jenkins-bot:
[mediawiki/extensions/ReplaceText@master] Remove inline JavaScript

https://gerrit.wikimedia.org/r/429207

Change 430252 had a related patch set uploaded (by Cicalese; owner: Cicalese):
[mediawiki/extensions/ReplaceText@REL1_31] Remove inline JavaScript

https://gerrit.wikimedia.org/r/430252

Change 430252 merged by jenkins-bot:
[mediawiki/extensions/ReplaceText@REL1_31] Remove inline JavaScript

https://gerrit.wikimedia.org/r/430252

Change 430258 had a related patch set uploaded (by Cicalese; owner: Cicalese):
[mediawiki/extensions/ReplaceText@master] Added HtmlArmor.

https://gerrit.wikimedia.org/r/430258

Change 430258 merged by jenkins-bot:
[mediawiki/extensions/ReplaceText@master] Added HtmlArmor.

https://gerrit.wikimedia.org/r/430258

Change 430401 had a related patch set uploaded (by Cicalese; owner: Cicalese):
[mediawiki/extensions/ReplaceText@REL1_31] Added HtmlArmor.

https://gerrit.wikimedia.org/r/430401

Change 430401 merged by jenkins-bot:
[mediawiki/extensions/ReplaceText@REL1_31] Added HtmlArmor.

https://gerrit.wikimedia.org/r/430401