Page MenuHomePhabricator

Security Review For EventStreamConfig extension
Closed, ResolvedPublic

Description

Project Information

Description of the tool/project

See README.

Description of how the tool will be used at WMF

  • The EventLogging extension will use EventStreamConfig with ResourceLoader to load configs for streams used on certain pages to dynamically configure client stream settings, like sample rate.
  • Mobile apps will use the API endpoint to dynamically configure client stream settings like sample rate.
  • EventGate event intake service(s) use this to ensure that only events of a specific schema title are allowed into a stream.
  • EventBus and other server side event producers use this to figure out which event intake service a given stream should be produced to.

Dependencies

None.

Has this project been reviewed before?

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventStreamConfig/+/545654

Working test environment

In mediawiki-vagrant, clone this extension into mediawiki/extensions. Edit LocalSettings.php and add:

wfLoadExtension('EventStreamConfig');


$wgEventStreams = [
	[
		'stream' => 'test.event',
		'schema_title' => 'test/event',
		'sample_rate' => 0.15,
		'destination' => 'http://10.11.12.13:8192/v1/events',
	],
	[
		'stream' => '/^mediawiki\.job\..+/',
		'schema_title' => 'mediawiki/job',
		'sample_rate' => 0.8,
		'EventServiceName' => 'eventgate-main',
	],
];

You should be able to get stream config from the API endpoint:

curl 'http://dev.wiki.local.wmftest.net:8080/w/api.php?action=streamconfigs&format=json'
curl 'http://dev.wiki.local.wmftest.net:8080/w/api.php?action=streamconfigs&format=json&streams=mediawiki.job.test'

Post-deployment

Analytics Engineering - Andrew Otto

Details

Related Gerrit Patches:
mediawiki/extensions/EventStreamConfig : masterAdd comment about keeping regexes simple in README

Event Timeline

Ottomata created this task.Jan 7 2020, 3:32 PM
Restricted Application added subscribers: Masumrezarock100, Aklapper. · View Herald TranscriptJan 7 2020, 3:32 PM
chasemp moved this task from Back Orders to Incoming on the Security Readiness Reviews board.
Jcross triaged this task as Low priority.Jan 7 2020, 6:27 PM
Ottomata moved this task from Backlog to Next Up on the Event-Platform board.Jan 7 2020, 8:42 PM

Bump! One of our (O)KRs/goals this quarter is to deploy an EventLogging usage using Modern Event Platform components, including this one. There are 3 or 4 moving parts to this, and this one blocks the others. We'd love to get this reviewed and then deplyed as soon as possible this quarter.

Ottomata updated the task description. (Show Details)Thu, Jan 23, 7:41 PM
Ottomata added subscribers: sbassett, Reedy.

@sbassett @Reedy is there anything I can do to help move this along?

Ottomata moved this task from Next Up to In Code Review on the Analytics-Kanban board.

Hey @Ottomata - this is still in our backlog, so it probably won't be reviewed until after All-hands. I did have a quick glance at the code and did not see anything outrageous, so I imagine that, once started, it will be a quick review.

Great, thanks!

sbassett claimed this task.Mon, Feb 3, 5:11 PM
sbassett added a project: user-sbassett.
sbassett moved this task from Incoming to In Progress on the Security Readiness Reviews board.
sbassett moved this task from Backlog to In Progress on the user-sbassett board.

I see this is In Progress...how goes?! :)

@Ottomata - hope to have results by EOD tomorrow (2019-02-07).

Security Review Summary - T242124 - 2020-02-07
Overall, this extension looks fine from a security perspective. See below for a handful of minor issues.

Vulnerable Packages
As reported by snyk test:

n.b.: both are for the dev dependency stylelint-config-wikimedia@0.8.0, overall risk within this context: [low]

  1. dot-prop@4.2.0, Prototype Pollution (medium risk)
  2. kind-of@6.0.2, Information Disclosure (low risk)
    • Introduced by stylelint-config-wikimedia@0.8.0 > stylelint@12.0.0 > global-modules@2.0.0 > global-prefix@3.0.0 > kind-of@6.0.2 and 44 other path(s). This issue was fixed in versions: 6.0.3. See also: https://snyk.io/vuln/SNYK-JS-KINDOF-537849.

General Security Issues

  1. I assume $wgEventStreams will always be highly-trusted config and intended to be publicly-exposed within some *Settings.php file. And that we're fine exposing things like the the sample-rate, destination (private IPs?) and any other key/value which might eventually reside here. It'd be nice if INTERNAL_SETTINGS within includes/StreamConfig.php was an allow-list instead of a deny-list, though I understand if that's not possible due to the extension needing to support any number of arbitrary key/value config items. [risk: low]
  2. includes/StreamConfig.php - lines 100, 111, 138 - using regular expression patterns dervied from dynamic data is always a bit risky, even if coming from trusted config. The validation happening within isValidRegex() seems reasonable, though it doesn't support the complete set of PHP's supported delimiters and the code is not preg_quoting the pattern either within matches() or isValidRegex(). Or checking for potential ReDOSes and similar vulnerabilities, though mitigating against that can be extremely difficult or even impossible, until something like T240884 is implemented. At the very least, it might be a good idea to note these limitations within the README.md and/or mediawiki.org doc. [risk: low]
sbassett moved this task from In Progress to Waiting on the user-sbassett board.

Change 570951 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[mediawiki/extensions/EventStreamConfig@master] Add comment about keeping regexes simple in README

https://gerrit.wikimedia.org/r/570951

Thanks @sbassett!

Vulnerable Packages

Since both of these are from stylelint-config-wikimedia@0.8.0, and that is the latest version, I imagine the maintainer of that should be notified to update dependencies? Is there a change I should make in EventStreamConfig?

I assume $wgEventStreams will always be highly-trusted config

yes

And that we're fine exposing things like the the sample-rate, destination (private IPs?) and any other

There aren't really 'private IPs' here, but there may be internal discovery or LVS FQDN endpoints that would be in the config. These are public anyway, as mediawiki-config itself is a public repository.

the code is not preg_quoting the pattern either within matches() or isValidRegex()

Not sure I understand, wouldn't calling preg_quote on the pattern make it a different pattern? We want e.g "/mediawiki.job\..*/" to match "mediawiki.job.custom". preg_quote("/mediawiki.job\..*/") -> "/mediawiki\.job\\\.\.\*/" which does not match.

Or checking for potential ReDOSes and similar vulnerabilities

We really only need a very limited set of regexes I think. If there was an easy way to restrict the possible regex patterns to safe one's I'm happy to do that. Is there in PHP?

lines 100, 111, 138 - using regular expression patterns dervied from dynamic data is always a bit risky, even if coming from trusted config

These will come directly from mediawiki-config only which will generally only use static string patterns. I suppose one could dynamically generate the config and possibly patterns in a mediawiki-config .php file, but I'd hope someone wouldn't do that. :)

At the very least, it might be a good idea to note these limitations within the README.md

I added https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventStreamConfig/+/570951/1/README.md.

@sbassett are there other action items for me?

Change 570951 merged by Ottomata:
[mediawiki/extensions/EventStreamConfig@master] Add comment about keeping regexes simple in README

https://gerrit.wikimedia.org/r/570951

Since both of these are from stylelint-config-wikimedia@0.8.0, and that is the latest version, I imagine the maintainer of that should be notified to update dependencies? Is there a change I should make in EventStreamConfig?

Yep, I can do that, I just prefer to call these out in reviews as well.

There aren't really 'private IPs' here, but there may be internal discovery or LVS FQDN endpoints that would be in the config. These are public anyway, as mediawiki-config itself is a public repository.

Ok. I know we're generally fine with exposing these sorts of things, so that's fair.

the code is not preg_quoting the pattern either within matches() or isValidRegex()

Not sure I understand, wouldn't calling preg_quote on the pattern make it a different pattern? We want e.g "/mediawiki.job\..*/" to match "mediawiki.job.custom". preg_quote("/mediawiki.job\..*/") -> "/mediawiki\.job\\\.\.\*/" which does not match.

In this case it would, so it wouldn't be appropriate since it's expected that the config is trustable and contains already-valid patterns. If that were ever not to be the case, we'd want to pursue further hardening along these lines.

We really only need a very limited set of regexes I think. If there was an easy way to restrict the possible regex patterns to safe one's I'm happy to do that. Is there in PHP?

There are probably a couple of simple ways to harden isValidRegex() a bit more, either with a list of allowed characters for the pattern (even eliminating the use of [multiple] parentheses or certain quantifiers would be good, if possible) and/or checking for some basic ReDOS patterns (possibly overkill).

I added https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventStreamConfig/+/570951/1/README.md.

Looks good, thanks.

@sbassett are there other action items for me?

No, unless you wanted to take a crack at improving isValidRegex() as suggested above, though given the low risk, that certainly wouldn't be a blocker.

Ok, great! If you want to take a crack at improving isValidRegex() please do! But either way, since this is not a blocker, I'll ask the MW folks to move forward with deployment. THANK YOU!

Ottomata moved this task from Next Up to Done on the Event-Platform board.Sat, Feb 8, 1:58 AM

AH SORRY, somehow on Friday I read ', unless you wanted to take a crack at improving' as 'unless you wanted _me_ to ...'. Sorry about putting it back on you. I kinda rather not support all of PHP's valid delimiters; there's no reason to have a '/' in the stream name pattern. Delimiting with / is good and pretty standard.

sbassett added a comment.EditedMon, Feb 10, 3:45 PM

Heh, no problem, I thought that was kinda funny. Anyhow, I think the README.md clarification is helpful and that most folks editing/pushing $wgEventStreams to production should know what they're doing. So the most I was going to suggest (which is admittedly a little paranoid) at this time was adding a config like $wgEnforceSimpleRegExp which would add another validator inside of isValidRegex(), maybe something like:

$s = str_split( $string );
array_walk( $s, function( &$item ) {
  if( stripos( "abcdefghijklmnopqrstuvwxyz0123456789.+*^$/\\", $item ) === false ) {
    throw new Exception("Invalid pattern chars in wgEnforceSimpleRegExp mode");
  }
});
sbassett closed this task as Resolved.Mon, Feb 10, 3:53 PM
sbassett moved this task from Waiting to Our Part Is Done on the Security Readiness Reviews board.
sbassett moved this task from Waiting to Done on the user-sbassett board.
Ottomata added a comment.EditedMon, Feb 10, 3:55 PM

Actually, something like that for stream name validation in general would be good. Stream names must follow the same rules as Kafka topics. I'll add something...