Page MenuHomePhabricator

wfParseUrl() incorrectly parses hostnames in older PHP versions and HHVM due to bug in parse_url() (CVE-2016-10397 [PHP])
Closed, ResolvedPublic

Description

As reported on pentest:

The following js:

$.post( 'https://en.wikipedia.beta.wmflabs.org/w/api.php?action=shortenurl&format=json&url=https%3A%2F%2Fgoogle.com/?en%2Ewikipedia%2Ebeta%2Ewmflabs%2Eorg%2Fwiki%2Ffoo' );

Creates a short url, despite the fact the domain is incorrect.

Event Timeline

Debugging this, it appears the cause is related to how hosts are split by wfParseUrl().

on php7.0

> var_dump( wfParseUrl( 'https://google.com?en.wikipedia.beta.wmflabs.org/wiki/foo' ) );
array(4) {
  ["scheme"]=>
  string(5) "https"
  ["host"]=>
  string(10) "google.com"
  ["query"]=>
  string(38) "en.wikipedia.beta.wmflabs.org/wiki/foo"
  ["delimiter"]=>
  string(3) "://"
}

But on hhvm wmf servers:

> var_dump( wfParseUrl( 'https://google.com?en.wikipedia.beta.wmflabs.org/wiki/foo' ) );
array(4) {
  ["scheme"]=>
  string(5) "https"
  ["host"]=>
  string(40) "google.com?en.wikipedia.beta.wmflabs.org"
  ["path"]=>
  string(9) "/wiki/foo"
  ["delimiter"]=>
  string(3) "://"
}

Easiest fix (esp. since hhvm is going away) would be to change the domain whitelist entries from (.*\.)?wikipedia\.beta\.wmflabs\.org to something like ([a-zA-Z0-9.-]*\.)?wikipedia\.beta\.wmflabs\.org

According to https://3v4l.org/ko84p PHP only has the correct behavior in 5.6.28+. It was backported in https://bugs.php.net/bug.php?id=73192 as a security fix. Do we need to look at a core workaround for pre-1.31 branches that required PHP 5.5.9+?

I thought HHVM had moved back to the PHP 7 implementation of parse_url (c.f. https://github.com/facebook/hhvm/issues/8104#issuecomment-360226070) but I guess not.

For the easy fix, I imagine that'd be much simpler and safer to do within:

gerrit.wikimedia.org/g/mediawiki/extensions/UrlShortener/+/c79e331691aa00b137bb55a34090aeb60edaeb51/includes/UrlShortenerUtils.php#231

Er, rather: https://gerrit.wikimedia.org/g/operations/mediawiki-config/+/f8a92cd6dccaed15ee927df7d420adc65c66a43f/wmf-config/CommonSettings-labs.php#116

as opposed to wfParseUrl, which sees heavy use by a lot of things, according to codesearch.

According to https://3v4l.org/ko84p PHP only has the correct behavior in 5.6.28+. It was backported in https://bugs.php.net/bug.php?id=73192 as a security fix. Do we need to look at a core workaround for pre-1.31 branches that required PHP 5.5.9+?

PHP 7.0.0–7.0.12, which we currently support in master, also show the incorrect behavior.

Patches for wfParseUrl():

  • Versions without PHP5 support:
  • Versions with PHP5 support:

Test: https://3v4l.org/9SFUa

For the easy fix, I imagine that'd be much simpler and safer to do within:

gerrit.wikimedia.org/g/mediawiki/extensions/UrlShortener/+/c79e331691aa00b137bb55a34090aeb60edaeb51/includes/UrlShortenerUtils.php#231

Er, rather: https://gerrit.wikimedia.org/g/operations/mediawiki-config/+/f8a92cd6dccaed15ee927df7d420adc65c66a43f/wmf-config/CommonSettings-labs.php#116

I think we have to fix this in core, given the usage of wfParseUrl in other security related contexts, including ContentSecurityPolicy, $wgNoFollowDomainExceptions, bypassing Special:LinkSearch, E:SecureLinkFixer, ...

as opposed to wfParseUrl, which sees heavy use by a lot of things, according to codesearch.

Given that we'd just be applying an upstream PHP fix, I think it should be relatively safe.

@Legoktm Yep, makes sense and @Anomie's patches above look good in that respect.

Legoktm renamed this task from UrlShortener doesn't correctly validate url whitelists to wfParseUrl() incorrectly parses hostnames in older PHP versions and HHVM due to bug in parse_url().Dec 17 2018, 11:50 PM
Legoktm added a subscriber: Smalyshev.

Patches for wfParseUrl():

  • Versions without PHP5 support:
  • Versions with PHP5 support:

Test: https://3v4l.org/9SFUa

Tested both, +2. I think the commit message should mention that this was already fixed in https://bugs.php.net/bug.php?id=73192 and this is just for specific older versions.

I also discussed this issue with @Smalyshev on IRC earlier, and he noted that there are other issues with parse_url - he's sent an email to @Anomie and I with some more details. Once this security issue is dealt with, we should look at what other bug fixes have been made to parse_url upstream and whether they're important enough for us to work around for older versions.

chasemp triaged this task as Medium priority.Dec 9 2019, 4:40 PM
chasemp added a project: Security-Team.

@Legoktm @Anomie - looks like this task fell off the radar. Given where production is at with PHP7, I'm guessing the non-php5 patch from T212067#4828683 is no longer relevant for a security deploy. Does it still make sense to backport these to supported release branches?

Yeah, we no longer support any versions of PHP affected by this issue. But including the added unit tests in master would still be a good idea. I have a patch for that prepared, once I have the all-clear to put it in Gerrit.

As for supported releases:

  • 1.34 only supports unaffected versions of PHP.
  • 1.31–1.33 support PHP 7.0.13+, which is not affected, and HHVM 3.18.5, which is. They also state "Although HHVM 3.18.5 or later is supported, it is generally advised to use PHP 7.0.13 or later for long term support".

Yeah, we no longer support any versions of PHP affected by this issue. But including the added unit tests in master would still be a good idea. I have a patch for that prepared, once I have the all-clear to put it in Gerrit.

If you're waiting on a thumbs up from the Security-Team, this is fine by me. I can +2 if needed.

As for supported releases:

  • 1.31–1.33 support PHP 7.0.13+, which is not affected, and HHVM 3.18.5, which is. They also state "Although HHVM 3.18.5 or later is supported, it is generally advised to use PHP 7.0.13 or later for long term support".

I'd guess we don't really have a good idea of how many people are actually running MW 1.31-1.33 with HHVM 3.18.5. Still, if the patch picks cleanly to those branches, we should probably expend the effort - I'm happy to try. And to confirm - the conditional in the patch (if ( isset( $bits['host'] ) && strpos( $bits['host'], '?' ) !== false )) works so that anything PHP 7.0.13+ would be unaffected.

Yeah, we no longer support any versions of PHP affected by this issue. But including the added unit tests in master would still be a good idea. I have a patch for that prepared, once I have the all-clear to put it in Gerrit.

If you're waiting on a thumbs up from the Security-Team, this is fine by me. I can +1/+2 if needed.

Ok, I'll push it to gerrit probably sometime this week.

And to confirm - the conditional in the patch (if ( isset( $bits['host'] ) && strpos( $bits['host'], '?' ) !== false )) works so that anything PHP 7.0.13+ would be unaffected.

Correct.

Reedy subscribed.

0001-SECURITY-Work-around-PHP-bug-in-parse_url.patch applies cleanly to 1.31, 1.32 and 1.33

Yeah, we no longer support any versions of PHP affected by this issue. But including the added unit tests in master would still be a good idea. I have a patch for that prepared, once I have the all-clear to put it in Gerrit.

If you're waiting on a thumbs up from the Security-Team, this is fine by me. I can +1/+2 if needed.

Ok, I'll push it to gerrit probably sometime this week.

Now up as https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/557062

0001-SECURITY-Work-around-PHP-bug-in-parse_url.patch applies cleanly to 1.31, 1.32 and 1.33

Ok, so I see the tests for master were merged. I assume the rest (the original patch with tests) will be backported to the other supported release branches with the security release next week. I'm trying to decide if this warrants a CVE on our end since it was technically a workaround for something broken within PHP/HHVM.

0001-SECURITY-Work-around-PHP-bug-in-parse_url.patch applies cleanly to 1.31, 1.32 and 1.33

Ok, so I see the tests for master were merged. I assume the rest (the original patch with tests) will be backported to the other supported release branches with the security release next week. I'm trying to decide if this warrants a CVE on our end since it was technically a workaround for something broken within PHP/HHVM.

Yeah, I've backported the tests to REL1_34 for completeness, will do the other patches this week as part of the release

According to https://3v4l.org/ko84p PHP only has the correct behavior in 5.6.28+. It was backported in https://bugs.php.net/bug.php?id=73192 as a security fix. Do we need to look at a core workaround for pre-1.31 branches that required PHP 5.5.9+?

I thought HHVM had moved back to the PHP 7 implementation of parse_url (c.f. https://github.com/facebook/hhvm/issues/8104#issuecomment-360226070) but I guess not.

^ CVE's were discussed/mentioned upstream, but never assigned. I don't think it's necessarily worth assigning one ourselves, but certainly at least point upstream

Reedy assigned this task to Anomie.
Reedy changed the visibility from "Custom Policy" to "Public (No Login Required)".

Change 559480 merged by jenkins-bot:
[mediawiki/core@REL1_31] SECURITY: Work around PHP bug in parse_url

https://gerrit.wikimedia.org/r/559480

Change 559483 merged by jenkins-bot:
[mediawiki/core@REL1_32] SECURITY: Work around PHP bug in parse_url

https://gerrit.wikimedia.org/r/559483

Change 559485 merged by jenkins-bot:
[mediawiki/core@REL1_33] SECURITY: Work around PHP bug in parse_url

https://gerrit.wikimedia.org/r/559485

fyi, the issue in PHP (Sec Bug #73192) was assigned CVE-2016-10397. Seems like the Bug was never updated accordingly.

https://nvd.nist.gov/vuln/detail/CVE-2016-10397

sbassett renamed this task from wfParseUrl() incorrectly parses hostnames in older PHP versions and HHVM due to bug in parse_url() to wfParseUrl() incorrectly parses hostnames in older PHP versions and HHVM due to bug in parse_url() (CVE-2016-10397 [PHP]).Dec 30 2019, 11:40 PM