Page MenuHomePhabricator

PHP 8: libxml_disable_entity_loader() is deprecated
Open, Needs TriagePublic

Description

libxml_disable_entity_loader() calls are used in MediaWiki and dependencies to prevent XXE injection vulnerabilities when processing arbitrary XML code. In PHP 8 and above, PHP uses a newer version of libxml that disables XXE loading by default, and PHP now raises a deprecation message when this function is called:

Deprecated: Function libxml_disable_entity_loader() is deprecated in ... on line ...

MediaWiki code and dependencies' code will need to be updated, probably by making this call conditional on PHP/libxml version.

Event Timeline

Most of them only exist to stop false positive reports

Change 651985 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Wrap libxml_(dis|en)able_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651985

Change 651987 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/libs/XMPReader@master] XML related cleanup

https://gerrit.wikimedia.org/r/651987

Change 651988 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/CommonsMetadata@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651988

Change 651989 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/Flow@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651989

Change 651990 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/GWToolset@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651990

Change 651991 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/RSS@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651991

Change 651991 merged by jenkins-bot:
[mediawiki/extensions/RSS@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651991

Change 651990 merged by jenkins-bot:
[mediawiki/extensions/GWToolset@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651990

Change 651988 merged by jenkins-bot:
[mediawiki/extensions/CommonsMetadata@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651988

Change 651996 had a related patch set uploaded (by MaxSem; owner: MaxSem):
[HtmlFormatter@master] PHP 8 compatibility

https://gerrit.wikimedia.org/r/651996

Change 651996 merged by jenkins-bot:
[HtmlFormatter@master] PHP 8 compatibility

https://gerrit.wikimedia.org/r/651996

Change 651989 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651989

Change 651987 merged by jenkins-bot:
[mediawiki/libs/XMPReader@master] XML related cleanup

https://gerrit.wikimedia.org/r/651987

What do we do when/where we want it enabled... If it's disabled by default, and calling the function is deprecated...

Change 653975 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/653975

Change 656915 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@REL1_35] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/656915

Change 651985 merged by jenkins-bot:
[mediawiki/core@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651985

Change 656915 merged by jenkins-bot:
[mediawiki/core@REL1_35] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/656915

I'm confused by https://gerrit.wikimedia.org/r/c/mediawiki/core/+/653975

Using LIBXML_VERSION causes a test failure, PHP_VERSION does not.

PHP 7.4.3 locally, libxml2 2.9.10...

$ php -a
Interactive mode enabled

php > echo LIBXML_VERSION;
20910
php > var_dump( libxml_disable_entity_loader( true ) );
php shell code:1:
bool(false)
php >

So it was not disabled by default?

https://www.php.net/manual/en/function.libxml-disable-entity-loader.php

Disable/enable the ability to load external entities. Note that disabling the loading of external entities may cause general issues with loading XML documents. However, as of libxml 2.9.0 entity substitution is disabled by default, so there is no need to disable the loading of external entities, unless there is the need to resolve internal entity references with LIBXML_NOENT. Generally, it is preferable to use libxml_set_external_entity_loader() to suppress loading of external entities.

Disable (true) or enable (false) libxml extensions (such as DOM, XMLWriter and XMLReader) to load external entities.

It seems it's not disabled by default on those versions? Or some ubuntu weirdness? Or am I missing something?

https://ubuntu.com/security/CVE-2013-0339
https://gitlab.gnome.org/GNOME/libxml2/commit/4629ee02ac649c27f9c0cf98ba017c6b5526070f

3v4l seems to agree - https://3v4l.org/pObcb

For input

<?php

var_dump( LIBXML_VERSION );
var_dump( libxml_disable_entity_loader( true ) );

Noting also, most documentation seems to be suggesting that

https://php.watch/versions/8.0/libxml_disable_entity_loader-deprecation LIBXML_VERSION < 20900 and PHP_VERSION_ID < 80000 should mostly have the same effect..

Per my gerrit comment, just removing the call will result in an XXE if entity substition is enabled. My test case is:

$reader = new XMLReader;
$reader->XML('<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "file:///etc/lsb-release" >]>
<foo>&xxe;</foo>', null, LIBXML_NOERROR | LIBXML_NOWARNING );
$reader->setParserProperty( XMLReader::SUBST_ENTITIES, true );
$reader->read();
$reader->read();
print $reader->readString();

In my testing, using an http/https URL causes an actual HTTP request to take place, although the results don't seem to end up in the output.

Any callers that use entity substitution will also need to do

libxml_set_external_entity_loader(
    function () {
        return null;
    }
);

That is a pretty close equivalent to libxml_disable_entity_loader(). But note that that is global, so the hander may need to be restored afterwards. I've confirmed that with such a call, it's still possible to define and substitute entities declared as string literals.

Change 658936 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658936

Change 658937 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@REL1_35] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658937

Change 658937 merged by jenkins-bot:
[mediawiki/core@REL1_35] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658937

Change 658936 merged by jenkins-bot:
[mediawiki/core@master] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658936

Should we introduce some wfLibXmlDisableEntityLoader helper function that does the appropriate thing depending on PHP / libxml versions?

There is also a test in mediawiki/core which is failing when the libxml_disable_entity_loader call is not done due to deprecation (as provided by the reverted patch sets).

1) UploadBaseTest::testCheckSvgScriptCallback with data set #39 ('<?xml version="1.0"?> <!DOCTY...</svg>', false, false, 'SVG with data: uri external entity')
SVG with data: uri external entity (filter match)
Failed asserting that true is identical to false.

It works with libxml_set_external_entity_loader, but there is no way to restore the loader - https://bugs.php.net/bug.php?id=76763

There is also a test in mediawiki/core which is failing when the libxml_disable_entity_loader call is not done due to deprecation (as provided by the reverted patch sets).

1) UploadBaseTest::testCheckSvgScriptCallback with data set #39 ('<?xml version="1.0"?> <!DOCTY...</svg>', false, false, 'SVG with data: uri external entity')
SVG with data: uri external entity (filter match)
Failed asserting that true is identical to false.

It works with libxml_set_external_entity_loader, but there is no way to restore the loader - https://bugs.php.net/bug.php?id=76763

Aha. Thanks for the Upstream bug; glad it wasn't just me that was confused.

And yeah, that test failure is part of the reason I stopped with https://gerrit.wikimedia.org/r/c/mediawiki/core/+/653975 (and Tim left CR on the patch and the comments here), and I reverted the other conditional adding patch out of an abundance of caution (rather than potentially regressing the XXE issue)

Some libs still using the version check - https://codesearch.wmcloud.org/search/?q=LIBXML_VERSION&i=nope&files=&excludeFiles=&repos=

libs / XMPReader
HtmlFormatter

and some extensions are also using it.

So it's not immediately an issue in all cases. Per Tim, cases where entity substitution is used/relied on is potentially problematic. And as such, probably the defined/expected behaviour in various places is very much YMMV, though in most cases we do want to be preventing external inclusion of "things" anyway (there are some exceptions where it explicitly turns it on instead of off).

Core had a few more "interesting" use cases

The XMPReader bump hasn't been brought into MW yet (patches are in Gerrit to do that though), the HtmlFormatter one has...

When using libxml_set_external_entity_loader returning null than SuiteDirectoryTest.php fails

There was 1 error:

1) SuiteDirectoryTest::testSuiteXmlDirectories
Error: Call to a member function getElementsByTagName() on null

/workspace/src/tests/phpunit/structure/SuiteDirectoryTest.php:16
/workspace/src/tests/phpunit/phpunit.php:75
/workspace/src/maintenance/doMaintenance.php:106
/workspace/src/tests/phpunit/phpunit.php:134

Not sure what for an error happens here

I am not sure if that is okay, but it works:

		libxml_set_external_entity_loader( function ( $public, $system, $context ) {
			if ( is_file( $system ) ) {
				return $system;
			}
			return null;
		} );

I am not sure if that is okay, but it works:

		libxml_set_external_entity_loader( function ( $public, $system, $context ) {
			if ( is_file( $system ) ) {
				return $system;
			}
			return null;
		} );

There is already a call to that function - https://gerrit.wikimedia.org/g/mediawiki/core/+/d7decde5f583960122ca22902d5c4717ccc41792/tests/phpunit/maintenance/DumpTestCase.php#181