Page MenuHomePhabricator

PHP 8: libxml_disable_entity_loader() is deprecated
Closed, ResolvedPublic

Description

libxml_disable_entity_loader() calls are used in MediaWiki and dependencies to prevent XXE injection vulnerabilities when processing arbitrary XML code. In PHP 8 and above, PHP uses a newer version of libxml that disables XXE loading by default, and PHP now raises a deprecation message when this function is called:

Deprecated: Function libxml_disable_entity_loader() is deprecated in ... on line ...

MediaWiki code and dependencies' code will need to be updated, probably by making this call conditional on PHP/libxml version.

Details

SubjectRepoBranchLines +/-
mediawiki/extensions/TEImaster+4 -2
mediawiki/extensions/FileAnnotationsmaster+4 -2
mediawiki/coremaster+10 -6
mediawiki/coremaster+0 -0
mediawiki/coreREL1_37+31 -16
mediawiki/coreREL1_35+31 -16
mediawiki/coreREL1_36+31 -16
mediawiki/coremaster+31 -16
mediawiki/coremaster+27 -56
mediawiki/coreREL1_35+27 -56
mediawiki/coremaster+56 -27
mediawiki/coreREL1_35+56 -27
mediawiki/libs/XMPReadermaster+11 -10
mediawiki/extensions/Flowmaster+9 -4
HtmlFormattermaster+11 -2
mediawiki/extensions/CommonsMetadatamaster+7 -2
mediawiki/extensions/GWToolsetmaster+7 -2
mediawiki/extensions/RSSmaster+10 -3
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Most of them only exist to stop false positive reports

Change 651985 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Wrap libxml_(dis|en)able_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651985

Change 651987 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/libs/XMPReader@master] XML related cleanup

https://gerrit.wikimedia.org/r/651987

Change 651988 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/CommonsMetadata@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651988

Change 651989 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/Flow@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651989

Change 651990 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/GWToolset@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651990

Change 651991 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/RSS@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651991

Change 651991 merged by jenkins-bot:
[mediawiki/extensions/RSS@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651991

Change 651990 merged by jenkins-bot:
[mediawiki/extensions/GWToolset@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651990

Change 651988 merged by jenkins-bot:
[mediawiki/extensions/CommonsMetadata@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651988

Change 651996 had a related patch set uploaded (by MaxSem; owner: MaxSem):
[HtmlFormatter@master] PHP 8 compatibility

https://gerrit.wikimedia.org/r/651996

Change 651996 merged by jenkins-bot:
[HtmlFormatter@master] PHP 8 compatibility

https://gerrit.wikimedia.org/r/651996

Change 651989 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651989

Change 651987 merged by jenkins-bot:
[mediawiki/libs/XMPReader@master] XML related cleanup

https://gerrit.wikimedia.org/r/651987

What do we do when/where we want it enabled... If it's disabled by default, and calling the function is deprecated...

Change 653975 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/653975

Change 656915 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@REL1_35] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/656915

Change 651985 merged by jenkins-bot:
[mediawiki/core@master] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/651985

Change 656915 merged by jenkins-bot:
[mediawiki/core@REL1_35] Wrap libxml_disable_entity_loader() calls in version constraint

https://gerrit.wikimedia.org/r/656915

I'm confused by https://gerrit.wikimedia.org/r/c/mediawiki/core/+/653975

Using LIBXML_VERSION causes a test failure, PHP_VERSION does not.

PHP 7.4.3 locally, libxml2 2.9.10...

$ php -a
Interactive mode enabled

php > echo LIBXML_VERSION;
20910
php > var_dump( libxml_disable_entity_loader( true ) );
php shell code:1:
bool(false)
php >

So it was not disabled by default?

https://www.php.net/manual/en/function.libxml-disable-entity-loader.php

Disable/enable the ability to load external entities. Note that disabling the loading of external entities may cause general issues with loading XML documents. However, as of libxml 2.9.0 entity substitution is disabled by default, so there is no need to disable the loading of external entities, unless there is the need to resolve internal entity references with LIBXML_NOENT. Generally, it is preferable to use libxml_set_external_entity_loader() to suppress loading of external entities.

Disable (true) or enable (false) libxml extensions (such as DOM, XMLWriter and XMLReader) to load external entities.

It seems it's not disabled by default on those versions? Or some ubuntu weirdness? Or am I missing something?

https://ubuntu.com/security/CVE-2013-0339
https://gitlab.gnome.org/GNOME/libxml2/commit/4629ee02ac649c27f9c0cf98ba017c6b5526070f

3v4l seems to agree - https://3v4l.org/pObcb

For input

<?php

var_dump( LIBXML_VERSION );
var_dump( libxml_disable_entity_loader( true ) );

Screenshot 2021-01-27 at 03.57.21.png (984×1 px, 127 KB)

Noting also, most documentation seems to be suggesting that

https://php.watch/versions/8.0/libxml_disable_entity_loader-deprecation LIBXML_VERSION < 20900 and PHP_VERSION_ID < 80000 should mostly have the same effect..

Per my gerrit comment, just removing the call will result in an XXE if entity substition is enabled. My test case is:

$reader = new XMLReader;
$reader->XML('<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "file:///etc/lsb-release" >]>
<foo>&xxe;</foo>', null, LIBXML_NOERROR | LIBXML_NOWARNING );
$reader->setParserProperty( XMLReader::SUBST_ENTITIES, true );
$reader->read();
$reader->read();
print $reader->readString();

In my testing, using an http/https URL causes an actual HTTP request to take place, although the results don't seem to end up in the output.

Any callers that use entity substitution will also need to do

libxml_set_external_entity_loader(
    function () {
        return null;
    }
);

That is a pretty close equivalent to libxml_disable_entity_loader(). But note that that is global, so the hander may need to be restored afterwards. I've confirmed that with such a call, it's still possible to define and substitute entities declared as string literals.

Change 658936 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@master] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658936

Change 658937 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/core@REL1_35] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658937

Change 658937 merged by jenkins-bot:
[mediawiki/core@REL1_35] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658937

Change 658936 merged by jenkins-bot:
[mediawiki/core@master] Revert "Wrap libxml_disable_entity_loader() calls in version constraint"

https://gerrit.wikimedia.org/r/658936

Should we introduce some wfLibXmlDisableEntityLoader helper function that does the appropriate thing depending on PHP / libxml versions?

There is also a test in mediawiki/core which is failing when the libxml_disable_entity_loader call is not done due to deprecation (as provided by the reverted patch sets).

1) UploadBaseTest::testCheckSvgScriptCallback with data set #39 ('<?xml version="1.0"?> <!DOCTY...</svg>', false, false, 'SVG with data: uri external entity')
SVG with data: uri external entity (filter match)
Failed asserting that true is identical to false.

It works with libxml_set_external_entity_loader, but there is no way to restore the loader - https://bugs.php.net/bug.php?id=76763

There is also a test in mediawiki/core which is failing when the libxml_disable_entity_loader call is not done due to deprecation (as provided by the reverted patch sets).

1) UploadBaseTest::testCheckSvgScriptCallback with data set #39 ('<?xml version="1.0"?> <!DOCTY...</svg>', false, false, 'SVG with data: uri external entity')
SVG with data: uri external entity (filter match)
Failed asserting that true is identical to false.

It works with libxml_set_external_entity_loader, but there is no way to restore the loader - https://bugs.php.net/bug.php?id=76763

Aha. Thanks for the Upstream bug; glad it wasn't just me that was confused.

And yeah, that test failure is part of the reason I stopped with https://gerrit.wikimedia.org/r/c/mediawiki/core/+/653975 (and Tim left CR on the patch and the comments here), and I reverted the other conditional adding patch out of an abundance of caution (rather than potentially regressing the XXE issue)

Some libs still using the version check - https://codesearch.wmcloud.org/search/?q=LIBXML_VERSION&i=nope&files=&excludeFiles=&repos=

libs / XMPReader
HtmlFormatter

and some extensions are also using it.

So it's not immediately an issue in all cases. Per Tim, cases where entity substitution is used/relied on is potentially problematic. And as such, probably the defined/expected behaviour in various places is very much YMMV, though in most cases we do want to be preventing external inclusion of "things" anyway (there are some exceptions where it explicitly turns it on instead of off).

Core had a few more "interesting" use cases

The XMPReader bump hasn't been brought into MW yet (patches are in Gerrit to do that though), the HtmlFormatter one has...

When using libxml_set_external_entity_loader returning null than SuiteDirectoryTest.php fails

There was 1 error:

1) SuiteDirectoryTest::testSuiteXmlDirectories
Error: Call to a member function getElementsByTagName() on null

/workspace/src/tests/phpunit/structure/SuiteDirectoryTest.php:16
/workspace/src/tests/phpunit/phpunit.php:75
/workspace/src/maintenance/doMaintenance.php:106
/workspace/src/tests/phpunit/phpunit.php:134

Not sure what for an error happens here

I am not sure if that is okay, but it works:

		libxml_set_external_entity_loader( function ( $public, $system, $context ) {
			if ( is_file( $system ) ) {
				return $system;
			}
			return null;
		} );

I am not sure if that is okay, but it works:

		libxml_set_external_entity_loader( function ( $public, $system, $context ) {
			if ( is_file( $system ) ) {
				return $system;
			}
			return null;
		} );

There is already a call to that function - https://gerrit.wikimedia.org/g/mediawiki/core/+/d7decde5f583960122ca22902d5c4717ccc41792/tests/phpunit/maintenance/DumpTestCase.php#181

Ok, so how's this:

  1. Implement a way to restore the loader, per the upstream bug. This would either involve changing libxml_set_external_entity_loader so it returns the previous loader, or implementing a new libxml_restore_external_entity_loader function, similar to how {set|restore}_error_handler work.
  1. Ask upstream not to remove libxml_disable_entity_loader before the aforementioned improvement lands.
  1. Continue using libxml_disable_entity_loader for the time being and suppress/ignore the deprecation warning.

Change 755841 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755841

Thanks Tim.

We should probably also decide if we want to revisit the patches made/merged to extensions and libraries to move away from the libxml version check, or those are "good enough" for now.

Change 755774 had a related patch set uploaded (by Reedy; author: Tim Starling):

[mediawiki/core@REL1_37] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755774

Change 755775 had a related patch set uploaded (by Reedy; author: Tim Starling):

[mediawiki/core@REL1_36] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755775

Change 755776 had a related patch set uploaded (by Reedy; author: Tim Starling):

[mediawiki/core@REL1_35] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755776

Change 755841 merged by jenkins-bot:

[mediawiki/core@master] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755841

Change 755775 merged by jenkins-bot:

[mediawiki/core@REL1_36] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755775

Change 755774 merged by jenkins-bot:

[mediawiki/core@REL1_37] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755774

Change 755776 merged by jenkins-bot:

[mediawiki/core@REL1_35] Suppress deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755776

Change 755784 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/core@master] Suppress more deprecation warnings from libxml_disable_entity_loader()

https://gerrit.wikimedia.org/r/755784

Change 755784 abandoned by Reedy:

[mediawiki/core@master] Suppress more deprecation warnings from libxml_disable_entity_loader()

Reason:

Didn't look properly. Failures are in wikimedia/html-formatter so bumping there needed

https://gerrit.wikimedia.org/r/755784

Change 653975 abandoned by Reedy:

[mediawiki/core@master] Wrap libxml_disable_entity_loader() calls in version constraint

Reason:

https://gerrit.wikimedia.org/r/653975

My pull request has been merged, so there will be libxml_get_external_entity_loader() in PHP 8.2. The idea from https://bugs.php.net/bug.php?id=76763 of having libxml_set_external_entity_loader() return the previous loader was rejected on backwards compatibility grounds. So libxml_get_external_entity_loader() was added instead.

This task does not block PHP 8.0 support.

Change 882003 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/extensions/TEI@master] Ignore deprecation warning from libxml_disable_entity_loader

https://gerrit.wikimedia.org/r/882003

Change 882150 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/extensions/FileAnnotations@master] Ignore deprecation warning from libxml_disable_entity_loader

https://gerrit.wikimedia.org/r/882150

Change 882150 merged by Umherirrender:

[mediawiki/extensions/FileAnnotations@master] Ignore deprecation warning from libxml_disable_entity_loader

https://gerrit.wikimedia.org/r/882150

Change 882003 merged by jenkins-bot:

[mediawiki/extensions/TEI@master] Ignore deprecation warning from libxml_disable_entity_loader

https://gerrit.wikimedia.org/r/882003