Page MenuHomePhabricator

Remex double-decodes HTML entities on PHP (not HHVM)
Closed, ResolvedPublic

Description

For example, given

`[[File:Foobar.jpg|alt=&]]`

PHP will render this in HTML as alt="&" while HHVM correctly renders alt="&".

This is a bug in the tokenizer, where we take a shortcut in Tokenizer::handleCharRefs() and then turn around and decode the entities again "the long way", leading to a double-decode.

Event Timeline

cscott created this task.Oct 15 2018, 8:15 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 15 2018, 8:15 PM

Change 467470 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/libs/RemexHtml@master] Don't double-decode HTML entities on non-HHVM PHP

https://gerrit.wikimedia.org/r/467470

Change 467470 merged by jenkins-bot:
[mediawiki/libs/RemexHtml@master] Don't double-decode HTML entities on non-HHVM PHP

https://gerrit.wikimedia.org/r/467470

The patch is merged, but we're going to need a new version of remex released to composer and mediawiki-core updated to require the new version before this bug is actually fixed in production uses of php 7 (ie, when running the php 7 jenkins tests).

Change 468799 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/vendor@master] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468799

Change 468800 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/core@master] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468800

Change 468799 merged by jenkins-bot:
[mediawiki/vendor@master] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468799

Change 468800 merged by jenkins-bot:
[mediawiki/core@master] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468800

Change 468852 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/vendor@REL1_32] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468852

Change 468853 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/core@REL1_32] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468853

Change 468852 merged by jenkins-bot:
[mediawiki/vendor@REL1_32] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468852

Change 468853 merged by jenkins-bot:
[mediawiki/core@REL1_32] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468853

Change 468862 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/core@REL1_31] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468862

Change 468864 had a related patch set uploaded (by Legoktm; owner: Legoktm):
[mediawiki/vendor@REL1_31] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468864

Legoktm closed this task as Resolved.Oct 21 2018, 7:33 PM
Legoktm claimed this task.
Legoktm added a subscriber: Legoktm.

I've backported to 1.32, and prepped 1.31 backports.

Change 468864 merged by jenkins-bot:
[mediawiki/vendor@REL1_31] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468864

Change 468862 merged by jenkins-bot:
[mediawiki/core@REL1_31] Upgrade wikimedia/remex-html to 2.0.1

https://gerrit.wikimedia.org/r/468862