Page MenuHomePhabricator

Sanitizer::stripAllTags shouldn't expand legacy "semicolon-less" HTML5 entities
Open, Needs TriagePublic

Description

When Sanitizer::stripAllTags was converted to use Remex in ddb4913f53624c8ee0a2a91bd44bf750e378569d we started decoding "semicolon-less" entities, which historically have not been allowed in wikitext. This caused a regression (T209236) when stripAllTags was invoked in link text. The regression was fixed by disabling semicolon-less entities in the specific case of alt/link option stripping, but we should probably disable these entities for every caller of Sanitizer::stripAllTags.

Event Timeline

Change 475821 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Uniformly use "wikitext entities" not "HTML5 entities"

https://gerrit.wikimedia.org/r/475821

Change 475821 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Uniformly use "wikitext entities" not "HTML5 entities"

https://gerrit.wikimedia.org/r/475821