Page MenuHomePhabricator

SVG upload fails because $safeXmlEncodings doesn't contain US-ASCII
Closed, ResolvedPublicBUG REPORT

Description

Steps to Reproduce:

  • Try to upload an SVG file with the XML declaration <?xml version='1.0' encoding='us-ascii' standalone='no'?>

Actual Results:

  • Error message:

This file contains HTML or script code that may be erroneously interpreted by a web browser. See the FAQ for more information.

Expected Results:

Successful upload.

Workaround:

Change encoding='us-ascii' to encoding='utf-8'.

Notes:

After a discussion in IRC, @AntiCompositeNumber in #wikimedia-commons found that the problem is the encoding. This encoding is rather strange, but it wasn't a conscious decision by me. Several XML creators, including Python's xml.etree, now keep track of whether any character in the document is non-ASCII, and if not, they tag the whole document us-ascii. Therefore, this encoding should be added to $safeXmlEncodings. I can't think of a reason ASCII would be less safe than UTF-8.

Details

Related Changes in Gerrit:

Event Timeline

copypaste renamed this task from SVG upload fails because $safeXmlEncodings doesn't encode US-ASCII to SVG upload fails because $safeXmlEncodings doesn't contain US-ASCII.Mar 24 2021, 8:52 PM
copypaste updated the task description. (Show Details)
TheDJ subscribed.

This is related to T49304: SVG JavaScript detection bypass

us-ascii is backwards compatible with utf-8, so this should be easy to fix and not a security problem. I actually think it was just an oversight that it didn't get included in that list (maybe because we didn't have any existing files in us-ascii back then).

Change 784759 had a related patch set uploaded (by TheDJ; author: TheDJ):

[mediawiki/core@master] Add us-ascii to safeXmlEncodings

https://gerrit.wikimedia.org/r/784759

Change 784759 merged by jenkins-bot:

[mediawiki/core@master] Add us-ascii to safeXmlEncodings

https://gerrit.wikimedia.org/r/784759