Page MenuHomePhabricator

TemplateData doesn't sanitize descriptions etc.
Closed, ResolvedPublic

Description

If a template description (or other InterfaceText field) contains disallowed HTML it is not stripped nor escaped, e.g.

<templatedata>
{
  "description": "<script>alert('test');</script>",
  "params": {}
}
</templatedata>

The API returns this as:

 "pages": {
    "935": {
        "title": "Template:Test",
        "description": {
            "en": "<script>alert('test');</script>"
        },
        "params": {},
        "format": null,
        "paramOrder": [],
        "sets": [],
        "maps": {}
    }
},

Which of course clients should not use as-is, and should always treat as plain text, but they mightn't know that. The value is of type InterfaceText (free-form string, no wikitext).

Is it worth sanitizing these outputs?

Event Timeline

The api should document this clearly...but i think it would be incorrect for the api to return escaped text (unless clearly documented) - we have no idea if the api data is being used in an html context.

Things should always be escaped as near as possible to where they are inserted into html

Yeah, that makes sense. So we should just make sure it's clear that the API result should never be used in an HTML context? That it's definitely plaintext, and should only ever be used as such?

I sort of started wondering about this because there are a bunch of template descriptions that contain new lines, and I think people expect them to be displayed as paragraphs, but if the descriptions are plaintext then that wouldn't be a transformation we'd want to make would it? (i.e. that InterfaceText is not designed for parsing as anything).

Bawolff claimed this task.

I don't think there's anything to do here.

Bawolff changed the visibility from "Custom Policy" to "Public (No Login Required)".Mar 26 2019, 3:02 PM