Page MenuHomePhabricator

RawAction should set proper Content-Type header
Open, MediumPublic

Description

Currently, action=raw will return text/x-wiki for all content (except for JS and CSS pages, but even then, the property MIME type will only be used if requested with the ctype or gen parameter).

This is a relict from times before ContentHandler. With ContentHandler, page content knows its mime type. RawAction should ask ContentHandler for the default serialization format if ctype and gen are not given. It should use Content::serialize to generate the desired output format. And finally, it should declare the actual format used for serialization in the Content-Type header.

Event Timeline

Well in general this sounds nice - there is a potential security issue here in the event that a content handler returns an unsafe mime that could execute scripts. E.g. returning text/html, application/svg+xml would be bad.Some browsers do a lot of sniffing on text/plain (not sure how much that still is the case). And there is also the potential for a format which is downloaded and executed unsafely (e.g. text/csv interpreted by excel. ) which may or not be an issue. Even if not executed could maybe cause a network request to be issued to a third party server which may be unacceptable privacy wise.

That doesnt totally block doing this, and im certainly not opposed for obviously safe cases like application/json, we just need to be sure to be careful about these issues before doing this.

Change 384089 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Add json to the allowed content types

https://gerrit.wikimedia.org/r/384089

Change 384096 had a related patch set uploaded (by Ladsgroup; owner: Amir Sarabadani):
[mediawiki/core@master] Use ContentHandler to get proper MIME type in RawAction

https://gerrit.wikimedia.org/r/384096

Change 384089 merged by jenkins-bot:
[mediawiki/core@master] RawAction: Add json to the allowed content types

https://gerrit.wikimedia.org/r/384089

In general, the API should be used for this. Imho the only use case for action=raw is for when a client functionally hard-requires returning the content as the only response body. This is basically only true for JS and CSS. Anything else can use the API.

So perhaps instead of making these non-wikitext types more accomodated, we could instead change them to HTTP 40x since they are probably not yet commonly used for anything.

See also T279120.

That sounds like a step backwards to me. Why should we force clients to talk to our bespoke MediaWiki API instead of speaking standard HTTP?

Pppery subscribed.

Agreed with Lucas here that deliberately breaking this use seems like a step backwards, for what it's worth.

Change #384096 abandoned by Hashar:

[mediawiki/core@master] Use ContentHandler to get proper MIME type in RawAction

https://gerrit.wikimedia.org/r/384096

Change #384096 restored by Thcipriani:

[mediawiki/core@master] Use ContentHandler to get proper MIME type in RawAction

https://gerrit.wikimedia.org/r/384096

Change #384096 abandoned by Ladsgroup:

[mediawiki/core@master] Use ContentHandler to get proper MIME type in RawAction

Reason:

Not working on it atm

https://gerrit.wikimedia.org/r/384096