Make upload.wikimedia.org Cross Origin compatible (CORS)
Closed, ResolvedPublic

Description

As with bug 25886, and as previously discussed such as http://www.gossamer-threads.com/lists/wiki/wikitech/220659 -- it would be useful for various next-generation user scripts, gadgets etc to have direct access to raw contents of uploaded files.

This is especially true for SVG and raster images, which if properly marked could be loaded in via XHR or img+canvas for editing and such. Currently, such tools must either have server-side support like the ApiSvgProxy plugin or use an offsite proxy on their own domain or via JSONP.

See Cross Origin Resource Sharing: http://www.w3.org/TR/access-control/


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=20298

bzimport set Reference to bz28700.
brion created this task.Via LegacyApr 25 2011, 5:51 PM
Catrope added a comment.Via ConduitApr 25 2011, 6:26 PM

Could you provide the specific header that should be set?

brion added a comment.Via ConduitApr 25 2011, 9:20 PM

I believe we want:

Access-Control-Allow-Origin: *

A quick test confirms I can fetch files cross-domain in Firefox with a regular XHR with this header added. (Some browsers require a little more fiddling.)

Brettz9 added a comment.Via ConduitApr 26 2011, 1:52 AM

Specification draft is at http://www.w3.org/TR/cors/ (IE8 works with a different object client-side (not XMLHttpRequest for cross-domain requests), but relies on the same Access-Control-Allow-Origin header: http://msdn.microsoft.com/en-us/library/dd573303(v=vs.85).aspx so even IE8 ought to be workable with it.)

I would think this bug would ideally be expanded to allow CORS for the API page itself as well--that would allow JavaScript applications to access the API without the GET limitations of JSONP and also avoids its security problems (a site can execute arbitrary JavaScript based on JSONP's current lack of a specific content-type in browsers (its not JSON, nor should it be JavaScript), not merely the callback requested by the user).

Besides that, my impression as a web developer is that JSONP is a lesser-known technique than Ajax, so I think you'd also be promoting the API usage more widely.

Catrope added a comment.Via ConduitApr 26 2011, 3:24 PM

(In reply to comment #3)

I would think this bug would ideally be expanded to allow CORS for the API page
itself as well--that would allow JavaScript applications to access the API
without the GET limitations of JSONP and also avoids its security problems (a
site can execute arbitrary JavaScript based on JSONP's current lack of a
specific content-type in browsers (its not JSON, nor should it be JavaScript),
not merely the callback requested by the user).

Besides that, my impression as a web developer is that JSONP is a lesser-known
technique than Ajax, so I think you'd also be promoting the API usage more
widely.

The API already supports CORS, see bug 19907. This code is live on Wikimedia wikis already, but it's not configured, so no CORS headers are actually served right now.

MarkAHershberger added a comment.Via ConduitApr 27 2011, 5:08 AM

Giving this to Roan to deploy CORS headers with high hopes. :)

Catrope added a comment.Via ConduitApr 27 2011, 1:00 PM

(In reply to comment #5)

Giving this to Roan to deploy CORS headers with high hopes. :)

I can and will enable CORS for the API per bug 19907 comment 12 , but this bug is about upload.wikimedia.org , and I don't have access to the box that serves that. Ariel typically does upload-related things, so passing the buck :)

ArielGlenn added a comment.Via ConduitMay 3 2011, 8:18 PM

We need the additional headers Access-Control-Allow-Methods and Access-Control-Max-Age for pre-flight requests, I believe. Am I correct in assuming we would only want this for retrieval of image files or for thumbnails (perhaps generated by a 404 handler), i.e. GET? Anything else would start to make me nervous.

I don't have a clue whether we would need to do something with the upload squids as well. Anybody?

Catrope added a comment.Via ConduitMay 3 2011, 8:41 PM

(In reply to comment #7)

We need the additional headers Access-Control-Allow-Methods and
Access-Control-Max-Age for pre-flight requests, I believe.

Nah, let's not bother with preflighted stuff. There's pretty much no use for that for upload.wikimedia.org

Am I correct in

assuming we would only want this for retrieval of image files or for thumbnails
(perhaps generated by a 404 handler), i.e. GET? Anything else would start to
make me nervous.

For POST as well, but that doesn't actually *do* anything on upload anyway, does it? Besides, these requests are already allowed, the only thing that changes is that the requestor will be able to read the response. The only case in which this is dangerous, to my knowledge, is if it contains anti-CSRF tokens, but those don't appear anywhere near upload.wikimedia.org .

I don't have a clue whether we would need to do something with the upload
squids as well. Anybody?

Sounds like we would have in order to also serve the headers on old cached images.

MarkAHershberger added a comment.Via ConduitJun 29 2011, 5:57 PM

Is anything else in MW needed before we can just DO this?

Ariel, are you the right person to do it, or would there be someone more suitable for doing this?

MarkAHershberger added a comment.Via ConduitJun 29 2011, 6:15 PM

Ariel will probably have time to rebuild squid (ugh!) on the week of
Jul 11, 2011. Check back then.

brion added a comment.Via ConduitJun 29 2011, 6:19 PM

Nothing in MediaWiki at all is needed for this to happen -- it's purely a web server configuration issue.

What's needed is:

  1. configuration of the backend web servers for upload.wikimedia.org to add headers on newly-served files
  1. configuration of the frontend squid servers for upload.wikimedia.org to add headers on already-cached files (optional: if we'd done #1 months ago the old entries would have long expired by now :)
Reedy added a comment.Via ConduitJul 6 2011, 8:10 PM

Removing "shell" keyword for things that aren't directly doable by shell users etc

brion added a comment.Via ConduitJul 6 2011, 8:12 PM

Restoring "shell" keyword for things that only a subset of shell users can possibly do.

Reedy added a comment.Via ConduitJul 6 2011, 8:31 PM

Adding ops keyword

Reedy added a comment.Via ConduitJul 6 2011, 8:32 PM

Removing shell keyword if exists

drdee added a comment.Via ConduitNov 17 2011, 3:33 AM

What is the current status of this? Has this been resolved?

Catrope added a comment.Via ConduitNov 17 2011, 11:46 AM

(In reply to comment #16)

What is the current status of this? Has this been resolved?

I don't believe so.

bzimport added a comment.Via ConduitDec 7 2011, 7:25 PM

afeldman wrote:

It doesn't seem that an ops request was ever made for this - I just opened RT 2107 for the apache/squid change.

faidon added a comment.Via ConduitNov 15 2012, 8:54 PM

So, I've commited https://gerrit.wikimedia.org/r/#/c/33652/ that adds a Swift middleware to push Access-Control-Allow-Origin: * to all unauthenticated GET/HEADs. No support for preflight requests or anything more complicated than that. I'd appreciate a review from a Python/WSGI speaker :)

Now, this obviously won't solve this for all already cached content. We can either deploy this and wait for either the content to be revalidated, or wait until we switch to Varnish so we can also (or instead) do CORS on VCL.

brion added a comment.Via ConduitNov 16 2012, 9:48 PM

I'd be inclined to just push it and let cached content revalidate over time... it'd be nice to start using it on new files at least. :)

faidon added a comment.Via ConduitNov 20 2012, 6:18 AM

Both https://gerrit.wikimedia.org/r/#/c/33652/ and https://gerrit.wikimedia.org/r/34264 have been merged, and new content in Swift comes with CORS now.

As explained earlier, the same doesn't apply for already cached content; not sure what do you want to do about the bug report.

faidon added a comment.Via ConduitNov 20 2012, 7:01 AM

Oh something a bit related too: for entirely different reasons, upload.wikimedia.org also serves a crossdomain.xml with <allow-access-from domain="*"/> for a while now.

ori added a comment.Via ConduitJan 9 2013, 11:11 AM

(In reply to comment #21)

As explained earlier, the same doesn't apply for already cached content; not
sure what do you want to do about the bug report.

The dream of every engineer is that a problem will resolve itself if you just ignore it long enough. This is one of those rare cases where it actually happens to be true.

Add Comment