Page MenuHomePhabricator

Enable cross-origin resource sharing (CORS) in Core REST API
Open, Needs TriagePublic

Description

Origin

The Access-Control-Allow-Origin response header should be set for all requests and given a default value of *. for wikis that are not on an intranet (i.e. behind a firewall). It is completely safe to set this as the default value. MediaWiki should allow this behavior to be disabled if you are running MediaWiki on an intranet.

Doing this will provide for a much better developer experience as developers will be able to use the API from another origin automatically.

Related: T210790

Proposed Solution
Add Access-Control-Allow-Origin: * to all requests (config option to disable)

Credentials

If the API allows for authorization with the authorization code grant (or some other authorization mechanism that and does not force the client app to expose it's own secrets), then it is safe to add Access-Control-Allow-Headers with a value of Authorization (this header only needs to be added as as response to an OPTIONS request). This would allow non-whitelisted origins to make cross-origin authenticated requests.

If the API allows for browser-based authorization (i.e. Cookies) then the API will need to use the origin whitelist like the Action API does and add the Access-Control-Allow-Credentials to an OPTIONS request from those whitelisted origins.

Regardless, since Authorization and Cookie headers bypass the cache, it is not necessary to vary a request by Origin.

Proposed Solution

  1. Ignore all Cookies and HTTP Basic Authorization
  2. Allow OAuth2's authorization code grant
  3. Add Access-Control-Allow-Headers: Authorization to all OPTIONS requests

---OR---

  1. Use the existing origin whitelist and only do the following actions from one of those Origins:
  2. Add Access-Control-Allow-Origin: <Origin Requested> to all OPTIONS requests
  3. Add Access-Control-Allow-Credentials: true to all OPTIONS requests

Event Timeline

dbarratt renamed this task from Enable cross-origin resource sharing to Enable cross-origin resource sharing (CORS).Sep 6 2019, 12:09 AM
dbarratt created this task.
Anomie added a subscriber: Anomie.

Seems like effectively a duplicate of T210790: Should the Action API allow cross-origin requests by default? to me.

I'm not going to unilaterally close this as such, but such a close is my recommendation as, absent evidence that there exists some significant difference in considerations that apply to the two APIs, it would be best not to duplicate discussion.

Thanks @dbarratt ! @Anomie , I think the authentication profiles are different between the Action API and the core REST API so we should consider this separately.

I don't know what "authentication profiles" means. I do know that the current plan seems to be for the core REST API to use the same SessionManager sessions as the rest of MediaWiki rather than trying to reinvent everything.

I guess I'll copy over concerns here from the other ticket then. I hope I won't have to re-debate them too.

  • Every response would have to vary on the Origin header. It needs to be confirmed that that won't have harmful effects on WMF's caching layers.
    • Currently the Action API only varies on Origin for CORS requests claiming to be from a whitelisted origin. Non-CORS requests and "anonymous" CORS requests do not vary on Origin.
  • CSRF tokens must not be returned for CORS requests from unwhitelisted origins.
    • It should not create a session that the client will be unable to use thanks to Access-Control-Allow-Credentials not being true (cf. T125267).
    • It should not "succeed" with a token that will fail verification later, that's poor UX.
    • All of this also applies to other types of tokens that aren't strictly CSRF tokens (see T210790#5443645).
  • Ideally, it should ensure that the User is not a logged-in User for any CORS request from unwhitelisted origins.
    • It should already be so thanks to Access-Control-Allow-Credentials not being true, but multiple layers of defense are good.

Regarding "whitelisted origins", see $wgCrossSiteAJAXdomains and $wgCrossSiteAJAXdomainExceptions.

dbarratt added a comment.EditedSep 10 2019, 11:14 PM

I don't know what "authentication profiles" means. I do know that the current plan seems to be for the core REST API to use the same SessionManager sessions as the rest of MediaWiki rather than trying to reinvent everything.

I believe he's referring to another conversation we had were I asked whether the REST API should be stateless (meaning it would not use sessions).

  • Every response would have to vary on the Origin header. It needs to be confirmed that that won't have harmful effects on WMF's caching layers.

This is not the case. Unless I'm missing something. What I have implemented in does not vary every request by Origin, and I'm not sure why that would be necessary.

  • CSRF tokens must not be returned for CORS requests from unwhitelisted origins.

If the API is stateless as discussed, then there wont be CSRF tokens. Also, per your feedback, I've whitelisted the tokens endpoint from enabling cross-origin resource sharing.

  • Ideally, it should ensure that the User is not a logged-in User for any CORS request from unwhitelisted origins.

Again, as I've implemented it, that is the case. Logged-in users continue to be subject to the same-origin policy unless they are coming from a whitelisted domain.

Related to CSRF tokens on stateless authentication: T126257

eprodromou renamed this task from Enable cross-origin resource sharing (CORS) to Enable cross-origin resource sharing (CORS) in Core REST API.Sep 11 2019, 2:15 PM
dbarratt updated the task description. (Show Details)Sep 11 2019, 3:01 PM
  • Every response would have to vary on the Origin header. It needs to be confirmed that that won't have harmful effects on WMF's caching layers.

This is not the case. Unless I'm missing something. What I have implemented in does not vary every request by Origin, and I'm not sure why that would be necessary.

If you don't vary on Origin, then what prevents a response to a request from a whitelisted origin (or a non-CORS request, for that matter) from being cached and served to a request from a non-whitelisted origin?

  • CSRF tokens must not be returned for CORS requests from unwhitelisted origins.

If the API is stateless as discussed, then there wont be CSRF tokens.

As I said when Evan first proposed that, I'd really recommend against trying to reimplement all authentication and session handling for rest.php.

That also seems pretty far from "stateless". You're merely communicating logged-in state via tokens in an Authorization header rather than a Cookie header.

Also of note is that the CSRF protection in that scheme has nothing to do with statelessness or lack thereof. It's relying on the fact that browsers typically don't automatically add OAuth Authorization headers to outgoing requests as they do add cookies and HTTP Basic or Digest auth.

If you don't vary on Origin, then what prevents a response to a request from a whitelisted origin (or a non-CORS request, for that matter) from being cached and served to a request from a non-whitelisted origin?

If all of the cached requests (from any origin) have Access-Control-Allow-Origin: * what difference does it make? Why would you need the whitelist? The only exception to this would be requests that contain a Cookie or Authorization header, but those requests are not cached anyways.

That also seems pretty far from "stateless". You're merely communicating logged-in state via tokens in an Authorization header rather than a Cookie header.

The difference between a standard Cookie (or Authorization) and a Session, is that with a session, the server is aware of the users who are logged in.

If you remove that state from the server, then you're right, both are stateless. However, the browser will automatically send all cookies with every request (even if you are not able to read those cookies from the client), it will not send the Authorization header automatically (unless it's Basic, I believe, but subsiquent requests might not auto-attach if they are cross-origin anyways... I'm not certain about that one)

Also of note is that the CSRF protection in that scheme has nothing to do with statelessness or lack thereof. It's relying on the fact that browsers typically don't automatically add OAuth Authorization headers to outgoing requests as they do add cookies and HTTP Basic or Digest auth.

That is correct. Apologies, I didn't mean to imply that it was purely a result of sessions themselves. It is a side-effect of credentials (mostly Cookies) being automatically attached to requests by the browsers.

So, when I'm saying stateless I mean that

  1. the server is not aware of the logged-in state
  2. the clients authorization token is not transferred to the server in an automated way

Yes, technically those are two different things, but I'm not sure what would be the point of having 2 without 1 (thought I can see a case for having 1 on its own). There is probably a better term for what I'm describing, I'm just not aware of it. :)

I created a task for discussing making MediaWiki unaware of the logged in status of its users: T232692

Ok, let me rephrase this in the context of this task...

If the REST API were to force authorization through a token in the Authorization header (regardless of how that session is stored or not stored or the server), that would allow the authenticated, cross-origin requests without a domain whitelist (since the risk of CSRF has been mitigated).

the server is not aware of the logged-in state

the clients authorization token is not transferred to the server in an automated way

I think what you mean, is that the server is able to evaluate what the user's session is without accessing any server-side state (e.g. You have an encrypted/authenticated cookie containing the entirety of the user's session).

Well that is certainly possible, I'm not sure why it would be a good idea or help in this situation.

the clients authorization token is not transferred to the server in an automated way

I guess you mean something like SameSite=lax cookie parameter? That would probably be a good idea for CSRF prevention in general, and seems pretty independent of the first part.

@Bawolff I realized above, that I was conflating two different things (storage of the sessions and how the session token is transmitted to the server). What I mean is this:

If the REST API were to force authorization through a token in the Authorization header (regardless of how that session is stored or not stored or the server), that would allow the authenticated, cross-origin requests without a domain whitelist (since the risk of CSRF has been mitigated).

Credentials
If the API allows for authorization with the authorization code grant (or some other authorization mechanism that is stateless and does not force the client app to expose it's own secrets), then it is safe to add Access-Control-Allow-Credentials. >However, if the user adds the credentials the Access-Control-Allow-Origin will need to be specific to the Origin requested. Since credentialed requests are not cached anyways, this shouldn't be a problem.

I'm confused. Are you talking about setting Access-Control-Allow-Credentials: true when Access-Control-Allow-Origin: *. My reading of https://fetch.spec.whatwg.org/#cors-protocol-and-credentials is that that is banned in the spec.

dbarratt updated the task description. (Show Details)Sep 12 2019, 4:44 PM

I'm confused. Are you talking about setting Access-Control-Allow-Credentials: true when Access-Control-Allow-Origin: *. My reading of https://fetch.spec.whatwg.org/#cors-protocol-and-credentials is that that is banned in the spec.

We would only add Access-Control-Allow-Credentials: true if the request included an Authorization header (which bypasses the cache), if it did it would reply back with Access-Control-Allow-Origin: <the request Origin>. (Again, this assumes we wouldn't be using Cookies, if we are then we would still need the whitelist like the Action API).

I'm confused. Are you talking about setting Access-Control-Allow-Credentials: true when Access-Control-Allow-Origin: *. My reading of https://fetch.spec.whatwg.org/#cors-protocol-and-credentials is that that is banned in the spec.

We would only add Access-Control-Allow-Credentials: true if the request included an Authorization header (which bypasses the cache), if it did it would reply back with Access-Control-Allow-Origin: <the request Origin>. (Again, this assumes we wouldn't be using Cookies, if we are then we would still need the whitelist like the Action API).

So in this scenario, if we're using solely the authorization header and no cookies (And i presume using our own authorization header, not HTTP basic auth or anything like that), why would we need Access-Control-Allow-credentials?

So in this scenario, if we're using solely the authorization header and no cookies (And i presume using our own authorization header, not HTTP basic auth or anything like that), why would we need Access-Control-Allow-credentials?

If we wanted to allow authenticated cross-origin requests.

Example: From a web application, I can have you login using OAuth2 (authorization code grant) and make authenticated edits from the web application without needing a proxy server.

Now if we don't want to do that, that's totally fine, but since the REST API is new anyways, you get to make these types of decisions that you can't necessarily make with the Action API. This explains the genesis of this conversation, which is how T210790 is different from this task. :)

But if your authentication is coming from an Authorization header, and no cookies are involved, the Access-Control-Allow-credentials would have no effect, as that only controls browser level credentials (cookies, TLS certs, http basic auth, etc)

dbarratt added a comment.EditedSep 12 2019, 5:15 PM

But if your authentication is coming from an Authorization header, and no cookies are involved, the Access-Control-Allow-credentials would have no effect, as that only controls browser level credentials (cookies, TLS certs, http basic auth, etc)

hmm, so the mozilla docs say:

Credentials are cookies, authorization headers or TLS client certificates.

I assumed that meant any Authorization header, are you saying a non-basic value doesn't apply? (if so, that's fascinating!)

Bawolff added a comment.EditedSep 12 2019, 5:20 PM

This is kind of an obscure aspect of CORS, and i certainly haven't tested it, so i might be wrong, but: https://fetch.spec.whatwg.org/#credentials says "Credentials are HTTP cookies, TLS client certificates, and authentication entries (for HTTP authentication). [COOKIES] [TLS] [HTTP-AUTH] ". My reading of that, combined with "A CORS non-wildcard request-header name is a byte-case-insensitive match for Authorization." would be that allow-credential just affect browser managed credentials, and if you explicitly put Authorization in the allowed headers (must be explicit, it is not included with a wildcard), then you can override the value of the Authorization header. But I haven't tested it, and CORS is complex, so I may be misunderstanding.

My reading of that, combined with "A CORS non-wildcard request-header name is a byte-case-insensitive match for Authorization." would be that allow-credential just affect browser managed credentials, and if you explicitly put Authorization in the allowed headers (must be explicit, it is not included with a wildcard), then you can override the value of the Authorization header.

If that is the case (which I think that makes sense to me) then that's even better because then we shouldn't need to do anything for the authenticated requests to work cross-origin (assuming we use a custom Authorization)?

Errr.. ok, we would need Access-Control-Allow-Headers: Authorization :)

dbarratt updated the task description. (Show Details)Sep 12 2019, 5:51 PM

Ok, let me rephrase this in the context of this task...
If the REST API were to force authorization through a token in the Authorization header (regardless of how that session is stored or not stored or the server), that would allow the authenticated, cross-origin requests without a domain whitelist (since the risk of CSRF has been mitigated).

Also that would mean we'd be requiring registration for anyone to be able to use the REST API. Anons would be locked out.

I'm also disturbed at how complex this is getting. Now we're talking about a custom Authorization header, not even OAuth which is already significantly complex? All to try to work around a "problem" (CSRF tokens) that isn't even much of a problem in the first place.

Also that would mean we'd be requiring registration for anyone to be able to use the REST API. Anons would be locked out.

Why?

I'm also disturbed at how complex this is getting. Now we're talking about a custom Authorization header, not even OAuth which is already significantly complex? All to try to work around a "problem" (CSRF tokens) that isn't even much of a problem in the first place.

OAuth is a custom Authorization header. :) (or rather, in the context that it is not managed by the browser)

dbarratt updated the task description. (Show Details)Sep 12 2019, 8:22 PM