Page MenuHomePhabricator

Enable cross-origin resource sharing (CORS) for requests in Core REST API
Closed, ResolvedPublic

Description

Origin

The Access-Control-Allow-Origin response header should be set for all requests and given a default value of *. for wikis that are not on an intranet (i.e. behind a firewall). It is completely safe to set this as the default value. MediaWiki should allow this behavior to be disabled if you are running MediaWiki on an intranet.

Doing this will provide for a much better developer experience as developers will be able to use the API from another origin automatically.

Proposed Solution
Add Access-Control-Allow-Origin: * to all requests (config option to disable)

Credentials

If the API allows for authorization with the authorization code grant (or some other authorization mechanism that and does not force the client app to expose it's own secrets), then it is safe to add Access-Control-Allow-Headers with a value of Authorization (this header only needs to be added as as response to an OPTIONS request). This would allow non-whitelisted origins to make cross-origin authenticated requests.

If the API allows for browser-based authorization (i.e. Cookies) then the API will need to use the origin allowlist like the Action API does and add the Access-Control-Allow-Credentials to an OPTIONS request from those whitelisted origins.

Regardless, since Authorization and Cookie headers bypass the cache, it is not necessary to Vary the non-OPTIONS request by Origin.

Proposed Solution 1
Allow cross-origin requests using OAuth (requires OAuth2's authorization code grant):

  1. Add Access-Control-Allow-Origin: * to all OPTIONS requests
  2. Add Access-Control-Allow-Headers: Authorization, Content-Type to all OPTIONS requests
  3. Add Access-Control-Allow-Methods: * to all OPTIONS requests

Proposed Solution 2
Allow cross-origin requests using Cookies:

  1. Add Vary: Origin to all requests (maybe we can get away with just OPTIONS requests?)
  2. Use the existing origin allowlist and only do the following actions from one of those Origins:
    1. Add Access-Control-Allow-Credentials: true to all requests
    2. Add Access-Control-Allow-Origin: <Origin Requested> to all requests
    3. Add Access-Control-Allow-Headers: Content-Type to all OPTIONS requests
    4. Add Access-Control-Allow-Methods: HEAD, GET, POST, PUT, PATCH, DELETE to all OPTIONS requests

Proposed Solution 3
Allow cross-origin requests with OAuth2 (requires OAuth2's authorization code grant) and Cookies:

  1. Add Vary: Origin to all requests (maybe we can get away with just OPTIONS requests?)
  2. Add Access-Control-Allow-Headers: Authorization, Content-Type to all OPTIONS requests
  3. Use the existing origin allowlist and only do the following actions from one of those Origins:
    1. Add Access-Control-Allow-Origin: <Origin Requested> to all requests
    2. Add Access-Control-Allow-Credentials: true to all requests
    3. Add Access-Control-Allow-Methods: HEAD, GET, POST, PUT, PATCH, DELETE to all OPTIONS requests
  4. If the origin is not on the allowlist:
    1. Add Access-Control-Allow-Origin: * to all requests
    2. Add Access-Control-Allow-Methods: * to all OPTIONS requests
Related

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I created a task for discussing making MediaWiki unaware of the logged in status of its users: T232692

Ok, let me rephrase this in the context of this task...

If the REST API were to force authorization through a token in the Authorization header (regardless of how that session is stored or not stored or the server), that would allow the authenticated, cross-origin requests without a domain whitelist (since the risk of CSRF has been mitigated).

the server is not aware of the logged-in state

the clients authorization token is not transferred to the server in an automated way

I think what you mean, is that the server is able to evaluate what the user's session is without accessing any server-side state (e.g. You have an encrypted/authenticated cookie containing the entirety of the user's session).

Well that is certainly possible, I'm not sure why it would be a good idea or help in this situation.

the clients authorization token is not transferred to the server in an automated way

I guess you mean something like SameSite=lax cookie parameter? That would probably be a good idea for CSRF prevention in general, and seems pretty independent of the first part.

@Bawolff I realized above, that I was conflating two different things (storage of the sessions and how the session token is transmitted to the server). What I mean is this:

If the REST API were to force authorization through a token in the Authorization header (regardless of how that session is stored or not stored or the server), that would allow the authenticated, cross-origin requests without a domain whitelist (since the risk of CSRF has been mitigated).

Credentials
If the API allows for authorization with the authorization code grant (or some other authorization mechanism that is stateless and does not force the client app to expose it's own secrets), then it is safe to add Access-Control-Allow-Credentials. >However, if the user adds the credentials the Access-Control-Allow-Origin will need to be specific to the Origin requested. Since credentialed requests are not cached anyways, this shouldn't be a problem.

I'm confused. Are you talking about setting Access-Control-Allow-Credentials: true when Access-Control-Allow-Origin: *. My reading of https://fetch.spec.whatwg.org/#cors-protocol-and-credentials is that that is banned in the spec.

I'm confused. Are you talking about setting Access-Control-Allow-Credentials: true when Access-Control-Allow-Origin: *. My reading of https://fetch.spec.whatwg.org/#cors-protocol-and-credentials is that that is banned in the spec.

We would only add Access-Control-Allow-Credentials: true if the request included an Authorization header (which bypasses the cache), if it did it would reply back with Access-Control-Allow-Origin: <the request Origin>. (Again, this assumes we wouldn't be using Cookies, if we are then we would still need the whitelist like the Action API).

I'm confused. Are you talking about setting Access-Control-Allow-Credentials: true when Access-Control-Allow-Origin: *. My reading of https://fetch.spec.whatwg.org/#cors-protocol-and-credentials is that that is banned in the spec.

We would only add Access-Control-Allow-Credentials: true if the request included an Authorization header (which bypasses the cache), if it did it would reply back with Access-Control-Allow-Origin: <the request Origin>. (Again, this assumes we wouldn't be using Cookies, if we are then we would still need the whitelist like the Action API).

So in this scenario, if we're using solely the authorization header and no cookies (And i presume using our own authorization header, not HTTP basic auth or anything like that), why would we need Access-Control-Allow-credentials?

So in this scenario, if we're using solely the authorization header and no cookies (And i presume using our own authorization header, not HTTP basic auth or anything like that), why would we need Access-Control-Allow-credentials?

If we wanted to allow authenticated cross-origin requests.

Example: From a web application, I can have you login using OAuth2 (authorization code grant) and make authenticated edits from the web application without needing a proxy server.

Now if we don't want to do that, that's totally fine, but since the REST API is new anyways, you get to make these types of decisions that you can't necessarily make with the Action API. This explains the genesis of this conversation, which is how T210790 is different from this task. :)

But if your authentication is coming from an Authorization header, and no cookies are involved, the Access-Control-Allow-credentials would have no effect, as that only controls browser level credentials (cookies, TLS certs, http basic auth, etc)

But if your authentication is coming from an Authorization header, and no cookies are involved, the Access-Control-Allow-credentials would have no effect, as that only controls browser level credentials (cookies, TLS certs, http basic auth, etc)

hmm, so the mozilla docs say:

Credentials are cookies, authorization headers or TLS client certificates.

I assumed that meant any Authorization header, are you saying a non-basic value doesn't apply? (if so, that's fascinating!)

This is kind of an obscure aspect of CORS, and i certainly haven't tested it, so i might be wrong, but: https://fetch.spec.whatwg.org/#credentials says "Credentials are HTTP cookies, TLS client certificates, and authentication entries (for HTTP authentication). [COOKIES] [TLS] [HTTP-AUTH] ". My reading of that, combined with "A CORS non-wildcard request-header name is a byte-case-insensitive match for Authorization." would be that allow-credential just affect browser managed credentials, and if you explicitly put Authorization in the allowed headers (must be explicit, it is not included with a wildcard), then you can override the value of the Authorization header. But I haven't tested it, and CORS is complex, so I may be misunderstanding.

My reading of that, combined with "A CORS non-wildcard request-header name is a byte-case-insensitive match for Authorization." would be that allow-credential just affect browser managed credentials, and if you explicitly put Authorization in the allowed headers (must be explicit, it is not included with a wildcard), then you can override the value of the Authorization header.

If that is the case (which I think that makes sense to me) then that's even better because then we shouldn't need to do anything for the authenticated requests to work cross-origin (assuming we use a custom Authorization)?

Errr.. ok, we would need Access-Control-Allow-Headers: Authorization :)

Ok, let me rephrase this in the context of this task...

If the REST API were to force authorization through a token in the Authorization header (regardless of how that session is stored or not stored or the server), that would allow the authenticated, cross-origin requests without a domain whitelist (since the risk of CSRF has been mitigated).

Also that would mean we'd be requiring registration for anyone to be able to use the REST API. Anons would be locked out.

I'm also disturbed at how complex this is getting. Now we're talking about a custom Authorization header, not even OAuth which is already significantly complex? All to try to work around a "problem" (CSRF tokens) that isn't even much of a problem in the first place.

Also that would mean we'd be requiring registration for anyone to be able to use the REST API. Anons would be locked out.

Why?

I'm also disturbed at how complex this is getting. Now we're talking about a custom Authorization header, not even OAuth which is already significantly complex? All to try to work around a "problem" (CSRF tokens) that isn't even much of a problem in the first place.

OAuth is a custom Authorization header. :) (or rather, in the context that it is not managed by the browser)

Change 608190 had a related patch set uploaded (by Dbarratt; owner: Dbarratt):
[mediawiki/core@master] Add option to enable cross-origin resource sharing (CORS) in REST API

https://gerrit.wikimedia.org/r/c/mediawiki/core/ /608190

Change 608190 merged by jenkins-bot:
[mediawiki/core@master] Add option to enable cross-origin resource sharing (CORS) in REST API

https://gerrit.wikimedia.org/r/608190

I guess a separate task should be opened for credential'd cross-origin requests?

@dbaratt My bad! Reopening it. No need to create a separate ticket for that.

@dbaratt My bad! Reopening it. No need to create a separate ticket for that.

Sweet. I should clarify that this task only covers GET requests, but seems logical to get that worked out first before moving on to writable methods.

dbarratt renamed this task from Enable cross-origin resource sharing (CORS) in Core REST API to Enable cross-origin resource sharing (CORS) for GET requests in Core REST API.Aug 17 2020, 2:25 PM
dbarratt renamed this task from Enable cross-origin resource sharing (CORS) for GET requests in Core REST API to Enable cross-origin resource sharing (CORS) for requests in Core REST API.Aug 17 2020, 2:27 PM
dbarratt updated the task description. (Show Details)

Sweet. I should clarify that this task only covers GET requests, but seems logical to get that worked out first before moving on to writable methods.

Errr.... Adding Access-Control-Allow-Methods to either option for supporting credentials will fix this for writable methods as well. :)

dbarratt updated the task description. (Show Details)

@eprodromou & @holger.knust I've updated that task description with three options. I think Option 3 is the best as it gives the most amount of flexibility for callers. It doesn't require that core knows whether OAuth2 is enabled or not as if MediaWiki has no way to handle a custom Authorization then the request will just fail authentication anyways.

We could also put the additional handling (a new hook?) in the OAuth extension. Basically it needs to handle OPTIONS requests and add additional headers (even if MediaWiki normally wouldn't).

@eprodromou & @holger.knust I've updated that task description with three options. I think Option 3 is the best as it gives the most amount of flexibility for callers. It doesn't require that core knows whether OAuth2 is enabled or not as if MediaWiki has no way to handle a custom Authorization then the request will just fail authentication anyways.

I would be happy with Option 1. I think requiring OAuth 2.0 for any authenticated cross-site requests is more than fair.

So, unauthenticated read-only requests (GET, HEAD, OPTIONS...?) are OK. OAuth 2.0 authenticated write requests (POST, PUT, PATCH, DELETE) are OK. Everything else is forbidden.

We don't currently have any legacy calls across sites to the REST API, so I think we can do the right thing without needing to support older structures.

I would be happy with Option 1. I think requiring OAuth 2.0 for any authenticated cross-site requests is more than fair.

Awesome. I can write a patch for this.

So, unauthenticated read-only requests (GET, HEAD, OPTIONS...?) are OK. OAuth 2.0 authenticated write requests (POST, PUT, PATCH, DELETE) are OK. Everything else is forbidden.

Right. And presumably, OAuth 2.0 authenticated GET reqeusts would be fine. There aren't many requests that require authentication for GET, but I can imagine there might be in the future, especially for things Anti-Harassment is working on.

We don't currently have any legacy calls across sites to the REST API, so I think we can do the right thing without needing to support older structures.

Just to clarify, this means that pages on ja.wikipedia.org will not be able to make authenticated requests to en.wikipedia.org without using OAuth. Is that acceptable? I think that is fine, but just want to make sure. Basically the user will need to explicitly authenticate to the other wiki there will be no passive authentication (like with the Action API). The only way I can see that being a problem is if someone wanted to query all or most of the wikis from the browser, but they probably shouldn't be doing that anyways. :)

Just to clarify, this means that pages on ja.wikipedia.org will not be able to make authenticated requests to en.wikipedia.org without using OAuth. Is that acceptable? I think that is fine, but just want to make sure. Basically the user will need to explicitly authenticate to the other wiki there will be no passive authentication (like with the Action API). The only way I can see that being a problem is if someone wanted to query all or most of the wikis from the browser, but they probably shouldn't be doing that anyways. :)

Err..... they could implement a passive authentication by sharing the same OAuth2 Authentication keys. :)

dbarratt updated the task description. (Show Details)

@eprodromou & @holger.knust Sorry, I have another question... I assume the REST API does support making an edit without authentication (i.e. a POST to an edit endpoint?) If so, we should probably reject an unauthenticated write (anything other than HEAD or GET) where the Origin header does not match the current wiki's origin. I think this should be safe to do without adding a Vary: Origin because (afaik) only HEAD and GET are cached. (thought it also shouldn't hurt anything to add it to be safe).

We need the application to handle this, because CORS should allow it to happen (they might be authenticated), but the application I think should prevent it, otherwise we end up with issues like T40417.

If it does not allow unauthenticated writes, then what I've outlined is fine. :)

Change 621900 had a related patch set uploaded (by Dbarratt; owner: Dbarratt):
[mediawiki/core@master] Handle CORS preflight request and prevent anon users from unsafe methods

https://gerrit.wikimedia.org/r/621900

The patch I've submitted should do the trick. But please review. :)

Just to clarify, this means that pages on ja.wikipedia.org will not be able to make authenticated requests to en.wikipedia.org without using OAuth.

Ok, this would also mean that none of the sites could use the new API Gateway (api.wikimedia.org). It would be fine for now since the gateway only exposes APIs that are also available under the wikis own domain, but as I understand ultimately we would want to use it in our own code as well.

I'd root for option 3 as the most flexible one to implement in core. If for some reason we'd want different CORS rules for APIs exposed under API.wikimedia.org, we'd be able to override them in envoy.

dbarratt updated the task description. (Show Details)

AH! I forgot that Access-Control-Allow-Credentials is a response header for simple requests (not just OPTIONS). This means we'd need to add Vary: Origin to all requests if we want to support Option 2 or Option 3. We could add a query parameter with the origin, which might save a little bit of cache variance, but I can't imagine it would be a ton.

An update from the discussion over T261358

We have agreed to go with Option 1 as a default, but leave a capability for the developer to opt-in to Option 3 by overriding a method in a Handler class. Manually mangling the CORS headers will still be disallowed.

@dbarratt is that your understanding as well? Do you want to amend your patch to support the revised proposal, or do you want me to take over? Alternatively, we can get your patch in as pure Option 1, and I'll do the revised proposal as a followup.

@dbarratt is that your understanding as well? Do you want to amend your patch to support the revised proposal, or do you want me to take over? Alternatively, we can get your patch in as pure Option 1, and I'll do the revised proposal as a followup.

I think the only thing that needs to be revised is what is in T261053 ? But if you could review both patches that would be much appreciated. :)

I had an idea in T261696#6427063 that we should use Option 1 by default, but add a config variable to enable Option 3. Effectively a wiki could opt-into allowing cookie auth for cross-origin requests. This would be enabled on Meta, but disabled on all other wikis. This should fix @Pchelolo's dilemma while also limiting the performance impact.

We could add a query parameter with the origin, which might save a little bit of cache variance, but I can't imagine it would be a ton.

It would reduce the number of cache entries per URL from the number of domains those requests come from to the number of whitelisted domains. That's a pretty big change. Chances are we only want to cache responses to requests which do not require authentication, so maybe we could only add Vary for those, but that's easy and dangerous to mess up.

I suppose we could mangle Origin headers at the edge cache layer if we are desperate.

Disallowing cross-domain cookie authentication would make life harder for many technical contributors who are probably not familiar with OAuth and write gadgets in their limited free time. Although maybe a good abstraction layer in the JS libraries could solve that (much like mw.Api hides the complexity of token mechanics).
It would also make testing and debugging harder (right now I can share an URL for an authenticated GET API call and it just works).

OTOH cross-domain via cookies is somewhat fragile as the cookie/session for some specific domain can expire without warning; OAuth does not have that problem.

So, I'm lost in this discussion, and we're still hitting CORS errors. What needs to happen to let our Web team use the REST search API endpoint?

So, I'm lost in this discussion, and we're still hitting CORS errors. What needs to happen to let our Web team use the REST search API endpoint?

https://gerrit.wikimedia.org/r/c/mediawiki/core/+/621900 needs to happen for CORS support.
Web team can just use the same domain since the search endpoint is exposed on per-domain basis.

Awesome. I'm not sure how this got jammed up. I asked for unauthenticated + OAuth 2.0 authenticated calls to get done ASAP, and we could work out what to do with Cookie-authenticated calls later. But it seems like we needed to have the Cookie-authenticated discussion first.

Regardless, I'm fine moving forward with this. The writeable REST endpoints require a CSRF token under Cookie authentication, so I think we're OK to move this forward. Consider proposal 3 to have PM sign-off, and please merge ASAP to get our Web team unblocked.

Web team can just use the same domain since the search endpoint is exposed on per-domain basis.

Apparently the Web team does their development on localhost or other domains, but make API calls to the production domains, so they are running into this problem.

Apparently the Web team does their development on localhost or other domains, but make API calls to the production domains, so they are running into this problem.

Could your team deploy a patch to enable $wgAllowCrossOrigin on all wikis? That would be the first step and enable anon GET/HEAD requests. :)

It looks like the Search endpoint is a GET so deploying a config patch should unblock the team. :)

Apparently the Web team does their development on localhost or other domains, but make API calls to the production domains, so they are running into this problem.

Could your team deploy a patch to enable $wgAllowCrossOrigin on all wikis? That would be the first step and enable anon GET/HEAD requests. :)

It looks like the Search endpoint is a GET so deploying a config patch should unblock the team. :)

I'm pretty sure @Pchelolo is working on this right now.

Change 621900 merged by BPirkle:
[mediawiki/core@master] Handle CORS preflight request and prevent anon users from unsafe methods

https://gerrit.wikimedia.org/r/621900