Page MenuHomePhabricator

Allow cross-origin requests by default in the Action API
Open, MediumPublic

Description

Problem
I have noticed that many developers are not aware of how to bypass the browser's same-origin policy when making requests to MediaWiki's API from the browser on a different domain. These developers typically fallback to using JSONP to achieve their goals.

I personally was tripped up by this when first learning the Action API. I think the developer experience can be better without sacrificing security.

Proposed Solution
MediaWiki could add configuration that would add the Access-Control-Allow-Origin: * to all API requests. This is safe as long as MediaWiki is not being run on an intranet:

It is completely safe to augment any resource with Access-Control-Allow-Origin: * as long as the resource is not part of an intranet (behind a firewall).

https://annevankesteren.nl/2012/12/cors-101

A wiki should be able to opt-out (by default?) of this behavior since it is not safe to do this if the wiki is non-private and on an intranet (i.e. the only thing that secures the content is the firewall itself). However, it is currently possible to bypass this T210791. Making this configurable, would therefore increase the security of these wikis.

The origin parameter should only be used by whitelisted domains and * would have no effect.
For same-origin requests, the request will either send the credentials (Cookie or Authorization headers) which will bypass the cache, or, if the user does not have a session, it wont, in which case the cached cross-origin request is fine (and perhaps preferable?).

Related

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Note previous discussion on T62835 - perhaps it was unnecessarily paranoid.

Anomie subscribed.

Setting the origin parameter to * has side effects you wouldn't want on a non-CORS request:

  • Sets $wgUser and the ContextSource user to an anonymous user.
  • Sends Access-Control-Allow-Credentials: false.
  • Disables all CSRF token fetching methods.
  • Disables action=login.

There has been in the past a somewhat similar suggestion to remove the origin parameter entirely, and rely only on the Origin HTTP header. The main drawback there is that it would mean every API response would have to vary on the Origin header, which might have serious negative implications for what caching the action API does do.

I'll leave this request open for the time being, but if anyone wants to move forward on this request they'd need to address the above issues.

  • Sets $wgUser and the ContextSource user to an anonymous user.

By default, the browser will not send credentials unless the preflight response has Access-Control-Allow-Credentials: true
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Credentials
so this behavior should be removed.

  • Sends Access-Control-Allow-Credentials: false.

The default is false so this is pointless. In fact, Mozilla's documentation says it should be omitted rather than set to false
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Credentials#Directives
so this behavior should be removed.

  • Disables all CSRF token fetching methods.

By default, the browser will not send credentials unless the preflight response has Access-Control-Allow-Credentials: true
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Credentials
so this behavior should be removed.

  • Disables action=login.

By default, the browser will not send credentials unless the preflight response has Access-Control-Allow-Credentials: true
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Credentials
so this behavior should be removed.

Note previous discussion on T62835 - perhaps it was unnecessarily paranoid.

Seems to be.

  • Sets $wgUser and the ContextSource user to an anonymous user.

By default, the browser will not send credentials unless the preflight response has Access-Control-Allow-Credentials: true
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Access-Control-Allow-Credentials
so this behavior should be removed.

If it's supposed to be an anonymous CORS request, it's safer to explicitly reject the credentials than to rely on the browser to not send them.

  • Disables all CSRF token fetching methods.

[copy-pasta]

We want an attempt to fetch CSRF tokes via anonymous CORS to explicitly fail instead of returning an anonymous user's token, for reasons I won't go into here per WP:BEANS.

  • Disables action=login.

[copy-pasta]

We don't want people logging in over anonymous CORS, particularly since if the browser isn't going to send the session cookie back then it won't work anyway but in a less-obvious manner.

If it's supposed to be an anonymous CORS request, it's safer to explicitly reject the credentials than to rely on the browser to not send them.

How is it any safer? The same-origin policy is exclusive to the browser. Any client application (other than a web browser) can send the credentials along and the server will happily accept them.

We want an attempt to fetch CSRF tokes via anonymous CORS to explicitly fail instead of returning an anonymous user's token, for reasons I won't go into here per WP:BEANS.
We don't want people logging in over anonymous CORS, particularly since if the browser isn't going to send the session cookie back then it won't work anyway but in a less-obvious manner.

The user has to explicitly set withCredentials to true when sending the request, so the functionality already exists so we are duplicating the functionality with a non-standard implementation. Even if you did set withCredentials to true the request would fail because the Access-Control-Allow-Credentials header would not be present in the preflight request. Therefor, the current behavior (which obfuscates the session change) is actually less obvious than the standard behavior.

If it's supposed to be an anonymous CORS request, it's safer to explicitly reject the credentials than to rely on the browser to not send them.

How is it any safer? The same-origin policy is exclusive to the browser. Any client application (other than a web browser) can send the credentials along and the server will happily accept them.

That's basically the point. If the request is supposed to be anonymous CORS, we shouldn't happily accept them.

We want an attempt to fetch CSRF tokes via anonymous CORS to explicitly fail instead of returning an anonymous user's token, for reasons I won't go into here per WP:BEANS.
We don't want people logging in over anonymous CORS, particularly since if the browser isn't going to send the session cookie back then it won't work anyway but in a less-obvious manner.

The user has to explicitly set withCredentials to true when sending the request, so the functionality already exists so we are duplicating the functionality with a non-standard implementation. Even if you did set withCredentials to true the request would fail because the Access-Control-Allow-Credentials header would not be present in the preflight request. Therefor, the current behavior (which obfuscates the session change) is actually less obvious than the standard behavior.

You seem to have completely missed the point. If a client sets withCredentials to true, then sure, it should fail after the preflight when the server doesn't tell it that credentials are allowed. I'm talking about the case where the client doesn't set withCredentials to true but tries to use action=login anyway. If action=login weren't disabled in that case, they'd wind up getting mysterious "session expired" errors. Better, IMO, to instead tell them that login is disabled when using request methods that bypass same-origin security (anonymous CORS, jsonp, etc).

That's basically the point. If the request is supposed to be anonymous CORS, we shouldn't happily accept them.

There is no way to stop that. The same-origin policy is already doing this as a feature of the browser. There is nothing we need to do to allow it with CORS.

You seem to have completely missed the point. If a client sets withCredentials to true, then sure, it should fail after the preflight when the server doesn't tell it that credentials are allowed. I'm talking about the case where the client doesn't set withCredentials to true but tries to use action=login anyway. If action=login weren't disabled in that case, they'd wind up getting mysterious "session expired" errors. Better, IMO, to instead tell them that login is disabled when using request methods that bypass same-origin security (anonymous CORS, jsonp, etc).

I understand your point, I'm saying it doesn't really matter, the user will get a cookie back, but the browser will ignore the cookies anyways. The server, imho, shouldn't be concerned with how a client is going to use the login cookie, that's up to the client to decide (or not decide). I think it's generally understood that if you omit withCredentials (or set it to false) then you are not going to get a cookie back. If you don't get a cookie back (at least not one you can read) then you can't make a follow-up request with that cookie.

I tried to do a login and you get WrongToken because the token doesn't match the token from the session (cookie). So this is already disabled, there's no need to have special logic based on CORS.

Exactly, you get "wrong token" rather than "you can't log in while using anonymous CORS".

Exactly, you get "wrong token" rather than "you can't log in while using anonymous CORS".

I mean we could just update the error message to indicate that a session/cookie wasn't passed. That seems more explicit and better for non-browser clients who might not know they need to pass the cookie.

You also get the error when you really do use a wrong token though.

You also get the error when you really do use a wrong token though.

That's fine, you should. But when there is no Cookie header in the request, we should probably throw a different error message. It's not that the token doesn't match, it is that there is nothing to match it against.

Sessions don't necessarily need Cookie headers.

Sessions don't necessarily need Cookie headers.

uhhh... unless you're creating one, they would need "credentials" i.e. Cookie or Authorization header (ok, you could probably pass some sort of API key or something in the url, but that's a bad idea for obvious reasons). A cross-origin request will lack credentials. Or a client who forgets to add them to the request.

you could probably pass some sort of API key or something in the url, but that's a bad idea for obvious reasons

It is, but for example the OAuth spec explicitly allows it and the third-party OAuth library we are using accepts it.

See T40417: MediaWiki's anonymous edit token leaves wiki installations (incl. Wikipedia) open to mass anonymous spam we can't block for another class of problems this could cause.

As for splitting cache on Origin, I imagine we'll have to bite that bullet for the REST API so we might want to reconsider it for the action API as well.

dbarratt updated the task description. (Show Details)
dbarratt renamed this task from Action API should default to origin=* on Wikimedia Wikis to Action API should default to origin=*.Jul 25 2019, 1:02 PM
dbarratt renamed this task from Action API should default to origin=* to Should the Action API allow cross-origin requests by default?.Jul 25 2019, 3:42 PM
dbarratt updated the task description. (Show Details)

I think defaulting to origin=* is unwise, because it would easily let a script accidentally make unauthenticated write actions, while the user thinks they are logged in, thus making the user's IP address public.

I think defaulting to origin=* is unwise, because it would easily let a script accidentally make unauthenticated write actions, while the user thinks they are logged in, thus making the user's IP address public.

I'm not sure I understand how this hypothetical could happen with the solution I'm proposing. A script has to intentionally add credentials to a cross-origin request, they aren't automatically added anyways.

WDoranWMF subscribed.

Not sure if this is correct for CPT, but please redirect if we've mistakenly retagged.

@dbarratt Sorry, I don't understand your response.

To elaborate on what I said: currently, if you have a script that is supposed to make cross-wiki edits, and you accidentally forget to set the 'origin' parameter (there's a bug in the code), the script just doesn't work. If I understand your proposal correctly, then in the same situation, the script would instead make the edits as an anonymous users, and thus expose your IP address.

To elaborate on what I said: currently, if you have a script that is supposed to make cross-wiki edits, and you accidentally forget to set the 'origin' parameter (there's a bug in the code), the script just doesn't work. If I understand your proposal correctly, then in the same situation, the script would instead make the edits as an anonymous users, and thus expose your IP address.

I'm not sure that's really a problem?

Assuming you are making a request from https://example.com to https://en.wikipedia.org:

Currently
If your script is leaving off the origin parameter, the browser will make a preflight request (since the Content-Type will be application/json) and that preflight will fail. This means all edits from your script, will fail and your script will not work at all.

If you fix your script and add origin=* to it, then the browser will make the preflight and the preflight will allow the request, then the browser will make the edit as an anonymous user (because Access-Control-Allow-Credentials defaults to false) and will expose your IP address.

After the change
If your script is leaving off the origin parameter, the browser will make the preflight and the preflight will allow the request, then the browser will make the edit as an anonymous user (because Access-Control-Allow-Credentials defaults to false) and will expose your IP address.


It feels like this is an edge-case of protecting scripts that are already not working? In essence, we are fixing them. There isn't another way to fix them other than to whitelist https://example.com, but that also requires a change to the config on https://en.wikipedia.org. Do you have any evidence there are non-working scripts that this change will fix? If so, is that a bad thing?

Change 531966 had a related patch set uploaded (by Dbarratt; owner: Dbarratt):
[mediawiki/core@master] Add a configuration option to make API requests allow cross-origin requests

https://gerrit.wikimedia.org/r/531966

@matmarex oops, it looks like I was wrong, based on T40417 the edit will happen regardless of whether a developer sets the origin parameter. This change will allow the response to that request to be read by the browser (but the edit will happen either way).

@dbarratt Thanks for this. I've moved this to our PM review column for @eprodromou or @CCicalese_WMF to review and determine if this is a direction we wish to go. We have some concerns about the patch but we should complete the discussion here first, I think.

The task doesn't have a priority, what's the level of severity for this?

The task doesn't have a priority, what's the level of severity for this?

Uhh, idk, it doesn't prevent anything from working, I just noticed that it trips up a lot of developers who are new to the API (and even some who have used it a lot but haven't used it cross-origin). This can be solved with better education, but from what I can tell it's completely unnecessary.

Great, thanks, we'll review and figure it out on our end.

Currently
If your script is leaving off the origin parameter, the browser will make a preflight request (since the Content-Type will be application/json) and that preflight will fail. This means all edits from your script, will fail and your script will not work at all.

If you fix your script and add origin=* to it, then the browser will make the preflight and the preflight will allow the request, then the browser will make the edit as an anonymous user (because Access-Control-Allow-Credentials defaults to false) and will expose your IP address.

The browser will be unable to make the edit as an anonymous user with origin=* because it will receive an error when attempting to fetch the CSRF token.

As you noted T40417 describes a way around that, but that's a different bug.

After the change
If your script is leaving off the origin parameter, the browser will make the preflight and the preflight will allow the request, then the browser will make the edit as an anonymous user (because Access-Control-Allow-Credentials defaults to false) and will expose your IP address.

We'd have to implement similar behavior here to make this attempted edit fail, differentiating local anonymous requests from cross-origin requests somehow.

Note that, while the Origin header is required for CORS, the presence of an Origin header does not necessarily mean it's a non-local request. RFC 6454 § 7.3 allows it on any request, while the Fetch standard indicates that it is to be set for any POST.

The browser will be unable to make the edit as an anonymous user with origin=* because it will receive an error when attempting to fetch the CSRF token.

Right, sorry I got my wires crossed, T40417 better describes the problem. This task doesn't change anything regarding this.

Even if you are able to see the CSRF tokens (which the patch now allows), if they are attached to a session (which they by definition are), the edit will still fail since it will be missing the session Cookie. If the client sends the session cookie, the browser will fail the request on the preflight.

if they are attached to a session (which they by definition are)

Perhaps by strict definition, but not in MediaWiki parlance. The eventual fix for T40417 may or may not wind up depending on cookies (and so may or may not be affected by the credential check), but certainly it won't be a session cookie.

Read it as "CSRF token or cross-site anon action prevention token" if you'd like.

Read it as "CSRF token or cross-site anon action prevention token" if you'd like.

I'll update the patch to prevent the Access-Control-Allow-Origin header on the tokens endpoint since I'd rather continue to block it than allow it for a small period and then block it again.

@dbarratt I want this to go through our PMs before we decide on moving it forward or not as they are the arbiters of product, I don't want the patch to merge in between, is it reasonable for me to -2 it to avoid that happening accidentally while discussion is on going or should we mark it as work in process?

@dbarratt I want this to go through our PMs before we decide on moving it forward or not as they are the arbiters of product, I don't want the patch to merge in between, is it reasonable for me to -2 it to avoid that happening accidentally while discussion is on going or should we mark it as work in process?

I think a -2 with a comment on why would be perfect. :)

So, we've taken a look at this on the CPT Product call, and I feel like it's a question we need to talk to Security team about before we move forward. I hate having to think about CORS, but I assume our client developers also hate having to think about it, so I guess we should take on that hassle for them.

I'm going to jump in to see what we need to get help on this, and I'll report back here when I've got more info.

So, the big question this raises to me is whether the API gateway T235270 would make this obsolete. I think we don't need CORS at all if calls are going through a domain that's only serving API results.

So, the big question this raises to me is whether the API gateway T235270 would make this obsolete. I think we don't need CORS at all if calls are going through a domain that's only serving API results.

Uhh.. right, it could set Access-Control-Allow-Origin: * for all non-credentialed requests and that would be perfect. How far away are we from that being a reality?

dbarratt renamed this task from Should the Action API allow cross-origin requests by default? to Allow cross-origin requests by default in the Action API.Aug 22 2020, 3:32 PM