Page MenuHomePhabricator

Client Developer makes unauthenticated sample API calls
Closed, ResolvedPublic

Description

"As a Client Developer, I want to make some API calls without providing an OAuth 2.0 client ID, so that I can test how the API calls work."

Most API calls should include an OAuth 2.0 Authorization: Bearer <xxxx> header.

Idempotent GET, OPTIONS and HEAD API calls can be made without an OAuth 2.0 Authorization header. (All API endpoints that use GET, OPTIONS and HEAD should be idempotent.)

API calls without an OAuth Authorization header are subject to per-IP rate limiting.

PUT, DELETE, POST, PATCH and any other write methods are not allowed without an OAuth Authorization: Bearer <xxxx> header. If they're received, they should return a 401 Unauthorized HTTP status.

Event Timeline

eprodromou triaged this task as High priority.
eprodromou added a subscriber: apaskulin.

Hugh, this is a configuration issue for Envoy; it needs to bounce non-GET/HEAD requests that don't have an Authorization: header.

It came out of discussions with @apaskulin about workflows in T249776.

Change 613650 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: Restrict unauthenticated write HTTP methods, permit read HTTP methods

https://gerrit.wikimedia.org/r/613650

I would argue that this should not be done in envoy.

I our infrastructure authorization is handled by MediaWiki - it's the ultimate source of truth with regards to authentication and access control, and it should be the judge of whether to allow or prohibit the request. Individual services should as well consult MediaWiki regarding access control.

This restriction will put a wall in front of any PUT/POST request for anonymous users of the APIs which goes to the contrary to:
a) Some communities have configured their wikis to be editable by anons. Adding an impenetrable wall in front of their wikis with no control from wiki admins seems like a very wrong thing to do.
b) Some APIs, for example math transform API from RESTBase, or Parsoid transform APIs are POST for technical reasons, and not because they require auth or are modifying the content - these APIs will become inaccessible for anons.

TLDR: MediaWiki is the source of truth for access control in our infrastructure, and it does it in a way more granular fashion then this blank restriction. Splitting the auth responsibility is wrong both from technical and ethical standpoint.

Setting aside how antithetical this requirement seems for a Wikimedia project...

I would argue that this should not be done in envoy.

I our infrastructure authorization is handled by MediaWiki - it's the ultimate source of truth with regards to authentication and access control, and it should be the judge of whether to allow or prohibit the request. Individual services should as well consult MediaWiki regarding access control.

Yeah, this has us distributing the problem of authorization over disparate systems (not awesome).

This restriction will put a wall in front of any PUT/POST request for anonymous users of the APIs which goes to the contrary to:
a) Some communities have configured their wikis to be editable by anons. Adding an impenetrable wall in front of their wikis with no control from wiki admins seems like a very wrong thing to do.
b) Some APIs, for example math transform API from RESTBase, or Parsoid transform APIs are POST for technical reasons, and not because they require auth or are modifying the content - these APIs will become inaccessible for anons.

Unless you go about making it conditional on the resource, at which point you'll see the complexity getting out of hand in a hurry. We've worked really hard to avoid this.

@Joe thoughts?

Yep, makes sense to not have this in Envoy. At base, if we don't implement this logic in Mediawiki itself in the first place it feels like we're doing it wrong to begin with, and even then we'd be doing it twice if we keep it in Envoy. The only case I can imagine is a usecase where Envoy checking or managing these headers saves the appservers from some kind of resource issues and that really doesn't seem like it'd be the case.

Change 613650 abandoned by Hnowlan:
[operations/deployment-charts@master] api-gateway: Restrict unauthenticated write HTTP methods, permit read HTTP methods

Reason:
Abandoning in favour of implementing this in Mediawiki

https://gerrit.wikimedia.org/r/613650

I'm fine with this being implemented in MediaWiki, although I think it's going to be difficult to do in a way that the behaviour when we're hit through the per-wiki domains is different than through the gateway. (We don't require OAuth 2.0 in MediaWiki normally, just through the gateway.) It seems way easier to implement a simple filter at the gateway level.

I'll also need to get some idea of how it will work for non-MediaWiki backends, like NodeJS services.

I'll push this back to "Ready", and reassign to @Pchelolo, since he seems to have an idea of how to do it in MediaWiki.

Also, @Pchelolo, you know that an OAuth 2.0 request can be authorised for a client ID without an associated user, correct?

https://www.oauth.com/oauth2-servers/access-tokens/client-credentials/

Unauthenticated users can still participate through the API server, but the apps they use will need to use a client ID.

@daniel @BPirkle I want to double-check on how we handled the page-creation and page-update endpoints in the MediaWiki REST API. I believe we set them up so you needed either Cookie authorization plus a CSRF token, OR OAuth 2.0 authorization. I believe that means that you could do an anonymous page edit with OAuth 2.0 client credentials...?

@eprodromou that is my recollection, although I don't recall if I ever specifically tested that use case.

I'm going to give it a whirl and see what happens.

I want to double-check on how we handled the page-creation and page-update endpoints in the MediaWiki REST API. I believe we set them up so you needed either Cookie authorization plus a CSRF token, OR OAuth 2.0 authorization.

See also slightly related (?) T261053: REST API unnecessarily asks for CSRF tokens

Ok, I would like to argue that the correct thing to do here is to do nothing.

In my understanding, the intention of the task is "We want to allow requests with no Authorization for testing purposes". The proposed solution is to allow idempotent requests and block POST/PUT. The intent of blocking POST/PUT in my understanding is to make people register the consumers. I argue that it's possible to have enough incentive to register a consumer by other means, without blocking.

My arguments are:

  • POST/PUT can be idempotent, if the http method was chosen only for technical reasons. Example endpoints that already exist: transform wikitext/HTML in Parsoid, render a math formulae in Mathoid, etc. Some of these routes actually exist ONLY for testing, like Parsoid full transform.
  • We already are stripping cookies from requests to the API via API Gateway, so the only method of authentication is via OAuth2. Any client that needs to do any non-anon actions (and most of the POST/PUT routes are indeed non-idempotent and prefer or require non-anon). This should be enough of an incentive for the developers to register a client.
  • We can have more additional incentives for using OAuth. We can write in our docs that requests without registering a consumer are for testing purposes only and support can be discontinued at any time etc. We can lower the rate limit. We can lower the rate limit to POST/PUT endpoints even further.

As for making the decision about blocking in MW, I indeed have misunderstood the ticket initially, that was not a valid argument.

@Pchelolo I appreciate your argument. However, it's important that this system have one simple way of making calls.

POST, PUT and eventually DELETE to the MediaWiki REST API requires using the Action API to get a CSRF token, and that's just a complicated way to get things done when we also support OAuth.

I realise that we can leave that undocumented, but if we leave it working that way, people are going to use it, and then we're going to have to support it.

When we enable APIs that have anonymous POST, PUT, PATCH or DELETE that don't require CSRF tokens, like the Parsoid API, we can revisit this on a per-API basis.

When we enable APIs that have anonymous POST, PUT, PATCH or DELETE that don't require CSRF tokens, like the Parsoid API, we can revisit this on a per-API basis.

We should try *VERY* hard not to have different logic on a per-endpoint basis in the gateway. This will put waaaaay to much logic into the gateway - basically we'd need to complicate the routing logic infinitely to support blocking some routes and not blocking other routes. The unit of routing in the gateway is a (backend service) * (project) * (language), if we make the unit any smaller, the complexity will explode exponentially. Maybe in a faaaaaar faaaaaar future it could be possible to do with introduction of XDS

So, we already know that we will have idempotent POST endpoints, Parsoid and Mathoid are good examples. So once we put them behind gateway (they're in the Next on the roadmap, so hopefully soon), would you be ok with repeating this conversation again and unblocking all POST/PUTs back?

Anyway, I want to wait for the resolution of T261053 as well, implementing this one is 5 minutes of work, so we can afford some bikeshedding.

Change 613650 restored by Ppchelko:
[operations/deployment-charts@master] api-gateway: Restrict unauthenticated write HTTP methods, permit read HTTP methods

https://gerrit.wikimedia.org/r/613650

Change 613650 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: Restrict unauthenticated write HTTP methods, permit read HTTP methods

https://gerrit.wikimedia.org/r/613650

Change 622996 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: fix config indentation

https://gerrit.wikimedia.org/r/622996

Change 622996 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: fix config indentation

https://gerrit.wikimedia.org/r/622996

Ok, I donno what it is to deployy here left, it is all deployed:

curl -i -XPOST -H 'Content-Length: 0' 'https://api.wikimedia.org/core/v1/wikipedia/en/search/page?q=pizza'

HTTP/2 401 
content-length: 47
content-type: application/json
date: Wed, 09 Sep 2020 20:45:47 GMT
server: envoy
age: 0
x-cache: cp4031 miss, cp4032 pass
x-cache-status: pass
server-timing: cache;desc="pass"
strict-transport-security: max-age=106384710; includeSubDomains; preload
set-cookie: WMF-Last-Access=09-Sep-2020;Path=/;HttpOnly;secure;Expires=Sun, 11 Oct 2020 12:00:00 GMT
x-client-ip: 2600:1700:3a60:38c0:9cc7:b402:dd7a:7c02

{"httpCode":401,"httpReason":"Jwt is missing"}

and the GET just works, can't verify now cause I've rate-limited myself for an hour :)

When a JWT is provided, we can see the POST going through to mediawiki as this localised error is not from Envoy.

$ curl -H "Authorization: Bearer $TOKEN" -XPOST -H 'Content-Length: 0' 'https://api.wikimedia.org/core/v1/wikipedia/en/search/page?q=pizza'
{"messageTranslations":{"en":"The request method (POST) was not the allowed method for this path (GET)"},"httpCode":405,"httpReason":"Method Not Allowed"}