Page MenuHomePhabricator

REST: Introduce support for private modules
Open, LowPublic

Description

We need a way to defined REST endpoints that are restricted for use by trusted clients under the control of the wiki maintainer, typically inside the local network. The driving use case for this is the need for a mechanism to execute jobs through HTTP requests from changeprop-jobrunner, see T175146.

Proposal

REST modules represent a collection of related endpoints covered by a versioned spec (see T362480). Each module should be able to declare an "audience designation" (see T366567: REST: introduce audience designations (proposal)). If should add a way to define a mechanism for protecting modules against unwanted access based on its audience designation. That would provide a way to flag mdodules as "private", so they can implement functionality that would be unsafe to offer to the general public.

Multiple protection mechanisms can be made availabel that can be applied separately or in combination. Three mechanisms should probably be supported from the start:

  • user-group: This will deny access to any user not in any of the listed groups. This is a natural way of restricting access, but may be impactical or cumbersome, depending on the available authentication mechanism. (TBD: we could also require user rights rather than groups).
  • network: This will allow access only for requests coming from an address matchign one of the network ranges provided. This is a convenient way to allow access to services on the local network, but can easily be misconfigured to allow access to anyone, when requests are routed through a proxy.
  • signed: Require requests to be signed according to RFC 9421, based on a shared secret, sich as $wgSecretKey. Libraries for request signing are available for a wide range of languages (through mostly based on earlier draft versions of the RFC, which is fairly young). This is useful e.g. in the context of end-to-end testing using a framework like Mocha: a key can easily be shared between the test code and the server under test, and the same configuration will work on test systems and local development setups without having to worry about changing network addresses.
  • allow: Can be set to either true (no protection) or false (the module is disabled). This could be used to allow access during testing, or prevent access on servers that are serving public traffic. This would be false per default, so private modules are not accessible on vanilla installs. It could be set to true in DevelopmentSettings.php.

Example:

// Enable HMAC authorization for private modules
$wgRestAPIProtection['private'] = [
	'network' => [ '192.168.0.0/24' ],
	'signed' => [
		'hmac-keys' => [ 'default' => $wgSecretKey ],
	],
];

Use at WMF

The deriver for this proposal is T175146: JobQueue: Unify JobRunner entry points. To address that task, we could define a jobqueue module and designate it as private. We can use the trivial allow protection to disable private modules on hosts that serve page views, and allow it on the jobrunner cluster:

Example:

$wgRestAPIProtection['private'] = [
	'allow' => true,
];

This would be equivalent to the way we currently restrict access to the RunSignleJob.php script (T362480). It would be sufficient for our own use in production. Other environments would have to use different protection mechanisms.

Rationale for using audience designations

All this could be achieved by having a was to flag modules as "private", and then configuring what protection should apply for private modules.

However, it seems useful to generalize the concept of "protection" for "audience designations" to allow different levels of protection for different sets of modules. For example, beyond the very struct "private" designation, we may also have a "bots" deswignation that would limit access to certain APIs to users in the "bots" group. Or we could define very weak protection based on the User-Agent, to (nominally) restrict access to certain modules to the Wikipedia App.

$wgRestModuleDesignations= [
	'edit.v2' => [ 'bots' ],
	'pci.v2 => [ 'apps' ],
];

$wgRestAPIProtection['bots'] = [
	'user-group' => [ 'bot' ],
];
$wgRestAPIProtection['apps'] = [
	'user-agent' => '/^Wikipedia-App/',
];

For more information on audience designations, see https://docs.google.com/document/d/1yarF_xQkFzQJUOvP3rMooFTFL6tKgK3C-Bf8zV00QR4/edit

Event Timeline

Change #1033698 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[mediawiki/core@master] REST: Add support for signed requests

https://gerrit.wikimedia.org/r/1033698

Change #1033227 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[mediawiki/core@master] REST: add support for private modules

https://gerrit.wikimedia.org/r/1033227

daniel updated the task description. (Show Details)

Read through the google doc, have some thoughts. Most of them are critical. That's not because I think module designations are a bad idea, just to poke at it to see how solid it is. Some of them are questions that you've probably already thought through, so "that's not a problem" is a great response. My comments below are mostly based on the linked google doc (which talks about module designations in a general sense), not the task description (which is more focused on private modules).

First off, this is a new idea to me. Are you familiar with other APIs/API infrastructures that use a similar concept? If so, I'd like to take a look to compare and contrast, and see how this looks for them in practice. If not, we should pound on this pretty thoroughly before implementing it, as this concept would become pretty central to the REST API, and breaking new ground can come with unexpected
consequences.

Part of the point of having a REST API is to make access to our data feel "normal" for people unfamiliar with our tech stack. Module designations don't feel "normal" to me. That doesn't mean they're bad, or that we shouldn't do this. As a caller, I'd probably stop and try to look up what they were before being comfortable calling the API. Maybe that's a good thing. And maybe most things that the average caller accesses will be public and this won't even effect them. But it is probably worth making sure we're happy with adding something unusual to our conventions.

The prefixes seem powerful, but also combine multiple ideas, some relevant to callers and some not. For example, callers care that endpoints are performant, but they don't really care what happens on the server side to achieve that. Using a designation that exists mostly to guide caching/routing behavior seems like exposing details the caller shouldn't have to care about. A specific example is "app" vs "internal", although I realize another difference is that versioning is optional in one of those. But if we consider (per the document) use of "app" by something other than our apps to be TOS violation, and if we distinguish this (again per the document) by user agent, they why wouldn't use just make routing/caching behavior dependent on the user agent? Does the designation gain us anything meaningful?

I see these quotes:

"Internal module names do not need to contain a version number."
"Private module names do not need to contain a version number."

Is this really what we want? I see this says "do not need to" and not "are prohibited from having". In practice, promising that we'll always update callers in "hours" feels optimistic. As all this is just part of the path, I suppose we could always introduce version numbers after the fact. But going from "internal:rcfeed" to "internal:rcfeed.v1" seems a little awkward. I'm not sure having an eternal "v1" that never changes, which is what we've done for both RESTBase and (so far) the MW REST API is any better, but it feels like holding a spot for the version might be preferred. Maybe I'm just used to seeing versions, but even if we don't require these module names to have versions, I'd be pretty careful about omitting them.

I'm unsure whether I prefer the term "beta" or the term "unstable" for that designation. "beta" feels to me like we're implying that the endpoint is on its way to production, even if we're not promising it. That's pretty inconsequential, as the actual meaning is the same either way, but it is something I thought as I read the doc, so mentioning it here.

From the description of the "apps" designation, it sounds like the same endpoint might be exposed under "apps", "internal", and "public"? I guess with the way that designations are assigned in module definitions, this isn't syntactically burdensome. But might exposing the same endpoint under multiple paths lead to undesirable fragmentation of things like metrics? Or even caching, if they are not in practice used to route to separate caches? Maybe this has already been vetted by someone more familiar than me with our caching implementation?

"enterprise": the doc says this currently uses a separate domain. Is anyone unhappy about that? Would exposing an enterprise API from MW mean integrating Enterprise's authentication in MW? What would that involve? Is that better than proxying enterprise endpoints through enterprise's existing domain?

This has the advantage that it'd be pretty hard for callers to not realize they were (for example) calling an "internal" url. However, would it make it harder for them to see what endpoints are available, and how to construct calls to them? As I understand how the module definition files would work: modules could optionally specify one or more designations (and would default to public if nothing is specified). If anything is specified, the designation would be added by the infrastructure to the path. Callers inspecting a module definition file to determine available paths would therefore need to understand how to assemble the full prefix, including designation, in order to successfully call an endpoint. Any resulting confusion might be mitigated by making OpenAPI specs available, which presumably would list the full path including designation, module name, and version. So implementing module designations might raise the priority for us to be able to generate and publish OpenAPI specs, especially via an interactive sandbox. That's make it less necessary for prospective callers to inspect and understand the endpoint code just to know how to call things.

Use of the colon character as a separator seems technically fine. The URL specification (https://datatracker.ietf.org/doc/html/rfc3986) defines the colon character as the "scheme component delimiter" (section 1.2.3). Colon is listed as a reserved character of type "gen-delim" in section 2.2. Per section 3.3, the first path segment cannot contain a colon character, but this won't be our first path segment, so we're good there. However, use of the colon character might cause minor annoyance in some cases. I can imagine developers colloquially referring to urls as something like "private:jobqueue/run" rather than "/api/private:jobqueue/run". Some editors may try to interpret that first bit as a scheme (ex. "mailto:"), find it invalid, and give unexpected results. This seems like a minor concern, and I don't have a better alternative. But as long as we're still at the discussion phase it seemed worth at least mentioning.

Read through the google doc, have some thoughts. Most of them are critical. That's not because I think module designations are a bad idea, just to poke at it to see how solid it is. Some of them are questions that you've probably already thought through, so "that's not a problem" is a great response. My comments below are mostly based on the linked google doc (which talks about module designations in a general sense), not the task description (which is more focused on private modules).

Thank you for the detailed analysis, Bill! I have since updated the doc with some more details. Since the discussion about "audeince designations" is beyond the scope of thiws ticket, I have filed T366567: REST: introduce audience designations (proposal). I will copy your post and respon there.