Page MenuHomePhabricator

RFC: Core REST API namespace and version
Open, Needs TriagePublic

Description

This proposal is to implement a namespace and version policy for routes in the MediaWiki REST API.

Problem

This RFC is intended to alleviate the following issues:

  • APIs change and develop over time. We'd like a way to indicate that the API has had a change that is backwards-incompatible.
  • Extensions are able to expose new API endpoints. They should not conflict with core endpoints, existing or future, nor should they conflict with other extensions.

Proposed solution

In general, routes exposed as part of the REST API in MediaWiki by T229661 should follow this pattern:

/v<version>/<rest of path>

Here version is the major part of the semantic version of the API interface (not MediaWiki!). As is typical for semantic versions, minor version numbers should be incremented when backwards-compatible changes are made (usually additional endpoints or fields in result objects), and major version numbers should be incremented when breaking changes are made.

(The minor version is not part of the path, but will be noted in documentation.)

For all the routes exposed by MediaWiki core we'll use the initial API version '1.0'.

rest of path is the RESTful path for the endpoint, like <type>/<id> or <type>/<id>/<attribute>. An example path:

/v1/revision/12345

Endpoints that have been deprecated because a new endpoint provides the same functionality better will have the Deprecation header set.

For REST API endpoints provided by extensions, the pattern will be:

/<component>/<rest of path>

Here, component is an URL-friendly string (short, lowercase Latin alphabet preferred) identifying the component that provides the API. This will usually be a lower-cased version of an extension name, like 'confirmedit' or 'popups'. It should not match the "v<digits>" pattern.

rest of path is up to the extension to decide. However, a version prefix for the extensions API version, independent of the core API version, is recommended since it is helpful to client developers. Extensions are developed independently so they should be able to change the versions of their API interfaces independently, also. For example,

/<component>/v<component major version>/<rest of path>

Interface strategy implications

Including a major semantic version number in the path of API calls dampens the rate of breaking changes and rearchitecture in the interface. APIs that use this technique are often append-only for the lifetime of a major version. That is, new endpoints are added and new properties of existing objects are added, but nothing is deleted from the interface.

Consider, for example, an API at version 1.0 with endpoints A, B, and C, each of which returns a JSON object with a number of properties a1, a2, ...

  • Endpoint A: properties a1, a2, a3
  • Endpoint B: b1, b2, b3
  • Endpoint C: c1, c2, c3

If additional information is needed by the client for Endpoint B, a new property can be added, and a new version 1.1 released:

  • A: a1, a2, a3
  • B: b1, b2, b3, b4
  • C: c1, c2, c3

Note that this is backwards compatible. A client application that was developed for API version 1.0 will still run and all the properties and endpoints it expects to find will still be there.

We can also add new endpoints, so that version 1.2 with new functionality at endpoint D looks like:

  • A: a1, a2, a3
  • B: b1, b2, b3, b4
  • C: c1, c2, c3
  • D: d1, d2, d3

Again, we've maintained backwards compatibility with previous 1.x minor versions.

If property a3 is of the wrong type (say, it's a string and should be an array) or the property name is misspelled or unclear (we have user_id instead of actor_id), instead of removing it and breaking backwards compatibility, we can add another property a4 with the right type and name, and deprecate property a3 in the documentation, so the interface version 1.3 is:

  • A: a1, a2, a3 <deprecated>, a4
  • B: b1, b2, b3, b4
  • C: c1, c2, c3
  • D: d1, d2, d3

Programs using the older 1.x interface definitions will continue to run correctly, even if they access a deprecated property.

It may happen that adding a duplicate property is too heavyweight at run-time (for example, we're renaming a property containing the full wikitext for an article, hundreds of thousands of bytes). In this case, it makes sense to add a new endpoint with the correct properties. If we did this with endpoint C, we'd add a new endpoint E, and deprecate C, making version 1.4:

  • A: a1, a2, a3 <deprecated>, a4
  • B: b1, b2, b3, b4
  • C <deprecated>: c1, c2, c3
  • D: d1, d2, d3
  • E: e1, e2, e3

Again, programs depending on older interface definitions will continue to run correctly, even if they access a deprecated endpoint.

At some point, the collective drag of supporting and maintaining the deprecated endpoints and properties might be too much. Or, there might be a major rearchitecture of the underlying platform that we want to expose, or a security issue that can't be solved without removing a property entirely (say, for example, property d3 leaks user passwords, so adding property d4 and deprecating property d3 wouldn't fix the problem). At this point, we would create a new major version, 2.0, and drop all the deprecated properties and endpoints:

  • A: a1, a2, a4
  • B: b1, b2, b3, b4
  • D: d1, d2, d4
  • E: e1, e2, e3

The API endpoints here would be prefixed with /v2/ . Whether or not we continue to support /v1/ endpoints for some bounded period depends on the nature of the change (in case of a security problem, probably not), the amount of traffic we get to the API, and so on. It probably does not make sense to decide that at this time.

Unstable or experimental interfaces

Unstable or experimental interfaces should be implemented in extensions. Once the interface is stable, the same routes can be mounted in the main, stable namespace.

So, an endpoint for playing audio files could be at:

/audio/v0/sound-player?sound=12345

The endpoint could be changed without backwards compatibility because it has a "0" for its major version. So, the same functionality might move to:

/audio/v0/sound/12345/play

Once the interface has stabilized, it can either stay in the extension, maybe with a 1.x semantic version:

/audio/v1/sound/12345/play

Or it could be moved to the core routes namespace, if it's part of core:

/v1/sound/12345/play

From this point, it must retain backwards-compatibility or else cause a major version change in the core API.

Versions and namespaces in other APIs

For comparison, these are URL patterns for some other APIs that client developers may be familiar with.

APINamespaceVersionExample root
RESTBasedomain name by projectmajor version in pathhttps://en.wikipedia.org/api/rest_v1/
Stripenonemajor version in path, additional version by date in custom headerhttps://api.stripe.com/v1/
Twitter APIdomain name (api, upload, ads-api, ...)major.minor version in pathhttps://api.twitter.com/1.1/
Facebooknoneoptional major.minor versionhttps://graph.facebook.com/v4.0/
Twiliononedate-based versionhttps://api.twilio.com/2010-04-01/
Googledomain name (maps, googleads, ...)major version in pathhttps://googleads.googleapis.com/v2/
Appledomain name (appstoreconnect, ...)major version in pathhttps://api.appstoreconnect.apple.com/v1/
Ubernonemajor.minor in pathhttps://api.uber.com/v1.2/
Sendgridnonemajor version in pathhttps://api.sendgrid.com/v3/
Amazon Web Services (AWS)domain name (ec2, s3, ...)varies, usually date as "Version" parameterhttps://ec2.amazonaws.com/?Action=...&Version=2016-11-15
Microsoft Graphnonemajor.minor in pathhttps://graph.microsoft.com/v1.0/

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@daniel I meant that client developers will probably appreciate any drag on breaking changes. My guess is that we'll need to do it pretty often to begin with, but that breaking changes will get much more costly as we get more users on this API.

@daniel I meant that client developers will probably appreciate any drag on breaking changes. My guess is that we'll need to do it pretty often to begin with, but that breaking changes will get much more costly as we get more users on this API.

Higher module granularity means fewer breaking changes per module, right?

Anyway - as Pchelolo mentioned, we can always pull stuff out into a separate module on a major version bump.

@daniel I meant that client developers will probably appreciate any drag on breaking changes. My guess is that we'll need to do it pretty often to begin with, but that breaking changes will get much more costly as we get more users on this API.

As I understand, we intend to bump versions for non-breaking changes. Having a minor version bump for any new field or new endpoint in the whole core, I could easily imagine API v1.100500 which makes the cache split problem more and more intimidating.

As I understand, we intend to bump versions for non-breaking changes. Having a minor version bump for any new field or new endpoint in the whole core, I could easily imagine API v1.100500 which makes the cache split problem more and more intimidating.

Yea, I see that problem. We'll need to talk to ops about the best solution for that.

eprodromou added a comment.EditedSep 10 2019, 5:45 PM

Changes between minor API versions should be backwards-compatible, so there's not real need for the client to request a specific non-latest version of the API.

I think that's the case for developers who are using Wikimedia sites, where the API version will increase monotonically.

It's less so for people who are making general tools for third-party MediaWiki, where they won't be sure that endpoints added in later versions will be available.

So, for example, if API version 3.1 has endpoints A, B, C, and D, and API version 3.2 adds endpoint E, then a developer who needs endpoint E could stick that "v3.2" into their requests. They're either going to get a 404 or the request should succeed.

More subtle is when a response has different fields between versions. If endpoint E returns {name, rank, serialNumber} in version 3.2 and has {name, rank, serialNumber, lottoNumbers} in version 3.3, then using 'v3.3' in the request means "I expect this result to have the lottoNumbers property".

I think there are a couple of other ways to satisfy this need, though.

  1. The client is just careful with requests. That can be difficult, since a 404 for a request for /foo/v3/bar/12345 might mean "this site has an earlier version of MediaWiki installed, before the /bar/<id> route was implemented" or it might mean "there's no such 'bar' with ID '12345'".
  2. We get into a convention of including introspection methods in the API. So /foo/v3/apiversioninfo would tell me that the API version 3.3 is available, so I should be able to get endpoint E and expect the lottoNumbers property to be there.

Overall, I am pretty OK with leaving out the minor version numbers from the endpoints. It seems slightly useful, but there's some overhead for cache (which I think we can manage with redirects? which introduces its own problems...) as well as implementation.

@eprodromou So, if I am requesting v1.1 that means I want v1,1+, more specifically, [1.1, 2.0). Thus, if the current version of the API is v1.3, anything before should probably redirect to v1.3, so that we could utilize a shared cache entry. If we are to ever purge the fronted cache for rest api (as we do for RESTBase, which allows us to have really high hit ratio) having one shared entry per resource is especially important. We need to chat with Traffic team about what would be available to us in ATS.

As for non-WMF installations of Mediawiki. So, if I'm requesting a version v3.3 on an installation that only has v3.2, it would return something like 501 Not Implemented right? Thus, my client will be entirely broken anyway unless I do some API version introspection beforehand and build a client with the possibility of not having the latest greatest API available. Arguably, having an entirely broken client is better than having a partially broken client (if it only needs v3.3 for a new field in a user, but all the rest of the resources are ok with v3.2 - the client will partially work).

I'm not strongly advocating for not including the minor version. I think we need to talk to the traffic team and see if they have any info/preferences with regards to this before making a decision.

@eprodromou So, if I am requesting v1.1 that means I want v1,1+, more specifically, [1.1, 2.0). Thus, if the current version of the API is v1.3, anything before should probably redirect to v1.3, so that we could utilize a shared cache entry.

That's probably a solid solution.

Here's what I think: this minor-version-in-the-URL thing is pretty unusual; it's uncommon to find it in the wild. It's barely solving a "problem" that nobody's actually asked about.

I'm going to take it out of the bug description. If someone needs it in the future, they can ask for it. If nobody needs it, we saved a lot of trouble.

eprodromou updated the task description. (Show Details)Sep 10 2019, 7:31 PM
Anomie added a subscriber: Anomie.Sep 10 2019, 9:17 PM

I don't have a strong position here (versus for the Action API, see T41592). But I think this comment might be helpful.

Having version numbers in the path means we're committing to keeping the endpoint working exactly the same long-term, with fixes going to a different-version path. In my experience with the Action API, it's fairly common that a breaking change is because the underlying business logic is changing, and trying to maintain API compatibility would mean significant effort outside of the API layer itself. Looking back at 2019 announcements to mediawiki-api-announce:

  • Reporting of edit failures due to AbuseFilter and SpamBlacklist: The "version" would have to be communicated from ApiEditPage → EditPage → EditFilterMergedContent hook → extension hook functions where the decision is made as to how to report the failure. Doable, but somewhat excessively complex IMO and would likely have to be repeated all over the codebase.
    • Or else the old-version ApiEditPage would need hacks to detect those specific extensions and back-convert their error messages to the old format. That would likely become unmanageable once we start considering extensions not deployed by WMF, and extensions not even hosted in Gerrit.
    • Or we'd need to use/add hooks for extensions to use to back-convert API responses (and cf. T86210).
  • POST without Content-Type: PHP 7 handles it differently from HHVM, we'd have to reimplement parsing post bodies into $_POST to keep it working the same.
  • Improved timestamp support: Our timestamp library changed to stop ignoring timezones, and to reject some invalid formats it used to accept. To avoid breaking this, we'd have to reimplement all those bugs in the API layer.
  • CSRF for action=logout: Security change. We don't want to keep the old version around, that would be the same as not fixing the security issue.

Versioning might have helped for Deprecation of list=allusers 'recenteditcount' result property, but in practice the help would be largely in allowing us to count how many requests still used the ancient version when we finally end-of-lifed it.

I also note that choosing "/core/v3" as the format (as in the current proposal) is probably necessary if we're going to have versions. If we did like "/page/v3" we'd have the problem of syncing versions between core and all extensions that want to provide something under "/page".

I would support having a major version number which is associated with each route, not a global version number. It should start at v1, not v0. I think it should be incremented extremely rarely: in most of the cases which @Anomie mentions above, it would not be incremented. I don't think we should commit to supporting old versions forever. For example, if you need to add a CSRF token to a vulnerable endpoint, you could add the v2 endpoint and remove the v1 endpoint in the same commit.

In the discussion on T221177, the consensus was that paths should not strictly reflect the module hierarchy. That conflicts with the scheme proposed in this task which requires that the path strictly follows the module of the handler.

I think I prefer /page/{title}/{action}/{version}, with {action}/{version} being extensible and {version} being incremented when the handler changes in a backwards-incompatible way.

Why not /core/page/{title}/{action}/{version}?. This implies that core/page is the name of the page concept. The /core prefix seems redundant, since most concepts will be core concepts, and confusing, since paths leading to extension handlers will be prefixed with /core.

What if SecurePoll had a concept of a page, distinct from a core page? We could have /securepoll-page/{id}/{action}/{version}. The resource name here is securepoll-page. Using a hyphen instead of a slash makes it clear to the developer that the desired interpretation has securepoll binding tightly to page, rather than securepoll binding to the whole following path. It is like /(securepoll/page)/{id}/{action}/{version} not /securepoll/(page/{id}/{action}/{version})

What if AbuseFilter let you see the AbuseFilter log of SecurePoll's pages? Then you could have /securepoll-page/{id}/abusefilter-log/v1 . It is clear here that the abusefilter-log is being requested of a resource of type securepoll-page.

The use of /page/ instead of /core-page/ can be justified in terms of core being at the root of the hierarchy. You are querying a MediaWiki API, so MediaWiki is implicitly at the root of the tree and owns the root namespace.

I am suggesting {action}/{version} not {version}/{action} to make it clear that the version is associated with the specific endpoint, not with the module which contains the endpoint. Something like /page/Foo/v1/abusefilter-logs might incorrectly imply that it is referring to v1 of abusefilter not v1 of abusefilter-logs. But it is a minor point.

Quick summary from a conversation I had with Tim just now:

Version modifiers could just be added when needed, and we could start without them. However, this implies that our initial interface is stable from the start, and we will maintain backwards compatibility with it. Establishing such an interface would require an RFC for each set of endpoints.

If we want to be able to deploy something internal/experimental without going through an RFC, it needs to have a prefix that clearly marks it as internal or experimental.

All this said, a quick word about per-endpoint versions vs. prefixes: both make sense, and they can be freely combined. To me, a prefix like /page/v1 indicates that all paths with that prefix comply version 1 of the page model (what a page is, how it is represented, how it is structured, what entities exist, etc). At the same time, an extension may apply its own versioning for an endpoint under /page/v1, so we may end up with something like /page/v1/Foo/abusefilter-logs/v2. Or even /page.v1/Foo/abusefilter-logs.v2

so we may end up with something like /page/v1/Foo/abusefilter-logs/v2

I'm curious what a new client developer would make of two different version numbers in the path. What is the "v1" actually controlling, versus the "v2"?

FWIW, The Parsoid PHP port already has REST API routes containing versions. For example:

		{
			"path": "/{domain}/v3/page/{format}/{title}",
			"class": "MWParsoid\\Rest\\Handler\\PageHandler",
			"factory": "MWParsoid\\Rest\\Handler\\PageHandler::factory",
			"method": "GET"
		},

FWIW, The Parsoid PHP port already has REST API routes containing versions.

I think those are outside the scope of this proposal, as they're exactly matching the existing Parsoid-JS routes for compatibility reasons and they're not actually being intended for public use (public use would go through Restbase for now, probably eventually new MW REST routes in the future).

I understand Daniel's desire to avoid putting a collection of half-baked endpoints into production with generic unversioned names. How about we split the whole iPhone app support project into an extension and give it its own prefix? Then it can act as a prototype for similar endpoints to be introduced to core.

eprodromou added a comment.EditedTue, Sep 17, 3:45 PM

I thought it might help to compare some other APIs and how they handle versioning and namespacing.

APINamespaceVersionExample root
RESTBasedomain name by projectmajor version in pathhttps://en.wikipedia.org/api/rest_v1/
Stripenonemajor version in path, additional version by date in custom headerhttps://api.stripe.com/v1/
Twitter APIdomain name (api, upload, ads-api, ...)major.minor version in pathhttps://api.twitter.com/1.1/
Facebooknoneoptional major.minor versionhttps://graph.facebook.com/v4.0/
Twiliononedate-based versionhttps://api.twilio.com/2010-04-01/
Googledomain name (maps, googleads, ...)major version in pathhttps://googleads.googleapis.com/v2/
Appledomain name (appstoreconnect, ...)major version in pathhttps://api.appstoreconnect.apple.com/v1/
Ubernonemajor.minor in pathhttps://api.uber.com/v1.2/
Sendgridnonemajor version in pathhttps://api.sendgrid.com/v3/
Amazon Web Services (AWS)domain name (ec2, s3, ...)varies, usually date as "Version" parameterhttps://ec2.amazonaws.com/?Action=...&Version=2016-11-15
Microsoft Graphnonemajor.minor in pathhttps://graph.microsoft.com/v1.0/

Given these examples, ideally I'd like to do a namespace in the URL, like "core.api.wikipedia.org" or "your-extension.api.wikipedia.org". However, for the API that goes out with MediaWiki, we can't mess with the domain name, so the easier solution is a namespace prefix as close to the domain name as possible, like "/core/" or "/your-extension/".

For the version, it seems like "v<major>" is the main pattern, although I was surprised to see so many major.minor versions. I think that "v<major>" will be fine for us.

I have not seen any APIs that include different versions for different components in the path. I see the value in @tstarling 's design, but I agree with @Anomie that this is confusing for the developer.

A common pattern in REST APIs is to define a single "root URL" that all the other paths descend from. "The root URL for the API" is a common definition in client libraries. The complexity of namespaces and versions are stuck in that root URL, and don't clutter up the client code. Constructing URLs consists of concatenating the root URL definition with a resource path, like "/revision/12345". We should probably stick with that as a design pattern.

For T231338, we'll use "/core/v0" unless there are strenuous objections. I'll prepare an RFC for answering the general question. Hopefully we can balance flexibility on the implementation side with pragmatism for making client developers' lives easier.

I think this task should be an RFC. It proposes a set of public endpoints which will inevitably be used by clients other than iOS. It proposes a path scheme which implies a trivial extension (v0 -> v1) to a permanent, official, public API. It proposes introducing a /core namespace despite contrary views on that point.

tstarling renamed this task from Namespace and version to Core REST API namespace and version.Tue, Sep 17, 11:05 PM

I would request to have the language code as part of API path than in the domain name. It is an important semantic value affecting API output. Having it in domain name as in https://en.wikipedia.org/api/rest_v1/ has mutliple issues that became obvious while writing apis to host at wikipedia.org/api and xx.wikipedia.org/api at the same. See https://wikimedia.org/api/rest_v1/ and https://en.wikipedia.org/api/rest_v1/

May be out of scope for this discussion: But having the rest apis only at a single place instead of 292 domains is my wish-simplifies documentation, discoverability of our apis

Krinkle renamed this task from Core REST API namespace and version to RFC: Core REST API namespace and version.Wed, Sep 18, 8:35 PM
daniel moved this task from Inbox to Under discussion on the TechCom-RFC board.Wed, Sep 18, 8:36 PM

Having the language or domain in the path would be very difficult to do within MediaWiki. Configuration based on language and project has already happened by the time the REST router is called. At Wikimedia sites, this "configuration" includes selection of the particular deployment branch to use. Changing all that would be difficult to say the least.

It could potentially be done on Wikimedia sites with routing magic that turns "https:/‍/wikimedia.org/some/prefix/‍wikipedia‍/‍en‍/resty/path/params" into "https:/‍/‍en.wikipedia.org/w/rest.php/resty/path/params" internally, or MWMultiVersion magic like that used for /w/thumb.php on upload.wikimedia.org. I believe that's outside the scope of this task, as I believe this task concerns just the "/resty/path/params" bits of those URLs.

@santhosh it's out of scope for this discussion. We're also considering a new service for aggregating REST API endpoints for all the projects under a single virtual server, like api.wiki[mp]edia.org. In this case, we'd have language and project in the path, like api.wikimedia.org/guide/en/Lisbon for a travel guide to Lisbon in English.

The current discussion is about the REST API in MediaWiki, which would be exposed for all the projects as well as in third-party MediaWiki sites.

@Krinkle @daniel So, I guess this is an RFC now? What do I need to do next?

@eprodromou I'd recommend updating the task description to have the problem statement and desired outcome stand out, e.g. in their own section (neither about a specific solution/proposal yet). Optionally, a summary of one or more proposed solutions can also be placed under a heading each, but thats not needed per-se at this stage.

eprodromou updated the task description. (Show Details)Thu, Sep 19, 5:59 PM
eprodromou updated the task description. (Show Details)
eprodromou updated the task description. (Show Details)Thu, Sep 19, 6:02 PM
eprodromou added a comment.EditedThu, Sep 19, 6:04 PM

Discussing this proposal on Tuesday with Tim, he strongly objected to having the word "core" in the prefix, since none of the other APIs referenced above have the word "core" in them. So, I've removed the "core" from the proposal.

Since we're proposing this for full discussion, I've changed the version from 0 to 1, so we don't have to go through this exercise again. As an added benefit, many of the other APIs have "/v1/" as a prefix, so it should trigger everyone's pattern-recognition.

Discussing this proposal on Tuesday with Tim, he strongly objected to having the word "core" in the prefix, since none of the other APIs referenced above have the word "core" in them. So, I've removed the "core" from the proposal.

None of the others have an ecosystem of 3rd party server side extensions, like we do...

eprodromou updated the task description. (Show Details)Thu, Sep 19, 6:21 PM
eprodromou updated the task description. (Show Details)

@Krinkle @daniel So, I guess this is an RFC now? What do I need to do next?

TechCom has penciled this in for a public IRC meeting on #wikimedia-office next Wednesday, 2pm PDT, 23:00 CEST, 21:00 UTC. @eprodromou would you be available at that time? If not, we can reschedule.

eprodromou updated the task description. (Show Details)Thu, Sep 19, 6:23 PM

@daniel sounds great, I'll be there!

None of the others have an ecosystem of 3rd party server side extensions, like we do...

Nope. A lot of them have different components that are namespaced in the domain name; Twitter, Facebook, Google all use this. I don't think that's feasible for the MediaWiki REST API, however.

TechCom has penciled this in for a public IRC meeting on #wikimedia-office next Wednesday, 2pm PDT, 23:00 CEST, 21:00 UTC. @eprodromou would you be available at that time? If not, we can reschedule.

Great! I'm in.

eprodromou updated the task description. (Show Details)Thu, Sep 19, 10:16 PM
eprodromou updated the task description. (Show Details)Thu, Sep 19, 10:21 PM

I added a section with some detailed explanation of how semantic versions work for APIs. For people who haven't built public APIs before, the strategy can seem a little strange ("Why can't I just make whatever changes I want, whenever I want?"), so I went into some detail.

eprodromou updated the task description. (Show Details)Thu, Sep 19, 10:26 PM
eprodromou updated the task description. (Show Details)Thu, Sep 19, 10:29 PM

There are a couple of Open Source and extension-related issues that aren't covered here.

  1. Anybody can hack the source and change their third-party MediaWiki site to have different core API endpoints or behaviour of those endpoints. That's out of our hands, and it's up to the third-party to decide what they want to do about versions.
  2. By convention we're asking extensions to use a namespace, but we don't have a technical way right now to prevent them from adding endpoints in the "main" namespace.
  3. If we add hooks to API endpoints, extensions might modify the behaviour of those endpoints so they aren't backwards-compatible. I'm not sure how to engineer for that: either don't add hooks, or have a social convention to stay compatible, or when we put hooks in, make sure they don't allow breaking changes.
  1. By convention we're asking extensions to use a namespace, but we don't have a technical way right now to prevent them from adding endpoints in the "main" namespace.

I doubt it's worth the complexity. This seems like the sort of thing that should be handled by code review, not code.

  1. If we add hooks to API endpoints, extensions might modify the behaviour of those endpoints so they aren't backwards-compatible. I'm not sure how to engineer for that: either don't add hooks, or have a social convention to stay compatible, or when we put hooks in, make sure they don't allow breaking changes.

Note that the hooks may be in the business logic used by the API endpoints, rather than in the endpoints themselves. Although those seem less likely to be able to modify API behavior in incompatible ways.

For that matter, the business logic itself changing could also unexpectedly break an API endpoint.

tstarling added a subscriber: Tgr.Wed, Sep 25, 12:35 PM

I did propose a technical way to encourage extensions to use a prefix at T221177#5142785, but it was shouted down. The opposing view was that extensions can and should register additional actions which relate to resources provided by core. This is what I described above, at T232485#5486509, but it was first proposed by @Tgr at T221177#5145523.

Joe added a subscriber: Joe.Wed, Sep 25, 6:56 PM

I can see arguments in favour of enforcing each extension to have their own namespace for APIs, in terms of maintainability, and code modularization. It would also make it easier to undeploy an extension and provide an easy documentation of the change to our users: "namespace someextension is gone".

On the other hand, I'm mostly an user of the API, so I can try to provide the point of view of someone who is more or less a third-party user.

I'd expect to find all API endpoints exposed under a structure that is consistent.

So for instance I'd expect to find the basic metadata about a revision at a url like`$prefix/revision/<ID>, its content at $prefix/revision/<ID>/content` and its corresponding ORES score at some url like $prefix/revision/<ID>/score, or some variation upon what I just described.

It would look like a much friendlier interface to interact with rather than having to call $prefix/score/vX/revision/<ID>. I don't want to know that this is an extension to the core api - that's an inner implementation detail of the software to me.

My version of the URL lacks the major version information for the specific api. That could be provided in the request as a Header or part of the request data but - do we need to provide that kind of independence to extension authors?

I would also see the use for experimental endpoints having a separate namespace under /alpha or /beta in production, to allow developers a fast iteration on features that might be completely removed (alpha) or break compatibility (beta) before being added to the stable API.

eprodromou added a comment.EditedWed, Sep 25, 7:03 PM

@Joe I think that's definitely a downside to having extensions write into their own namespace.

I don't see a way to have it both ways.

@Joe I think that's definitely a downside to having extensions write into their own namespace.
I don't see a way to have it both ways.

Why not, actually? Extensions could be free to do either: add something to a prefix owned by core and follow core's versioning, or define their own "root", and control their own versioning. A single extension could even do both, e.g. offer custom actions under its own "root", and add some properties under the revision endpoint.

Joe added a comment.Wed, Sep 25, 8:00 PM

@Joe I think that's definitely a downside to having extensions write into their own namespace.
I don't see a way to have it both ways.

Why not, actually? Extensions could be free to do either: add something to a prefix owned by core and follow core's versioning, or define their own "root", and control their own versioning. A single extension could even do both, e.g. offer custom actions under its own "root", and add some properties under the revision endpoint.

Sorry, I might not have been very clear in my comment above. Of course an extensions /should/ be able to define its own namespaces- but I think API consistency should be encouraged as much as possible.

jijiki added a subscriber: jijiki.Wed, Sep 25, 8:01 PM

Since we're proposing this for full discussion, I've changed the version from 0 to 1, so we don't have to go through this exercise again. As an added benefit, many of the other APIs have "/v1/" as a prefix, so it should trigger everyone's pattern-recognition.

In my understanding, this would mean we don't deploy experimental endpoints under v1, and we guarantee the stability of anything deployed under that prefix. I thought that's what we were trying to avoid? In my understanding, this would imply having RFCs for the endpoints we put there.

eprodromou updated the task description. (Show Details)Wed, Sep 25, 10:06 PM
eprodromou updated the task description. (Show Details)
eprodromou added a comment.EditedWed, Sep 25, 10:09 PM

I've updated the proposal with an explanation of how to deploy unstable or experimental interfaces.

eprodromou updated the task description. (Show Details)Wed, Sep 25, 10:17 PM
eprodromou updated the task description. (Show Details)
eprodromou updated the task description. (Show Details)

@Nikerabbit I also added a note that the minor version only shows up in documentation.

tstarling added a comment.EditedThu, Sep 26, 5:31 AM

I've talked with @Joe on IRC about his objections to this RFC. I think he would be prepared to accept the RFC with the following changes:

  • Allow a route to be marked "deprecated". This would correspond with the "deprecated" flag in the OpenAPI operationObject spec. When a route is deprecated, responses would carry a deprecation warning. It could follow the example of the warning array in the Action API.
  • Also allow a route to be marked "internal". This is by analogy with the internal flag in the Action API. In the Action API, the internal flag does not add a warning, it only causes a notice to be added to the documentation.
Joe added a comment.Thu, Sep 26, 5:35 AM

As a side note - having the process of graduating resources from unstable status to published in core in the way described above can work, but we will have realistically to have both working for some amount of time during at least the transition of internal clients.

I would like the timeframe for such a transition to be clearly defined, at least as an order of magnitude (minutes? months?).

As a side note - having the process of graduating resources from unstable status to published in core in the way described above can work, but we will have realistically to have both working for some amount of time during at least the transition of internal clients.
I would like the timeframe for such a transition to be clearly defined, at least as an order of magnitude (minutes? months?).

Let's take an example of the iOS history API we're working on. I'd probably suggest that we would keep the unstable version available, after the stable version was established, for long enough that a major percentage of our users upgraded. So, I don't know, maybe something like 6 months to a year?

@tstarling can correct me if I'm wrong, but I think providing the exact same functionality from two different endpoints would be relatively easy to do. Tim, we'd just need to map the same route handler to two different routes, correct? So something like this to support both /greetings/v0/user/{userName}/hello and /v1/user/{userName}/hello:

"RestRoutes": [
  {
    "method": "GET",
    "path": "/greetings/v0/user/{userName}/hello",
    "class": "HelloHandler",
    "parameters": [
      {
        "name": "userName",
        "in": "path",
        "required": true,
        "type": "string"
      }
    ]
  },
  {
    "method": "GET",
    "path": "/v1/user/{userName}/hello",
    "class": "HelloHandler",
    "parameters": [
      {
        "name": "userName",
        "in": "path",
        "required": true,
        "type": "string"
      }
    ]
  }
]

Not ideal, but I think it's a relatively minor maintenance burden...?

  • Allow a route to be marked "deprecated". This would correspond with the "deprecated" flag in the OpenAPI operationObject spec. When a route is deprecated, responses would carry a deprecation warning. It could follow the example of the warning array in the Action API.

Sure. I don't love putting that data into the response, but it would be fine as a header. In REST APIs, the result is the thing, and putting a "warning" property into the output for fetching a revision doesn't make sense; the revision doesn't have a "warning" property. That's in contrast to RPC-style APIs like the Action API.

Is there a reason we want to send deprecation info to the end user? It would make more sense to send it to the developer, right?

When we have API keys working, we could notify developers whose apps are using deprecated API endpoints by email, say, and give them instructions on how to upgrade. (Not on every API call, of course.)

  • Also allow a route to be marked "internal". This is by analogy with the internal flag in the Action API. In the Action API, the internal flag does not add a warning, it only causes a notice to be added to the documentation.

Sure. Again, would we send this info to the end user, rather than the developer?

Anomie added a comment.EditedThu, Sep 26, 7:49 PM

Is there a reason we want to send deprecation info to the end user? It would make more sense to send it to the developer, right?

Immediate feedback when the end user is also the developer.

When we have API keys working, we could notify developers whose apps are using deprecated API endpoints by email, say, and give them instructions on how to upgrade. (Not on every API call, of course.)

It seems somewhat unhelpful if a developer is getting spammed by "version 1.2.3 of your app is using deprecated endpoints!" when they've already fixed that in version 1.2.4 and people just haven't upgraded to the new version of the app yet. That might wind up encouraging developers to make a new API key for every version of their app and aggressively disabling the old versions' keys. Or it might wind up encouraging developers to filter the emails straight to the trash.

It would be even less helpful if someone can steal the API key out of an app for use in their own app. See T221161 for further discussion on that possibility.

For the Action API I created ApiFeatureUsage so developers could proactively look up (by the user agent) what deprecated features their apps are using. The REST API could write to the same log channel that feeds that, if it wants. I don't know whether anyone actually makes much use of it though.

Pchelolo added a comment.EditedThu, Sep 26, 7:56 PM

Not ideal, but I think it's a relatively minor maintenance burden...?

It's not only a maintenance burden to return the same content from 2 routes. It doubles the amount of front-end cache purges we need to do for every route. The proper way of doing it would be to 301 Redirect old routes to new routes.

Sure. I don't love putting that data into the response, but it would be fine as a header. In REST APIs, the result is the thing, and putting a "warning" property into the output for fetching a revision doesn't make sense; the revision doesn't have a "warning" property. That's in contrast to RPC-style APIs like the Action API.

It's not very RESTy to include the warnings property in the response. Instead we could adopt a standard HTTP Warning header. There's also a draft RFC for deprecated header.

Including this information in a header vs as a property in the response doesn't make it less visible for the end user, you can ignore a response property as successfully as an http header.

Instead we could adopt a standard HTTP Warning header. There's also a draft RFC for deprecated header.

That seems just like what we're looking for.

I tried looking for a link-rel that would work in a Link: header, but the closest I could come up with is "help".

eprodromou updated the task description. (Show Details)Thu, Sep 26, 8:27 PM

I updated to note that we'll set the Deprecation header.

eprodromou updated the task description. (Show Details)Thu, Sep 26, 8:31 PM
daniel added a comment.EditedThu, Sep 26, 8:36 PM

I updated to note that we'll set the Deprecation header.

That sounds good, though it may not be obvious to the casual observer.
Is the idea to follow this? https://tools.ietf.org/html/draft-dalal-deprecation-header-00 Or is there another draft spec for this header?

EDIT: I just came across https://zapier.com/engineering/api-geriatrics/. Seems relevant. Now reading.

Joe added a comment.Wed, Oct 2, 8:27 AM

[CUT]

Not ideal, but I think it's a relatively minor maintenance burden...?

Yes, I think it's a good solution.

One could argue the right thing to do is to make a redirect from the old url to the new one, and ask clients to follow it. There are downsides to both approaches (one is HTTP-correct, the other reduces the latency for the end user and grows the amount of data we cache), so I don't have a clear preference. Maybe doing what you proposed for the first few months, then move to a redirect? Anyways not something that needs to be decided in the scope of the RFC - the important part is we have a way to manage such transitions easily. Thanks!

Is there another meeting for this? Is my presence needed or helpful?

eprodromou updated the task description. (Show Details)Wed, Oct 9, 1:56 PM

Based on feedback, I've added a section on private, internal API endpoints. We carved out a namespace /private/ for endpoints that are internal to MediaWiki. I also added some danger signs, pointing out how private APIs can become public, and then get difficult to change or support. Lastly, I added a note that the private nature of /private/ might be enforced using API keys. (We're just starting to get API keys into the mix, so this is more a shot across the bow than a design plan.)

I added the "Unstable" and "Internal" sections based on requests from previous discussions. My team won't use those; they were added by request from TechCom.

If TechCom decides to drop them from the RFC, I'm 100% OK with that.

Please don't hold up moving forward with this RFC to get my feedback on dropping those sections.

eprodromou updated the task description. (Show Details)Wed, Oct 9, 7:11 PM

Based on internal discussions inside CPT, I've taken out the section on Internal or Private APIs. Our rough consensus is that this is not a good pattern to encourage with this RFC, so we'd like to see it go through without the section.

I've duplicated it in this comment in case there's need to review the text.


Private or internal API endpoints

For MediaWiki core, we'll reserve a private namespace, /private/, for any internal API endpoints used by MediaWiki that aren't intended for public use.

We recognize this is problematic; private, internal interfaces have a tendency to become de-facto public interfaces if they're the only way to get some functionality to work. We encourage any developers who are thinking of putting an endpoint into /private/ to think again, and think once more, before using it. The overhead of designing a public interface that can be supported in a stable, backwards-compatible way is not as high as the overhead of supporting a private endpoint that's become public over time.

Client API developers should question anyone who tries to get them to use a /private/ API.

The /private/ namespace will use semantic versioning. It will keep the major version at zero, reflecting the unstable nature of this namespace. So, a private API endpoint might be:

/private/v0/leftpad?char=0

Private API endpoints may use other mechanisms, such as API keys, to restrict access to authorized clients only.

Per the TechCom meeting on October 9, this RFC is entering the Last Call period. If there are no pertienent concerns raised and left unaddressed by October 23, this RFC will be approved as proposed.