Page MenuHomePhabricator

Client Developer knows semantic version of API
Closed, ResolvedPublic3 Estimated Story Points

Description

"As a Client Developer, I want to know the semantic version of the API, so that I can depend on the endpoints I use."

One of the most difficult parts of developing an API client for a live site is changes in the interface. We should make sure as we develop the REST API that it's clear what endpoints and functionality is a supported, production endpoint subject to our version policy.

Ideally, we will have minor releases that more or less map to the epics in phabricator. So:

  • version 1.0: Page History T231338
  • version 1.1: Page History + Minimal Client T229662
  • version 1.2: Page History + Minimal Client + Media T234944
  • version 1.3: Page History + Minimal Client + Media + Extended History T234951

To keep our development in accord with the versioning and namespace RFC T232485, especially w/r/t unstable interfaces, we should:

  1. Add new endpoints to a development namespace, /coredev/v0. coredev here means "core development", and v0 means a semantic version 0, or "no promises, can change at any time".
  2. When we add new features to an existing endpoint (by adding properties to the output, or by accepting different parameters or input), we should have two versions: the version with new features in /coredev/v0, and the version without them in /v1.
  3. When we finish an epic and are ready to release, we remove all the endpoints for that epic from the development namespace, and add them to the production /v1 namespace, and we increment the minor version of the API (1.0 -> 1.1, 1.1 -> 1.2, ...).
  4. We should have an API version information endpoint; see below.

Event Timeline

I'm open to other options, but I think once something hits production in the /v1 namespace, it should be subject to semantic version rules (no breaking changes unless there a major version change).

@BPirkle when I talked about this in our last checkin, you questioned the idea. Can you let me know if this makes sense for you now?

I think steps 1 and 3 above are relatively easy, although we'll need to also update integration tests.

I think step 2 is more difficult, since it requires having variants of the same functionality mounted at different endpoints. I can see three ways of dealing with it.

  1. Just have new functionality on existing endpoints go right into /v1, without bothering with the development version. This is probably OK, but client developers also often "follow their nose" on endpoints. "Oh, this undocumented property exists in the output. Since it's in the stable part of the API, it's going to stay there, so I'll just start using it..."
  2. Have multiple *Handler classes, mounted at the different endpoints. Inheritance might help here.
  3. Pass some kind of parameter to the handler class to say, "support this functionality" or "don't support this functionality".

This isn't going to get easier as our API gets more popular, so it's important to hammer it out now.

Note also: there may be some point in the future where we want to get the full version information as in Special:Version but in a structured form. I don't think we need that for the first pass on the version endpoint.

@BPirkle when I talked about this in our last checkin, you questioned the idea. Can you let me know if this makes sense for you now?

I'm struggling to remember what specifically I was questioning. I recall raising the point that our development/review/deployment workflow is geared toward only merging things that are "done", because once they're in production, they're available for whoever whats to use them, in whatever way they want to. This proposal addresses that point, and I have no fundamental objections.

Just to make sure I understand the proposal, I'll restate some of the key points as they affect the development workflow. Please let me know if I misunderstood anything;

  1. we'll generally release changes to the REST API in batches (minor versions) rather than trickling changes out one-at-a-time as they are implemented.
  2. new endpoints should generally be added under /coredev/v0
  3. we may need to have both "old" and "new" versions of an endpoint: a released version for public consumption and a modified version under /coredev/v0 that is awaiting the next minor version release
  4. the minor version exists in documentation and in the version endpoint, but not in the path for stable endpoints

Note: I used words like "generally" and "may" above, because I wouldn't be surprised if we make exceptions from time to time.

I have three thoughts:

  1. the minor version will be present in the documentation and in the version endpoint but not in the stable paths. I understand why, but I also anticipate confusion and questions. We should document this well.
  2. I share your concerns about how best to maintain two versions of an endpoint. I think we'll have to deal with it on a case-by-case basis. It might not be a bad idea to modify the Handler class so that implementations can query at runtime whether they are being invoked via a stable or unstable path. In a lot of cases, it might be simple to use that information to make one implementation support both versions.
  3. I wonder what a minor version release patch will look like. We could try, in one code change, to modify the version endpoint (or whatever it pulls its version data from) as well as all the affected handlers, to guarantee that there's never an inconsistent point. Alternatively, assuming a train deploy, we could do this in multiple patches, but we'd have to carefully make sure they all rode the train together. Regardless of how we approach this, it seems to require more coordination than an average change.
  4. there are endpoints in progress that should, per the proposal, go under /coredev/v0. If we move forward with this proposal, we should be sure to update those endpoints.

From a documentation perspective, it makes sense to version the docs by major version. Major versions ensure backwards compatibility and give developers confidence in the API. We'll need to incorporate this into our specifications for the automated API docs.

Can you clarify the value of the minor version? From the user's perspective, the minor version seems redundant with the MediaWiki version. For example, MediaWiki 1.35 will contain a certain set of endpoints, whether we classify those endpoints as 1.1 or 1.2 doesn’t impact API functionality. Changes to a given endpoint between version 1.1 and 1.2 must be backwards compatible, so a user shouldn't have to be aware of a version change there. I appreciate the neatness of semantic minor versioning as it aligns with the product development epics, but I’d like to make sure the value for users is worth the added complexity.

Can you clarify the value of the minor version?

I think that it would be useful to say, "This endpoint has been available since API version 1.3" or "This property has been available since API version 1.2". This is probably more useful for developers working with non-WMF wikis.

We do that with MediaWiki versions in the Action API, but without the semantic version commitment.

  1. the minor version will be present in the documentation and in the version endpoint but not in the stable paths. I understand why, but I also anticipate confusion and questions. We should document this well.

Agreed. Maybe it's not something that needs to be surfaced that often?

  1. I share your concerns about how best to maintain two versions of an endpoint. I think we'll have to deal with it on a case-by-case basis. It might not be a bad idea to modify the Handler class so that implementations can query at runtime whether they are being invoked via a stable or unstable path. In a lot of cases, it might be simple to use that information to make one implementation support both versions.

I think it's a low priority. The main concern is for developers who see an undocumented property and start using it while it's still unstable. That seems like a bad practice on their part!

  1. I wonder what a minor version release patch will look like. We could try, in one code change, to modify the version endpoint (or whatever it pulls its version data from) as well as all the affected handlers, to guarantee that there's never an inconsistent point. Alternatively, assuming a train deploy, we could do this in multiple patches, but we'd have to carefully make sure they all rode the train together. Regardless of how we approach this, it seems to require more coordination than an average change.

I was thinking that all it would mean would be changing the routes of the new endpoints from /coredev/v0 to /v1 in coreroutes.json. I guess the integration tests would need to be updated too.

  1. there are endpoints in progress that should, per the proposal, go under /coredev/v0. If we move forward with this proposal, we should be sure to update those endpoints.

Agreed!

  1. I wonder what a minor version release patch will look like. We could try, in one code change, to modify the version endpoint (or whatever it pulls its version data from) as well as all the affected handlers, to guarantee that there's never an inconsistent point. Alternatively, assuming a train deploy, we could do this in multiple patches, but we'd have to carefully make sure they all rode the train together. Regardless of how we approach this, it seems to require more coordination than an average change.

I was thinking that all it would mean would be changing the routes of the new endpoints from /coredev/v0 to /v1 in coreroutes.json. I guess the integration tests would need to be updated too.

Depends exactly how we handle the "two versions at the same time" situation, which may differ from case to case. Hopefully we'll create some standard practices that cleanly and conveniently deal with this. But I don't yet have a clear picture of how that'll work out.

The core REST API is already a part of MediaWiki core and MW has it's own versioning scheme. Introducing one more version will create a lot of confusion. Installing MW LTS release v1.xx - which core REST API minor version do I get? Where do I look it up? What does the semantic REST API version give me? Can I install a specific version on top of a different MW release? All of these are rithorical questions for the purpose of the ticket. TLDR: If core REST was implemented as an extension, this could make sense. Since it's a part of the core, it should follow the core versioning scheme.

Documenting features as being added/deprecated in a certain version of Mediawiki seems like a great idea, but I think we probably already do that, right @apaskulin ? I could imagine the docs being entirely versioned by MW version, so you can independently look up docs for 1.xx and 1.yy mediawiki.

Introducing a version endpoint that gives the info similar to Special:Version in a structured format - also seems like a good idea, but probably requires some design or a separate ticket.

Regarding releasing endpoints in the coredev prefix. I have seen one justification for this in the previous discussion - it prevents users from depending on undocumented properties. Correct me if have missed something.

If the user decides to depend on the coredev api, we will break the client when moving the uri. If the user decides to use undocumented/experimental properties - we will break the client. Both breakages are OK according to the contract of an 'experimental' API. I don't see a lot of benefit in dancing around with the coredev prefix. Having two independent versions running in parallel at the same time seems like a straight way to hell. If we are backwards compatible - why not release in place. If we are not - why not introduce a new version under v2 and run those in parallel.

In general, I understand what you're trying to achieve. You want to have staging feature releases until the feature is ready, and them merge the feature into master as soon as it's complete. This is a valid need, but it's done in a different way normally - you'd have a feature branch of the software, and release it into a staging environment for testing, merging into master once a development cycle for the feature is complete. Unfortunately right now it's impossible (hard, not impossible) in our release environment. However, leaking these concepts into the software doesn't seem like a correct approach to me. It feels like we're working around a flaw in our internal processes at the users expense.

@Pchelolo thanks for the thoughtful comments.

One of the major features that people ask me about our upcoming API is versioning. Will we use a stable versioning policy? Will it remain backwards compatible?

Using a separate semantic version for the API handles this. It means we can have API versions that don't match up with our MediaWiki versions. We might have multiple API versions within a single MW version cycle, or a single API version that spans multiple MW versions.

As for whether to use the prefix; it's outlined in the RFC on versions and namespaces. We've been planning to do this for unstable interfaces, as detailed in T232485: RFC: Core REST API namespace and version.

I know that carefully making versioned changes today doesn't seem like a big deal. It will become a big deal as this API gets used more often. I'd like to introduce the versioning discipline now so we don't have issues in the future.

I appreciate that there are other ways we can do this. But I don't see any of them as obviously better.

There should be very few actual users of endpoints currently in development for core. I don't think having them use a development namespace is a big impediment.

@Pchelolo, I agree with most or all of your concerns, but I think I'm less opposed than you are to the proposal, as it seems to me one of the less bad ways to deal with the problem. Here are my thoughts in more detail.

In general, I understand what you're trying to achieve. You want to have staging feature releases until the feature is ready, and them merge the feature into master as soon as it's complete. This is a valid need, but it's done in a different way normally - you'd have a feature branch of the software, and release it into a staging environment for testing, merging into master once a development cycle for the feature is complete.

I agree with this - it is a very legitimate need. The API work is a little different than some of our other projects in that it involves the interaction between two teams working on different but related software. API/core developers needs a way to provide speculative endpoints to client developers. Client developers need a way to try out speculative endpoints to see if they'll actually meet their needs. We can (and should, and do) work out as much as we can via specs and discussions before implementing anything. But there's no substitute for actually trying out the real technology against real data. I don't see a way around the need for this.

Unfortunately right now it's impossible (hard, not impossible) in our release environment.

Yes, and this still surprises me. I guess the people who run our release environment have a lot to do, this has just never bubbled up high enough on the priority list for it to happen. When I joined, I was surprised we didn't have a good solution for this.

However, leaking these concepts into the software doesn't seem like a correct approach to me. It feels like we're working around a flaw in our internal processes at the users expense.

I mostly agree with this. However, I doubt we'll be able to adjust our internal processes quickly enough to meet our short-term needs. Regarding "at the users expense", there are multiple sets of users with different needs. Right now, the API work is focused on the needs of the IOS team, but I'm hopeful that the API will become useful to other groups, both internal and external, on both our wikis and third-party wikis. That's a lot of users with probably conflicting needs. Some of them will probably want to collaborate with us on speculative endpoints, so it seems important to have a way to satisfy that need. So yes, we're working around a flaw in our internal processes and yes, this is probably not the ideal approach. And yes, some people may find it confusing or may not like it, but others may find it beneficial.

Regarding having the API follow the MediaWiki version scheme - I'm not convinced that is realistic. We'll be changing things in the API much more frequently than MediaWiki does minor releases. For sure faster than major ones.

If the user decides to depend on the coredev api, we will break the client when moving the uri. If the user decides to use undocumented/experimental properties - we will break the client. Both breakages are OK according to the contract of an 'experimental' API. I don't see a lot of benefit in dancing around with the coredev prefix.

Regarding this point - the one benefit to "coredev" that I see is that it makes it pretty hard for a client developer to not be aware that an endpoint is experimental. If we were to just document (in a code comment, or in the actual documentation) that an endpoint was experimental, it'd be easier for client developers to miss. Maybe that's their problem and not ours, but it seems friendly to help them avoid mistakes.

Having two independent versions running in parallel at the same time seems like a straight way to hell.

I agree completely. That is my biggest concern about this proposal.

If we are backwards compatible - why not release in place. If we are not - why not introduce a new version under v2 and run those in parallel.

I'm concerned about version explosion, especially at the beginning when the API is changing quickly. "coredev" sounds less intrusive than frequent major version changes.

If core REST was implemented as an extension, this could make sense

This is the best alternative I've heard. I liked this when it was first suggested, and I still like it. None of our new handlers are in mediawiki 1.34, so we could still probably do this without unduly affecting anyone. I'm sure it is not difficult for the IOS team to adjust paths, as long as the endpoints themselves behave identically.

Ok, I can live with the coredev thing.

But.

The problem I have with independent versioning for REST API is that we have no way to release it other than together with MW core. What's the point of having version x.y if it's only a temporary version within MW w.z and it's impossible to install otherwise? This will create a bunch of versions that, once superseded within MW core are effectively unreachable - third party can not install it, we will never be able to get back to it etc.

We might have multiple API versions within a single MW version cycle,

Within a current MW release cycle. One MW version is released, all API versions that lived inside that release are effectively unreachable. I understand what you're trying to achieve. MW release cycle is too long and you want to move faster right? The coredev thing can help with that. Introducing yet another versioning policy... Not so much?

My assumption is, we will not run multiple versions of the API simultaneously, right? There will be no universal way to request v1.1 and v1.2 from the same running software. If so, how is having separate versioning for API help anything? Once API is deployed and 'publicized' there will be usage considerations regardless of the policy, but purely from a policy standpoint, I don't see how introducing ye another version buys us any additional flexibility

or a single API version that spans multiple MW versions.

Same API can exist spanning multiple software versions, we don't need special versioning for it.

Ok, I can live with the coredev thing.

Thank you!

But.

Understood. Thanks for the flexibility. Here's what I'd like to do: let's take a strict approach to API versioning for 2-3 versions. If we find it's not working for us, let's ease up.

It looks like we've been sticking with this. Thanks, all.