Page MenuHomePhabricator

Document and implement the REST API format versioning and negotiation policy
Closed, ResolvedPublic

Description

Following the resolution of T124365, we should clearly document our policy for REST API format versioning. It should be prominently mentioned in https://www.mediawiki.org/wiki/API_versioning, and referenced from Swagger documentations. Ideally, each end point's documentation will encourage users to explicitly provide an Accept header.

The implementation in RESTBase will start with the Parsoid content end points, to support the migration towards separate data-mw metadata (see T78676). Some Varnish work is needed to efficiently vary on Accept headers.

More general validation for Accept headers vs. supported response types might follow at a later point.

Related Objects

Event Timeline

Change 273997 had a related patch set uploaded (by Alex Monk):
Send Accept header to restbase

https://gerrit.wikimedia.org/r/273997

We have updated https://www.mediawiki.org/wiki/API_versioning, and the documentation of individual entry points is providing the supported content types, along with a CURL example line for sending the Accept header.

We should further see if we can more prominently link to https://www.mediawiki.org/wiki/API_versioning, and perhaps tweak swagger-ui to automatically provide even stronger encouragement to send the default content-type in Accept headers.

Next steps:

  • Normalize Accept headers in Varnish: Clamp to default value unless profile=https://www.mediawiki.org/wiki/Specs/ is supplied.
  • Send Vary: Accept on entry points that consider the Accept header.
  • Check & convert versions in specific end points. For Parsoid, we can either use Parsoid's upcoming HTML2HTML end point (T114413), or re-render content from scratch.
  • Store a stash of non-default composite responses, so that we can provide data-parsoid and data-mw consistent with the specific render, in the specific content-type.
  • Cache non-default type responses for less time than the stash TTL.

Change 281042 had a related patch set uploaded (by GWicke):
Normalize REST API Accept headers

https://gerrit.wikimedia.org/r/281042

Change 281042 merged by BBlack:
Normalize REST API Accept headers

https://gerrit.wikimedia.org/r/281042

Are there plans to support any kind of non-exact specification?

E.g. specifying ~1.2 or 1.2 to mean 1.2.*?

See P3094. We requested 1.2.0 and it's rejected since it only allows 1.2.1 or 2.0.0. (This particular issue is being addressed, I'm asking about the general scenario).

The general intention behind this versioning scheme is to avoid breaking clients. This means that the priority is clearly on breaking changes, and not on patch versions.

Following semver, I think we should basically ignore patch versions in external Accept headers. Minor versions are debatable, as they are still guaranteed to be backwards-compatible. A client requesting the older minor version is still expected to work when consuming a newer minor version.

This leaves us primarily with major versions, which is also the focus of the upcoming Parsoid format migration (1.2.1 to 2.0.0 in that case).

I think we should encourage clients to specify the exact version they have tested, but take liberties on the server side to return the latest compatible version. This is allowed and arguably in the spirit of RFC 7231:

If an Accept header field is present in a request and none of the available representations for the response have a media type that is listed as acceptable, the origin server MAY either honor the Accept header field by sending a 406 (Not Acceptable) response or disregard the Accept header field by treating the response as if it is not subject to content negotiation.

To summarize, I'm proposing to

  • return an error 406 if the requested major version cannot be satisfied,
  • try to match the minor version, but return another (later) compatible version matching the major if that is not possible, and
  • ignore patch versions in client headers.

It turns out this is what Parsoid is currently doing. They're using caret semantics, so if you request e.g. 1.0.0 (or 1.2.0, or 1.1.*, etc.), it will serve you 1.2.1. However, you can't use 0.0.0, or some future version (1.3.0) it doesn't know about.

The errors before were because the format string (not the version itself, which was an allowed change, but the other parts) changed. See https://gerrit.wikimedia.org/r/#/c/288725/1/includes/Conversion/Utils.php .

I have expanded the documentation at https://www.mediawiki.org/wiki/API_versioning#Content_format_stability_and_-negotiation with details on semver handling in line with the discussion & the Parsoid implementation.

The infrastructure is now all in place, and the policies are documented. We are now working on gradually using the Accept header to vary responses in specific end points. Whenever we do so, we need to keep the following in mind:

  • Set Vary: Accept, so that Varnish stores varying responses. Varnish already normalizes accept headers unless profile=https://www.mediawiki.org/wiki/Specs/ is supplied.
  • Keep an eye on usage of older versions, and make sure that conversions perform adequately.

First step towards enforcing content type versions of Parsoid end points: https://github.com/wikimedia/restbase/pull/803

This does not use the client-supplied accept header yet, but does guarantee that the latest content type as documented by the API spec is returned.

As the first iteration of this, we would like to implement on-demand downgrading of the content without storage or stashing. Then we can verify correctness and latency impact of the downgrading and make an informed decision how to proceed regarding storage part of it.

The semantics will be the following:

  1. Semver patch version differences are ignored.
  2. If the client accept header is greater then what we have in storage, we simply make a no-cache request. If the rerender provided a version matching the requested version -return. If the rerender provided version greater then requested - go to point 3. If the rerender returned version semver minor less then requested - return. If rerender provided version semver major less the requested - 406
  3. If the client accept header is less then stored - try to downgrade and follow https://phabricator.wikimedia.org/T128040#2294397
  4. If the client didn't supply accept header - return whatever we have in storage. This will be revisited later when we make sure the vast majority of the clients specify the header.

If we all agree on the semantics, I can start coding this up as soon as I have some Parsoid endpoint to test agains

Just talked with @ssastry over IRC on the spec described in the previous comment and he agrees this is how it should be done. Implemented in https://github.com/wikimedia/restbase/pull/1052

If the rerender returned version semver minor less then requested - return. why isn't that a 406 as well?

If the rerender returned version semver minor less then requested - return. why isn't that a 406 as well?

Ye that's controversial and it is a bug in the client, but I thought we've agreed on the proposal in https://phabricator.wikimedia.org/T128040#2294397

I'm pretty indifferent whether to return 406 or just return the content, slightly leaning to 406 as a more correct approach

That proposal says,

try to match the minor version, but return another (later) compatible version matching the major if that is not possible, and

Carrot semantics are such that it's ok to return a greater than or equal minor version. Here you're returning a less than minor version, which may be missing the feature that's depended on (ie. the difference is adding a feature vs making a backwards incompatible change).

Oh right! I've misread that comment! Of course, you're right.

Except a small hack with dummy data-parsoid, RESTBase PR is ready for review https://github.com/wikimedia/restbase/pull/1052

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:25:19Z] <ppchelko@deploy1001> Started deploy [restbase/deploy@0f11d5d]: Canary on 2001 for content negotiations T128040

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:29:36Z] <ppchelko@deploy1001> Finished deploy [restbase/deploy@0f11d5d]: Canary on 2001 for content negotiations T128040 (duration: 04m 17s)

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:39:54Z] <ppchelko@deploy1001> Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:43:01Z] <ppchelko@deploy1001> Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 (duration: 03m 06s)

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:43:48Z] <ppchelko@deploy1001> Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 2, feeds

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:47:37Z] <ppchelko@deploy1001> Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 2, feeds (duration: 03m 49s)

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:47:46Z] <ppchelko@deploy1001> Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 3, feeds

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:51:45Z] <ppchelko@deploy1001> Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 3, feeds (duration: 03m 59s)

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:52:15Z] <ppchelko@deploy1001> Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 4, feeds....

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:57:01Z] <ppchelko@deploy1001> Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 4, feeds.... (duration: 04m 46s)

Mentioned in SAL (#wikimedia-operations) [2018-09-27T16:57:12Z] <ppchelko@deploy1001> Started deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 5, feeds....

Mentioned in SAL (#wikimedia-operations) [2018-09-27T17:03:44Z] <ppchelko@deploy1001> Finished deploy [restbase/deploy@0f11d5d]: Full deploy for content negotiations T128040 take 5, feeds.... (duration: 06m 32s)

Mentioned in SAL (#wikimedia-operations) [2018-10-01T19:23:58Z] <ppchelko@deploy1001> Started deploy [restbase/deploy@7caf4d8]: Content-negotiation filter going live T128040

Mentioned in SAL (#wikimedia-operations) [2018-10-01T19:27:36Z] <ppchelko@deploy1001> Finished deploy [restbase/deploy@7caf4d8]: Content-negotiation filter going live T128040 (duration: 03m 38s)

Pchelolo claimed this task.

I think this is done and can be resolved now.