Page MenuHomePhabricator
Paste P2631

Full log from ArchCom RFC meeting 2016-02-17
ActivePublic

Authored by RobLa-WMF on Feb 18 2016, 6:15 AM.
Referenced Files
F3368204: Full log from ArchCom RFC meeting 2016-02-17
Feb 18 2016, 6:15 AM
Subscribers
None
[22:01:10] <gwicke> #startmeeting REST API response format version negotiation: https://phabricator.wikimedia.org/T124365
[22:01:31] <gwicke> hello
[22:01:50] <TimStarling> meetbot broken?
[22:01:54] * legoktm waves
[22:01:56] <gwicke> yeah, not sure it's around
[22:01:58] <Scott_WUaS> greetings All
[22:02:33] <gwicke> anybody knowledgeable about meetbot around?
[22:02:48] * gwicke looks at https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Tools/meetbot
[22:03:20] <TimStarling> I'm doing it
[22:03:24] <gwicke> I don't have an account in tool labs
[22:03:28] <TimStarling> hmm, maybe not
[22:04:16] <TimStarling> maybe I don't have an account either, or the restricted bastion is not documented
[22:04:17] <mobrovac> tech problems - so typical for engineers :D
[22:04:23] <gwicke> well, worst case we can just pretend it's working and then grep everything with #info etc from our own logs
[22:04:50] <gwicke> let me give a short intro on the RFC
[22:05:00] <gwicke> #link https://phabricator.wikimedia.org/T124365
[22:05:24] <gwicke> the scope of this RFC is negotiating the response format of REST API end points
[22:05:49] <gwicke> the REST API has a global version (currently v1) that is used to version the API as a whole
[22:06:24] <gwicke> it will be incremented whenever the API layout or the request interface of stable end points is changed in incompatible ways
[22:06:50] <TimStarling> I have just been reading http://www.troyhunt.com/2014/02/your-api-versioning-is-wrong-which-is.html
[22:06:52] <gwicke> the 'stable' in there refers to the per-entrypoint stability markers
[22:07:04] <gwicke> as described in https://www.mediawiki.org/wiki/API_versioning
[22:07:04] <TimStarling> and http://www.lexicalscope.com/blog/2012/03/12/how-are-rest-apis-versioned/
[22:07:52] <gwicke> now, getting back to the RFC, we still have a need to sometimes change the response returned from a *stable* entry point in incompatible ways
[22:08:49] <gwicke> concretely, the Parsoid team is about to move data-mw (template metadata etc) from inline attributes to a separate JSON blob; this change would break existing clients if we don't provide a means to continue to receive the old response
[22:09:19] <mobrovac> if they don't want to switch to the new layout, that is
[22:09:48] <gwicke> the main questions in this context are: 1) How to request a specific response format, and 2) What to do if no format was specified
[22:10:20] <subbu> or if they cannot be switched to accept the new layout at the exact same moment the new format is dpeloyed.
[22:10:33] <gwicke> for 1), there are two "wrong" proposals, meaning that both have pros & cons
[22:10:52] <gwicke> https://www.mediawiki.org/wiki/Talk:API_versioning has a list of pros / cons for each
[22:11:21] <TimStarling> ah yes, I see it already links to that Troy Hunt article
[22:11:39] <gwicke> *nod*
[22:11:49] <TimStarling> his approach was to just do it all the ways
[22:12:07] <TimStarling> that way you have a URL if you need one for some reason, but normally you can use an Accept header
[22:12:28] <mobrovac> for 1) i'd be for the accept header
[22:12:36] <gwicke> the downside is that you get the complexity of both
[22:12:57] <mobrovac> looks much cleaner and is more API-like
[22:12:58] <gwicke> wrt VCL complexity and documentation
[22:13:30] <gwicke> but, the pro is that you can easily link to a specific version
[22:13:45] <gwicke> *and* link to the latest
[22:13:46] <mobrovac> right, but with query params you might get into the mw api query params caching trap we have right now
[22:14:40] <gwicke> yeah
[22:14:49] <mobrovac> wrt linking, the content render doesn't actually change
[22:14:56] <TimStarling> it's doable though, with a lot of logic in varnish
[22:15:18] <TimStarling> it's apparently more common to have the version in the path though
[22:15:27] <mobrovac> TimStarling: right, but it's less of an effort to do so with clean headers
[22:16:04] * mobrovac "it's doable" should be every engineer's moto :P
[22:16:06] <gwicke> global api versions are very common in the path, per-entrypoint versions a bit less so
[22:16:14] <DanielK_urlaub> what do the URIs identify? An API entry point, or the data returned by that entry point (or eve nthe thing described by the data returned by the entry point)?
[22:16:34] <DanielK_urlaub> in wikidata, we use separate URIs fpr the Thing and the data describing the Thing.
[22:16:46] <DanielK_urlaub> the first can't be versioned, the second one can
[22:17:08] <gwicke> it's typically slightly different representations of the same logical resource
[22:17:16] <subbu> DanielK_urlaub, example urls for 'the thing' and 'the data describing the thing'?
[22:17:52] <gwicke> so very much in the spirit of HTTP content negotiation
[22:18:09] <TimStarling> parsoid already has /v2
[22:18:15] <DanielK_urlaub> subbu: https://wikidata.org/entity/Q42 and https://www.wikidata.org/wiki/Special:EntityData/Q42.rdf
[22:18:29] <subbu> TimStarling, and /v3
[22:18:31] <gwicke> TimStarling: that's the global API version, which is a different issue
[22:18:52] <subbu> because the api uris completely changed.
[22:19:17] <DanielK_urlaub> subbu: ...and https://www.wikidata.org/wiki/Special:EntityData/Q42.rdf?revision=293317199
[22:19:34] <gwicke> DanielK_urlaub: how do you represent changes in your rdf format?
[22:19:51] <DanielK_urlaub> gwicke: my point is: if your api returns data that is itself versioned, you have multiple dimensions of versioning to deal with
[22:19:57] <gwicke> and how do you request a specific RDF format version?
[22:20:10] <DanielK_urlaub> you may have to encode both in the uri without creating confusion
[22:20:19] <TimStarling> ah right, /:domain/v3
[22:20:26] <TimStarling> it's not at the top level
[22:20:33] <subbu> DanielK_urlaub, so, clients making requests to Q42 have to specify the rdf version in the request?
[22:20:34] <SMalyshev> gwicke: there's https://www.wikidata.org/wiki/Special:EntityData/Q42.rdf?flavor=dump for example
[22:20:36] <DanielK_urlaub> gwicke: specific rdf version https://www.wikidata.org/wiki/Special:EntityData/Q42.rdf?revision=293317199
[22:21:03] <gwicke> DanielK_urlaub: revision as in MW revision is a different resource already
[22:21:17] <DanielK_urlaub> of course
[22:21:37] <TimStarling> (Article titles should be capitalized, which will prevent an article named 'v3' from being an additional source of ambiguity.)
[22:21:42] <subbu> i.e. a request to https://wikidata.org/entity/Q42 might return a new data object after the rdf is updated.
[22:21:45] <DanielK_urlaub> but if i use ?version?1234 in the uri, it's not obvious if that refers to the version of the resource, or the version of the api
[22:22:31] <gwicke> the core question is whether we should use Accept headers, a query parameter, or both
[22:22:37] <DanielK_urlaub> subbu: indeed it will. and if you resolve it, it actually redirects to the document URI (also without a fixed version, though)
[22:22:51] <Scott_WUaS> *nod*
[22:22:51] <mobrovac> so, to clarify, we are discussing here returning *the same revision* of the content, only slightly differently represented
[22:23:26] <gwicke> yes, the revision question is orthogonal / OT
[22:23:36] <DanielK_urlaub> gwicke: i guess my point is: if it's a query parameter, make the name very clear. apiversion=123, not just version=123
[22:23:54] <gwicke> DanielK_urlaub: makes sense
[22:24:15] <gwicke> the proposal presents a strawman of ?accept=text/html;profile=mediawiki.org/specs/html/1.0.0
[22:24:18] <DanielK_urlaub> yea, the revision thing is only relevant in as much we should avoid clashes and confusion when naming parameters
[22:24:46] <gwicke> #link https://www.mediawiki.org/wiki/Talk:API_versioning
[22:24:58] <DanielK_urlaub> gwicke: does the profile only govern the output representation, or also the interpretation of the input?
[22:25:13] <gwicke> it's the mime type of the response
[22:25:21] <gwicke> so only affects the response
[22:25:41] <DanielK_urlaub> so how would you version breaking changes to the parameters an api accepts?
[22:25:48] <gwicke> backwards-incompatible changes to the request part would force an increment of the major API version
[22:26:10] <gwicke> (or the creation of a new entry point, to avoid breakage)
[22:26:21] <TimStarling> so gwicke, your preferred solution is the Accept header?
[22:26:23] <DanielK_urlaub> and how does the client specify the major api version in a request?
[22:26:56] <mobrovac> DanielK_urlaub: it's part of the URI - /api/rest_v1/ ...
[22:27:05] <gwicke> TimStarling: if I have to pick one, I think I am leaning slightly towards the Accept header
[22:27:17] <mobrovac> +1
[22:27:41] <DanielK_urlaub> mobrovac: i thought that was only one of the alternatives, and not the one that gwicke is prefering
[22:27:42] <subbu> if I had to pick as a user of an API, uri query params for versioning would be simpler from a usability PoV, but understand it complicates aspects of the implementation .. and accept header feels cleaner.
[22:27:44] <mobrovac> fwiw, i think allowing both could create too much confusion
[22:28:14] <subbu> uri query params for per-endpoint format versioning, to be clear.
[22:28:21] <TimStarling> the Accept header would be forwarded to parsoid?
[22:28:21] <gwicke> one thing that's strong about Accept is the symmetric use of the content type
[22:28:31] <DanielK_urlaub> gwicke: i'm getting the idea that the accept header is good for specifying the expected output format, but versioning of the api input should go into the url.
[22:28:35] <mobrovac> DanielK_urlaub: the overall api version is contained in the url itself, but here the focus is on small mime-type-like changes in the returned content
[22:29:04] <mobrovac> subbu: why would it be simpler for you?
[22:29:18] <gwicke> TimStarling: potentially; there are optimizations we can apply if an older version of the same content is in storage
[22:29:19] <mobrovac> it's an API - so you have to build an application that talks to it
[22:29:35] <TimStarling> yeah but didn't you previously say that version conversion is not RESTBase's responsibility?
[22:29:51] <gwicke> TimStarling: we have been discussing a html2html end point, which would convert one version of html/data-parsoid/etc to another
[22:30:13] <gwicke> yes, the conversion itself is not, but making the request to parsoid to ask for the conversion might be
[22:30:13] <subbu> mobrovac, testing via browser params vs headers.
[22:30:13] <subbu> grr.
[22:30:14] <subbu> in the browser via params
[22:30:15] <subbu> urls are easy to construct
[22:30:20] <gwicke> that's all behind the scenes, though
[22:30:33] <TimStarling> parsoid could provide a format upgrade/downgrade API
[22:30:45] <DanielK_urlaub> oh right, sorry - i was under the impression that gwicke was proposing to also move the major version marker into the accept header.
[22:30:50] <gwicke> #link https://phabricator.wikimedia.org/T114413
[22:31:09] <subbu> TimStarling, yes, we need to provide that.
[22:31:10] <mobrovac> subbu: that's what curl is here for
[22:31:12] <mobrovac> :P
[22:31:19] <subbu> mobrovac, i was afraid you were going to say that. :)
[22:31:19] <gwicke> TimStarling: ^^ is a task discussing a html2html end point
[22:31:43] <gwicke> subbu: I agree, that's a strong point for URLs
[22:32:00] <TimStarling> we could have a testing gateway which does nothing but maps query strings to request headers
[22:32:00] <DanielK_urlaub> so why not allow both?
[22:32:06] <TimStarling> if that is really needed
[22:32:14] <gwicke> a mitigating factor is that the swagger sandbox shows accept headers by default: https://en.wikipedia.org/api/rest_v1/?doc#!/Page_content/get_page_html_title
[22:32:19] <subbu> TimStarling, fair enough
[22:32:44] <mobrovac> subbu: with our current content types which contain slashes and whatnot, i'd like to see you uri-enconding that by hand :P
[22:33:03] <subbu> DanielK_urlaub, you mean by having the query string one to do an internal redirect to the accept header version?
[22:33:12] <TimStarling> it's pretty normal for REST APIs to be broken when you can't set custom request headers
[22:33:17] <gwicke> TimStarling: yeah, that could be an option if we can manage to do so without overly complicating the VCL config
[22:33:24] <mobrovac> exactly TimStarling
[22:33:26] <TimStarling> and I think gateways are a normal solution to that
[22:34:11] <gwicke> my proposal would be to start with one option, and consider adding a second alternative in a second step
[22:34:28] <gwicke> pragmatically, we probably don't have the time to do both at once
[22:34:39] <mobrovac> nor does it make sense really
[22:35:46] <gwicke> picking one version doesn't stop us from adding an alternative later
[22:36:11] <subbu> i think we should move to part (2) of the rfc if part (1) feels resolved .. which it seems it is .. in terms of accept-header being the preferred solution.
[22:36:16] <mobrovac> s/version/mechanism/ (to avoid confusion)
[22:36:34] * mobrovac looks forward to (2)
[22:36:54] <gwicke> TimStarling, DanielK_urlaub: does subbu's summary reflect your view?
[22:38:00] <DanielK_urlaub> gwicke: yep
[22:38:02] <TimStarling> yes
[22:38:41] <gwicke> #summary overall slight preference for going with Accept headers, but option for adding query strings later remains
[22:38:45] <gwicke> okay, great
[22:38:48] <gwicke> so, lets move to 2)
[22:39:04] <DanielK_urlaub> subbu: redirect, rewrite, whatever. can be done by the api itself, or the webserver, or a proxy
[22:39:07] <gwicke> What to do if no format was specified
[22:39:28] <subbu> DanielK_urlaub, makes sense.
[22:40:03] <gwicke> I think general consensus is to strongly encourage clients to specify a version in their request, so that they can benefit from stable / predictable responses
[22:40:14] <mobrovac> +1
[22:40:23] <mobrovac> which implies serving the latest version
[22:40:33] <mobrovac> which i'm +1 for
[22:40:43] <subbu> i think option 3. from the RFC is really a strawman.
[22:40:46] <gwicke> one option is to take this even further, to the point of *requiring* clients to specify a version, and reject requests without
[22:40:53] <subbu> i don't think it needs to be considered.
[22:41:02] <DanielK_urlaub> gwicke: one way to do that is to make a version-les request redirect to a versioned request. but that only works if the version is in the url
[22:41:31] <gwicke> DanielK_urlaub: that's effectively the same as returning a specific version
[22:41:43] <mobrovac> gwicke: requiring them to specify it seems a bit like a high-entry bar
[22:41:44] <gwicke> which brings us to 1) or 2)
[22:41:56] <DanielK_urlaub> except that the cleint actually gets told "no, look there!" in http speak.
[22:42:35] <DanielK_urlaub> option (1) means making the new versio nthe default without giving people a chance to test agains the new version, if i understand correctly
[22:43:15] <gwicke> DanielK_urlaub: yeah, that's true -- in a URL-based scheme this would also slightly reduce cache fragmentation
[22:43:50] <gwicke> yes, option (1) very strongly encourages people to set a version explicitly
[22:44:02] <gwicke> if they don't, then the format can change at relatively short notice
[22:44:07] <TimStarling> not as strongly as option 3
[22:44:15] <DanielK_urlaub> ...by breaking their stuff a couple of times...
[22:44:28] <gwicke> yeah, lets call it "intermediate encouragement"
[22:44:28] <TimStarling> for the particular change under discussion, I guess option 1 is OK
[22:44:39] <TimStarling> since a lot of clients are not going to care about data-mw
[22:44:49] <subbu> right .. option (1) only makes sense when enforced via (3).
[22:45:07] <DanielK_urlaub> in theory, (2) the the best option. but is it worth the wait, given that people genenrally don't do anything before their stuff breaks, no matter how much you announce?
[22:45:26] * DanielK_urlaub is one of the people who only notices this tsuff when something breaks
[22:45:26] <subbu> TimStarling, right .. so the qn. is whether this rfc is deciding a general policy for all future API changes or for this current upcoming one.
[22:45:29] <gwicke> legoktm brought up the issue of no delay being long enough for everybody
[22:45:39] <Scott_WUaS> (appreciating the focus of this and recent #wikimedia-office meetings, and the developing phabricator focus too)
[22:45:40] <TimStarling> in other cases I have argued for a fatal error because terrible things happen when you serve the new API to an old client
[22:46:06] <TimStarling> terrible as in subtly terrible, which is worse than obviously broken
[22:46:52] <subbu> i agree .. based on this discussion it seems it is either (2) or (1)+(3).
[22:46:54] <gwicke> option 2) might encourage clients to not set the explicit version, and rely on announcements instead
[22:48:02] <TimStarling> 1-2 months is an extremely short deprecation period
[22:48:19] <gwicke> I'm personally a bit more optimistic than subbu about clients having enough pressure to be explicit with 1)
[22:48:48] <DanielK_urlaub> gwicke: you could put an annoying warning in every response to an unversioned request. "use the versioned url! we are otherwise not responsible fo rbroken software and curly toenails!"
[22:49:01] <gwicke> DanielK_urlaub: hehe, yes
[22:49:08] <Scott_WUaS> :)
[22:49:08] <gwicke> and.. delay the response by 10s
[22:49:15] <subbu> gwicke, i guess the question is what I asked TimStarling earlier .. is this about this particular format change only? or is this a policy discussion for all future api changes?
[22:49:42] <TimStarling> for mobile apps the developer does not fully control the update schedule
[22:49:44] <DanielK_urlaub> gwicke: limit batch size to 5 ;)
[22:49:49] <gwicke> subbu: we would like to establish a consistent policy for this
[22:49:58] <mobrovac> i'd like this to be a general discussion, and not focused on data-mw so much
[22:50:04] <TimStarling> I don't know what the adoption curve looks like when a new version is pushed out
[22:50:14] <gwicke> so that we can document expectations, and also be a bit more aggressive about declaring specific entry points stable
[22:50:45] <subbu> alternatively, we say that decision made here applies to future changes and if necessary, a new rfc gets filed when the policy needs to be amended based on experience .. or maybe that is stating the obvious.
[22:50:46] <TimStarling> but I know android offers lots of options to limit data usage due to updates, like disabling updates for specific apps or just not scheduling updates at all
[22:51:22] <gwicke> TimStarling: long-term stability is only possible with an explicit version in the request
[22:51:51] <gwicke> with that, we also have data about usage of specific versions, which we can use to make decisions about dropping support
[22:52:03] <subbu> that seems reasonable.
[22:52:47] <gwicke> so overall I am hearing a concern about clients not being explicit unless forced
[22:53:40] <gwicke> I guess one option would be to go with strong encouragement first, and then see what the uptake is like
[22:54:15] <TimStarling> ok
[22:54:24] <mobrovac> gwicke: but, as DanielK_urlaub pointed out, most people seem to feel concerned only once their stuff break
[22:54:41] <mobrovac> so in that regard opts (1) and (2) might not be that different after all
[22:54:53] <subbu> mobrovac, what?
[22:55:09] <gwicke> yeah, my feeling is also that 2) doesn't add much over 1)
[22:55:22] <gwicke> the main question seems to be 1) or 3)
[22:55:49] <subbu> i think i had a preference for option (2) but as gwicke said i am being conservative / not as optimistic as him about client adoption rate in general .. but for the immediate change at hand, it is a non-issue.
[22:56:03] <mobrovac> subbu: (14:45:26) ***DanielK_urlaub is one of the people who only notices this tsuff when something breaks
[22:56:23] <subbu> i didn't understand why (1) and (2) are not different at all.
[22:56:38] <gwicke> such breakage can produce learnable momentsâ„¢
[22:57:06] <Scott_WUaS> mobrovac: what else do you have in mind in terms of a general discussion?
[22:57:09] <mobrovac> subbu: because if you announce it and wait for 2 months, and only once you change it people start reacting then you've effectively wasted 2 months and could have bumped the version in the first place
[22:57:12] <subbu> there is a delta between when the default format changes ... ah, but i think you guys are saying no one pays attention till stuff breaks.
[22:57:15] <gwicke> subbu: in terms of how strongly clients are encouraged to be explicit, 2) seems to be slightly weaker than 1)
[22:57:39] <DanielK_urlaub> subbu: that was the point, yea :)
[22:57:44] <TimStarling> we're just about out of time
[22:57:45] <gwicke> while the concern with 2) is that clients not paying attention won't be helped much by a two-month delay
[22:57:54] <TimStarling> is there an interim solution that everyone is happy with?
[22:58:06] <mobrovac> Scott_WUaS: nah, i just didn't want the discussion to be confined to the data-mw change at hand
[22:58:08] <subbu> got it .. i can go with (1).
[22:58:22] <mobrovac> :)
[22:58:23] <Scott_WUaS> sounds good
[22:58:33] <mobrovac> subbu: see, we made you optimistic suddenly :D
[22:58:45] <subbu> DanielK_urlaub's argument is a good one.
[22:58:59] <gwicke> yeah
[22:58:59] <subbu> no, you clarified that my pessimism was warranted. :)
[22:59:32] <gwicke> how about: we start with option 1), and monitor the percentage of clients who are explicit over the first few changes
[22:59:56] <gwicke> when take-up is too low and people keep complaining, we revisit and consider switching to 3)
[23:00:16] <gwicke> does this sound like a sensible approach?
[23:00:26] <DanielK_urlaub> option 1 sounds fine to me. though it would be nice to give people who *do* pay attention a chance to test. but i guess that would be the people who'd be using a verisoned request in the first place.
[23:00:28] <subbu> wfm
[23:00:32] <DanielK_urlaub> so yea.
[23:00:34] <TimStarling> yes
[23:01:02] <gwicke> mobrovac: you on board as well?
[23:01:13] <mobrovac> yup
[23:01:33] <gwicke> awesome, just in time ;)
[23:01:42] <mobrovac> but DanielK_urlaub's idea is actually good
[23:01:42] <TimStarling> documentation is key
[23:01:56] <mobrovac> that could be achieved by putting it up in labs a week earlier or so
[23:01:56] <TimStarling> make sure the documentation implies that Accept is required
[23:02:03] <gwicke> TimStarling: yes, agreed
[23:03:13] <gwicke> #info broad support for starting with option 1), and monitoring the percentage of clients who are explicit over the first few changes; if take-up is too low and people keep complaining, revisit and consider switching to 3) (enforce version in the request)
[23:03:43] <TimStarling> make sure you copy the IRC log to the phab event, since there is no meetbot
[23:03:52] <gwicke> thank you, this was a very productive and enjoyable discussion!
[23:04:17] <gwicke> TimStarling: yes, will copy a summary & link to a full log
[23:04:22] <gwicke> #endmeeting