Page MenuHomePhabricator

Route external requests to RESTbase via API gateway
Open, In Progress, HighPublic

Description

We want all external requests to RESTbase to go through the API gateway, so we can easily re-route them to standalong services later.

The URL structure we expose to the public needs to stay exactly as it is now, per https://en.wikipedia.org/api/rest_v1/

Event Timeline

We (on API platform) need to consider some of the following from the API gateway perspective as regards our historical decisions - the restbase migration is what we built the gateway for, so we can/should/must move on these issues:

  • we only respond to api.wikimedia.org, we will need to respond to every language on every project - this probably means we should respond for all hostnames if an API route is valid
  • A top level /api/ path is a pretty totalising decision especially when we have other conventions established around pathing, but it's not really one we can budge on.
  • scaling and capacity - an ongoing discussion/concern but this work increases the priority
  • Currently we require JWTs for any POST requests on the gateway and this was a decision that was made as part of the envisaged path from restbase deprecation to the API gateway before it was built - are we going to have to budge on this or can we make this part of the pattern of migration?

Those aside, in the abstract currently adding this routing is quite easy within the gateway. We even already have restbase configured as a service, albeit only for /feeds/. Seems to me like adding entries for all of the routes we want to re-route later early would be a good move rather than just routing /api/* and then later adding more routes, so a complete list to add when we start this work would be best.

Currently we require JWTs for any POST requests on the gateway and this was a decision that was made as part of the envisaged path from restbase deprecation to the API gateway before it was built - are we going to have to budge on this or can we make this part of the pattern of migration?

For the purpose of RESTbase sunsetting, the move to the API Gateway needs to be completely transparent to clients. So the requirement of JWT would have to be dropped for backwards compat paths (perhaps anything that doesn't come in via api.wikimedia.org?)

We may be able to get around this if it turns out that the relevant existing APIs don't need POST requests. But I doubt it.

I guess API gateway was originally envisioned for use by new APIs. I want to use it specifically as an aid for migrating old APIs. Maybe it shouldn't be the same thing? Maybe we want to have a "Legacy API Gateway", that is similar to the "Shiny API Gateway", but not quite the same?

I guess API gateway was originally envisioned for use by new APIs. I want to use it specifically as an aid for migrating old APIs. Maybe it shouldn't be the same thing? Maybe we want to have a "Legacy API Gateway", that is similar to the "Shiny API Gateway", but not quite the same?

I am fairly certain that much of the initial thinking within the API gateway was done with the assumption that one of the main clients the gateway was guaranteed to support _was_ RESTbase itself, so I think we shouldn't split our work and just need to move to accommodate the needs of restbase.

VirginiaPoundstone renamed this task from Route external requests to RESTbase via API gatewa to Route external requests to RESTbase via API gateway.Nov 1 2022, 8:09 PM

Regarding the historical question, it wouldn't surprise me if there was more than one perspective, even at the time. That happens a lot. :)

My personal recollection from my involvement during the Evan Prodromou era matches Daniel's: the API Gateway was primarily intended for new APIs. We did discuss limited migration of one or more legacy APIs (Feeds came up a lot, mostly because it seemed straightforward) as proof of concept. But I recall Evan being quite focused on new APIs. Notably, the term "RESTbase" does not appear in the initiative document.

However, I would not be at all surprised if others at WMF had different ideas, even back then.

All of that is, in my opinion irrelevant to any decision we make at this time regarding whether we continue with one API Gateway or split things up. We should make that decision strictly based on the technical merits. If Hugh is more comfortable with a single API Gateway from a technical perspective, then I'm good to continue down that path.

Final thought on history: I'm very happy to see the API Gateway being useful for RESTbase sunset, regardless of how we got here.

I'm curious about anticipated time frame (if we're brave enough to have one yet). We're currently trying to move the AQS 2.0 Unique Devices service toward production deployment (T288298: AQS 2.0: Device Analytics service, T320967: [AQS 2.0] New Service Request device_analytics, T320983: Move uniqueDevices service repo from Gitlab to Gerrit, T320976: Obtain security review of uniqueDevices, etc.). But we still have a number of things to complete before that actually happens, and some of those depend on other teams who likely also have full schedules. So our time frame to be ready for the new Unique Devices service to be deployed and ready to accept production traffic is measured in at least weeks, if not longer. It sounds like it would be helpful and appropriate for the API Gateway work being discussed in this task to be completed before the Unique Devices deployment. Do we anticipate these efforts dovetailing happily?

As far as I understand, requests to AQS are routed through RESTbase as the moment. If we have the gateway in from of REStbase, we could route AQS endpoints directly, and bypass RESTbase. As far as I understand, this should be possible right away. That would be nice!

Am I getting that right?

PS: If we can use a single API gateway for all our needs, I'm happy with that!

Change 852165 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: expose restbase /api/ endpoint

https://gerrit.wikimedia.org/r/852165

As far as I understand, requests to AQS are routed through RESTbase as the moment. If we have the gateway in from of REStbase, we could route AQS endpoints directly, and bypass RESTbase. As far as I understand, this should be possible right away. That would be nice!

Am I getting that right?

PS: If we can use a single API gateway for all our needs, I'm happy with that!

Currently AQS is an independent service but shares a lot of its codebase with RESTbase. The new version is even more independent with a completely distinct codebase, so as far as the gateway is concerned these are two totally parallel streams of work that aren't really interdependent from a technical perspective.

Currently AQS is an independent service but shares a lot of its codebase with RESTbase. The new version is even more independent with a completely distinct codebase, so as far as the gateway is concerned these are two totally parallel streams of work that aren't really interdependent from a technical perspective.

My understanding was that public access to AQS is routed through RESTbase at the moment, per https://phabricator.wikimedia.org/diffusion/GRES/browse/master/v1/metrics.yaml. Did I misunderstand? I was thinking that, with the gateway in place, we could take RESTbase out of the picture.

We may be able to get around this if it turns out that the relevant existing APIs don't need POST requests. But I doubt it.

As far as I can tell from grepping through https://phabricator.wikimedia.org/diffusion/GRES/browse/master/v1/, the only thing that needs POST requests is the /transform/ endpoint.

The biggest external user by far seems to be Google's REST crawler (according to the UA breakdown - not that the log is sampled, numbers need to be scaled up accordingly). The rate seems to be about 6 req/sec.

The level of external traffix to RESTbase (behind the web cache layer) seems to be on the order of 6k req/sec, see https://grafana.wikimedia.org/d/000000068/restbase and https://grafana.wikimedia.org/d/000000577/restbase-external-overview.

If I understood the problem correctly, the plan is to first convert the services to reply to the same urls we now handle via restbase directly, then convert the internal clients to call them directly and not go via restbase, then to move the public API away from restbase. Is this correct?

If this is the case, I think we have two options that are much simpler than shoehorning everything inside the api gateway:

  1. Just transition pointing from restbase to the backend from the CDN. I'm not sure how well segregated the APIs for the various backends are (I fear, not at all), but if we can just do prefix routing, this seems simple and avoids adding another element in the call chain
  2. We can reuse most of the api gateway work and just create an envoy configuration tailored to routing the legacy restbase api; we can also switch rate-limiting logic to pure per-ip global ratelimiting (like restbase does now)

Given the api gateway and this "other" gateway work with different hostnames and with different url structures, there is really no value, IMHO, in concocting them toghether in the same configuration.

If the above makes sense, do we have a precise url mapping for restbase url prefix => backend service somewhere? That specification would help us pick the best option based on complexity: specifically, if most mappings are simple, say /foo => pcs, /bar => parsoid, then I'd go with just separating requests at the traffic layer. Otherwise, we can create a "legacy api gateway" by just changing the configuration of the current gateway and deploying it separately.

If I understood the problem correctly, the plan is to first convert the services to reply to the same urls we now handle via restbase directly, then convert the internal clients to call them directly and not go via restbase, then to move the public API away from restbase. Is this correct?

A few things to unpack here...

first convert the services to reply to the same urls we now handle via restbase directly

How? We need something to route to them. ATS or Envoy or... something.

then convert the internal clients to call them directly and not go via restbase

Most services are not caleld internally at all. It's mostly Parsoid, and that will be replaced by direct php method calls inside MediaWiki.

then to move the public API away from restbase

This needs to happen before/when we move them out of RESTbase. We need to keep the public URLs stable, and if the service is no longer in RESTbase, the routing needs to be handled elsewhere.

If this is the case, I think we have two options that are much simpler than shoehorning everything inside the api gateway:

  1. Just transition pointing from restbase to the backend from the CDN. I'm not sure how well segregated the APIs for the various backends are (I fear, not at all), but if we can just do prefix routing, this seems simple and avoids adding another element in the call chain

My understanding was that on reason to have the API gateway in the first place is that it's much easier to change routing in the gateway than it is to change routing the the CDN. So my naive idea was to change the routing in the CDN from pointing to RESTbase to pointing to the API gateway. For a while, that adds another layer to the stack, but only temporarily. This would allow us to mobve things out of RESTbase simply by changing configuration of the gateway to route to the respective service directly.

  1. We can reuse most of the api gateway work and just create an envoy configuration tailored to routing the legacy restbase api; we can also switch rate-limiting logic to pure per-ip global ratelimiting (like restbase does now)

My understanding was that is exactly what the gateway does...

Given the api gateway and this "other" gateway work with different hostnames and with different url structures, there is really no value, IMHO, in concocting them toghether in the same configuration.

I personally don't care if we have one gateway config, or a spearate gateway config for the reoutes currently handled by RESTbase. Whatever makes more sense for you and Hugh.

If the above makes sense, do we have a precise url mapping for restbase url prefix => backend service somewhere? That specification would help us pick the best option based on complexity: specifically, if most mappings are simple, say /foo => pcs, /bar => parsoid, then I'd go with just separating requests at the traffic layer. Otherwise, we can create a "legacy api gateway" by just changing the configuration of the current gateway and deploying it separately.

I have been looking for that mapping myself. It must be burried in the RESTbase config somehow, but I have been unable to find it. Perhaps @Jgiannelos can help.

If I understood the problem correctly, the plan is to first convert the services to reply to the same urls we now handle via restbase directly, then convert the internal clients to call them directly and not go via restbase, then to move the public API away from restbase. Is this correct?

A few things to unpack here...

first convert the services to reply to the same urls we now handle via restbase directly

How? We need something to route to them. ATS or Envoy or... something.

I meant that if restbase does any url translation, say from /foo/bar/baz to /bar/baz, the service must first be updated to be able to reply to /foo/bar/baz first.

then convert the internal clients to call them directly and not go via restbase

Most services are not caleld internally at all. It's mostly Parsoid, and that will be replaced by direct php method calls inside MediaWiki.

This is very incorrect. Most services *are* called internally, *via* restbase, from other services. Examples: parsoid for sure, but also recommendation-api, mobileapps... wikifeeds calls N different backends mostly via restbase.

then to move the public API away from restbase

This needs to happen before/when we move them out of RESTbase. We need to keep the public URLs stable, and if the service is no longer in RESTbase, the routing needs to be handled elsewhere.

If this is the case, I think we have two options that are much simpler than shoehorning everything inside the api gateway:

  1. Just transition pointing from restbase to the backend from the CDN. I'm not sure how well segregated the APIs for the various backends are (I fear, not at all), but if we can just do prefix routing, this seems simple and avoids adding another element in the call chain

My understanding was that on reason to have the API gateway in the first place is that it's much easier to change routing in the gateway than it is to change routing the the CDN. So my naive idea was to change the routing in the CDN from pointing to RESTbase to pointing to the API gateway. For a while, that adds another layer to the stack, but only temporarily. This would allow us to mobve things out of RESTbase simply by changing configuration of the gateway to route to the respective service directly.

I would object to this characterization of what is easy and what is not. What is true is that we don't want complex business logic in the CDN layer, hence why we've tended to allow api aggregators outside of it.

  1. We can reuse most of the api gateway work and just create an envoy configuration tailored to routing the legacy restbase api; we can also switch rate-limiting logic to pure per-ip global ratelimiting (like restbase does now)

My understanding was that is exactly what the gateway does...

No, the gateway does rate-limiting based on a series of parameters, including a jwt token offered. The configuration would be much simpler in this case.

Given the api gateway and this "other" gateway work with different hostnames and with different url structures, there is really no value, IMHO, in concocting them toghether in the same configuration.

I personally don't care if we have one gateway config, or a spearate gateway config for the reoutes currently handled by RESTbase. Whatever makes more sense for you and Hugh.

If the above makes sense, do we have a precise url mapping for restbase url prefix => backend service somewhere? That specification would help us pick the best option based on complexity: specifically, if most mappings are simple, say /foo => pcs, /bar => parsoid, then I'd go with just separating requests at the traffic layer. Otherwise, we can create a "legacy api gateway" by just changing the configuration of the current gateway and deploying it separately.

I have been looking for that mapping myself. It must be burried in the RESTbase config somehow, but I have been unable to find it. Perhaps @Jgiannelos can help.

How? We need something to route to them. ATS or Envoy or... something.

I meant that if restbase does any url translation, say from /foo/bar/baz to /bar/baz, the service must first be updated to be able to reply to /foo/bar/baz first.

My naive assumption was that in the future, the gateway will do that translation.

Most services are not caleld internally at all. It's mostly Parsoid, and that will be replaced by direct php method calls inside MediaWiki.

This is very incorrect. Most services *are* called internally, *via* restbase, from other services. Examples: parsoid for sure, but also recommendation-api, mobileapps... wikifeeds calls N different backends mostly via restbase.

Ah, sorry - I was thinking of services called by MediaWiki. Between node services, it's also mostly parsoid, but it's not the only thing.

My understanding is that any internal calls should go via the service grid, not via the gateway.

My understanding was that on reason to have the API gateway in the first place is that it's much easier to change routing in the gateway than it is to change routing the the CDN. So my naive idea was to change the routing in the CDN from pointing to RESTbase to pointing to the API gateway. For a while, that adds another layer to the stack, but only temporarily. This would allow us to mobve things out of RESTbase simply by changing configuration of the gateway to route to the respective service directly.

I would object to this characterization of what is easy and what is not. What is true is that we don't want complex business logic in the CDN layer, hence why we've tended to allow api aggregators outside of it.

I'll leave it to @hnowlan and you to duke that out :) I just need *something* to do that routing that is currently done in RESTbase, in a way that can easily be changed, ideally per site. E.g. we'd want to try out new new route without RESTbase on mediawiki.org first, then try it on a few moew wikis a week later, etc.

  1. We can reuse most of the api gateway work and just create an envoy configuration tailored to routing the legacy restbase api; we can also switch rate-limiting logic to pure per-ip global ratelimiting (like restbase does now)

My understanding was that is exactly what the gateway does...

No, the gateway does rate-limiting based on a series of parameters, including a jwt token offered. The configuration would be much simpler in this case.

I thought this is just one of several things it is designed to do. If it doesn't do routing/address translation, I need to take two steps back and rethink. @hnowlan can you clarify?

We can reuse most of the api gateway work and just create an envoy configuration tailored to routing the legacy restbase api; we can also switch rate-limiting logic to pure per-ip global ratelimiting (like restbase does now)

Where would this Envoy instance be running? Not opposed to the idea but just curious as to how it'd be configured. Would it just be another instance of the api-gateway chart that we only route to for previously-restbase calls via the CDN? I can see this possibly working rather than duplicating or splitting stuff out of the gateway, and it would potentially be less complicated.

Given the api gateway and this "other" gateway work with different hostnames and with different url structures, there is really no value, IMHO, in concocting them toghether in the same configuration.

The URL structure itself isn't really the end of the world as regards the API namespace in the gateway, nor is the hostname issue. How requests get to the gateway itself as regards the hostnames in question when we transition the endpoints is a concern though of course.

  1. We can reuse most of the api gateway work and just create an envoy configuration tailored to routing the legacy restbase api; we can also switch rate-limiting logic to pure per-ip global ratelimiting (like restbase does now)

My understanding was that is exactly what the gateway does...

No, the gateway does rate-limiting based on a series of parameters, including a jwt token offered. The configuration would be much simpler in this case.

I thought this is just one of several things it is designed to do. If it doesn't do routing/address translation, I need to take two steps back and rethink. @hnowlan can you clarify?

I think what is meant here is that the gateway can do rate limiting based on a series of things - at base in the gateway when JWTs are not factored in, rate limiting is done on a per-IP level. In the absence of JWTs (which will be the case with restbase in general) we will just get per-IP rate limiting which is the current status quo as far as I understand it?
The gateway also does API routing and address translation yes.

Change 852165 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: expose restbase /api/ endpoint

https://gerrit.wikimedia.org/r/852165

Change 865683 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] api-gateway: add restbase routing, enable in staging

https://gerrit.wikimedia.org/r/865683

Change 865683 merged by jenkins-bot:

[operations/deployment-charts@master] api-gateway: add restbase routing, enable in staging

https://gerrit.wikimedia.org/r/865683

daniel triaged this task as High priority.Mar 6 2023, 10:37 AM