Page MenuHomePhabricator

Add Link engineering: Allow external traffic to linkrecommendation service
Closed, ResolvedPublic

Description

I noticed in deployment-charts/helmfile.d/services/README.md this line:

  • if you need external traffic, serviceops also needs to configure LVS and DNS to enable you to receive traffic from outside or from other WMF services.

I was under the impression that we couldn't allow external traffic to the new link recommendation service. But if we could, it would be really nice for:

  1. local devs not having to set up their own service
  2. QA and Product Managers can query the service directly to assess quality of results
  3. Community members can build gadgets that integrate with the service outside of Growth Team's planned way to present this tool's output to users

So, if we could allow external traffic, that would be quite nice. But it should not block initial deployment.

Details

SubjectRepoBranchLines +/-
mediawiki/extensions/GrowthExperimentsmaster+21 -2
mediawiki/extensions/GrowthExperimentsmaster+5 -4
research/mwaddlinkmain+174 -186
operations/deployment-chartsmaster+20 -0
operations/deployment-chartsmaster+86 -2
mediawiki/extensions/GrowthExperimentsmaster+5 -4
research/mwaddlinkmain+58 -57
mediawiki/extensions/GrowthExperimentsmaster+30 -2
mediawiki/extensions/GrowthExperimentsmaster+79 -5
operations/puppetproduction+5 -0
operations/dnsmaster+3 -0
operations/dnsmaster+4 -0
operations/puppetproduction+4 -4
operations/puppetproduction+3 -3
operations/puppetproduction+3 -3
operations/puppetproduction+72 -0
operations/deployment-chartsmaster+8 -2
operations/dnsmaster+2 -4
operations/dnsmaster+9 -3
Show related patches Customize query in gerrit

Event Timeline

Arguably, in the interest of https://en.wikipedia.org/wiki/Separation_of_concerns it's probably better than whatever instance of the service mediawiki queries is NOT exposed to the public. That way requests of the infrastructure won't be mixed with external ones, allowing for better capacity planning, service level support etc.

We could however instantiate a second deployment of the software, e.g. as a second helm/helmfile release and expose that one. Depending on the timeline it might end up being easier than expected, as we are working on some changes in the infrastructure which might remove a lot of manual work from SRE's plate (the stuff in the README).

Arguably, in the interest of https://en.wikipedia.org/wiki/Separation_of_concerns it's probably better than whatever instance of the service mediawiki queries is NOT exposed to the public. That way requests of the infrastructure won't be mixed with external ones, allowing for better capacity planning, service level support etc.

That makes sense.

We could however instantiate a second deployment of the software, e.g. as a second helm/helmfile release and expose that one. Depending on the timeline it might end up being easier than expected, as we are working on some changes in the infrastructure which might remove a lot of manual work from SRE's plate (the stuff in the README).

If it were quick and easy, I was hoping to do this soon to avoid having the need for other developers to setup the software themselves for testing and development. But I could also figure out an interim solution where the service is running on e.g. Toolforge or WCMS instead of a kubernetes prod instance.

Arguably, in the interest of https://en.wikipedia.org/wiki/Separation_of_concerns it's probably better than whatever instance of the service mediawiki queries is NOT exposed to the public. That way requests of the infrastructure won't be mixed with external ones, allowing for better capacity planning, service level support etc.

That makes sense.

We could however instantiate a second deployment of the software, e.g. as a second helm/helmfile release and expose that one. Depending on the timeline it might end up being easier than expected, as we are working on some changes in the infrastructure which might remove a lot of manual work from SRE's plate (the stuff in the README).

If it were quick and easy, I was hoping to do this soon to avoid having the need for other developers to setup the software themselves for testing and development. But I could also figure out an interim solution where the service is running on e.g. Toolforge or WCMS instead of a kubernetes prod instance.

Deploying it is quick and easy, exposing it is a bit more of a mess (which is why we are trying to figure out a different way of handling all that). We might have something to test against in 4-6 months from now, but no promises, covid-19 is a really unpredictable thing. Depending on your timelines, let us know if that sounds ok or not.

If it's easy to selectively address one or the other from MediaWiki, we could just have a special page or API proxy queries to the secondary service as an interim solution. (That would be sort of nice for exposing it to editors anyway - we could link to it from action=info.)

If it's easy to selectively address one or the other from MediaWiki, we could just have a special page or API proxy queries to the secondary service as an interim solution. (That would be sort of nice for exposing it to editors anyway - we could link to it from action=info.)

Yeah I was thinking this also, this could be a good compromise. I imagine we would need to apply some rate limiting to request, and we'd have to decide if requests proxied via MediaWiki should be cached.

Can/should it maybe be integrated into https://wikitech.wikimedia.org/wiki/API_Gateway instead of going though MW?

Can/should it maybe be integrated into https://wikitech.wikimedia.org/wiki/API_Gateway instead of going though MW?

That could work too, as long as it avoids the problems @akosiaris mentioned, and it doesn't come with SLA/stability requirements (I don't think we want to advertise the API as stable for outside clients to use at this point).

Change 656430 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/dns@master] Introduce linkrecommendation{,-external}

https://gerrit.wikimedia.org/r/656430

Change 656430 merged by Alexandros Kosiaris:
[operations/dns@master] Introduce linkrecommendation{,-external}

https://gerrit.wikimedia.org/r/656430

Change 657855 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] Add a linkrecommendation-external release

https://gerrit.wikimedia.org/r/657855

Change 658303 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/dns@master] Remove linkrecommendation-external

https://gerrit.wikimedia.org/r/658303

Change 658303 merged by Alexandros Kosiaris:
[operations/dns@master] Remove linkrecommendation-external

https://gerrit.wikimedia.org/r/658303

Change 657855 merged by jenkins-bot:
[operations/deployment-charts@master] Add a linkrecommendation-external release

https://gerrit.wikimedia.org/r/657855

Change 658576 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] services: Create LVS services for linkrecommendation

https://gerrit.wikimedia.org/r/658576

Change 658576 merged by Alexandros Kosiaris:
[operations/puppet@production] services: Create LVS services for linkrecommendation

https://gerrit.wikimedia.org/r/658576

Change 658579 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] similar-users, linkrecommendation: Switch to lvs_setup

https://gerrit.wikimedia.org/r/658579

Change 658579 merged by Alexandros Kosiaris:
[operations/puppet@production] similar-users, linkrecommendation: Switch to lvs_setup

https://gerrit.wikimedia.org/r/658579

Change 658599 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] similar-users, linkrecommendation: Switch to lvs_setup

https://gerrit.wikimedia.org/r/658599

Change 658599 merged by Alexandros Kosiaris:
[operations/puppet@production] similar-users, linkrecommendation: Switch to monitoring_setup

https://gerrit.wikimedia.org/r/658599

Change 658630 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] similar-users, linkrecommendation: Switch to production

https://gerrit.wikimedia.org/r/658630

Change 658630 merged by Alexandros Kosiaris:
[operations/puppet@production] similar-users, linkrecommendation: Switch to production

https://gerrit.wikimedia.org/r/658630

Change 658980 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/dns@master] similar-users, linkrecommendation: Add discovery

https://gerrit.wikimedia.org/r/658980

Change 658980 merged by Alexandros Kosiaris:
[operations/dns@master] similar-users, linkrecommendation: Add discovery

https://gerrit.wikimedia.org/r/658980

Change 659314 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] linkrecommendation: Add linkrecommendation.wikimedia.org

https://gerrit.wikimedia.org/r/659314

Change 659315 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/dns@master] linkrecommendation: Point to dyna.w.o

https://gerrit.wikimedia.org/r/659315

@hnowlan, hi. So, TL;DR this task is about exposing an instance of the new linkrecommendation service publicy. We could expose it under a dedicated domain name (e.g. linkrecommendation.wikimedia.org), but since we are talking about an API, it makes way more sense to me to expose it under the api-gateway. Opinions?

On the back end side, we 've created 2 different LVS services, 1 to expose to all the internal services (e.g. mediawiki) and 1 to expose to the public for separation of concerns, at least on the kubernetes level (the database is still the same). At the beginning the API will only be exposed for QA and PM purposes, but it's not inconceivable that later on it could be added e.g. to gadgets.

This seems very well suited to the API gateway. All that really needs to be configured is a external path and an internal hostname to map it to. Currently URLs are of the form /core/v1/SERVICE/PATH

For the appservers currently we do rewriting of URLs on a per-wiki and per-language basis. It's a bit convoluted but it works - I assume this wouldn't be needed if we're just connecting to an LVS endpoint though?

If the usage is mostly for developers not having to run their own instance with their own data, I would suggest we expose what follows:

  1. under /core/v1/linkrecommendation/PATH a connection to the actual service, with aggressive rate-limiting of clients
  2. under some other url, we could expose the staging instances, for use by our developers for testing/verification

We might think of various ways to do this, one of which is to handle x-wikimedia-debug headers in the api gateway (and use the same external URL, but pointing to staging instead than to the production servers).

We already have some support for x-mediawiki-debug for appservers in the gateway - given that we'll need to do some slight configuration changes to point to k8s hosts instead of appservers, adding the ability to specify staging services as x-mediawiki-debug services would be easily done

If the usage is mostly for developers not having to run their own instance with their own data, I would suggest we expose what follows:

That's the current plan, but it might end up being used in other things as well.

  1. under /core/v1/linkrecommendation/PATH a connection to the actual service, with aggressive rate-limiting of clients
  2. under some other url, we could expose the staging instances, for use by our developers for testing/verification

+1 on the former, I am not so fond of the latter. And the reason is that staging is not meant for testing (it's actually a safety net right before we deploy to production and it's there to allow us to reach a certain degree of confidence before deploying to production) and we should not overload the meaning.

We might think of various ways to do this, one of which is to handle x-wikimedia-debug headers in the api gateway (and use the same external URL, but pointing to staging instead than to the production servers).

Yes, but let's not abuse staging for this.

@kostajh I think the path described above (that is having the external linkrecommendation service exposed under e.g. https://api.wikimedia.org/linkrecommendation/v1/wikipedia or something like /core/v1/linkrecommendation would be better way more sustainable long term than linkrecommendation.wikimedia.org. The api-gateway is new, but so is this service. api-gatewayprovides rate limiting, documentation, examples how to use the API, authentication/authz if required etc.

@kostajh I think the path described above (that is having the external linkrecommendation service exposed under e.g. https://api.wikimedia.org/linkrecommendation/v1/wikipedia or something like /core/v1/linkrecommendation would be better way more sustainable long term than linkrecommendation.wikimedia.org. The api-gateway is new, but so is this service. api-gatewayprovides rate limiting, documentation, examples how to use the API, authentication/authz if required etc.

That sounds fine to me. What would be needed from Growth-Team to implement that?

If the usage is mostly for developers not having to run their own instance with their own data, I would suggest we expose what follows:

That's the current plan, but it might end up being used in other things as well.

Indeed, potential uses for the external API are: QA, product managers, community members who want to explore algorithm quality (especially as we are not generating recommendations for all content), someone might want to write a gadget that uses the service, etc.

@kostajh I think the path described above (that is having the external linkrecommendation service exposed under e.g. https://api.wikimedia.org/linkrecommendation/v1/wikipedia or something like /core/v1/linkrecommendation would be better way more sustainable long term than linkrecommendation.wikimedia.org. The api-gateway is new, but so is this service. api-gatewayprovides rate limiting, documentation, examples how to use the API, authentication/authz if required etc.

That sounds fine to me. What would be needed from Growth-Team to implement that?

PET would be happy to implement this for you if it's the desired path forward. All we need is the internal services to connect to (I'm assuming linkrecommendation.discovery.wmnet) and a URL path we can test against once implemented.

@kostajh I think the path described above (that is having the external linkrecommendation service exposed under e.g. https://api.wikimedia.org/linkrecommendation/v1/wikipedia or something like /core/v1/linkrecommendation would be better way more sustainable long term than linkrecommendation.wikimedia.org. The api-gateway is new, but so is this service. api-gatewayprovides rate limiting, documentation, examples how to use the API, authentication/authz if required etc.

That sounds fine to me. What would be needed from Growth-Team to implement that?

PET would be happy to implement this for you if it's the desired path forward. All we need is the internal services to connect to (I'm assuming linkrecommendation.discovery.wmnet) and a URL path we can test against once implemented.

Thanks @hnowlan. I'll leave it for @akosiaris and @Joe to confirm.

The internal services URL is https://linkrecommendation.discovery.wmnet:4005 We haven't yet loaded the datasets so the service is not functional but curl https://linkrecommendation.discovery.wmnet:4005/healthz and curl https://linkrecommendation.discovery.wmnet:4005/apispec_1.json should both work without error.

@kostajh I think the path described above (that is having the external linkrecommendation service exposed under e.g. https://api.wikimedia.org/linkrecommendation/v1/wikipedia or something like /core/v1/linkrecommendation would be better way more sustainable long term than linkrecommendation.wikimedia.org. The api-gateway is new, but so is this service. api-gatewayprovides rate limiting, documentation, examples how to use the API, authentication/authz if required etc.

That sounds fine to me. What would be needed from Growth-Team to implement that?

PET would be happy to implement this for you if it's the desired path forward. All we need is the internal services to connect to (I'm assuming linkrecommendation.discovery.wmnet) and a URL path we can test against once implemented.

Thanks @hnowlan. I'll leave it for @akosiaris and @Joe to confirm.

The internal services URL is https://linkrecommendation.discovery.wmnet:4005 We haven't yet loaded the datasets so the service is not functional but curl https://linkrecommendation.discovery.wmnet:4005/healthz and curl https://linkrecommendation.discovery.wmnet:4005/apispec_1.json should both work without error.

https://linkrecommendation.discovery.wmnet:4005 is the internal service. We want to expose the one in port 4006 (separation of concerns) so please use https://linkrecommendation.discovery.wmnet:4006. The rest applies as is and they use the same database (the separation of concerns happens only at the deployment level currently, not the datastore layer),
so once the internal service is ready, the one to be exposed will also be ready.

Change 662692 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: generic discovery service config option, add linkrecommendation

https://gerrit.wikimedia.org/r/662692

Change 666097 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Allow lookup of uncached link recommendations via REST route

https://gerrit.wikimedia.org/r/666097

Change 666105 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] [WIP] link recommendations: Allow querying for arbitrary wikis

https://gerrit.wikimedia.org/r/666105

Change 659315 abandoned by Alexandros Kosiaris:
[operations/dns@master] linkrecommendation: Point to dyna.w.o

Reason:
Per discussion in the task, this is going under the api-gateway instead. Abandoning this

https://gerrit.wikimedia.org/r/659315

Change 659314 abandoned by Alexandros Kosiaris:
[operations/puppet@production] linkrecommendation: Add linkrecommendation.wikimedia.org

Reason:
Per discussion in the task, this is going under the api-gateway instead. Abandoning this

https://gerrit.wikimedia.org/r/659314

Change 666105 abandoned by Kosta Harlan:
[mediawiki/extensions/GrowthExperiments@master] [WIP] link recommendations: Allow querying for arbitrary wikis

Reason:
This is going to be more trouble than it's worth. Implementing a GET endpoint in the mwaddlink repo will be easier and more flexible

https://gerrit.wikimedia.org/r/666105

Change 666194 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[research/mwaddlink@main] Add API endpoint for GET requests

https://gerrit.wikimedia.org/r/666194

Change 666307 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] linkrecommendation: Use updated API endpoint path

https://gerrit.wikimedia.org/r/666307

Change 666097 abandoned by Kosta Harlan:
[mediawiki/extensions/GrowthExperiments@master] Allow lookup of uncached link recommendations via REST route

Reason:
See I3667220ef1a4616e39b64e92c3da6f08e381bfc8 for alternative

https://gerrit.wikimedia.org/r/666097

Moving into our current sprint as we're doing some work to support the API gateway integration.

Change 666194 merged by jenkins-bot:
[research/mwaddlink@main] Add API endpoint for GET requests

https://gerrit.wikimedia.org/r/666194

Our side of this is done, so I'm moving into External for us to keep an eye on.

@hnowlan I'm assigning to you as you're working on the patch(es) to make this work. Thanks!

Change 666307 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] linkrecommendation: Use updated API endpoint path

https://gerrit.wikimedia.org/r/666307

Change 667645 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/puppet@production] prometheus::postgres_exporter: Load additional rules on stretch

https://gerrit.wikimedia.org/r/667645

Change 662692 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: generic discovery service config option, add linkrecommendation

https://gerrit.wikimedia.org/r/662692

Change 667844 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: allow access to linkrecommendation service

https://gerrit.wikimedia.org/r/667844

Change 667844 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: allow access to linkrecommendation service

https://gerrit.wikimedia.org/r/667844

Change 667850 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[research/mwaddlink@main] [WIP] Use single swagger definition for linkrecommendations endpoint

https://gerrit.wikimedia.org/r/667850

Change 667856 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] linkrecommendation: Threshold and max_recommendations are query params

https://gerrit.wikimedia.org/r/667856

Change 667850 merged by jenkins-bot:
[research/mwaddlink@main] Use single swagger definition for linkrecommendations endpoint

https://gerrit.wikimedia.org/r/667850

Change 667856 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] linkrecommendation: Threshold and max_recommendations are query params

https://gerrit.wikimedia.org/r/667856

Change 667941 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/extensions/GrowthExperiments@master] Link recommendation: Use access token for access to external release

https://gerrit.wikimedia.org/r/667941

Change 667941 merged by jenkins-bot:
[mediawiki/extensions/GrowthExperiments@master] Link recommendation: Use access token for access to external release

https://gerrit.wikimedia.org/r/667941