Page MenuHomePhabricator

Migrate recommendation-api to kubernetes
Closed, ResolvedPublic

Description

  • Add a .pipeline/blubber.yaml to the repo
  • Enable the pipeline for that repo. A good example would be 07351ff6bc8252cf4876298b91126
  • Get an image and create the helm chart. https://gerrit.wikimedia.org/g/operations/deployment-charts has a README.md for how to that takes most of the complexity away (do read the generated files though, you probably want to change some stuff)
  • Get the chart reviewed and merged.
  • Ask ServiceOps SRE to create the kubernetes namespaces/tokens/network policies to allow deploying the service
  • Deploy
  • Switch traffic from the old instance to the pipeline one. That last part is again SRE actions only (but should be done in cooperation with research).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 20 2019, 12:15 PM
bmansurov added a subscriber: Pchelolo.

@Pchelolo Hey, any documentation on how to move the service to node-js 10?

Release-Engineering-Team any documentation on how to move the service to the deployment pipeline?

Change 560454 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[operations/puppet@production] Recommendation API: upgrade node to version 10

https://gerrit.wikimedia.org/r/560454

Change 563185 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/docker-images/production-images@master] nodejs10: Add buster image

https://gerrit.wikimedia.org/r/563185

@akosiaris thanks for the patch. What are the next steps once your patch is merged?

Change 563185 merged by Alexandros Kosiaris:
[operations/docker-images/production-images@master] nodejs10: Add buster image

https://gerrit.wikimedia.org/r/563185

Images for buster and nodejs10 have been been created and are present in the registry

@akosiaris thanks for the patch. What are the next steps once your patch is merged?

@bmansurov, off the top of my head (these belong in a page under wikitech, will get to that)

  • Add a .pipeline/blubber.yaml to the repo
  • Enable the pipeline for that repo. A good example would be 07351ff6bc8252cf4876298b91126
  • Get an image and create the helm chart. https://gerrit.wikimedia.org/g/operations/deployment-charts has a README.md for how to that takes most of the complexity away (do read the generated files though, you probably want to change some stuff)
  • Get the chart reviewed and merged.
  • Ask ServiceOps SRE to create the kubernetes namespaces/tokens/network policies to allow deploying the service
  • Deploy
  • Switch traffic from the old instance to the pipeline one. That last part is again SRE actions only (but should be done in cooperation with research).
bmansurov updated the task description. (Show Details)Jan 19 2020, 11:45 PM

Change 565788 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[operations/deployment-charts@master] Add recommendation-api chart

https://gerrit.wikimedia.org/r/565788

@akosiaris thanks! The first two points were already done. I've created a chart and uploaded a patch. Would you please review it.

Also, do I need to create a separate ticket to ask "ServiceOps SRE to create the kubernetes namespaces/tokens/network"? Or is this ticket good for that purpose too?

@akosiaris thanks! The first two points were already done. I've created a chart and uploaded a patch. Would you please review it.

Done.

Also, do I need to create a separate ticket to ask "ServiceOps SRE to create the kubernetes namespaces/tokens/network"? Or is this ticket good for that purpose too?

Yes please do create one.

Change 577627 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[mediawiki/services/recommendation-api@master] Blubber: Use nodejs 10 versions of nodejs base images

https://gerrit.wikimedia.org/r/577627

Change 577627 merged by jenkins-bot:
[mediawiki/services/recommendation-api@master] Blubber: Use nodejs 10 versions of nodejs base images

https://gerrit.wikimedia.org/r/577627

Change 580294 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[labs/private@master] Add k8s dummy tokens for 3 new services.

https://gerrit.wikimedia.org/r/580294

Change 580295 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Kubernetes: Create token stanzas for some new services

https://gerrit.wikimedia.org/r/580295

Change 565788 merged by jenkins-bot:
[operations/deployment-charts@master] Add recommendation-api chart

https://gerrit.wikimedia.org/r/565788

Change 599812 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] Create namespaces/calico rules for new services

https://gerrit.wikimedia.org/r/599812

Change 580294 merged by Alexandros Kosiaris:
[labs/private@master] Add k8s dummy tokens for 3 new services.

https://gerrit.wikimedia.org/r/580294

Change 580295 merged by Alexandros Kosiaris:
[operations/puppet@production] Kubernetes: Create token stanzas for some new services

https://gerrit.wikimedia.org/r/580295

Change 599812 merged by jenkins-bot:
[operations/deployment-charts@master] Create namespaces/calico rules for new services

https://gerrit.wikimedia.org/r/599812

akosiaris updated the task description. (Show Details)May 29 2020, 4:14 PM

@bmansurov namespaces, rules, tokens have been created. Chart has been merged and publish. You are free to deploy. You will require a change like 968132909b4d24192b2f69a657c14bb30acd7a42 in order to instantiate the first deploy, feel free to add me as a reviewer.

Reedy renamed this task from Migrate recommendation-api to kubernetes to Migrate recommendation-api to kubernetes.Jun 2 2020, 11:19 PM

Change 602527 had a related patch set uploaded (by Bmansurov; owner: Bmansurov):
[operations/deployment-charts@master] Add recommendation-api helmfile stanzas

https://gerrit.wikimedia.org/r/602527

PI will want to track this since Recommendation-API contains a couple of endpoints we maintain and the apps consume.

jijiki moved this task from Incoming 🐫 to Unsorted on the serviceops board.Aug 17 2020, 11:46 PM
hashar added a subscriber: hashar.Sep 3 2020, 10:13 AM

hi, is that still worked on? Asking cause CI still has to maintain a Jessie based image / NodeJS 6.

@hashar o/ Yes, we're still working on it. I'll try to speed it up.

Change 560454 abandoned by Hashar:
[operations/puppet@production] Recommendation API: upgrade node to version 10

Reason:
Per Alexandros, the only path is to migrate to a container / k8s. T241230

https://gerrit.wikimedia.org/r/560454

@akosiaris Thanks for reviewing my patchsets. I was wondering if you've seen my last comment. My understanding is that the patch is very close to being merged, which would be great. Could you please take a look at my benchmark numbers and let me know what you think? Thanks. Here's the patch: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/602527

Change 641439 had a related patch set uploaded (by Alexandros Kosiaris; owner: Bmansurov):
[operations/deployment-charts@master] recommendation-api: Supply more configuration

https://gerrit.wikimedia.org/r/641439

Change 641439 merged by jenkins-bot:
[operations/deployment-charts@master] recommendation-api: Supply more configuration

https://gerrit.wikimedia.org/r/641439

Change 602527 merged by jenkins-bot:
[operations/deployment-charts@master] Add recommendation-api helmfile stanzas

https://gerrit.wikimedia.org/r/602527

Change 641713 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] recommendation-api: Open up access to 3306

https://gerrit.wikimedia.org/r/641713

Change 641713 merged by jenkins-bot:
[operations/deployment-charts@master] recommendation-api: Open up access to 3306

https://gerrit.wikimedia.org/r/641713

Change 641742 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] recommendation-api: Allow overriding mysql_tables

https://gerrit.wikimedia.org/r/641742

Change 641742 merged by jenkins-bot:
[operations/deployment-charts@master] recommendation-api: Allow overriding mysql_tables

https://gerrit.wikimedia.org/r/641742

Change 641749 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] conftool: Add recommendation-api to kubernetes nodes

https://gerrit.wikimedia.org/r/641749

Change 641750 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] lvs: Add new TLS enabled recommendation-api service

https://gerrit.wikimedia.org/r/641750

Change 641753 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/deployment-charts@master] recommendation-api: Switch to using envoy based discovery

https://gerrit.wikimedia.org/r/641753

Change 641749 merged by Alexandros Kosiaris:
[operations/puppet@production] conftool: Add recommendation-api to kubernetes nodes

https://gerrit.wikimedia.org/r/641749

Change 641753 merged by jenkins-bot:
[operations/deployment-charts@master] recommendation-api: Switch to using envoy based discovery

https://gerrit.wikimedia.org/r/641753

Mentioned in SAL (#wikimedia-operations) [2020-11-18T17:13:56Z] <akosiaris> T241230 pool codfw kubernetes for recommendation-api at a very low weight

Change 641750 merged by Alexandros Kosiaris:
[operations/puppet@production] lvs: Add new TLS enabled recommendation-api service

https://gerrit.wikimedia.org/r/641750

akosiaris updated the task description. (Show Details)Thu, Nov 19, 9:56 AM
akosiaris closed this task as Resolved.Thu, Nov 19, 9:59 AM
akosiaris claimed this task.
akosiaris triaged this task as Medium priority.

The service has been deployed yesterday, and the traffic switch happened today. Per https://grafana.wikimedia.org/d/Y5wk80oGk/recommendation-api?orgId=1&var-dc=thanos&var-site=eqiad&var-service=recommendation-api&var-prometheus=k8s&var-container_name=All&from=now-3h&to=now traffic (alas there is no corresponding dashboard for the legacy infrastructure) is flowing now to the kubernetes based deployment. There is some cleanup work to happen, but otherwise this is done. I am gonna resolve it successfully, but feel free to reopen. Thanks to @bmansurov for working through getting the container created and the helm chart ready.

Change 641937 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] lvs: Switch old nontls recommendation-api to lvs_setup

https://gerrit.wikimedia.org/r/641937

Change 641938 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] recommendation-api: Switch to service_setup

https://gerrit.wikimedia.org/r/641938

Change 641939 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] recommendation-api: Cleanups

https://gerrit.wikimedia.org/r/641939

Change 641937 merged by Alexandros Kosiaris:
[operations/puppet@production] lvs: Switch old nontls recommendation-api to lvs_setup

https://gerrit.wikimedia.org/r/641937