Page MenuHomePhabricator

Deploy the HDFS synchronizer (blunderbuss) service to the dse-k8s cluster
Closed, ResolvedPublic

Description

As developers of the HDFS synchronizer service that acts as a part of a GitLab CI/CD pipeline and has access to an HDFS cluster, we need an environment set up that enables us to iterate on the develop -> test -> modify loop in the real world settings.

Here is a list of things needed to achieve this functionality:

  • A kubernetes namespace.
  • A kubectl config file.
  • A GitLab CI pipeline that builds a container image of your application.
  • A helm chart that describes your application's execution environment, including a kerberos authentication sidecar definition.
  • A helmfile deployment definition, which describes how your chart will be deployed to the dse-k8s cluster.
  • A kerberos keytab, which is deployed as a kubernetes secret, for your helmfile deployment to use.
  • IP tables firewall rule to allow GitLab trusted runners to talk to the k8s service

Details

Other Assignee
bking
Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Add hdfs-synchronizer to trusted projectsrepos/releng/gitlab-trusted-runner!98amastilovicT371994_Add_HDFS_Synchronizer_projectmain
Customize query in GitLab

Event Timeline

BTullis subscribed.

I'd be happy to work on this with you @amastilovic

Many of the steps above are things that you can drive yourself, so hopefully we will just be able to guide you through and assist with reviews.
One or two of the steps, such as obtaining a kerberos principal and keytab, do require one of the Data-Platform-SRE team (or any SRE) to be able to complete, so we can do that for you.

There are some general guidelines about deploying Kubernetes services here: https://wikitech.wikimedia.org/wiki/Kubernetes/Add_a_new_service

There are also some notes about the GitLab-CI integration and the process of creating your application image here: https://www.mediawiki.org/wiki/GitLab/Workflows/Deploying_services_to_production

When it comes to the helm chart, there is an overview of how things work here: https://wikitech.wikimedia.org/wiki/Kubernetes/Deployment_Charts
...as well as a useful README file here, which explains how to use the create_new_service.sh script.

Bear in mind that many of these notes have been written by the ServiceOps new team, with specific reference to the wikikube (a.k.a. main) Kubernetes clusters.
This application will be deployed to the dse-k8s cluster so, for example, the helmfile deployment will be in the helmfile.d/dse-k8s-services subdirectory. Other relevant patches may also have to reference dse-k8s-eqiad instead of main, in various places.

BTullis renamed this task from Obtain SRE resources needed to test the HDFS synchronizer service to Deploy the HDFS synchronizer service to the dse-k8s cluster.Sep 18 2024, 11:46 AM
Gehel triaged this task as High priority.Sep 20 2024, 7:47 AM

Change #1077096 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] dse-k8s: add kube_env config for net-new service

https://gerrit.wikimedia.org/r/1077096

Change #1077096 merged by Bking:

[operations/puppet@production] dse-k8s: add kube_env config for net-new service

https://gerrit.wikimedia.org/r/1077096

Ottomata renamed this task from Deploy the HDFS synchronizer service to the dse-k8s cluster to Deploy the HDFS synchronizer (blunderbuss) service to the dse-k8s cluster.Oct 24 2024, 4:10 PM
bking updated Other Assignee, added: bking; removed: BTullis.Nov 8 2024, 4:40 PM
bking updated the task description. (Show Details)

Change #1077106 had a related patch set uploaded (by Bking; author: Aleksandar Mastilovic):

[operations/deployment-charts@master] Adding a Helm chart for HDFS Synchronizer

https://gerrit.wikimedia.org/r/1077106

Change #1077106 merged by jenkins-bot:

[operations/deployment-charts@master] Adding a Helm chart for HDFS Synchronizer

https://gerrit.wikimedia.org/r/1077106

Change #1090894 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] hdfs-synchronizer: remove unneeded mcrouter config

https://gerrit.wikimedia.org/r/1090894

Change #1088608 had a related patch set uploaded (by Bking; author: Aleksandar Mastilovic):

[operations/deployment-charts@master] Added helmfile.d dse-k8s-services entries for HDFS synchronizer

https://gerrit.wikimedia.org/r/1088608

Change #1090894 merged by jenkins-bot:

[operations/deployment-charts@master] hdfs-synchronizer: remove unneeded mcrouter config

https://gerrit.wikimedia.org/r/1090894

Change #1090929 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] hdfs-synchronizer: remove more unneeded mcrouter config

https://gerrit.wikimedia.org/r/1090929

Change #1090929 merged by jenkins-bot:

[operations/deployment-charts@master] hdfs-synchronizer: remove more unneeded mcrouter config

https://gerrit.wikimedia.org/r/1090929

Change #1090932 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] hdfs-sychronizer: once again, remove unneeded mcrouter config

https://gerrit.wikimedia.org/r/1090932

Change #1090932 merged by jenkins-bot:

[operations/deployment-charts@master] hdfs-sychronizer: once again, remove unneeded mcrouter config

https://gerrit.wikimedia.org/r/1090932

Change #1090935 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] hdfs-synchronizer: remove unneeded mcrouter config, part 4

https://gerrit.wikimedia.org/r/1090935

Change #1090935 merged by jenkins-bot:

[operations/deployment-charts@master] hdfs-synchronizer: remove unneeded mcrouter config, part 4

https://gerrit.wikimedia.org/r/1090935

Change #1090945 had a related patch set uploaded (by Bking; author: Bking):

[operations/dns@master] dse-k8s-services: add CNAME for hdfs-synchronizer

https://gerrit.wikimedia.org/r/1090945

Change #1090945 merged by Bking:

[operations/dns@master] dse-k8s-services: add CNAME for hdfs-synchronizer

https://gerrit.wikimedia.org/r/1090945

Change #1090972 had a related patch set uploaded (by Bking; author: Bking):

[operations/dns@master] dse-k8s-services: add CNAME for blunderbuss (nee hdfs-synchronizer)

https://gerrit.wikimedia.org/r/1090972

Change #1090977 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] dse-k8s: add ingress config for net-new service

https://gerrit.wikimedia.org/r/1090977

Change #1090972 merged by Bking:

[operations/dns@master] dse-k8s-services: add CNAME for blunderbuss (nee hdfs-synchronizer)

https://gerrit.wikimedia.org/r/1090972

Change #1091756 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] dse-k8s-service: fix name of net-new service

https://gerrit.wikimedia.org/r/1091756

Change #1091756 merged by Bking:

[operations/puppet@production] dse-k8s-service: fix name of net-new service

https://gerrit.wikimedia.org/r/1091756

Change #1091776 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] admin-ng: replace hdfs-synchronizer namespace w/blunderbuss (dse-k8s)

https://gerrit.wikimedia.org/r/1091776

Change #1091776 merged by jenkins-bot:

[operations/deployment-charts@master] admin-ng: replace hdfs-synchronizer namespace w/blunderbuss (dse-k8s)

https://gerrit.wikimedia.org/r/1091776

Change #1091801 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s-services: add net-new chart for blunderbuss

https://gerrit.wikimedia.org/r/1091801

Change #1091801 merged by jenkins-bot:

[operations/deployment-charts@master] dse-k8s-services: add net-new chart for blunderbuss

https://gerrit.wikimedia.org/r/1091801

Change #1091827 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s-services: introduce Blunderbuss config

https://gerrit.wikimedia.org/r/1091827

Change #1092311 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s: raise quota for blunderbuss

https://gerrit.wikimedia.org/r/1092311

Change #1092311 merged by jenkins-bot:

[operations/deployment-charts@master] dse-k8s: raise quota for blunderbuss

https://gerrit.wikimedia.org/r/1092311

Change #1090977 merged by Bking:

[operations/puppet@production] dse-k8s: add ingress config for net-new service

https://gerrit.wikimedia.org/r/1090977

Mentioned in SAL (#wikimedia-operations) [2024-11-20T19:08:28Z] <inflatador> bking@krb1001 add kerberos keytab for blunderbuss https://phabricator.wikimedia.org/P71106 T371994

Change #1101925 had a related patch set uploaded (by Aleksandar Mastilovic; author: Aleksandar Mastilovic):

[operations/puppet@production] Add Blunderbuss firewall rule to GitLab runner set

https://gerrit.wikimedia.org/r/1101925

Change #1101925 merged by Btullis:

[operations/puppet@production] Add Blunderbuss firewall rule to GitLab runner set

https://gerrit.wikimedia.org/r/1101925

Mentioned in SAL (#wikimedia-operations) [2024-12-12T21:32:52Z] <inflatador> bking@gitlab-runner2004 restart docker to troubleshoot missing iptables rules T371994

Mentioned in SAL (#wikimedia-operations) [2024-12-12T21:35:24Z] <inflatador> bking@gitlab-runner2004 restart ferm to troubleshoot missing iptables rules T371994

bking added a subscriber: Jelto.

@Jelto see above, Puppet laid down the ferm rules when the patch you reviewed was merged, but ferm didn't actually load them, even when I reloaded ferm.service.

I had to restart ferm on gitlab-runner2004 to actually get the new rule into iptables. I didn't want to do this on all gitlab runner hosts without your team's permission...let us know if this is OK. This is blocking us, but it's not an emergency.

Mentioned in SAL (#wikimedia-operations) [2024-12-12T22:08:02Z] <inflatador> bking@cumin2002 sudo cumin A:gitlab-runner 'systemctl restart ferm.service' T371994

amastilovic updated the task description. (Show Details)

Change #1091827 merged by Bking:

[operations/deployment-charts@master] dse-k8s-services: introduce Blunderbuss config

https://gerrit.wikimedia.org/r/1091827