Page MenuHomePhabricator

Clarify common k8s label and service conventions in our helm charts
Open, Needs TriagePublic

Description

I have some questions and thoughts about common k8s labels and template defines in our helm charts. I mostly maintain the various eventgate deployments, so my perspective comes from running multiple deployment services all from the same Helm chart. I'll use eventgate in the examples below.

Perhaps we just need some good documentation of what these terms mean, or perhaps we should refactor some of them.


First, the wmf.* defines in _helpers.tpl:

{{- define "wmf.chartname" -}}
{{- default .Chart.Name .Values.chartName | trunc 63 | trimSuffix "-" -}}
  • Why do we need to support .Values.chartName?
{{- define "wmf.releasename" -}}
{{- $name := default .Chart.Name .Values.chartName -}}
{{- printf "%s-%s" $name .Release.Name | trunc 63 | trimSuffix "-" -}}
  • For eventgate, this makes wmf.releasename == 'eventgate-production' for each eventgate deployment. I learned that this is safe because each deployment is in its own namespace, e.g. eventgate-main. I assume that since this concats chart + release name, this is intended to fully identify the WMF deployed release. It doesn't. Perhaps this is just poorly named?
{{- define "wmf.chartid" -}}
{{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}
  • If we support .Values.chartName, shouldn't wmf.chartid also use it?
  • This probably needs normalized to 63 chars too, right?
{{- define "wmf.appbaseurl" -}}
http://{{ template "wmf.releasename" . }}:{{ .Values.main_app.port }}
  • Because wmf.releasename is e.g. eventgate-production, wmf.appbaseurl is probably useless, no? Or, perhaps I am misunderstanding what wmf.appbbaseurl is for. Maybe we just need more doc comments in _helpers.tpl?

And Service naming label question. From _tls_helpers.yaml:

apiVersion: v1
kind: Service
metadata:
  name: {{ template "wmf.releasename" . }}-tls-service
  labels:
    app: {{ template "wmf.chartname" . }}
    chart: {{ template "wmf.chartid" . }}
    release: {{ .Release.Name }}
    heritage: {{ .Release.Service }}
spec:
  type: NodePort
  selector:
    app: {{ template "wmf.chartname" . }}
    routed_via: {{ .Release.Name }}
  ports:
    - name: {{ template "wmf.releasename" . }}-https
      protocol: TCP
      port: {{ .Values.tls.public_port }}
      nodePort: {{ .Values.tls.public_port }}
  • Calling the Service e.g. 'eventgate-production-tls-service' is confusing. For eventgate deployments: there will be 4 distinct Services named this in k8s (even though each one is in its own namespace). Is that ok / desired?
  • app != wmf.chartname, unless...what is app supposed to mean in this case? Would it be more correct to name this label chart?
  • If so, then the chart label would be more correctly named chartid.
  • What is heritage?

Event Timeline

Ottomata renamed this task from Clarify helm common label and service conventions to Clarify common k8s label and service conventions in our helm charts.Sep 27 2021, 3:36 PM
  • For eventgate, this makes wmf.releasename == 'eventgate-production' for each eventgate deployment. I learned that this is safe because each deployment is in its own namespace, e.g. eventgate-main. I assume that since this concats chart + release name, this is intended to fully identify the WMF deployed release. It doesn't. Perhaps this is just poorly named?
  • Calling the Service e.g. 'eventgate-production-tls-service' is confusing. For eventgate deployments: there will be 4 distinct Services named this in k8s (even though each one is in its own namespace). Is that ok / desired?

When considering naming, you need to also factor in the namespace, like you'd do for classes in a program.

So when you read eventgate-production-tls-proxy is just the name within the namespace, and it's properly eventgate-logging-external::eventgate-production-tls-proxy; and indeed, when you use kubectl, you won't ever see two eventgate-production-tls-proxy services, because you have to indicate the namespace, either explicitly on the command line or via the kubeconfig file you're using.

What I think is partially misleading is adding the release name to the tlsproxy name, as it's actually routing more than just that release. But at the same time, this allows us to define multiple releases in the same namespace with different endpoints if we need to (note: that would need some more work on the charts anyways, but naming is the hardest thing to change afterwards).

The routed_via value then refers to the release name we want to funnel tls traffic through even for releases that don't have their own TLS proxy.

The whole app/chart/release/heritage combo of labels comes from what helm best practices suggest.

Specifically:

  • app indicates the general chart we're using. So e.g. eventgate, mediawiki, shellbox, termbox, mathoid, etc.
  • chart indicates the specific chart used in an unique way, so e.g. eventgate-0.60, mediawiki-0.0.38, etc.
  • release refers to the release within the namespace, so I guess usually production
  • heritage is helm everywhere and it's a way to clearly identify that the object is managed by helm. It's a standard label that's mostly redundant for us (we manage everything via helm) but it's useful in general-purpose charts

Change 724489 had a related patch set uploaded (by Ottomata; author: Ottomata):

[operations/deployment-charts@master] Add docs about template, label, and conary conventions

https://gerrit.wikimedia.org/r/724489

Ok, added some docs in patch, lemme know what you think. Still not sure what appbaseurl is for, or if we should make chartid also truncated to 63 chars.

The whole app/chart/release/heritage combo of labels comes from what helm best practices suggest.

Specifically:

  • app indicates the general chart we're using. So e.g. eventgate, mediawiki, shellbox, termbox, mathoid, etc.
  • chart indicates the specific chart used in an unique way, so e.g. eventgate-0.60, mediawiki-0.0.38, etc.
  • release refers to the release within the namespace, so I guess usually production
  • heritage is helm everywhere and it's a way to clearly identify that the object is managed by helm. It's a standard label that's mostly redundant for us (we manage everything via helm) but it's useful in general-purpose charts

FWIW: Those recommendations have already changes as well, see T253395 :/

Change 724489 merged by jenkins-bot:

[operations/deployment-charts@master] Add docs about template, label, and canary conventions

https://gerrit.wikimedia.org/r/724489