Page MenuHomePhabricator

Test deployment-charts for kubernetes 1.19 compatibility
Closed, ResolvedPublic

Description

Unfortunately not just a simple Rakefile patch...

The kubeyaml situation changed a bit at the new version has been re-written as a react app without a CLI.
We could probably stick to the old implementation and maintain future updates (for each k8s version) ourselves, but that does not sound very appealing.

I had a bookmark for https://github.com/instrumenta/kubeval in my browser (that's all the research I did on that by now). If it does what we need/got from kubeyaml, we can maybe migrate with less effort then patching kubeyaml.

Event Timeline

While it looks like kubeval basically works and we can easily replace kubeyaml with it, it has the disadvantage of relying on https://kubernetesjsonschema.dev (https://github.com/instrumenta/kubernetes-json-schema) to fetch the schema from so we would need to include a copy of that in the CI image and run like:
kubeval -v 1.11.0 -s file:///tmp/kubernetes-json-schema/

The schema repo currently does not have 1.19 schemata included and seems not particularly fast/automated in including them. So we will probably have to do that ourselves/and or send PRs upstream.

I fiddled with this a bit and I it is possible to use a local version/checkout of the schema which we can also generate ourselves with something like:

#!/usr/bin/env python3
import json
import os

import requests
import link_header
from datetime import datetime
from openapi2jsonschema.command import default as oapi2json

K8S_RELEASES_URL = "https://api.github.com/repos/kubernetes/kubernetes/releases?per_page=100"


def fetch_all_k8s_releases(recurse=True):
    # Fetch a list of all k8s releases
    def _fetch(url):
        resp = requests.get(url)
        if resp.status_code == 403:
            limit_reset = datetime.utcfromtimestamp(int(resp.headers["X-RateLimit-Reset"]))
            print("Rate limited until %s UTC" % limit_reset)
        resp.raise_for_status()
        body = resp.json()
        if "link" not in resp.headers:
            return (body, None)

        for link in link_header.parse(resp.headers["link"]).links:
            if link.rel == "next":
                return (body, link.rel)

    all_releases, next_link = _fetch(K8S_RELEASES_URL)
    while recurse and next_link is not None:
        releases, next_link = _fetch(next_link)
        all_releases.extend(releases)

    return all_releases


def filter_stable_k8s_release_versions(in_releases):
    # Return a list of only stable k8s releases version numbers
    releases = []
    for rel in in_releases:
        if rel.get("draft", False) or rel.get("prerelease", False):
            continue
        releases.append(rel["name"])

    return releases


def openapi_to_json(path, version, force=False):
    schema_url = "https://raw.githubusercontent.com/kubernetes/kubernetes/%s/api/openapi-spec/swagger.json" % version
    # FIXME This should probably be a local, maybe relative url
    prefix_url = "https://kubernetesjsonschema.dev/%s/_definitions.json" % version

    def _o2j(output, stand_alone, expanded, kubernetes, strict, prefix=""):
        if not prefix:
            prefix = os.path.join(output, "_definitions.json")
        return oapi2json.callback(output=output,
                         schema=schema_url,
                         prefix=prefix,
                         stand_alone=stand_alone,
                         expanded=expanded,
                         kubernetes=kubernetes,
                         strict=strict)

    output = os.path.join(path, version)
    if force or not os.path.isdir(output):
        _o2j(output=output, prefix=prefix_url, stand_alone=False, expanded=True, kubernetes=True, strict=False)
        _o2j(output=output, prefix=prefix_url, stand_alone=False, expanded=False, kubernetes=True, strict=False)

    for variant in ("standalone-strict", "standalone", "local"):
        output = os.path.join(path, "-".join((version, variant)))
        if not force and os.path.isdir(output):
            continue
        stand_alone = "standalone" in variant
        strict = "strict" in variant
        _o2j(output=output, stand_alone=stand_alone, expanded=True, kubernetes=True, strict=strict)
        _o2j(output=output, stand_alone=stand_alone, expanded=False, kubernetes=True, strict=strict)


if __name__ == "__main__":
    releases = fetch_all_k8s_releases(recurse=False)
    stable_versions = filter_stable_k8s_release_versions(releases)
    for version in sorted(stable_versions, reverse=True):
        print("Generating spec for %s" % version)
        openapi_to_json("/tmp/kubernetes-json-schema", version)

My current plan is to build a kubeval deb and add a git repo with the needed kubernetes api schema. I think we don't need fancy auto-upgrade stuff for the schema repo as we don't upgrade k8s that often. I will add a script like the above, so that we can just run that locally and commit the new kubernetes version schema to the repo once we plan to upgrade to a new version (or once we want to test for compatibility with a new version).

[27.10.20 14:03] <jayme> akosiaris: so unfortunately, kubeval is way less picky than kubeyaml is. I guess that's simply because it just validates against the spec rather then actually parsing into the go structures (which I think kubeyaml did)
[27.10.20 14:04] <jayme> means we won't get those "type errors" we had seen (like "[spec.template.spec.volumes] key volumes has wrong type <nil> (should be []interface{})")
[27.10.20 14:07] <akosiaris> umf
[27.10.20 14:09] <jayme> the nodejs kubeyaml otoh still seems to detect those
[27.10.20 14:10] <jayme> and still has this weird issue of just parsing the first object of a yaml stream :-)
[27.10.20 14:17] <jayme> but that difference also means that kubeyaml still uses go for the backend, which turns out to be true...
[27.10.20 14:17] <jayme> sooo...we could a) run the kubeyaml backend internally or b) hack together a cli
[27.10.20 14:31] <akosiaris> jayme: there is a c). Just figure out what needs to be done to import the versions we are missing to the current one
[27.10.20 14:31] <akosiaris> it means we are forking it of course.
[27.10.20 14:31] <akosiaris> Which given upstream's stance, it might not be a bad idea. I am just not sure we want to pay that cost. Lemme try to do it and gauge how much of a pain it is
[27.10.20 14:32] <jayme> akosiaris: oh, yeah. That ofc...no problem - I can figure out what it needs to add 1.19, I'm looking into all that now anyways
[27.10.20 14:33] <jayme> on the long run, maybe someone will re-add CLI to the current backend code ...
[27.10.20 14:46] <jayme> akosiaris: the backend code seems to be API compatible to the old CLI interface :D

So I created https://github.com/chuckha/kubeyaml/pull/22, imported that tarball as tags/upstream/0.0.3_20201027+git5f5556c and updated vendor & stuff in https://gerrit.wikimedia.org/r/c/operations/debs/kubeyaml/+/636713

Change 636878 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[integration/config@master] helm-linter: Update kubeyaml

https://gerrit.wikimedia.org/r/636878

Change 636879 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[integration/config@master] jjb: update job to releng/helm-linter:0.2.10

https://gerrit.wikimedia.org/r/636879

Change 636881 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] Test charts/deployments for compatibility with k8s 1.19

https://gerrit.wikimedia.org/r/636881

Change 636878 merged by jenkins-bot:
[integration/config@master] helm-linter: Update kubeyaml

https://gerrit.wikimedia.org/r/636878

Mentioned in SAL (#wikimedia-releng) [2020-10-28T10:00:02Z] <hashar> Successfully tagged docker-registry.discovery.wmnet/releng/helm-linter:0.2.10 # T266032

Change 636879 merged by jenkins-bot:
[integration/config@master] jjb: update job to releng/helm-linter:0.2.10

https://gerrit.wikimedia.org/r/636879

Change 636881 merged by jenkins-bot:
[operations/deployment-charts@master] Test charts/deployments for compatibility with k8s 1.19

https://gerrit.wikimedia.org/r/636881

Change 639736 had a related patch set uploaded (by JMeybohm; owner: JMeybohm):
[operations/deployment-charts@master] Add kubernetes 1.16 to the list of tested versions

https://gerrit.wikimedia.org/r/639736

Change 639736 merged by jenkins-bot:
[operations/deployment-charts@master] Add kubernetes 1.16 to the list of tested versions

https://gerrit.wikimedia.org/r/639736