Page MenuHomePhabricator

Add minimal yaml file linting as part of ci for security ci templates repository
Closed, ResolvedPublic

Description

Something like python's yaml.safe_load is likely good enough for now.

Event Timeline

Made some progress on this and can maybe call it done for now. Here's the linter I came up with within the repo's .gitlab-ci.yml:

https://gitlab.wikimedia.org/security/gitlab-ci-security-templates/-/blob/6ad4315aff1491f7c176a40e39ab006dd3c10221/.gitlab-ci.yml

It's using the wmf python3:0.0.2-20211024 docker image and I tried to be minimal about the apt/pip installs on top of that. I think there's still a question as to what is "secure enough" within these environments, see also T291978. I would note that being too restrictive would make it extremely difficult to quickly and conveniently build and modify Gitlab ci functionality. But we also probably don't want people to install hundreds of packages or random binaries or whatever.

Anyhow, here's an example of the yaml linter failing and exiting (and the "bad" file that triggered it). And here's one where it succeeds.

Some remaining questions in my mind:

  1. As previously mentioned, what will be considered "secure enough" regarding external dependencies for various Gitlab ci functionality?
  2. I don't believe a Gitlab ci coding standards guide exists yet, but maybe there should be one? I don't think my experiment here is too awful, but it does feature a probably-too-lengthy python one-liner to do the yaml-linting, which maybe should be its own in-repo script that gets called or shortened in some other way? I'm not exactly sure what constitutes best and worst practices for Gitlab ci.
  3. Outside of gitlab-replica.wikimedia.org (which I'm not sure how synced that is to gitlab.wikimedia.org) there isn't really a great dev environment to build and test more advanced Gitlab ci functionality, outside of a random shell on your laptop :)
  4. Maybe we'd want unit/integration tests for some of the Gitlab ci stuff? Seems a little funny - tests for the testing infrastructure, but I'd imagine these will become critical code/configuration once gitlab.wikimedia.org more fully begins to replace gerrit.
  5. For this yaml linter specifically, maybe we want an env var to allow empty files to pass? Currently they fail.
sbassett triaged this task as Medium priority.
sbassett moved this task from In Progress to Done on the user-sbassett board.
sbassett moved this task from In Progress to Our Part Is Done on the Security-Team board.

A quick blog post that could be helpful to some future work in this area: https://cipherstash.com/blog/2021-11-25-linting-your-github-actio

This post is specifically for Github Actions, but the same ideas apply to Gitlab's CI, particularly the other two checks the author mentions (since we already validate the yaml):

  1. That it has all the required fields, and no unknown fields, by validating the structure against published schemas
  2. That all globs mentioned in paths and paths-ignore lists match at least one file in the repo.