Page MenuHomePhabricator

Move cloud runner CI jobs to trusted runners
Closed, ResolvedPublic

Description

The gitlab-cloud-runner project uses Terraform with a broad Digital Ocean access token and Helm to provision the cloud runner infrastructure. These are sensitive operations that should be moved to the trusted runners.

However, our trusted runners will not allow use of the currently used terraform-images from the GitLab container registry, so we'll need to vendor the scripts in that image with Terraform first and publish our image to docker-registry.wikimedia.org.

Details

TitleReferenceAuthorSource BranchDest Branch
wmf: Use binary distribution of terraformrepos/releng/gitlab-terraform-images!9dduvallreview/lets-use-binary-distwmf/stable
Refactor to use WMF gitlab-terraform-images/blubber/kokkurirepos/releng/gitlab-cloud-runner!16dduvallreview/use-wmf-imagemain
Add repos/releng/gitlab-terraform-images to trusted runnersrepos/releng/gitlab-trusted-runner!5dduvallreview/add-gitlab-terraform-imagesmain
Customize query in GitLab

Event Timeline

Change 852961 had a related patch set uploaded (by Dduvall; author: Dduvall):

[operations/puppet@production] aptrepo: Add thirdparty/terraform

https://gerrit.wikimedia.org/r/852961

Change 852961 merged by Dzahn:

[operations/puppet@production] aptrepo: Add thirdparty/terraform

https://gerrit.wikimedia.org/r/852961

Thanks for the review/merge, @MoritzMuehlenhoff and @Dzahn! I don't see the packages yet but I'm assuming the actual import is a manual step?

Looks like this is needed: https://wikitech.wikimedia.org/wiki/Reprepro#Adding_a_new_external_repository but puppet already did 1-3. Let me try to follow-up.

Mentioned in SAL (#wikimedia-operations) [2022-11-07T22:07:41Z] <mutante> [apt1001:~] $ sudo -E reprepro --verbose --component thirdparty/terraform update bullseye-wikimedia - T322344

apt1001:~] $ sudo -E reprepro --verbose --component  thirdparty/terraform-bullseye update bullseye-wikimedia 
Error: Component 'thirdparty/terraform-bullseye' as given to --component is not know.
(it does not appear as component in /srv/wikimedia/conf/distributions (did you mistype?))


[apt1001:~] $ grep terraform /srv/wikimedia/conf/distributions 
 thirdparty/terraform
 thirdparty/terraform-bullseye
[apt1001:~] $ sudo -E reprepro --verbose --component  thirdparty/terraform update bullseye-wikimedia 

...
ERROR: Condition 'DA418C88A3219F7B' not fulfilled for '/srv/wikimedia/lists/thirdparty%2Fterraform-bullseye_bullseye_InRelease'.
Signatures in '/srv/wikimedia/lists/thirdparty%2Fterraform-bullseye_bullseye_InRelease':
'DA418C88A3219F7B' (signed 1970-01-01): bad signature
Error: Not enough signatures found for remote repository thirdparty/terraform-bullseye (https://apt.releases.hashicorp.com bullseye)!
There have been errors!
[apt1001:~] $ sudo -E reprepro --verbose --component  thirdparty/terraform-bullseye update bullseye-wikimedia 

...

Nothing to do found. (Use --noskipold to force processing)
...
ERROR: Condition 'DA418C88A3219F7B' not fulfilled for '/srv/wikimedia/lists/thirdparty%2Fterraform-bullseye_bullseye_InRelease'.
Signatures in '/srv/wikimedia/lists/thirdparty%2Fterraform-bullseye_bullseye_InRelease':
'DA418C88A3219F7B' (signed 1970-01-01): bad signature
Error: Not enough signatures found for remote repository thirdparty/terraform-bullseye (https://apt.releases.hashicorp.com bullseye)!
There have been errors!

I don't know enough about reprepro to say why this error would occur, but FWIW the Release.gpg file does contain a signature from DA418C88A3219F7B.

$ gpg --list-packets <(curl -sSL 'https://apt.releases.hashicorp.com/dists/bullseye/Release.gpg')
# off=0 ctb=89 tag=2 hlen=3 plen=540
:signature packet: algo 1, keyid DA418C88A3219F7B
        version 4, created 1667502332, md5len 0, sigclass 0x00
        digest algo 8, begin of digest 99 42
        hashed subpkt 2 len 4 (sig created 2022-11-03)
        subpkt 16 len 8 (issuer key ID DA418C88A3219F7B)
        data: [4095 bits]

Is it possible that reprepro does not yet know about the key I added and there's some gpg import step? The puppet code suggests to me that it should all happen automatically, but maybe it's worth manually verifying that the key is present in the keyring.

...
ERROR: Condition 'DA418C88A3219F7B' not fulfilled for '/srv/wikimedia/lists/thirdparty%2Fterraform-bullseye_bullseye_InRelease'.
Signatures in '/srv/wikimedia/lists/thirdparty%2Fterraform-bullseye_bullseye_InRelease':
'DA418C88A3219F7B' (signed 1970-01-01): bad signature
Error: Not enough signatures found for remote repository thirdparty/terraform-bullseye (https://apt.releases.hashicorp.com bullseye)!
There have been errors!

I don't know enough about reprepro to say why this error would occur, but FWIW the Release.gpg file does contain a signature from DA418C88A3219F7B.

$ gpg --list-packets <(curl -sSL 'https://apt.releases.hashicorp.com/dists/bullseye/Release.gpg')
# off=0 ctb=89 tag=2 hlen=3 plen=540
:signature packet: algo 1, keyid DA418C88A3219F7B
        version 4, created 1667502332, md5len 0, sigclass 0x00
        digest algo 8, begin of digest 99 42
        hashed subpkt 2 len 4 (sig created 2022-11-03)
        subpkt 16 len 8 (issuer key ID DA418C88A3219F7B)
        data: [4095 bits]

Is it possible that reprepro does not yet know about the key I added and there's some gpg import step? The puppet code suggests to me that it should all happen automatically, but maybe it's worth manually verifying that the key is present in the keyring.

Hmmh, this is quite strange. There's no obvious error in the config (but the claimed signed on 1970 may be the culprit) and the key has been imported correctly on apt1001. This will need some further debugging, I'll try to get to it in the next days, currrently quite busy.

@Dzahn In the mean time to unblock the work, could you fetch the deb via secure apt (by setting it up following the upstream docs to setup the apt source and then running "apt-get download terraform" and then importing the package using "reprepro includedeb"?

Hmmh, this is quite strange. There's no obvious error in the config (but the claimed signed on 1970 may be the culprit) and the key has been imported correctly on apt1001. This will need some further debugging, I'll try to get to it in the next days, currrently quite busy.

@Dzahn In the mean time to unblock the work, could you fetch the deb via secure apt (by setting it up following the upstream docs to setup the apt source and then running "apt-get download terraform" and then importing the package using "reprepro includedeb"?

Thanks to you both for helping with this.

@Dzahn In the mean time to unblock the work, could you fetch the deb via secure apt (by setting it up following the upstream docs to setup the apt source and then running "apt-get download terraform" and then importing the package using "reprepro includedeb"?

I followed the exact steps from https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli -> Linux but got:

Get:5 https://apt.releases.hashicorp.com bullseye InRelease [12.0 kB]                            
Err:5 https://apt.releases.hashicorp.com bullseye InRelease                
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY DA418C88A3219F7B
E: The repository 'https://apt.releases.hashicorp.com bullseye InRelease' is not signed.

I have the key for sure. I can list it with both gpg --list-keys and apt-key list and it's in the APT key ring. I am also aware of apt-key being deprecated and I dropped the keyring into /etc/apt/trusted.gpg./d but it would not work either way.

gpg --list-keys DA418C88A3219F7B
pub   rsa4096 2020-05-07 [SC]
      E8A032E094D8EB4EA189D270DA418C88A3219F7B
uid           [ unknown] HashiCorp Security (HashiCorp Package Signing) <security+packaging@hashicorp.com>
sub   rsa4096 2020-05-07 [E]
apt-key list | grep -B1 Hash
Warning: apt-key is deprecated. Manage keyring files in trusted.gpg.d instead (see apt-key(8)).
      E8A0 32E0 94D8 EB4E A189  D270 DA41 8C88 A321 9F7B
uid           [ unknown] HashiCorp Security (HashiCorp Package Signing) <security+packaging@hashicorp.com>
/etc/apt/trusted.gpg.d$ file * | grep hashi
hashicorp-archive-keyring.gpg:                  PGP/GPG key public ring (v4) created Thu May  7 11:09:42 2020 RSA (Encrypt or Sign) 4096 bits MPI=0xcb4b530fed7acc95...

But none of this made the NO_PUBKEY go away.

Eventually I skipped the entire APT part of this and simply used:

wget https://apt.releases.hashicorp.com/dists/bullseye/InRelease

and it has a BAD signature.

 gpg --verify InRelease 
gpg: Signature made Wed 09 Nov 2022 10:31:46 AM PST
gpg:                using RSA key DA418C88A3219F7B
gpg: BAD signature from "HashiCorp Security (HashiCorp Package Signing) <security+packaging@hashicorp.com>" [unknown]

So.. I can remove the "signed-by" line from apt.sources and look at the checksums in the InRelease file. I can't verify who made that file though.

I opened a ticket with upstream. (#89323)

I opened a ticket with upstream. (#89323)

Can you provide a link please?

FWIW I was able to add the APT source and install the package this way (very similar to the docs you linked).

FROM docker-registry.wikimedia.org/bullseye

RUN apt-get update && \
    apt-get install -y \
      ca-certificates \
      gnupg \
      software-properties-common \
      wget

RUN wget -O - https://apt.releases.hashicorp.com/gpg | \
    gpg --dearmor > /etc/apt/trusted.gpg.d/hashicorp-archive.gpg

RUN echo "deb https://apt.releases.hashicorp.com bullseye main" > \
    /etc/apt/sources.list.d/hashicorp.list && \
    apt-get update

RUN apt-get install terraform

Would it be ok to add GetInRelease: no to the reprepro updates file? According to the manpage:

GetInRelease: no
    IF this is present, no InRelease file is downloaded but only Release (and Release.gpg ) are tried.

@dduvall The way you describe above works but it seems to me that is simply because it skips the "signed by" line in APT sources and does not verify. It would also work if you skip the installation of the gpg key entirely.

That being said, your comment did make me confirm that:

  • It works if the key is in /etc/apt/trusted.gpg.d/ (combined with signed-by line in sources)
  • It does not work if the identical key is in /usr/share/keyrings/ as suggested by the upstream docs.

So this:

deb [signed-by=/etc/apt/trusted.gpg.d/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com bullseye main

after additional debugging it turns out it was just a permission issue on the key when copied to /usr/share. The _apt user needs to be able to read it and wasn't.

I have now succesfully installed the .deb locally (with signing) (have to do it with apt-get and not apt or it does not get saved in /var/cache/apt/archives/) and uploaded it to apt1001.

Then imported the .deb manually with

[apt1001:/tmp] $ sudo -E reprepro -C thirdparty/terraform includedeb bullseye-wikimedia /tmp/terraform_1.3.4_amd64.deb 
Exporting indices...

@dduvall It's available for bullseye now:

[apt1001:/tmp] $ sudo -E reprepro ls terraform
terraform | 1.3.4 | bullseye-wikimedia | amd64

@dduvall The way you describe above works but it seems to me that is simply because it skips the "signed by" line in APT sources and does not verify. It would also work if you skip the installation of the gpg key entirely.

I don't think it needs the signed-by parameter if the key is under /etc/apt/trusted.gpg.d. I verified this by omitting the key installation and watching it fail.

FROM docker-registry.wikimedia.org/bullseye

RUN apt-get update && \
    apt-get install -y \
      ca-certificates \
      gnupg \
      software-properties-common \
      wget

#RUN wget -O - https://apt.releases.hashicorp.com/gpg | \
#    gpg --dearmor > /etc/apt/trusted.gpg.d/hashicorp-archive.gpg

RUN echo "deb https://apt.releases.hashicorp.com bullseye main" > \
    /etc/apt/sources.list.d/hashicorp.list && \
    apt-get update

RUN apt-get install terraform
$ DOCKER_BUILDKIT=0 docker build .
Sending build context to Docker daemon  2.048kB
Step 1/4 : FROM docker-registry.wikimedia.org/bullseye
 ---> 42fdecba0c20
Step 2/4 : RUN apt-get update &&     apt-get install -y       ca-certificates       gnupg       software-properties-common       wget
 ---> Using cache
 ---> 0f399f25d409
Step 3/4 : RUN echo "deb https://apt.releases.hashicorp.com bullseye main" >     /etc/apt/sources.list.d/hashicorp.list &&     apt-get update
 ---> Running in 5f5f291322ad
Hit:1 http://security.debian.org/debian-security bullseye-security InRelease
Get:2 https://apt.releases.hashicorp.com bullseye InRelease [12.0 kB]
Hit:3 http://mirrors.wikimedia.org/debian bullseye InRelease
Hit:4 http://apt.wikimedia.org/wikimedia bullseye-wikimedia InRelease
Hit:5 http://mirrors.wikimedia.org/debian bullseye-updates InRelease
Err:2 https://apt.releases.hashicorp.com bullseye InRelease
  The following signatures couldn't be verified because the public key is not available: NO_PUBKEY DA418C88A3219F7B
Hit:6 http://mirrors.wikimedia.org/debian bullseye-backports InRelease
Reading package lists...
W: GPG error: https://apt.releases.hashicorp.com bullseye InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY DA41
8C88A3219F7B
E: The repository 'https://apt.releases.hashicorp.com bullseye InRelease' is not signed.
The command '/bin/sh -c echo "deb https://apt.releases.hashicorp.com bullseye main" >     /etc/apt/sources.list.d/hashicorp.list &&     apt-get update' returned a non-zer
o code: 100
  • It does not work if the identical key is in /usr/share/keyrings/ as suggested by the upstream docs.

That is truly odd behavior!

So where do we go from here? I really don't get what apt is doing that reprepro does differently.

@dduvall It's available for bullseye now:

[apt1001:/tmp] $ sudo -E reprepro ls terraform
terraform | 1.3.4 | bullseye-wikimedia | amd64

\o/ thank you!!!

I don't think it needs the signed-by parameter if the key is under /etc/apt/trusted.gpg.d. I verified this by omitting the key installation and watching it fail.

I can remove the key from /etc/apt/trusted.gpg.d/ and the signed-by line and it still works but if I also remove it from apt-key then it does not work anymore.

That is truly odd behavior!

Just a permission issue after all.

But this does not explain why I get "BAD signature" when entirely by-passing APT and just using wget and gpg on the InRelease file.

Anyways.. you are unblocked for now regardless I hope.

What's needed here to close this task? The cloud runner CI jobs use wikimedia images since https://gitlab.wikimedia.org/repos/releng/gitlab-cloud-runner/-/commit/5c3df5c0c3c0088b85e27057c864cc8a2210f431 and the jobs are executed und Trusted Runners now. Is there something open with the terraform debian packages?

Also thanks for migrating the CI jobs to the Trusted Runners! I think that's a good use-case for the Trusted Runners and to secure the tokens/kubernetes configs.

Is there something open with the terraform debian packages?

Our status is: the change to add the external terraform repo to WMF repos was reverted. The reason was that it still had issues with the keys that signed the repo and, which I did not expect, this even affected other thirdparty repos / users.

original change: https://gerrit.wikimedia.org/r/c/operations/puppet/+/852961

revert: https://gerrit.wikimedia.org/r/c/operations/puppet/+/858315

Afaict the issue is / (was?) in upstream, not us. But it needs to be fixed at some point.

That being said, since the package was already installed and not removed I don't think it currently blocks this ticket. Correct me if I'm wrong.

We're now seeing errors during our image build. It seems the component and package may no longer be in our repo.

W: Skipping acquire of configured file 'thirdparty/terraform/binary-amd64/Packages' as repository 'http://apt.wikimedia.org/wikimedia bullseye-wikimedia InRelease' doesn't have the component 'thirdparty/terraform' (component misspelt in sources.list?)

I'm thinking the best course for us at this point might be to rely on binaries verified by checksum for now and not the upstream deb package.

Change 868623 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Readd thirdparty/terraform components

https://gerrit.wikimedia.org/r/868623

Change 868623 merged by Muehlenhoff:

[operations/puppet@production] Readd thirdparty/terraform components

https://gerrit.wikimedia.org/r/868623

Mentioned in SAL (#wikimedia-operations) [2022-12-16T09:44:55Z] <moritzm> import terraform 1.3.6 to thirdparty/terraform for buster/bullseye T322344

I'm thinking the best course for us at this point might be to rely on binaries verified by checksum for now and not the upstream deb package.

Nah, let's not do that. thirdparty/terraform was missing because the older config caused issues with reprepro and had been reverted. I have now merged a patch to readd thirdparty/terraform and import terraform 1.3.6, please give your image build another shot.

I'm thinking the best course for us at this point might be to rely on binaries verified by checksum for now and not the upstream deb package.

Nah, let's not do that. thirdparty/terraform was missing because the older config caused issues with reprepro and had been reverted. I have now merged a patch to readd thirdparty/terraform and import terraform 1.3.6, please give your image build another shot.

Image build job for terraform works again, thanks for the quick help!

I'm thinking the best course for us at this point might be to rely on binaries verified by checksum for now and not the upstream deb package.

Nah, let's not do that. thirdparty/terraform was missing because the older config caused issues with reprepro and had been reverted. I have now merged a patch to readd thirdparty/terraform and import terraform 1.3.6, please give your image build another shot.

Thanks for the help, Mortiz. It looks like you re-added it to the distributions files, but are we still unable to get it re-added to the updates file, and how will that impact package updates going forward? Will we have to request manual imports of new releases each?

I'm worried about the long-term maintainability if there continue to be problems with upstreams APT repo configuration.

how will that impact package updates going forward? Will we have to request manual imports of new releases each?

You would have needed that with the previous configuration as well, the updates config only makes this easier on the SRE end. I have setup a systemd-nspawn container which has the terraform source config. So when we need an update I can simply run "apt-get download terraform" and import the deb on apt1001. Some time next year we'll update reprepro to the latest version, it was dormant for some years but Debian switched to a fork which integrates all the patches and bugfixes floating around. I'd expect that this will allow us to switch to the automated sync again.

dduvall claimed this task.

You would have needed that with the previous configuration as well, the updates config only makes this easier on the SRE end. I have setup a systemd-nspawn container which has the terraform source config. So when we need an update I can simply run "apt-get download terraform" and import the deb on apt1001.

Wow, nice. :)

Some time next year we'll update reprepro to the latest version, it was dormant for some years but Debian switched to a fork which integrates all the patches and bugfixes floating around. I'd expect that this will allow us to switch to the automated sync again.

Sounds good! Thanks again.

I'm going to close this task out then, as the terraform package is available again, the WMF image is built and available, and we are indeed running all the gitlab-cloud-runner jobs on trusted runners Thanks, all.