Page MenuHomePhabricator

Upload docker-ce 18.06.3 upstream package for Stretch
Closed, DeclinedPublic

Description

CI instances use Docker from upstream. We could thus use https://download.docker.com/linux/debian/dists/stretch/pool/stable/amd64/docker-ce_18.06.3~ce~3-0~debian_amd64.deb to be uploaded to apt.wikimedia.org for Stretch.

The jessie one is under thirdparty/ci. But we can use another component name if that is better.

Related Objects

StatusSubtypeAssignedTask
Resolvedhashar
Resolvedhashar
Resolvedhashar
Resolvedhashar
Resolvedhashar
Resolvedhashar
StalledNone
ResolvedNone
Resolvedakosiaris
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
InvalidJdforrester-WMF
ResolvedMoritzMuehlenhoff
ResolvedKrinkle
ResolvedKrinkle
Resolvedhashar
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
DeclinedJdforrester-WMF
DuplicateNone
ResolvedMilimetric
ResolvedMilimetric
ResolvedLadsgroup
Resolvedakosiaris
DeclinedNone
Resolved Mholloway
DuplicateNone
ResolvedNone
ResolvedNone
DeclinedNone
ResolvedMSantos
DuplicateNone
Resolvedjeena
ResolvedJdforrester-WMF
ResolvedJdrewniak
DuplicateNone
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedJdforrester-WMF
ResolvedMoritzMuehlenhoff
Resolvedhashar
Resolvedhashar
DeclinedMoritzMuehlenhoff

Event Timeline

And maybe we could use some reprepro configuration to ease further upgrades?

hashar triaged this task as Medium priority.Jun 24 2019, 1:44 PM
hashar updated the task description. (Show Details)

Adding @MoritzMuehlenhoff since he seems to knows best about the reprepro config in modules/aptrepo/files/updates.

MoritzMuehlenhoff claimed this task.

This was resolved via T215975

I need the package for Stretch!

Ah eventually I found the entry:

Name: thirdparty/kubeadm-k8s-docker.com
Method: https://download.docker.com/linux/debian/
Suite: stretch
Components: stable>thirdparty/kubeadm-k8s
UDebComponents:
Architectures: amd64
VerifyRelease: 7EA0A9C3F273FCD8
ListShellHook: grep-dctrl -e -P '^docker-ce|docker-ce-cli$' || [ $? -eq 1 ]

My primarily concern is the component has been created for the Toolforge Kubernetes cluster and I would want the versions used on CI decoupled from that. It is to avoid unattended upgrades due to Toolforge moving at a difference stance.

The component also has mixed purposes since it brings in both Docker and the Kubernetes tools. Which is fine for Toolforge but might not be when one just need the fresh Docker version while keeping the k8s tools from Debian.org. Eg contint1001 uses the Kubernetes commands, it is still on Jessie but once it switches to Stretch that would tie it to Toolforge upgrades.

Would it be possible to separate the concerns? Maybe by uploading the docker-ce 18.06.3 to stretch-wikimedia thirdparty-ci (which currently only has Jenkins).

Ah eventually I found the entry:

Name: thirdparty/kubeadm-k8s-docker.com
Method: https://download.docker.com/linux/debian/
Suite: stretch
Components: stable>thirdparty/kubeadm-k8s
UDebComponents:
Architectures: amd64
VerifyRelease: 7EA0A9C3F273FCD8
ListShellHook: grep-dctrl -e -P '^docker-ce|docker-ce-cli$' || [ $? -eq 1 ]

My primarily concern is the component has been created for the Toolforge Kubernetes cluster and I would want the versions used on CI decoupled from that.

Well, if there's a security update for Docker we'll want it for both (and same for important bugfixes), so using the same component seems actually better for consistency.

It is to avoid unattended upgrades due to Toolforge moving at a difference stance.

Upgrades to this component are not unattended, they are synced manually by an SRE:

Would it be possible to separate the concerns? Maybe by uploading the docker-ce 18.06.3 to stretch-wikimedia thirdparty-ci (which currently only has Jenkins).

It creates extra effort, but I don't currently see a good justification? CI should not diverge from current docker releases to the same extent that Toolforge should.

Well, if there's a security update for Docker we'll want it for both (and same for important bugfixes), so using the same component seems actually better for consistency.

It is to avoid unattended upgrades due to Toolforge moving at a difference stance.

Upgrades to this component are not unattended, they are synced manually by an SRE:

Even if packages are manually updated on apt.wikimedia.org, whenever we create a new instance we would get the version currently available on apt.wikimedia.org which might not match the rest of the fleet. That has hit me multiple times in the past due to others introducing a newer version that is slightly different if not just backward incompatible. There are also components that might bring in different major version than the one provided by Debian, that is the case with thirdparty/kubeadm-k8s which beside Docker also bring the Kubernetes command line tool and that in turns has side effects on deployment pipeline which use k8s, I would rather decouple those as well.

Down the road, whenever CI or Toolforge need a newer version of Docker or K8s tools, the other infrastructure is kind of forced to upgrade as well. I would rather have each team decided when and how they conduct the upgrade.

Would it be possible to separate the concerns? Maybe by uploading the docker-ce 18.06.3 to stretch-wikimedia thirdparty-ci (which currently only has Jenkins).

It creates extra effort, but I don't currently see a good justification? CI should not diverge from current docker releases to the same extent that Toolforge should.

I agree we should track upstream release closely, specially for security updates. But the upgrade themselves should be able to happen at different time on each of CI and Toolforge infra. And I would certainly prefer not to bring in the Kubernetes CLI tools from ToolForge when CI targets the production Kubernetes, who knows what might end up breaking :-\

Change 522434 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Sync docker packages to thirdparty/ci

https://gerrit.wikimedia.org/r/522434

Change 522434 merged by Muehlenhoff:
[operations/puppet@production] Sync docker packages to thirdparty/ci

https://gerrit.wikimedia.org/r/522434

Mentioned in SAL (#wikimedia-operations) [2019-07-16T11:57:55Z] <moritzm> synched docker-ce, docker-ce-cli, containerd.io to thirdparty/ci for stretch-wikimedia (T226236)

Packages have been synched to thirdparty/ci for stretch-wikimedia.

Thanks, I can confirm the component is around and it addresses the concern of mixing up upgrades with Toolforge. However that imports 18.09.7 but we need the previous version 18.06.x for now :-\

Thanks, I can confirm the component is around and it addresses the concern of mixing up upgrades with Toolforge. However that imports 18.09.7 but we need the previous version 18.06.x for now :-\

/me curious, what's the reason for 18.06.x? Is there something that does not work with 18.09.7?

I am Back from vacations!

CI currently runs 18.06. 18.09 introduces a bunch of changes I am not comfortable to apply while also migrating from Jessie to Stretch. So my idea was to first do the Stretch migration, dispose of the Jenkins agent and thus help phase out Jessie. Then later look into upgrading Docker :)

Sorry, no. We're not going to intentionally downgrade to an old, known-insecure version. If there's an issue with any of the 18.06->18.09 changes affecting CI, then we should just fix them as part of the migration.

It is not about downgrading Docker, but rather to keep the same version we are currently using on the Jessie instances. My primary intent was just to migrate to Stretch, not to have to deal with a Docker migration and more puppet work. containerd for example is no more managed by Docker but by systemd and the 18.09 Docker package is no more provided for Jessie. It is just too risky/long to migrate both the OS and the Docker engine at the sametime.

Anywa,y I am declining this and postpone the migration to Stretch to a later time when I have the time to handle it.

Anywa,y I am declining this and postpone the migration to Stretch to a later time when I have the time to handle it.

Ok, but note the there's a hard deadline for the contint* migration given that it's still running Jessie; end of Q3 of the FY, so March 2020.

You should really aim for Buster. Per https://wikitech.wikimedia.org/wiki/Operating_system_upgrade_policy we'll quit using Stretch by June 2021.

Eventually a few days after I noticed jobs were slower than expected and spend a couple days narrowing down. That is due to the Docker version used for Stretch :-\ T236675

Anywa,y I am declining this and postpone the migration to Stretch to a later time when I have the time to handle it.

Ok, but note the there's a hard deadline for the contint* migration given that it's still running Jessie; end of Q3 of the FY, so March 2020.

It's March 2020. As part of an attempt to unblock T224591 (which releng pinged me about recently) i made these changes:

https://gerrit.wikimedia.org/r/q/topic:%22contint2001%22+(status:open%20OR%20status:merged)

but it's blocked by T236675. How can we move forward?