Page MenuHomePhabricator

Migrate deployment-prep away from Debian Jessie to Debian Stretch/Buster
Closed, ResolvedPublic

Description

Ubuntu Trusty is gone (at least, from our project) and Debian Jessie instance creation just got disabled (see T218119: Disable jessie VM creation in VPS).
Therefore the following instances are not reproducible in their current state. If they get lost to a hardware failure and are not able to be set up on stretch, the service they ran may be SOL.
So it's time to begin migrating our 34 Jessie instances towards Stretch.
You'll notice a Buster prerelease image is available to deployment-prep alongside Stretch. Please don't use this unless production is running the same service on buster, or you are setting up a fresh service that will be on buster when deployed to production, or you are working on migrating the production service to buster. Buster is now released and available publicly, go nuts.

The following deployment-prep instances are running Jessie, as of 2021-03-19:

NameStatusTaskNotes
deployment-logstash2T238707Underway, puppet challenges
deployment-restbase01T250574pending deletion
deployment-restbase02T250574pending deletion
deployment-sca01???Pending deletion. Services on them are too old and broken to be useful according to the SREs, and have no maintainers. Ideally those should be moved to a Beta Cluster k8s cluster(?) (T276650)
deployment-sca02???Pending deletion. Services on them are too old and broken to be useful according to the SREs, and have no maintainers. Ideally those should be moved to a Beta Cluster k8s cluster(?) (T276650)

Related Objects

StatusSubtypeAssignedTask
InvalidNone
OpenNone
ResolvedBUG REPORT Bstorm
ResolvedMoritzMuehlenhoff
DuplicateNone
ResolvedNone
ResolvedEevans
ResolvedKrenair
ResolvedKrenair
Resolvedelukey
ResolvedNone
DeclinedNone
DeclinedNone
ResolvedNone
ResolvedJdforrester-WMF
ResolvedKrenair
Resolvedthcipriani
Resolvedtaavi
Resolvedtaavi
Resolvedhashar
Resolvedtaavi
Resolvedtaavi
Resolvedtaavi

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Krenair renamed this task from Migrate away from Debian Jessie to Debian Stretch to Migrate away from Debian Jessie to Debian Stretch/Buster.Jul 9 2019, 10:57 PM
Krenair updated the task description. (Show Details)
Krenair renamed this task from Migrate away from Debian Jessie to Debian Stretch/Buster to Migrate deployment-prep away from Debian Jessie to Debian Stretch/Buster.Sep 12 2019, 11:28 PM

@fgiunchedi Do we need to do anything else to get rid of deployment-logstash2 and use deployment-logstash03 instead? logstash2 now has a puppet error due to https://gerrit.wikimedia.org/r/c/operations/puppet/+/522406 (T198092)

@fgiunchedi Do we need to do anything else to get rid of deployment-logstash2 and use deployment-logstash03 instead? logstash2 now has a puppet error due to https://gerrit.wikimedia.org/r/c/operations/puppet/+/522406 (T198092)

If deployment-logstash03 has the same classes applied than deployment-logstash2 and no puppet errors I'd say the next step would be to switch producers to use deployment-logstash03 and the proxy to logstash-beta.wmflabs.org. It might help with T233134: logstash-beta.wmflabs.org does not receive any mediawiki events too

(I've removed most subscribers that came from the merger of T236575 to avoid spamming half of the wikimedia technical community. In the mean time logstash discussion moved to T238707 and I need to follow up)
note from https://wikitech.wikimedia.org/wiki/News/Jessie_deprecation#Cloud_VPS_projects:

In December 2019, deadline. Evaluate if Jessie VMs not migrated are actually in use and why they weren't migrated.
In January 2020, shutdown all Jessie instances (unless special arrangements have been made for extension of deadline).

FYI @bd808 the most common reason some of these hasn't been migrated is probably that I overestimated the amount of services that would be running inside containers in production by this time. Other reasons include things like the production equivalent hosts not yet having been upgraded.
For deployment-prep I would like to request exemption up to the equivalent deadline for prod (end of security support probably?)

FYI @bd808 the most common reason some of these hasn't been migrated is probably that I overestimated the amount of services that would be running inside containers in production by this time. Other reasons include things like the production equivalent hosts not yet having been upgraded.
For deployment-prep I would like to request exemption up to the equivalent deadline for prod (end of security support probably?)

Ack. I assumed that deployment-prep would be among the last projects to fully remove Jessie. Thanks for all your hard work coordinating things here @Krenair.

Just to remind everyone: We have one week left before the following would be running an unsupported OS, and I don't know exactly what the policy will be:
deployment-imagescaler01 (see subtask)
deployment-cpjobqueue
deployment-fluorine02 (prod equivalent is jessie still)
deployment-restbase02 (being replaced)
deployment-restbase01 (being replaced)
deployment-sca04
deployment-sca01
deployment-etcd-01
deployment-sca02
deployment-memc05 (prod seems to have a lot of jessie memc boxes still)
deployment-memc04 (prod seems to have a lot of jessie memc boxes still)
deployment-memc07 (purpose? see subtask)
deployment-sentry01
deployment-memc06 (purpose? see subtask)
deployment-logstash2 (made a replacement for this but something was wrong, I think someone decided to just resurrect this instead of fix it)
deployment-mcs01
deployment-ircd (prod equivalent is jessie still)
deployment-changeprop

We're coming up on a year since Jessie support ended. Here are the remaining Jessie VMs in deployment-prep:

deployment-etcd-01.deployment-prep.eqiad1.wikimedia.cloud
deployment-fluorine02.deployment-prep.eqiad1.wikimedia.cloud
deployment-ircd.deployment-prep.eqiad1.wikimedia.cloud
deployment-mcs01.deployment-prep.eqiad1.wikimedia.cloud
deployment-memc[04-07].deployment-prep.eqiad1.wikimedia.cloud
deployment-restbase[01-02].deployment-prep.eqiad1.wikimedia.cloud
deployment-sca[01-02].deployment-prep.eqiad1.wikimedia.cloud

On the one-year anniversary of Debian's EOL (June 30, 2021) I'll just delete these unless there's some indication of activity by then.

deployment-mcs01 is obsolete, superseded by deployment-docker-mobileapps01. I'll shut it down momentarily.

I think we can simply remove deployment-sca01/sca02? The respective hosts in production have been removed (hardware is still up, but services gone) and removal doesn't need to wait for an eventual k8s installation in beta.

Thank you for all your work on this, @Majavah !

I think we can simply remove deployment-sca01/sca02? The respective hosts in production have been removed (hardware is still up, but services gone) and removal doesn't need to wait for an eventual k8s installation in beta.

I think a few services that are still running there, at least apertium and recommendation api according to a quick codesearch and a look at deployment-prep proxy list.

In T218729#6907904, @Majavah wrote:

I think we can simply remove deployment-sca01/sca02? The respective hosts in production have been removed (hardware is still up, but services gone) and removal doesn't need to wait for an eventual k8s installation in beta.

I think a few services that are still running there, at least apertium and recommendation api according to a quick codesearch and a look at deployment-prep proxy list.

At this point the scb role which is applied to the sca* hosts only provides apertium and graphoid, so the recommendation api config is stale. The entire scb role is going to removed very soon anyway, I'd propose to ditch the sca* hosts and revisit this once there's some k8s setup for beta.

Mentioned in SAL (#wikimedia-releng) [2021-03-19T12:48:11Z] <Majavah> shutdown deployment-sca*, services on them are too old and broken to be useful according to the SREs, have no maintainers and the hosts are running Jessie, T218729

Mentioned in SAL (#wikimedia-releng) [2021-03-26T07:05:34Z] <Majavah> delete remaining shutdown deployment-prep jessies: deployment-sca[01-02], deplyoment-logstash2 (T218729)