Page MenuHomePhabricator

Toolforge: migrate redis servers to Debian Buster or later
Closed, ResolvedPublic

Description

There are a couple of redis servers that need to migrate from Debian Stretch to Debian Buster.

Event Timeline

Andrew triaged this task as Medium priority.Apr 13 2021, 4:15 PM
Andrew moved this task from Inbox to Soon! on the cloud-services-team (Kanban) board.
taavi renamed this task from Toolforge: migrate redis servers to Debian Buster to Toolforge: migrate redis servers to Debian Buster or later.May 16 2021, 3:50 PM

Here's a proposed migration plan, how does it sound?

  • Make a redis.svc.tools.eqiad1.wikimedia.cloud DNS record to current redis servers and use that on documentation to make that the canonical name (T153810#7090361)
  • Wait for Bullseye to come out?
  • Figure out if we want to do this without downtime and if so, how
  • Decide that we likely don't, since the downtime will likely be just some minutes and we would still need a some read-only time and the hassle for that is way too much
  • Make 3 new redis instances and set them up in a Sentinel cluster
  • Create a VIP and allow said instances use them, setup Keepalived
  • Announce migration to cloud mailing lists
  • Wait for maintenance window to start
  • Stop redis service on old nodes
  • Copy data file from old primary node to all new ones
  • Start new ones, starting from the one designated as initial primary in hiera
  • Ensure all replicas have the data
  • Ensure Sentinel starts and detects all replicas
  • Change DNS to point to the VIP
  • Be happy because Toolforge Redis is now running on a modern operating system and can automatically fail over in case of node failure

Mentioned in SAL (#wikimedia-cloud) [2022-01-30T14:22:29Z] <taavi> creating a cluster of 3 bullseye redis hosts for T278541

Mentioned in SAL (#wikimedia-cloud) [2022-01-30T14:41:12Z] <taavi> created a neutron port with ip 172.16.2.46 for a service ip for toolforge redis automatic failover T278541

Mentioned in SAL (#wikimedia-cloud) [2022-05-03T08:20:11Z] <taavi> redis: start replication from the old cluster to the new one (T278541)

Change 788675 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] openstack: update tools-redis to a 'new' style name

https://gerrit.wikimedia.org/r/788675

Change 788675 merged by David Caro:

[operations/puppet@production] openstack: update tools-redis to a 'new' style name

https://gerrit.wikimedia.org/r/788675

Mentioned in SAL (#wikimedia-cloud) [2022-05-22T17:04:50Z] <taavi> failover tools-redis to the updated cluster T278541

Deleted the VMs. Closing.