Page MenuHomePhabricator

Use encrypted rsync for releases
Closed, ResolvedPublic

Description

While auditing codfw/eqiad traffic during switchover (T286038) I came across plaintext rsync for releases hosts, please consider switching to encrypted rsync

Event Timeline

fgiunchedi triaged this task as Medium priority.Aug 30 2021, 7:50 AM

T289857: Use encrypted rsync for deployment::rsync has some notes on how to enable stunnel for this. However the MW-on-K8s image building process also performs an rsync against the releases host, so it might also need an update to use stunnel as well.

All files sent to releases are meant to be available to the world though. Does it still matter to encrypt traffic internally for something like this?

IMHO yes, we should encrypt traffic unless we have reasons not to (e.g. system is going to be retired, too hard/complex to implement vs advantages, etc)

I don't think this should be considered a blocker for T327920: March 2023 Datacenter Switchover
However, we should address it for mw-on-k8s and releases.

Edit: I may have misunderstood, was this unencrypted cross-datacenter traffic?

Tagging collab; since we are probably the ones who need to get back to this nowadays. Sorry for the delay; slipped off the radar.

LSobanski lowered the priority of this task from Medium to Low.Nov 24 2025, 4:28 PM
LSobanski moved this task from Incoming to Backlog on the collaboration-services board.

Change #1217572 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] releases: add stunnel to rsync data copy

https://gerrit.wikimedia.org/r/1217572

Change #1217572 merged by Dzahn:

[operations/puppet@production] releases: add stunnel to rsync data copy

https://gerrit.wikimedia.org/r/1217572

Change #1217594 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] releases: use stunnel with rsync from deployment server

https://gerrit.wikimedia.org/r/1217594

There are several different "rsyncs" involved here.

This is now resolved, using stunnel, for those data transfers from one releases server to another. 2 different ones!

Then there is another one where releases hosts pull from the deployment host. The second patch for that is up for review but not deployed yet.

Need to double check if deployment host works with this. In the scap::master there are a couple rsync::server::module but it does not use the abstraction of rsync::quickdatacopy which offers the servers_uses_stunnel parameter.

Change #1217594 merged by Dzahn:

[operations/puppet@production] releases: use stunnel with rsync from deployment server

https://gerrit.wikimedia.org/r/1217594

Mentioned in SAL (#wikimedia-operations) [2025-12-15T18:10:56Z] <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases1003.eqiad.wmnet with reason: T289858

Mentioned in SAL (#wikimedia-operations) [2025-12-15T18:12:21Z] <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: T289858

Change #1218348 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] deployment::server: allow releases hosts encrypted rsync

https://gerrit.wikimedia.org/r/1218348

Change #1218348 merged by Dzahn:

[operations/puppet@production] deployment::server: allow releases hosts encrypted rsync

https://gerrit.wikimedia.org/r/1218348

Mentioned in SAL (#wikimedia-operations) [2025-12-15T21:44:29Z] <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: T289858

Mentioned in SAL (#wikimedia-operations) [2025-12-15T21:44:52Z] <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases1003.eqiad.wmnet with reason: T289858

Mentioned in SAL (#wikimedia-operations) [2025-12-15T22:23:19Z] <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: T289858

Mentioned in SAL (#wikimedia-operations) [2025-12-15T22:23:33Z] <dzahn@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: T289858

file transfers to and between releases servers are now encrypted