In T104826#1438122, @mmodell wrote:I can imagine a few scenarios where lack of locking/coordination among deployers could cause this to go horribly wrong.
For one thing, a sync from mira could clobber any in progress work that is being done on tin (and vise-versa)
We currently avoid conflicting deployment work by following a schedule and manually checking for other logged in users on tin, as well as coordinating via IRC, however, none of these methods are ideal and the checking for other logged in users bit is easily defeated when there are two deployment servers.
Shouldn't we create some sort of flag / mutex that must be obtained on both servers before beginning deployment work, and is then released at the end of a sync?
Description
Description
Related Objects
Related Objects
Event Timeline
Comment Actions
We might be able to use the new etcd infrastructure for this. Ideally we would setup something that not only keeps multiple masters from running a sync operation at the same time but also keeps multiple deployers from messing with the state of /srv/mediawiki-staging concurrently (on the same host and cross-master).
Comment Actions
See https://github.com/jplana/python-etcd/pull/114 for support in python-etcd for global locks.