== Conclusions ==
The easiest solution from a development perspective would be to use [[ http://redis.io/topics/partitioning#redis-cluster | Redis Cluster ]], which is available in Redis >= 3.0.0 and is recommended by the Redis authors as the best current practice. It transparently supports sharding, automatic failover, and node migration. Unfortunately, this has not yet landed in Ubuntu's stable distro, and I can't find any examples of successful production deployment, so it's out of the question for now.
Instead, we're going to do a manual sharding thing and simple master-slave replication. The topology will be like,
* Payments 1001-1003: Each run a Redis master.
* Payments 1004: Runs three Redis slaves, one for each master.
The following changes will need to be made to client code:
* The DonationInterface frontends will read and write to Redis on localhost, each box maintaining its own limbo. If the completed transaction hook cannot find a limbo entry for the donation being processed, check whether the data is available on any of the three slaves. Failure to pop from the FIFO queue during transaction completion is fine, the orphan slayer is already robust against that.
* The orphan slayer will pop from each master, or slave if the master isn't available. It will round-robingoing round-robin and taking one message from each queue.
* When we read from slaves, we won't be able to pop the record from the FIFO queue. This means the orphan slayer needs another repeated tail-chase prevention mechanism. I'm afraidThe slayer should probably only use slaves for actual read operations, the answer might be to run another Redis master onand not substitute a slave read for a master pop like we do from the slayerfrontend, which tracks keys that have been read but not poppedotherwise we'll have to come up with a mechanism to prevent repeatedly chasing our tail.