Two QoL improvements for this cookbook:
- Add a start-datetime flag. Any host that has already been rebooted since the start-datetime should be skipped over. This allows resuming work when the cookbook fails or is paused by operator without unnecessarily rebooting hosts that already got rebooted.
- Add a log line indicating to the operator when the cookbook can safely be killed (i.e. between batches)
We can use the sre.elasticsearch.rolling-operation cookbook as a model for these two changes.
Also tack on the following:
- Fix (I believe) erroneous use of math.floor instead of math.ceil that results in batch size not being respected in some circumstances.