I detected that cloudgw's keepalived daemon flaps when rebooting the primary server:
- primary server A, secondary server B
- reboot A, B takes over as primary
- A boots, takes over as primary
This adds an unnecessary transition, instead it should be:
- primary server A, secondary server B
- reboot A, B takes over as primary
- A boots, nothing else happens, B stays as primary and A is secondary.
The additional transition could add additional instability in the network, becuase when A takes over after the boot, the conntrack information might not be synced yet.
We currently configure keepalived with nopreempt and initial state BACKUP, so I suspect there is a bug somewhere.