As a followup to T351507: VMs in Cloud VPS share the same machine-id and T400223: Investigate daily disconnections of IRC bots hosted in Toolforge we must make sure /etc/machine-id is actually unique across all Cloud VPS VMs.
The process is simple: rm /etc/machine-id and then either systemd-machine-id-setup + restart affected demons, or reboot. The problematic case we have seen so far was with systemd-networkd using the same dhcp client id from multiple VMs, and causing problems with leases unable to be renewed.
Action plan:
- Audit all VMs machine-id, identify which need to be fixed.
- Decide what to do with un-auditable VMs, either shut down or up but inaccessible to cumin. T402185: Audit and potentially fix VMs not reachable by cloudcumin root key
- Proceed to fix the problem in batches, perhaps previous announcement to users. We have observed a brief bounce of network connections when systemd-networkd is restarted, and no ill side effects other than that.