Mailman will be down for approximately 2 hours on Tuesday, 18th of June from 10:00 UTC to 12:00 UTC to faciliate migration to a new host. The extent of the downtime will be that mailing list delivery and the web archives will be stopped for most of the duration of the window, and search availability may be intermittent for several hours. Mail will be delayed and delivered later, and should not be lost.
The rough outline for migration is:
1: stop mail arriving inbound, wait for queues to clear out
2: migrate data, VIPs and service from old host to new host
3: run the [[ https://docs.mailman3.org/en/latest/upgrade-guide.html#upgrade-from-3-3-1-to-3-3-1 | required upgrade steps ]]
4: test web UI on new host
5: allow mail to arrive inbound
More detailed step-by-step plan for migrating from the old hosts to the new host (lists1001 -> lists1004):
Prep:
- [] Merge puppet change to block incoming mail on lists1001 and lists1004
- [] Ensure the queue is empty on lists1001 (lists1001: `sudo find /var/lib/mailman3/queue/{in,out} | wc -l`)
- [] Stop mailman on lists1001 (lists1001: `sudo systemctl stop mailman3; systemctl stop mailman3-web`)
Migrate:
- [] Ensure data is synced from lists1001 to lists1004/lists2001 (`sudo /usr/local/sbin/sync-var-lib-mailman`)
- [] Merge CR migrating VIPs from lists1001, and switching primary host to lists1004 (https://gerrit.wikimedia.org/r/c/operations/puppet/+/1036610)
- [] Run puppet agent on lists1001, ensure VIPs are removed and exim4 config does **not** contain the lists VIPs for routing mail (lists1001: `sudo grep 208.80.154.21 /etc/exim4/exim4.conf`)
- [] Run puppet agent on lists1004, ensure VIPs are added and exim4 config **does** contains the lists VIPs (lists1004: `sudo grep 208.80.154.21 /etc/exim4/exim4.conf`)
Post-upgrade:
- [] Run the following post-upgrade steps on the new host, lists1004:
-- [] `mailman-web migrate`
-- [] `mailman-web compress`
-- [] `mailman-web collectstatic`
-- [] `mailman-web compilemessages`
-- [] `mailman-web rebuild_index` (may not be needed, test if archive search works before running this)
Restore:
- [] Start mailman-web on lists1004 and verify (lists1004: `sudo systemctl start mailman-web`)
- [] Test mail delivery locally
- [] Merge puppet change to unblock incoming mail on lists1004
- [] Re-enable puppet on all hosts (cumin: `sudo cumin 'A:lists' 'sudo puppet agent --enable`)
Rolling back:
We can undo this at any point up to allowing mail to arrive on the new host, by reverting the puppet change to migrate the VIPs and service. After that we need to allow for some mails to have been sent to exim but potentially not be delivered and we can deal with this as it comes.