quote John Lewis: "hold lists.wikimedia.org with exim (disable puppet on sodium; apply locally rather via operations/puppet unless we want to hold all emails to fermium as well for 'safety'?)"
Description
Details
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| lists: hold mail to lists.wm.o | operations/puppet | production | +3 -0 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | faidon | T84041 Replace all instances of lighttpd with nginx | |||
| Resolved | faidon | T84053 mailman - replace lighttpd | |||
| Resolved | LSobanski | T111653 Encrypt all the things | |||
| Resolved | faidon | T82576 Enable STARTTLS (both inbound and outbound) on lists | |||
| Resolved | JanZerebecki | T55259 Add Forward Secrecy to all HTTPS sites | |||
| Resolved | Dzahn | T90351 Improve SSL of lists.wikimedia.org | |||
| Resolved | Dzahn | T83541 Upgrade Exim to >=4.73 | |||
| Duplicate | None | T97492 Upgrade to Mailman 3.0 | |||
| Resolved | Dzahn | T110141 TTL back up to normal 1H | |||
| Resolved | • MZMcBride | T27231 Mailman mailing list archiver truncates if a line begins with "From" | |||
| Resolved | None | T66818 Mitigate strict DMARC policy on the mailing lists | |||
| Resolved | Dzahn | T80945 Get rid of all Ubuntu Lucid (10.04) installs | |||
| Resolved | Dzahn | T82698 shutdown sodium after mailman has migrated to jessie VM | |||
| Resolved | Dzahn | T105756 Mailman Upgrade (Jessie & Mailman 2.x) and migration to a VM | |||
| Resolved | Dzahn | T110136 hold lists.wikimedia.org with exim |
Event Timeline
I've looked at this. If we add lists.wikimedia.org to the hold domains and increase the retry time (to at least 4 hours, the window length), then all emails should be written to /var/spool/exim4/input.
From there, we can rsync sodium onto fermium's directory (exim4 will hold on fermium too so this can be a puppet change and the directory will be used by exim already) and allow exim to process emails once we're confident nothing else is going to sodium's exim.
This creates two levels of stoppage. Mailman will be stopped so it will not process any emails on both servers which should be done after exim begins to hold emails.
I propose:
- hold lists.wikimedia.org for exim (via puppet/git), deploy it
- wait 10 minutes, check mailman is not doing anything mail wise - then stop it.
- do the migration
- start mailman on fermium only.
- don't hold anymore emails on fermium only. Sodium should have puppet disabled and should remain to queue emails otherwise any unlucky people will never have their emails acknowledged by mailman and will just bounce after 24-48 hours.
And assigning back to Daniel.
Change 233750 had a related patch set uploaded (by John F. Lewis):
lists: hold mail to lists.wm.o