Page MenuHomePhabricator

Replace primary mail relays (polonium/lead)
Closed, ResolvedPublic

Description

Our current e-mail relays, polonium and lead, need to be replaced:

  • They are both in eqiad, something that was always meant to be temporary (at the time they were provisioned codfw was not in production)
  • They need to be convered into jessie
  • They need to be moved into VMs
  • (Optional) They need to get cluster-based hostnames as we'll need to issue certificates for them

The systems are well-puppetized, so It should be relatively simple. The only tricky part is handling the migration, as polonium/lead's IP addresses are hardcoded e.g. in the Google for Apps panel OIT is handling.

Details

Related Gerrit Patches:
operations/puppet : productionAdd mx1001/mx2001 as role mail::mx

Event Timeline

faidon created this task.Sep 21 2015, 7:59 AM
faidon claimed this task.
faidon raised the priority of this task from to High.
faidon updated the task description. (Show Details)
faidon added projects: acl*sre-team, Mail.
faidon added a subscriber: faidon.
Restricted Application added subscribers: Matanya, Aklapper. · View Herald TranscriptSep 21 2015, 7:59 AM

Change 239784 had a related patch set uploaded (by Faidon Liambotis):
Add mx1001/mx2001 as role mail::mx

https://gerrit.wikimedia.org/r/239784

Change 239784 merged by Faidon Liambotis:
Add mx1001/mx2001 as role mail::mx

https://gerrit.wikimedia.org/r/239784

faidon set Security to None.
faidon added a comment.EditedSep 21 2015, 1:07 PM

The new hosts, mx1001/mx2001 are up and running. I've already notified WMF's Office IT team to update Google Apps with the new IPs (#8564 on their ticketing).

Google Apps was updated by OIT. MXes for all domains except wikimedia.org and its subdomains have been switched. wiki-mail-eqiad was switched as well.

wikimedia.org and subdomains will follow next; the generic wiki-mail CNAME too, as well as a new wiki-mail-codfw. ETA is tomorrow, Sep 22nd.

All of the above are done. polonium still gets a fair share of emails (spammers don't really obey DNS TTLs); I'll be monitoring it over the next few days, find any stray email flows and switch those as well. After that, a ticket to properly decom polonium/lead will follow.

faidon closed this task as Resolved.Oct 8 2015, 9:32 AM

This is essentially done for a few days now. See T113962 for the decom task.