Page MenuHomePhabricator

Setup new mirror server (mirror1001.wikimedia.org)
Closed, ResolvedPublic

Description

We have a new mirror host, we need to get the mirrors updated and then switch over the mirrors.wikimedia.org CNAME.

  • cut over dns
  • monitor debian mirroring, tail -F auth.log|grep ftpsync
  • decom sodium

Event Timeline

MoritzMuehlenhoff renamed this task from Setup new mirror server to Setup new mirror server (copernicium.wikimedia.org).Jul 19 2021, 11:06 AM
MoritzMuehlenhoff triaged this task as Medium priority.

Change 745612 had a related patch set uploaded (by JHathaway; author: JHathaway):

[operations/puppet@production] debian mirrors: add new mirror, copernicium in eqiad

https://gerrit.wikimedia.org/r/745612

Cookbook cookbooks.sre.hosts.reimage was started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye executed with errors:

  • mirror1001 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye executed with errors:

  • mirror1001 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye executed with errors:

  • mirror1001 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye executed with errors:

  • mirror1001 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jhathaway@cumin1001 for host mirror1001.wikimedia.org with OS bullseye completed:

  • mirror1001 (PASS)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202112141633_jhathaway_30634_mirror1001.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Change 745612 merged by JHathaway:

[operations/puppet@production] debian mirrors: add new mirror, mirror1001 in eqiad

https://gerrit.wikimedia.org/r/745612

jhathaway renamed this task from Setup new mirror server (copernicium.wikimedia.org) to Setup new mirror server (mirror1001.wikimedia.org).Dec 16 2021, 7:59 PM
jhathaway claimed this task.

Change 747933 had a related patch set uploaded (by JHathaway; author: JHathaway):

[operations/dns@master] mirrors.wikimedia.org: point to new mirror

https://gerrit.wikimedia.org/r/747933

Change 749146 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] mirrors.wikimedia.org: Add new mirror server to dmz_cidr

https://gerrit.wikimedia.org/r/749146

Change 749146 merged by Jbond:

[operations/puppet@production] mirrors.wikimedia.org: Add new mirror server to dmz_cidr

https://gerrit.wikimedia.org/r/749146

Change 747933 merged by JHathaway:

[operations/dns@master] mirrors.wikimedia.org: point to new mirror

https://gerrit.wikimedia.org/r/747933

Not sure if this has been flagged by anyone else or considered but note that our mirror is an official mirror for Debian, Ubuntu and Tails. For at least Debian, sodium's IPs are in the ftp.us.debian.org rotation (and thus has to be an A/AAAA rather than a CNAME). I still see sodium's IPs there. I think Debian has some automated machinery to update these IPs but I'm not sure what triggers it - so be careful when turning off sodium. We're also a push mirror, which means that Debian's infrastructure triggers an update through SSH; not sure if this works yet?

I haven't touched all this for some years now so I may also be missing steps (or maybe I'm too paranoid and everything will just work out fine :). I also realize that our mirror configs may not be as adequately documented; give the config a closer look if you haven't, and don't hesitate to ask questions :)

Not sure if this has been flagged by anyone else or considered but note that our mirror is an official mirror for Debian, Ubuntu and Tails. For at least Debian, sodium's IPs are in the ftp.us.debian.org rotation (and thus has to be an A/AAAA rather than a CNAME). I still see sodium's IPs there. I think Debian has some automated machinery to update these IPs but I'm not sure what triggers it - so be careful when turning off sodium. We're also a push mirror, which means that Debian's infrastructure triggers an update through SSH; not sure if this works yet?

Thanks Faidon, definitely not too paranoid. I missed the need to have Debian update their side of the DNS. I have reached out to the Debian mirrors team to have them update the DNS records, but in the meantime we will keep sodium up to date until that change is made. I will also reach out to the Ubuntu & Tails teams to confirm they have the updated IP for mirrors.wikimedia.org. You may have not touched it for years, but your memory seems spot on! Also, I will update the docs to reflect the need to have the A/AAAA records updated.

Mentioned in SAL (#wikimedia-operations) [2022-01-12T19:12:10Z] <mutante> mirror1001 - CRITICAL - degraded: The following units failed: update-ubuntu-mirror.service - T286898

yup, thanks for the reminder, reverted