Page MenuHomePhabricator

VRTS eqiad replacement
Closed, ResolvedPublic

Description

This is a task for failing over otrs1001 (buster) to vrts1001(bullseye) in eqiad.

  • Create scheduled maintenance window on VRTS dashboard to notify users of upcoming downtime
  • Downtime otrs1001 for two hours on cumin:
sudo cookbook sre.hosts.downtime -r "Replacing Host" -H 2 otrs1001.eqiad.wmnet
  • Disable Puppet on otrs1001
  • Disable all services on otrs1001:
sudo systemctl stop cron
sudo systemctl stop exim4
sudo systemctl stop apache2
sudo systemctl stop vrts-daemon
  • Create patch and merge with the following changes:
    • Applying vrts role on vrts1001
    • Changing active_host on Puppet hiera to vrts1001.eqiad.wmnet.
  • Run sudo /usr/local/bin/install_vrts 6.0.48 on vrts1001
  • Run sudo -u www-data /opt/otrs/bin/otrs.Console.pl Admin::Package::ReinstallAll
  • Ensure all services are running normally on vrts1001:
sudo systemctl status cron
sudo systemctl status exim4
sudo systemctl status apache2
sudo systemctl status vrts-daemon
  • Create patch in DNS repo. In the wmnet template, point ticket to vrts1001.eqiad.wmnet. Merge and run authdns-update
  • Access https://ticket.wikimedia.org and login and check if everything is working normally.

If everything is working fine, extend downtime on otrs1001 and prepare for decom.

Event Timeline

Change 930245 had a related patch set uploaded (by AOkoth; author: AOkoth):

[operations/puppet@production] vrts: failing over to vrts1001

https://gerrit.wikimedia.org/r/930245

Change 930245 merged by AOkoth:

[operations/puppet@production] vrts: failing over to vrts1001

https://gerrit.wikimedia.org/r/930245

Change 930251 had a related patch set uploaded (by AOkoth; author: AOkoth):

[operations/dns@master] ticket: otrs1001 -> vrts1001

https://gerrit.wikimedia.org/r/930251

Change 930251 merged by AOkoth:

[operations/dns@master] ticket: otrs1001 -> vrts1001

https://gerrit.wikimedia.org/r/930251

Thanks again @Dzahn @eoghan

Seems to be working fine for now.

Test ticket: https://ticket.wikimedia.org/otrs/index.pl?Action=AgentTicketZoom;TicketID=12843552

Will wait till tomorrow to close this if nothing is reported.

@Arnoldokoth Great that all worked smoothly! Thank you. There is this other ticket (T295416) that was about upgrading OTRS (all hosts) to bullseye.

Seems like that is also resolved now with eqiad and codfw both having a bullseye machine. Should we close that as resolved? Or wait for decom of otrs1001?

@Dzahn Yeah, I created this as a sub-task for that. I will close this first and create another sub-task under (T295416) for decom otrs1001.