Page MenuHomePhabricator

Upgrade cloudnet2003 and cloudnet2004 to Debian Buster
Closed, ResolvedPublic

Description

I just did these upgrades in codfw1dev and things went very smoothly. Steps:

  • Merge an install_server patch moving these hosts to Buster
  • Determine which host is active and which is standby:
root@cloudcontrol1003:~# neutron l3-agent-list-hosting-router d93771ba-2711-4f88-804a-8df6fd03978a
  • Re-image standby host, confirm puppet stability, reboot
  • Confirm (with above command) that the standby node is up and running
  • Fail-over by stopping the l3-agent on the active host
service neutron-l3-agent stop
  • Confirm networking still working, confirm that active/standby hosts have traded places
  • Re-image remaining host

Event Timeline

Andrew triaged this task as Medium priority.May 19 2020, 4:03 PM
Andrew moved this task from Inbox to Doing on the cloud-services-team (Kanban) board.

Change 597560 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] cloudnet1003 and cloudnet1004: next install with Buster

https://gerrit.wikimedia.org/r/597560

Change 597560 merged by Andrew Bogott:
[operations/puppet@production] cloudnet1003 and cloudnet1004: next install with Buster

https://gerrit.wikimedia.org/r/597560

Script wmf-auto-reimage was launched by andrew on cumin1001.eqiad.wmnet for hosts:

cloudnet1004.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005211243_andrew_213507_cloudnet1004_eqiad_wmnet.log.

Mentioned in SAL (#wikimedia-operations) [2020-05-21T12:44:22Z] <andrewbogott> reimaging cloudnet1004.eqiad.wmnet for T253124

Completed auto-reimage of hosts:

['cloudnet1004.eqiad.wmnet']

Of which those FAILED:

['cloudnet1004.eqiad.wmnet']

Script wmf-auto-reimage was launched by andrew on cumin1001.eqiad.wmnet for hosts:

cloudnet1004.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005211259_andrew_214571_cloudnet1004_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['cloudnet1004.eqiad.wmnet']

Of which those FAILED:

['cloudnet1004.eqiad.wmnet']

Script wmf-auto-reimage was launched by andrew on cumin1001.eqiad.wmnet for hosts:

cloudnet1004.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005211332_andrew_218658_cloudnet1004_eqiad_wmnet.log.

Completed auto-reimage of hosts:

['cloudnet1004.eqiad.wmnet']

Of which those FAILED:

['cloudnet1004.eqiad.wmnet']

Script wmf-auto-reimage was launched by andrew on cumin1001.eqiad.wmnet for hosts:

cloudnet1003.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202005211547_andrew_239426_cloudnet1003_eqiad_wmnet.log.

Mentioned in SAL (#wikimedia-operations) [2020-05-21T15:48:44Z] <andrewbogott> rebuilding cloudnet1003.eqiad.wmnet with Debian Buster for T253124

Completed auto-reimage of hosts:

['cloudnet1003.eqiad.wmnet']

and were ALL successful.

Andrew updated the task description. (Show Details)