Page MenuHomePhabricator

upgrade cloud-vps openstack to Openstack version 'Xena'
Closed, ResolvedPublic

Description

The current Horizon deploy is already W. So that leaves the cloudservices, cloudcontrol, cloudnet, and cloudvirt nodes to upgrade.

  • update IRC topic
  • downtime everything in icinga through 14:00CDT

    aborrero@cumin1001:~ $ sudo cookbook sre.hosts.downtime -r "upgrading openstack" --min 120 lab*

    aborrero@cumin1001:~ $ sudo cookbook sre.hosts.downtime -r "upgrading openstack" --min 120 cloud*

Start with cloudservices100[34].wikimedia.org (T304880).

  • dump databases on cloudcontrol1005: nova_eqiad1, nova_api_eqiad1, nova_cell0_eqiad1, neutron, glance, keystone, cinder:
    1. mysqldump -u root nova_eqiad1 > /root/xenadbbackups/nova_eqiad1.sql
    2. mysqldump -u root nova_api_eqiad1 > /root/xenadbbackups/nova_api_eqiad1.sql
    3. mysqldump -u root nova_cell0_eqiad1 > /root/xenadbbackups/nova_cell0_eqiad1.sql
    4. mysqldump -u root neutron > /root/xenadbbackups/neutron.sql
    5. mysqldump -u root cinder > /root/xenadbbackups/cinder.sql
    6. mysqldump -u root glance > /root/xenadbbackups/glance.sql
    7. mysqldump -u root placement > /root/xenadbbackups/placement.sql
    8. mysqldump -u root keystone > /root/xenadbbackups/keystone.sql
    9. mysqldump -u root trove_eqiad1 > /root/xenadbbackups/trove_eqiad1.sql

Cloudcontrols:

All open database connections post-upgrade https://phabricator.wikimedia.org/P10999
Checking haproxy status echo "show stat" | socat /var/run/haproxy/haproxy.sock stdio | grep DOWN

First cloudcontrol100x.wikimedia.org:

  • puppet agent --enable && puppet agent -tv
  • apt-get update
  • DEBIAN_FRONTEND=noninteractive apt-get install glance python3-eventlet glance-api glance-common keystone nova-api nova-conductor nova-scheduler nova-common glance trove-api trove-conductor trove-taskmanager neutron-server python3-requests python3-urllib3 placement-api cinder-volume cinder-scheduler cinder-api python3-oslo.messaging python3-tooz -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • DEBIAN_FRONTEND=noninteractive apt-get upgrade -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • puppet agent -tv
  • nova-manage api_db sync
  • nova-manage db sync
  • placement-manage db sync
  • glance-manage db_sync
  • keystone-manage db_sync
  • cinder-manage db sync
  • cinder-manage db online_data_migrations
  • trove-manage db_sync
  • puppet agent -tv
  • nova-manage db online_data_migrations
  • systemctl list-units --failed (should show nothing failed, or just keystone. If keystone is failed just reset with systemctl reset-failed)
  • neutron-db-manage upgrade heads

Remaining cloudcontrol100x.wikimedia.org:

  • puppet agent --enable && puppet agent -tv
  • apt-get update
  • DEBIAN_FRONTEND=noninteractive apt-get install glance python3-eventlet glance-api glance-common keystone nova-api nova-conductor nova-scheduler nova-common glance trove-api trove-conductor trove-taskmanager neutron-server python3-requests python3-urllib3 placement-api cinder-volume cinder-scheduler cinder-api python3-oslo.messaging python3-tooz -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • DEBIAN_FRONTEND=noninteractive apt-get upgrade -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • puppet agent -tv
  • systemctl list-units --failed (should show nothing failed, or just keystone. If keystone is failed just reset with systemctl reset-failed)

cloudnets, wait for network outage window (one at a time please):

Begin with the standby node, as determined with:

$ neutron l3-agent-list-hosting-router cloudinstances2b-gw

Standby node (cloudnet1004.eqiad.wmnet):

  • puppet agent --enable && puppet agent -tv
  • apt-get update
  • DEBIAN_FRONTEND=noninteractive apt-get install -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" neutron-l3-agent python3-oslo.messaging python3-neutronclient python3-glanceclient
  • DEBIAN_FRONTEND=noninteractive apt-get upgrade -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • puppet agent -tv
  • run neutron-db-manage upgrade heads on cloudcontrol1005.wikimedia.org

Active node (cloudnet1003.eqiad.wmnet):

  • puppet agent --enable && puppet agent -tv
  • apt-get update
  • DEBIAN_FRONTEND=noninteractive apt-get install -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" neutron-l3-agent python3-oslo.messaging python3-neutronclient python3-glanceclient
  • DEBIAN_FRONTEND=noninteractive apt-get upgrade -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • puppet agent -tv

Break Time

Cloudvirts (start with one test host first, cloudvirt1039:

  • puppet agent --enable && puppet agent -tv
  • apt-get update
  • DEBIAN_FRONTEND=noninteractive apt-get install -y python3-libvirt python3-eventlet python3-os-brick python3-os-vif nova-compute neutron-common nova-compute-kvm neutron-linuxbridge-agent python3-neutron python3-oslo.messaging python3-taskflow python3-tooz python3-keystoneauth1 python3-requests python3-urllib3 -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade -y --allow-downgrades -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold"
  • puppet agent -tv
  • systemctl restart neutron-linuxbridge-agent libvirtd nova-compute

cloudbackup200[12].codfw.wmnet:

  • puppet agent --enable && puppet agent -tv
  • apt-get update
  • DEBIAN_FRONTEND=noninteractive apt upgrade cinder-backup
  • puppet agent -tv
  • (test from cloudcontrol1005.wikimedia.org) sudo wmcs-cinder-backup-manager
  • update IRC topic
  • enable puppet on all cloud* hosts

    $ sudo cumin 'cloud*' "enable-puppet 'Upgrading to openstack Wallaby - T281275 - ${USER}'"

Things to check

  • Check 'openstack region list'. There should be exactly one region, eqiad1-r. If there is a second region named 'RegionOne' (this happened in codfw1dev), delete it; otherwise scripts that enumerate regions will be confused.
  • Clean up VMs in the admin-monitoring project that leaked during upgrade; delete them.
  • Create a new VM and confirm that DNS and ssh work properly
  • Logs will be extremely noisy about policy deprecations and value checks; this is expected because OpenStack is poised between two different policy systems; our existing policies are still (noisily) supported in U.

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+2 -7
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+4 -0
operations/puppetproduction+0 -14
operations/puppetproduction+1 -5
operations/puppetproduction+2 -2
operations/puppetproduction+45 -0
operations/puppetproduction+166 -0
operations/puppetproduction+42 -0
operations/puppetproduction+59 -0
operations/puppetproduction+273 -0
operations/puppetproduction+71 -0
operations/puppetproduction+172 -0
operations/puppetproduction+43 -0
operations/puppetproduction+176 -0
operations/puppetproduction+79 -0
operations/puppetproduction+23 -0
operations/puppetproduction+49 -0
operations/puppetproduction+207 -0
operations/puppetproduction+80 -0
operations/puppetproduction+0 -3
operations/puppetproduction+16 K -0
Show related patches Customize query in gerrit

Related Objects

StatusSubtypeAssignedTask
ResolvedAndrew
ResolvedAndrew
Resolvedrook
ResolvedAndrew
ResolvedAndrew
Resolvedtaavi
Resolvedaborrero
Resolvedaborrero
Resolveddcaro
Resolveddcaro
Resolveddcaro
Resolveddcaro
Duplicatedcaro
Resolveddcaro
Resolvedtaavi
Resolveddcaro
Resolvedaborrero
Resolveddcaro
In ProgressNone
ResolvedAndrew
OpenAndrew
OpenAndrew
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenAndrew
OpenNone
OpenAndrew
Resolvedaborrero
Resolvedaborrero
Duplicateaborrero
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedAndrew
Resolvedayounsi
Resolvedrook
Resolvedaborrero
Resolvedrook
Resolvedrook
DuplicateNone
ResolvedAndrew

Event Timeline

Change 824831 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] OpenStack: add files and templates for release Xena

https://gerrit.wikimedia.org/r/824831

Change 824832 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Remove refs to cinder v2 api -- it was removed in X.

https://gerrit.wikimedia.org/r/824832

Change 824833 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add openstack serverpackages manifest for Xena

https://gerrit.wikimedia.org/r/824833

Change 824834 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add openstack client package manifests for Xena

https://gerrit.wikimedia.org/r/824834

Change 824835 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add glance manifest for Openstack Xena

https://gerrit.wikimedia.org/r/824835

Change 824836 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add Magnum manifest for OpenStack Xena

https://gerrit.wikimedia.org/r/824836

Change 824837 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add manifests for openstack Designate version Xena

https://gerrit.wikimedia.org/r/824837

Change 824838 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add manifests for Openstack Cinder Xena

https://gerrit.wikimedia.org/r/824838

Change 824839 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Neutron: add manifest for Xena services

https://gerrit.wikimedia.org/r/824839

Change 824840 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack Trove: replace file overlays with patch files for Xena

https://gerrit.wikimedia.org/r/824840

Change 824841 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Keystone: add manifest for Xena

https://gerrit.wikimedia.org/r/824841

Change 824842 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Keystone: replace file overlay with patch file for Xena

https://gerrit.wikimedia.org/r/824842

Change 824843 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add manifest for Openstack Heat version Xena

https://gerrit.wikimedia.org/r/824843

Change 824844 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add manifest for openstack barbican version Xena

https://gerrit.wikimedia.org/r/824844

Change 824845 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add manifests for Openstack Nova version Xena

https://gerrit.wikimedia.org/r/824845

Change 824846 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add manifest for openstack Placement service, version Xena

https://gerrit.wikimedia.org/r/824846

Change 824885 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack Designate codfw1dev to Xena

https://gerrit.wikimedia.org/r/824885

Change 824886 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack codfw1dev to version Xena

https://gerrit.wikimedia.org/r/824886

Change 824831 merged by Andrew Bogott:

[operations/puppet@production] OpenStack: add files and templates for release Xena

https://gerrit.wikimedia.org/r/824831

Change 824832 merged by Andrew Bogott:

[operations/puppet@production] Trove: remove refs to cinder v2 api -- it was removed in X.

https://gerrit.wikimedia.org/r/824832

Change 824833 merged by Andrew Bogott:

[operations/puppet@production] Add openstack serverpackages manifest for Xena

https://gerrit.wikimedia.org/r/824833

Change 824834 merged by Andrew Bogott:

[operations/puppet@production] Add openstack client package manifests for Xena

https://gerrit.wikimedia.org/r/824834

Change 824836 merged by Andrew Bogott:

[operations/puppet@production] Add Magnum manifest for OpenStack Xena

https://gerrit.wikimedia.org/r/824836

Change 824837 merged by Andrew Bogott:

[operations/puppet@production] Add manifests for openstack Designate version Xena

https://gerrit.wikimedia.org/r/824837

Change 824838 merged by Andrew Bogott:

[operations/puppet@production] Add manifests for Openstack Cinder Xena

https://gerrit.wikimedia.org/r/824838

Change 824840 merged by Andrew Bogott:

[operations/puppet@production] Openstack Trove: replace file overlays with patch files for Xena

https://gerrit.wikimedia.org/r/824840

Change 824841 merged by Andrew Bogott:

[operations/puppet@production] Keystone: add manifest for Xena

https://gerrit.wikimedia.org/r/824841

Change 824842 merged by Andrew Bogott:

[operations/puppet@production] Keystone: replace file overlay with patch file for Xena

https://gerrit.wikimedia.org/r/824842

Change 824843 merged by Andrew Bogott:

[operations/puppet@production] Add manifest for Openstack Heat version Xena

https://gerrit.wikimedia.org/r/824843

Change 824844 merged by Andrew Bogott:

[operations/puppet@production] Add manifest for openstack barbican version Xena

https://gerrit.wikimedia.org/r/824844

Change 824845 merged by Andrew Bogott:

[operations/puppet@production] Add manifests for Openstack Nova version Xena

https://gerrit.wikimedia.org/r/824845

Change 824846 merged by Andrew Bogott:

[operations/puppet@production] Add manifest for openstack Placement service, version Xena

https://gerrit.wikimedia.org/r/824846

Change 824835 merged by Andrew Bogott:

[operations/puppet@production] Add glance manifest for Openstack Xena

https://gerrit.wikimedia.org/r/824835

Change 824839 merged by Andrew Bogott:

[operations/puppet@production] Neutron: add manifest for Xena services

https://gerrit.wikimedia.org/r/824839

Change 824885 merged by Andrew Bogott:

[operations/puppet@production] Openstack Designate codfw1dev to Xena

https://gerrit.wikimedia.org/r/824885

Change 824886 merged by Andrew Bogott:

[operations/puppet@production] Openstack codfw1dev to version Xena

https://gerrit.wikimedia.org/r/824886

Change 825394 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Openstack Trove: remove some file resources no longer needed in X

https://gerrit.wikimedia.org/r/825394

Change 825394 merged by Andrew Bogott:

[operations/puppet@production] Openstack Trove: remove some file resources no longer needed in X

https://gerrit.wikimedia.org/r/825394

Andrew updated the task description. (Show Details)

Change 825399 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Move cloudbackup100[12]-dev to Xena

https://gerrit.wikimedia.org/r/825399

Change 825399 merged by Andrew Bogott:

[operations/puppet@production] Move cloudbackup100[12]-dev to Xena

https://gerrit.wikimedia.org/r/825399

Change 825927 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Eqiad designate -> OpenStack version Xena

https://gerrit.wikimedia.org/r/825927

Change 825927 merged by Andrew Bogott:

[operations/puppet@production] Eqiad designate -> OpenStack version Xena

https://gerrit.wikimedia.org/r/825927

Codfw1dev is now running X for all services. Eqiad1 cloudservice nodes are running Designate X; the remaining services will be upgraded next week.

Change 828020 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Horizon: put into maintenance mode for Xena upgrade

https://gerrit.wikimedia.org/r/828020

Change 828024 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Upgrade eqiad1 to openstack Xena

https://gerrit.wikimedia.org/r/828024

Mentioned in SAL (#wikimedia-cloud) [2022-08-30T14:59:31Z] <andrewbogott> manually marking most eqiad1 cloud* servers down in icinga for T296561

Change 828020 merged by FNegri:

[operations/puppet@production] Horizon: put into maintenance mode for Xena upgrade

https://gerrit.wikimedia.org/r/828020

Change 828024 merged by Andrew Bogott:

[operations/puppet@production] Upgrade eqiad1 to openstack Xena

https://gerrit.wikimedia.org/r/828024

Andrew claimed this task.

This upgrade is done. A few issues came up during the upgrade which are in subtasks.