Page MenuHomePhabricator

Investigate options to improve CloudVPS backend database architecture
Closed, ResolvedPublic

Description

As we continue to improve and grow the OpenStack control plane, the limitations around the number of database connections, tuning options and failover procedures included in the standard database configuration have become more of a concern.

This task is to investigate the architecture and design for a highly available database that can be tuned specifically for the CloudVPS OpenStack services.

Update: We're going to move the OpenStack control plane to a galera cluster. The Galera cluster will be hosted on the cloudcontrol nodes, of which there are three in each deployment.

codfw1dev:

  • Install and launch galera cluster
  • Move Glance db use to galera
  • Move 'labspuppet' db use to galera
  • Move Nova (+ nova_api + nova_cell0) db use to galera
  • Move Keystone db use to galera
  • Move Neutron db use to galera
  • Move Designate db use to galera

eqiad1:

  • Install and launch galera cluster
  • Move Glance db use to galera
  • Move Nova (+ nova_api + nova_cell0) db use to galera
  • Move Keystone db use to galera
  • Move Neutron db use to galera
  • Move Designate db use to galera

cleanup:

  • rename tables (as preamble to dropping)
    • glance.* -> glance_old.*
    • nova_eqiad1.* -> nova_eqiad1_old.*
    • nova_api_eqiad1.* -> nova_api_eqiad1_old.*
    • nova_cell0_eqiad1.* -> nova_cell0_eqiad1_old.*
    • keystone.* -> keystone_old.*
    • neutron.* -> neutron_old.*
    • designate.* -> designate_old.*
  • remove databases from backup jobs (if they're enumerated?)
  • drop databases from m5

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+21 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+14 -1
operations/puppetproduction+66 -0
operations/puppetproduction+5 -2
operations/puppetproduction+2 -0
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+9 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+80 -0
operations/puppetproduction+265 -5
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+1 -1
operations/puppetproduction+47 -20
operations/puppetproduction+9 -0
operations/puppetproduction+2 -2
operations/puppetproduction+20 -4
operations/puppetproduction+2 -2
operations/puppetproduction+1 -3
operations/puppetproduction+282 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 604856 merged by Andrew Bogott:
[operations/puppet@production] Initial module and profile for galera + mariadb

https://gerrit.wikimedia.org/r/604856

Change 605604 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera: rearrange hiera settings

https://gerrit.wikimedia.org/r/605604

Change 605604 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera: rearrange hiera settings

https://gerrit.wikimedia.org/r/605604

Change 605607 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera codfw1dev: enable

https://gerrit.wikimedia.org/r/605607

Change 605607 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera codfw1dev: enable

https://gerrit.wikimedia.org/r/605607

Change 605620 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera: fix mysqld process check

https://gerrit.wikimedia.org/r/605620

Change 605620 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera: fix mysqld process check

https://gerrit.wikimedia.org/r/605620

Change 605336 merged by Andrew Bogott:
[operations/puppet@production] Galera: move behind haproxy

https://gerrit.wikimedia.org/r/605336

Change 605622 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera: move backend port to 23306; 13306 is already occupied by prometheus

https://gerrit.wikimedia.org/r/605622

Change 605622 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera: move backend port to 23306; 13306 is already occupied by prometheus

https://gerrit.wikimedia.org/r/605622

Change 605625 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera: move codfw1dev mysql port behind haproxy

https://gerrit.wikimedia.org/r/605625

Change 605625 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera: move codfw1dev mysql port behind haproxy

https://gerrit.wikimedia.org/r/605625

Change 605651 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs haproxy: fix up mysql config

https://gerrit.wikimedia.org/r/605651

Change 605651 merged by Andrew Bogott:
[operations/puppet@production] wmcs haproxy: fix up mysql config

https://gerrit.wikimedia.org/r/605651

Change 605897 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] icinga galera monitoring: typo fix

https://gerrit.wikimedia.org/r/605897

Change 605897 merged by Andrew Bogott:
[operations/puppet@production] icinga galera monitoring: typo fix

https://gerrit.wikimedia.org/r/605897

Change 605928 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] rename check-galera to check_galera

https://gerrit.wikimedia.org/r/605928

Change 605928 merged by Andrew Bogott:
[operations/puppet@production] rename check-galera to check_galera

https://gerrit.wikimedia.org/r/605928

Change 606042 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera: add a second icinga check

https://gerrit.wikimedia.org/r/606042

Change 606042 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera: add a second icinga check

https://gerrit.wikimedia.org/r/606042

Change 606266 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] openstack: add templatized database grants for openstack services

https://gerrit.wikimedia.org/r/606266

Change 606266 merged by Andrew Bogott:
[operations/puppet@production] openstack: add templatized database grants for openstack services

https://gerrit.wikimedia.org/r/606266

Change 606466 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] codfw1dev: move keystone db to galera

https://gerrit.wikimedia.org/r/606466

Change 606467 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] codfw1dev: move nova db to galera

https://gerrit.wikimedia.org/r/606467

Change 606468 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] codfw1dev: move neutron db to galera

https://gerrit.wikimedia.org/r/606468

Change 606469 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] codfw1dev: move designate db to galera

https://gerrit.wikimedia.org/r/606469

Change 606470 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] codfw1dev: move puppet enc storage to galera

https://gerrit.wikimedia.org/r/606470

Change 606466 merged by Andrew Bogott:
[operations/puppet@production] codfw1dev: move keystone db to galera

https://gerrit.wikimedia.org/r/606466

Change 606469 merged by Andrew Bogott:
[operations/puppet@production] codfw1dev: move designate db to galera

https://gerrit.wikimedia.org/r/606469

Change 606483 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] designate database: expand grants to include cloudcontrol nodes

https://gerrit.wikimedia.org/r/606483

Change 606483 merged by Andrew Bogott:
[operations/puppet@production] designate database: expand grants to include cloudcontrol nodes

https://gerrit.wikimedia.org/r/606483

Change 606470 merged by Andrew Bogott:
[operations/puppet@production] codfw1dev: move puppet enc storage to galera

https://gerrit.wikimedia.org/r/606470

Change 606468 merged by Andrew Bogott:
[operations/puppet@production] codfw1dev: move neutron db to galera

https://gerrit.wikimedia.org/r/606468

Change 606467 merged by Andrew Bogott:
[operations/puppet@production] codfw1dev: move nova db to galera

https://gerrit.wikimedia.org/r/606467

Change 606534 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Galera/mysql: increase max_connections to 500

https://gerrit.wikimedia.org/r/606534

Change 606535 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] haproxy: add some settings for tcp backends

https://gerrit.wikimedia.org/r/606535

Change 606534 merged by Andrew Bogott:
[operations/puppet@production] Galera/mysql: increase max_connections to 500

https://gerrit.wikimedia.org/r/606534

Change 606535 merged by Andrew Bogott:
[operations/puppet@production] haproxy: add some settings for tcp backends

https://gerrit.wikimedia.org/r/606535

Change 607078 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera: add daily backups of each OpenStack db

https://gerrit.wikimedia.org/r/607078

Change 607078 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera: add daily backups of each OpenStack db

https://gerrit.wikimedia.org/r/607078

Change 607318 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs: install galera on eqiad1 cloudcontrol nodes

https://gerrit.wikimedia.org/r/607318

Change 607318 merged by Andrew Bogott:
[operations/puppet@production] wmcs: install galera on eqiad1 cloudcontrol nodes

https://gerrit.wikimedia.org/r/607318

Change 607321 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] eqiad1: enable galera

https://gerrit.wikimedia.org/r/607321

Change 607321 merged by Andrew Bogott:
[operations/puppet@production] eqiad1: enable galera

https://gerrit.wikimedia.org/r/607321

In order to get something resembling a fresh start, I'm trying to copy data over to galera but creating new tables; that way we should avoid encoding issues and other cruft left over from having run OpenStack since the stone age.

On the old db host, "mysqldump --no-create-db --no-create-info --complete-insert"
Removing BY HAND anything related to schema or versioning in the dump
On the new db host, create the new database directly, then use the openstack tool's schema upgrader to create tables &c ('e.g. glance-manage db sync')
Importing the data from the above dump; correcting as needed for duplicate records.

It's graceless but seems better than carrying along our old schema.

Change 607374 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] eqiad1: move glance db to galera

https://gerrit.wikimedia.org/r/607374

Change 607374 merged by Andrew Bogott:
[operations/puppet@production] eqiad1: move glance db to galera

https://gerrit.wikimedia.org/r/607374

Change 610365 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs galera: rework use of prometheus-mysqld-exporter

https://gerrit.wikimedia.org/r/610365

Change 610365 merged by Andrew Bogott:
[operations/puppet@production] wmcs galera: rework use of prometheus-mysqld-exporter

https://gerrit.wikimedia.org/r/610365

Change 611421 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Openstack Nova: move database access to galera on cloudcontrol nodes

https://gerrit.wikimedia.org/r/611421

Change 611421 merged by Andrew Bogott:
[operations/puppet@production] Openstack Nova: move database access to galera on cloudcontrol nodes

https://gerrit.wikimedia.org/r/611421

Change 611935 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] eqiad1 keystone: move database to galera on cloudcontrol hosts

https://gerrit.wikimedia.org/r/611935

Change 611935 merged by Andrew Bogott:
[operations/puppet@production] eqiad1 keystone: move database to galera on cloudcontrol hosts

https://gerrit.wikimedia.org/r/611935

Change 612378 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] eqiad1 designate: move to galera db host

https://gerrit.wikimedia.org/r/612378

Change 612378 merged by Andrew Bogott:
[operations/puppet@production] eqiad1 designate: move to galera db host

https://gerrit.wikimedia.org/r/612378

Andrew updated the task description. (Show Details)

@Andrew If it wouldn't be a lot of overhead on you, could you check documentation of misc dbs at https://wikitech.wikimedia.org/wiki/MariaDB/misc#m5 to make sure that is accurate to reality? I am guessing it would be mostly removal of things.

@Andrew If it wouldn't be a lot of overhead on you, could you check documentation of misc dbs at https://wikitech.wikimedia.org/wiki/MariaDB/misc#m5 to make sure that is accurate to reality? I am guessing it would be mostly removal of things.

Yep, I removed the references to OpenStack dbs there.