Page MenuHomePhabricator

Create staging-db* (databases)
Closed, DeclinedPublic

Event Timeline

greg raised the priority of this task from to Medium.
greg updated the task description. (Show Details)
greg added a project: Staging.
greg added subscribers: thcipriani, Aklapper, demon and 3 others.

using puppet modules:

standard
mariadb::packages
mariadb::config

Seem to be a small handful of things that need to be tweaked:
Something should address: https://gerrit.wikimedia.org/r/#/c/195328/
This can be worked around with a few failed puppet runs, but still: https://gerrit.wikimedia.org/r/#/c/194925/

Also, not quite sure what the solution is here: https://phabricator.wikimedia.org/T91797 — possibly a manual piece of the puzzle for the time being.

It seems that we have to ensure apt-get update is run before anything from the mariadb::packages class is run. The easy way to do this is just:

require => Exec['apt-get update'],

The nice way to do this, and the pattern I've seen applied elsewhere, would be to use:

require => Apt::Repository['wikimedia_mariadb'],

The problem here is subtle: the notify => Exec['apt-get update'] in apt::repository means that the file "/etc/apt/sources.d/${name}.list" has to exist before apt-get update is run and adding require => Apt::Repository['wikimedia_mariadb'] to mariadb::packages means that all mariadb packages will be installed after "/etc/apt/sources.d/${name}.list" gets added; however, the mariadb packages, currently, have no relationship with apt-get update.

This means puppet may try to install all mariadb packages after adding `/etc/apt/sources.d/wikimedia_mariadb.list, but before running apt-get update. Puppet will succeed with some packages (libmysqlclient18) fail in others then create unresolvable dependency conflicts (dependency hell) on the next run (after apt-get update has run). These are the erros seen here: https://phabricator.wikimedia.org/P376

To fix this we'd have to refactor apt::repository and apt.

From apt, we'd need to remove:

exec { 'apt-get update':
...
}

and move it to its own class, say, apt::update then we'd add the apt::update class as an anchor to apt::repository:

anchor { "apt::repository::${name}":
    require => Class['apt::update'],
}

This will likely cause new problems in places where the implicit assumption that apt-get update has been run has not formerly caused issues.

Change 195779 had a related patch set uploaded (by Thcipriani):
Ensure apt update before sql libraries install

https://gerrit.wikimedia.org/r/195779

  1. Provision box, sign puppet, first run, etc
  2. xtrabackup clone & prepare data from another server
  3. Start MariaDB service, wait for replication to catch up
  4. mediawiki-config commit -- pool server with low load for warm up
  5. mediawiki-config commit -- raise server to normal load

https://wikitech.wikimedia.org/wiki/Setting_up_a_MySQL_replica

The catch is that step #2 should really be from a depooled slave so it completes in reasonable time without saturating the network. Ie, an extra depool/repool step for another node in there.

Next steps: make manual steps ↑ as painless as possible.

(Thoughts on how to handle manual steps for staging cluster at T88702#1109512)

  1. Provision box, sign puppet, first run, etc
  2. xtrabackup clone & prepare data from another server
  3. Start MariaDB service, wait for replication to catch up
  4. mediawiki-config commit -- pool server with low load for warm up
  5. mediawiki-config commit -- raise server to normal load

Next steps: make manual steps ↑ as painless as possible.

See also: T73212: Make it possible to quickly and programmatically pool and depool application servers

@Springle How would this be handled for master, where there's nowhere to clone from?

@Springle How would this be handled for master, where there's nowhere to clone from?

Masters and slaves are the same. Cloning a new master is not really a thing; instead, a new server always comes online as a slave, and may be promoted to master at some later point.

Next steps: make manual steps ↑ as painless as possible.

It would be nice to make de/re-pooling not require a medawiki-config commit. That's been discussed from time to time, and the effort to bring proxies into the mix will help.

greg set Security to None.

Currently using:

classes:
  - mariadb::packages_wmf
  - mariadb::config

Manual steps used:

/opt/wmf-mariadb10/install
cd /opt/wmf-mariadb10 && ./scripts/mysql_install_db --defaults-file=/etc/my.cnf
service mysql start

mv "$HOME/.my.cnf" "$HOME/.my.cnf.bak" && \
  mysqladmin -u root password "$(grep ^password "$HOME/.my.cnf.bak" | cut -c12-)" && \
  mv "$HOME/.my.cnf.bak" "$HOME/.my.cnf"

Without an initial master db from which to clone, initial staging master setup for mediawiki will have to be done via something similar to:

#!/usr/bin/env bash
sql_root_pass=$(ssh staging-palladium -- RUBYLIB=/var/lib/puppet/lib hiera mariadb::config::password ::hostname=staging-db1 ::instanceproject=staging -c /etc/puppet/hiera.yaml)
db_user='wiki'
db_pass='wikipass'
[path-to-]/maintence/install.php --dbname mediawiki \
  --dbserver staging-db1.eqiad.wmflabs \
  --dbtype mysql --dbuser "${db_user}" \
  --dbpass "${db_pass}" \
  --installdbpass "${sql_root_pass}" \
  --installdbuser root
greg changed the task status from Open to Stalled.Apr 29 2015, 4:37 PM

Change 195779 abandoned by Thcipriani:
Ensure apt update before sql libraries install

https://gerrit.wikimedia.org/r/195779