Page MenuHomePhabricator

Failover all db1115 services to db1215
Closed, ResolvedPublic

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 908157 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] kormat/bashrc.wmf: Change alias location

https://gerrit.wikimedia.org/r/908157

Change 908157 merged by Marostegui:

[operations/puppet@production] kormat/bashrc.wmf: Change alias location

https://gerrit.wikimedia.org/r/908157

Change 909324 had a related patch set uploaded (by Jcrespo; author: Jcrespo):

[operations/puppet@production] .bashrc: Change alias location

https://gerrit.wikimedia.org/r/909324

I am going to create a specific task for the switchover itself and then use this as "services" migrations. It is a complex process and I want to have it split for my own sanity.
Though, given that tomorrow is the designed last day for maintenance before the DC switch back to eqiad, I am not going to attempt this today as many things can break.

Change 909964 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software@master] section: Update zarcillo location

https://gerrit.wikimedia.org/r/909964

Change 909964 merged by Marostegui:

[operations/software@master] section: Update zarcillo location

https://gerrit.wikimedia.org/r/909964

Change 909965 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software@master] host-to-instance: Change zarcillo location

https://gerrit.wikimedia.org/r/909965

Change 909965 merged by jenkins-bot:

[operations/software@master] host-to-instance: Change zarcillo location

https://gerrit.wikimedia.org/r/909965

Change 909966 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software@master] check-master-heartbeat.sh: Change zarcillo location

https://gerrit.wikimedia.org/r/909966

Change 909966 merged by Marostegui:

[operations/software@master] check-master-heartbeat.sh: Change zarcillo location

https://gerrit.wikimedia.org/r/909966

Change 909967 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] common.yaml: Add db1215 to mysql clients

https://gerrit.wikimedia.org/r/909967

Change 909967 merged by Marostegui:

[operations/puppet@production] common.yaml: Add db1215 to mysql clients

https://gerrit.wikimedia.org/r/909967

Change 909972 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] prometheus.yaml: Change zarcillo location

https://gerrit.wikimedia.org/r/909972

Change 909972 merged by Marostegui:

[operations/puppet@production] prometheus.yaml: Change zarcillo location

https://gerrit.wikimedia.org/r/909972

Change 910076 had a related patch set uploaded (by Cwhite; author: Cwhite):

[operations/puppet@production] prometheus: Change zarcillo location

https://gerrit.wikimedia.org/r/910076

Change 910076 merged by Cwhite:

[operations/puppet@production] prometheus: Change zarcillo location

https://gerrit.wikimedia.org/r/910076

Change 910414 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software@master] change_mw_mysql_pass.sh: Change zarcillo host

https://gerrit.wikimedia.org/r/910414

Change 910414 merged by jenkins-bot:

[operations/software@master] change_mw_mysql_pass.sh: Change zarcillo host

https://gerrit.wikimedia.org/r/910414

Change 911693 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software@master] switchover-tmpl.py: Replace zarcillo host

https://gerrit.wikimedia.org/r/911693

Change 911693 merged by jenkins-bot:

[operations/software@master] switchover-tmpl.py: Replace zarcillo host

https://gerrit.wikimedia.org/r/911693

This ID is so easy to remember that the task goes so smoothly

In T334455#8815619, @I wrote:

This ID is so easy to remember that the task goes so smoothly

Please don't post unhelpful and off topic comments on phabricator tasks.

Change 917313 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] orchestrator: Change database

https://gerrit.wikimedia.org/r/917313

Change 917320 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software/wmfmariadbpy@master] switchover.py: Replace zarcillo host

https://gerrit.wikimedia.org/r/917320

Change 917313 merged by Marostegui:

[operations/puppet@production] orchestrator: Change database

https://gerrit.wikimedia.org/r/917313

Change 917320 merged by Marostegui:

[operations/software/wmfmariadbpy@master] switchover.py: Replace zarcillo host

https://gerrit.wikimedia.org/r/917320

Change 917434 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software@master] report_users.sh: Change zarcillo host

https://gerrit.wikimedia.org/r/917434

Change 909324 merged by Marostegui:

[operations/puppet@production] .bashrc: Change alias location

https://gerrit.wikimedia.org/r/909324

Change 917434 merged by Marostegui:

[operations/software@master] report_users.sh: Change zarcillo host

https://gerrit.wikimedia.org/r/917434

I think everything is failed over. What I am going to do in around 1h, is to stop mariadb on db1115 and see if something breaks.

Change 917587 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1115: Disable notifications

https://gerrit.wikimedia.org/r/917587

Change 917587 merged by Marostegui:

[operations/puppet@production] db1115: Disable notifications

https://gerrit.wikimedia.org/r/917587

Mentioned in SAL (#wikimedia-operations) [2023-05-09T08:40:01Z] <marostegui> Stop mariadb on db1115 (old zarcillo master) T334455

Change 917833 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software/wmfdb@master] db_mysql.py: Replace zarcillo master

https://gerrit.wikimedia.org/r/917833

Change 917833 merged by jenkins-bot:

[operations/software/wmfdb@master] db_mysql.py: Replace zarcillo master

https://gerrit.wikimedia.org/r/917833

I don't know how wmfdb is deployed. Do we need a release and packaging of it too?

To be honest, I don't know, there's not much doc: https://wikitech.wikimedia.org/wiki/Wmfdb
From this task https://phabricator.wikimedia.org/T304915 I understand it is "just" creating the debs and installing them

root@cumin1001:/usr# dpkg -l | grep wmfdb
ii  python3-wmfdb                        0.1.2+deb11u1                  amd64        Libraries for interacting with WMF's mariadb deployments
ii  wmfdb-admin                          0.1.2+deb11u1                  amd64        Utilities for maintaining WMF's mariadb deployments

On the debian directory of the wmfdb repo there're debian package files:

compat				changelog			source
control				python3-wmfdb.install		wmfdb-admin.install
copyright			rules				wmfdb-admin.lintian-overrides

I am not sure if @MoritzMuehlenhoff was involved and might recall something, or could help us figure this out.

I am not sure if @MoritzMuehlenhoff was involved and might recall something, or could help us figure this out.

I wasn't involved in this, but I can take care of updating the deb, just let me know. It also currently seems to lack puppet integration, at least I don't see anything in puppet.git which installs it on the cumin hosts.

@Ladsgroup I want to close this task, you are following up the new version of wmfmariadbpy with T336174
Do you want me to create a specific task for packaging wmfdb?

Sorry it looks like I dropped the ball on this. I'm not lazy. just wmfdb has so many changes since the last release and I want to package and release when you're around in case things go sideways. It's my top prio today and I don't care either way regarding the ticket.

Change 920214 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[operations/software/wmfdb@master] Prepare for v0.1.3 release

https://gerrit.wikimedia.org/r/920214

Change 920214 merged by jenkins-bot:

[operations/software/wmfdb@master] Prepare for v0.1.3 release

https://gerrit.wikimedia.org/r/920214

ladsgroup@build2001:~/wmfdb$ ls  -la /var/cache/pbuilder/result/bullseye-amd64/*wmfdb*
-rw-r--r-- 1 ladsgroup wikidev  6540 May 16 19:14 /var/cache/pbuilder/result/bullseye-amd64/python3-wmfdb_0.1.3_amd64.deb
-rw-r--r-- 1 ladsgroup wikidev  6311 May 16 19:14 /var/cache/pbuilder/result/bullseye-amd64/wmfdb_0.1.3_amd64.buildinfo
-rw-r--r-- 1 ladsgroup wikidev  1888 May 16 19:14 /var/cache/pbuilder/result/bullseye-amd64/wmfdb_0.1.3_amd64.changes
-rw-r--r-- 1 ladsgroup wikidev   702 May 16 19:14 /var/cache/pbuilder/result/bullseye-amd64/wmfdb_0.1.3.dsc
-rw-r--r-- 1 ladsgroup wikidev  1176 May 16 19:14 /var/cache/pbuilder/result/bullseye-amd64/wmfdb_0.1.3_source.changes
-rw-r--r-- 1 ladsgroup wikidev 19456 May 16 19:14 /var/cache/pbuilder/result/bullseye-amd64/wmfdb_0.1.3.tar.xz
-rw-r--r-- 1 ladsgroup wikidev  4412 May 16 19:14 /var/cache/pbuilder/result/bullseye-amd64/wmfdb-admin_0.1.3_amd64.deb

Pushed to apt. Gonna try it out in cumin2002 tomorrow

Pushed to apt. Gonna try it out in cumin2002 tomorrow

<3

ladsgroup@cumin1001:~$ sudo debdeploy deploy -u 2023-05-17-wmfdb.yaml -s cumin-all
Rolling out wmfdb:
Non-daemon update, no service restart needed

python3-wmfdb was updated: 0.1.2+deb11u1 -> 0.1.3
  cumin1001.eqiad.wmnet (1 hosts)

wmfdb-admin was updated: 0.1.2+deb11u1 -> 0.1.3
  cumin1001.eqiad.wmnet (1 hosts)

These hosts are already up-to-date:
  cumin2002.codfw.wmnet (1 hosts)

The package to be updated isn't installed on these hosts:
  cloudcumin2001.codfw.wmnet,cloudcumin1001.eqiad.wmnet,cuminunpriv100
1.eqiad.wmnet (3 hosts)

ladsgroup@cumin1001:~$

It should be done now.