Page MenuHomePhabricator

Productionize db2096 on x1
Closed, ResolvedPublic

Description

db2096 was installed (T206191) and it is now ready for DBAs to take over.
This host needs to go to x1

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 10 2018, 5:29 AM
Marostegui triaged this task as Medium priority.Oct 10 2018, 5:30 AM
Marostegui moved this task from Triage to Next on the DBA board.
Banyek claimed this task.Oct 10 2018, 9:14 AM
Banyek moved this task from Backlog to next on the User-Banyek board.Oct 10 2018, 10:09 AM

Change 466846 had a related patch set uploaded (by Banyek; owner: Banyek):
[operations/puppet@production] mariadb: productionize db2096

https://gerrit.wikimedia.org/r/466846

Change 466847 had a related patch set uploaded (by Banyek; owner: Banyek):
[operations/mediawiki-config@master] mariadb: produtionize db2096

https://gerrit.wikimedia.org/r/466847

Change 466856 had a related patch set uploaded (by Banyek; owner: Banyek):
[operations/puppet@production] mariadb: reimage db2096

https://gerrit.wikimedia.org/r/466856

Banyek moved this task from next to In progress on the User-Banyek board.Oct 12 2018, 5:09 PM

Mentioned in SAL (#wikimedia-operations) [2018-10-15T07:32:12Z] <banyek> reimaging db2096(T206593)

Change 466856 merged by Banyek:
[operations/puppet@production] mariadb: reimage db2096

https://gerrit.wikimedia.org/r/466856

Script wmf-auto-reimage was launched by banyek on neodymium.eqiad.wmnet for hosts:

['db2096.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/201810150748_banyek_22243.log.

Banyek moved this task from Next to In progress on the DBA board.Oct 15 2018, 7:51 AM

Completed auto-reimage of hosts:

['db2096.codfw.wmnet']

and were ALL successful.

Change 466846 merged by Banyek:
[operations/puppet@production] mariadb: productionize db2096

https://gerrit.wikimedia.org/r/466846

Change 467288 had a related patch set uploaded (by Banyek; owner: Banyek):
[operations/mediawiki-config@master] mariadb: depool db2069

https://gerrit.wikimedia.org/r/467288

Mentioned in SAL (#wikimedia-operations) [2018-10-15T08:48:54Z] <banyek> depooling db2033 (T206593)

Change 467288 merged by Banyek:
[operations/mediawiki-config@master] mariadb: depool db2069

https://gerrit.wikimedia.org/r/467288

Mentioned in SAL (#wikimedia-operations) [2018-10-15T08:58:50Z] <banyek@deploy1001> Synchronized wmf-config/db-codfw.php: T206593: depooling db2069 (duration: 00m 48s)

db2096 is getting recloned frmo db2069

Change 466847 merged by Banyek:
[operations/mediawiki-config@master] mariadb: productionize db2096

https://gerrit.wikimedia.org/r/466847

Mentioned in SAL (#wikimedia-operations) [2018-10-15T13:30:50Z] <banyek@deploy1001> Synchronized wmf-config/db-eqiad.php: T206593: adding db2096 to hosts (and repooling db2069) (duration: 00m 49s)

Mentioned in SAL (#wikimedia-operations) [2018-10-15T13:32:00Z] <banyek@deploy1001> Synchronized wmf-config/db-codfw.php: T206593: adding db2096 to hosts (and repooling db2069) (duration: 00m 49s)

what's pending here?

Nothing, se just agreed yesterday to not put it to production until today iirc

Make sure you enable notifications first and update zarcillo DB before repooling.

Is there a tool for updating zarcillo about the host, or I should 'INSERT INTO ...' ?

Marostegui added a subscriber: jcrespo.EditedOct 16 2018, 8:34 AM

Is there a tool for updating zarcillo about the host, or I should 'INSERT INTO ...' ?

I would ask @jcrespo to be sure.
You also need to update tendril:

mwmaint1002:/home/jynus/home-terbium/iron/tendril/bin# host=db20XX; bash tendril-host-add.sh $host.codfw.wmnet 3306 tendril | mysql -h db1115 tendril && bash tendril-host-enable.sh $host.codfw.wmnet 3306 | mysql -h db1115 tendril

ok, waiting on @jcrespo's answer then I'll update zarcillo and tendril at once

Change 467722 had a related patch set uploaded (by Banyek; owner: Banyek):
[operations/puppet@production] mariadb: enable notifications for db2096

https://gerrit.wikimedia.org/r/467722

I have added it to tendril.

Banyek added a comment.EditedOct 17 2018, 11:47 AM

After checking the zarcillo database I believe the following queries will add the host there as well, however there are some blurry pioints for me:

INSERT INTO instances (name, server, port) VALUES ('db2096','db2096.codfw.wmnet',3306);
INSERT INTO section_instances (instance, section) VALUES ('db2096','x1');
INSERT INTO servers (fqdn, hostname, dc) VALUES ('db2096.codfw.wmnet', 'db2096', 'codfw');

The rest of the tables which are 'default NULL' values will later fill up automatically, or we have to add some initial values for them? See:

instances.version
instances.last_start
servers.ipv4
servers.ipv6
servers.last_boot

@jcrespo it's ok to run these 3 inserts?

@jcrespo it's ok to run these 3 inserts?

Yes

Change 467954 had a related patch set uploaded (by Banyek; owner: Banyek):
[operations/mediawiki-config@master] mariadb: enable db2096

https://gerrit.wikimedia.org/r/467954

Merging this patch is also needed for enabling notifications
https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/467722/

Mentioned in SAL (#wikimedia-operations) [2018-10-17T12:56:10Z] <banyek> enabling notifications on db2096 (T206593)

Change 467722 merged by Banyek:
[operations/puppet@production] mariadb: enable notifications for db2096

https://gerrit.wikimedia.org/r/467722

Mentioned in SAL (#wikimedia-operations) [2018-10-17T15:28:11Z] <banyek> enabling db2096 for cluster x1 (T206593)

Mentioned in SAL (#wikimedia-operations) [2018-10-17T15:30:59Z] <banyek@deploy1001> Synchronized wmf-config/db-codfw.php: T206593: Enabling db2096 for x1 (duration: 00m 56s)

Change 467954 merged by jenkins-bot:
[operations/mediawiki-config@master] mariadb: enable db2096

https://gerrit.wikimedia.org/r/467954

Mentioned in SAL (#wikimedia-operations) [2018-10-17T15:34:30Z] <banyek@deploy1001> Synchronized wmf-config/db-codfw.php: T206593: Enabling db2096 for x1 (duration: 00m 56s)

Banyek closed this task as Resolved.Oct 17 2018, 3:35 PM