Page MenuHomePhabricator

Upgrade the Hadoop coordinators to Debian Buster
Closed, ResolvedPublic

Description

This is probably the most challenging of the Buster upgrades :)

In the past we just took an outage for the time needed for the reimage (namely stopping hive/oozie and some functionalities of Druid/Superset/etc..) so may do the same this time as well, but we could try to test our procedure for the failover of all services, including the Meta database:

https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Mysql_Meta#Failover (but replacing the db1108 bits with an-coord1002)

We can proceed with an-coord1002's upgrade first, preserving data on partitions (except root), and then quickly test it via a rapid hive failover. Then we could move all the Analytics Meta clients to 1002, as well as hive/presto/etc.., and finally reimage it.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 677121 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] install_server: add reuse recipe for an-coord100* nodes

https://gerrit.wikimedia.org/r/677121

Change 677121 merged by Elukey:

[operations/puppet@production] install_server: add reuse recipe for an-coord100* nodes

https://gerrit.wikimedia.org/r/677121

Change 677122 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] install_server: set Buster for an-(master|coord) nodes

https://gerrit.wikimedia.org/r/677122

Change 677122 merged by Elukey:

[operations/puppet@production] install_server: set Buster for an-(master|coord) nodes

https://gerrit.wikimedia.org/r/677122

The an-coord1002 host is now running Buster, puppet looks working (already tested in Hadoop test). The only thing that I had to do was to chown mysql:mysql /srv/sqldata, since after the reimage the mysql user got another id (same thing for the group).

On thing that differs from 1001 and 1002 is the partition layout. The former has two separate partitions for /srv and /var/lib/mysql, meanwhile the latter only a single /srv partition.

/dev/mapper/an--coord1001--vg-srv    102G   46G   56G  46% /srv
/dev/mapper/an--coord1001--vg-mysql   59G   49G   11G  82% /var/lib/mysql

Since the Data Persistence team stores things under /srv, I'd suggest to make it uniform and fold the separate /var/lib/mysql partition in /srv. We could do something like:

  • stop all services etc..
  • copy /var/lib/mysql under /srv
  • delete the /var/lib/mysql partition
  • expand the /srv volume and ext4 partition

This could be done before reimage, and then we could re-use the partman recipe that I used for 1002 (preserving /srv only) also for 1001. I know that @Ottomata has some reservations about /srv so I'll not proceed :)

Another thought - I don't think that we are completely ready for a database failover, since it will surely require some prep work. We could simply drain the cluster and announce a half an hour outage for some tools (like Superset), reimage to Buster with the new partition layout and then set up a failover day as separate project later on.

I know that @Ottomata has some reservations about /srv so I'll not proceed :)

Oh oh oh, no no my reservations are not valid here. We use /srv at WMF and we should be consistent and do it too. My reservations are more philosophical about that choice for WMF than about if we should do it for our analytics systems.

I'd suggest to make it uniform and fold the separate /var/lib/mysql partition in /srv.

+1 let's do it.

Change 679356 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] aptrepo: add component libmysql-java to buster-wikimedia

https://gerrit.wikimedia.org/r/679356

Change 679356 merged by Elukey:

[operations/puppet@production] aptrepo: add component libmysql-java to buster-wikimedia

https://gerrit.wikimedia.org/r/679356

Change 679368 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] bigtop::mysql_jdbc: use component/libmysql-java for buster

https://gerrit.wikimedia.org/r/679368

Change 679387 had a related patch set uploaded (by Elukey; author: Elukey):

[analytics/refinery@master] Move sqoop-mediawiki-tables back to the com.mysql.jdbc.Driver

https://gerrit.wikimedia.org/r/679387

Change 679368 merged by Elukey:

[operations/puppet@production] bigtop::mysql_jdbc: use component/libmysql-java for buster

https://gerrit.wikimedia.org/r/679368

I opened https://issues.apache.org/jira/browse/HIVE-25020 to Hive upstream, it seems that the buster mariadb jdbc driver doesn't play well with the Metastore. For the moment I have added a special component to buster called component/libmysql-dev, forward porting libmysql-dev from stretch to buster-wikimedia to use its jars as interim solution.

Change 679387 merged by Elukey:

[analytics/refinery@master] Move sqoop-mediawiki-tables back to the com.mysql.jdbc.Driver

https://gerrit.wikimedia.org/r/679387

Change 681358 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] role::analytics_cluster::coordinator: move mysql to /srv/sqldata

https://gerrit.wikimedia.org/r/681358

Change 681359 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/dns@master] Move analytics-hive to an-coord1002

https://gerrit.wikimedia.org/r/681359

Procedure:

  • drain the cluster from applications
  • stop druid load timers on an-launcher1002
  • disable puppet on an-coord1001
  • disable replication on an-coord1002 and db1108
  • stop hive, presto, oozie, mysql on an-coord1001
  • reshape partitions
mkdir /srv/sqldata
mv /var/lib/mysql/* /srv/sqldata
umount /var/lib/mysql
umount /srv
lvremove /dev/an-coord1001-vg/mysql
lvextend -l +100%FREE /dev/an-coord1001-vg/srv
resize2fs /dev/an-coord1001-vg/srv

The above assumes a reuse partman recipe for an-coord1001, I am going to create one now.

Change 681359 merged by Elukey:

[operations/dns@master] Move analytics-hive to an-coord1002

https://gerrit.wikimedia.org/r/681359

Change 681396 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] install_server: add custom recipe for an-coord1001

https://gerrit.wikimedia.org/r/681396

Change 681396 merged by Elukey:

[operations/puppet@production] install_server: add custom recipe for an-coord1001

https://gerrit.wikimedia.org/r/681396

Change 681358 merged by Elukey:

[operations/puppet@production] role::analytics_cluster::coordinator: move mysql to /srv/sqldata

https://gerrit.wikimedia.org/r/681358

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

['an-coord1001.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202104210707_elukey_27139.log.

Completed auto-reimage of hosts:

['an-coord1001.eqiad.wmnet']

Of which those FAILED:

['an-coord1001.eqiad.wmnet']

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

['an-coord1001.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202104210737_elukey_32351.log.

Completed auto-reimage of hosts:

['an-coord1001.eqiad.wmnet']

and were ALL successful.

All done! I'll follow up on https://issues.apache.org/jira/browse/HIVE-25020 but we should be good :)

@Ottomata @razzi there is some potential clean up to do in /srv:

elukey@an-coord1001:~$ sudo du -hs /srv/* | sort -h
4.0K	/srv/tmp
16K	/srv/lost+found
6.0M	/srv/event-schemas
57M	/srv/presto
313M	/srv/superset_production_1608135231.sql   <===============
362M	/srv/superset_production_1613687763.sql   <===============
480M	/srv/mediawiki-config
865M	/srv/backup
7.1G	/srv/backup_hivemeta           <===============
8.0G	/srv/an-coord1002-backup   <===============
32G	/srv/deployment
48G	/srv/sqldata

Can you double check if it is ok to drop?

Mentioned in SAL (#wikimedia-analytics) [2021-04-27T08:33:31Z] <elukey> run mysql_upgrade for analytics-meta on an-coord1002 (should be part of the upgrade process) - T278424

Mentioned in SAL (#wikimedia-analytics) [2021-04-29T14:57:31Z] <elukey> run mysql_upgrade on an-coord1001 to complete the buster upgrade - T278424