Page MenuHomePhabricator

Bring an-coord100[3-4] into service
Closed, ResolvedPublic

Description

We have two new Hadoop coordinator servers, ready to be brought into service.

These are: an-coord100[3-4]

Acceptance criteria

The following services have been migrated to either/both of an-coord100[3-4]

  • hive-server2
  • hive-metastore
  • presto-coordinator

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Gehel triaged this task as High priority.Nov 15 2023, 9:46 AM

I have made new kerberos principals and keytabs for the new coordinators.

hive/an-coord1003.eqiad.wmnet@WIKIMEDIA
analytics/an-coord1003.eqiad.wmnet@WIKIMEDIA
hadoop/an-coord1003.eqiad.wmnet@WIKIMEDIA
presto/an-coord1003.eqiad.wmnet@WIKIMEDIA
hive/an-coord1004.eqiad.wmnet@WIKIMEDIA
analytics/an-coord1004.eqiad.wmnet@WIKIMEDIA
hadoop/an-coord1004.eqiad.wmnet@WIKIMEDIA
presto/an-coord1004.eqiad.wmnet@WIKIMEDIA
Entry for principal hive/an-coord1003.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1003.eqiad.wmnet/hive/hive.keytab.
Entry for principal analytics/an-coord1003.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1003.eqiad.wmnet/analytics/analytics.keytab.
Entry for principal hadoop/an-coord1003.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1003.eqiad.wmnet/hadoop/hadoop.keytab.
Entry for principal presto/an-coord1003.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1003.eqiad.wmnet/presto/presto.keytab.
Entry for principal hive/an-coord1004.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1004.eqiad.wmnet/hive/hive.keytab.
Entry for principal analytics/an-coord1004.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1004.eqiad.wmnet/analytics/analytics.keytab.
Entry for principal hadoop/an-coord1004.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1004.eqiad.wmnet/hadoop/hadoop.keytab.
Entry for principal presto/an-coord1004.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:/srv/kerberos/keytabs/an-coord1004.eqiad.wmnet/presto/presto.keytab.

I am going to look at whether it's possible/preferable to have both of an-coord100[3-4] in the same role analytics_cluster::coordinator.
I think that this might be easier now that we have migrated the analytics_meta MariaDB away from an-coord100[1-2], I can't see the point at the moment of retaining the analytics_cluster::coordinator::replica role, if we don't need it.

One thing that is crucial here is for us to make sure that the keytabs for hive and presto that are deployed to the new coordinators have the principals which match the service names (the DNS CNAME) that we use for running these services.

This allows us to fail over the hive and presto services between the hosts, with only a DNS change.

On the kerberos kdc we can see the hive keytabs for the two existing coordinators, and the two new coordinators.

root@krb1001:/srv/kerberos/keytabs# find an-coord100*/hive/hive.keytab -exec klist -k {} \;
Keytab name: FILE:an-coord1001.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1001.eqiad.wmnet@WIKIMEDIA
   1 hive/analytics-hive.eqiad.wmnet@WIKIMEDIA
Keytab name: FILE:an-coord1002.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1002.eqiad.wmnet@WIKIMEDIA
   1 hive/analytics-hive.eqiad.wmnet@WIKIMEDIA
Keytab name: FILE:an-coord1003.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1003.eqiad.wmnet@WIKIMEDIA
Keytab name: FILE:an-coord1004.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1004.eqiad.wmnet@WIKIMEDIA
root@krb1001:/srv/kerberos/keytabs#

Note that the keytab files for an-coord100[1-2] contain not only the FQDN based principal, but also hive/analytics-hive.eqiad.wmnet@WIKIMEDIA

I need to add these manually to the new keytabs, before adding them to the secrets repo.

As per the guidelines here: https://wikitech.wikimedia.org/wiki/Data_Engineering/Systems/Kerberos/Administration#Create_a_custom_principal_and_keytab_entry

Adding the existing hive/analytics-hive.eqiad.wmnet@WIKIMEDIA principal to the new keytabs.

root@krb1001:/srv/kerberos/keytabs# kadmin.local ktadd -norandkey -k an-coord1003.eqiad.wmnet/hive/hive.keytab hive/analytics-hive.eqiad.wmnet@WIKIMEDIA
Entry for principal hive/analytics-hive.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:an-coord1003.eqiad.wmnet/hive/hive.keytab.

root@krb1001:/srv/kerberos/keytabs# kadmin.local ktadd -norandkey -k an-coord1004.eqiad.wmnet/hive/hive.keytab hive/analytics-hive.eqiad.wmnet@WIKIMEDIA
Entry for principal hive/analytics-hive.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:an-coord1004.eqiad.wmnet/hive/hive.keytab.

Now this looks correct.

root@krb1001:/srv/kerberos/keytabs# find an-coord100*/hive/hive.keytab -exec klist -k {} \;
Keytab name: FILE:an-coord1001.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1001.eqiad.wmnet@WIKIMEDIA
   1 hive/analytics-hive.eqiad.wmnet@WIKIMEDIA
Keytab name: FILE:an-coord1002.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1002.eqiad.wmnet@WIKIMEDIA
   1 hive/analytics-hive.eqiad.wmnet@WIKIMEDIA
Keytab name: FILE:an-coord1003.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1003.eqiad.wmnet@WIKIMEDIA
   1 hive/analytics-hive.eqiad.wmnet@WIKIMEDIA
Keytab name: FILE:an-coord1004.eqiad.wmnet/hive/hive.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 hive/an-coord1004.eqiad.wmnet@WIKIMEDIA
   1 hive/analytics-hive.eqiad.wmnet@WIKIMEDIA

There is something funny with the presto keytab for an-coord1002. It looks like we have the shared service name presto/analytics-presto.eqiad.wmnet@WIKIMEDIA correct, but the host name part is that of an-coord1001.

root@krb1001:/srv/kerberos/keytabs# klist -k an-coord1002.eqiad.wmnet/presto/presto.keytab 
Keytab name: FILE:an-coord1002.eqiad.wmnet/presto/presto.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 presto/an-coord1001.eqiad.wmnet@WIKIMEDIA
   1 presto/analytics-presto.eqiad.wmnet@WIKIMEDIA

This isn't deployed at the moment anyway: https://github.com/wikimedia/operations-puppet/blob/production/hieradata/role/common/analytics_cluster/coordinator/replica.yaml#L35-L44
...so I don't know whether it's worth fixing, but it's noteworthy.

I've added the service principals to the presto keytabs for the new coordinators.

root@krb1001:/srv/kerberos/keytabs# kadmin.local ktadd -norandkey -k an-coord1003.eqiad.wmnet/presto/presto.keytab presto/analytics-presto.eqiad.wmnet@WIKIMEDIA
Entry for principal presto/analytics-presto.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:an-coord1003.eqiad.wmnet/presto/presto.keytab.

root@krb1001:/srv/kerberos/keytabs# kadmin.local ktadd -norandkey -k an-coord1004.eqiad.wmnet/presto/presto.keytab presto/analytics-presto.eqiad.wmnet@WIKIMEDIA
Entry for principal presto/analytics-presto.eqiad.wmnet@WIKIMEDIA with kvno 1, encryption type aes256-cts-hmac-sha1-96 added to keytab WRFILE:an-coord1004.eqiad.wmnet/presto/presto.keytab.

Adding these new keytabs to the puppet secret repo now.

Change 979086 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Bring an-coord1003 into service as a hadoop coordinator

https://gerrit.wikimedia.org/r/979086

Change 979087 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add explicit Hiera records to mark the new coordinator nodes as running Puppet 7

https://gerrit.wikimedia.org/r/979087

Change 979087 merged by Muehlenhoff:

[operations/puppet@production] Add explicit Hiera records to mark the new coordinator nodes as running Puppet 7

https://gerrit.wikimedia.org/r/979087

Change 979088 had a related patch set uploaded (by Btullis; author: Btullis):

[labs/private@master] Add dummy keytabs for new hadoop coordinators

https://gerrit.wikimedia.org/r/979088

Change 979088 merged by Btullis:

[labs/private@master] Add dummy keytabs for new hadoop coordinators

https://gerrit.wikimedia.org/r/979088

Change 979086 merged by Btullis:

[operations/puppet@production] Bring an-coord1003 into service as a hadoop coordinator

https://gerrit.wikimedia.org/r/979086

Mentioned in SAL (#wikimedia-analytics) [2023-12-04T13:53:29Z] <btullis> bringing an-coord1003 into service as an analytics_cluster::coordinator for T336045

Change 979973 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Prevent removal of python2 on hadoop coordinators

https://gerrit.wikimedia.org/r/979973

Change 979973 merged by Btullis:

[operations/puppet@production] Prevent removal of python2 on hadoop coordinators

https://gerrit.wikimedia.org/r/979973

Change 980372 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Bring an-coord1004 into service as a hadoop coordinator

https://gerrit.wikimedia.org/r/980372

Change 980372 merged by Btullis:

[operations/puppet@production] Bring an-coord1004 into service as a hadoop coordinator

https://gerrit.wikimedia.org/r/980372

Change 980396 had a related patch set uploaded (by Btullis; author: Btullis):

[analytics/refinery/scap@master] Add the scap targets for the new hadoop coordinators

https://gerrit.wikimedia.org/r/980396

Change 980405 had a related patch set uploaded (by Btullis; author: Btullis):

[analytics/hdfs-tools/deploy@master] Update list of scap targets to match where hdfs_tools is deployed

https://gerrit.wikimedia.org/r/980405

Change 980405 merged by Ottomata:

[analytics/hdfs-tools/deploy@master] Update list of scap targets to match where hdfs_tools is deployed

https://gerrit.wikimedia.org/r/980405

I have discovered something interesting! Until now, we have only been running a single presto coordinator, which has been on an-coord1001.
Our standby server (an-coord1002) was not running a presto coordinator. It was in a different puppet role as well (analytics_cluster::coordinator::replica)

Now that an-coord1003 and an-coord1004 are in service in the analytics_cluster::coordinator role as well, they are both running presto coordinators.
However, I had expected that there were all independent of each other and that therefore the coordinators on the new hosts were unused.

It seems that I am wrong though. We already have a disggregated coordinator configuration in place.

If I view the presto UI from an-coord1003 or an-coord1004 I can still see the 15 registered workers and can view the queries happening on the cluster. This is great because it means that our migration process should be simpler. We still want to end up with clients connecting to https://analytics-presto.eqiad.wmnet in order to register, but it means that we can upgrade the settings on each of the presto coordinators individually and then test, before switching the hostname here in wmfdata.

I think that we are ready to move the analytics-hive.eqiad.wmnet DNS CNAME from an-coord1001 to an-coord1003.

I have tested by running sudo -u analytics kerberos-run-command analytics hive on an-coord1003 and I can see that the hive process is indeed talking to the hive metastore process on its own hostname, which is what is specified in /etc/hive/conf/hive-site.xml

btullis@an-coord1003:/var/log/hive$ ps -ef | grep -i CliDriver
analyti+  954223  954221 99 15:15 pts/0    00:00:25 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx256m -Dhadoop.log.dir=/usr/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Dproc_hivecli -Dlog4j2.formatMsgNoLookups=true -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/usr/lib/hive/conf/parquet-logging.properties -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/lib/hive/lib/hive-cli-2.3.6.jar org.apache.hadoop.hive.cli.CliDriver

btullis@an-coord1003:/var/log/hive$ sudo lsof -p 954223 | grep ESTABLISHED
lsof: WARNING: can't stat() fuse.fuse_dfs file system /mnt/hdfs
      Output information may be incomplete.
java    954223 analytics  654u     IPv6          206031088      0t0       TCP an-coord1003.eqiad.wmnet:37304->an-coord1003.eqiad.wmnet:9083 (ESTABLISHED)

Based on this, I will make a patch to the DNS repository and move analytics-hive to an-coord1003.

Change 987152 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/dns@master] Migrate analytics-hive to a new coordinator

https://gerrit.wikimedia.org/r/987152

Change 987152 merged by Btullis:

[operations/dns@master] Migrate analytics-hive to a new coordinator

https://gerrit.wikimedia.org/r/987152

Mentioned in SAL (#wikimedia-analytics) [2024-01-02T15:36:23Z] <btullis> migrating analytics-hive.eqiad.wmnet to an-coord1003 for T336045

This looks to be OK so far. I have run the same test from a stat client and I can see that the metastore connection is going to an-coord1003.

btullis@stat1004:~$ pgrep -u btullis -fa hive.cli
21412 /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx256m -Dhadoop.log.dir=/usr/lib/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Djava.library.path=/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Dproc_hivecli -Dlog4j.configurationFile=hive-log4j2.properties -Djava.util.logging.config.file=/usr/lib/hive/conf/parquet-logging.properties -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/lib/hive/lib/hive-cli-2.3.6.jar org.apache.hadoop.hive.cli.CliDriver

btullis@stat1004:~$ sudo lsof -p 21412 | grep -i established
lsof: WARNING: can't stat() fuse.fuse_dfs file system /mnt/hdfs
      Output information may be incomplete.
java    21412 btullis  642u     IPv6          290961277      0t0       TCP stat1004.eqiad.wmnet:56352->an-coord1003.eqiad.wmnet:9083 (ESTABLISHED)

I deployed https://gerrit.wikimedia.org/r/c/operations/puppet/+/709713 so now the presto cluster is using PKI certificates.

All of the servers are also able to use the Alternative DNS name of analytics-presto.eqiad.wmnet now, as well.

root@an-coord1004:/etc/presto/ssl# openssl x509 -in discovery__an-coord1004_eqiad_wmnet.pem -noout -text|grep -A1 'X509v3 Subject Alternative Name'
            X509v3 Subject Alternative Name: 
                DNS:analytics-presto.eqiad.wmnet, DNS:an-coord1004.eqiad.wmnet

Currently, they are not using this. Each of them is still using the kerberos pricipal based on their hostname:

root@an-coord1004:/etc/presto/ssl# klist -k /etc/security/keytabs/presto/presto.keytab 
Keytab name: FILE:/etc/security/keytabs/presto/presto.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   1 presto/an-coord1004.eqiad.wmnet@WIKIMEDIA
   1 presto/analytics-presto.eqiad.wmnet@WIKIMEDIA

The next step is to update the presto configuration for an-coord100[3-4] so that is uses the presto/analytics-presto.eqiad.wmnet@WIKIMEDIA principal for authentication. Once I have verified that this is OK, then I can make another change to wmfdata-python to use this CNAME.

This is now waiting on: https://github.com/wikimedia/wmfdata-python/pull/50 and T345482: Wmfdata should connect to Presto using the analytics-presto CNAME.
Once that is merged, we will create a new conda-analytics package containing the updated wmfdata library and then we can plan to migrate the coordinator via a DNS change.

Change 980396 abandoned by Btullis:

[analytics/refinery/scap@master] Add the scap targets for the new hadoop coordinators

Reason:

Achieved via another commit

https://gerrit.wikimedia.org/r/980396

Change 998425 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Use the analytics-presto CNAME for workers and clients

https://gerrit.wikimedia.org/r/998425

Change 998425 merged by Btullis:

[operations/puppet@production] Use the analytics-presto CNAME for workers and clients

https://gerrit.wikimedia.org/r/998425

Change 998440 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/dns@master] Migrate analytics-presto to a new hadoop coordinator

https://gerrit.wikimedia.org/r/998440

@Stevemunene - I created the following patch to the DNS repo: https://gerrit.wikimedia.org/r/c/operations/dns/+/998440

This is the switch to flick, when we finally want to move the presto-coordinator service from an-coord1001.

I told users in this thread (and via email) that they would have a while to update their wmfdata environments and./or conda-analytics environment, before we flick the switch.

We will also need to update superset here when we make the switch:

image.png (653×1 px, 81 KB)

I tried using the analytics-presto CNAME here, but it didn't work, so I think that we will have to switch it to an-coord1003 manually.

BTullis added a subscriber: Stevemunene.

I have scheduled a maintenance window for Monday 26th at 11:00 when @Stevemunene and I will migrate this role to the its new host.

Change 998440 merged by Btullis:

[operations/dns@master] Migrate analytics-presto to a new hadoop coordinator

https://gerrit.wikimedia.org/r/998440

We have carried out the presto coordinator migration and all went as planned.

image.png (517×1 px, 78 KB)

Although we initially started by using the cookbook to roll-restart the workers, in fact it was more effective to restart them all synchronously using cumin.

We also updated Superset manually here:

image.png (579×1 px, 55 KB)

Woohoo! Exciting to see this complete 😁