Page MenuHomePhabricator

Prepare the Hadoop Analytics cluster for Kerberos
Closed, ResolvedPublic21 Estimated Story Points

Description

This task should list all the steps to take before enabling Kerberos:

  • Create user principals for various teams - T237605
  • Create and deploy system users and their keytabs to hosts (like analytics-search, etc..)
  • T238306
  • Kerberize all the nodes that will need it, and deploy keytabs generated in the above step (This can be done separately and it is good to test if kerberos works on all the nodes).
  • Add analytics users (without SSH access) to all Hadoop worker nodes
  • Add TLS keys for Master nodes if needed (may be needed only for workers, to be checked)
  • Add druid and analytics search system users to all Hadoop worker nodes
  • Disable Meta DB backup to HDFS
  • T231208
  • Find a solution for labstore crons
  • T234229
  • Create Oozie automation to get a snapshot of the current jobs and restart them
  • T237271

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+3 -2
analytics/refinerymaster+1 K -90
operations/puppetproduction+11 -3
operations/puppetproduction+136 -2
operations/puppetproduction+63 -0
operations/puppetproduction+11 -1
operations/puppetproduction+8 -4
operations/puppetproduction+9 -0
operations/puppetproduction+1 -1
operations/puppetproduction+11 -1
operations/puppetproduction+9 -1
operations/puppetproduction+22 -0
operations/puppetproduction+18 -0
operations/puppetproduction+97 -2
operations/puppetproduction+25 -0
operations/puppetproduction+6 -0
operations/puppetproduction+1 -0
operations/puppetproduction+7 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 548293 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Add analytics users (without ssh keys) to all Hadoop worker nodes

https://gerrit.wikimedia.org/r/548293

Change 548294 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Add druid/analytics/search system users to all Hadoop worker nodes

https://gerrit.wikimedia.org/r/548294

fdans moved this task from Incoming to Operational Excellence on the Analytics board.

Change 548293 merged by Elukey:
[operations/puppet@production] Add analytics users (without ssh keys) to all Hadoop worker nodes

https://gerrit.wikimedia.org/r/548293

Change 548294 merged by Elukey:
[operations/puppet@production] Add druid/analytics/search system users to all Hadoop worker nodes

https://gerrit.wikimedia.org/r/548294

elukey updated the task description. (Show Details)

Change 548759 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Add TLS certificates to Hadoop Analytics master nodes

https://gerrit.wikimedia.org/r/548759

Change 548759 merged by Elukey:
[operations/puppet@production] Add TLS certificates to Hadoop Analytics master nodes

https://gerrit.wikimedia.org/r/548759

Change 549083 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Include Kerberos profiles in the Analytics infrastructure

https://gerrit.wikimedia.org/r/549083

Change 549083 merged by Elukey:
[operations/puppet@production] Include Kerberos profiles in the Analytics infrastructure

https://gerrit.wikimedia.org/r/549083

Change 549103 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Deploy Kerberos keytabs to Analytics Hadoop hosts

https://gerrit.wikimedia.org/r/549103

Change 549103 merged by Elukey:
[operations/puppet@production] Deploy Kerberos keytabs to Analytics Hadoop hosts

https://gerrit.wikimedia.org/r/549103

Change 549565 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Deploy kerberos keytabs on stat100[4,5,7]

https://gerrit.wikimedia.org/r/549565

Change 549565 merged by Elukey:
[operations/puppet@production] Deploy kerberos keytabs on stat100[4,5,7]

https://gerrit.wikimedia.org/r/549565

Change 549566 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Enable Kerberos in Hadoop Analytics and Druid Analytics/Public

https://gerrit.wikimedia.org/r/549566

Change 550099 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kerberos::client: add MOTD to help users

https://gerrit.wikimedia.org/r/550099

Change 550099 merged by Elukey:
[operations/puppet@production] profile::kerberos::client: add MOTD to help users

https://gerrit.wikimedia.org/r/550099

Change 550713 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Deploy Kerberos keytabs for analytics-search and presto system users

https://gerrit.wikimedia.org/r/550713

Change 550713 merged by Elukey:
[operations/puppet@production] Deploy Kerberos keytabs for analytics-search and presto system users

https://gerrit.wikimedia.org/r/550713

Change 550945 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Update hive and spark oozie jobs for kerberos

https://gerrit.wikimedia.org/r/550945

Change 551392 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] hadoop: add analytics kerberos keytab for test/prod master host

https://gerrit.wikimedia.org/r/551392

Change 551392 merged by Elukey:
[operations/puppet@production] hadoop: add analytics kerberos keytab for test/prod master host

https://gerrit.wikimedia.org/r/551392

Change 551398 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::analytics::cluster::client: fix nagios' sudo permissions

https://gerrit.wikimedia.org/r/551398

Change 551398 merged by Elukey:
[operations/puppet@production] profile::analytics::cluster::client: fix nagios' sudo permissions

https://gerrit.wikimedia.org/r/551398

I was able to make Presto run with the following config settings on analytics1030:

/etc/presto/catalog/analytics_test_hive.properties

hive.metastore.authentication.type=KERBEROS
hive.metastore.service.principal=hive/analytics1030.eqiad.wmnet@WIKIMEDIA
hive.metastore.client.principal=presto/analytics1030.eqiad.wmnet@WIKIMEDIA
hive.metastore.client.keytab=/etc/security/keytabs/presto/presto.keytab

hive.hdfs.authentication.type=KERBEROS
hive.hdfs.impersonation.enabled=true
hive.hdfs.presto.principal=presto/analytics1030.eqiad.wmnet@WIKIMEDIA
hive.hdfs.presto.keytab=/etc/security/keytabs/presto/presto.keytab
hive.hdfs.impersonation.enabled=true
hive.hdfs.wire-encryption.enabled=true
/etc/presto/config.properties

http-server.authentication.type=KERBEROS
http.server.authentication.krb5.service-name=presto
http.server.authentication.krb5.keytab=/etc/security/keytabs/presto/presto.keytab
http.authentication.krb5.config=/etc/krb5.conf

TLS seems optional, and it makes sense, since the kerberos auth is protected, but then the traffic between user and presto coord is unencrypted. I think that we can live with this for the moment, and possibly review it after Kerberos is enabled?

Change 551870 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster_coordinator: enable Kerberos for Presto

https://gerrit.wikimedia.org/r/551870

Change 551870 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster_coordinator: enable Kerberos for Presto

https://gerrit.wikimedia.org/r/551870

Change 551881 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::coordinator: fix Presto kerberos config

https://gerrit.wikimedia.org/r/551881

Change 551881 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::coordinator: fix Presto kerberos config

https://gerrit.wikimedia.org/r/551881

Change 551957 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Add hdfs kerberos keytab to Analytics Hadoop coordinators

https://gerrit.wikimedia.org/r/551957

Change 551957 merged by Elukey:
[operations/puppet@production] Add hdfs kerberos keytab to Analytics Hadoop coordinators

https://gerrit.wikimedia.org/r/551957

Change 556190 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::hadoop::worker: add set_yarn_dir_ownership script

https://gerrit.wikimedia.org/r/556190

Change 556190 merged by Elukey:
[operations/puppet@production] profile::hadoop::worker: add set_yarn_dir_ownership script

https://gerrit.wikimedia.org/r/556190

Change 549566 merged by Elukey:
[operations/puppet@production] Enable Kerberos in Hadoop Analytics and Druid Analytics/Public

https://gerrit.wikimedia.org/r/549566

Change 558067 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Set g+s on all yarn user cache directories

https://gerrit.wikimedia.org/r/558067

Change 558067 merged by Ottomata:
[operations/puppet@production] Set g+s on all yarn user cache directories

https://gerrit.wikimedia.org/r/558067

Change 550945 merged by Mforns:
[analytics/refinery@master] Update hive and spark oozie jobs for kerberos

https://gerrit.wikimedia.org/r/550945

Change 558128 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Put hive-site.xml into HDFS as analytics user

https://gerrit.wikimedia.org/r/558128

Change 558128 merged by Ottomata:
[operations/puppet@production] Put hive-site.xml into HDFS as analytics user

https://gerrit.wikimedia.org/r/558128

elukey set the point value for this task to 21.Dec 17 2019, 9:32 AM
elukey moved this task from In Progress to Done on the Analytics-Kanban board.
elukey moved this task from Kerberos to Done on the User-Elukey board.