Page MenuHomePhabricator

Set up a Kerberos KDC service in production with minimal puppet automation
Closed, ResolvedPublic21 Estimated Story Points

Description

Set up a Kerberos KDC service in production with minimal puppet automation (for example, requiring manual provisioning of the keytabs/principals).

Event Timeline

Nuria triaged this task as High priority.Dec 18 2018, 9:32 PM
Nuria created this task.

Created a subtask to request a Ganeti VM to deploy a bare minimum kerberos service on it. The goal is to use it only to test it with the Hadoop Test cluster, the final prod service will be probably a two bare-metal host set up.

@MoritzMuehlenhoff kerberos1001 created (with role::spare::system), let's coordinate (whenever you have time) about how to proceed :)

Milimetric raised the priority of this task from High to Needs Triage.Mar 14 2019, 4:24 PM
Milimetric triaged this task as High priority.
Milimetric moved this task from Operational Excellence to Deprioritized on the Analytics board.
Milimetric moved this task from Deprioritized to Operational Excellence on the Analytics board.
elukey added a comment.Jun 3 2019, 6:09 PM

Added the following to the Analytics VLAN firewall rules:

elukey@re0.cr1-eqiad# show | compare
[edit firewall family inet filter analytics-in4]
       term schema { ... }
+      term kerberos {
+          from {
+              destination-address {
+                  /* kerberos1001 */
+                  10.64.0.182/32;
+              }
+              protocol tcp;
+              destination-port 88;
+          }
+          then accept;
+      }
       term default { ... }

I was able to kinit with my user from analytics1030!

Change 514092 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Add profile::kerberos::client to the Hadoop testing cluster

https://gerrit.wikimedia.org/r/514092

Change 514092 merged by Elukey:
[operations/puppet@production] Add profile::kerberos::client to the Hadoop testing cluster

https://gerrit.wikimedia.org/r/514092

Change 470566 had a related patch set uploaded (by Elukey; owner: Muehlenhoff):
[operations/puppet@production] kerberos: add script to generate service principals/keytabs

https://gerrit.wikimedia.org/r/470566

Added the following to the Analytics VLAN firewall rules:

elukey@re0.cr1-eqiad# show | compare
[edit firewall family inet filter analytics-in4]
       term schema { ... }
+      term kerberos {
+          from {
+              destination-address {
+                  /* kerberos1001 */
+                  10.64.0.182/32;
+              }
+              protocol tcp;
+              destination-port 88;

Kerberos primarily uses UDP and falls back to TCP (e.g. for some protocol stuff which exceeds UDP limits (similar to DNS)), we should also allow UDP here,

elukey added a comment.EditedJun 4 2019, 3:57 PM

Done!

elukey@re0.cr2-eqiad# show | compare
[edit firewall family inet filter analytics-in4 term kerberos from]
-       protocol tcp;
+       protocol [ tcp udp ];

kinit seems much faster now, I think I was falling back to TCP in my previous tries.

elukey moved this task from Next Up to In Progress on the Analytics-Kanban board.
elukey added a comment.EditedJun 4 2019, 4:50 PM

Adding some ideas about a possible layout:

  • keytabs will be stored on puppetmaster1001's puppet private repo under /srv/private/modules/secrets/secrets/kerberos/FQDN/role (basically grouping keytabs by host and role - for example, analytics1028.eqiad.wmnet/hadoop/hdfs.keytab).
  • every puppet kerberos client will have a hiera argument to set whether or not puppet should look into the private repo to deploy keytabs. If set, the keytabs will be deployed in a pre-defined location (/etc/security/kerberos/keytabs/.. for example).

If the above is acceptable, there is now the problem of how keytabs will get to the puppet private repo. They are generated via a script, but we have two options:

  1. the script keeps using kadmin.local, runs on kerberos1001 (or the future KDC master) and generates keytabs under something like /srv/kerberos/keytabs/etc... A manual rsync to puppetmaster1001 is required to copy keytabs over.
  2. the script is deployed on puppetmaster1001, and uses kadmin. It authenticates via kerberos to the kadmind daemon on kerberos1001 (so puppetmaster1001 will need to be a kerberos client) and generates the keytabs directly on the host.

@MoritzMuehlenhoff Any preference/suggestion/etc..? Maybe option 1) is cleaner so we avoid to add kerberos client's dependencies to puppetmaster1001. Otherwise let me know your idea if what I wrote doesn't make sense :)

Adding some ideas about a possible layout:

  • keytabs will be stored on puppetmaster1001's puppet private repo under /srv/private/modules/secrets/secrets/kerberos/FQDN/role (basically grouping keytabs by host and role - for example, analytics1028.eqiad.wmnet/hadoop/hdfs.keytab).

+1

  • every puppet kerberos client will have a hiera argument to set whether or not puppet should look into the private repo to deploy keytabs. If set, the keytabs will be deployed in a pre-defined location (/etc/security/kerberos/keytabs/.. for example).

+1

If the above is acceptable, there is now the problem of how keytabs will get to the puppet private repo. They are generated via a script, but we have two options:

  1. the script keeps using kadmin.local, runs on kerberos1001 (or the future KDC master) and generates keytabs under something like /srv/kerberos/keytabs/etc... A manual rsync to puppetmaster1001 is required to copy keytabs over.
  2. the script is deployed on puppetmaster1001, and uses kadmin. It authenticates via kerberos to the kadmind daemon on kerberos1001 (so puppetmaster1001 will need to be a kerberos client) and generates the keytabs directly on the host.

2 seems like a good solution for a later full rollout beyond the experimental setup, but until then we should probably rather implement something along the lines of 1 (rsync or SFTP) to contain Kerberos to the test environment. Having to do a manual pull also seems fine, given that it's only needed when we add new kerberized servers and/or new principals (which should both be relatively rare).

Change 514447 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kerberos::kadminserver: allow puppetmaster to rsync keytabs

https://gerrit.wikimedia.org/r/514447

Change 514471 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kerberos::kadminserver: add generate_keytabs.py

https://gerrit.wikimedia.org/r/514471

Change 514447 merged by Elukey:
[operations/puppet@production] profile::kerberos::kadminserver: allow puppetmaster to rsync keytabs

https://gerrit.wikimedia.org/r/514447

Change 514471 merged by Elukey:
[operations/puppet@production] profile::kerberos::kadminserver: add generate_keytabs.py

https://gerrit.wikimedia.org/r/514471

Change 514749 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kerberos::kadminserver: add auth_users to rsync's module config

https://gerrit.wikimedia.org/r/514749

Change 514749 merged by Elukey:
[operations/puppet@production] profile::kerberos::kadminserver: add auth_users to rsync's module config

https://gerrit.wikimedia.org/r/514749

Change 515010 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Allow Hadoop-related profiles to deploy Kerberos keytabs

https://gerrit.wikimedia.org/r/515010

Change 517623 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet/cdh@master] Add hive/yarn/oozie/hive users to the puppet catalog

https://gerrit.wikimedia.org/r/517623

Change 517623 merged by Elukey:
[operations/puppet/cdh@master] Add hive/yarn/oozie/hive users to the puppet catalog

https://gerrit.wikimedia.org/r/517623

Change 515010 merged by Elukey:
[operations/puppet@production] Allow Hadoop-related profiles to deploy Kerberos keytabs

https://gerrit.wikimedia.org/r/515010

Change 517812 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::coordinator: deploy Kerberos keytabs

https://gerrit.wikimedia.org/r/517812

Change 517812 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::coordinator: deploy Kerberos keytabs

https://gerrit.wikimedia.org/r/517812

Change 517814 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kerberos::keytabs: ensure the keytab's parent dir

https://gerrit.wikimedia.org/r/517814

Change 517814 merged by Elukey:
[operations/puppet@production] profile::kerberos::keytabs: ensure the keytab's parent dir

https://gerrit.wikimedia.org/r/517814

Change 517817 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Deploy keytabs to the Analytics Hadoop test cluster

https://gerrit.wikimedia.org/r/517817

Change 517817 merged by Elukey:
[operations/puppet@production] Deploy keytabs to the Analytics Hadoop test cluster

https://gerrit.wikimedia.org/r/517817

Change 517997 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kerberos::keytabs: add parent_dir_grp option

https://gerrit.wikimedia.org/r/517997

Change 517997 merged by Elukey:
[operations/puppet@production] profile::kerberos::keytabs: add parent_dir_grp option

https://gerrit.wikimedia.org/r/517997

Change 518001 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] hadoop: add HTTP kerberos keytabs

https://gerrit.wikimedia.org/r/518001

Change 518001 merged by Elukey:
[operations/puppet@production] hadoop: add HTTP kerberos keytabs

https://gerrit.wikimedia.org/r/518001

Change 518002 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] hadoop: fix HTTP keytab name in Hadoop testing cluster

https://gerrit.wikimedia.org/r/518002

Change 518002 merged by Elukey:
[operations/puppet@production] hadoop: fix HTTP keytab name in Hadoop testing cluster

https://gerrit.wikimedia.org/r/518002

Change 518030 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::kerberos::kerberos-puppet-wrapper: add principals

https://gerrit.wikimedia.org/r/518030

Change 518030 merged by Elukey:
[operations/puppet@production] profile::kerberos::kerberos-puppet-wrapper: add principals

https://gerrit.wikimedia.org/r/518030

Verified on the testing cluster that all the keytabs are working, task completed :)

elukey set the point value for this task to 21.Jun 20 2019, 2:11 PM
elukey moved this task from In Progress to Done on the Analytics-Kanban board.

Change 470566 abandoned by Muehlenhoff:
kerberos: add script to generate service principals/keytabs

Reason:
Superceded by another change

https://gerrit.wikimedia.org/r/470566

Nuria closed this task as Resolved.Jun 24 2019, 9:10 PM