Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings.
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Restricted Task | |||||
Resolved | elukey | T211836 Enable Security (stronger authentication and data encryption) for the Analytics Hadoop cluster and its dependent services | |||
Resolved | elukey | T212257 Set up a Kerberos KDC service in production with minimal puppet automation | |||
Resolved | elukey | T212259 Run critical Analytics Hadoop jobs and make sure that they work with the new auth settings. | |||
Restricted Task | |||||
Restricted Task | |||||
Resolved | elukey | T226232 Update the Camus checker to be able to authenticate via Kerberos |
Event Timeline
Change 489243 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::coordinator: add basic camus support
Change 489243 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::coordinator: add basic camus support
Change 490004 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::coordinator: add admin settings
Change 490004 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::coordinator: add admin settings
Change 490067 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::coord: add kafkatee instance
Change 490067 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::coord: add kafkatee instance
Change 491777 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] camus: make webrequest_text config more similar to prod
Change 491777 merged by Elukey:
[operations/puppet@production] camus: make webrequest_text config more similar to prod
Change 491779 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Rename kafka webrequest test topic
Change 491779 merged by Elukey:
[operations/puppet@production] Rename kafka webrequest test topic
hdfs@analytics1030:/mnt/hdfs/wmf/data/raw/webrequest$ ls webrequest_test_text
So if I got it correctly, it is now only a matter of using the right bundle.xml/properties and we should be able to refine this data too!
Change 491791 had a related patch set uploaded (by Elukey; owner: Elukey):
[analytics/refinery@master] Add oozie webrequest test bundle
Change 491944 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::analytics::refinery::job::test::camus: fix topic whitelist
Change 491944 merged by Elukey:
[operations/puppet@production] profile::analytics::refinery::job::test::camus: fix topic whitelist
Change 491949 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::analytics::refinery::job::test::camus: fix checked topic
Change 491949 merged by Elukey:
[operations/puppet@production] profile::analytics::refinery::job::test::camus: fix checked topic
Change 491969 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::coordinator: ensure hive-site.xml in HDFS
Change 491969 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::coordinator: ensure hive-site.xml in HDFS
Change 491973 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Move ensure hive-site.xml from (test) hadoo coord to ui
Change 491973 merged by Elukey:
[operations/puppet@production] Move ensure hive-site.xml from (test) hadoo coord to ui
Just saw this on the Confluent mailing list. Parking it here for future reference:
We don't yet use Kafka Connect, but if/when we do and we have a Kerberized Hadoop Cluster, we'll need to be aware of that. :/
Change 518097 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet/cdh@master] Add cdh::systemd_timer
Change 518220 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::hue: add a parameter to selectively enable oozie security
Change 518220 merged by Elukey:
[operations/puppet@production] profile::hue: add a parameter to selectively enable oozie security
Change 518469 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluste::hadoop::master|stanby: allow https port
Change 518469 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluste::hadoop::master|stanby: allow https port
Change 518646 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] hadoop: set 'hdfs' as admin user for the Hadoop test cluster
Change 518646 merged by Elukey:
[operations/puppet@production] hadoop: set 'hdfs' as admin user for the Hadoop test cluster
Change 518648 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] hadoop: format dfs.cluster.administrators correctly
Change 518648 merged by Elukey:
[operations/puppet@production] hadoop: format dfs.cluster.administrators correctly
Change 518915 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Introduce the kerberos module
Change 518915 merged by Elukey:
[operations/puppet@production] Introduce the kerberos module
Change 518954 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Replace profile::analytics::systemd_timer with kerberos::systemd_timer
Change 518958 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] camus: add support for kerberos
Change 519057 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] Set oozie as proxy for the Hadoop testing cluster
Change 519057 merged by Elukey:
[operations/puppet@production] Set oozie as proxy for the Hadoop testing cluster
Change 518954 merged by Elukey:
[operations/puppet@production] Replace profile::analytics::systemd_timer with kerberos::systemd_timer
Change 518958 merged by Elukey:
[operations/puppet@production] camus: add support for kerberos
Change 519263 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] profile::hadoop::common: set r+o to the trustore file
Change 519263 merged by Elukey:
[operations/puppet@production] profile::hadoop::common: set r+o to the trustore file
Summary of things done:
- hue works now with oozie
- oozie is kerberized
- camus works fine with kerberos
- webrequest_load runs successfully replacing hive actions with hive2 actions
Point 4) example:
diff --git a/oozie/util/hive/partition/add/workflow.xml b/oozie/util/hive/partition/add/workflow.xml index ca07199..4e6d892 100644 --- a/oozie/util/hive/partition/add/workflow.xml +++ b/oozie/util/hive/partition/add/workflow.xml @@ -42,13 +42,26 @@ <description>HDFS path(s) naming the input dataset.</description> </property> </parameters> + <credentials> + <credential name='my-hive-creds' type='hive2'> + <property> + <name>hive2.server.principal</name> + <value>hive/analytics1030.eqiad.wmnet@WIKIMEDIA</value> + </property> + <property> + <name>hive2.jdbc.url</name> + <value>jdbc:hive2://analytics1030.eqiad.wmnet:10000/default</value> + </property> + </credential> + </credentials> + <start to="add_partition"/> - <action name="add_partition"> - <hive xmlns="uri:oozie:hive-action:0.2"> + <action name="add_partition" cred="my-hive-creds"> + <hive2 xmlns="uri:oozie:hive2-action:0.1"> <job-tracker>${job_tracker}</job-tracker> <name-node>${name_node}</name-node> <job-xml>${hive_site_xml}</job-xml> <configuration> <!--make sure oozie:launcher runs in a low priority queue --> @@ -65,13 +78,13 @@ <value>${queue_name}</value> </property> </configuration> - + <jdbc-url>jdbc:hive2://analytics1030.eqiad.wmnet:10000/default</jdbc-url> <script>${hive_script}</script> <param>database=${replaceAll(table, "\\..*", "")}</param> <param>table=${replaceAll(table, "^.*\\.", "")}</param> <param>location=${location}</param> <param>partition_spec=${partition_spec}</param> - </hive> + </hive2> <ok to="end"/> <error to="kill"/> </action>
Change 519355 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] cdh::oozie: add hive2/hcat credentials classes
Change 519355 merged by Elukey:
[operations/puppet@production] cdh::oozie: add hive2/hcat credentials classes
For the scope of this task, I would call it done. Several follow ups will need to be done, but overall most of the critical analytics tools are working!
Change 519368 had a related patch set uploaded (by Elukey; owner: Elukey):
[operations/puppet@production] role::analytics_test_cluster::hadoop::ui: configure hive for hue
Change 519368 merged by Elukey:
[operations/puppet@production] role::analytics_test_cluster::hadoop::ui: configure hive for hue
Change 491791 had a related patch set uploaded (by Elukey; owner: Elukey):
[analytics/refinery@master] Add oozie webrequest test bundle
Change 491791 abandoned by Elukey:
[analytics/refinery@master] Add oozie webrequest test bundle
Reason: