Page MenuHomePhabricator

Refactor analytics/cdh roles to use hiera, setup Analytics Cluster in beta labs. [21 pts]
Closed, ResolvedPublic

Description

It is about time we created and maintained an official Analytics cluster in Beta. This includes CDH (Hadoop) and Kafka. It might also include varnishkafka on Beta varnishes.

There will be a little bit of refactoring in puppet that needs to happen for this, but for the most part the existing puppetization will do everything it needs to do. Building this cluster won't be too much work (maybe half a week), but maintaining it might be.

This work does not have to be done by me. I will work on any puppet tweaks needed, but I think it would be good to get more analytics engineers involved here.

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+6 -2
operations/puppetproduction+16 -0
operations/puppetproduction+0 -1
operations/puppetproduction+2 -1 K
operations/puppetproduction+7 -78
operations/puppetproduction+62 -44
operations/puppetproduction+298 -40
operations/puppetproduction+4 -9
operations/puppetproduction+2 -2
operations/puppetproduction+12 -20
operations/puppetproduction+1 -1
operations/puppetproduction+6 -1
operations/puppetproduction+1 -1
operations/puppetproduction+106 -104
operations/puppetproduction+1 -1
operations/puppetproduction+66 -0
operations/puppetproduction+64 -52
operations/puppetproduction+13 -0
operations/puppet/cdhmaster+76 -0
operations/puppetproduction+99 -99
operations/puppetproduction+829 -2
operations/puppet/cdhmaster+124 -138
Show related patches Customize query in gerrit

Event Timeline

Ottomata raised the priority of this task from to Needs Triage.
Ottomata updated the task description. (Show Details)
Milimetric set Security to None.
Milimetric moved this task from Incoming to Prioritized on the Analytics-Backlog board.

@elukey I'll ensure you a beer supplied night a next offsite for that one :)

Change 267797 had a related patch set uploaded (by Ottomata):
[WIP] Refactor manifests/role/analytics/* into modules/role, use hiera to configure

https://gerrit.wikimedia.org/r/267797

Ottomata renamed this task from Create and maintain an Analytics Cluster in Beta Cluster in labs. to Create and maintain an Analytics Cluster in Beta Cluster in labs. [21 pts].Feb 4 2016, 6:03 PM
Ottomata renamed this task from Create and maintain an Analytics Cluster in Beta Cluster in labs. [21 pts] to Refactor analytics/cdh roles to use hiera, setup Analytics Cluster in beta labs. [21 pts].Feb 8 2016, 11:37 PM

Change 269340 had a related patch set uploaded (by Ottomata):
Updates to work with hiera, Hive/Oozie MySQL db can now be hosted on remote node

https://gerrit.wikimedia.org/r/269340

Change 269340 merged by Ottomata:
Updates to work with hiera, Hive/Oozie MySQL db can now be hosted on remote node

https://gerrit.wikimedia.org/r/269340

Change 267797 merged by Ottomata:
Refactor manifests/role/analytics/* into modules/role, use hiera to configure

https://gerrit.wikimedia.org/r/267797

Change 269825 had a related patch set uploaded (by Ottomata):
Rename analytics_new to analytics_cluster

https://gerrit.wikimedia.org/r/269825

Change 269825 merged by Ottomata:
Rename analytics_new to analytics_cluster

https://gerrit.wikimedia.org/r/269825

Change 269841 had a related patch set uploaded (by Ottomata):
Add hive and oozie database grant defines

https://gerrit.wikimedia.org/r/269841

Change 269841 merged by Ottomata:
Add hive and oozie database grant defines

https://gerrit.wikimedia.org/r/269841

Change 269844 had a related patch set uploaded (by Ottomata):
Include analytics_cluster::client role on analytics1026 for testing

https://gerrit.wikimedia.org/r/269844

Change 269981 had a related patch set uploaded (by Ottomata):
Add role::analytics_cluster::java class

https://gerrit.wikimedia.org/r/269981

Change 269981 merged by Ottomata:
Add role::analytics_cluster::java class

https://gerrit.wikimedia.org/r/269981

Change 269991 had a related patch set uploaded (by Ottomata):
Move analytics_cluster prod hiera vars to eqiad/cdh/

https://gerrit.wikimedia.org/r/269991

Change 269991 merged by Ottomata:
Move analytics_cluster prod hiera vars to eqiad/cdh/

https://gerrit.wikimedia.org/r/269991

Change 269994 had a related patch set uploaded (by Ottomata):
Add missing net-topology.py.erb to analytics_cluster role

https://gerrit.wikimedia.org/r/269994

Change 269994 merged by Ottomata:
Add missing net-topology.py.erb to analytics_cluster role

https://gerrit.wikimedia.org/r/269994

Change 269996 had a related patch set uploaded (by Ottomata):
Fix use of variable in analytics_cluster/hadoop/client.pp

https://gerrit.wikimedia.org/r/269996

Change 269996 merged by Ottomata:
Fix use of variable in analytics_cluster/hadoop/client.pp

https://gerrit.wikimedia.org/r/269996

Change 269998 had a related patch set uploaded (by Ottomata):
Move hadoop memory settings for production above class declaration

https://gerrit.wikimedia.org/r/269998

Change 269998 merged by Ottomata:
Move hadoop memory settings for production above class declaration

https://gerrit.wikimedia.org/r/269998

Change 269844 merged by Ottomata:
Include analytics_cluster::client role on analytics1026 for testing

https://gerrit.wikimedia.org/r/269844

Change 270033 had a related patch set uploaded (by Ottomata):
Include new analytics_cluster::hadoop::worker for testing on analytics1057

https://gerrit.wikimedia.org/r/270033

Change 270033 merged by Ottomata:
Include new analytics_cluster::hadoop::worker for testing on analytics1057

https://gerrit.wikimedia.org/r/270033

Change 270035 had a related patch set uploaded (by Ottomata):
Apply new analytics_cluster::hadoop::standby role on analytics1002

https://gerrit.wikimedia.org/r/270035

Change 270035 merged by Ottomata:
Apply new analytics_cluster::hadoop::standby role on analytics1002

https://gerrit.wikimedia.org/r/270035

Change 270037 had a related patch set uploaded (by Ottomata):
Use new analytics_cluster::hadoop::master role on analytics1001

https://gerrit.wikimedia.org/r/270037

Change 270040 had a related patch set uploaded (by Ottomata):
Move hadoop_user_posix_groups setting to eqiad/cdh/hadoop/users.yaml

https://gerrit.wikimedia.org/r/270040

Change 270040 merged by Ottomata:
Move hadoop_user_posix_groups setting to eqiad/cdh/hadoop/users.yaml

https://gerrit.wikimedia.org/r/270040

Change 270037 merged by Ottomata:
Use new analytics_cluster::hadoop::master role on analytics1001

https://gerrit.wikimedia.org/r/270037

Change 270050 had a related patch set uploaded (by Ottomata):
Use new analytics_cluster::hadoop::worker role on all Hadoop workers

https://gerrit.wikimedia.org/r/270050

Change 270050 merged by Ottomata:
Use new analytics_cluster::hadoop::worker role on all Hadoop workers

https://gerrit.wikimedia.org/r/270050

Change 270103 had a related patch set uploaded (by Ottomata):
Create refinery classes in analytics_cluster role, apply them to stat1002

https://gerrit.wikimedia.org/r/270103

Change 270103 merged by Ottomata:
Create refinery classes in analytics_cluster role, apply them to stat1002

https://gerrit.wikimedia.org/r/270103

Change 270795 had a related patch set uploaded (by Ottomata):
Apply new analytics_cluster role to analytics1027

https://gerrit.wikimedia.org/r/270795

Change 270795 merged by Ottomata:
Apply new analytics_cluster role to analytics1027

https://gerrit.wikimedia.org/r/270795

Change 270808 had a related patch set uploaded (by Ottomata):
Remove some unsed role::analytics::* classes, more to come

https://gerrit.wikimedia.org/r/270808

Change 270808 merged by Ottomata:
Remove some unsed role::analytics::* classes, more to come

https://gerrit.wikimedia.org/r/270808

Change 270851 had a related patch set uploaded (by Ottomata):
Remove unused analytics role classes

https://gerrit.wikimedia.org/r/270851

Change 270851 merged by Ottomata:
Remove unused analytics role classes

https://gerrit.wikimedia.org/r/270851

Change 270857 had a related patch set uploaded (by Ottomata):
Remove unused anayltics role from analytics kafka brokers

https://gerrit.wikimedia.org/r/270857

Change 270857 merged by Ottomata:
Remove unused anayltics role from analytics kafka brokers

https://gerrit.wikimedia.org/r/270857

Change 271044 had a related patch set uploaded (by Ottomata):
Configure Analytics Cluster in beta deployment-prep

https://gerrit.wikimedia.org/r/271044

Change 271044 merged by Ottomata:
Configure Analytics Cluster in beta deployment-prep

https://gerrit.wikimedia.org/r/271044

Change 271062 had a related patch set uploaded (by Ottomata):
Use project-$labsproject in labs for /var/log/refinery

https://gerrit.wikimedia.org/r/271062

Change 271062 merged by Ottomata:
Use project-$labsproject in labs for /var/log/refinery

https://gerrit.wikimedia.org/r/271062

Yeehaw! Done!

deployment-analytics03 should be our main entrypoint for using hadoop clients (Hive, Spark, etc.)