This is the tracking task. ops needs might need to sync with analytics engineering but this is used in WMF already.
Here are some reccomendations from folks working on analytics at WMF:
From Mikhail Popov:
Relevant Puppet code for ideas and reference:
- profile::swap - https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/swap.pp
- jupyterhub module - https://github.com/wikimedia/puppet/tree/production/modules/jupyterhub
- profile::hadoop::firewall::master - https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/hadoop/firewall/master.pp
- profile::hive::client - https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/hive/client.pp
WIP design document on how SWAP works and problems with it:
I would encourage @Jgreen to focus in tools we use broadly to reduce maintenance costs of updates. For exploration/daily work for analysts jupyter
notebooks is the best solution, for dashboarding superset. Superset is flexible in that it can display dat from a number of datasources, specially if presto is used with it. Also it has more sophisticated authentication as it can be used with kerberos. I would discourage setting up another dashboarding tool such us Rstudio.