We are currently running a Cloudera Hadoop distribution for the Analytics cluster, precisely CDH 5.10. This distribution has served us well but it showed some shortcomings:
* Limited community support for reporting bugs when needed (and getting issues fixed upstream).
* Absence of Debian source packages (limiting our ability to apply patches promptly, mostly for security CVEs).
Cloudera released some days ago [[ https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_6_release_notes.html | CDH 6 ]], a Hadoop 3.0 based distribution containing a lot of software upgrades (among all, Hive 2.1). Given the fact that we are running Hadoop 2.6.0 now, the jump to a new major version would require a lot of work and testing, likely doable only in multiple quarters.
This could be a good time to think if we want to keep going with CDH or change distribution, like:
* Hortonworks
* Apache Big Top
A bit more details about each distribution:
**Hortonworks**
The last 2.x series release seems to be 2.6.5, here the [[ https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_command-line-installation/content/config-remote-repositories.html | documentation ]] about installing it manually. The repository seems to deny directory listing so it is difficult to explore, but as far as I can see the support is only up to Debian 7 (Debian Stretch is 9 to compare, so very old).
The last release is 3.1.0 and seems to [[ https://docs.hortonworks.com/HDPDocuments/Ambari-2.7.0.0/bk_ambari-installation/content/hdp_30_repositories.html | support ]] Debian Stretch.
Very nice that Apache Ambari and Ranger and integrated with the Distribution.
**Apache BigTop**
Don't see any Debian mention in the [[http://apache.panu.it/bigtop/bigtop-1.2.1/repos/ | list of repos]].
**CDH 6**
The [[https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_600_release_notes.html#cdh600_release_notes | release notes]] are very interesting to read. From the [[https://www.cloudera.com/documentation/enterprise/6/release-notes/topics/rg_cdh_60_packaging.html#cdh_packaging_600 | packages list ]] it is clear though that Hadoop 3.0 is installed.