The Analytics Hadoop cluster will change Hadoop distribution, from Cloudera CDH 5.16 to Apache Bigtop 1.5.
Several big changes listed, the most important ones:
- hadoop 2.6.x -> 2.10.1
- hive 1.1 -> 2.3.6
Even if the Hadoop version bump seems minor, it will be in practice a complex one requiring downtime of the whole cluster for some hours. We have backed up all the important data that cannot be re-created (like Pageviews etc..) in a separate cluster, so in case the upgrade goes sideways we'll have a good recovery path.
The upgrade is scheduled for February 9th (Tuesday), and it should last from 2 to 4 hours in the EU morning (more precise timings will be added).
Procedure: https://etherpad.wikimedia.org/p/analytics-bigtop-upgrade
High level maintenance window: 8AM -> 20PM CET
UPDATE: we extended the maintenance window to 14PM due to backups taking a long time to complete.
UPDATE 2: due to an upgrade issue, we extended again the maintenance to 16PM CET, really sorry :(
UPDATE 3: due to an upgrade issue, we extended again the maintenance to 20PM CET :(