In order to workaround hive-parquet-spark timestamp field issues, we should upgrade to 5.4 as soon as possible. This will also get us the spark oozie action! :)
Description
Description
Details
Details
Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
operations/puppet | production | +1 -1 | Mirror cdh5.4.0 in apt |
Related Objects
Related Objects
- Mentioned In
- rOPUP972ef003f8f7: Mirror cdh5.4.0 in apt
Event Timeline
Comment Actions
This has been tested in labs and in vagrant, and an upgrade of the production cluster is scheduled for Monday.
Comment Actions
I am starting this now. Here is the general plan:
# CDH 5.3.1 to CDH 5.4.0 upgrade ## Shutdown and backup http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_earlier_cdh5_upgrade.html Stop puppet everywhere. apt-get update Put namenode in safemode and take save namespace: sudo -u hdfs hdfs dfsadmin -safemode enter sudo -u hdfs hdfs dfsadmin -saveNamespace Shutdown all services: salt "analytics*" cmd.run 'for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done; service impala-server stop' service hive-server2 stop service hive-metastore stop service oozie stop service hue stop service llama stop service impala-state-store stop service impala-catalog stop Backup namenode metadata cd /var/lib/hadoop/name/ tar -cvf /root/hadoop_name_backup_$(date +%s).tar . ## Upgrade Hadoop On active master apt-get install hadoop-hdfs-namenode hadoop-yarn-resourcemanager hadoop-mapreduce-historyserver hadoop-httpfs On standby master apt-get install hadoop-hdfs-namenode On workers apt-get install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce On clients (stat1002, analytics1026, analytics1027) apt-get install hadoop-client On journalnodes apt-get install hadoop-hdfs-journalnode # Start journalnodes service hadoop-hdfs-journalnode start Upgrade hdfs metadata on active master service hadoop-hdfs-namenode upgrade Bootstrap standb and start namenode sudo -u hdfs hdfs namenode -bootstrapStandby service hadoop-hdfs-namenode start Start DataNodes service hadoop-hdfs-datanode start On master service hadoop-yarn-resourcemanager start service hadoop-httpfs start Start worker nodemanagers service hadoop-yarn-nodemanager start Start historyserver on master service hadoop-mapreduce-historyserver start Make sure stuff works! ## Upgrade Components ## HCatalog apt-get install hive-hcatalog Take note of differences in hue.ini. See if anything needs updating. ## Mahout Everywhere, do: apt-get install mahout ## Pig http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_pig_installation.html apt-get install pig ## Sqoop http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_sqoop_package_install.html apt-get install sqoop ## Hive Backup the mysql metastore db mysqldump -u root hive_metastore > /root/hive_metastore-backup.$(date +%s).sql analytics1027: apt-get install hive hive-metastore hive-server2 All other nodes: apt-get install hive TODO: increase heap size for hive-metastore?? http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_hive_install.html Upgrade the metastore on analytics1027 /usr/lib/hive/bin/schematool -dryRun -dbType mysql -upgradeSchemaFrom 0.13.0 # if ok, then: /usr/lib/hive/bin/schematool -dbType mysql -upgradeSchemaFrom 0.13.0 start hive services service hive-metastore restart service hive-server2 restart ## Oozie http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_oozie_configure.html?scroll=topic_17_6 On clients apt-get install oozie-client On oozie server sudo service oozie stop # backup database mysqldump -u root oozie > /root/oozie-backup.$(date +%s).sql apt-get install oozie oozie-client # upgrade database sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh upgrade -run Upgrade oozie shared library sudo -u oozie hadoop fs -rmr /user/oozie/share oozie-setup sharelib create -fs hdfs://analytics-hadoop/ -locallib /usr/lib/oozie/oozie-sharelib-yarn Start oozie server service oozie restart TODO: Check back on oozie web console, does it work better now? extjs??? ## Spark On all nodes apt-get install spark-core spark-python Upload new spark assembly jar sudo -u spark hdfs dfs -rm /user/spark/share/lib/spark-assembly.jar sudo -u spark hdfs dfs -put /usr/lib/spark/lib/spark-assembly.jar /user/spark/share/lib/spark-assembly.jar ## Impala http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/impala_upgrading.html#upgrade_cm_pkgs_unique_1 All impala services should already be stopped. On worker nodes apt-get install impala-server On client nodes apt-get install impala-shell On impala master node (analytics1026) apt-get install impala-catalog impala-state-store llama Start impala services On impala master node (analytics1026) service impala-catalog start service impala-state-store start service llama start On worker nodes service impala-server start ## Hue service hue stop # SOMETHING IS WEIRD HERE, I had the precise versions installed via apt, HMmMm. apt-get install hue