Hadoop-test cluster servers to be upgraded
- hadoop-workers-test - 3- cumin 'P{F:lsbdistcodename = buster} and A:hadoop-worker-test' - 2 out of 3 done
- hadoop-coordinator-test - 1 an-test-coord1001.eqiad.wmnet
- hadoop-coordinator-standby-test - 1 an-test-coord1002.eqiad.wmnet
- hadoop-master-test - 1 - an-test-master1001.eqiad.wmnet
- hadoop-standby-test - 1 - an-test-master1002.eqiad.wmnet
- hadoop-client-test - 1 - an-test-client1001.eqiad.wmnet Replace with an-test-client1002.eqiad.wmnet
- re commission an-test-worker1001 which is currently in host exclude list
We need to make sure that these servers do not format various volumes diring the reinstall.
- an-test-worker100[1-3] - /srv/hadoop
- an-test-coord1001 - /srv/
- an-test-master1001 - /srv/
The contents of /home on an-test-client will be lost, so we should ask users whether they would like to back up anything before it is reinstalled.
Here are the largest home directories, according to sudo ncdu -x /home
The way in which we configure the debian installer not to format volumes is as shown here:
https://phabricator.wikimedia.org/rOPUP8457d3f0007143f0772e9a8dae0b5d088c3d7978
All of these reuse partition recipes should already be in place for all of the servers here:
https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/install_server/files/autoinstall/netboot.cfg$95
...but it's worth checking that they look good and work as expected.
There is an optional reuse-parts-test.cfg file that pauses the installer before committing the changes to disk. Not sure whether it makes sense to use it on these test servers, but it's worth knowing about.
These are the remaining errors running puppet on a Hadoop test worker:
- Error: /Stage[main]/Profile::Python37/Package[python3.7]/ensure: change from 'purged' to 'present' failed
- Error: /Stage[main]/Ores::Base/Package[enchant]/ensure: change from 'purged' to 'present' failed
- Error: /Stage[main]/Ores::Base/Package[myspell-de-at]/ensure: change from 'purged' to 'present' failed - plus myspell-de-ch, myspell-de-de
- Error: /Stage[main]/Conda_analytics/Package[conda-analytics]/ensure: change from 'purged' to 'present' failed
- Error: /Stage[main]/Bigtop::Hive/File[/usr/lib/hive/bin/ext/hiveserver2.sh]/ensure: change from 'absent' to 'file' failed
- Error: /Stage[main]/Profile::Hadoop::Spark2/Package[spark2]/ensure: change from 'purged' to 'present' failed
- Error: /Stage[main]/Profile::Hadoop::Spark2/File[/etc/spark2/conf/hive-site.xml]/ensure: change from 'absent' to 'link' failed
- Error: /Stage[main]/Profile::Hadoop::Spark2/Package[spark2]/ensure: change from 'purged' to 'present' failed
- Error: /Stage[main]/Bigtop::Hadoop::Nodemanager/Systemd::Service[hadoop-yarn-nodemanager]/Service[hadoop-yarn-nodemanager]/ensure: change from 'stopped' to 'running' failed