In T207192 an-worker1078->95 have been configured as hadoop worker nodes (datanode partitions configured). We need to remove analytics1028->1042 from service and add the newer ones.
- Decomming a node is explained in https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Hadoop/Administration#Decommissioning
- Adding new workers should be a matter of updating the rack awareness config for HDFS and assign to them the right puppet role
- analytics1028 and analytics1035 are journal nodes, so the journal config needs to move to other two servers before decomming them completely.
A safe procedure to move journal nodes could be the following:
- Set HDFS in safe mode (no writes accepted)
- stop/mask the two journal node daemons (on analytics1028 and analytics1035)
- merge a config change for puppet to move the journal config to another two nodes, copy the journal node partition to them and roll restart all the journal nodes
- Remove HDFS Safe mode