stop-dfs.sh report that there are no datanodes running on some nodes like
hdfs-node-000208: no datanode to stop
However, there are DataNode process running there. How to clean these processes on many (100s) of nodes?
You may use this piece of bash script:
for i in `cat hadoop/etc/hadoop/slaves`; do echo $i; ssh $i 'jps | grep DataNode | cut -d" " -f1 | xargs --no-run-if-empty -I@ bash -c "echo -- killing @; kill @"'; done
What it does is, for each slave ndoes, run the command started by
finds DataNode Java processes, gets the process IDs and passes these IDs to
kill to kill them.