Get Hadoop up and running without DNS
In this couple of days, I have tried to get Hadoop Word Count ruuning on my local cluster with 3 CentOS boxes.
Thanks to Running Hadoop On Ubuntu Linux (Multi-Node Cluster), 90% of the set up was easy as described.
But there are two problems that I had to waste my time.
1. RSA authentication with SSH
authorized_keys file has to be accessible only by the user. Don't forget to disable any access by any groups and others.
2. Host name resolution
examples of hosts files
master
::1 localhost6.localdomain6 localhost6
192.168.10.21 master * master has to have accessible IP address(not ::1 nor 127.0.0.1) by slaves
192.168.10.22 slave.yellow
192.168.10.23 slave.redslave.yellow
127.0.0.1 slave.yellow localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.23 slave.redslave.red
127.0.0.1 slave.red localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.22 slave.yellow
Be careful about these network settings, then the Work Count should run.
2008/9/14 tested Hadoop-0.18.0, JDK1.6.0_10, CentOS5.2