Get Hadoop up and running without DNS

In this couple of days, I have tried to get Hadoop Word Count ruuning on my local cluster with 3 CentOS boxes.

Thanks to Running Hadoop On Ubuntu Linux (Multi-Node Cluster), 90% of the set up was easy as described.

But there are two problems that I had to waste my time.

1. RSA authentication with SSH

authorized_keys file has to be accessible only by the user. Don't forget to disable any access by any groups and others.

2. Host name resolution

examples of hosts files

master

::1 localhost6.localdomain6 localhost6
192.168.10.21 master * master has to have accessible IP address(not ::1 nor 127.0.0.1) by slaves
192.168.10.22 slave.yellow
192.168.10.23 slave.red

slave.yellow

127.0.0.1 slave.yellow localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.23 slave.red

slave.red

127.0.0.1 slave.red localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.10.21 master
192.168.10.22 slave.yellow

Be careful about these network settings, then the Work Count should run.

2008/9/14 tested Hadoop-0.18.0, JDK1.6.0_10, CentOS5.2

No feedback yet

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
PoorExcellent
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)
Free Blog Themes and Free Blog Templates