Common Hadoop Problems

Today we will learn some common problems, that a person faces while installing Hadoop. Here are few problems  listed below.

1. Problem with ssh conguration.
 error: connection refused to port 22

2. Namenode node not reachable
    error: Retrying to connect 127.0.0.1

1. Problem with ssh configuration: In this case you may face many kind of errors, but most common one while installing hadoop is connection refused to port 22. Here you should check if machine on which you are trying to login, should have ssh server installed.
   
If you are using Ubuntu/Lubuntu, you can install ssh server using following command.
   
   sudo apt-get install openssh-server
   
   On CentOs or Redhat you can install ssh server using yum package manager
   
   sudo yum install openssh-server
   
   after installing ssh server, make sure you have configured the keys properly and share public key with the machine that you want to login into. If the problem persists then check for configurations of ssh in your machine. you can check configuration in /etc/ssh/sshd_config file. use following command to read this file
   
   sudo gedit /etc/ssh/sshd_config
   
   In this file RSAAuthentication should be set to yes, password less authentication also should be yes.
   
   after this close the file and restart ssh with following command
   
   sudo /etc/init.d/ssh restart
   
   Now your problem should be resolved. Apart from this error you can face one more issue. Even though you have configured keys correctly, ssh is still prompting for password. In that case check if keys are being managed by ssh. For that run following command. your keys should be in 
   $HOME/.ssh folder
   
   ssh-add
   
 2. If your namenode is not reachable, first thing you should check is demons running on namnode machine. you can check that with following command

   jps
   
   This command tells you all java processes running on your machine. If you donot see Namenode in the output list, do the following. Stop the hadoop with following command.
   
   $HADOOP_HOME/bin/stop-all.sh
   
   Format the Namenode using following command
   
   $HADOOP_HOME/bin/hadoop namenode -format
   
   start hadoop with following command
   
   $HADOOP_HOME/bin/start-all.sh
   
   this time namenode should run. if you are still not able to start namenode. then check for core-site.xml file in conf directory of hadoop with following command
   
   gedit $HADOOP_HOME/conf/core-site.xml
   
   check for value for property hadoop.tmp.dir. it should be set to a path where user who is trying to run hadoop has write permissions. if you dont want to scratch your head on this set it to $HOME/hadoop_tmp directory. Now save and close this file. Format the namenode again and try starting hadoop again. Things should work this time.

   Thats all for this posts, Please share problems that you are facing, we will try to solve them together. stay tuned for more stuff :)   
   

Comments

  1. I too faced similar problems during my initial Hadoop installation.
    This saves a lot of time for the beginners.
    Again nice post Harjeet.

    ReplyDelete
  2. Thanks for InformationHadoop Course will provide the basic concepts of MapReduce applications developed using Hadoop, including a close look at framework components, use of Hadoop for a variety of data analysis tasks, and numerous examples of Hadoop in action. This course will further examine related technologies such as Hive, Pig, and Apache Accumulo. HADOOP Online Training

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Hi Harjeet,

    I found your post quite informative and helpful. Could you please add some more common types of issues/problems coming while working with hadoop.

    Thanks for your posts! Keep posting :)

    ReplyDelete

Post a Comment

Popular posts from this blog

Hive UDF Example

Custom UDF in Apache Spark

Enterprise Kafka and Spark : Kerberos based Integration