HADOOP 2.6.0 INSTALLATION GUIDE
User Manual:
Open the PDF directly: View PDF
.
Page Count: 19
| Download | |
| Open PDF In Browser | View PDF |
HADOOP - 2.6.0 INSTALLATION GUIDE By. Mr.Gopal Krishna Sr. Hadoop Architect STEP 1: First we need to update the “System Software Repositories” using below command [ sudo apt-get update ] STEP 2: Install java-1.7 version using below command. [ This is a mandate step for Hadoop Installation ] sudo apt-get install openjdk-7-jdk Before Java Installation: For Java Installation After Java Installation: STEP 3: Install ssh (secure shell) using below command sudo apt-get install ssh Before ssh installation For ssh installation After ssh installation Secure Shell (SSH) Communication prompting for PASSWORD To Achieve the Password Less SSH Communication, enter the below commands at any terminal: gopalkrishna@ubuntu:~$ ssh localhost gopalkrishna@ubuntu:~$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa gopalkrishna@ubuntu:~$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys Secure Shell (SSH) Communication WITHOUT prompting for PASSWORD STEP 4: To Install eclipse IDE, use the below command sudo apt-get install eclipse Before eclipse install For Eclipse Installation After Eclipse Installation STEP 5: Download Hadoop-2.6.0 version tar ball from Apache Mirrors http://apache.mirrors.tds.net/hadoop/core/ STEP 6: Create ‘INSTALL’ directory in ‘/home/gopalkrishna’ folder STEP 7: Copy the Downloaded ‘hadoop2.6.0.tar.gz’ file to INSTALL directory STEP 8: Extract the copied tar ball [hadoop2.6.0.tar.gz] using below command: After UNTARING: hadoop-2.6.0 directory is extracted STEP 9: Go to the below path to update configuration files: /home/gopalkrishna/INSTALL/hadoop-2.6.0/etc/hadoop Below are the total configuration files in ‘hadoop’ directory STEP 10: We have to edit the below 8 configuration files as part of HADOOP Installation: 1. 2. 3. 4. 5. 6. 7. 8. core-site.xml mapred-site.xml mapred-env.sh yarn-site.xml hdfs-site.xml hadoop-env.sh yarn-env.sh slaves STEP 11: Open ‘core-site.xml’ file, add the below 2 properties in betweentags Name Node Configuration Property... RPC port is: 8020 To Store the data in Local File System fs.defaultFS hdfs://localhost:8020 STEP 12: Open ‘hadoop-env.sh’ file and update JAVA_HOME path STEP 13: Open mapred-env.sh and update JAVA_HOME in that Updating JAVA_HOME path in mapred_env.sh file STEP 14: Open hdfs-site.xml file and add the below properties: hadoop.tmp.dir file:/home/gopalkrishna/data/tmp dfs.replication 1 dfs.namenode.name.dir file:/home/gopalkrishna/data/dfs/name First Property is to configuration “REPLICATION” factor...In Single Node Cluster it’s always one...default is: 3 Second Property is to specify the path to store Metadata in Name Node Third property is to store the specify the path to store the Actual Data on Data Node STEP 15: Open mapred-site.xml and update the framework architecture details as ‘yarn’ dfs.datanode.data.dir file:/home/gopalkrishna/data/dfs/data This Property is to Specify MR2 architecture is “yarn” STEP 16: Open yarn-env.sh and update JAVA_HOME path in that Updating JAVA_HOME in yarn-env.sh STEP 17: Open yarn-site.xml and add the below properties to configure “Resource Manager” & “Node Manager” details: mapreduce.framework.name yarn yarn.resourcemanager.hostname localhost yarn.nodemanager.aux-services mapreduce_shuffle These 3 properties are to specify “ResourceManager” & “NodeManager” Configurations. STEP 18: Open slaves file and check whether the hostname is localhost or not In Case Of Singe Node Psedo Cluster Setup hostname will always point to “localhost” where as in distributed cluster setup we must have to specify the individual datanode host names here STEP 19: Open bashrc file ( ~/.bashrc ) and update JAVA_HOME, HADOOP_HOME & PATH variables: export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export HADOOP_HOME=/home/gopalkrishna/INSTALL/hadoop-2.6.0 export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/sbin STEP 20: To check the bashrc changes, open a new terminal and type the below command: (echo $HADOOP_HOME) STEP 21: Before starting Name Node, we must have to format the name node using below command: (hadoop namenode –format) Before starting name node, we must have to format the name node, otherwise daemons wont start even STEP 22: To start all the daemons of hadoop in 2.X.X use “start-all.sh” command HDFS Storage Daemons: Name Node : Data Node : Secondary Name Node Map Reduce Processing Daemons: Resource Manager : Node Manager STEP 23: To Access the Name Node information in GUI: http://localhost:50070 To Access Legacy UI page, click here STEP 24: To Start Job History Server in 2.x.x. mr-jobhistory-daemon.sh start historyserver STEP 25: To Access Resource Manager: http://localhost:8088 STEP 26: To Access Job History Server: http://localhost:19888 STEP 27: To stop all the daemons of hadoop in 2.X.X use “stop-all.sh” command STEP 28: To Stop Job History Server in 2.x.x. yarn.nodemanager.aux-services.mapreduce_shuffle.class org.apache.hadoop.mapred.ShuffleHandler
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Language : en-US Page Count : 19 XMP Toolkit : 3.1-702 About : 9AD3EA69-2BFC-77DA-B190-6D36F7027509 Keywords : Producer : http://www.convertapi.com Modify Date : 2017:03:13 03:25:18-05:00 Create Date : 2017:03:13 03:25:18-05:00 Metadata Date : 2017:03:13 03:25:18-05:00 Creator Tool : Microsoft® Word 2013 Format : application/pdf Description : Creator : Niyathi Title : Document ID : uuid:3BA8A10C-AD80-26E8-7170-8E6498B33D80 Instance ID : uuid:9AD3EA69-2BFC-77DA-B190-6D36F7027509 Author : NiyathiEXIF Metadata provided by EXIF.tools