单节点Hadoop安装过程
1.1.1 环境准备
本次由一台centos虚拟服务器搭建hadoop平台,机器信息如表1所示:
表1 主机环境准备
名称
信息
IP
10.1.1.20
hostname
Master.hadoop
为方便使用,现给出主机环境方面需要修改的地方:
● IP地址修改
IP地址位于/etc/sysconfig/network-scripts/目录中,通过vi编辑ifcfg-eth0文件修改成如下所示结构即可:
[root@master network-scripts]# cd /etc/sysconfig/network-scripts/
[root@master network-scripts]# cat ifcfg-eth0
DEVICE="eth0"
ONBOOT=yes
TYPE=Ethernet
BOOTPROTO=none
IPADDR=10.1.1.20
PREFIX=24
GATEWAY=10.1.1.1
DEFROUTE=yes
HWADDR=00:30:16:AF:00:D1
● hostname修改
Hostname修改位于/etc/sysconfig/network文件中,其修改后的结果如下所示:
[root@master network-scripts]# cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master.hadoop
● DNS修改
DNS位于/etc/hosts文件中,修改的结果如下所示:
[root@master network-scripts]# cat /etc/hosts
10.1.1.20 master.hadoop master
127.0.0.1 localhost.localdomain localhost
● 环境测试
通过ping测试master.hadoop是否畅通:
[root@master network-scripts]# ping master.hadoop
PING master.hadoop (10.1.1.20) 56(84) bytes of data.
64 bytes from master.hadoop (10.1.1.20): icmp_seq=1 ttl=64 time=0.040 ms
64 bytes from master.hadoop (10.1.1.20): icmp_seq=2 ttl=64 time=0.016 ms
--- master.hadoop ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1467ms
rtt min/avg/max/mdev = 0.016/0.028/0.040/0.012 ms
1.1.2Java安装与部署
Hadoop需要java环境支持,通常需要java 1.6版本以上,因此可以通过去java官方网站下载JDK环境,下载地址为:
http://www.oracle.com/technetwork/java/javase/downloads/jdk-6u25-download-346242.html
从本链接中选择jdk-6u25-linux-x64-rpm.bin,在接受协议后方可下载到本地;
● Java安装
将下载到后java文件传至master.hadoop主机/home目录中,下面可以进行对其进行安装:
[root@master home]# chmod u+x jdk-6u25-linux-x64-rpm.bin
[root@master home]# ./jdk-6u25-linux-x64-rpm.bin
● Java配置
Java安装完毕后,可以对java目录信息进行环境变量配置,配置信息需增加至文件/etc/profile之中,具体如下所示:
[root@master home]#vi /etc/profile
JAVA_HOME=/usr/java/jdk1.6.0_25
CLASSPATH=.:$JAVA_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin
环境变量配置完毕后,通过命令进行检验并生效:
[root@master jdk1.6.0_25]# source /etc/profile
1.1.3 SSH配置
通过配置SSH实现基于公钥方式无密码登录,具体操作步骤为:创建一个新的hadoop帐户、生成这个帐户的SSH公钥、配置公钥授权文件、设置SSH服务登录方式等,下面给出具体方式:
● 创建hadoop帐户
[root@master jdk1.6.0_25]# useradd hadoop #创建帐号
[root@master jdk1.6.0_25]# passwd hadoop #配置密码
● 生成公钥
[hadoop@master ~]$ ssh-keygen #生成SSH认证公钥,连续回车即可
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
86:b5:d9:6a:ea:03:4e:5a:97:e5:24:5b:1f:65:41:89 hadoop@master.hadoop
The key's randomart image is:
+--[ RSA 2048]----+
| ooo |
| E + |
| . o |
| .o++. |
| .OS... |
| + +.... |
| = o o |
| . . .o |
| .o. |
+-----------------+
[hadoop@master ~]$ cd .ssh/
[hadoop@master .ssh]$ ls
id_rsa id_rsa.pub
● 配置授权
[hadoop@master ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@master ~]$ chmod 700 ~/.ssh
[hadoop@master ~]$ chmod 600 ~/.ssh/authorized_keys
● 测试
[hadoop@master jdk1.6.0_25]$ ssh master.hadoop
Last login: Wed Jun 13 18:29:29 2012 from master.hadoop
1.1.4 Hadoop安装与配置
使用的Hadoop版本是hadoop-0.20.2.tar.gz,下载地址为:
http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-0.20.2/hadoop-0.20.2.tar.gz
● Hadoop安装
[root@master home]# tar xzvf hadoop-0.20.2.tar.gz
[root@master home]# mv hadoop-0.20.2 /usr/local
[root@master home]# cd /usr/local
[root@master local]# ls
bin etc games hadoop-0.20.2 include lib lib64 libexec sbin share src
[root@master local]# mv hadoop-0.20.2/ hadoop
[root@master local]# ls
bin etc games hadoop include lib lib64 libexec sbin share src
[root@master local]# chown -R hadoop:hadoop /usr/local/hadoop/ #修改权限
● 环境变量配置
跟配置JAVA一样配置hadoop环境变量,编辑文件/etc/profile,同时也要修改hadoop内部环境变量/hadoop/conf/hadoop_env.sh,具体细节如下所示:
[root@master local]# vi /etc/profile
HADOOP_HOME=/usr/local/hadoop
HADOOP_CONF_DIR=$HADOOP_HOME/conf
CLASSPAH=.:$JAVA_HOME/lib:$HADOOP_HOME/lib
PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$HADOOP_HOME/bin
"/etc/profile" 73L, 1660C written
[root@master local]# source /etc/profile
[root@master conf]# vi hadoop-env.sh
export JAVA_HOME=$JAVA_HOME
export HADOOP_CLASSPATH="$HADOOP_CLASSPATH"
export HADOOP_HEAPSIZE=2048
export HADOOP_LOG_DIR=/var/local/logs
export HADOOP_PID_DIR=/var/local/pids
[root@master bin]# export JAVA_HOME
[root@master bin]# export HADOOP_HOME
[root@master bin]# export HADOOP_CONF_DIR
● hadoop文件配置
配置三个xml文件,分别为:core-site.xml、hdfs-site.xml、mapred-site.xml,配置效果如下所示:
文件:core-site.xml
[root@master conf]# vi core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
文件:hdfs-site.xml
[root@master conf]# vi hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
文件:mapred-site.xml
[root@master conf]# vi mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
● hadoop格式化文件系统
切换到bin目录,找到可执行文件hadoop,执行文件系统格式化操作:
[root@master bin]# hadoop namenode -format
● 启动hadoop
[root@master bin]# ./start-all.sh
starting namenode, logging to /var/local/logs/hadoop-root-namenode-master.hadoop.out
localhost: starting datanode, logging to /var/local/logs/hadoop-root-datanode-master.hadoop.out
localhost: starting secondarynamenode, logging to /var/local/logs/hadoop-root-secondarynamenode-master.hadoop.out
starting jobtracker, logging to /var/local/logs/hadoop-root-jobtracker-master.hadoop.out
localhost: starting tasktracker, logging to /var/local/logs/hadoop-root-tasktracker-master.hadoop.out
1.1.5 Hadoop测试
[root@master hadoop]# jps
2459 JobTracker
2284 DataNode
2204 NameNode
2860 Jps
2382 SecondaryNameNode
2575 TaskTracker