[综合]Apache Hadoop 2.2.0集群安装(2)[翻译]
$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>
在NameNode执行如下命令去启动hdfs:
?
?
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
在所有的从节点上执行如下命令启动DataNodes?:
?
?
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
在ResourceManager上执行如下命令去启动YARN
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
在所有的从节点上执行如下命令去启动NodeManagers?:
?
?
?
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
单独启动一个web服务器,如果需要负载均衡的话那么在每个机子上都执行如下脚本:
?
?
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh start proxyserver --config $HADOOP_CONF_DIR
在任何一台机子上执行如下命令去启动MapReduce JobHistory?服务:
?
?
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
Hadoop集群关闭
?
在NameNode?节点上执行如下命令去关闭NameNode进程:
?
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode
在所有的从节点上执行如下脚本去停止DataNodes?进程:
?
?
$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode
在ResourceManager?节点上执行如下命令可以停止ResourceManager?进程:
?
?
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager
在所有从节点执行如下命令去停止NodeManagers?进程:
?
?
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager
在运行WebAppProxy?的节点上执行如下命令可以停止WebAppProxy?服务:
?
?
$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh stop proxyserver --config $HADOOP_CONF_DIR
在运行MapReduce JobHistory?服务的节点上执行如下命令去停止MapReduce JobHistory?服务:
?
?
$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
?
?
Hadoop在安全模式下运行
本节将讲述一些在安全模式下运行的参数,安全模式是可靠的基于Kerberos协议认证的。
Hadoop进程的用户账户
确保HDFS和YARN进程是由不同的Unix用户启动的,如hdfs,yarn,并且MapReduce JobHistory?是由mapred启动的。
推荐他们都属于同一个组如Hadoop:
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/nn.service.keytabKeytab name: FILE:/etc/security/keytab/nn.service.keytabKVNO Timestamp Principal 4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 nn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)Secondary NameNode 的keytab文件如下:
?
?
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/sn.service.keytabKeytab name: FILE:/etc/security/keytab/sn.service.keytabKVNO Timestamp Principal 4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 sn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
DataNode 的keytab文件如下:
?
?
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/dn.service.keytabKeytab name: FILE:/etc/security/keytab/dn.service.keytabKVNO Timestamp Principal 4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 dn/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
YARN:
?
ResourceManager?节点上的ResourceManager?keytab文件如下:
?
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/rm.service.keytabKeytab name: FILE:/etc/security/keytab/rm.service.keytabKVNO Timestamp Principal 4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 rm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
NodeManager节点上的keytab文件如下:
?
?
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/nm.service.keytabKeytab name: FILE:/etc/security/keytab/nm.service.keytabKVNO Timestamp Principal 4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 nm/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
MapReduce JobHistory Server:
?
MapReduce JobHistory Server keytab 文件如下:
?
$ /usr/kerberos/bin/klist -e -k -t /etc/security/keytab/jhs.service.keytabKeytab name: FILE:/etc/security/keytab/jhs.service.keytabKVNO Timestamp Principal 4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 jhs/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-256 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (AES-128 CTS mode with 96-bit SHA-1 HMAC) 4 07/18/11 21:08:09 host/full.qualified.domain.name@REALM.TLD (ArcFour with HMAC/md5)
?
安全模式配置:
conf/core-site.xml:
$ mvn package -Dcontainer-executor.conf.dir=/etc/hadoop/通过?-Dcontainer-executor.conf.dir传过来的路径集群节点上必须有且是本地的路径,执行文件必须在$HADOOP_YARN_HOME/bin中有。执行文件必须有权限:6050 or --Sr-s---?,NodeManager?的unix用户必须同组,这个组必须是个特殊的组,如果其他应用程序具有这个组的权限那么他将是不安全的,这个组的名称需要在?yarn.nodemanager.linux-container-executor.group?属性中配置涉及到conf/yarn-site.xml?and?conf/container-executor.cfg两个文件。
?
如:NodeManager?的启动用户为yarn?为hadoop组,users组中有如下两个用户yarn?和alice(应用程序提交者)?同时alice?不属于hadoop组如上所述那么setuid/setgid 执行文件必须设置权限为 6050 or --Sr-s---?,yarn?用户和hadoop?组(这样alice?就不能执行了)。
LinuxTaskController 需要的目录?yarn.nodemanager.local-dirs?andyarn.nodemanager.log-dirs他们的权限设置为755?。
conf/container-executor.cfg:
执行文件需要一个配置文件container-executor.cfg上面mvn提到的,此文件必须为运行NodeManager?的用户所有(如上面的yarn?),任意组那么权限为:0400 or r--------.
执行文件需要下属参数在conf/container-executor.cfg配置,以key-value对出现,并且一行一个。
[hdfs]$ $HADOOP_PREFIX/bin/hdfs namenode -format <cluster_name>在NameNode?节点上启动hdfs,用户为hdfs用户:
[hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
在DataNodes?节点上启动DataNodes?用户为root,设置环境变量HADOOP_SECURE_DN_USER为hdfs:
[root]$ HADOOP_SECURE_DN_USER=hdfs $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
在ResourceManager?节点上执行如下命令启动YARN,用户为yarn:
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
在其他从节点上执行如下命令启动NodeManagers,用户为yarn:
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager
用户yarn启动一个WebAppProxy?服务如果需要启动多个去负载均衡那么就用同样的方式启动多个:
[yarn]$ $HADOOP_YARN_HOME/bin/yarn start proxyserver --config $HADOOP_CONF_DIR
用mapred用户启动MapReduce JobHistory Server?:
[mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR
?
hadoop集群关闭:
用户hdfs执行如下命令关闭NameNode?:
[hdfs]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop namenode
root用户在所有从节点上执行如下命令停止DataNodes?:
[root]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs stop datanode
yarn用户在ResourceManager?节点上执行如下命令关闭ResourceManager:
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop resourcemanager
yarn用户在所有的从节点上执行如下命令结束NodeManagers:
[yarn]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR stop nodemanager
yarn用户在WebAppProxy server.节点上执行如下命令停止WebAppProxy server.如果有多台那么依次:
[yarn]$ $HADOOP_YARN_HOME/bin/yarn stop proxyserver --config $HADOOP_CONF_DIR
mapred用户执行如下命令停止MapReduce JobHistory Server:
[mapred]$ $HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver --config $HADOOP_CONF_DIR
?
Web监控页面
一旦集群启动之后可以通过web-ui监控进程运行情况:
MapReduce JobHistory Serverhttp://jhs_host:port/Default HTTP port is 19888.???? ?????
?
?