Hadoop集群安装&Hbase实验环境搭建
1.安装ubuntu10.04操作系统安装并配置telnet1)安装#apt-get install xinetd telnetd2). 安装成功后,系统也会有相应提示:sudo vi /etc/inetd.conf并加入以下一行telnet stream tcp nowait telnetd /usr/sbin/tcpd /usr/sbin/in.telnetd3). sudo vi /etc/xinetd.conf并加入以下内容:# Simple configuration file for xinetd## Some defaults, and include /etc/xinetd.d/
defaults{
# Please note that you need a log_type line to be able to use log_on_success# and log_on_failure. The default is the following :# log_type = SYSLOG daemon info
instances = 60log_type = SYSLOG authprivlog_on_success = HOST PIDlog_on_failure = HOSTcps = 25 30}
includedir /etc/xinetd.d4). sudo vi /etc/xinetd.d/telnet并加入以下内容:# default: on# description: The telnet server serves telnet sessions; it uses \# unencrypted username/password pairs for authentication.service telnet{disable = noflags = REUSEsocket_type = streamwait = nouser = rootserver = /usr/sbin/in.telnetdlog_on_failure = USERID}5). 重启机器或重启网络服务sudo /etc/init.d/xinetd restart
安装vimsudo apt-get remove vim-common && sudo apt-get install vim2.配置更新源(其实没必要)配置ubuntu10.04网易更新源vi /etc/apt/source-list
deb http://mirrors.163.com/ubuntu/ lucid main universe restricted multiversedeb-src http://mirrors.163.com/ubuntu/ lucid main universe restricted multiversedeb http://mirrors.163.com/ubuntu/ lucid-security universe main multiverse restricteddeb-src http://mirrors.163.com/ubuntu/ lucid-security universe main multiverse restricteddeb http://mirrors.163.com/ubuntu/ lucid-updates universe main multiverse restricteddeb http://mirrors.163.com/ubuntu/ lucid-proposed universe main multiverse restricteddeb-src http://mirrors.163.com/ubuntu/ lucid-proposed universe main multiverse restricteddeb http://mirrors.163.com/ubuntu/ lucid-backports universe main multiverse restricteddeb-src http://mirrors.163.com/ubuntu/ lucid-backports universe main multiverse restricteddeb-src http://mirrors.163.com/ubuntu/ lucid-updates universe main multiverse restricted3.安装Java1.6TIPS:完美解决Ubuntu下vi编辑器方向键变字母的问题。执行命令 sudo apt-get remove vim-common && sudo apt-get install vim
下载地址:http://www.oracle.com/technetwork/java/javase/downloads/jdk6downloads-1902814.html安装步骤:1)为jdk 文件添加执行权限: # chmod +x jdk-6u43-linux-x64.bin2)进行安装 : # ./jjdk-6u43-linux-x64.bin3)增加JAVA_HOME环境变量(注意实际JAVA安装环境) [root@test src]# vi /etc/profile 在最后面增加: #set java environment export JAVA_HOME=/usr/java/jdk1.6.0_43 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar export PATH=$PATH:$JAVA_HOME/bin export JAVA_HOME CLASSPATH PATH 保存退出4)使得刚刚添加到环境变量生效: # source /etc/profile5) 进入 /usr/bin/目录 #cd /usr/bin #ln -s -f /usr/java/jdk1.6.0_18/jre/bin/java #ln -s -f /usr/java/jdk1.6.0_18/bin/javac6)在命令行输入root@ubuntu:/usr/bin# java -versionjava version "1.6.0_43"Java(TM) SE Runtime Environment (build 1.6.0_43-b01)Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)
hadoop1.04 stable版本,下载地址:http://mirrors.cnnic.cn/apache/hadoop/common/
vi /etc/hostsIP地址规划:192.168.123.11 master192.168.123.12 slave1192.168.123.13 slave2
安装软件 lrzsz(方便使用SecureCRT传输文件)添加用户:添加Hadoop专用系统用户hadoop:hduser$ sudo addgroup hadoop$ sudo adduser --ingroup hadoop hduser克隆虚拟机两台!!!需要40-50分钟4.配置SSH服务每台机器上都运行:ssh-keygen -t rsa -P ""配置SSH(因为Hadoop需要通过SSH来管理它的节点)Master节点:hduser@master:~$scp .ssh/authorized_keys root@slave2:/home/roothduser@master:~$scp .ssh/authorized_keys root@slave2:/home/rootSlave1节点:root@ubuntu:/home/hduser# chown -R hduser:hadoop authorized_keyshduser@ubuntu:~$ cat .ssh/id_rsa.pub >> authorized_keyshduser@ubuntu:~$ scp authorized_keys hduser@slave2:/home/hduserSlave2节点:hduser@ubuntu:~$ cat .ssh/id_rsa.pub >> authorized_keyshduser@ubuntu:~$ cp authorized_keys .ssh/hduser@ubuntu:~$ scp authorized_keys hduser@master:/home/hduser/.sshhduser@ubuntu:~$ scp authorized_keys hduser@slave1:/home/hduser/.ssh
完成SSH配置。5.关闭IPV6功能vi /etc/sysctl.conf在/etc/sysctl.conf中添加:# disable ipv6net.ipv6.conf.all.disable_ipv6 = 1net.ipv6.conf.default.disable_ipv6 = 1net.ipv6.conf.lo.disable_ipv6 = 1
然后需要重启机器查看是否关闭IPV6$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6如果为0,则没有关闭,如果为1,则关闭##或者仅仅不允许Hadoop使用ipv6,在hadoop-env.sh中添加:exportHADOOP_OPTS=-Djava.net.preferIPv4Stack=true6.Hadoop安装:将上传到/home/root的文件修改所属为hduser(或者直接在hduser用户上传)root@ubuntu:~# chown -R hduser:hadoop *.gz(使用hduser用户)将Hadoopxx.xx.gz在/usr/local文件夹中解压~$sudo tar xzf hadoop-1.0.4.tar.gz出现错误:hduser is not in the sudoers file. This incident will be reported.解决方法:1)查看sudoers的位置root@master:~# whereis sudoerssudoers: /etc/sudoers.d /etc/sudoers /usr/share/man/man5/sudoers.5.gz2)添加修改权限root@slave2:~# chmod u+x /etc/sudoers3)添加hduser的sudo权限root@slave2:~# vi /etc/sudoers添加:hduser ALL=(ALL:ALL) ALL
解压Hadoop:再进行解压:hduser@master:/usr/local$ sudo tar xzf hadoop-1.0.4.tar.gz修改目录名称:hduser@master:/usr/local$ sudo mv hadoop-1.0.4 hadoop修改所属:hduser@master:/usr/local$ sudo chown -R hduser:hadoop hadoop
更新$HOME/.bashrc(每台机器上都需要修改)# Set Hadoop-related environment variablesexport HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)export JAVA_HOME=/usr/java/jdk1.6.0_43
# Some convenient aliases and functions for running Hadoop-related commandsunalias fs &> /dev/nullalias fs="hadoop fs"unalias hls &> /dev/nullalias hls="fs -ls"
# If you have LZO compression enabled in your Hadoop cluster and# compress job outputs with LZOP (not covered in this tutorial):# Conveniently inspect an LZOP compressed file from the command# line; run via:## $ lzohead /hdfs/path/to/lzop/compressed/file.lzo## Requires installed 'lzop' command.#lzohead () { hadoop fs -cat $1 | lzop -dc | head -1000 | less}
# Add Hadoop bin/ directory to PATHexport PATH=$PATH:$HADOOP_HOME/bin
7.Hadoop配置:1)修改conf/hadoop-env.sh设置JAVA_HOMEexportJAVA_HOME=/usr/java/jdk1.6.0_43
2)仅修改Master节点:conf/masters内容修改为:master
conf/slaves内容修改为:masterslave1slave2
3)修改所有节点的conf/*-site.xmlconf/core-site.xml (所有机器)<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
conf/mapred-site.xml (所有机器)<property>
<name>mapred.job.tracker</name>
<value>master:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
conf/hdfs-site.xml (所有机器)<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
8.Format HDFS1)hduser@master:/usr/local/hadoop$ bin/hadoop namenode -format
13/04/09 06:50:03 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = master/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.0.4
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012
************************************************************/
13/04/09 06:50:04 INFO util.GSet: VM type = 64-bit
13/04/09 06:50:04 INFO util.GSet: 2% max memory = 19.33375 MB
13/04/09 06:50:04 INFO util.GSet: capacity = 2^21 = 2097152 entries
13/04/09 06:50:04 INFO util.GSet: recommended=2097152, actual=2097152
13/04/09 06:50:05 INFO namenode.FSNamesystem: fsOwner=hduser
13/04/09 06:50:05 INFO namenode.FSNamesystem: supergroup=supergroup
13/04/09 06:50:05 INFO namenode.FSNamesystem: isPermissionEnabled=true
13/04/09 06:50:05 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
13/04/09 06:50:05 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
13/04/09 06:50:05 INFO namenode.NameNode: Caching file names occuring more than 10 times
13/04/09 06:50:07 INFO common.Storage: Image file of size 112 saved in 0 seconds.
13/04/09 06:50:07 INFO common.Storage: Storage directory /tmp/hadoop-hduser/dfs/name has been successfully formatted.
13/04/09 06:50:07 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/127.0.1.1
************************************************************/HDFS name table存储在NameNode本地的文件系统中,具体在dfs.name.dir中,name table被用于跟踪和分析DataNode的信息
9.测试1.启动集群1)启动HDFS守护进程,NameNode进程在master上,DataNode进程在slave节点上2)然后启动MapReduce进程:JobTracker在master上,TaskTracker进程在slave节点上
HDFS进程:运行bin/star-dfs.sh(在master节点上运行)hduser@master:/usr/local/hadoop$ bin/start-dfs.sh
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-master.out
The authenticity of host 'master (127.0.1.1)' can't be established.
ECDSA key fingerprint is ec:c2:e2:5f:c7:72:de:4f:7a:c0:f1:e7:2b:eb:84:3f.
Are you sure you want to continue connecting (yes/no)? slave1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave1.out
slave2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave2.out
master: Host key verification failed.
The authenticity of host 'master (127.0.1.1)' can't be established.
ECDSA key fingerprint is ec:c2:e2:5f:c7:72:de:4f:7a:c0:f1:e7:2b:eb:84:3f.
Are you sure you want to continue connecting (yes/no)? yes
master: Warning: Permanently added 'master' (ECDSA) to the list of known hosts.
master: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-secondarynamenode-master.out
在master节点上~此时JAVA进程应该像这样~hduser@master:/usr/local/hadoop$ jps
3217 NameNode
3526 SecondaryNameNode
4455 DataNode
4697 Jps
slave节点应该像这样:hduser@slave2:/usr/local/hadoop/conf$ jps
3105 DataNode
3743 Jps
在Master节点上运行:bin/start-mapred.sh在slave节点上查看:hduser@master:/usr/local/hadoop/logs$ cat hadoop-hduser-tasktracker-master.log
2013-04-09 07:27:15,895 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG: host = master/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.4
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct 3 05:13:58 UTC 2012
************************************************************/
2013-04-09 07:27:17,558 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-04-09 07:27:17,681 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-04-09 07:27:17,683 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-04-09 07:27:17,684 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
2013-04-09 07:27:18,445 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-04-09 07:27:18,459 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-04-09 07:27:23,961 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-04-09 07:27:24,178 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2013-04-09 07:27:24,305 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-04-09 07:27:24,320 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as hduser
2013-04-09 07:27:24,327 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /tmp/hadoop-hduser/mapred/local
2013-04-09 07:27:24,345 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2013-04-09 07:27:24,397 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2013-04-09 07:27:24,406 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
2013-04-09 07:27:24,492 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort39355 registered.
2013-04-09 07:27:24,493 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort39355 registered.
2013-04-09 07:27:24,498 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
2013-04-09 07:27:24,509 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2013-04-09 07:27:24,519 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 39355: starting
2013-04-09 07:27:24,519 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 39355: starting
2013-04-09 07:27:24,520 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 39355: starting
2013-04-09 07:27:24,521 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:39355
2013-04-09 07:27:24,521 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_master:localhost/127.0.0.1:39355
2013-04-09 07:27:24,532 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 39355: starting
2013-04-09 07:27:24,533 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 39355: starting
2013-04-09 07:27:24,941 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_master:localhost/127.0.0.1:39355
2013-04-09 07:27:24,998 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-04-09 07:27:25,028 INFO org.apache.hadoop.mapred.TaskTracker: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@74c6eff5
2013-04-09 07:27:25,030 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
2013-04-09 07:27:25,034 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
2013-04-09 07:27:25,042 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
2013-04-09 07:27:25,045 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
2013-04-09 07:27:25,046 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
2013-04-09 07:27:25,046 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
2013-04-09 07:27:25,046 INFO org.mortbay.log: jetty-6.1.26
2013-04-09 07:27:25,670 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50060
2013-04-09 07:27:25,670 INFO org.apache.hadoop.mapred.TaskTracker: FILE_CACHE_SIZE for mapOutputServlet set to : 2000
此时在Master节点上应该为:hduser@master:/usr/local/hadoop$ jps
3217 NameNode
3526 SecondaryNameNode
4455 DataNode
5130 Jps
4761 JobTracker
4988 TaskTracker
hduser@master:/usr/local/hadoop$
在Slave节点应该是:hduser@slave2:/usr/local/hadoop/conf$ jps
3901 TaskTracker
3105 DataNode
3958 Jps
停止MapReduce进程,在Master节点上运行:hduser@master:/usr/local/hadoop$ bin/stop-mapred.sh
stopping jobtracker
slave2: stopping tasktracker
master: stopping tasktracker
slave1: stopping tasktracker之后Master节点上的Jave进程:hduser@master:/usr/local/hadoop$ jps
3217 NameNode
3526 SecondaryNameNode
4455 DataNode
5427 Jps4761 JobTracker之后在Slave节点上:hduser@slave2:/usr/local/hadoop/conf$ jps
4988 TaskTracker
3105 DataNode
4140 Jps3901 TaskTrackerStopping the HDFS layer(Master)hduser@master:/usr/local/hadoop$ bin/stop-dfs.sh
stopping namenode
slave1: stopping datanode
slave2: stopping datanode
master: stopping datanode
localhost: stopping secondarynamenode
hduser@master:/usr/local/hadoop$ jps
5871 Jps
此时的Slave节点:hduser@slave2:/usr/local/hadoop/conf$ jps
4305 Jps
10.MapReduce测试:hduser@master:/usr/local/hadoop$ mkdir /tmp/test
hduser@master:/usr/local/hadoop$ cd /tmp/test
hduser@master:/tmp/test$ rz
rz waiting to receive.
开始 zmodem 传输。 按 Ctrl+C 取消。
100% 615 KB 615 KB/s 00:00:01 0 Errors
100% 502 KB 502 KB/s 00:00:01 0 Errors
100% 813 KB 813 KB/s 00:00:01 0 Errors
hduser@master:/tmp/test$ /usr/local/hadoop/bin/start-all.sh
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-master.out
slave2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave2.out
slave1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave1.out
master: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-master.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-secondarynamenode-master.out
starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-jobtracker-master.out
slave1: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-tasktracker-slave1.out
slave2: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-tasktracker-slave2.out
master: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-tasktracker-master.out
hduser@master:~$ /usr/local/hadoop/bin/hadoop dfs -copyFromLocal /tmp/test /user/hduser/test
hduser@master:~$ /usr/local/hadoop/bin/hadoop dfs -ls /user/hduser
Found 1 items
drwxr-xr-x - hduser supergroup 0 2013-04-09 08:06 /user/hduser/test
hduser@master:~$ /usr/local/hadoop/bin/hadoop dfs -ls /user/hduser/test
Found 3 items
-rw-r--r-- 2 hduser supergroup 630010 2013-04-09 08:06 /user/hduser/test/1.epub
-rw-r--r-- 2 hduser supergroup 514824 2013-04-09 08:06 /user/hduser/test/2.epub
-rw-r--r-- 2 hduser supergroup 832882 2013-04-09 08:06 /user/hduser/test/3.epub
hduser@master:~$ cd /usr/local/hadoop/
hduser@master:/usr/local/hadoop$ bin/hadoop jar hadoop-examples-1.0.4.jar wordcount /user/hduser/test /user/hduser/test-output
13/04/09 08:18:45 INFO input.FileInputFormat: Total input paths to process : 3
13/04/09 08:18:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/09 08:18:45 WARN snappy.LoadSnappy: Snappy native library not loaded
13/04/09 08:18:45 INFO mapred.JobClient: Running job: job_201304090758_0001
13/04/09 08:18:46 INFO mapred.JobClient: map 0% reduce 0%
13/04/09 08:19:10 INFO mapred.JobClient: map 66% reduce 0%
13/04/09 08:19:19 INFO mapred.JobClient: map 100% reduce 0%
13/04/09 08:19:31 INFO mapred.JobClient: map 100% reduce 100%
13/04/09 08:19:36 INFO mapred.JobClient: Job complete: job_201304090758_0001
13/04/09 08:19:37 INFO mapred.JobClient: Counters: 29
13/04/09 08:19:37 INFO mapred.JobClient: Job Counters
13/04/09 08:19:37 INFO mapred.JobClient: Launched reduce tasks=1
13/04/09 08:19:37 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=42658
13/04/09 08:19:37 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/04/09 08:19:37 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
13/04/09 08:19:37 INFO mapred.JobClient: Launched map tasks=3
13/04/09 08:19:37 INFO mapred.JobClient: Data-local map tasks=3
13/04/09 08:19:37 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=20867
13/04/09 08:19:37 INFO mapred.JobClient: File Output Format Counters
13/04/09 08:19:37 INFO mapred.JobClient: Bytes Written=3216032
13/04/09 08:19:37 INFO mapred.JobClient: FileSystemCounters
13/04/09 08:19:37 INFO mapred.JobClient: FILE_BYTES_READ=3421949
13/04/09 08:19:37 INFO mapred.JobClient: HDFS_BYTES_READ=1978040
13/04/09 08:19:37 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6930267
13/04/09 08:19:37 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=3216032
13/04/09 08:19:37 INFO mapred.JobClient: File Input Format Counters
13/04/09 08:19:37 INFO mapred.JobClient: Bytes Read=1977716
13/04/09 08:19:37 INFO mapred.JobClient: Map-Reduce Framework
13/04/09 08:19:37 INFO mapred.JobClient: Map output materialized bytes=3421961
13/04/09 08:19:37 INFO mapred.JobClient: Map input records=14841
13/04/09 08:19:37 INFO mapred.JobClient: Reduce shuffle bytes=2449921
13/04/09 08:19:37 INFO mapred.JobClient: Spilled Records=64440
13/04/09 08:19:37 INFO mapred.JobClient: Map output bytes=3685555
13/04/09 08:19:37 INFO mapred.JobClient: CPU time spent (ms)=10830
13/04/09 08:19:37 INFO mapred.JobClient: Total committed heap usage (bytes)=496644096
13/04/09 08:19:37 INFO mapred.JobClient: Combine input records=36177
13/04/09 08:19:37 INFO mapred.JobClient: SPLIT_RAW_BYTES=324
13/04/09 08:19:37 INFO mapred.JobClient: Reduce input records=32220
13/04/09 08:19:37 INFO mapred.JobClient: Reduce input groups=31501
13/04/09 08:19:37 INFO mapred.JobClient: Combine output records=32220
13/04/09 08:19:37 INFO mapred.JobClient: Physical memory (bytes) snapshot=614944768
13/04/09 08:19:37 INFO mapred.JobClient: Reduce output records=31501
13/04/09 08:19:37 INFO mapred.JobClient: Virtual memory (bytes) snapshot=3845349376
13/04/09 08:19:37 INFO mapred.JobClient: Map output records=36177
hduser@master:/usr/local/hadoop$ mkdir /tmp/test-output
hduser@master:/usr/local/hadoop$ bin/hadoop dfs -getmerge /user/hduser/test-output /tmp/test-output
13/04/09 08:22:48 INFO util.NativeCodeLoader: Loaded the native-hadoop library
$head /tmp/test-output/test-output查看内容~因为导入的内容为中文的,所以为乱码,待查看,先到此吧~HBASE还没安装呢~