部署自己写的map/reduce程序的方法
部署自己写的map/reduce程序的方法
【1】:首先就是打包了。需要把的程序( class文件)、配置目录(conf/)、jar包 放在同一级目录 ,下面是 build.xml
<?xml version="1.0" encoding="UTF-8"?>
<project name="mapreducetest" default="dist">
<property name="name" value="mapreducetest" />
<property name="version" value="0.1" />
<property name="build" value="build" />
<property name="build.classes" value="${build}/classes" />
<property name="dist" value="dist" />
<property name="src" value="src" />
<property name="lib" value="lib" />
<property name="conf" value="conf" />
<path id="project.class.path">
<fileset dir="${lib}">
<include name="*.jar" />
</fileset>
</path>
<target name="init">
<delete dir="${build}" />
<delete dir="${dist}" />
<mkdir dir="${build}" />
<mkdir dir="${build.classes}" />
<mkdir dir="${dist}" />
</target>
<target name="compile" depends="init">
<javac debug="true" srcdir="${src}" destdir="${build.classes}">
<classpath refid="project.class.path" />
</javac>
</target>
<target name="dist" depends="compile">
<copy todir="${build.classes}">
<fileset dir="${src}">
<include name="**/*.xml" />
<include name="**/*.dtd" />
<include name="**/*.properties" />
<include name="**/*.vm" />
<include name="**/*.xsd" />
</fileset>
<fileset dir="${lib}">
<include name="*.jar"/>
</fileset>
</copy>
<mkdir dir="${dist}" />
<jar jarfile="${dist}/${name}.jar" index="true">
<fileset dir="${build.classes}">
<include name="**" />
</fileset>
<fileset dir="${conf}">
<include name="**" />
</fileset>
<manifest>
<attribute name="Built-By" value="${author}" />
<section name="main">
<attribute name="Specification-Version" value="${version}" />
<attribute name="Implementation-Title" value="main" />
<attribute name="Implementation-Version" value="${version} ${TODAY}" />
</section>
</manifest>
</jar>
</target>
<target name="clean">
<delete dir="${build}" />
<delete dir="${dist}" />
</target>
</project>
【2】 : 把/dist/mapreducetest.jar 拷贝到指定的目录 --- ( hadoop@user-desktop:~/hadoop-0.19.2$ ) 接下来就是运行了。
【3】 : bin/hadoop jar mapreducetest.jar com.mapreduce.demo.DB.ReadAccess /hadoop/readout 如果能看到下面东东。就说明你的map/reduce运行成功了。
10/07/07 15:39:58 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
10/07/07 15:40:02 INFO mapred.JobClient: Running job: job_201007071102_0002
10/07/07 15:40:03 INFO mapred.JobClient: map 0% reduce 0%
10/07/07 15:40:08 INFO mapred.JobClient: map 50% reduce 0%
10/07/07 15:40:10 INFO mapred.JobClient: map 100% reduce 0%
10/07/07 15:40:15 INFO mapred.JobClient: map 100% reduce 100%
10/07/07 15:40:16 INFO mapred.JobClient: Job complete: job_201007071102_0002
10/07/07 15:40:16 INFO mapred.JobClient: Counters: 14
10/07/07 15:40:16 INFO mapred.JobClient: File Systems
10/07/07 15:40:16 INFO mapred.JobClient: HDFS bytes written=80
10/07/07 15:40:16 INFO mapred.JobClient: Local bytes read=134
10/07/07 15:40:16 INFO mapred.JobClient: Local bytes written=330
10/07/07 15:40:16 INFO mapred.JobClient: Job Counters
10/07/07 15:40:16 INFO mapred.JobClient: Launched reduce tasks=1
10/07/07 15:40:16 INFO mapred.JobClient: Launched map tasks=2
10/07/07 15:40:16 INFO mapred.JobClient: Map-Reduce Framework
10/07/07 15:40:16 INFO mapred.JobClient: Reduce input groups=6
10/07/07 15:40:16 INFO mapred.JobClient: Combine output records=0
10/07/07 15:40:16 INFO mapred.JobClient: Map input records=6
10/07/07 15:40:16 INFO mapred.JobClient: Reduce output records=6
10/07/07 15:40:16 INFO mapred.JobClient: Map output bytes=116
10/07/07 15:40:16 INFO mapred.JobClient: Map input bytes=6
10/07/07 15:40:16 INFO mapred.JobClient: Combine input records=0
10/07/07 15:40:16 INFO mapred.JobClient: Map output records=6
10/07/07 15:40:16 INFO mapred.JobClient: Reduce input records=6
【4】 : 命令查看输出
bin/hadoop fs -cat /hadoop/readout/part-00000| head -13
1 楼 yangxuanlun 2011-12-14 你好,本人比较愚钝,看了你的帖子,还是没搞懂怎么部署自己的mapreduce。希望能故详细解释。比如,1:那个build.xml是不是就是hadoop安装目录下面的那个build.xml,要在那个里面修改那些地方:2,把那些class,jar放在同一个目录。这同一个目录是固定的那个目录还是自己随便定义一个目录呢?3,把/dist/mapreducetest.jar 拷贝到同一个目录指的是那个目录?4,是不是应该把那些jar同时拷贝到所有的节点呢?希望能尽快得到您的答复。我的邮箱yangxuanlun@163.com。谢谢。