首页 诗词 字典 板报 句子 名言 友答 励志 学校 网站地图
当前位置: 首页 > 教程频道 > JAVA > Eclipse开发 >

myeclipse配备hadoop开发环境

2012-11-26 
myeclipse配置hadoop开发环境3. 安装Hadoop开发插件:下载hadoop-1.0.3-eclipse-plugin.jar(http://ishare.

myeclipse配置hadoop开发环境
3. 安装Hadoop开发插件:下载hadoop-1.0.3-eclipse-plugin.jar(http://ishare.iask.sina.com.cn/f/24669534.html)拷贝到myeclipseeclipse根目录下/dropins目录下:
4. 启动Eclipse,打开Perspective:【Window】->【Open Perspective】->【Other...】->【Map/Reduce】->【OK】myeclipse配备hadoop开发环境
5. 打开一个View:【Window】->【Show View】->【Other...】->【MapReduce Tools】->【Map/Reduce Locations】->【OK】myeclipse配备hadoop开发环境

6. 添加Hadoop location:myeclipse配备hadoop开发环境
myeclipse配备hadoop开发环境
在Advanced parameters中修改:hadoop.tmp.dir=/home/xsj/hadoop/hadoop-xsj
mapred.child.java.opts=-Xmx512m注意:hadoop.tmp.dir的修改将影响到mapred.local.dir、mapred.system.dir、mapred.temp.dir、fs.s3.buffer.dir、fs.checkpoint.dir、fs.checkpoint.edits.dir、dfs.name.dir、dfs.name.edits.dir、dfs.data.dir等项,因此修改后需要重新启动Eclipse。mapred.child.java.opts的修改可能最开始没有这一项,可以在后面再行修改,这里的512是我的虚拟机Ubuntu的内存大小,具体情形具体分析。myeclipse配备hadoop开发环境
修改之后可以看到以下HDFS的视图:myeclipse配备hadoop开发环境

7. 添加文本文件:$ ./hadoop fs -mkdir /user/xsj/input$ ./hadoop fs -put ./*.txt /user/xsj/inputmyeclipse配备hadoop开发环境

8. 新建Map/Reduce Project:【File】->【New】->【Project...】->【Map/Reduce】->【Map/Reduce Project】->【Project name: WordCount】->【Configure Hadoop install directory...】->【Hadoop installation directory: /home/xsj/hadoop/hadoop-0.20.2】->【Apply】->【OK】->【Next】->【Allow output folders for source folders】->【Finish】myeclipse配备hadoop开发环境

9. 新建WordCount类:【WordCount】->【src】->【New】->【Class】->【Package: org.apache.hadoop.examples】->【Name: WordCount】->【Finish】
myeclipse配备hadoop开发环境
添加/编写源代码:/home/xsj/hadoop/hadoop-0.20.2/src/examples/org/apache/hadoop/examples/WordCount.java
package org.apache.hadoop.examples;
import java.io.IOException;import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.GenericOptionsParser;
public class WordCount {
  public static class TokenizerMapper        extends Mapper<Object, Text, Text, IntWritable>{        private final static IntWritable one = new IntWritable(1);    private Text word = new Text();          public void map(Object key, Text value, Context context                    ) throws IOException, InterruptedException {      StringTokenizer itr = new StringTokenizer(value.toString());      while (itr.hasMoreTokens()) {        word.set(itr.nextToken());        context.write(word, one);      }    }  }    public static class IntSumReducer        extends Reducer<Text,IntWritable,Text,IntWritable> {    private IntWritable result = new IntWritable();
    public void reduce(Text key, Iterable<IntWritable> values,                        Context context                       ) throws IOException, InterruptedException {      int sum = 0;      for (IntWritable val : values) {        sum += val.get();      }      result.set(sum);      context.write(key, result);    }  }
  public static void main(String[] args) throws Exception {    Configuration conf = new Configuration();    String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();    if (otherArgs.length != 2) {      System.err.println("Usage: wordcount <in> <out>");      System.exit(2);    }    Job job = new Job(conf, "word count");    job.setJarByClass(WordCount.class);    job.setMapperClass(TokenizerMapper.class);    job.setCombinerClass(IntSumReducer.class);    job.setReducerClass(IntSumReducer.class);    job.setOutputKeyClass(Text.class);    job.setOutputValueClass(IntWritable.class);    FileInputFormat.addInputPath(job, new Path(otherArgs[0]));    FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));    System.exit(job.waitForCompletion(true) ? 0 : 1);  }}
10. 配置运行参数:【Run】->【Run Configurations】->【Java Application】->【Word Count】->【Arguments】->【Program arguments: hdfs://localhost:9000/user/xsj/input/* hdfs://localhost:9000/user/xsj/output】->【VM arguments: -Xms512m -Xmx512m】->【Apply】->【Close】->【Run】->【Run As】->【Run On Hadoop】myeclipse配备hadoop开发环境
myeclipse配备hadoop开发环境

11. 运行:myeclipse配备hadoop开发环境
控制台正常输出:12/06/01 10:23:31 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=12/06/01 10:23:33 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).12/06/01 10:23:36 INFO input.FileInputFormat: Total input paths to process : 212/06/01 10:23:37 INFO mapred.JobClient: Running job: job_local_000112/06/01 10:23:37 INFO input.FileInputFormat: Total input paths to process : 212/06/01 10:23:37 INFO mapred.MapTask: io.sort.mb = 10012/06/01 10:23:40 INFO mapred.MapTask: data buffer = 79691776/9961472012/06/01 10:23:40 INFO mapred.MapTask: record buffer = 262144/32768012/06/01 10:23:44 INFO mapred.JobClient:  map 0% reduce 0%12/06/01 10:23:52 INFO mapred.MapTask: Starting flush of map output12/06/01 10:23:59 INFO mapred.LocalJobRunner: 12/06/01 10:23:59 INFO mapred.MapTask: Finished spill 012/06/01 10:24:00 INFO mapred.JobClient:  map 100% reduce 0%12/06/01 10:24:00 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting12/06/01 10:24:03 INFO mapred.LocalJobRunner: 12/06/01 10:24:03 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.12/06/01 10:24:04 INFO mapred.MapTask: io.sort.mb = 10012/06/01 10:24:08 INFO mapred.MapTask: data buffer = 79691776/9961472012/06/01 10:24:08 INFO mapred.MapTask: record buffer = 262144/32768012/06/01 10:24:10 INFO mapred.MapTask: Starting flush of map output12/06/01 10:24:10 INFO mapred.MapTask: Finished spill 012/06/01 10:24:11 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting12/06/01 10:24:12 INFO mapred.LocalJobRunner: 12/06/01 10:24:12 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000001_0' done.12/06/01 10:24:14 INFO mapred.LocalJobRunner: 12/06/01 10:24:16 INFO mapred.Merger: Merging 2 sorted segments12/06/01 10:24:17 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 77 bytes12/06/01 10:24:17 INFO mapred.LocalJobRunner: 12/06/01 10:24:20 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting12/06/01 10:24:20 INFO mapred.LocalJobRunner: 12/06/01 10:24:20 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now12/06/01 10:24:21 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to hdfs://localhost:9000/user/xsj/output12/06/01 10:24:21 INFO mapred.LocalJobRunner: reduce > reduce12/06/01 10:24:21 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.12/06/01 10:24:22 INFO mapred.JobClient:  map 100% reduce 100%12/06/01 10:24:22 INFO mapred.JobClient: Job complete: job_local_000112/06/01 10:24:22 INFO mapred.JobClient: Counters: 1412/06/01 10:24:22 INFO mapred.JobClient:   FileSystemCounters12/06/01 10:24:22 INFO mapred.JobClient:     FILE_BYTES_READ=5048812/06/01 10:24:22 INFO mapred.JobClient:     HDFS_BYTES_READ=12012/06/01 10:24:22 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=10274812/06/01 10:24:22 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=4112/06/01 10:24:22 INFO mapred.JobClient:   Map-Reduce Framework12/06/01 10:24:22 INFO mapred.JobClient:     Reduce input groups=512/06/01 10:24:22 INFO mapred.JobClient:     Combine output records=612/06/01 10:24:22 INFO mapred.JobClient:     Map input records=412/06/01 10:24:22 INFO mapred.JobClient:     Reduce shuffle bytes=012/06/01 10:24:22 INFO mapred.JobClient:     Reduce output records=512/06/01 10:24:22 INFO mapred.JobClient:     Spilled Records=1212/06/01 10:24:22 INFO mapred.JobClient:     Map output bytes=8112/06/01 10:24:22 INFO mapred.JobClient:     Combine input records=812/06/01 10:24:22 INFO mapred.JobClient:     Map output records=812/06/01 10:24:22 INFO mapred.JobClient:     Reduce input records=6
12. 查看运行结果:/user/xsj/output/part-r-00000文件为输出文件:myeclipse配备hadoop开发环境

热点排行