Spring Hadoop搭建例子配置Spring方法

作者：佚名来源: JZ5U整理日期：2019/11/21 14:22:50

当你想要开端体验 Spring Hadoop 的时分, 你会遇到形形色色奇异的问题, 目前也有人开端陆续报答了.

假如你只是想要简单的试用一下, 又不想要本人处理这些疑问杂症, 倡议大家能够参考下面的步骤来快速体验一下 Spring Hadoop 的能力.

Spring Hadoop 快速入门

Step1. 下载 Spring Hadoop, 这边是运用 git 去下载, 假如你对 git 不熟习的话, 也能够直接从官网下载再解紧缩。

这边的例子里面是用我的 home 目录为例, 大家记得要改成你本人的目录称号

/home/evanshsu mkdir springhadoop
/home/evanshsu cd springhadoop
/home/evanshsu/springhadoop git init
/home/evanshsu/springhadoop git pull "git://github.com/SpringSource/spring-hadoop.git"
Step2. build spring-hadoop.jar

build完之后, 我们要把一切的 jar 檔都放在 /home/evanshsu/springhadoop/lib 里面, 以便之后把一切的jar 档包在同一包里面

/home/evanshsu/springhadoop ./gradlew jar
/home/evanshsu/springhadoop mkdir lib
/home/evanshsu/springhadoop cp build/libs/spring-data-hadoop-1.0.0.BUILD-SNAPSHOT.jar lib/
Step3. get spring-framework.

由于 spring hadoop 是倚赖于 spring-framework 的, 所以我们也要把 spring-framework 的 jar 檔放在 lib 里面

/home/evanshsu/spring wget "http://s3.amazonaws.com/dist.springframework.org/release/SPR/spring-framework-3.1.1.RELEASE.zip"
/home/evanshsu/spring unzip spring-framework-3.1.1.RELEASE.zip
/home/evanshsu/spring cp spring-framework-3.1.1.RELEASE/dist/*.jar /home/evanshsu/springhadoop/lib/

Step4. 修正 build file 让我们能够把一切的 jar 檔, 封装到同一个 jar 档里面

/home/evanshsu/spring/samples/wordcount vim build.gradle

[code=php]

description = 'Spring Hadoop Samples - WordCount'

apply plugin: 'base'
apply plugin: 'java'
apply plugin: 'idea'
apply plugin: 'eclipse'

repositories {
flatDir(dirs: '/home/evanshsu/springhadoop/lib/')
// Public Spring artefacts
maven { url "http://repo.springsource.org/libs-release" }
maven { url "http://repo.springsource.org/libs-milestone" }
maven { url "http://repo.springsource.org/libs-snapshot" }
}

dependencies {
compile fileTree('/home/evanshsu/springhadoop/lib/')
compile "org.apache.hadoop:hadoop-examples:$hadoopVersion"
// see HADOOP-7461
runtime "org.codehaus.jackson:jackson-mapper-asl:$jacksonVersion"

testCompile "junit:junit:$junitVersion"
testCompile "org.springframework:spring-test:$springVersion"
}

jar {
from configurations.compile.collect { it.isDirectory() ? it : zipTree(it).matching{
exclude 'META-INF/spring.schemas'
exclude 'META-INF/spring.handlers'
} }
}

[code/]

Step5. 这边有一个特殊的 hadoop.properties 主要是放置 hadoop 相关的设定数据.

根本上我们要把 wordcount.input.path wordcount.output.path 改成之后执行 wordcount 要运用的目录,　而且wordcount.input.path 里面记得要放几个文本文件

另外, 还要把 hd.fs 改成你 hdfs 的设定

假如你是用国网中心 Hadoop 的话, 要把 hd.fs 改成 hd.fs=hdfs://gm2.nchc.org.tw:8020

/home/evanshsu/spring/samples/wordcount vim src/main/resources/hadoop.properties

[code=php]

wordcount.input.path=/user/evanshsu/input.txt
wordcount.output.path=/user/evanshsu/output

hive.host=localhost
hive.port=12345
hive.url=jdbc:hive://${hive.host}:${hive.port}
hd.fs=hdfs://localhost:9000
mapred.job.tracker=localhost:9001

path.cat=bin${file.separator}stream-bin${file.separator}cat
path.wc=bin${file.separator}stream-bin${file.separator}wc

input.directory=logs
log.input=/logs/input/
log.output=/logs/output/

distcp.src=${hd.fs}/distcp/source.txt
distcp.dst=${hd.fs}/distcp/dst

[code/]

Step6. 这是最重要的一个配置文件, 有用过 Spring 的人都晓得这个配置文件是Spring 的灵魂

/home/evanshsu/spring/samples/wordcount vim src/main/resources/META-INF/spring/context.xml

xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xmlns:context="http://www.springframework.org/schema/context"

xmlns:hdp="http://www.springframework.org/schema/hadoop"

xmlns:p="http://www.springframework.org/schema/p"

xsi:schemaLocation="http://www.springframework.org/schema/beanshttp://www.springframework.org/schema/beans/spring-beans.xsd

http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd

http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd">

fs.default.name=${hd.fs}

input-path="${wordcount.input.path}" output-path="${wordcount.output.path}"

mapper="org.springframework.data.hadoop.samples.wordcount.WordCountMapper"

reducer="org.springframework.data.hadoop.samples.wordcount.WordCountReducer"

jar-by-class="org.springframework.data.hadoop.samples.wordcount.WordCountMapper" />

Step7. 加上本人的 mapper, reducer

/home/evanshsu/spring/samples/wordcount vim src/main/java/org/springframework/data/hadoop/samples/wordcount/WordCountMapper.java

package org.springframework.data.hadoop.samples.wordcount;

import java.io.IOException;

import java.util.StringTokenizer;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Mapper;

public class WordCountMapper extends Mapper {

private final static IntWritable one = new IntWritable(1);

private Text word = new Text();

public void map(Object key, Text value, Context context)

throws IOException, InterruptedException {

StringTokenizer itr = new StringTokenizer(value.toString());

while (itr.hasMoreTokens()) {

word.set(itr.nextToken());

context.write(word, one);

}

/home/evanshsu/spring/samples/wordcount vim src/main/java/org/springframework/data/hadoop/samples/wordcount/WordCountReducer.java

package org.springframework.data.hadoop.samples.wordcount;

import java.io.IOException;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.Text;

import org.apache.hadoop.mapreduce.Reducer;

public class WordCountReducer extends

Reducer {

private IntWritable result = new IntWritable();

public void reduce(Text key, Iterable values, Context context)

throws IOException, InterruptedException {

int sum = 0;

for (IntWritable val : values) {

sum += val.get();

}

result.set(sum);

context.write(key, result);

}

Step8. 加上 spring.schemas, spring.handlers

/home/evanshsu/spring/samples/wordcount vim src/main/resources/META-INF/spring.schemas

http\://www.springframework.org/schema/context/spring-context.xsd=org/springframework/context/config/spring-context-3.1.xsd

http\://www.springframework.org/schema/hadoop/spring-hadoop.xsd=/org/springframework/data/hadoop/config/spring-hadoop-1.0.xsd

/home/evanshsu/spring/samples/wordcount vim src/main/resources/META-INF/spring.handlers

http\://www.springframework.org/schema/p=org.springframework.beans.factory.xml.SimplePropertyNamespaceHandler

http\://www.springframework.org/schema/context=org.springframework.context.config.ContextNamespaceHandler

http\://www.springframework.org/schema/hadoop=org.springframework.data.hadoop.config.HadoopNamespaceHandler

Step9. 终于到最后一步啰, 这一步我们要把一切的 jar 档封装在一同, 并且丢到hadoop 上面去跑

/home/evanshsu/spring/samples/wordcount ../../gradlew jar

/home/evanshsu/spring/samples/wordcount hadoop jar build/libs/wordcount-1.0.0.M1.jar org.springframework.data.hadoop.samples.wordcount.Main

Step10. 最后来确认看看结果有没有跑出来吧

/home/evanshsu/spring/samples/wordcount hadoop fs -cat /user/evanshsu/output/*

Spring Hadoop搭建例子 配置Spring方法

Spring Hadoop 快速入门

Spring Hadoop搭建例子配置Spring方法