Ich habe versucht, einfache Gruppe in Mapreduce zu implementieren.Reducer-Klasse funktioniert nicht wie erwartet in Hadoop MapReduce
Meine Eingabedatei unten angegeben:
7369,SMITH,CLERK,800,20
7499,ALLEN,SALESMAN,1600,30
7521,WARD,SALESMAN,1250,30
7566,JONES,MANAGER,2975,20
7654,MARTIN,SALESMAN,1250,30
7698,BLAKE,MANAGER,2850,30
7782,CLARK,MANAGER,2450,10
7788,SCOTT,ANALYST,3000,20
7839,KING,PRESIDENT,5000,10
7844,TURNER,SALESMAN,1500,30
7876,ADAMS,CLERK,1100,20
7900,JAMES,CLERK,950,30
7902,FORD,ANALYST,3000,20
7934,MILLER,CLERK,1300,10
Mein Mapper Klasse:
public class Groupmapper extends Mapper<Object,Text,IntWritable,IntWritable> {
@Override
public void map(Object key, Text value, Context context) throws IOException, InterruptedException{
String line = value.toString();
String[] parts=line.split(",");
String token1=parts[3];
String token2=parts[4];
int deptno=Integer.parseInt(token2);
int sal=Integer.parseInt(token1);
context.write(new IntWritable(deptno),new IntWritable(sal));
}
}
Reducer Klasse:
public class Groupreducer extends Reducer<IntWritable, IntWritable, IntWritable, IntWritable> {
IntWritable result=new IntWritable();
public void Reduce(IntWritable key,Iterable<IntWritable> values, Context context) throws IOException, InterruptedException{
int sum=0;
for(IntWritable val:values){
sum+=val.get();
}
result.set(sum);
context.write(key,result);
}
}
Treiberklasse:
public class Group {
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
Configuration conf=new Configuration();
Job job=Job.getInstance(conf,"Group");
job.setJarByClass(Group.class);
job.setMapperClass(Groupmapper.class);
job.setCombinerClass(Groupreducer.class);
job.setReducerClass(Groupreducer.class);
job.setOutputKeyClass(IntWritable.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
Erwartete Ausgabe sollte:
10 8750
20 10875
30 9400
Aber Er druckt unter Ausgang gegeben. Es hat die Werte nicht aggregiert. Es funktioniert wie Identity Reducer.
10 1300
10 5000
10 2450
20 1100
20 3000
20 800
20 2975
20 3000
30 1500
30 1600
30 2850
30 1250
30 1250
30 950
Die Reducer-Funktion funktioniert nicht richtig.