javahadoopmapreducesequencefile

Java Map Reduce use SequenceFIle as reducer output


I have a working Java Map Reduce Program with 2 jobs. The output of the first reduce is written on a file and read by the second mapper.

I would like to change the first reducer output to be a SequenceFile.

How can i do this?

This is the main of my program

public static void main(String[] args) throws Exception {
    //setup first job
    Configuration conf = new Configuration();
    conf.set("mapred.textoutputformat.separator", "&");
    Job job = Job.getInstance(conf, "First Job");
    job.setJarByClass(Prova.class);
    job.setMapperClass(FirstMapper.class);
    job.setReducerClass(FirstReducer.class);
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    Path tempOutput=new Path("FirstMapper");
    FileOutputFormat.setOutputPath(job, tempOutput);

    job.waitForCompletion(true);

    //setup second job
    Configuration conf2 = new Configuration();
    conf2.set("mapred.textoutputformat.separator", " ");
    conf2.set("numberOfELements", args[2]);
    Job job2 = Job.getInstance(conf2, "Second Job");
    job2.setJarByClass(Prova.class);
    job2.setMapperClass(SecondMapper.class);
    job2.setReducerClass(SecondReducer.class);
    job2.setOutputKeyClass(Text.class);
    job2.setOutputValueClass(Text.class);
    FileInputFormat.addInputPath(job2, tempOutput);
    FileOutputFormat.setOutputPath(job2, new Path(args[1]));

    System.exit(job2.waitForCompletion(true) ? 0 : 1);
}

I already tried by adding the following lines:

job.setOutputFormatClass(SequenceFileOutputFormat.class);
job2.setInputFormatClass(SequenceFileInputFormat.class);

but i get the following error: wrong value class: org.apache.hadoop.io.Text is not class org.apache.hadoop.io.IntWritable. The error happens when i make contect.write(Text,Text) in the first reducer.


Solution

  • context.write(Text, Text) and job.setOutputValueClass(IntWritable.class); disagree with one another. Make them consistent and it should work.