I see a discrepancy between the console log when i run a mapreduce with and without multipleOutputs.
I have a mapper only job that outputs to a text file.
Without the MultipleOutputs configured,
Code Snippet in my Mapper:
context.write(null,new Text(value));
Console output excerpt
Map-Reduce Framework
Map input records=2
Map output records=2
With the MultipleOutputs,
Code Snippet in my Mapper:
multipleOutputs.write(null,new Text(value),FileOutputFormat.getOutputPath(context).toString() + Path.SEPARATOR + "v");
Console output excerpt
Map-Reduce Framework
Map input records=2
Map output records=0
Driver code to avoid empty part file
LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);
Note the number of output records. Although, its showing as 0 in the second case, i still see the correct output in the file. The file name generated is v-m-00000.
Am i missing something?
Map output records counts the number of key-value pairs that the mappers emit (using context.write()
). This is the only way to pass records from mappers to reducers and that is the reason why this counter exists.
If you want to count the number of records written from any other method, or actually, if you want to count anything else, you have to define your own custom counter, which I recommend in your case.