What is Combiner? How to write?

Combiner is used for performance benefits by reducing the data volume between Map and Reduce.
Each combiner is associated with each mapper. It takes the output of mapper as input and does shuffling and sorting and send the result to reducer for further processing.It takes each key-value pair from the Map phase, processes it, and produces the output as key-value collection pairs.
Combiner is also called as mini reducer. It acts between Mapper and Reducer.

Lets say, you have the below input file


hello hi saroj rout kumar
hey hello what hi rout hello
hi nishant saroj rout what

Mapper will take the input file and give the output as below


(hello,1) (hi,1) (saroj,1) (rout,1) (kumar,1)
(hey,1) (hello,1) (what,1) (hi,1) (rout,1) (hello,1)
(hi,1) (nishant,1) (saroj,1) (rout,1) (what,1)

The Combiner phase takes each key-value pair as input from Mapper output(mentioned above), processes it, and produces the output as key-value collection pairs as below.


(hello,1,1) (hey,1) (hi,1,1,1) (kumar,1) 
(nishant,1) (rout,1,1,1) (saroj,1,1) (what,1,1)

There is no difference between Reducer and Combiner code.
We just need to set the combiner class in the driver program as below


job.setMapperClass(WordCountMapper.class);
job.setCombinerClass(WordCountReducer.class);
job.setReducerClass(WordCountReducer.class);

The reducer output would be as below


(hello,2) (hey,1) (hi,3) (kumar,1) (nishant,1) (rout,3)
(saroj,2) (what,2)

Your Ultimate Tech Guide

Discover technology articles and Find amazing solutions to unique problems

What is Combiner? How to write?

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply