I am implementing reduce side Join in Hadoop MapReduce(Java) for that purpose I am using multiple inputs, e.g there are two files Customers and Orders and I joined them considering cid(customer_id).
My Questions :
- In the above program if I write combiner class how is it going to work, as far as I know combiner is mapper level aggregator, however in this case we have two mapper logics.
- Will the combiner logic be applied to both mapper logics
- Is there any way using which I can apply combiner logic to any one mapper logic
Combiner aggregates mapper output and you can override it with any code you think is better. Combiner is known as a Mini-Reducer and inherits reducer class.
remember that combiner is not guaranteed to run in all cases, so your mapper output should always suffice as a reducer input.
and i dont get your question, despite whatever your mapper input is, mapper output will be some key-value data. combiner just aggregates or simply adds them up, say your mapper output is:
after combining your output will be: