I've been reading a nice answer to Difference between reduce and foldLeft/fold in functional programming (particularly Scala and Scala APIs)? provided by samthebest and I am not sure if I understand all the details:
According to the answer (
reducevsfoldLeft):A big big difference (...) is that reduce should be given a commutative monoid, (...)
This distinction is very important for Big Data / MPP / distributed computing, and the entire reason why reduce even exists.
and
Reduce is defined formally as part of the MapReduce paradigm,
I am not sure how this two statements combine. Can anyone put some light on that?
I tested different collections and I haven't seen performance difference between
reduceandfoldLeft. It looks likeParSeqis a special case, is that right?Do we really need order to define
fold?we cannot define fold because chunks do not have an ordering and fold only requires associativity, not commutativity.
Why it couldn't be generalized to unordered collection?
                        
As mentioned in the comments, the term reduce means different thing when used in the context of MapReduce and when used in the context of functional programming.
In MapReduce, the system groups the results of the
mapfunction by a given key and then calls thereduceoperation to aggregate values for each group (soreduceis called once for each group). You can see it as a function(K, [V]) -> Rtaking the group keyKtogether with all the values belonging to the group[V]and producing some result.In functional programming,
reduceis a function that aggregates elements of some collection when you give it an operation that can combine two elements. In other words, you define a function(V, V) -> Vand thereducefunction uses it to aggregate a collection[V]into a single valueV.When you want to add numbers
[1,2,3,4]using+as the function, thereducefunction can do it in a number of ways:((1+2)+3)+4)a = 1+2andb = 3+4in parallel and then adda+b!The
foldLeftoperation is, by definition always proceeding from the left and so it always uses the evaluation strategy of (1). In fact, it also takes an initial value, so it evaluates something more like(((0+1)+2)+3)+4). This makesfoldLeftuseful for operations where the order matters, but it also means that it cannot be implemented for unordered collections (because you do not know what "left" is).