[swift-evolution] Proposal: Pattern Matching Partial Function (#111)

Fri Feb 5 13:41:07 CST 2016

Actually with Scala (and from what I gather from Big Data) — reduce and fold from the outside look like synonyms — BUT they are not.  

inline from http://stackoverflow.com/questions/25158780/difference-between-reduce-and-foldleft-fold-in-functional-programming-particula <http://stackoverflow.com/questions/25158780/difference-between-reduce-and-foldleft-fold-in-functional-programming-particula>

reduce vs foldLeft

A big big difference, not mentioned in any other stackoverflow answer relating to this topic clearly, is that reduce should be given a commutative monoid, i.e. an operation that is both commutative and associative. This means the operation can be parallelized.

This distinction is very important for Big Data / MPP / distributed computing, and the entire reason why reduce even exists. The collection can be chopped up and the reduce can operate on each chunk, then the reduce can operate on the results of each chunk - in fact the level of chunking need not stop one level deep. We could chop up each chunk too. This is why summing integers in a list is O(log N) if given an infinite number of CPUs.

If you just look at the signatures there is no reason for reduce to exist because you can achieve everything you can with reduce with a foldLeft. The functionality of foldLeft is a greater than the functionality of reduce.

But you cannot parallelize a foldLeft, so its runtime is always O(N) (even if you feed in a commutative monoid). This is because it's assumed the operation is not a commutative monoid and so the cumulated value will be computed by a series of sequential aggregations.

foldLeft does not assume commutativity nor associativity. It's associativity that gives the ability to chop up the collection, and it's commutativity that makes cumulating easy because order is not important (so it doesn't matter which order to aggregate each of the results from each of the chunks). Strictly speaking commutativity is not necessary for parallelization, for example distributed sorting algorithms, it just makes the logic easier because you don't need to give your chunks an ordering.

If you have a look at the Spark documentation for reduce it specifically says "... commutative and associative binary operator"

http://spark.apache.org/docs/1.0.0/api/scala/index.html#org.apache.spark.rdd.RDD <http://spark.apache.org/docs/1.0.0/api/scala/index.html#org.apache.spark.rdd.RDD>
Here is proof that reduce is NOT just a special case of foldLeft

scala> val intParList: ParSeq[Int] = (1 to 100000).map(_ => scala.util.Random.nextInt()).par

scala> timeMany(1000, intParList.reduce(_ + _))
Took 462.395867 milli seconds

scala> timeMany(1000, intParList.foldLeft(0)(_ + _))
Took 2589.363031 milli seconds
reduce vs fold

Now this is where it gets a little closer to the FP Category Theory roots, and a little trickier to explain.

There is no fold method in Scalding because under the (strict) Map Reduce programming model we cannot define fold because chunks do not have an ordering and fold only requires associativity, not commutativity. Spark does have fold because its framework is a super-set of the Map Reduce programming model and can order its chunks. Well you can actually do this in Hadoop too, but Scalding doesn't seem to expose this functionality in the version I'm familiar with.

Put simply, reduce works without an order of cumulation, fold requires an order of cumulation and it is that order of cumulation that necessitates a zero value NOT the existence of the zero value that distinguishes them. Strictly speaking reduce should work on an empty collection, because its zero value can by deduced by taking an arbitrary value x and then solving x op y = x, but that doesn't work with a non-commutative operation as there can exist a left and right zero value that are distinct (i.e. x op y != y op x). Of course Scala doesn't bother to work out what this zero value is as that would require doing some mathematics (which are probably uncomputable), so just throws an exception.

This difference between reduce* and fold* is a FP historical convention and has its roots in Category Theory. I hope now this deep difference relating to parallelization will no longer go unnoticed :)

> On 2016-02-06, at 2:30:01, Dave Abrahams via swift-evolution <swift-evolution at swift.org> wrote:
> 
> 
> on Fri Feb 05 2016, Craig Cruden <swift-evolution at swift.org> wrote:
> 
>> Swift reduce has a starting value and likely is just a fold, not a
>> reduce (at least when it is compared to Scala; not sure about when it
>> comes to big data).
> 
> IME fold and reduce have always been synonyms.  Did I miss some memo?
> 
> -- 
> -Dave
> 
> _______________________________________________
> swift-evolution mailing list
> swift-evolution at swift.org
> https://lists.swift.org/mailman/listinfo/swift-evolution

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.swift.org/pipermail/swift-evolution/attachments/20160206/072dcc68/attachment.html>