The MapReduce paradigm was inspired by functional programming techniques, so why not take things full circle and rewrite a MapReduce job in a functional language? Here is a one-line Scala implementation of the classic “word-count” program.
List("to be or", "not to", "be").par.flatMap(_.split("""\s+""")). foldLeft(Map[String, Int]())((m,s) => m + (s -> (m.getOrElse(s, 0)+1))) // produces Map(to -> 2, be -> 2, or -> 1, not -> 1)
Okay so I used “flatMap” instead of “map” and “foldLeft” instead of “reduce”, but you get the idea. Note the
par keyword, which invokes Scala’s parallel collections framework, transparently parallelizing the map step. Who needs a compute cluster when you’ve got the Scala REPL?
This is just a little stunt, of course, but if you’re serious about working with MapReduce and Scala the ScalaHadoop project looks like a good place to start.