Ok, in the last post I tantalized you with a silly problem I was trying to solve. I was taking items from a list and using a list comprehension to insert them each into their own map, each with a value of 1
.
```clojure-repl
user=> (for [i [:a :b :c]] {i 1})
({:a 1} {:b 1} {:c 1})
That list comprehension is actually part of a function:
(defn counts-merge [coll]
(->> (for [i coll] {i 1})
(apply (partial merge-with +))))
Here's what it does:
(should= {\a 1 \b 2 \c 3 \d 4}
(counts-merge "abbcccdddd"))
Yeah, since teasing you in the last post I've discovered that, as usual, coding in Clojure is basically cheating (if you know the right functions to reach for):
(should= {\a 1 \b 2 \c 3 \d 4}
(frequencies "abbcccdddd"))
Yup, there's a function in clojure.core
that does just what I was trying to do. And, arguably, it has a better name.
I was curious to examine its implementation:
(persistent!
(reduce (fn [counts x]
(assoc! counts x (inc (get counts x 0))))
(transient {}) coll))
Ok, so there are a few strange bits (persistent!
and transient
). I presume those are performance-related (more on that in a bit). Here's a simplified implementation:
(defn- counts-inc [counts x]
(assoc counts x (inc (get counts x 0))))
(defn counts-reduce [coll]
(reduce counts-inc {} coll))
It also works as advertised:
(should= {\a 1 \b 2 \c 3 \d 4}
(sut/counts-reduce "abbcccdddd"))
Tt's not hard to imagine that this solution is just all-around more performant to begin with compared with my original count-merge
function. So, let's benchmark it:
And the winner is...
1. counts-reduce - 10 x 10000 ops, average per-op: 3036ns
2. frequencies - 10 x 10000 ops, average per-op: 3209ns
3. counts-merge - 10 x 10000 ops, average per-op: 6764ns
My simplified counts-reduce
function??
Ok, I'm not surprised that counts-merge
came in last at roughly 2x
average duration, but on my machine, the counts-reduce
consistently beats out the built-in frequencies
. Not by much, but still. Maybe that's something to explore in a future blog post...