Ok, in the last post I tantalized you with a silly problem I was trying to solve. I was taking items from a list and using a list comprehension to insert them each into their own map, each with a value of 1.
user=> (for [i [:a :b :c]] {i 1}) ({:a 1} {:b 1} {:c 1})
That list comprehension is actually part of a function:
(defn counts-merge [coll] (->> (for [i coll] {i 1})
(apply (partial merge-with +))))
Here's what it does:
(should= {\a 1 \b 2 \c 3 \d 4}
(counts-merge "abbcccdddd"))
Yeah, since teasing you in the last post I've discovered that, as usual, [coding in Clojure is basically cheating](/coding-in-clojure-feels-like-cheating/) (if you know the right functions to reach for):
(should= {\a 1 \b 2 \c 3 \d 4}
(frequencies "abbcccdddd"))
Yup, [there's a function in `clojure.core`](https://clojuredocs.org/clojure.core/frequencies) that does just what I was trying to do. And, arguably, it has a better name.
I was curious to examine its [implementation](https://github.com/clojure/clojure/blob/clojure-1.10.1/src/clj/clojure/core.clj#L7203):
(persistent! (reduce (fn [counts x]
(assoc! counts x (inc (get counts x 0))))
(transient {}) coll))
Ok, so there are a few strange bits ([`persistent!`](https://clojuredocs.org/clojure.core/persistent!) and [`transient`](https://clojuredocs.org/clojure.core/transient)). I presume those are performance-related (more on that in a bit). Here's a simplified implementation:
(defn- counts-inc [counts x] (assoc counts x (inc (get counts x 0))))
(defn counts-reduce [coll] (reduce counts-inc {} coll))
It also works as advertised:
(should= {\a 1 \b 2 \c 3 \d 4}
(sut/counts-reduce "abbcccdddd"))
Tt's not hard to imagine that this solution is just all-around more performant to begin with compared with my original `count-merge` function. So, [let's benchmark it](/benchmarking-clojure-code/):
And the winner is...
- counts-reduce - 10 x 10000 ops, average per-op: 3036ns
- frequencies - 10 x 10000 ops, average per-op: 3209ns
counts-merge - 10 x 10000 ops, average per-op: 6764ns
My simplified counts-reduce function??
Ok, I'm not surprised that counts-merge came in last at roughly 2x average duration, but on my machine, the counts-reduce consistently beats out the built-in frequencies. Not by much, but still. Maybe that's something to explore in a future blog post...