# Counting Frequencies In Clojure

## "There's a function for that"

#### October 15, 2021

Ok, in the last post I tantalized you with a silly problem I was trying to solve. I was taking items from a list and using a list comprehension to insert them each into their own map, each with a value of `1`.

`````````clojure-repl
user=> (for [i [:a :b :c]] {i 1})
({:a 1} {:b 1} {:c 1})
``````

That list comprehension is actually part of a function:

``````(defn counts-merge [coll]
(->> (for [i coll] {i 1})
(apply (partial merge-with +))))
``````

Here's what it does:

``````(should= {\a 1 \b 2 \c 3 \d 4}
(counts-merge "abbcccdddd"))
``````

Yeah, since teasing you in the last post I've discovered that, as usual, coding in Clojure is basically cheating (if you know the right functions to reach for):

``````(should= {\a 1 \b 2 \c 3 \d 4}
(frequencies "abbcccdddd"))
``````

Yup, there's a function in `clojure.core` that does just what I was trying to do. And, arguably, it has a better name.

I was curious to examine its implementation:

``````  (persistent!
(reduce (fn [counts x]
(assoc! counts x (inc (get counts x 0))))
(transient {}) coll))
``````

Ok, so there are a few strange bits (`persistent!` and `transient`). I presume those are performance-related (more on that in a bit). Here's a simplified implementation:

``````(defn- counts-inc [counts x]
(assoc counts x (inc (get counts x 0))))

(defn counts-reduce [coll]
(reduce counts-inc {} coll))
``````

``````(should= {\a 1 \b 2 \c 3 \d 4}
(sut/counts-reduce "abbcccdddd"))
``````

Tt's not hard to imagine that this solution is just all-around more performant to begin with compared with my original `count-merge` function. So, let's benchmark it:

And the winner is...

``````1. counts-reduce - 10 x 10000 ops, average per-op: 3036ns
2. frequencies   - 10 x 10000 ops, average per-op: 3209ns
3. counts-merge  - 10 x 10000 ops, average per-op: 6764ns
``````

My simplified `counts-reduce` function??

Ok, I'm not surprised that `counts-merge` came in last at roughly `2x` average duration, but on my machine, the `counts-reduce` consistently beats out the built-in `frequencies`. Not by much, but still. Maybe that's something to explore in a future blog post...

-Michael Whatcott