Michael Whatcott - Benchmarking Clojure Code, Part 2

In part 1 I eagerly showed off a little utility function for benchmarking bits of code. Since then I've had a chance to use the code several times, share it with coworkers, and think about possible improvements. There's a saying in the Go community which I mostly agree with:

A little copying is better than a little dependency.

In this case I really wanted to publish something and experience the full workflow of creating a library to then using it. So, without further ado, here it is:

Example usage:

(require '[benchmarks.bench :as bench])

(println "big reduce:        " (bench/report #(reduce * (range 10000000))))
(println "int/float division:" (bench/report #(int (/ 101 10))))
(println "float division:    " (bench/report #(/ 101 10)))
(println "integer division:  " (bench/report #(quot 101 10)))

Example Output:

big reduce:         {:total-ops 10**1, :total-time 3.1s, :per-op-time 314.1ms}
int/float division: {:total-ops 10**7, :total-time 1.7s, :per-op-time 173ns}
float division:     {:total-ops 10**7, :total-time 865.0ms, :per-op-time 86ns}
integer division:   {:total-ops 10**7, :total-time 74.7ms, :per-op-time 7ns}

Summary of improvements compared with part 1

The benchmark function doesn't emit a plain text report to stdout, instead it returns a data structure with low-level values about execution.
The report function translates the result structure of benchmark into another data structure with more human-readable values.
Best of all, there's no need to specify any numeric values directing the library how many times to execute the code. All of that is handled by the benchmark function. Like other benchmarking tools I've used recently (Go), this library calls the provided function repeatedly in successively increasing batches (powers of 10) until:
- it has gathered enough measurements to give an stable average or
- runtime exceeds 1 second.

New Learning

I'm pretty happy with the main algorithm:

(->> (iterate powers-of-10 1)
     (map (partial bench f))
     (drop-while within-thresholds)
     first)

This is the bit that calls the provided function f in increasing larger batches. You might be curious about the implementation of powers-of-10:

(def powers-of-10 (partial * 10))

An example:

user=> (take 10 (iterate (partial * 10) 1))
(1 10 100 1000 10000 100000 1000000 10000000 100000000 1000000000)

What's does within-thresholds do? Well, there are two threshold conditions and when either is met it's a signal to stop executing the benchmark.

(defn- within-time-threshold [{:keys [total-ns]}] (< total-ns max-duration))
(defn- within-ops-threshold [{:keys [total-ops]}] (< total-ops max-ops))

They get stitched together with the following def:

(def within-thresholds
  (every-pred within-time-threshold
              within-ops-threshold))

This is the first time I've used every-pred, which seemed more elegant than a full-blown function that uses and to stitch things together:

(defn within-thresholds [result]
  (and (within-time-threshold result)
  	   (within-ops-threshold result)))

Feel free to take it for a spin!