Destructuring, records, protocols and named arguments

optimization, pondering — cgrand, 8 May 2010 @ 12 h 19 min

Warning: this post is full of microbenchmarks so take it with a pinch of salt.

Destructuring a record

Field access through keyword-lookups on records is fast:

user=> (defrecord Foo [a b])
user.Foo
user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [a (:a x) b (:b x)] [a b])))))
"Elapsed time: 114.787424 msecs"
"Elapsed time: 102.568273 msecs"
"Elapsed time: 71.150593 msecs"
"Elapsed time: 72.217418 msecs"
"Elapsed time: 70.127489 msecs"
nil

But as far as I know destructure still emits (get x :k), let check:

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [{:keys [a b]} x] [a b])))))
"Elapsed time: 968.616612 msecs"
"Elapsed time: 945.704133 msecs"
"Elapsed time: 911.290751 msecs"
"Elapsed time: 927.658125 msecs"
"Elapsed time: 916.796408 msecs"
nil

Actually it’s slightly slower than lookup on small maps:

user=> (let [x {:a 1 :b 2}] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [{:keys [a b]} x] [a b])))))
"Elapsed time: 866.942377 msecs"
"Elapsed time: 746.273968 msecs"
"Elapsed time: 734.366239 msecs"
"Elapsed time: 729.346188 msecs"
"Elapsed time: 746.96393 msecs"
nil

One patch later

I patched destructure to emit keyword-lookups:

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [{:keys [a b]} x] [a b])))))
"Elapsed time: 479.911064 msecs"
"Elapsed time: 488.895167 msecs"
"Elapsed time: 464.617431 msecs"
"Elapsed time: 480.944575 msecs"
"Elapsed time: 464.969779 msecs"
nil

It’s better but still slower than without destructuring. Let see what slows us:

user=> (macroexpand-1 '(let [{:keys [a b]} x] [a b]))
(let* [map__3838 x map__3838 (if (clojure.core/seq? map__3838) (clojure.core/apply clojure.core/hash-map map__3838) map__3838) b (:b map__3838) a (:a map__3838)] [a b])

My bet is on the if+seq? so I remove it:

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let* [map__3838 x map__3838 map__3838 b (:b map__3838) a (:a map__3838)] [a b])))))
"Elapsed time: 125.188397 msecs"
"Elapsed time: 103.041099 msecs"
"Elapsed time: 70.061558 msecs"
"Elapsed time: 70.793984 msecs"
"Elapsed time: 69.759146 msecs"
nil

This if+seq? allows for named arguments. I wonder if this behaviour should be an option of map-destructuring (eg {:keys [a b] :named-args true}). Anyway I had in mind that such a dispatch could be optimized with protocols.

Optimizing dispatch with protocols

user=> (defprotocol Seq (my-seq [this]) (my-seq? [this]))
Seq
user=> (extend-protocol Seq
clojure.lang.ISeq
(my-seq [this] (.seq this))
(my-seq? [this] true)
Object
(my-seq [this] (clojure.lang.RT/seq this))
(my-seq? [this] false))
nil

Let see how my-seq? compares to seq?

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let* [map__3838 x map__3838 (if (my-seq? map__3838) (clojure.core/apply clojure.core/hash-map map__3838) map__3838) b (:b map__3838) a (:a map__3838)] [a b])))))
"Elapsed time: 179.282982 msecs"
"Elapsed time: 161.781526 msecs"
"Elapsed time: 154.307042 msecs"
"Elapsed time: 155.567677 msecs"
"Elapsed time: 153.716604 msecs"
nil

Hence my-seq? is 3x faster than seq? which means that protocols are indeed speedy: yet another incredible piece of work by Rich Hickey!

I’m still exploring how low-level protocols fns behave in a concurrent setting, stay tuned!

Meanwhile don’t forget to register to the next European Clojure training session taught by Lau and me.

Graph Structured Stacks in Clojure

pondering — cgrand, 16 January 2010 @ 17 h 41 min

I am currently pondering how best to represent Graph Structured Stacks in Clojure. My first idea was to use maps.

So this GSS would be represented as:

{7 {3 {1 {0 {}}},
    4 {1 {0 {}}},
    5 {2 {0 {}}}},
 8 {6 {2 {0 {}}}}}

Please note that repeated maps in this representation would be shared by construction.

It is all well and good this representation allow me to easily perform stack ops but I want to also be able to traverse the stacks from the bottom. This means to make the graph cyclic so I thought that I need an adjency list or, more realistically, two indexed versions (maps of sets) of this list to be able to traverse the GSS in any direction. Or the map-based representation for direct traversal and an adjency map for reverse traversal.

To me the more boring part with this adjency lists is that I have to name the GSS nodes. Wait! There are natural identifiers for these nodes: their key-value pairs in the map-based representation! [2 {0 {}}] is a natural identifier for node #2 and [2 {0 {}}] the one for node #7.

Now I’m able to represent the GSS as:

[;; direct representation
 {7 {3 {1 {0 {}}},
     4 {1 {0 {}}},
     5 {2 {0 {}}}},
  8 {6 {2 {0 {}}}}}
 ;; adjency map for reverse traversal
 {nil #{[0 {}]},
  [0 {}] #{[1 {0 {}}]
           [2 {0 {}}]},
  [1 {0 {}}] #{[3 {1 {0 {}}}]
               [4 {1 {0 {}}}]},
  [2 {0 {}}] #{[5 {2 {0 {}}}]
               [6 {2 {0 {}}}]},
  [3 {1 {0 {}}}] #{[7 {3 {1 {0 {}}},
                       4 {1 {0 {}}},
                       5 {2 {0 {}}}}]},
  [4 {1 {0 {}}}] #{[7 {3 {1 {0 {}}},
                       4 {1 {0 {}}},
                       5 {2 {0 {}}}}]},
  [5 {2 {0 {}}}] #{[7 {3 {1 {0 {}}},
                       4 {1 {0 {}}},
                       5 {2 {0 {}}}}]},
  [6 {2 {0 {}}}] #{[8 {6 {2 {0 {}}}}]}}]

Granted the serialization is verbose but all nodes are shared in memory by construction and I find this model rather simple.

Does anyone have other ideas on how to functionally implement such a structure?

(c) 2010 Clojure and me | powered by WordPress with Barecity