Clojure and me

A* in Clojure

unsorted — cgrand, 4 September 2010 @ 4 h 22 min

Thanks to Mark Engelberg there’s now a nice priority map in contrib. With such a facility, it becomes really easy to write a nice implementation of the A* algorithm.

(ns net.cgrand.clj-me.a-star
  (:use [clojure.contrib.priority-map :only [priority-map]]))

(defn A*
 "Finds a path between start and goal inside the graph described by edges
  (a map of edge to distance); estimate is an heuristic for the actual
  distance. Accepts a named option: :monotonic (default to true).
  Returns the path if found or nil."
 [edges estimate start goal & {mono :monotonic :or {mono true}}]
  (let [f (memoize #(estimate % goal)) ; unsure the memoization is worthy
        neighbours (reduce (fn [m [a b]] (assoc m a (conj (m a #{}) b)))
                      {} (keys edges))]
    (loop [q (priority-map start (f start))
           preds {}
           shortest {start 0}
           done #{}]
      (when-let [[x hx] (peek q)]
        (if (= goal x)
          (reverse (take-while identity (iterate preds goal)))
          (let [dx (- hx (f x))
                bn (for [n (remove done (neighbours x))
                         :let [hn (+ dx (edges [x n]) (f n))
                               sn (shortest n Double/POSITIVE_INFINITY)]
                         :when (< hn sn)]
                     [n hn])]
            (recur (into (pop q) bn)
              (into preds (for [[n] bn] [n x]))
              (into shortest bn)
              (if mono (conj done x) done))))))))

Example:

(defn euclidian-distance [a b] ; multidimensional
  (Math/sqrt (reduce + (map #(let [c (- %1 %2)] (* c c)) a b))))

;; generate a grid graph whose outlying edges are one-way
(defn grid [x y w h]
  (into {}
    (for [i (range w) j (range h)
          :let [x0 (+ x i) y0 (+ y j) x1 (inc x0) y1 (inc y0)]]
      {[[x0 y0] [x1 y0]] 1
       [[x1 y0] [x1 y1]] 1
       [[x1 y1] [x0 y1]] 1
       [[x0 y1] [x0 y0]] 1})))

(def g (apply dissoc (grid 0 0 4 4) (keys (grid 1 1 2 2))))

(comment 
user=> (A* g euclidian-distance [0 3] [4 2])
([0 3] [1 3] [2 3] [3 3] [3 2] [4 2])
)

Shameless plug: if you are in Europe and want to learn Clojure, don’t forget to register for the Frankfurt course, October 26-28.

Comments (13)

Quick update

unsorted — cgrand, 26 August 2010 @ 0 h 05 min

Long time no post. No code in this one though.
However I’m pleased to announce that Lau and me are going to give another Clojure course: October 26-28 in Frankfurt, beginners welcome. So if you want to learn Clojure, register!

Comments (0)

Primitive types support for fns coming to a Clojure branch near you

unsorted — cgrand, 10 June 2010 @ 8 h 57 min

Rich Hickey is working on primitive support in the prim branch. As of now, one can write:

(defn ^:static fib ^long [^long n]
  (if (>= (long 1) n)
    (long 1)
    (+ (fib (dec n)) (fib (- n (long 2))))))

and computes (fib 38) on my laptop in 650ms where my (or even Rich’s) best attempt took about 2s! If I tweak the above code to use unchecked ops (regular arithmetic ops on primitive types in Clojure throw an exception on overflow, unchecked ones don’t), the performance comes real real close (< 5%) to the equivalent Java code.

What’s this `^:static` thing?

Primitive (double and long only: they subsume any other type) args and return values are only allowed on statics (note that the return type hint is put on the arglist so as to allow different hints for different arities). Statics are still fns in vars but they are backed by a static method and, when called by name, a direct static method call is emitted rather than going through the var and the IFn interface — as such static calls replace direct binding.

(my-static 42) ; direct call
(apply my-static 42 nil) ; regular call through the var

About the syntax: ^:keyword is a new reader shorthand for ^{:keyword true} (and keep in mind that in Clojure 1.2 ^ is the new #^), and metadata are merged: ^:static ^long ^{:doc "docstring"} x is now equivalent to ^{:static true :tag long :doc "docstring"} x.

Interesting times… as usual in the Clojure community. Thanks Rich!

Comments (6)

Primitive types support for fns

unsorted — cgrand, 4 June 2010 @ 10 h 25 min

Follow up See here.

Update: I was wrong and Rich Hickey set me right: I didn’t measure the gains I’m expecting because the inline-expanded form still go through the var lookup. See here (or the comments) for real gains (around 800ms on my laptop).

This is a quick and dirty hack to emulate primitive types support for globally-bound fns:

(defmacro defhintedfn [name args & body]
  (let [iface (gensym "HintedFn")
        hname (vary-meta name assoc :tag iface)
        vname (vary-meta name assoc 
                :inline (fn [& args] `(.invoke ~hname ~@args)))]
    `(do
       (definterface ~iface
         (~(with-meta 'invoke {:tag (:tag name Object)}) ~args))
       (def ~vname nil) ; workaround for a soon-to-be-fixed bug
       (def ~vname
          (reify 
            ~iface
              (invoke [this# ~@args] 
                ~@body) 
            clojure.lang.IFn
              (^Object invoke [this# ~@(map #(vary-meta % dissoc :tag) args)]
                 ~@body))))))

As you can see the trick is to generate a dedicated interface and to inline calls to the defined fn to go through the specific interface rather than through IFn.

Let’s define a dumb numeric fn generating tons of calls: the fibonacci function! Below it comes in three flavors: hinted as int, hinted as long and plain.

(defhintedfn ^int hfib [^int n] (if (>= (int 1) n) (int 1) (+ (hfib (dec n)) (hfib (- n (int 2))))))
(defhintedfn ^long lfib [^long n] (if (>= (long 1) n) (long 1) (+ (lfib (dec n)) (lfib (- n (long 2))))))
(defn fib [n] (if (>= 1 n) 1 (+ (fib (dec n)) (fib (- n 2)))))

And here are the timings:

user=> (time (hfib 38))
"Elapsed time: 2016.128098 msecs"
63245986
user=> (time (lfib 38))
"Elapsed time: 2896.46198 msecs"
63245986
user=> (time (fib 38))
"Elapsed time: 4704.449867 msecs"
63245986

Please note that hinted fns are still regular fns:

user=> (map hfib (range 10))
(1 1 2 3 5 8 13 21 34 55)

Comments (1)

Destructuring, records, protocols and named arguments

optimization,pondering — cgrand, 8 May 2010 @ 12 h 19 min

Warning: this post is full of microbenchmarks so take it with a pinch of salt.

Destructuring a record

Field access through keyword-lookups on records is fast:

user=> (defrecord Foo [a b])
user.Foo
user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [a (:a x) b (:b x)] [a b])))))
"Elapsed time: 114.787424 msecs"
"Elapsed time: 102.568273 msecs"
"Elapsed time: 71.150593 msecs"
"Elapsed time: 72.217418 msecs"
"Elapsed time: 70.127489 msecs"
nil

But as far as I know destructure still emits (get x :k), let check:

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [{:keys [a b]} x] [a b])))))
"Elapsed time: 968.616612 msecs"
"Elapsed time: 945.704133 msecs"
"Elapsed time: 911.290751 msecs"
"Elapsed time: 927.658125 msecs"
"Elapsed time: 916.796408 msecs"
nil

Actually it’s slightly slower than lookup on small maps:

user=> (let [x {:a 1 :b 2}] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [{:keys [a b]} x] [a b])))))
"Elapsed time: 866.942377 msecs"
"Elapsed time: 746.273968 msecs"
"Elapsed time: 734.366239 msecs"
"Elapsed time: 729.346188 msecs"
"Elapsed time: 746.96393 msecs"
nil

One patch later

I patched destructure to emit keyword-lookups:

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let [{:keys [a b]} x] [a b])))))
"Elapsed time: 479.911064 msecs"
"Elapsed time: 488.895167 msecs"
"Elapsed time: 464.617431 msecs"
"Elapsed time: 480.944575 msecs"
"Elapsed time: 464.969779 msecs"
nil

It’s better but still slower than without destructuring. Let see what slows us:

user=> (macroexpand-1 '(let [{:keys [a b]} x] [a b]))
(let* [map__3838 x map__3838 (if (clojure.core/seq? map__3838) (clojure.core/apply clojure.core/hash-map map__3838) map__3838) b (:b map__3838) a (:a map__3838)] [a b])

My bet is on the if+seq? so I remove it:

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let* [map__3838 x map__3838 map__3838 b (:b map__3838) a (:a map__3838)] [a b])))))
"Elapsed time: 125.188397 msecs"
"Elapsed time: 103.041099 msecs"
"Elapsed time: 70.061558 msecs"
"Elapsed time: 70.793984 msecs"
"Elapsed time: 69.759146 msecs"
nil

This if+seq? allows for named arguments. I wonder if this behaviour should be an option of map-destructuring (eg {:keys [a b] :named-args true}). Anyway I had in mind that such a dispatch could be optimized with protocols.

Optimizing dispatch with protocols

user=> (defprotocol Seq (my-seq [this]) (my-seq? [this]))
Seq
user=> (extend-protocol Seq
clojure.lang.ISeq
(my-seq [this] (.seq this))
(my-seq? [this] true)
Object
(my-seq [this] (clojure.lang.RT/seq this))
(my-seq? [this] false))
nil

Let see how my-seq? compares to seq?

user=> (let [x (Foo. 1 2)] (dotimes [_ 5] (time (dotimes [_ 1e7] (let* [map__3838 x map__3838 (if (my-seq? map__3838) (clojure.core/apply clojure.core/hash-map map__3838) map__3838) b (:b map__3838) a (:a map__3838)] [a b])))))
"Elapsed time: 179.282982 msecs"
"Elapsed time: 161.781526 msecs"
"Elapsed time: 154.307042 msecs"
"Elapsed time: 155.567677 msecs"
"Elapsed time: 153.716604 msecs"
nil

Hence my-seq? is 3x faster than seq? which means that protocols are indeed speedy: yet another incredible piece of work by Rich Hickey!

I’m still exploring how low-level protocols fns behave in a concurrent setting, stay tuned!

Meanwhile don’t forget to register to the next European Clojure training session taught by Lau and me.

Comments (0)

European Clojure Training, Brussels, late June

unsorted — cgrand, 14 April 2010 @ 12 h 29 min

Following a recent tweet, I’m setting up a Clojure training session with Lau B. Jensen and we’d like to know more about our potential attendees. Please fill this short survey:

Loading…
(click here if you don’t see the above form.)

Comments (1)

Pipe dreams aren’t necessarily made of promises

unsorted — cgrand, 2 April 2010 @ 14 h 31 min

Because of the spinning nature of atoms, it’s kind of a hack (a fun hack but still a hack) to build queues on it. Here is the same pipe function built on Java queues:

(defn pipe []
  (let [q (java.util.concurrent.LinkedBlockingQueue.)
        EOQ (Object.)
        NIL (Object.)
        s (fn s [] (lazy-seq (let [x (.take q)] 
                               (when-not (= EOQ x) 
                                 (cons (when-not (= NIL x) x) (s))))))]
    [(s) (fn ([] (.put q EOQ)) ([x] (.put q (or x NIL))))]))

Comments (25)

The reasons why Enlive templates return seqs of strings

unsorted — cgrand, 10 February 2010 @ 12 h 03 min

The main reasons behind Enlive templates returning seqs of strings are:

Ring does not expose the output stream (and I think it is a good thing)
I don’t want to allocate the whole response as a single string

Hence it seemed to me the simplest thing to do. (Simpler than implementing InputStream or making templates and Ring to communicate through a kind of reduce-based IO.)

Nodes serializations are cached when the source of the template is read:

user=> (deftemplate hello (-> "<h1>This is a static title<h2>This is a dynamic title" java.io.StringReader. html-resource) [name] [:h2] (content name))
#'user/hello
user=> (hello "Replaced dynamic title")
("<html>" "<body>" "<h1>This is a static title</h1>" "<h2>" "Replaced dynamic title" "</h2>" "</body>" "</html>")
user=> (hello "Replaced dynamic title II")
("<html>" "<body>" "<h1>This is a static title</h1>" "<h2>" "Replaced dynamic title II" "</h2>" "</body>" "</html>")
user=> (map identical? *1 *2)
(true true true true false true true true)

Thus no string allocation except for the dynamic parts but the real raison d’être of the static nodes serializations cache is to reduce the time spent traversing the tree and the number of allocated Cons (since the strings are longer).

Even without this cache there’s no String allocation when serializing a tree (but many Cons instances are allocated):

user=> (let [x {:tag :html :content [{:tag :body :content [{:tag :h1 :content ["This is a static title"]} {:tag :h2 :content ["This is a dynamic title"]}]}]}]
         (every? (partial apply identical?) (map vector (emit* x) (emit* x))))
true

Comments (3)

Clojure refactoring: flattening reduces

mistakes — cgrand, 19 January 2010 @ 12 h 48 min

This morning I wrote some code which looked like:

(reduce (fn [acc x]
          (reduce (fn [acc y]
                    (reduce f acc y)) acc x)) init xs)

(it was slightly more complex with some filtering and destructuring thrown in for good measure).

I wasn’t happy with those nested reduces and it occured to me that I could refactor it to use a single one:

(reduce f init (for [x xs, y x, z y] z))

Now that reads better!

Comments (34)

Graph Structured Stacks in Clojure

pondering — cgrand, 16 January 2010 @ 17 h 41 min

I am currently pondering how best to represent Graph Structured Stacks in Clojure. My first idea was to use maps.

So this GSS would be represented as:

{7 {3 {1 {0 {}}}, 
    4 {1 {0 {}}}, 
    5 {2 {0 {}}}}, 
 8 {6 {2 {0 {}}}}}

Please note that repeated maps in this representation would be shared by construction.

It is all well and good this representation allow me to easily perform stack ops but I want to also be able to traverse the stacks from the bottom. This means to make the graph cyclic so I thought that I need an adjency list or, more realistically, two indexed versions (maps of sets) of this list to be able to traverse the GSS in any direction. Or the map-based representation for direct traversal and an adjency map for reverse traversal.

To me the more boring part with this adjency lists is that I have to name the GSS nodes. Wait! There are natural identifiers for these nodes: their key-value pairs in the map-based representation! [2 {0 {}}] is a natural identifier for node #2 and [2 {0 {}}] the one for node #7.

Now I’m able to represent the GSS as:

[;; direct representation
 {7 {3 {1 {0 {}}}, 
     4 {1 {0 {}}}, 
     5 {2 {0 {}}}}, 
  8 {6 {2 {0 {}}}}}
 ;; adjency map for reverse traversal
 {nil #{[0 {}]},
  [0 {}] #{[1 {0 {}}] 
           [2 {0 {}}]},
  [1 {0 {}}] #{[3 {1 {0 {}}}]
               [4 {1 {0 {}}}]},
  [2 {0 {}}] #{[5 {2 {0 {}}}]
               [6 {2 {0 {}}}]},
  [3 {1 {0 {}}}] #{[7 {3 {1 {0 {}}}, 
                       4 {1 {0 {}}}, 
                       5 {2 {0 {}}}}]},
  [4 {1 {0 {}}}] #{[7 {3 {1 {0 {}}}, 
                       4 {1 {0 {}}}, 
                       5 {2 {0 {}}}}]},
  [5 {2 {0 {}}}] #{[7 {3 {1 {0 {}}}, 
                       4 {1 {0 {}}}, 
                       5 {2 {0 {}}}}]},
  [6 {2 {0 {}}}] #{[8 {6 {2 {0 {}}}}]}}]

Granted the serialization is verbose but all nodes are shared in memory by construction and I find this model rather simple.

Does anyone have other ideas on how to functionally implement such a structure?

Comments (6)

« Previous Page — Next Page »

What’s this ^:static thing?

Destructuring a record

One patch later

Optimizing dispatch with protocols

What’s this `^:static` thing?