An optimization job is never done

optimization — cgrand, 18 November 2009 @ 12 h 26 min

… when you don’t set measurable goals.

Yesterday, in Life of Brian I showed several tricks to make an implementation of Brian’s Brain faster. I tought it was enough but since some disagree here comes another perf boost.

Repetitive lookups

In neighbours-count the first arg (which happens to be a seq of 3 items) is destructured:

(defn neighbours-count [[above current below] i w]

which means that there’s 3 lookups per :off cell while these lookups results are constant for a given row!

So let’s move these lookups in step1:

(defn neighbours-count [above current below i w]
  (let [j (mod (inc i) w)
        k (mod (dec i) w)
        s #(if (= (aget #^objects %1 (int %2)) :on) 1 0)]
    (+ (+ (+ (s above j) (s above i))
         (+ (s above k) (s current j)))
      (+ (+ (s current k) (s below j))
        (+ (s below i) (s below k))))))
 
(defn step1 [[above #^objects current below]]
  (let [w (count current)]
    (loop [i (int (dec w)), #^objects row (aclone current)]
      (if (neg? i)
        row
        (let [c (aget current i)]
          (cond
            (= c :on)
              (recur (dec i) (doto row (aset i :dying)))
            (= c :dying)
              (recur (dec i) (doto row (aset i :off)))
            (= 2 (neighbours-count above current below i w))
              (recur (dec i) (doto row (aset i :on)))
            :else
              (recur (dec i) row)))))))

Benchmark:

user=> (do  (let [b (rand-board 80 80)] (time ((apply comp (repeat 1000 step)) b))) nil)
"Elapsed time: 2058.372014 msecs"
nil
user=> (do  (let [b (rand-board 200 200)] (time ((apply comp (repeat 100 step)) b))) nil)
"Elapsed time: 1105.499303 msecs"
nil

Wow, twice as fast! Again. 2.1ms/step on a 80*80 board, 11.1ms/step on a 200*200 board.

This version on a 200*200 board is then nearly 10x faster than my first one!

11 Comments »

  1. [...] this subsequent post for more optimizations (another 2x improvement). Comments [...]

    Pingback by Clojure and me » Life of Brian — 18 November 2009 @ 12 h 29 min
  2. Hey Christophe.

    It’s official: You’re insane! :)

    Great work on tweaking and tuning – Please let me know once you’ve enhanced the Clojure compiler to do this type of stuff on it’s own!

    Keep blogging, Lau

    Comment by Lau Jensen — 18 November 2009 @ 12 h 58 min
  3. Unable to resolve classname: objects

    Comment by rkrastev — 18 November 2009 @ 17 h 26 min
  4. @rkrastev all my code use the development (master) branch of Clojure.

    Comment by cgrand — 18 November 2009 @ 18 h 41 min
  5. why not unroll neighbours-count using a macro or definline instead of the blob of (+ …)

    Comment by hiredman — 19 November 2009 @ 0 h 18 min
  6. @hiredman: this unrolling is a one-shot (how often is count+for is a bottleneck?) so I thought that introducing a macro would just make the code more complex. But if you have a simple macro for the job I’m eager to see it .

    Comment by cgrand — 19 November 2009 @ 9 h 08 min
  7. Thanks, it works with the newest clojure.jar but the question still stands. What is objects and where we can read more about it? It will be very helpful to learn Clojure if you explain in more details your code.

    Comment by rkrastev — 19 November 2009 @ 10 h 45 min
  8. @rkrastev #^objects is a hinting shortcut for objects arrays. Prior to its addition to master, the only way to hint an objects array was the ugly #^”[Ljava.lang.Object;”. #^objects is similar to shortcuts for hinting primitive arrays right now it’s only documented here. Note that clojure.org documents reflects the last stable release.

    Comment by cgrand — 19 November 2009 @ 11 h 33 min
  9. Hi,

    interestingly this works on my Mac fine with a fresh build from github but gets stuck on the first frame on my AMD based RedHat Linux box and also on a Sun T2000 Sparc box.

    Lau’s originals run fine.

    Anyone else having these problems?

    James.

    Comment by James Swift — 20 November 2009 @ 6 h 27 min
  10. [...] the get-coord function use primitives to get a further performance boost. However, as cgrand says, an optimisation job is never done and for now I believe the performance is “good [...]

  11. I’m not surprised you are interested in this kind of stuff: Conway, Turing, Penrose; mathematical toys for geeky persons. By the way, it’s funny a coincidence I just played yesterday (during office time) with the game of life (I had just told Jean-François Totain about: he had never heard of it).

    PS: didn’t even try to understand your posts! Can’t you use a humanly understandable dialect rather that sigma centauri inhabitants’ native tongue?

    Comment by Olivier — 17 December 2009 @ 15 h 15 min

RSS feed for comments on this post. TrackBack URI

Leave a comment

(c) 2024 Clojure and me | powered by WordPress with Barecity