Taming multidim Arrays

interop — cgrand, 15 October 2009 @ 20 h 04 min

Utilities

(defn array? [x] (-> x class .isArray))
(defn see [x] (if (array? x) (map see x) x))

Creation

Blank and rectangular:

(see (make-array Double/TYPE 3 2))
; ((0.0 0.0) (0.0 0.0) (0.0 0.0))

From a nested collection:

(see (into-array (map (partial into-array Double/TYPE) [[1 2 3 4] [5 6]])))
; ((1.0 2.0 3.0 4.0) (5.0 6.0))

Reading

A multidim array:

(def dd (make-array Double/TYPE 1000 1000))

Naive method:

(time (dotimes [i 1000] (dotimes [j 1000] (aget dd i j))))
; "Elapsed time: 455.514388 msecs"

Step-by-step (allow inlining):

(time (dotimes [i 1000] (dotimes [j 1000] (-> dd (aget i) (aget j)))))
; "Elapsed time: 257.625829 msecs"

(Not so) Hinted step-by-step (#^doubles is ignored because of inline-expansion):

(time (dotimes [i 1000] (dotimes [j 1000] (-> #^objects dd (#^doubles aget i) (aget j)))))
; "Elapsed time: 157.095175 msecs"

Fully hinted step-by-step:

(time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aget a j)))))
; "Elapsed time: 17.412548 msecs"

Let’s embody this knowledge in a nice macro:

(defmacro deep-aget
  ([hint array idx]
    `(aget ~(vary-meta array assoc :tag hint) ~idx))
  ([hint array idx & idxs]
    `(let [a# (aget ~(vary-meta array assoc :tag 'objects) ~idx)]
       (deep-aget ~hint a# ~@idxs))))

Does it work?

(time (dotimes [i 1000] (dotimes [j 1000] (deep-aget doubles dd i j))))
; "Elapsed time: 16.937279 msecs"

Updating

Naive method:

(time (dotimes [i 1000] (dotimes [j 1000] (aset dd i j 42.0))))
; "Elapsed time: 13859.404896 msecs"

Partially hinted step-by-step:

(time (dotimes [i 1000] (dotimes [j 1000] (-> #^objects dd (aget i) (aset j 42.0)))))
; "Elapsed time: 94.699536 msecs"

Fully hinted step-by-step (take 1):

(time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aset a j 42.0)))))
; java.lang.IllegalArgumentException: More than one matching method found: aset

The compiler doesn’t know which aset to pick between (Object[], int, Object) and (double[], int, double) according to the hints (double[], int, Object). That’s why one needs to hint 42.0:

(time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aset a j (double 42.0))))))
; "Elapsed time: 19.561983 msecs"

A nice macro to abstract away:

(defmacro deep-aset [hint array & idxsv]
  (let [hints '{doubles double ints int} ; writing a comprehensive map is left as an exercise to the reader
        [v idx & sxdi] (reverse idxsv)
        idxs (reverse sxdi)
        v (if-let [h (hints hint)] (list h v) v)
        nested-array (if (seq idxs)
                       `(deep-aget ~'objects ~array ~@idxs)
                        array)
        a-sym (with-meta (gensym "a") {:tag hint})]
      `(let [~a-sym ~nested-array]
         (aset ~a-sym ~idx ~v))))

Does it work?

(time (dotimes [i 1000] (dotimes [j 1000] (deep-aset doubles dd i j 42.0))))
; "Elapsed time: 19.439133 msecs"

(All tests run on jdk 1.6.0_14 32bit, options -server -XX:+DoEscapeAnalysis on a processor with a 6MB L2 cache.)

15 Comments »

  1. [...] tried to tap directly into Javas Arrays following the great performance boosting tips found: here. But it still gave me about 200 msecs on an 80×80 board. I then tried to reduce it to a [...]

  2. [...] Taming multidim Arrays [...]

  3. [...] Clojure and me » Taming multidim Arrays (tags: clojure array matrix optimization) [...]

    Pingback by links for 2010-03-27 « Blarney Fellow — 28 March 2010 @ 3 h 42 min
  4. [...] as Christophe Grand points out in his fantastic blogpost on the matter, the Clojure compiler doesn’t support this type of reflection warning… yet. [...]

  5. Out of interest, could you explain why the “Step-by-step (allow inlining)” reading approach twice as fast as the naive approach?

    Comment by Sam Aaron — 30 March 2010 @ 18 h 11 min
  6. @Sam Aaron: because the Clojure compiler inlines aget only for two args.
    Plus I fixed typehints propagation so, in 1.1, (-> #^objects dd #^doubles (aget i) (aget j)) simply works.

    Comment by cgrand — 30 March 2010 @ 18 h 37 min
  7. Hi Christopher,
    I have a defrecord defined as follows ..

    (defrecord Complex [^double i ^double r])

    and I tried your deep-aset in the following way

    (deep-aset Complex array-name i j (Complex. 10 20))

    I didn’t see a significant improvement in speed like I saw in doubles when compared to the naive usage of aset… Is this expected?

    Sunil.

    Comment by Sunil Nandihalli — 28 December 2010 @ 12 h 23 min
  8. @Sunil Try (deep-aset objects array-name i j (Complex. 10 20)). The hint passed to deep-aset is on the most deeply nested array — that is an array of Complex instances, hence an array of objects. To be sure, can you call the class fn on a most deeply nested array? Or paste (eg on gist) more code to give me more context.

    Comment by cgrand — 1 January 2011 @ 20 h 37 min
  9. Hi Christophe,

    This article was very helpful for me. Thank you for this.

    – John

    Comment by John — 4 July 2013 @ 4 h 12 min
  10. Hi,

    I was experimenting with your code. It is very insightful. I have noticed that :

    (time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aget a j)))))

    can be expressed as:

    (time (dotimes [i 1000] (dotimes [j 1000] (aget #^doubles (aget #^objects dd i) j))))

    and it has the same performance characteristics. So I presume the hinting in the second example works as intended. What I don’t fully understand is what ^doubles actually hints? Since ^doubles macro is a reader macro it looks to me as it hints not the result of evaluating (aget #^objects dd i) but it hints a form which is ‘ (aget #^objects dd i) [ using another #=(type …) reader-macor one can verify that the type is not [D but clojure.lang.PersistentList ). Anyway, since we are hinting a list which at runtime will be evaluated to array how the outer aget can make use of the hint? What am I missing here?

    Comment by Daniel — 16 September 2013 @ 16 h 03 min
  11. @Daniel this post is 4-year old, At the time your re-expression wouldn’t have run as fast.

    ^doubles gives a static hint (all hints are static) on the type of the evaluation result. It’s not the outer aget which is the consumer of the type hint but the compiler. aget has an inline replacement (kind of macro expansion) when it is used in function position. This expansion is to a Java static methods and the right overload is picked based on the type hint on the array argument.

    I hope this makes things clearer to you.

    Comment by cgrand — 16 September 2013 @ 16 h 11 min
  12. a small followup:

    Why

    (time (dotimes [i 1000] (dotimes [j 1000] (aset ^doubles (aget ^objects dd i) j (double 42.0)))))

    takes 9 msec to execute, but

    (time (dotimes [i 1000] (dotimes [j 1000] (aset ^doubles (aget ^objects (make-array Double/TYPE 1000 1000) i) j (double 42.0)))))

    takes 874000 msec to execute?

    Comment by Daniel — 16 September 2013 @ 16 h 36 min
  13. HI Christophe,

    I know the post is old but is still up-to date. There are really no performance difference between:

    user=> (time (dotimes [i 1000] (dotimes [j 1000] (let [^doubles a (aget ^objects dd i)] (aget a j)))))
    “Elapsed time: 9.476 msecs”

    and

    user=> (time (dotimes [i 1000] (dotimes [j 1000] (aget ^doubles (aget ^objects dd i) j))))
    “Elapsed time: 9.438 msecs”

    So the hinting must be identical.

    Comment by Daniel — 16 September 2013 @ 16 h 42 min
  14. @Daniel in you exmpel with make-array, you arae allocating 1 milion times 1000 arrays of 1000 doubles.

    In my previous comment I just tried to explain why your hinting is valid now but wanst’ years ago (and required to introduce a let).

    Comment by cgrand — 16 September 2013 @ 16 h 53 min
  15. ahh, ofcourse ;)

    Comment by Daniel — 16 September 2013 @ 17 h 33 min

RSS feed for comments on this post. TrackBack URI

Leave a comment

(c) 2018 Clojure and me | powered by WordPress with Barecity