Taming multidim Arrays
Utilities
(defn array? [x] (-> x class .isArray)) (defn see [x] (if (array? x) (map see x) x))
Creation
Blank and rectangular:
(see (make-array Double/TYPE 3 2)) ; ((0.0 0.0) (0.0 0.0) (0.0 0.0))
From a nested collection:
(see (into-array (map (partial into-array Double/TYPE) [[1 2 3 4] [5 6]]))) ; ((1.0 2.0 3.0 4.0) (5.0 6.0))
Reading
A multidim array:
(def dd (make-array Double/TYPE 1000 1000))
Naive method:
(time (dotimes [i 1000] (dotimes [j 1000] (aget dd i j)))) ; "Elapsed time: 455.514388 msecs"
Step-by-step (allow inlining):
(time (dotimes [i 1000] (dotimes [j 1000] (-> dd (aget i) (aget j))))) ; "Elapsed time: 257.625829 msecs"
(Not so) Hinted step-by-step (#^doubles
is ignored because of inline-expansion):
(time (dotimes [i 1000] (dotimes [j 1000] (-> #^objects dd (#^doubles aget i) (aget j))))) ; "Elapsed time: 157.095175 msecs"
Fully hinted step-by-step:
(time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aget a j))))) ; "Elapsed time: 17.412548 msecs"
Let’s embody this knowledge in a nice macro:
(defmacro deep-aget ([hint array idx] `(aget ~(vary-meta array assoc :tag hint) ~idx)) ([hint array idx & idxs] `(let [a# (aget ~(vary-meta array assoc :tag 'objects) ~idx)] (deep-aget ~hint a# ~@idxs))))
Does it work?
(time (dotimes [i 1000] (dotimes [j 1000] (deep-aget doubles dd i j)))) ; "Elapsed time: 16.937279 msecs"
Updating
Naive method:
(time (dotimes [i 1000] (dotimes [j 1000] (aset dd i j 42.0)))) ; "Elapsed time: 13859.404896 msecs"
Partially hinted step-by-step:
(time (dotimes [i 1000] (dotimes [j 1000] (-> #^objects dd (aget i) (aset j 42.0))))) ; "Elapsed time: 94.699536 msecs"
Fully hinted step-by-step (take 1):
(time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aset a j 42.0))))) ; java.lang.IllegalArgumentException: More than one matching method found: aset
The compiler doesn’t know which aset to pick between (Object[], int, Object) and (double[], int, double) according to the hints (double[], int, Object). That’s why one needs to hint 42.0:
(time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aset a j (double 42.0)))))) ; "Elapsed time: 19.561983 msecs"
A nice macro to abstract away:
(defmacro deep-aset [hint array & idxsv] (let [hints '{doubles double ints int} ; writing a comprehensive map is left as an exercise to the reader [v idx & sxdi] (reverse idxsv) idxs (reverse sxdi) v (if-let [h (hints hint)] (list h v) v) nested-array (if (seq idxs) `(deep-aget ~'objects ~array ~@idxs) array) a-sym (with-meta (gensym "a") {:tag hint})] `(let [~a-sym ~nested-array] (aset ~a-sym ~idx ~v))))
Does it work?
(time (dotimes [i 1000] (dotimes [j 1000] (deep-aset doubles dd i j 42.0)))) ; "Elapsed time: 19.439133 msecs"
(All tests run on jdk 1.6.0_14 32bit, options -server -XX:+DoEscapeAnalysis
on a processor with a 6MB L2 cache.)
[...] tried to tap directly into Javas Arrays following the great performance boosting tips found: here. But it still gave me about 200 msecs on an 80×80 board. I then tried to reduce it to a [...]
[...] Taming multidim Arrays [...]
[...] Clojure and me » Taming multidim Arrays (tags: clojure array matrix optimization) [...]
[...] as Christophe Grand points out in his fantastic blogpost on the matter, the Clojure compiler doesn’t support this type of reflection warning… yet. [...]
Out of interest, could you explain why the “Step-by-step (allow inlining)” reading approach twice as fast as the naive approach?
@Sam Aaron: because the Clojure compiler inlines aget only for two args.
Plus I fixed typehints propagation so, in 1.1, (-> #^objects dd #^doubles (aget i) (aget j)) simply works.
Hi Christopher,
I have a defrecord defined as follows ..
(defrecord Complex [^double i ^double r])
and I tried your deep-aset in the following way
(deep-aset Complex array-name i j (Complex. 10 20))
I didn’t see a significant improvement in speed like I saw in doubles when compared to the naive usage of aset… Is this expected?
Sunil.
@Sunil Try (deep-aset objects array-name i j (Complex. 10 20)). The hint passed to deep-aset is on the most deeply nested array — that is an array of Complex instances, hence an array of objects. To be sure, can you call the class fn on a most deeply nested array? Or paste (eg on gist) more code to give me more context.
Hi Christophe,
This article was very helpful for me. Thank you for this.
– John
Hi,
I was experimenting with your code. It is very insightful. I have noticed that :
(time (dotimes [i 1000] (dotimes [j 1000] (let [#^doubles a (aget #^objects dd i)] (aget a j)))))
can be expressed as:
(time (dotimes [i 1000] (dotimes [j 1000] (aget #^doubles (aget #^objects dd i) j))))
and it has the same performance characteristics. So I presume the hinting in the second example works as intended. What I don’t fully understand is what ^doubles actually hints? Since ^doubles macro is a reader macro it looks to me as it hints not the result of evaluating (aget #^objects dd i) but it hints a form which is ‘ (aget #^objects dd i) [ using another #=(type …) reader-macor one can verify that the type is not [D but clojure.lang.PersistentList ). Anyway, since we are hinting a list which at runtime will be evaluated to array how the outer aget can make use of the hint? What am I missing here?
@Daniel this post is 4-year old, At the time your re-expression wouldn’t have run as fast.
^doubles gives a static hint (all hints are static) on the type of the evaluation result. It’s not the outer aget which is the consumer of the type hint but the compiler. aget has an inline replacement (kind of macro expansion) when it is used in function position. This expansion is to a Java static methods and the right overload is picked based on the type hint on the array argument.
I hope this makes things clearer to you.
a small followup:
Why
(time (dotimes [i 1000] (dotimes [j 1000] (aset ^doubles (aget ^objects dd i) j (double 42.0)))))
takes 9 msec to execute, but
(time (dotimes [i 1000] (dotimes [j 1000] (aset ^doubles (aget ^objects (make-array Double/TYPE 1000 1000) i) j (double 42.0)))))
takes 874000 msec to execute?
HI Christophe,
I know the post is old but is still up-to date. There are really no performance difference between:
user=> (time (dotimes [i 1000] (dotimes [j 1000] (let [^doubles a (aget ^objects dd i)] (aget a j)))))
“Elapsed time: 9.476 msecs”
and
user=> (time (dotimes [i 1000] (dotimes [j 1000] (aget ^doubles (aget ^objects dd i) j))))
“Elapsed time: 9.438 msecs”
So the hinting must be identical.
@Daniel in you exmpel with make-array, you arae allocating 1 milion times 1000 arrays of 1000 doubles.
In my previous comment I just tried to explain why your hinting is valid now but wanst’ years ago (and required to introduce a let).
ahh, ofcourse ;)
Today, even though technological advancements have allowed humans beings to live comfortable lives, human beings still live in a very chaotic world….
anxiety disorder meaning in kannada
Motlow, george dickel, manchester, bonnaroo, coffee county, winchester, monteagle, tims ford, beechcraft, lynchburg, a.e.d.c., sign dept. anoxia villosa