Valued functions
Functions are black boxes and determining when two functions are equivalent is undecidable. That’s why:(= (fn [x] x) (fn [x] x))
is false (a clever compiler could sove this naive specific case but not all).
However from time to time I’d like functions constructed the same way be considered equals. For example (= (comp inc inc) (comp inc inc))
could be true and it wouldn’t necessitate a smart enough compiler or memoization.
It’s not without precedent: keywords, symbols, maps, sets and vectors are functions and also have value semantics: (= (hash-set 1) (hash-set 1))
is true!
In indexed-set I use maps whose values are functions, sometimes functions returned by higher-order functions and the user having no reference to the function returned by the hof can’t look it up in the map — if she calls the hof agin with the same arguments she gets another function which is not equal to the first one. This means that I should always let the user call the hof by herself and keep the result around. It may be ok for a low-level API but not for more user-friendly functions.
Helpfully, records have value semantics, can implement Java interfaces and do not already implement clojure.lang.IFn
and that’s what I have done:
(defrecord Op-By [key f init] clojure.lang.IFn (invoke [this m v] (let [k (key v)] (assoc m k (f (m k init) v))))) (defn- op-by [key f init] (Op-By. key f init))
instead of
(defn- op-by [key f init] (fn [m v]) (let [k (key v)] (assoc m k (f (m k init) v))))
I’m still not happy with that I’d like something more fluid (closer to fn, maybe built on reify but ideally a metadata flag on fn
e.g. (^:valued fn [m v] ...)
) ; I don’t think the type name is valueable, I’m on the fence concerning features I get for free from defrecord
(namely conj
, assoc
and dissoc
).
Hi Christophe,
interesting post. The question what a function is and where data ends and where code starts pops up in a couple of places. It reminded me of an issue I stumbled upon but never had the time to really dig into – how do you serialize and deserialize a sorted map/set that has a comparator?
Is a comparator just compiled code? Then it’s hard to (de)serialize.
Or should you store the source “AST” alongside the function as data (not that big an issue in a LISP)? Then you could (de)serialize it, and you could also compare those functions easily – what your post above is actually about.
Sometimes compilation feels a bit like ‘premature optimization’…
Otoh there’d probably be a lot of issues to tackle, like what subset of clojure would you allow for these functions (at least if you serialize and perhaps store the function source code, you wouldn’t want to allow a lot, think of different clojure versions with different sets of functions etc), although maybe that’s not an issue if you do everything in-process, rather than serializing out and deserializing back into another process.
Another issue would be people passing in closures – can you ensure that doesn’t happen? Or maybe that’s actually not that hard, like running (eval) on the function with some dummy data and see what happens…
P.S. Minor issue, but there seems to be an HTML escaping issue with “emdash” in your post.
Hi Eugen,
While writing this post I was only thinking “in-process”, serialization was out of scope.
To me, the problem with generalized serialization as a *data* serialization mechanism is to know when to stop to not serialize half the environment and to understand what are the requirements you implicitly put on the consumer.
I think serialization is a complex term which covers multiple distinct needs. I tend to have a polarized view on it:
* either you want to serialize data for consumption by another application (including a future new version of the original application or the same application in a different environement etc.) in which case I lean towards a greatest common subset which is easy to map to json, sql tables or XML. Sorted collection for example have no place in it. The consuming application has to know, as part of the format, this — the knowledge can be a library.
* either you want to checkpoint the whole application state and external os-level tools are a better fit for that.
PS: fixed, thanks.
In addition to all specified above, I can think of many uses of keeping the ‘non compiled’ version of a function.
As one example, it can be very useful to be able to query a function for its closure, understanding in run time all the inputs affect its output.
I would find it useful when implementing memoization strategies that depend on a function, its inputs, and its closure. Today I confine myself to memoizing closureless functions only which is usually ok, but some cases really would benefit from partially binding the first arguments of a function.
for example:
(let [a 5
b 5
fa (partial + a)
fb (partial + b)
c (fa 7)
d (fb 7)]
(= c d))
fa and fb are equal, but there is no runtime mechanism that gives you the tools to start trying to understand that.