Extensible macros (A flatter cond follow-up)

unsorted — cgrand, 21 September 2012 @ 8 h 51 min

I pushed Utils out. It features reduce-by but also variations on cond (the ones introduced in the comments of A Flatter Cond) and if-let.

cond supports the following operators: :let, :when, :when-let and vectors in test position are treated as the binding form to a if-let (thus setting the local bindings) for the “then” expression not for the rest of the cond.

if-let supports :let to introduce new bindings without testing their expression for truth.

One notable twist to these macros is that thanks to net.cgrand.xmacros they are extensible: new operators can safely be participated to the macros (proper namespacing is enforced). See the cond implementation, abbreviated below:

(defmacro cond
 "An extensible variation on cond. Trailing else is supported.
keywords in test position are treated as operators.

Operators impls signature is [rhs else] where rhs is the \"then\"
expression next to the operator."
  [& clauses]
  (when-let [[op rhs & more-clauses] (seq clauses)]
    (if (next clauses)
      (x/expand-op op rhs `(cond ~@more-clauses))

(x/defdefault-op cond [test-expr then else]
  (if (vector? test-expr)
    `(if-let ~test-expr ~then ~else)
    `(if ~test-expr ~then ~else)))

(x/defop cond :let
  "Introduces local bindings."
  [bindings cont]
  `(let ~bindings ~cont))

(x/defop cond :when
  "Short-circuits the rest of the cond if false"
  [test-expr cont]
  `(when ~test-expr ~cont))

(x/defop cond :when-let
 "Short-circuits the rest of the cond if false and introduces local
  [bindings cont]
  `(when-let ~bindings ~cont))

Follow-up: A world in a ref

unsorted — cgrand, @ 8 h 35 min

I recently published Megaref a refined (and working) version of the idea described in A World in a Ref.

New and notesworthy:

  • actual working code,
  • the concurrency (number of guards) can be tuned at runtime,
  • tuning happens through an uniform interface (get-options, set-option!),
  • subrefs (and backward compatible replacements for usual STM functions) allow to easily migrate “polyref” code to megaref (see the conversion of Rich Hickey’s ant demo — mind the typo in the commit message).

Feedback welcome!

Fair conjunction: status report

unsorted — cgrand, 6 April 2012 @ 15 h 54 min
Shameless plugs:
Clojure Programming is out! Available now as an ebook and in a few days in paper!
Come to my Clojure training in Geneva it will be in a mixed language (French/English) setting — for an English-only training stay tuned!

Making conjunction fair is the reason why I messed up with core.logic, and this post document the approach I explore in the fair-conj2 branch.

In The Reasoned Scheduler I said that conjunction is easy: you wait for the first stream to provide a answer that you pass to the next goal. It looks easy but it suffers from being ordered: when the first goal diverges (produces an infinite stream of results or never halts) and the next goal is known to fail, the conjuction diverges too! This is unfair conjunction.

Changing the order of the goals would fix this particular example but like noted by William Byrd in Relational Programming in miniKanren: Techniques, Applications, and Implementations (page 53 and 54):

However, reordering goals has its disadvantages. For many programs, no ordering of goals will result in finite failure (see the remaining example in this chapter). Also, by committing to a certain goal ordering we are giving up on the declarative nature of relational programming: we are specifying how the program computes, rather than only what it computes. For these reasons, we should consider alternative solutions.

[T]he goal ordering we picked might diverge when we ask for a second answer, while the other ordering may fail finitely after producing a single answer.

So, to make conjunction fair (commutative), one has to not commit to an ordering of goal, which means that both goals must be executed in parallel (interleaved execution or even true parallelism). As soon as one of the goal produces an answer, this substitution has to be communicated to the other threadgoal so as to narrow its search space.

As a consequence of the disappearance of ordering, the distinction between a goal and a stream disappears too. I introduced a Search abstraction. A Search may yield a substitution (or nil)
and progress with step. It sounds awfully familiar with first and next.

Let’s name plus, join and narrow respectively the disjunction, conjunction and narrowing operators. Conjunction is distributive over disjunction as usual and narrowing is distributive over both.

Here is how the key expression is derived:

(join a b)
(join (plus (yield a) (step a)) b) ; decompose a 
(plus (join (yield a) b) (join (step a) b)) ; distribute join
(plus (narrow b (yield a)) (join (step a) b)) ; joining with a substitution (or nil) is narrowing
(plus (narrow b (yield a)) (join b (step a))) ; commute the operands of join for interleaving

This expression can be found nearly as-is in the code.

Narrowing is performed lazily: when one steps a suspended narrowing (a Narrow instance), the narrowing is propagated (thanks to its distributivity) to the children searches of its argument.

To make narrowing efficient, I introduced the min-yield function which, given a Search, returns the minimal substitution yielded by this search. That is: if this Search ever yield results they will be extensions of the min-yield substitution. This allows to discard a Search in its entirety when the narrowing substitution and the min-yield substitutions are not compatible!

To compute min-yield substitutions, the narrowing process leaves nodes (Scope instances) in the computation tree labeled with the narrowing substitution. By default min-yield returns empty-s the empty substitution.

This is the gist of my approach to fair conjunction. In a subsequent post, I’ll cover heuristics I use to prioritize execution of Searches.

Practically, what does it mean? It means that logic programs are easier to write, it also means that, at the moment, a program written with unfair conjunction in mind run slower with fair conjunction. Interestingly, under fair conjunction, naive pluso (p. 63) seems refutationally complete.

I intend to explore how constraints can be added to this branch and how it will impact (positively) performance.

If you feel like dabbling with this branch, I’ll be more than happy to hear your feedback. Thanks!

A poor man’s interval tree

unsorted — cgrand, 16 March 2012 @ 22 h 49 min

Sorted collections can be put to good (ab)use, here I define interval-lt a partial order on intervals and points — where an interval is a vector [from to] (from inclusive, to exclusive; they can be nil to denote infinity) and a point is [n n].

(defn interval-lt
  [[a b] [c d]]
  (boolean (and b c 
                (if (= a b)
                  (neg? (compare b c))
                  (<= (compare b c) 0)))))

Using this partial order and a sorted-map, I can build an interval tree:

(def empty-interval-map 
  (sorted-map-by interval-lt [nil nil] #{}))

(defn- isplit-at [interval-map x]
  (if x
    (let [[[a b :as k] vs] (find interval-map [x x])]
      (if (or (= a x) (= b x))
        (-> interval-map (dissoc k) (assoc [a x] vs [x b] vs))))

(defn- ialter [interval-map from to f & args]
  (let [interval-map (-> interval-map (isplit-at from) (isplit-at to))
        kvs (for [[r vs] 
                    (and from to)
                    (subseq interval-map >= [from from] < [to to])
                    (subseq interval-map >= [from from])
                    (subseq interval-map < [to to])
              [r (apply f vs args)])]
    (into interval-map kvs)))

(defn iassoc [interval-map from to v]
  (ialter interval-map from to conj v))

(defn idissoc [interval-map from to v]
  (ialter interval-map from to disj v))

(defn iget [interval-map x]
  (get interval-map [x x]))

Demo, let’s state who’s present and at which time:

(-> empty-interval-map
  (iassoc 9 17 "Ethel")
  (iassoc 7 12 "Lucy")
  (iassoc 11 20 "Fred"))
; {[nil 7] #{}, [7 9] #{"Lucy"}, 
;  [9 11] #{"Ethel" "Lucy"}, 
;  [11 12] #{"Ethel" "Fred" "Lucy"}, 
;  [12 17] #{"Ethel" "Fred"}, 
;  [17 20] #{"Fred"}, 
;  [29 nil] #{}}

And now one can query by time:

(iget *1 8)
; #{"Lucy"}
(iget *2 15)
; #{"Ethel" "Fred"}
(iget *3 20)
; #{}

Sure, this is not the best interval tree implementation ever, it suffers from severe fragmentation and not ideal update complexity. However if what only matters is lookup, it works nicely in O(log n) — each iassoc adds at most two entries to the map so for n intervals you have at most 1+2n entries, lookup is O(log (2n+1)), that is O(log n).

The Reasoned Scheduler

unsorted — cgrand, 30 January 2012 @ 19 h 28 min

tl;dr How I realized that core.logic (or miniKanren) is essentially a scheduler.

Some times ago I wanted to experiment with core.logic’s codebase and the best way to get intimate with a codebase is to mess with it. Doing so I repeatedly failed but I had several epiphanies on the essence of core.logic (or miniKanren) implementation. Epiphanies that I’m going to share here.

At first glance, when I looked at this codebase (codebase which started as a faithful port from Scheme), I noticed a lazy stream implementation, a substitutions type and a monadic structure. A lazy stream impl? in Clojure? Let me change this to use a good old lazy sequence!

Before diving in, let me explain what core.logic does: given a problem specification (a program) and some unknowns it tries to solve the equation program. One executes a program with run or run* which returns a lazy sequence of values for the specified unknown. So the magic happens inside run.

run calls the program — because the program is just a function — which returns a lazy stream of substitutions maps. A subsitutions map is a map from unknowns to their values, values which may involve other unknowns. A program is made of goals — functions of one substitutions map to a lazy stream of substitutions maps. A program is kickstarted with an empty subsitutions map.

The crux of miniKanren or core.logic is then how to combine goals and lazy streams. The two basic combinations are the disjunction (conde) and the conjunction (fresh, all and the implicit conjunction inside each conde clause).

Conjunction is easy: it takes a lazy stream and a goal, if the lazy-stream has come to an end then the conjunction too, in the other case, take the next substitutions map returned by the stream, pass it to the goal, you get another stream that you concatenate to the sream returned by the conjunction of the beheaded stream and the goal — in truth concatenation happens through disjunction.

Disjunction takes two lazy streams and returns another lazy stream but is a bit tricky to get right. The naive approach is to simply concatenate the two streams but this is problematic: if the first stream is infinite, results contributed by the second stream are never returned — but at least some results keep being returend. The problem gets even worse if the first stream does not terminate (because its search space is infinite and all solutions has already been found): it simply blocks the evaluation of the second stream. The right approach is to interleave the two streams so as to consume them concurrently.

So I happily proceeded to the conversion of all core.logic’s lazy stream code to lazy sequences… and it failed on some tests; the cause of this failure led to my first ahah moment.

The reason is to be found in the interleaving: if at some point one of the sequences diverges (a call to seq never returns) then one can’t call first on it and interaleaves its items with those of the other sequence!

How did the interleaving manage to work for lazy streams? Streams are either nil, a thunk or an instance of Choice (the lazy stream pendant of Cons). When you call a thunk to force the lazy stream you may get another thunk, which means that you don’t have realized the stream but you have advanced towards its realization. So to really force the stream you have to trampoline until a Choice or nil is returned. On the other hand, lazy sequences, when forced, always return nil or a Cons because the trampolining happens inside LazySeq.

Hence miniKanren lazy streams give more control on their evaluation to the user and that’s why they can’t be replaced by Clojure lazy sequences! This means that mplus (the interleaving function) is all about scheduling evaluation of the two streams. This realization struck me: at its heart core.logic business is process management, not about streams mangling!

A disjunction for example can be seen a process forking two child processes and directly passing children results to its own parent process. One can certainly even make a toy implementation on top of agents…

I said that conjunction was easy a few paragraphs ago, it’s a lie but it may be the subject of a subsequent post.

Valued functions

unsorted — cgrand, 17 October 2011 @ 14 h 55 min

Functions are black boxes and determining when two functions are equivalent is undecidable. That’s why:(= (fn [x] x) (fn [x] x)) is false (a clever compiler could sove this naive specific case but not all).

However from time to time I’d like functions constructed the same way be considered equals. For example (= (comp inc inc) (comp inc inc)) could be true and it wouldn’t necessitate a smart enough compiler or memoization.

It’s not without precedent: keywords, symbols, maps, sets and vectors are functions and also have value semantics: (= (hash-set 1) (hash-set 1)) is true!

In indexed-set I use maps whose values are functions, sometimes functions returned by higher-order functions and the user having no reference to the function returned by the hof can’t look it up in the map — if she calls the hof agin with the same arguments she gets another function which is not equal to the first one. This means that I should always let the user call the hof by herself and keep the result around. It may be ok for a low-level API but not for more user-friendly functions.

Helpfully, records have value semantics, can implement Java interfaces and do not already implement clojure.lang.IFn and that’s what I have done:

(defrecord Op-By [key f init]
  (invoke [this m v]
    (let [k (key v)]
      (assoc m k (f (m k init) v)))))

(defn- op-by [key f init] (Op-By. key f init))

instead of

(defn- op-by [key f init] 
  (fn [m v])
    (let [k (key v)]
      (assoc m k (f (m k init) v))))

I’m still not happy with that I’d like something more fluid (closer to fn, maybe built on reify but ideally a metadata flag on fn e.g. (^:valued fn [m v] ...)) ; I don’t think the type name is valueable, I’m on the fence concerning features I get for free from defrecord (namely conj, assoc and dissoc).

A world in a ref

unsorted — cgrand, 6 October 2011 @ 13 h 00 min

At times I struggle deciding on the granularity I should give to my refs. If I put a big map in a single ref (what I call a megaref), updates to unrelated parts of the map may conflict and cause transactions to retry, but it’s dead easy to snapshot the whole state. If I use a ref for every entry of the map, concurrency is excellent again but snapshotting the whole state may be tricky (you need to tune the ref to have a history long enough) when there is a large number of rapidly changing refs or when the system is slow/loaded.

What I’d like is an alter-in function, a mix of alter and update-in, with the following guarantee: two alter-ins conflict when their paths are either equal or prefix from one another.

This is not something new: this problem bugs me since 2008 I think. I had several (failed/unfinished and private) attempts to patch the STM to accommodate for a new reference type.

Some weeks ago I realized that I didn’t need to hack the STM to create such a new reference type.

The basic idea behind all my attempts was to use lock-striping. What I didn’t realize is that I can leverage the existing STM by using ref-striping!

Let’s say the whole map is stored in a single ref, the root ref and you have N guard refs (whose value is nil). When I want to alter the value for a given key, I compute its hash modulo N which gives me an index into the guards vector. I ref-set the corresponding guard ref (to nil, the actual value doesn’t matter) thus claiming exclusive commit rights for this key. Once that done I simply commute on the root ref being sure that the operation will only commute with non-conflicting other ops.

NB: An alternative design is to simply have N roots and no guards and to merge the N maps on deref — efficient merging would need a didcated map implementation. However it doesn’t help much with nested maps but it helps a lot with concurrency since in the other design everything is serialized on the single root. I think an hybrid solution (N roots, each roots having M guards) woud bring an interesting trade-off between concurrency, number of retries and ease of snapshotting.

To support nested maps, instead of a single key, one has to deal with a path, eg [:a :b :c]. Here the idea is to ensure guards for path prefixes ([:a] and [:a :b]) and to ref-set the guard for the full-path as before.

;; except ensure-in and alter-in, the others are here because
;; they can be optimized in a mega ref wth several roots
(defprotocol AssociativeRef
  (-alter-in [aref ks f args])
  (-commute-in [aref ks f args])
  (ref-set-in [aref ks v])
  (ensure-in [aref ks])
  (deref-in [aref ks]))
(defn alter-in [aref ks f & args]
  (-alter-in aref ks f args))

(defn commute-in [aref ks f & args]
  (-commute-in aref ks f args))

(defn- ensure-path [guard-for path]
  (loop [path path]
    (when-not (= [] path)
      (ensure (guard-for path))
      (recur (pop path)))))

(defn- guards-fn [guards]
  (let [n (count guards)]
    #(nth guards (mod (hash %) n))))

(deftype MegaRef [r guards]
  (-alter-in [this ks f args]
    (if (seq ks)
      (let [guard-for (guards-fn guards)]
        (ensure-path guard-for (pop (vec ks)))
        (ref-set (guard-for ks) nil)
        (let [v (apply f (get-in @r ks) args)]
          ; v is precomputed to not evaluate it twice because of commute
          (commute r assoc-in ks v)))
      (throw (IllegalArgumentException. "ks must not be empty."))))
  (-commute-in [this ks f args]
    (apply commute r update-in ks f args))
  (ref-set-in [this ks v]
    (-alter-in this ks (constantly v) nil))
  (ensure-in [this ks]
    (ensure-path  (vec ks) (guards-fn guards))
    (deref-in this ks))
  (deref-in [this ks]
    (get-in @r ks))
  (deref [this]

(defn megaref [entries & options]
  (let [guards (:guards options 16)
        root-options (select-keys options [:validator :min-history :max-history])]
    (MegaRef. (apply ref (into {} entries) root-options)
              (vec (repeatedly guards #(ref nil))))))

NB: Write skew anomalies can now occur at the “sub ref” level: when you deref-in a path unrelated to the ones you update; ensure-in is the answer.

So, some questions:

  • Is this of interest to anyone except me? If yes, once mature (eg multiroot) should it belong in contrib?
  • Is my implementation sound? Do I rely on implementation details?
  • Should the STM expose a “token” facility so that I don’t have to use refs as guards?

Conway’s Game of Life

unsorted — cgrand, 19 August 2011 @ 20 h 01 min

APL is famous for having a 1-liner for Conway’s game of life.
Being very efficient at implementing a matrix-based solution of Conway’s game of life should come to no suprise from an array-oriented language.

The way you model data determines your code. Clojure encourages what I call relational-oriented programming. That is modeling with sets, natural identifiers (thanks to composite values) and maps-as-indexes.

If you pick the right representation for the state of the board, you end up with a succinct implementation:

(defn neighbours [[x y]]
  (for [dx [-1 0 1] dy (if (zero? dx) [-1 1] [-1 0 1])] 
    [(+ dx x) (+ dy y)]))

(defn step [cells]
  (set (for [[loc n] (frequencies (mapcat neighbours cells))
             :when (or (= n 3) (and (= n 2) (cells loc)))]

Let’s see how it behaves with the “blinker” configuration:

(def board #{[1 0] [1 1] [1 2]})
; #'user/board
(take 5 (iterate step board))
; (#{[1 0] [1 1] [1 2]} #{[2 1] [1 1] [0 1]} #{[1 0] [1 1] [1 2]} #{[2 1] [1 1] [0 1]} #{[1 0] [1 1] [1 2]})

Great, it oscillates as expected!

From this step can be distilled a generic topology-agnostic life-like automatons stepper factory (phew!) but this is a subject for another post or — shameless plug — a book.

A* in Clojure

unsorted — cgrand, 4 September 2010 @ 4 h 22 min

Thanks to Mark Engelberg there’s now a nice priority map in contrib. With such a facility, it becomes really easy to write a nice implementation of the A* algorithm.

(ns net.cgrand.clj-me.a-star
  (:use [clojure.contrib.priority-map :only [priority-map]]))

(defn A*
 "Finds a path between start and goal inside the graph described by edges
  (a map of edge to distance); estimate is an heuristic for the actual
  distance. Accepts a named option: :monotonic (default to true).
  Returns the path if found or nil."
 [edges estimate start goal & {mono :monotonic :or {mono true}}]
  (let [f (memoize #(estimate % goal)) ; unsure the memoization is worthy
        neighbours (reduce (fn [m [a b]] (assoc m a (conj (m a #{}) b)))
                      {} (keys edges))]
    (loop [q (priority-map start (f start))
           preds {}
           shortest {start 0}
           done #{}]
      (when-let [[x hx] (peek q)]
        (if (= goal x)
          (reverse (take-while identity (iterate preds goal)))
          (let [dx (- hx (f x))
                bn (for [n (remove done (neighbours x))
                         :let [hn (+ dx (edges [x n]) (f n))
                               sn (shortest n Double/POSITIVE_INFINITY)]
                         :when (< hn sn)]
                     [n hn])]
            (recur (into (pop q) bn)
              (into preds (for [[n] bn] [n x]))
              (into shortest bn)
              (if mono (conj done x) done))))))))


(defn euclidian-distance [a b] ; multidimensional
  (Math/sqrt (reduce + (map #(let [c (- %1 %2)] (* c c)) a b))))

;; generate a grid graph whose outlying edges are one-way
(defn grid [x y w h]
  (into {}
    (for [i (range w) j (range h)
          :let [x0 (+ x i) y0 (+ y j) x1 (inc x0) y1 (inc y0)]]
      {[[x0 y0] [x1 y0]] 1
       [[x1 y0] [x1 y1]] 1
       [[x1 y1] [x0 y1]] 1
       [[x0 y1] [x0 y0]] 1})))

(def g (apply dissoc (grid 0 0 4 4) (keys (grid 1 1 2 2))))

user=> (A* g euclidian-distance [0 3] [4 2])
([0 3] [1 3] [2 3] [3 3] [3 2] [4 2])

Shameless plug: if you are in Europe and want to learn Clojure, don’t forget to register for the Frankfurt course, October 26-28.

Quick update

unsorted — cgrand, 26 August 2010 @ 0 h 05 min

Long time no post. No code in this one though.
However I’m pleased to announce that Lau and me are going to give another Clojure course: October 26-28 in Frankfurt, beginners welcome. So if you want to learn Clojure, register!

« Previous PageNext Page »
(c) 2018 Clojure and me | powered by WordPress with Barecity