The Reasoned Scheduler
tl;dr How I realized that core.logic (or miniKanren) is essentially a scheduler.
Some times ago I wanted to experiment with core.logic’s codebase and the best way to get intimate with a codebase is to mess with it. Doing so I repeatedly failed but I had several epiphanies on the essence of core.logic (or miniKanren) implementation. Epiphanies that I’m going to share here.
At first glance, when I looked at this codebase (codebase which started as a faithful port from Scheme), I noticed a lazy stream implementation, a substitutions type and a monadic structure. A lazy stream impl? in Clojure? Let me change this to use a good old lazy sequence!
Before diving in, let me explain what core.logic does: given a problem specification (a program) and some unknowns it tries to solve the equation program. One executes a program with run
or run*
which returns a lazy sequence of values for the specified unknown. So the magic happens inside run
.
run
calls the program — because the program is just a function — which returns a lazy stream of substitutions maps. A subsitutions map is a map from unknowns to their values, values which may involve other unknowns. A program is made of goals — functions of one substitutions map to a lazy stream of substitutions maps. A program is kickstarted with an empty subsitutions map.
The crux of miniKanren or core.logic is then how to combine goals and lazy streams. The two basic combinations are the disjunction (conde
) and the conjunction (fresh
, all
and the implicit conjunction inside each conde
clause).
Conjunction is easy: it takes a lazy stream and a goal, if the lazy-stream has come to an end then the conjunction too, in the other case, take the next substitutions map returned by the stream, pass it to the goal, you get another stream that you concatenate to the sream returned by the conjunction of the beheaded stream and the goal — in truth concatenation happens through disjunction.
Disjunction takes two lazy streams and returns another lazy stream but is a bit tricky to get right. The naive approach is to simply concatenate the two streams but this is problematic: if the first stream is infinite, results contributed by the second stream are never returned — but at least some results keep being returend. The problem gets even worse if the first stream does not terminate (because its search space is infinite and all solutions has already been found): it simply blocks the evaluation of the second stream. The right approach is to interleave the two streams so as to consume them concurrently.
So I happily proceeded to the conversion of all core.logic’s lazy stream code to lazy sequences… and it failed on some tests; the cause of this failure led to my first ahah moment.
The reason is to be found in the interleaving: if at some point one of the sequences diverges (a call to seq
never returns) then one can’t call first
on it and interaleaves its items with those of the other sequence!
How did the interleaving manage to work for lazy streams? Streams are either nil
, a thunk or an instance of Choice
(the lazy stream pendant of Cons
). When you call a thunk to force the lazy stream you may get another thunk, which means that you don’t have realized the stream but you have advanced towards its realization. So to really force the stream you have to trampoline until a Choice
or nil
is returned. On the other hand, lazy sequences, when forced, always return nil
or a Cons
because the trampolining happens inside LazySeq
.
Hence miniKanren lazy streams give more control on their evaluation to the user and that’s why they can’t be replaced by Clojure lazy sequences! This means that mplus
(the interleaving function) is all about scheduling evaluation of the two streams. This realization struck me: at its heart core.logic business is process management, not about streams mangling!
A disjunction for example can be seen a process forking two child processes and directly passing children results to its own parent process. One can certainly even make a toy implementation on top of agents…
I said that conjunction was easy a few paragraphs ago, it’s a lie but it may be the subject of a subsequent post.