A world in a ref

unsorted — cgrand, 6 October 2011 @ 13 h 00 min

At times I struggle deciding on the granularity I should give to my refs. If I put a big map in a single ref (what I call a megaref), updates to unrelated parts of the map may conflict and cause transactions to retry, but it’s dead easy to snapshot the whole state. If I use a ref for every entry of the map, concurrency is excellent again but snapshotting the whole state may be tricky (you need to tune the ref to have a history long enough) when there is a large number of rapidly changing refs or when the system is slow/loaded.

What I’d like is an alter-in function, a mix of alter and update-in, with the following guarantee: two alter-ins conflict when their paths are either equal or prefix from one another.

This is not something new: this problem bugs me since 2008 I think. I had several (failed/unfinished and private) attempts to patch the STM to accommodate for a new reference type.

Some weeks ago I realized that I didn’t need to hack the STM to create such a new reference type.

The basic idea behind all my attempts was to use lock-striping. What I didn’t realize is that I can leverage the existing STM by using ref-striping!

Let’s say the whole map is stored in a single ref, the root ref and you have N guard refs (whose value is nil). When I want to alter the value for a given key, I compute its hash modulo N which gives me an index into the guards vector. I ref-set the corresponding guard ref (to nil, the actual value doesn’t matter) thus claiming exclusive commit rights for this key. Once that done I simply commute on the root ref being sure that the operation will only commute with non-conflicting other ops.

NB: An alternative design is to simply have N roots and no guards and to merge the N maps on deref — efficient merging would need a didcated map implementation. However it doesn’t help much with nested maps but it helps a lot with concurrency since in the other design everything is serialized on the single root. I think an hybrid solution (N roots, each roots having M guards) woud bring an interesting trade-off between concurrency, number of retries and ease of snapshotting.

To support nested maps, instead of a single key, one has to deal with a path, eg [:a :b :c]. Here the idea is to ensure guards for path prefixes ([:a] and [:a :b]) and to ref-set the guard for the full-path as before.

;; except ensure-in and alter-in, the others are here because
;; they can be optimized in a mega ref wth several roots
(defprotocol AssociativeRef
  (-alter-in [aref ks f args])
  (-commute-in [aref ks f args])
  (ref-set-in [aref ks v])
  (ensure-in [aref ks])
  (deref-in [aref ks]))
 
(defn alter-in [aref ks f & args]
  (-alter-in aref ks f args))

(defn commute-in [aref ks f & args]
  (-commute-in aref ks f args))

(defn- ensure-path [guard-for path]
  (loop [path path]
    (when-not (= [] path)
      (ensure (guard-for path))
      (recur (pop path)))))

(defn- guards-fn [guards]
  (let [n (count guards)]
    #(nth guards (mod (hash %) n))))

(deftype MegaRef [r guards]
  AssociativeRef
  (-alter-in [this ks f args]
    (if (seq ks)
      (let [guard-for (guards-fn guards)]
        (ensure-path guard-for (pop (vec ks)))
        (ref-set (guard-for ks) nil)
        (let [v (apply f (get-in @r ks) args)]
          ; v is precomputed to not evaluate it twice because of commute
          (commute r assoc-in ks v)))
      (throw (IllegalArgumentException. "ks must not be empty."))))
  (-commute-in [this ks f args]
    (apply commute r update-in ks f args))
  (ref-set-in [this ks v]
    (-alter-in this ks (constantly v) nil))
  (ensure-in [this ks]
    (ensure-path  (vec ks) (guards-fn guards))
    (deref-in this ks))
  (deref-in [this ks]
    (get-in @r ks))
  clojure.lang.IDeref
  (deref [this]
    @r))

(defn megaref [entries & options]
  (let [guards (:guards options 16)
        root-options (select-keys options [:validator :min-history :max-history])]
    (MegaRef. (apply ref (into {} entries) root-options)
              (vec (repeatedly guards #(ref nil))))))

NB: Write skew anomalies can now occur at the “sub ref” level: when you deref-in a path unrelated to the ones you update; ensure-in is the answer.

So, some questions:

  • Is this of interest to anyone except me? If yes, once mature (eg multiroot) should it belong in contrib?
  • Is my implementation sound? Do I rely on implementation details?
  • Should the STM expose a “token” facility so that I don’t have to use refs as guards?

14 Comments »

  1. Nice implementation. I think we definitively need the choice of lock granularity in the STM. In some cases you can easily “shard” your mega-refs into single refs, which make handling harder. I tought about the problem for some time now, an I came to the conclusion that one must define something like tx boundaries on a data structure. The basic mechanisms are to use multiple refs and “ensure” but this gets messy soon. Using lock striping and commute as you pointed out is a great alternative. However this should be supported as first class citizens on clojure to make the STM a really
    handsome tool.

    Comment by Philipp Meier — 6 October 2011 @ 14 h 43 min
  2. I have run into similar issues and I think this is an interesting solution. We built something similar (in Java) at Terracotta to get tunable lock granularity in a variant of ConcurrentHashMap. Generally you want concurrency somewhere between one ref per key and one ref per map, not either extreme.

    I think the idea of commuting the root ref and using N guard refs seems sound to me. I am less clear on the path implementation. I think only the root ref is commuted, which means alters of [:a :b] and [:a :c] would still conflict (even though they wouldn’t need to), right? Seems like you really want to commute at *every* guard above the change.

    It would be conceptually nice if you could have the impl handle nested maps using composition of the type rather than baking it into the type definition. I’m not sure that’s possible, just a thought.

    Comment by Alex Miller — 6 October 2011 @ 17 h 15 min
  3. @Alex [:a :b] and [:a :c] don’t conflict: the guard for [:a] is ensured (basically it guards against a [:a] vs [:a :x] conflict) then the guards for [:a :b] and [:a :c] are set and the root commuted. I’m not sure composition is feasible but I haven’t tried too hard so…

    Comment by cgrand — 6 October 2011 @ 17 h 26 min
  4. Ah, right. All the paths are just ensured other than the write path which is ref-set.

    Comment by Alex Miller — 6 October 2011 @ 21 h 49 min
  5. Clojure and me » A world in a ref…

    Clojure and me » A world in a ref…

    Trackback by pligg.com — 11 October 2011 @ 6 h 16 min
  6. [...] A World in a Ref — Christophe never fails to make me think and this post is no exception. This time he tackles the problem of STM refs containing uber-structures. [...]

    Pingback by fogus: The best things and stuff of 2011 — 31 December 2011 @ 17 h 01 min
  7. [...] I recently published Megaref a refined (and working) version of the idea described in A World in a Ref. [...]

    Pingback by Clojure and me » Follow-up: A world in a ref — 21 September 2012 @ 8 h 35 min
  8. Hmm. I think this solves a problem I was wrestling with. Could it work also for vecs?

    Comment by Eric Normand — 22 September 2012 @ 3 h 36 min
  9. @Eric it works on any associative collection so yes you can use vectors. The ants demo uses vectors by the way.

    Comment by cgrand — 22 September 2012 @ 8 h 48 min
  10. Greetings from Florida! I’m bored to tears at work so I decided to check out your website on my iphone during lunch break. I really like the info you present here and can’t wait to take a look when I get
    home. I’m surprised at how quick your blog loaded on my mobile .. I’m not even using WIFI, just 3G .
    . Anyhow, fantastic blog!

    Comment by espresso maker — 18 June 2013 @ 2 h 11 min
  11. [...] A world in a ref http://clj-me.cgrand.net/2011/10/06/a-world-in-a-ref/ [...]

  12. I will right away seize your rss feed as I can not find your e-mail subscription link or e-newsletter service.
    Do you’ve any? Kindly permit me understand so that I may subscribe.
    Thanks.

    Look into my web blog … Mylene Farmer Timeless 2013 le film telecharger

  13. Thanks for finally writing about > Clojure
    and me

    Comment by Isaac — 15 June 2014 @ 9 h 03 min
  14. My relatives always say that I am killing my time here at net, but I know I am getting experience all the time by reading such good articles or
    reviews.

RSS feed for comments on this post. TrackBack URI

Leave a comment

(c) 2014 Clojure and me | powered by WordPress with Barecity