Decaying lists: log scale for lists
The concept of exponentially decaying lists by David Barbour piqued my interest and I implemented it. It can be summed up as: log scale for lists!
As for log scale, decaying lists allow to manage large range of values and thus to get a better understanding of the data.
So a decaying list grows logarithmically with the number of conj
ed items. It follows that some items are dropped when others are inserted. Let’s see:
=> (seq (into (decaying-list) (range 100))) (99 98 97 96 95 94 92 91 90 89 88 87 86 84 83 80 70 65 61 59 57 49 44 30 29 27 25 23 22 18 12 4 0)
So most of the recent items are there but the older elements have a greater probability to have been dropped.
A decaying list is parametrized by its half-life. The default half-life is 20, it means that an item in the list has one chance out of two to “survive” the insertion of 20 other items.
=> (seq (into (decaying-list 5) (range 100))) ; a half-life of 5 conjs (99 98 97 96 95 94 93 91 83 81 79 78 68 61 57 54 52 31 15)
You can know where the next decay (if any: nil
is returned when no decay) is going to happen:
=> (def dl (into (decaying-list 5) (range 100))) #'net.cgrand.decay/dl => (decay-loc dl) [(93 95 97 98 99) (92 91 89 87 85 81 77 72 71 63 54 52 46 31 20 16)]
So here the decay is going to happen between 93 and 92 (the first items of the “left” and “right” sequences). By default the most recent one is kept.
=> (seq (conj dl 100)) (100 99 98 97 95 93 91 89 87 85 81 77 72 71 63 54 52 46 31 20 16)
Indeed 92 has been dropped. (Repeated calls to decay-loc
always yield the same result.)
However we may elect to synthetize a new value rather than just keep one of the two candidates to decay. For example we can compute their average:
=> (seq (into (decaying-list 5 :collapse (fn [dl] (let [[[l] [r]] (decay-loc dl)] (/ (+ l r) 2.0)))) (range 100))) (99 98 97 95.5 93.5 91.5 90 89 87.125 82.75 79.5 72.875 64.8125 60.875 58 50.732421875 27.0 17.75 11.25 2.8125)
Or something a bit more elaborate (and precise):
=> (defn stats [x] (if (map? x) x {:min x :max x :avg x :n 1})) #'net.cgrand.decay/stats => (defn merge-stats [{mina :min maxa :max avga :avg na :n} {minb :min maxb :max avgb :avg nb :n}] {:min (min mina minb) :max (max maxa maxb) :avg (/ (+ (* na avga) (* nb avgb)) (double (+ na nb))) :n (+ na nb)}) #'net.cgrand.decay/merge-stats => (seq (into (decaying-list 5 :collapse (fn [dl] (let [[[l] [r]] (decay-loc dl)] (merge-stats (stats l) (stats r))))) (range 100))) (99 98 97 {:min 92, :max 96, :avg 94.0, :n 5} 91 {:min 87, :max 90, :avg 88.5, :n 4} 86 85 84 83 {:min 80, :max 82, :avg 81.0, :n 3} {:min 78, :max 79, :avg 78.5, :n 2} 77 {:min 70, :max 76, :avg 73.0, :n 7} 69 68 {:min 49, :max 67, :avg 58.0, :n 19} {:min 41, :max 48, :avg 44.5, :n 8} {:min 36, :max 40, :avg 38.0, :n 5} 35 {:min 32, :max 34, :avg 33.0, :n 3} {:min 10, :max 31, :avg 20.5, :n 22} {:min 3, :max 9, :avg 6.0, :n 7} {:min 0, :max 2, :avg 1.0, :n 3})
Decaying lists can also maintain a state – that is updated after each conj or decay. Below the implementation, there is an example of state management.
You can also cap the length of the decaying list by using the :capacity
option. A quick note on capacity: increasing the capacity by the half-life value (eg going from 1000 to 1050 when half-life is 50), doubles the range the decaying list remembers.