Optimal justification of text with Dynamic Programming

unsorted — cgrand, 29 August 2014 @ 11 h 46 min

This is an exercise in dynamic programming I found on /ftp.

The goal is to tightly arrange a given sequence of n words within page margins, maximizing overall neatness. To be more precise, we wish to minimize the sum, over all lines except the last, of the cubes of the number of blank characters at the end of each line. See the comments in the code for more details.

The algorithm has O(n^2) time and space complexities.

Below is my take on it and I believe the complexity to be O(n*width). I achieve linearity in the words count by laying out the text back to front: the key insight is that the optimal layout of some words only depends on the amount of space left on the current line, it does not depend on the layout of the words before them.

(defn layout [words width]
  (let [cat (fn [word {u :ugliness [l & ls] :lines}] ; adds the word to the start of the first line
              {:ugliness u :lines (cons (cons word l) ls)})
        br (fn [rem {u :ugliness ls :lines}] ; adds a break (creates a new line)
              {:ugliness (+ u (* rem rem rem)) :lines (cons () ls)})
        layout ; layout words, knowing that the first line shouldn't have more than rem characters
          (fn [layout words rem]
            (if-let [[word & ws] (seq words)]
                (= rem width) ; blank line ahead
                  (cat word (layout ws (- rem (count word))))
                (< (count word) rem) ; enough room for a space and the current word
                  (cat word
                    (min-key :ugliness
                      (layout ws (- rem (count word) 1))
                      (br rem (layout ws width))))
                :else (br rem (layout words width)))
              {:ugliness 0 :lines (list ())}))
        mlayout (memoize layout)
        layout (fn self [words rem] (mlayout self words rem))]
    (:lines (layout words width))))

This is an exercise I prepared for a Lambda Next workshop but that we never used.

1 Comment »

  1. It would be appropriate to mention that this is similar to the technique invented by Knuth and Plass (1980) for Professor Knuth’s TeX software. http://www.google.ca/search?client=safari&rls=en&q=knuth+paragraphs+breaking

    Comment by Toby — 29 August 2014 @ 14 h 48 min

RSS feed for comments on this post. TrackBack URI

Leave a comment

(c) 2024 Clojure and me | powered by WordPress with Barecity