Cover Data Structures and Algorithms with Object-Oriented Design Patterns in Java
next up previous contents index

Application: Typesetting Problem

 

Consider the problem of typesetting a paragraph of justified text. A paragraph can be viewed as a sequence of n>0 words, tex2html_wrap_inline67201. The objective is to determine how to break the sequence into individual lines of text of the appropriate size. Each word is separated from the next by some amount of space. By stretching or compressing the space between the words, the left and right ends of consecutive lines of text are made to line up. A paragraph looks best when the amount of stretching or compressing is minimized.

We can formulate the problem as follows: Assume that we are given the lengths of the words, tex2html_wrap_inline67799, and that the desired length of a line is D. Let tex2html_wrap_inline67803 represent the sequence of words from tex2html_wrap_inline67185 to tex2html_wrap_inline67807 (inclusive). That is,

displaymath67791

for tex2html_wrap_inline67809.

Let tex2html_wrap_inline67811 be the sum of the lengths of the words in the sequence tex2html_wrap_inline67803. That is,

displaymath67792

The natural length, for the sequence tex2html_wrap_inline67803 is the sum of the lengths of the words, tex2html_wrap_inline67811 plus the normal amount of space between those words. Let s be the normal size of the space between two words. Then the natural length of tex2html_wrap_inline67803 is tex2html_wrap_inline67823. Note, we can also define tex2html_wrap_inline67811 recursively as follows:

  equation33000

In general, when we typeset the sequence tex2html_wrap_inline67803 all on a single line, we need to stretch or compress the spaces between the words so that the length of the line is the desired length D. Therefore, the amount of stretching or compressing is given by the difference tex2html_wrap_inline67831. However, if the sum of the lengths of the words, tex2html_wrap_inline67811, is longer than the desired line length D, it is not possible to typeset the sequence on a single line.

Let tex2html_wrap_inline67837 be the penalty associated with typesetting the sequence tex2html_wrap_inline67811 on a single line. Then,

  equation33013

This definition is of penalty is consistent with the stated objectives: The penalty increases as the difference between the natural length of the sequence and the desired length increases and the infinite penalty disallows lines that are too long.

Finally, we define the quantity tex2html_wrap_inline67629 for tex2html_wrap_inline67809 as the minimum total penalty required to typeset the sequence tex2html_wrap_inline67803. In this case, the text may be all on one line or it may be split over more than one line. The quantity tex2html_wrap_inline67629 is given by

  equation33025

We obtain Equation gif as follows: When i=j there is only one word in the paragraph. The minimum total penalty associated with typesetting the paragraph in this case is just the penalty which results from putting the one word on a single line.

In the general case, there is more than one word in the sequence tex2html_wrap_inline67803. In order to determine the optimal way in which to typeset the paragraph we consider the cost of putting the first k words of the sequence on the first line of the paragraph, tex2html_wrap_inline67855, plus the minimum total cost associated with typesetting the rest of the paragraph tex2html_wrap_inline67857. The value of k which minimizes the total cost also specifies where the line break should occur.




next up previous contents index

Bruno Copyright © 1998 by Bruno R. Preiss, P.Eng. All rights reserved.