Steven S. Skiena, Miguel A. Revilla Programming Challenges. The Con Training Manual Spri

User Manual:

Open the PDF directly: View PDF .
Page Count: 373

Download
Open PDF In Browser	View PDF

Steven S. Skiena Miguel A. Revilla

PROGRAMMING CHALLENGES
The Programming Contest Training Manual

With 65 Illustrations

Steven S. Skiena
Department of Computer Science
SUNY Stony Brook
Stony Brook, NY 11794-4400, USA
skiena@programming-challenges.com

Miguel A. Revilla
Department of Applied Mathematics
and Computer Science
Faculty of Sciences
University of Valladolid
Valladolid, 47011, SPAIN
revilla@programming-challenges.com

Library of Congress Cataloging-in-Publication Data
Skeina, Steven S.
Programming challenges : the programming contest training manual / Steven S. Skiena,
Miguel A. Revilla.
p. cm. — (Texts in computer science)
Includes bibliographical references and index.
ISBN 0-387-00163-8 (softcover : alk. paper)
1. Computer programming. I. Revilla, Miguel A. II. Title. III. Series.
QA76.6.S598 2003
005.1—dc21
2002044523
ISBN 0-387-00163-8

Printed on acid-free paper.

Preface

There are many distinct pleasures associated with computer programming. Craftsmanship has its quiet rewards, the satisfaction that comes from building a useful object and
making it work. Excitement arrives with the ﬂash of insight that cracks a previously
intractable problem. The spiritual quest for elegance can turn the hacker into an artist.
There are pleasures in parsimony, in squeezing the last drop of performance out of clever
algorithms and tight coding.
The games, puzzles, and challenges of problems from international programming competitions are a great way to experience these pleasures while improving your algorithmic
and coding skills. This book contains over 100 problems that have appeared in previous
programming contests, along with discussions of the theory and ideas necessary to attack them. Instant online grading for all of these problems is available from two WWW
robot judging sites. Combining this book with a judge gives an exciting new way to
challenge and improve your programming skills.
This book can be used for self-study, for teaching innovative courses in algorithms
and programming, and in training for international competition.

To the Reader
The problems in this book have been selected from over 1,000 programming problems at
the Universidad de Valladolid online judge, available at http://online-judge.uva.es. The
judge has ruled on well over one million submissions from 27,000 registered users around
the world to date. We have taken only the best of the best, the most fun, exciting, and
interesting problems available.
We have organized these problems by topic and provided enough tutorial material
(primarily in mathematics and algorithms) to give you a fair chance to solve them.

Sample programs are provided to illustrate many important concepts. By reading this
book and trying the problems you will gain a concrete understanding of algorithmic
techniques such as backtracking and dynamic programming, and advanced topics such
as number theory and computational geometry. These subjects are well worth your
attention even if you never intend to compete in programming contests.
Many of the problems are ﬂat-out fun. They address fascinating topics in computer
science and mathematics, sometimes disguised by an amusing story. These make interesting subjects for additional study, so we provide notes with further readings where
appropriate.
We have found that people whose training is in the pragmatics of programming and
software engineering often fail to appreciate the power of algorithmics. Similarly, the
theoretically inclined typically underestimate what it takes to turn an algorithm into a
program, and how clever programming can make short work of a tough problem.
For this reason, the ﬁrst portion of the book focuses primarily on programming
techniques, such as the proper use of data types and program libraries. This lays the
foundation for the more algorithmic sections in the second part of the book. Mastery
of both is required to be a complete problem solver.

To the Instructor
This book has been designed to serve as a textbook for three types of courses:
• Algorithm courses focusing on programming.
• Programming courses focusing on algorithms.
• Elective courses designed to train students to participate in competitions such
as the Association for Computing Machinery (ACM) International Collegiate
Programming Contest and the International Olympiad in Informatics.
Such courses can be a lot of fun for all involved. Students are easily motivated by
the thrill of competition, and get positive feedback each time the judge accepts their
solution. The most obvious algorithm may result in a “Time Limit Exceeded” message
from the judge, thus motivating a search for eﬃciency. The correct insight can make for
a dozen-line program instead of a huge mass of code. The best students will be inspired
to try extra problems just for kicks.
Such courses are fun to teach, too. Many problems are quite clever, putting a fresh
face on standard topics in programming and algorithms. Finding the best solution
requires insight and inspiration. It is exciting to ﬁgure out the right way to do each of
the problems, and even more exciting when the students ﬁgure it out for themselves.
Pedagogical features of this book include:
• Complements Standard Algorithm Texts — Although this book is self-contained,
it has been written with the understanding that most students will have some prior
exposure to algorithm design. This book has been designed (and priced) so it can
serve as a supplementary text for traditional algorithms courses, complementing
abstract descriptions with concrete implementations and theoretical analysis with

hands-on experience. Further, it covers several interesting topics that are not
universally included in standard algorithm texts.
• Provides Complete Implementations of Classical Algorithms — Many students
have a diﬃcult time going from abstract algorithm descriptions to working code.
To help them, we provide carefully written implementations of all important algorithms we discuss using a subset of C designed to be easily readable by C++
and Java programmers. Several of our programming challenge problems can be
solved by appropriately modifying these routines, thus providing a concrete path
to get students started.
• Integrated Course Management Environment — We have created a special
course management environment that makes it shamefully easy to administer
such a course, as it will handle all testing and grading for you! Our website
http://www.programming-challenges.com lets you assign problems to students,
maintain rosters, view each student’s score and programs, and even detect
suspicious similarity among their solutions!
• Help for Students at All Levels — The challenges included in this book have
been selected to span a wide range of diﬃculty. Many are suitable for introductory students, while others will prove challenging to those ready for international
competition. Hints for most problems are provided.
To help identify the most appropriate problems for any given student, we have
annotated each problem with three distinct measures of diﬃculty. The popularity
of a problem (A, B, or C) refers to how many people try it, while the success rate
(low to high) measures how often they succeed. Finally, the level of a problem (1
to 4, corresponding roughly from freshman to senior) indicates how advanced a
student needs to be in order to have a fair chance of solving the problem.

To the Coach or Competitor
This book has been particularly designed to serve as a training manual for programming competitions at the high school and collegiate levels. We provide a convenient
summary/reference of important topics in mathematics and algorithms, along with
appropriate challenges to help you master the material.
The robot judge checks the correctness of submitted programs just like the human
judges of the ACM International Collegiate Programming Contest. Once you set up a
personal account with the judge, you can submit solutions written in C, C++, Pascal, or
Java and wait for the verdict announcing success or failure. The judge keeps statistics on
how you are doing, so you can compare yourself to the thousands of other participants.
To help the competitor, we include an appendix with training secrets from ﬁnalists
for the three major programming contest venues: the ACM International Collegiate
Programming Contest (ICPC), the International Olympiad in Informatics (IOI), and
the TopCoder Programmer Challenge. We present the history of these competitions,
show how you can get involved, and help you make your best possible showing.

Roughly 80% of all ﬁnalists in the most recent ACM contest trained on the Universidad de Valladolid online judge. That the ﬁnals are held in exotic locals like Hawaii
provides extra incentive to study. Good luck!

Associated Websites
This book has been designed to work hand-in-hand with two websites. Online grading
for all problems is available at http://www.programming-challenges.com, along with
lots of supporting material. In particular, we provide complete source code of all the
programs that appear in the text as well as lecture notes to help integrate this material
into courses.
All of the problems in this book (and many, many more) can also be graded by the
Universidad de Valladolid online judge, http://online-judge.uva.es. In particular each
programming challenge in this book has been given an ID number on both judging
websites, so you can take advantage of their special features.

Acknowledgments
The existence of this book is due in great part to the generosity of all the people who let
us incorporate their contest problems into the robot judge as well as in this book. No less
than 17 people contributed problems to this volume, from four diﬀerent continents. We
are particularly indebted to Gordon Cormack and Shahriar Manzoor, problem posers
on the scale of Sam Loyd and H. E. Dudeney!
A complete mapping of people to problems appears in the appendix, but we particularly thank the following contest organizers for their contributions: Gordon Cormack
(38 problems), Shahriar Manzoor (28), Miguel Revilla (10), Pedro Demasi (8), Manuel
Carro (4), Rujia Liu (4), Petko Minkov (4), Owen Astrakan (3), Alexander Denisjuk (3),
Long Chong (2), Ralf Engels (2), Alex Gevak (1), Walter Guttmann (1), Arun Kishore
(1), Erick Moreno (1), Udvranto Patik (1), and Marcin Wojciechowski (1). Several of
these problems were developed by third parties, who are acknowledged in the appendix.
Tracking down the original authors of some of these problems proved almost as diﬃcult as tracking down the author of the Bible. We have tried very hard to identify the
author of each problem, and in each case received permission from someone claiming to
speak for the author. We apologize in advance if there are any oversights. If so, please
let us know so we can award proper credit.
The robot judge project is the work of many hands. Ciriaco Garcı́a is the primary author of the robot judge software and a key supporter of the project. Fernando P. Nájera
is responsible for many of the tools that help the judge in a friendly manner. Carlos M.
Casas maintains the correctness of the test ﬁles, ensuring that they are both fair and
demanding. José A. Caminero and Jesús Paúl help with problem curation and solution
integrity. We particularly thank Miguel Revilla, Jr. for building and maintaining the
http://www.programming-challenges.com website.
This book was partially debugged during a course taught at Stony Brook by Vinhthuy
Phan and Pavel Sumazin in spring 2002. The students from our terriﬁc programing teams this year (Larry Mak, Dan Ports, Tom Rothamel, Alexey Smirnov, Jeﬀrey

Versoza, and Charles Wright) helped review the manuscript and we thank them for
their interest and feedback. Haowen Zhang made a signiﬁcant contribution by carefully
reading the manuscript, testing the programs, and tightening the code.
We thank Wayne Yuhasz, Wayne Wheeler, Frank Ganz, Lesley Poliner, and Rich Putter of Springer-Verlag for all their help turning a manuscript into a published book. We
thank Gordon Cormack, Lauren Cowles, David Gries, Joe O’Rourke, Saurabh Sethia,
Tom Verhoeﬀ, Daniel Wright, and Stan Wagon for thoughtful manuscript reviews that
signiﬁcantly improved the ﬁnal product. The Fulbright Foundation and the Department
of Applied Mathematics and Computation at the Universidad de Valladolid provided
essential support, enabling the two authors to work together face to face. Citigroup CIB,
through the eﬀorts of Peter Remch and Debby Z. Beckman, signiﬁcantly contributed
to the ACM ICPC eﬀort at Stony Brook. Its involvement helped spark the writing of
this book.
Steven S. Skiena
Stony Brook, NY
Miguel A. Revilla
Valladolid, Spain
February 2003

Contents

1 Getting Started
1.1
Getting Started With the Judge . . . . . . . . . . .
1.1.1 The Programming Challenges Robot Judge
1.1.2 The Universidad de Valladolid Robot Judge
1.1.3 Feedback From the Judge . . . . . . . . . .
1.2
Choosing Your Weapon . . . . . . . . . . . . . . . .
1.2.1 Programming Languages . . . . . . . . . . .
1.2.2 Reading Our Programs . . . . . . . . . . . .
1.2.3 Standard Input/Output . . . . . . . . . . .
1.3
Programming Hints . . . . . . . . . . . . . . . . . .
1.4
Elementary Data Types . . . . . . . . . . . . . . .
1.5
About the Problems . . . . . . . . . . . . . . . . . .
1.6
Problems . . . . . . . . . . . . . . . . . . . . . . . .
1.6.1 The 3n + 1 Problem . . . . . . . . . . . . .
1.6.2 Minesweeper . . . . . . . . . . . . . . . . . .
1.6.3 The Trip . . . . . . . . . . . . . . . . . . . .
1.6.4 LCD Display . . . . . . . . . . . . . . . . . .
1.6.5 Graphical Editor . . . . . . . . . . . . . . .
1.6.6 Interpreter . . . . . . . . . . . . . . . . . . .
1.6.7 Check the Check . . . . . . . . . . . . . . .
1.6.8 Australian Voting . . . . . . . . . . . . . . .
1.7
Hints . . . . . . . . . . . . . . . . . . . . . . . . . .
1.8
Notes . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

1
1
2
2
3
4
5
6
7
9
11
13
15
15
16
17
18
19
21
23
25
26
26

2 Data Structures
2.1
Elementary Data Structures . . . . . . . . . .
2.1.1 Stacks . . . . . . . . . . . . . . . . . .
2.1.2 Queues . . . . . . . . . . . . . . . . . .
2.1.3 Dictionaries . . . . . . . . . . . . . . .
2.1.4 Priority Queues . . . . . . . . . . . . .
2.1.5 Sets . . . . . . . . . . . . . . . . . . . .
2.2
Object Libraries . . . . . . . . . . . . . . . . .
2.2.1 The C++ Standard Template Library
2.2.2 The Java java.util Package . . . . .
2.3
Program Design Example: Going to War . . .
2.4
Hitting the Deck . . . . . . . . . . . . . . . . .
2.5
String Input/Output . . . . . . . . . . . . . .
2.6
Winning the War . . . . . . . . . . . . . . . .
2.7
Testing and Debugging . . . . . . . . . . . . .
2.8
Problems . . . . . . . . . . . . . . . . . . . . .
2.8.1 Jolly Jumpers . . . . . . . . . . . . . .
2.8.2 Poker Hands . . . . . . . . . . . . . . .
2.8.3 Hartals . . . . . . . . . . . . . . . . . .
2.8.4 Crypt Kicker . . . . . . . . . . . . . .
2.8.5 Stack ’em Up . . . . . . . . . . . . . .
2.8.6 Erdös Numbers . . . . . . . . . . . . .
2.8.7 Contest Scoreboard . . . . . . . . . . .
2.8.8 Yahtzee . . . . . . . . . . . . . . . . .
2.9
Hints . . . . . . . . . . . . . . . . . . . . . . .
2.10 Notes . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

27
27
28
28
30
31
32
33
33
33
34
35
37
38
39
42
42
43
45
47
48
50
52
53
55
55

3 Strings
3.1
Character Codes . . . . . . . . . . . . . . . . . . .
3.2
Representing Strings . . . . . . . . . . . . . . . .
3.3
Program Design Example: Corporate Renamings
3.4
Searching for Patterns . . . . . . . . . . . . . . .
3.5
Manipulating Strings . . . . . . . . . . . . . . . .
3.6
Completing the Merger . . . . . . . . . . . . . . .
3.7
String Library Functions . . . . . . . . . . . . . .
3.8
Problems . . . . . . . . . . . . . . . . . . . . . . .
3.8.1 WERTYU . . . . . . . . . . . . . . . . . .
3.8.2 Where’s Waldorf? . . . . . . . . . . . . . .
3.8.3 Common Permutation . . . . . . . . . . .
3.8.4 Crypt Kicker II . . . . . . . . . . . . . . .
3.8.5 Automated Judge Script . . . . . . . . . .
3.8.6 File Fragmentation . . . . . . . . . . . . .
3.8.7 Doublets . . . . . . . . . . . . . . . . . . .
3.8.8 Fmt . . . . . . . . . . . . . . . . . . . . .
3.9
Hints . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

56
56
58
59
61
62
63
64
66
66
67
69
70
71
73
74
75
77

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

3.10

Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Sorting
4.1
Sorting Applications . . . . . . . . .
4.2
Sorting Algorithms . . . . . . . . . .
4.3
Program Design Example: Rating the
4.4
Sorting Library Functions . . . . . .
4.5
Rating the Field . . . . . . . . . . . .
4.6
Problems . . . . . . . . . . . . . . . .
4.6.1 Vito’s Family . . . . . . . . .
4.6.2 Stacks of Flapjacks . . . . . .
4.6.3 Bridge . . . . . . . . . . . . .
4.6.4 Longest Nap . . . . . . . . . .
4.6.5 Shoemaker’s Problem . . . . .
4.6.6 CDVII . . . . . . . . . . . . .
4.6.7 ShellSort . . . . . . . . . . . .
4.6.8 Football (aka Soccer) . . . . .
4.7
Hints . . . . . . . . . . . . . . . . . .
4.8
Notes . . . . . . . . . . . . . . . . . .

. . . .
. . . .
Field
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .

5 Arithmetic and Algebra
5.1
Machine Arithmetic . . . . . . . . . . . . .
5.1.1 Integer Libraries . . . . . . . . . . .
5.2
High-Precision Integers . . . . . . . . . . .
5.3
High-Precision Arithmetic . . . . . . . . .
5.4
Numerical Bases and Conversion . . . . . .
5.5
Real Numbers . . . . . . . . . . . . . . . .
5.5.1 Dealing With Real Numbers . . . .
5.5.2 Fractions . . . . . . . . . . . . . . .
5.5.3 Decimals . . . . . . . . . . . . . . .
5.6
Algebra . . . . . . . . . . . . . . . . . . . .
5.6.1 Manipulating Polynomials . . . . .
5.6.2 Root Finding . . . . . . . . . . . .
5.7
Logarithms . . . . . . . . . . . . . . . . . .
5.8
Real Mathematical Libraries . . . . . . . .
5.9
Problems . . . . . . . . . . . . . . . . . . .
5.9.1 Primary Arithmetic . . . . . . . . .
5.9.2 Reverse and Add . . . . . . . . . .
5.9.3 The Archeologist’s Dilemma . . . .
5.9.4 Ones . . . . . . . . . . . . . . . . .
5.9.5 A Multiplication Game . . . . . . .
5.9.6 Polynomial Coeﬃcients . . . . . . .
5.9.7 The Stern-Brocot Number System .
5.9.8 Pairsumonious Numbers . . . . . .
5.10 Hints . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

78
78
79
82
83
85
88
88
89
91
92
94
95
97
99
101
101

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

102
102
103
103
105
110
112
113
113
114
115
115
116
116
117
119
119
120
121
122
123
124
125
127
128

5.11

Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

128

6 Combinatorics
6.1
Basic Counting Techniques . . . . .
6.2
Recurrence Relations . . . . . . . .
6.3
Binomial Coeﬃcients . . . . . . . .
6.4
Other Counting Sequences . . . . .
6.5
Recursion and Induction . . . . . .
6.6
Problems . . . . . . . . . . . . . . .
6.6.1 How Many Fibs? . . . . . .
6.6.2 How Many Pieces of Land?
6.6.3 Counting . . . . . . . . . . .
6.6.4 Expressions . . . . . . . . .
6.6.5 Complete Tree Labeling . .
6.6.6 The Priest Mathematician .
6.6.7 Self-describing Sequence . .
6.6.8 Steps . . . . . . . . . . . . .
6.7
Hints . . . . . . . . . . . . . . . . .
6.8
Notes . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

129
129
131
131
133
135
137
137
138
139
140
141
142
144
145
146
146

7 Number Theory
7.1
Prime Numbers . . . . . . . . . . .
7.1.1 Finding Primes . . . . . . .
7.1.2 Counting Primes . . . . . .
7.2
Divisibility . . . . . . . . . . . . . .
7.2.1 Greatest Common Divisor .
7.2.2 Least Common Multiple . .
7.3
Modular Arithmetic . . . . . . . . .
7.4
Congruences . . . . . . . . . . . . .
7.4.1 Operations on Congruences
7.4.2 Solving Linear Congruences
7.4.3 Diophantine Equations . . .
7.5
Number Theoretic Libraries . . . .
7.6
Problems . . . . . . . . . . . . . . .
7.6.1 Light, More Light . . . . . .
7.6.2 Carmichael Numbers . . . .
7.6.3 Euclid Problem . . . . . . .
7.6.4 Factovisors . . . . . . . . . .
7.6.5 Summation of Four Primes .
7.6.6 Smith Numbers . . . . . . .
7.6.7 Marbles . . . . . . . . . . .
7.6.8 Repackaging . . . . . . . . .
7.7
Hints . . . . . . . . . . . . . . . . .
7.8
Notes . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

147
147
148
149
149
150
151
152
154
154
155
155
156
157
157
158
159
160
161
162
163
164
166
166

8 Backtracking
8.1
Backtracking . . . . . . . . . . . . . . . . . . . . . . . .
8.2
Constructing All Subsets . . . . . . . . . . . . . . . . .
8.3
Constructing All Permutations . . . . . . . . . . . . . .
8.4
Program Design Example: The Eight-Queens Problem
8.5
Pruning Search . . . . . . . . . . . . . . . . . . . . . .
8.6
Problems . . . . . . . . . . . . . . . . . . . . . . . . . .
8.6.1 Little Bishops . . . . . . . . . . . . . . . . . . .
8.6.2 15-Puzzle Problem . . . . . . . . . . . . . . . .
8.6.3 Queue . . . . . . . . . . . . . . . . . . . . . . .
8.6.4 Servicing Stations . . . . . . . . . . . . . . . . .
8.6.5 Tug of War . . . . . . . . . . . . . . . . . . . .
8.6.6 Garden of Eden . . . . . . . . . . . . . . . . . .
8.6.7 Color Hash . . . . . . . . . . . . . . . . . . . . .
8.6.8 Bigger Square Please... . . . . . . . . . . . . . .
8.7
Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.8
Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

167
167
169
170
172
173
176
176
177
179
180
181
182
184
186
188
188

9 Graph Traversal
9.1
Flavors of Graphs . . . . . . . . . . .
9.2
Data Structures for Graphs . . . . .
9.3
Graph Traversal: Breadth-First . . .
9.3.1 Breadth-First Search . . . . .
9.3.2 Exploiting Traversal . . . . .
9.3.3 Finding Paths . . . . . . . . .
9.4
Graph Traversal: Depth-First . . . .
9.4.1 Finding Cycles . . . . . . . .
9.4.2 Connected Components . . .
9.5
Topological Sorting . . . . . . . . . .
9.6
Problems . . . . . . . . . . . . . . . .
9.6.1 Bicoloring . . . . . . . . . . .
9.6.2 Playing With Wheels . . . . .
9.6.3 The Tourist Guide . . . . . .
9.6.4 Slash Maze . . . . . . . . . . .
9.6.5 Edit Step Ladders . . . . . . .
9.6.6 Tower of Cubes . . . . . . . .
9.6.7 From Dusk Till Dawn . . . . .
9.6.8 Hanoi Tower Troubles Again!
9.7
Hints . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

189
189
191
194
194
195
196
198
198
199
200
203
203
204
206
208
210
211
213
215
216

10 Graph Algorithms
10.1 Graph Theory . . . . . .
10.1.1 Degree Properties
10.1.2 Connectivity . . .
10.1.3 Cycles in Graphs

.
.
.
.

217
217
217
218
219

.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

220
220
223
223
225
227
231
231
232
234
235
237
239
241
242
244

11 Dynamic Programming
11.1 Don’t Be Greedy . . . . . . . . . . . . . . . . . .
11.2 Edit Distance . . . . . . . . . . . . . . . . . . . .
11.3 Reconstructing the Path . . . . . . . . . . . . . .
11.4 Varieties of Edit Distance . . . . . . . . . . . . .
11.5 Program Design Example: Elevator Optimization
11.6 Problems . . . . . . . . . . . . . . . . . . . . . . .
11.6.1 Is Bigger Smarter? . . . . . . . . . . . . .
11.6.2 Distinct Subsequences . . . . . . . . . . .
11.6.3 Weights and Measures . . . . . . . . . . .
11.6.4 Unidirectional TSP . . . . . . . . . . . . .
11.6.5 Cutting Sticks . . . . . . . . . . . . . . . .
11.6.6 Ferry Loading . . . . . . . . . . . . . . . .
11.6.7 Chopsticks . . . . . . . . . . . . . . . . . .
11.6.8 Adventures in Moving: Part IV . . . . . .
11.7 Hints . . . . . . . . . . . . . . . . . . . . . . . . .
11.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

245
245
246
250
251
253
257
257
258
259
260
262
263
265
266
267
267

12 Grids
12.1 Rectilinear Grids . . . . . . . . . . . . .
12.1.1 Traversal . . . . . . . . . . . . . .
12.1.2 Dual Graphs and Representations
12.2 Triangular and Hexagonal Grids . . . . .
12.2.1 Triangular Lattices . . . . . . . .
12.2.2 Hexagonal Lattices . . . . . . . .
12.3 Program Design Example: Plate Weight
12.4 Circle Packings . . . . . . . . . . . . . .
12.5 Longitude and Latitude . . . . . . . . . .

.
.
.
.
.
.
.
.
.

268
268
269
270
271
271
272
275
277
278

10.2
10.3

10.4
10.5

10.6

10.1.4 Planar Graphs . . . . . . . . . .
Minimum Spanning Trees . . . . . . .
Shortest Paths . . . . . . . . . . . . . .
10.3.1 Dijkstra’s Algorithm . . . . . .
10.3.2 All-Pairs Shortest Path . . . . .
Network Flows and Bipartite Matching
Problems . . . . . . . . . . . . . . . . .
10.5.1 Freckles . . . . . . . . . . . . .
10.5.2 The Necklace . . . . . . . . . .
10.5.3 Fire Station . . . . . . . . . . .
10.5.4 Railroads . . . . . . . . . . . . .
10.5.5 War . . . . . . . . . . . . . . .
10.5.6 Tourist Guide . . . . . . . . . .
10.5.7 The Grand Dinner . . . . . . .
10.5.8 The Problem With the Problem
Hints . . . . . . . . . . . . . . . . . . .

. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Setter
. . . .

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

12.6

.
.
.
.
.
.
.
.
.
.

279
279
280
282
283
284
286
287
288
290

13 Geometry
13.1 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13.2 Triangles and Trigonometry . . . . . . . . . . . . . . . . .
13.2.1 Right Triangles and the Pythagorean Theorem . .
13.2.2 Trigonometric Functions . . . . . . . . . . . . . . .
13.2.3 Solving Triangles . . . . . . . . . . . . . . . . . . .
13.3 Circles . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13.4 Program Design Example: Faster Than a Speeding Bullet
13.5 Trigonometric Function Libraries . . . . . . . . . . . . . .
13.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13.6.1 Dog and Gopher . . . . . . . . . . . . . . . . . . . .
13.6.2 Rope Crisis in Ropeland! . . . . . . . . . . . . . . .
13.6.3 The Knights of the Round Table . . . . . . . . . .
13.6.4 Chocolate Chip Cookies . . . . . . . . . . . . . . .
13.6.5 Birthday Cake . . . . . . . . . . . . . . . . . . . . .
13.6.6 The Largest/Smallest Box ... . . . . . . . . . . . . .
13.6.7 Is This Integration? . . . . . . . . . . . . . . . . . .
13.6.8 How Big Is It? . . . . . . . . . . . . . . . . . . . . .
13.7 Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

291
291
294
295
295
296
298
299
302
304
304
305
306
307
308
309
310
311
312

14 Computational Geometry
14.1 Line Segments and Intersection . . . . . . . . . .
14.2 Polygons and Angle Computations . . . . . . . .
14.3 Convex Hulls . . . . . . . . . . . . . . . . . . . .
14.4 Triangulation: Algorithms and Related Problems
14.4.1 Van Gogh’s Algorithm . . . . . . . . . . .
14.4.2 Area Computations . . . . . . . . . . . . .
14.4.3 Point Location . . . . . . . . . . . . . . .
14.5 Algorithms on Grids . . . . . . . . . . . . . . . .
14.5.1 Range Queries . . . . . . . . . . . . . . . .
14.5.2 Lattice Polygons and Pick’s Theorem . . .
14.6 Geometry Libraries . . . . . . . . . . . . . . . . .
14.7 Problems . . . . . . . . . . . . . . . . . . . . . . .
14.7.1 Herding Frosh . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.

313
313
315
316
319
320
322
322
324
324
325
326
327
327

12.7

Problems . . . . . . . . . . . . . . . . . . . .
12.6.1 Ant on a Chessboard . . . . . . . . .
12.6.2 The Monocycle . . . . . . . . . . . .
12.6.3 Star . . . . . . . . . . . . . . . . . .
12.6.4 Bee Maja . . . . . . . . . . . . . . .
12.6.5 Robbery . . . . . . . . . . . . . . . .
12.6.6 (2/3/4)-D Sqr/Rects/Cubes/Boxes? .
12.6.7 Dermuba Triangle . . . . . . . . . . .
12.6.8 Airlines . . . . . . . . . . . . . . . . .
Hints . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

328
329
330
331
333
334
336
337

A Appendix
A.1 The ACM International Collegiate Programming Contest
A.1.1 Preparation . . . . . . . . . . . . . . . . . . . . .
A.1.2 Strategies and Tactics . . . . . . . . . . . . . . .
A.2 International Olympiad in Informatics . . . . . . . . . .
A.2.1 Participation . . . . . . . . . . . . . . . . . . . . .
A.2.2 Format . . . . . . . . . . . . . . . . . . . . . . . .
A.2.3 Preparation . . . . . . . . . . . . . . . . . . . . .
A.3 Topcoder.com . . . . . . . . . . . . . . . . . . . . . . . .
A.4 Go to Graduate School! . . . . . . . . . . . . . . . . . . .
A.5 Problem Credits . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.

339
339
340
341
343
343
344
345
345
346
348

14.8

14.7.2
14.7.3
14.7.4
14.7.5
14.7.6
14.7.7
14.7.8
Hints

The Closest Pair Problem
Chainsaw Massacre . . . .
Hotter Colder . . . . . . .
Useless Tile Packers . . . .
Radar Tracking . . . . . .
Trees on My Island . . . .
Nice Milk . . . . . . . . .
. . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.

References

350

Index

353

1
Getting Started

We kick oﬀ this book with a collection of relatively elementary programming problems,
none of which require ideas more advanced than arrays and iteration.
Elementary does not necessarily mean easy, however! These problems provide a
good introduction to the demanding nature of the robot judge, and the need to read
carefully and understand speciﬁcations. They also provide an opportunity to discuss
programming styles best suited to getting the job done.
To help you get started, we begin with a description of the robot judges and their
idiosyncrasies. We follow with a discussion of basic programming style and data structures before introducing our ﬁrst set of problems. As in all chapters in this book, we
follow with hints for selected problems and notes for further study.

1.1 Getting Started With the Judge
This book is designed to be used in tandem with one (or both) of two robot judging
websites. The Programming Challenges judge http://www.programming-challenges.com
has been set up speciﬁcally to help you get the most from the challenges in this book.
The Universidad de Valladolid judge http://online-judge.uva.es has a diﬀerent interface
as well as hundreds of additional problems available.
All the problems in the book can be judged from either website, which are both
administered by Miguel Revilla. In this section, we describe how to use the two judges
and explain the diﬀerences between them. Be aware that both sites are living, breathing
projects, so these procedures may evolve over time. Check the current instructions at
each site for clariﬁcation.

1. Getting Started

Your ﬁrst task is to get an account for the judge of your choice. You will be asked
to give a password governing access to your personal data, speciﬁcally your name and
your email address. If you forget your password, clicking the appropriate button will
get it emailed back to you.
Note that the contestant rosters of the two sites are currently kept distinct, but there
is no reason why you should not register for both of them and enjoy their distinct
advantages.

1.1.1

The Programming Challenges Robot Judge

The Programming Challenges website (http://www.programming-challenges.com) provides special features associated with each of the problems in this book. For example,
a description of each challenge appearing in the book is given on site, along with
down-loadable input and output ﬁles to eliminate the need for you to type this test
data.
The Programming Challenges site uses a web interface for submission (the Submito-Matic) instead of the email interface of the UVa judge. This makes submission much
easier and more reliable, and provides for quicker response.
Each problem in this book has two associated ID numbers, one for each judge. One
advantage of the web interface is that the identiﬁer for the Programming Challenges
site (the PC ID) is not necessary for most submissions. The problem descriptions in
this book have been rewritten for clarity; thus they often diﬀer from the descriptions
on the UVa judge in minor ways. However, the problems they describe are identical.
Thus any solution scored as correct on one judge should be scored correct on the other
as well.
The Programming Challenges site has a special course management interface, which
permits an instructor to maintain a roster of students in each class and see their submissions and results. It also contains a program similarity tester so the instructor can verify
that the solutions each student submits are indeed his or her own work. This makes it
“bad karma” to hunt for solutions on the web or in your classmate’s directories.

1.1.2

The Universidad de Valladolid Robot Judge

All of the problems in this book and many more appear on the Universidad de Valladolid
robot judge http://online-judge.uva.es, the largest collection of programming problems
in the world. We encourage anyone whose appetite has been whetted by our challenges
to continue their studies there.
After registering on the UVa judge, you will receive email containing an ID number
which will uniquely identify your programs to the judge. You will need this ID number
for every solution you submit.
The UVa judge is gradually adopting a web interface but currently uses email submission. Solutions are emailed directly to judge@uva.es after being annotated with enough
information to tell the judge which problem you are trying to solve, who the author is,
and what programming language you are using.

1.1. Getting Started With the Judge

Speciﬁcally, each submitted program must contain a line (at any location) with an
@JUDGE ID: ﬁeld. Usually, this line is placed inside a comment. For example,
/*

@JUDGE_ID:

1000AA

100

"Dynamic Programming"

The argument after the @JUDGE ID: is your user ID (1000AA in the example). This is
followed by the problem number (100 in the example), and then by the language used.
Make sure you use the UVa ID number for all submissions to this judge! Upper- and
lowercase letters are indistinguishable. If you fail to specify the programming language,
the judge will try to auto-detect it – but why play games? Finally, if you have used any
interesting algorithm or method, you may include a note to that eﬀect between quotes,
such as Dynamic Programming in the example above.
Bracketing your program with beginning/end of source comments is a good way to
make sure the judge is not confused by junk appended by your mailer.
/* @BEGIN_OF_SOURCE_CODE */
your program here
/* @END_OF_SOURCE_CODE */
Certain mysterious errors will go away when you do this.

1.1.3

Feedback From the Judge

Students should be aware that both judges are often very picky as to what denotes a
correct solution. It is very important to interpret the problem speciﬁcations properly.
Never make an assumption which is not explicitly stated in the specs. There is absolutely
no reason to believe that the input is sorted, the graphs are connected, or that the
integers used in a problem are positive and reasonably small unless it says so in the
speciﬁcation.
Just like the human judges of the ACM International Collegiate Programming Contest, the online judge provides you with very little feedback about what is wrong with
your program. The judge is likely to return one of the following verdicts:
• Accepted (AC) — Congratulations! Your program is correct, and runs within the
given time and memory limits.
• Presentation Error (PE) — Your program outputs are correct but are not presented in the speciﬁed format. Check for spaces, left/right justiﬁcation, line feeds,
etc.
• Accepted (PE) — Your program has a minor presentation error, but the judge is
letting you oﬀ with a warning. Don’t be concerned, because many problems have
somewhat ambiguous output speciﬁcations. Usually your problems are something
as trivial as an extra blank at the end of a line, so stop here and declare victory.

1. Getting Started

• Wrong Answer (WA) — This you should concern you, because your program
returned an incorrect answer to one or more of the judge’s secret test cases. You
have some more debugging to do.
• Compile Error (CE) — The compiler could not ﬁgure out how to compile
your program. The resulting compiler messages will be returned to you. Warning
messages that do not interfere with compilation are ignored by the judge.
• Runtime Error (RE) — Your program failed during execution due to a segmentation fault, ﬂoating point exception, or similar problem. Its dying message will
be sent back to you. Check for invalid pointer references or division by zero.
• Time Limit Exceeded (TL) — Your program took too much time on at least one
of the test cases, so you likely have a problem with eﬃciency. Just because you
ran out of time on one input does not mean you were correct on all the others,
however!
• Memory Limit Exceeded (ML) — Your program tried to use more memory than
the judge’s default settings.
• Output Limit Exceeded (OL) — Your program tried to print too much output.
This usually means it is trapped in a inﬁnite loop.
• Restricted Function (RF) — Your source program tried to use an illegal system
function such as fork() or fopen(). Behave yourself.
• Submission Error (SE) — You did not correctly specify one or more of the
information ﬁelds, perhaps giving an incorrect user ID or problem number.
Just to reiterate: if your program is found guilty of having a wrong answer, the judge
will not show you which test case it failed on, or give you any additional hints as to
why it failed. This is why it is so essential to review the speciﬁcations carefully. Even
when you may be sure that your program is correct, the judge may keep saying no.
Perhaps you are overlooking a boundary case or assuming something which just ain’t
so. Resubmitting the program without change does you absolutely no good. Read the
problem again to make sure it says what you thought it did.
The judge occasionally returns a more exotic verdict which is essentially independent
of your solution. See the appropriate website for details.

1.2 Choosing Your Weapon
What programming language should you use in your battles with the judge? Most likely,
the language which you know best. The judge currently accepts programs written in
C, C++, Pascal, and Java, so your favorite language is probably available. One programming language may well be better than another for a speciﬁc programming task.
However, these problems test general problem-solving skills far more than portability, modularity, or eﬃciency, which are the usual dimensions by which languages are
compared.

1.2. Choosing Your Weapon

Submissions per Month by Programming Language
80000
C
C++
Pascal
Java
All

70000

60000

Submissions

50000

40000

30000

20000

10000

0
1997

1998

1999

2000

2001

2002

Year

Figure 1.1. Robot Judge Submissions by Programming Language Through December 2002.

1.2.1

Programming Languages

The four languages supported by the judge were designed at diﬀerent times with
diﬀerent goals in mind:
• Pascal — The most popular educational programming language of the 1980s,
Pascal was designed to encourage good structured-programming habits. Its popularity has eroded almost to the point of extinction, but it retains a foothold in
high schools and in Eastern Europe.
• C — The original language of the UNIX operating system, C was designed to
provide experienced programmers with the power to do whatever needs to be
done. This includes the power to hang yourself by invalid pointer references and
invalid type casting. Developments in object-oriented programming during the
1990s lead to the new and improved. . .
• C++ — The ﬁrst commercially successful object-oriented language pulled oﬀ
the neat trick of maintaining backward compatibility with C while incorporating
new data abstraction and inheritance mechanisms. C++ became the primary
programming language for teaching and industry during the mid-to-late 1990s,
but now it looks over its shoulder at. . .
• Java — Designed as a language to support mobile programs, Java has special
security mechanisms to avoid common programmer errors such as array out-ofbounds violations and illegal pointer access. It is a full-featured programming
language which can do everything the others can and more.

1. Getting Started
Lang
C
C++
Java
Pascal
All

Total
451447
639565
16373
149408
1256793

AC
31.9%
28.9%
17.9%
27.8%
29.7%

PE
6.7%
6.3%
3.6%
5.5%
6.3%

WA
35.4%
36.8%
36.2%
41.8%
36.9%

CE
8.6%
9.6%
29.8%
10.1%
9.6%

RE
9.1%
9.0%
0.5%
6.2%
8.6%

TL
6.2%
7.1%
8.5%
7.2%
6.8%

ML
0.4%
0.6%
1.0%
0.4%
0.5%

OL
1.1%
1.0%
0.5%
0.4%
1.0%

RF
0.6%
0.7%
2.0%
0.5%
0.6%

Table 1.1. The Judge’s Verdicts by Programming Language (Through December 2002).

Note that each of the judge’s programming languages have compiler and systemspeciﬁc idiosyncrasies. Thus a program which runs on your machine may not run on the
judge. Read the language notes on your judge’s website carefully to minimize trouble,
particularly if you are using Java.
It is interesting to look at which languages people have been using. As of December
2002 over 1,250,000 program submissions have been sent to the robot judge. Almost
half of them were in C++, with almost another third in C. Only a tiny fraction were
written in Java, but that is not a fair test since the judge did not accept Java programs
until November 2001.
These submissions are broken down by month in Figure 1.1. C proved the most
popular language until late 1999, when C++ surged ahead. It is interesting to note the
annual spike in demand each fall as students train for the ACM International Collegiate
Programming Contest regional competitions. Every year the judge gets busier, as more
and more students seek trial in its court.
It is also interesting to look at the judge’s verdicts by programming language. These
are tabulated in Table 1.1, according to the response codes described in Section 1.1.3.
The verdicts are quite consistent across the board. However, the frequencies of certain
types of errors appear to be language dependent. C++ programs run out of time and
memory more often than C language programs, a sign that C++ is a relative resource
hog. C has a slightly higher acceptance rate than C++, presumably reﬂecting its popularity at an earlier state in the judge’s development. Pascal has the lowest rate of
restricted function errors, reﬂecting its origins as a nice, safe language for students to
play with. Java has had far more than its share of compiler errors to date, but it also
crashes much less often than the other languages. Safety is indeed a virtue.
But the basic lesson is that the tools do not make the man. Your language doesn’t
solve the problems – you do.

1.2.2

Reading Our Programs

Several programming examples appear in this book, illustrating programming techniques and providing complete implementations of fundamental algorithms. All of this
code is available at http://www.programming-challenges.com for you to use and experiment with. There is no ﬁner way to debug programs than having them read by several
thousand bright students, so look there for errata and revised solutions.
Our programming examples are implemented in a vanilla subset of C, which we hope
will be understandable by all of our readers with relatively little eﬀort. C itself is a

1.2. Choosing Your Weapon

subset of C++ and its syntax is quite similar to Java. We have taken care to avoid
using weird C-speciﬁc constructs, pointer structures and dynamic memory allocation
throughout this book, so what remain should be familiar to users of all four of the
judge’s programming languages.
We provide a few hints about C below which may be helpful in reading our programs:
• Parameter Passing — All parameters in C are passed by call-by-value, meaning
that copies of all arguments are made on function calls. This would seem to suggest
that it is impossible to write functions that have side eﬀects. Instead, C encourages
you to pass a pointer to any argument that you intend to modify within the body
of the function.
Our only use of pointers will be in parameter passing. The pointer to x is
denoted by &x, while the item pointed to by p is denoted *p. Do not get confused
between multiplication and pointer dereferencing!
• Data Types — C supports several basic data types, including int, float, and
char, which should all be self-explanatory. Higher precision ints and floats are
denoted long and double, respectively. All functions return a value of type int
if not otherwise speciﬁed.
• Arrays — C array indices always range from 0 to n − 1, where n is the number
of elements in the array. Thus if we want to start with a ﬁrst index of 1 for
convenience, we had better remember to allocate room for n + 1 elements in the
array. No runtime checking is performed on the validity of array bounds, so such
errors are a common cause of program crashes.
We are not always consistent as to where the ﬁrst element of each array is
located. Starting from 0 is the traditional C style. However, it is sometimes clearer
or easier to start at 1, and we are willing to pay one extra memory location for
the privilege. Try not to be confused when reading our code.
• Operators — C contains a few essential operators which may be mysterious to
some readers. The integer remainder or mod operation is denoted %. The logicaland and logical-or operations which appear in conditional statements are denoted
&& and ||, respectively.

1.2.3

Standard Input/Output

UNIX programmers are familiar with notions of ﬁlters and pipes–programs that take
one stream of input and produce one stream of output. The output of such programs
is suitable to feed to other programs as input. The paradigm is one of stringing lots of
little programs together rather than producing big, complicated software systems that
try to do everything.
This software tools philosophy has taken somewhat of a beating in recent years due
to the popularity of graphical user interfaces. Many programmers instinctively put a
point-and-click interface on every program. But such interfaces can make it very diﬃcult
to transfer data from one program to another. It is easy to manipulate text output in
another program, but what can you do with an image other than look at it?

1. Getting Started

#include
int main() {
long p,q,r;
while (scanf("%ld %ld",&p,&q)
!=EOF) {
if (q>p) r=q-p;
else r=p-q;
printf("%ld\n",r);
}
}

#include
void main()
{
long long a,b,c;
while (cin>>a>>b) {
if (b>a)
c=b-a;
else
c=a-b;
cout << c << endl;
}
}

{$N+}
program acm;
var
a, b, c : integer;
begin
while not eof do
begin
readln(a, b);
if b > a then
begin
c := b;
b := a;
a := c
end;
writeln(a - b);
end
end.

Figure 1.2. Standard Input/Output Examples in C (left), C++ (center), and Pascal (right).

The judge’s I/O standards reﬂect the oﬃcial ACM programming contest rules. Each
program must read the test data from the standard input and print the results to the
standard output. Programs are not allowed to open ﬁles or to execute certain system
calls.
Standard input/output is fairly easy in C, C++, and Pascal. Figure 1.2 provides
a simple example in each language that reads in two numbers per line and reports
the absolute value of their diﬀerence. Note how your favorite language tests for the
end-of-ﬁle terminating condition. Most problems make input processing even easier by
specifying a count of the number of examples or describing a special termination line.
Most languages provide powerful formatted I/O functions. When used properly,
single-line commands can render unnecessary certain painful parsing and formatting
routines written by those who didn’t read the manual.
Standard input/output is not easy in Java, however. A electronic template for Java
I/O (35 lines long) is available from http://www.programming-challenges.com. Set it up
once and use it for all your entries.
Java programs submitted to the judge must consist of a single source code ﬁle. They
are currently compiled and run as native applications using the gcj compiler, although
this may change in the future. Note that java::io use is restricted; which implies that
some features are not available. Network functions and threads are also unavailable.
However, methods from math, util and other common packages are authorized. All
programs must begin in a static main method in a Main class. Do not use public classes:
even Main must be non-public to avoid compile errors. However, you can add and
instance as many classes as needed.
If you ﬁnd yourself using an operating system/compiler which makes it diﬃcult to use
standard input/output, note that the judge always deﬁnes the ONLINE JUDGE symbol
while compiling your program. Thus, you can test for it and redirect the input/output
to a ﬁle when running on your own system.

1.3. Programming Hints

1.3 Programming Hints
It is not our purpose in this book to teach you how to program, only how to program better. We assume you are familiar with such fundamental concepts as variables,
conditional statements (e.g., if-then-else, case), iteration primitives (e.g., for-do,
while-do, repeat-until), subroutines, and functions. If you are unfamiliar with these
concepts you may have picked up the wrong book, but buy it anyway for later use.
It is important to realize how much power there is in what you already know. In
principle, every interesting algorithm/program can be built from what you learn in a
ﬁrst programming course. The powerful features of modern programming languages are
not really necessary to build interesting things – only to do them in cleaner, better
ways.
To put it another way, one becomes a good writer not by learning additional vocabulary words but by ﬁnding something to say. After one or two programming courses,
you know all the words you need to make yourself understood. The problems in this
book strive to give you something interesting to say.
We oﬀer a few low-level coding hints that are helpful in building quality programs.
The bad examples all come from actual submissions to the robot judge.
• Write the Comments First — Start your programs and procedures by writing a
few sentences explaining what they are supposed to do. This is important because
if you can’t easily write these comments, you probably don’t really understand
what the program does. We ﬁnd it much easier to debug our comments than our
programs, and believe the additional typing is time very well spent. Of course,
with the time pressure of a contest comes a tendency to get sloppy, but do so at
your own risk.
• Document Each Variable — Write a one-line comment for each variable when
you declare it so you know what it does. Again, if you can’t describe it easily, you
don’t know why it is there. You will likely be living with the program for at least
a few debug cycles, and this is a modest investment in readability which you will
come to appreciate.
• Use Symbolic Constants — Whenever you have a constant in your program (input
size, mathematical constant, data structure size, etc.) declare it to be so at the
top of your program. Horribly insidious errors can result from using inconsistent
constants in a program. Of course, the symbolic name helps only if you actually
use it in your program whenever you need the constant. . .
• Use Enumerated Types for a Reason — Enumerated types (i.e., symbolic variables
such Booleans (true,false)) can be terriﬁc aids to understanding. However, they
are often unnecessary in short programs. Note this example representing the suit
(club, diamond, heart, spade) of a deck of cards:
switch(cursuit) {
case ’C’:
newcard.suit = C;

1. Getting Started

break;
case ’D’:
newcard.suit = D;
break;
case ’H’:
newcard.suit = H;
break;
case ’S’:
newcard.suit = S;
...
No additional clarity arises from using the enumerated variables (C,D,H,S)
over the original character representation (’C’,’D’,’H’,’S’), only additional
opportunities for error.
• Use Subroutines To Avoid Redundant Code — Read the following program fragment managing the state of a rectangular board, and think how you might shorten
and simplify it:
...
while (c != ’0’) {
scanf("%c", &c);
if (c == ’A’) {
if (row-1 >= 0) {
temp = b[row-1][col];
b[row-1][col] = ’ ’;
b[row][col] = temp;
row = row-1;
}
}
else if (c == ’B’) {
if (row+1 <= BOARDSIZE-1) {
temp = b[row+1][col];
b[row+1][col] = ’ ’;
b[row][col] = temp;
row = row+1;
}
}
...
In the full program, there were four blocks of three lines each moving a value to a
neighboring cell. Mistyping a single + or − would have lethal consequences. Much
safer would be to write a single move-swap routine and call it with the proper
arguments.

1.4. Elementary Data Types

• Make Your Debugging Statements Meaningful — Learn to use the debugging
environment on your system. This will enable you to stop execution at a given
statement or condition, so you can see what the values of all associated variables
are. This is usually faster and easier than typing in a bunch of print statements.
But if you are going to insert debugging print statements, make them say something. Print out all relevant variables, and label the printed quantity with the
variable name. Otherwise it is easy to get lost in your own debugging output.
Most computer science students are now well-versed in object-oriented programming,
a software engineering philosophy designed to construct reusable software components
and exploit them. Object-oriented programming is useful to build large, reusable programs. However, most of the programming challenge problems in this book are designed
to be solved by short, clever programs. The basic assumption of object-oriented programming just does not apply in this domain, so deﬁning complicated new objects (as
opposed to using predeﬁned objects) is likely to be a waste of time.
The trick to successful programming is not abandoning style, but using one
appropriate to the scale of the job.

1.4 Elementary Data Types
Simple data structures such as arrays have an important advantage over more sophisticated data structures such as linked lists: they are simple. Many kinds of errors in
pointer-based structures simply cannot happen with static arrays.
The sign of a mature professional is keeping the simple jobs simple. This is particularly
challenging for those who are just learning a new subject. Medical students provide a
classic example of this problem. After sitting through a few lectures on obscure tropical
diseases, a young doctor worries that any patient with a sniﬄe and a rash might have
the Ebola virus or bubonic plague, while a more experienced physician just sends them
home with a bottle of aspirin.
Likewise, you may have recently learned about balanced binary search trees, exception
handling, parallel processing, and various models of object inheritance. These are all
useful and important subjects. But they are not necessarily the best way to get a correct
program working for a simple job.
So, yes, pointer-based structures are very powerful if you do not know the maximum
size of the problem in advance, or in supporting fast search and update operations.
However, many of the problems you will be solving here have maximum sizes speciﬁed.
Further, the robot judge typically allows several seconds for your job to complete, which
is a lot of computation time when you stop to think about it. You don’t get extra points
for ﬁnishing faster.
So what is the simple, mature approach to data structures? First, be familiar with
the basic primitive data types built into your programming language. In principle, you
can build just about anything you want from just these:
• Arrays — This workhorse data type permits access to data by position, not
content, just like the house numbers on a street permit access by address, not

1. Getting Started

name. They are used to store sequences of single-type elements such as integers,
reals, or compound objects such as records. Arrays of characters can be used to
represent text strings, while arrays of text strings can be used to represent just
about anything.
Sentinels can be a useful technique to simplify array-based programming. A
sentinel is a guard element, implicitly checking that the program does not run
beyond the bounds of the array without performing an explicit test. Consider the
case of inserting element x into the proper position among n elements in a sorted
array a. We can explicitly test each step to see whether we have hit the bottom
of the array as on the left:
i = n;
while ((a[i]>=x) && (i>=1)) {
a[i] = a[i-1];
i=i-1;
}
a[i+1] = x;

i = n;
a[0] = - MAXINT;
while (a[i] >= x) {
a[i] = a[i-1];
i=i-1;
}
a[i+1] = x;

or, we can make sure that fake element a[0] is smaller than anything it will
encounter as on the right. Proper use of sentinels, and making sure that your
array is a little larger than it presumably needs to be, can help avoid many
boundary errors.
• Multidimensional Arrays — Rectangular grid structures such as chessboards and
images comes to mind ﬁrst when thinking about two-dimensional arrays, but
more generally they can be used to group together homogeneous data records.
For example, an array of n points in the x − y plane can be thought of as an
n × 2 array, where the second argument (0 or 1) of A[i][j] dictates whether we are
referring to the x or y coordinate of the point.
• Records — These are used to group together heterogeneous data records. For example, an array of people-records can lump together people’s names, ID numbers,
heights, and weights into a simple package. Records are important for conceptual
clarity in large programs, but such ﬁelds can often be harmlessly represented using
separate arrays in programs of modest size.
Whether it is better to use records or multidimensional arrays in a problem
is not always a clear-cut decision. Think of the problem of representing points in
the x − y plane discussed above. The obvious representation would be a record or
structure such as:
struct point {
int x, y;
};
instead of as a two-element array. A big plus for records is that the notation p.y
and p.y are more akin to our natural notation for working with points. However, a

1.5. About the Problems

disadvantage of the record representation is that you cannot loop over individual
variables as you can elements in an array.
Suppose you wanted to change a geometric program to work with threedimensional points instead of two, or even in arbitrary dimensions. Sure you can
easily add extra ﬁelds to the record, but every place where you did calculations on
x and y you now must replicate them for z. But by using the array representation,
changing distance computations from two to three dimensions can be as simple
as changing a constant:
typedef int point[DIMENSION];
double distance(point a, point b)
{
int i;
double d=0.0;
for (i=0; ifirst = 0;
q->last = QUEUESIZE-1;
q->count = 0;
}
enqueue(queue *q, int x)
{
if (q->count >= QUEUESIZE)
printf("Warning: queue overflow enqueue x=%d\n",x);
else {
q->last = (q->last+1) % QUEUESIZE;
q->q[ q->last ] = x;
q->count = q->count + 1;
}
}
int dequeue(queue *q)
{
int x;
if (q->count <= 0) printf("Warning: empty queue dequeue.\n");
else {
x = q->q[ q->first ];
q->first = (q->first+1) % QUEUESIZE;
q->count = q->count - 1;
}

2. Data Structures

return(x);
}
int empty(queue *q)
{
if (q->count <= 0) return (TRUE);
else return (FALSE);
}
Queues are one of the few data structures which are easier to program using linked
lists than arrays, because they eliminate the need to test for the wrap-around condition.

2.1.3

Dictionaries

Dictionaries permit content-based retrieval, unlike the position-based retrieval of stacks
and queues. They support three primary operations –
• Insert(x,d) — Insert item x into dictionary d.
• Delete(x,d) — Remove item x (or the item pointed to by x) from dictionary d.
• Search(k,d) — Return an item with key k if one exists in dictionary d.
A data structures course may well present a dozen diﬀerent ways to implement dictionaries, including sorted/unsorted linked lists, sorted/unsorted arrays, and a forest
full of random, splay, AVL, and red-black trees – not to mention all the variations on
hashing.
The primary issue in algorithm analysis is performance, namely, achieving the best
possible trade-oﬀ between the costs of these three operations. But what we usually want
in practice is the simplest way to get the job done under the given time constraints.
The right answer depends on how much the contents of your dictionary change over the
course of execution:
• Static Dictionaries — These structures get built once and never change. Thus
they need to support search, but not insertion on deletion.
The right answer for static dictionaries is typically an array. The only real
question is whether to keep it sorted, in order to use binary search to provide
fast membership queries. Unless you have tight time constraints, it probably isn’t
worth using binary search until n > 100 or so. You might even get away with
sequential search to n = 1, 000 or more, provided you will not be doing too many
searches.
Sorting algorithms and binary search always prove harder to debug than they
should. Library sort/search routines are available for C, C++, and Java, and will
be presented in Chapter 4.
• Semi-dynamic Dictionaries — These structures support insertion and search
queries, but not deletion. If we know an upper bound on the number of elements
to be inserted we can use an array, but otherwise we must use a linked structure.

2.1. Elementary Data Structures

Hash tables are excellent dictionary data structures, particularly if deletion
need not be supported. The idea is to apply a function to the search key so we can
determine where the item will appear in an array without looking at the other
items. To make the table of reasonable size, we must allow for collisions, two
distinct keys mapped to the same location.
The two components to hashing are (1) deﬁning a hash function to map keys
to integers in a certain range, and (2) setting up an array as big as this range, so
that the hash function value can specify an index.
The basic hash function converts the key to an integer, and takes the value
of this integer mod the size of the hash table. Selecting a table size to be a prime
number (or at least avoiding obvious composite numbers like 1,000) is helpful to
avoid trouble. Strings can be converted to integers by using the letters to form a
base “alphabet-size” number system. To convert “steve” to a number, observe that
e is the 5th letter of the alphabet, s is the 19th letter, t is the 20th letter, and v is
the 22nd letter. Thus “steve” ⇒ 264 × 19 + 263 × 20 + 262 × 5 + 261 × 22 + 260 × 5 =
9, 038, 021. The ﬁrst, last, or middle ten characters or so will likely suﬃce for
a good index. Tricks on how to do the modular arithmetic eﬃciently will be
discussed in Chapter 7.
The absence of deletion makes open addressing a nice, easy way to resolve
collisions. In open addressing, we use a simple rule to decide where to put a new
item when the desired space is already occupied. Suppose we always put it in
the next unoccupied cell. On searching for a given item, we go to the intended
location and search sequentially. If we ﬁnd an empty cell before we ﬁnd the item,
it does not exist anywhere in the table.
Deletion in an open addressing scheme is very ugly, since removing one element can break a chain of insertions, making some elements inaccessible. The
key to eﬃciency is using a large-enough table that contains many holes. Don’t be
cheap in selecting your table size or else you will pay the price later.
• Fully Dynamic Dictionaries — Hash tables are great for fully dynamic dictionaries as well, provided we use chaining as the collision resolution mechanism.
Here we associate a linked list with each table location, so insertion, deletion, and
query reduce to the same problem in linked lists. If the hash function does a good
job the m keys will be distributed uniformly in a table of size n, so each list will
be short enough to search quickly.

2.1.4

Priority Queues

Priority queues are data structures on sets of items supporting three operations –
• Insert(x,p) — Insert item x into priority queue p.
• Maximum(p) — Return the item with the largest key in priority queue p.
• ExtractMax(p) — Return and remove the item with the largest key in p.

2. Data Structures

Priority queues are used to maintain schedules and calendars. They govern who goes
next in simulations of airports, parking lots, and the like, whenever we need to schedule
events according to a clock. In a human life simulation, it may be most convenient to
determine when someone will die immediately after they are born. We can then stick
this date in a priority queue so as to be reminded when to bury them.
Priority queues are used to schedule events in the sweep-line algorithms common to
computational geometry. Typically, we use the priority queue to store the points we
have not yet encountered, ordered by x-coordinate, and push the line forward one step
at a time.
The most famous implementation of priority queues is the binary heap, which can
be eﬃciently maintained in either a top-down or bottom-up manner. Heaps are very
slick and eﬃcient, but may be tricky to get right under time pressure. Far simpler
is maintaining a sorted array, particularly if you will not be performing too many
insertions.

2.1.5

Sets

Sets (or more strictly speaking subsets) are unordered collections of elements drawn
from a given universal set U . Set data structures get distinguished from dictionaries
because there is at least an implicit need to encode which elements from U are not in
the given subset.
The basic operations on subsets are —
• Member(x,S) — Is an item x an element of subset S?
• Union(A,B) — Construct subset A ∪ B of all elements in subset A or in subset
B.
• Intersection(A,B) — Construct subset A ∩ B of all elements in subset A and in
subset B.
• Insert(x,S), Delete(x,S) — Insert/delete element x into/from subset S.
For sets of a large or unbounded universe, the obvious solution is representing a
subset using a dictionary. Using sorted dictionaries makes union and intersection operations much easier, basically reducing the problem to merging two sorted dictionaries.
An element is in the union if it appears at least once in the merged list, and in the
intersection if it appears exactly twice.
For sets drawn from small, unchanging universes, bit vectors provide a convenient
representation. An n-bit vector or array can represent any subset S from an n-element
universe. Bit i will be 1 iﬀ i ∈ S. Element insertion and deletion operations simply
ﬂip the appropriate bit. Intersection and union are done by “and-ing” or “or-ing” the
corresponding bits together. Since only one bit is used per element, bit vectors can
be space eﬃcient for surprisingly large values of |U |. For example, an array of 1,000
standard four-byte integers can represent any subset on 32,000 elements.

2.2. Object Libraries

2.2 Object Libraries
Users of modern object-oriented languages such as C++ and Java have implementations
of these basic data structures available using standard library classes.

2.2.1

The C++ Standard Template Library

A C library of general-purpose data structures, like stacks and queues, cannot really
exist because functions in C can’t tell the type of their arguments. Thus we would have
to deﬁne separate routines such as push int() and push char() for every possible data
type. Further, such an approach couldn’t generalize to construct stacks on user-deﬁned
data types.
Templates are C++’s mechanism to deﬁne abstract objects which can be parameterized by type. The C++ Standard Template Library (STL) provides implementations of
all the data structures deﬁned above and much more. Each data object must have the
type of its elements ﬁxed (i.e., templated) at compilation time, so
#include
stack S;
stack T;
declares two stacks with diﬀerent element types.
Good references on STL include [MDS01] and http://www.sgi.com/tech/stl/. Brief
descriptions of our featured data structures follow below –
• Stack — Methods include S.push(), S.top(), S.pop(), and S.empty(). Top
returns but does not remove the element on top; while pop removes but does
not return the element. Thus always top on pop [Seu63]. Linked implementations
ensure the stack will never be full.
• Queue — Methods include Q.front(), Q.back(), Q.push(), Q.pop(), and
Q.empty() and have the same idiosyncrasies as stack.
• Dictionaries — STL contains a wide variety of containers, including hash map,
a hashed associative container binding keys to data items. Methods include
H.erase(), H.find(), and H.insert().
• Priority Queues — Declared priority queue Q;, methods include
Q.top(), Q.push(), Q.pop(), and Q.empty().
• Sets — Sets are represented as sorted associative containers, declared set S;. Set algorithms include set union and set intersection, as
well as other standard set operators.

2.2.2

The Java java.util Package

Useful standard Java objects appear in the java.util package. Almost all of
java.util is available on the judge, except for a few libraries which provide

2. Data Structures

an unhealthy degree of power to the contestant. For details, see Sun’s website
http://java.sun.com/products/jdk.
The collection of all Java classes deﬁnes an inheritance hierarchy, which means that
subclasses are built on superclasses by adding methods and variables. As you move
up the inheritance hierarchy, classes become more general and more abstract. The sole
purpose of an abstract class is to provide an appropriate superclass from which other
classes may inherit its interface and/or implementation. The abstract classes can only
declare objects, not instantiate them. Classes from which objects can be instantiated
are called concrete classes.
If you want to declare a general data structure object, declare it with an interface or
abstract class and instantiate it using a concrete class. For example,
Map myMap = new HashMap();
In this case, myMap is treated as an object of Map class. Otherwise, you can just declare
and instantiate an object with concrete class, like
HashMap myMap = new HashMap();
Here, myMap is only an object of HashMap class.
To use java.util include import java.util.*; at the beginning of your program
to import the whole package, or replace the star to import a speciﬁc class, like import
java.util.HashMap;.
Appropriate implementations of the basic data structures include —
Data Structure
Stack
Queue
Dictionaries
Priority Queue
Sets

Abstract class
No interface
List
Map
SortedMap
Set

Concrete class
Stack
ArrayList, LinkedList
HashMap, Hashtable
TreeMap
HashSet

Methods
pop, push, empty, peek
add, remove, clear
put, get, contains
firstKey, lastKey, headMap
add, remove, contains

2.3 Program Design Example: Going to War
In the children’s card game War, a standard 52-card deck is dealt to two players (1 and
2) such that each player has 26 cards. Players do not look at their cards, but keep them
in a packet face down. The object of the game is to win all the cards.
Both players play by turning their top cards face up and putting them on the table.
Whoever turned the higher card takes both cards and adds them (face down) to the
bottom of their packet. Cards rank as usual from high to low: A, K, Q, J, T, 9, 8, 7,
6, 5, 4, 3, 2. Suits are ignored. Both players then turn up their next card and repeat.
The game continues until one player wins by taking all the cards.
When the face up cards are of equal rank there is a war. These cards stay on the
table as both players play the next card of their pile face down and then another card
face up. Whoever has the higher of the new face up cards wins the war, and adds all
six cards to the bottom of his or her packet. If the new face up cards are equal as well,
the war continues: each player puts another card face down and one face up. The war

2.4. Hitting the Deck

goes on like this as long as the face up cards continue to be equal. As soon as they are
diﬀerent, the player with the higher face up card wins all the cards on the table.
If someone runs out of cards in the middle of a war, the other player automatically
wins. The cards are added to the back of the winner’s hand in the exact order they
were dealt, speciﬁcally 1’s ﬁrst card, 2’s ﬁrst card, 1’s second card, etc.
As anyone with a ﬁve year-old nephew knows, a game of War can take a long time to
ﬁnish. But how long? Your job is to write a program to simulate the game and report
the number of moves.
————————————
Solution starts below
————————————
How do we read such a problem description? Keep the following in mind as you
design, code, test, and debug your solutions:
• Read the Problem Carefully — Read each line of the problem statement carefully,
and reread it when the judge complains about an error. Skim the passage ﬁrst,
since much of the description may be background/history that does not impact
the solution. Pay particular attention to the input and output descriptions, and
sample input/output, but . . .
• Don’t Assume — Reading and understanding speciﬁcations is an important part
of contest (and real-life) programming. Speciﬁcations often leave unspeciﬁed traps
to fall in.
Just because certain examples exhibit some nice property does not mean that
all the test data will. Be on the lookout for unspeciﬁed input orders, unbounded
input numbers, long line lengths, negative numbers, etc. Any input which is not
explicitly forbidden should be assumed to be permitted!
• Not So Fast, Louie — Eﬃciency is often not an important issue, unless we are
using exponential algorithms for problems where polynomial algorithms suﬃce.
Don’t worry too much about eﬃciency unless you have seen or can predict trouble. Read the speciﬁcation to learn the maximum possible input size, and decide
whether the most straightforward algorithm will suﬃce on such inputs.
Even though a game of war seems interminable when you are playing with
your nephew (and in fact can go on forever), we see no reason to worry about
eﬃciency with this particular problem description.

2.4 Hitting the Deck
What is the best data structure to represent a deck of cards? The answer depends upon
what you are going to do with them. Are you going to shuﬄe them? Compare their
values? Search for patterns in the deck? Your intended actions deﬁne the operations of
the data structure.
The primary action we need from our deck is dealing cards out from the top and
adding them to the rear of our deck. Thus it is natural to represent each player’s hand
using the FIFO queue we deﬁned earlier.

2. Data Structures

But there is an even more fundamental problem. How do we represent each card?
Remember that cards have both suits (clubs, diamonds, hearts, and spades) and values
(ace, 2–10, jack, queen, king). We have several possible choices. We can represent each
card by a pair of characters or numbers specifying the suit and value. In the war problem,
we might even throw away the suit entirely – but such thinking can get us in trouble.
What if we had to print the winning card, or needed strong evidence that our queue
implementation was working perfectly? An alternate approach might represent each
card with a distinct integer, say 0 to 51, and map in back and forth between numbers
and cards as needed.
The primary operation in war is comparing cards by their face value. This is tricky
to do with the ﬁrst character representation, because we must compare according to
the historical but arbitrary ordering of face cards. Ad hoc logic might seem necessary
to deal with this problem.
Instead, we will present the mapping approach as a generally useful programming
technique. Whenever we can create a numerical ranking function and a dual unranking
function which hold over a particular set of items s ∈ S, we can represent any item
by its integer rank. The key property is that s = unrank(rank(s)). Thus the ranking
function can be thought of as a hash function without collisions.
How can we rank and unrank playing cards? We order the card values from lowest
to highest, and note that there are four distinct cards of each value. Multiplication and
division are the key to mapping them from 0 to 51:
#define NCARDS
#define NSUITS

52
4

/* number of cards */
/* number of suits */

char values[] = "23456789TJQKA";
char suits[] = "cdhs";
int rank_card(char value, char suit)
{
int i,j;
/* counters */
for (i=0; i<(NCARDS/NSUITS); i++)
if (values[i]==value)
for (j=0; j value(y))
clear_queue(&c,a);
else if (value(x) < value(y))
clear_queue(&c,b);
else if (value(y) == value(x))
inwar = TRUE;
}
}
if (!empty(a) && empty(b))
printf("a wins in %d steps \n",steps);
else if (empty(a) && !empty(b))
printf("b wins in %d steps \n",steps);
else if (!empty(a) && !empty(b))
printf("game tied after %d steps, |a|=%d |b|=%d \n",
steps,a->count,b->count);
else
printf("a and b tie in %d steps \n",steps);
}
clear_queue(queue *a, queue *b)
{
while (!empty(a)) enqueue(b,dequeue(a));
}

2.7 Testing and Debugging
Debugging can be particularly frustrating with the robot judge, since you never get to
see the test case on which your program failed. Thus you can’t fake your way around a
problem – you have to do it right.
This makes it very important to test your program systematically before submission.
Catching stupid errors yourself will save you time in the long run, particularly in contest
situations where incorrect submissions are penalized. Several ideas go into designing
good test ﬁles:
• Test the Given Input — Most problem speciﬁcations include sample input and
output. Often (but not always) they match each other. Getting the sample input
right is a necessary but not suﬃcient condition for correctness.
• Test Incorrect Input — If the problem speciﬁcation tells you that your program
must take action on illegal inputs, be sure to test for such problematic instances.

2. Data Structures

• Test Boundary Conditions — Many bugs in programs are due to “oﬀ-by-one”
errors. Explicitly test your code for such conditions as empty input, one item, two
items, and values which are zero.
• Test Instances Where You Know the Correct Answer — A critical part of developing a good test case is being sure you know what the right answer is. Your test
cases should focus on simple enough instances that you can solve them by hand.
It is easy to be fooled into accepting plausible-looking output without completely
analyzing the desired behavior.
• Test Big Examples Where You Don’t Know the Correct Answer — Usually only
small examples are solvable by hand. This makes it diﬃcult to validate your
program on larger inputs. Try a few easily constructed large instances, such as
random data or the numbers 1 to n inclusive, just to make certain the program
doesn’t crash or do anything stupid.
Testing is the art of revealing bugs. Debugging is the art of exterminating them. We
designed this programming problem and coded it up ourselves for the purpose of this
example. Yet getting it working without bugs took us signiﬁcantly longer than expected.
This is no surprise, for all programmers are inherent optimists. But how can you avoid
falling into such traps?
• Get To Know Your Debugger — Any reasonable programming environment comes
with a source-level debugger, which lets you stop execution at given positions
or logical conditions, look at the contents of a variable, and change its value
to see what happens. Source-level debuggers are well worth their weight in print
statements; learn to use them. The sooner you start, the more time and frustration
you will save.
• Display Your Data Structures — At one point in debugging our War program,
we had an oﬀ-by-one error in our priority queue. This could only be revealed
by displaying the contents of the priority queue to see what was missing. Write
special-purpose display routines for all non-trivial data structures, as debuggers
often have a hard time making sense out of them.
• Test Invariants Rigorously — The card ranking and unranking functions are a
potential source of error whenever they are not inverses of each other. An invariant
is a property of the program which is true regardless of the input. A simple
invariant test like
for (i=0; i 0 integers is called a jolly jumper if the absolute values of the
diﬀerences between successive elements take on all possible values 1 through n − 1. For
instance,
1 4 2 3
is a jolly jumper, because the absolute diﬀerences are 3, 2, and 1, respectively. The
deﬁnition implies that any sequence of a single integer is a jolly jumper. Write a program
to determine whether each of a number of sequences is a jolly jumper.

Input
Each line of input contains an integer n < 3, 000 followed by n integers representing the
sequence.

Output
For each line of input generate a line of output saying “Jolly” or “Not jolly”.

Sample Input
4 1 4 2 3
5 1 4 2 -1 6

Sample Output
Jolly
Not jolly

2.8. Problems

2.8.2

Poker Hands

PC/UVa IDs: 110202/10315, Popularity: C, Success rate: average Level: 2
A poker deck contains 52 cards. Each card has a suit of either clubs, diamonds, hearts,
or spades (denoted C, D, H, S in the input data). Each card also has a value of either 2
through 10, jack, queen, king, or ace (denoted 2, 3, 4, 5, 6, 7, 8, 9, T, J, Q, K, A). For
scoring purposes card values are ordered as above, with 2 having the lowest and ace the
highest value. The suit has no impact on value.
A poker hand consists of ﬁve cards dealt from the deck. Poker hands are ranked by
the following partial order from lowest to highest.
High Card. Hands which do not ﬁt any higher category are ranked by the value of
their highest card. If the highest cards have the same value, the hands are ranked
by the next highest, and so on.
Pair. Two of the ﬁve cards in the hand have the same value. Hands which both contain
a pair are ranked by the value of the cards forming the pair. If these values are
the same, the hands are ranked by the values of the cards not forming the pair,
in decreasing order.
Two Pairs. The hand contains two diﬀerent pairs. Hands which both contain two pairs
are ranked by the value of their highest pair. Hands with the same highest pair
are ranked by the value of their other pair. If these values are the same the hands
are ranked by the value of the remaining card.
Three of a Kind. Three of the cards in the hand have the same value. Hands which
both contain three of a kind are ranked by the value of the three cards.
Straight. Hand contains ﬁve cards with consecutive values. Hands which both contain
a straight are ranked by their highest card.
Flush. Hand contains ﬁve cards of the same suit. Hands which are both ﬂushes are
ranked using the rules for High Card.
Full House. Three cards of the same value, with the remaining two cards forming a
pair. Ranked by the value of the three cards.
Four of a Kind. Four cards with the same value. Ranked by the value of the four
cards.
Straight Flush. Five cards of the same suit with consecutive values. Ranked by the
highest card in the hand.
Your job is to compare several pairs of poker hands and to indicate which, if either,
has a higher rank.

2. Data Structures

Input
The input ﬁle contains several lines, each containing the designation of ten cards: the
ﬁrst ﬁve cards are the hand for the player named “Black” and the next ﬁve cards are
the hand for the player named “White”.

Output
For each line of input, print a line containing one of the following:
Black wins.
White wins.
Tie.

Sample Input
2H
2H
2H
2H

3D
4S
3D
3D

5S
4C
5S
5S

9C
2D
9C
9C

KD
4H
KD
KD

Sample Output
White wins.
Black wins.
Black wins.
Tie.

2C
2S
2C
2D

3H
8S
3H
3H

4S
AS
4S
5C

8C
QS
8C
9S

AH
3S
KH
KH

2.8. Problems

2.8.3

Hartals

PC/UVa IDs: 110203/10050, Popularity: B, Success rate: high Level: 2
Political parties in Bangladesh show their muscle by calling for regular hartals
(strikes), which cause considerable economic damage. For our purposes, each party
may be characterized by a positive integer h called the hartal parameter that denotes
the average number of days between two successive strikes called by the given party.
Consider three political parties. Assume h1 = 3, h2 = 4, and h3 = 8, where hi is
the hartal parameter for party i. We can simulate the behavior of these three parties
for N = 14 days. We always start the simulation on a Sunday. There are no hartals on
either Fridays or Saturdays.

Days
Party 1
Party 2
Party 3
Hartals

1
Su

2
Mo

3
Tu
x

4
We
x

5
Th

6
Fr
x

7
Sa

8
Su
x
x
3

9
Mo
x

10
Tu

11
We

12
Th
x
x

13
Fr

14
Sa

There will be exactly ﬁve hartals (on days 3, 4, 8, 9, and 12) over the 14 days. There
is no hartal on day 6 since it falls on Friday. Hence we lose ﬁve working days in two
weeks.
Given the hartal parameters for several political parties and the value of N , determine
the number of working days lost in those N days.

Input
The ﬁrst line of the input consists of a single integer T giving the number of test cases
to follow. The ﬁrst line of each test case contains an integer N (7 ≤ N ≤ 3, 650), giving
the number of days over which the simulation must be run. The next line contains
another integer P (1 ≤ P ≤ 100) representing the number of political parties. The ith
of the next P lines contains a positive integer hi (which will never be a multiple of 7)
giving the hartal parameter for party i (1 ≤ i ≤ P ).

Output
For each test case, output the number of working days lost on a separate line.

Sample Input
2
14
3
3
4

2. Data Structures

8
100
4
12
15
25
40

Sample Output
5
15

2.8. Problems

2.8.4

Crypt Kicker

PC/UVa IDs: 110204/843, Popularity: B, Success rate: low Level: 2
A common but insecure method of encrypting text is to permute the letters of the
alphabet. In other words, each letter of the alphabet is consistently replaced in the text
by some other letter. To ensure that the encryption is reversible, no two letters are
replaced by the same letter.
Your task is to decrypt several encoded lines of text, assuming that each line uses
a diﬀerent set of replacements, and that all words in the decrypted text are from a
dictionary of known words.

Input
The input consists of a line containing an integer n, followed by n lowercase words, one
per line, in alphabetical order. These n words compose the dictionary of words which
may appear in the decrypted text. Following the dictionary are several lines of input.
Each line is encrypted as described above.
There are no more than 1,000 words in the dictionary. No word exceeds 16 letters.
The encrypted lines contain only lower case letters and spaces and do not exceed 80
characters in length.

Output
Decrypt each line and print it to standard output. If there are multiple solutions, any
one will do. If there is no solution, replace every letter of the alphabet by an asterisk.

Sample Input
6
and
dick
jane
puff
spot
yertle
bjvg xsb hxsn xsb qymm xsb rqat xsb pnetfn
xxxx yyy zzzz www yyyy aaa bbbb ccc dddddd

Sample Output
dick and jane and puff and spot and yertle
**** *** **** *** **** *** **** *** ******

2. Data Structures

2.8.5

Stack ’em Up

PC/UVa IDs: 110205/10205, Popularity: B, Success rate: average Level: 1
The Big City has many casinos. In one of them, the dealer cheats. She has perfected
several shuﬄes; each shuﬄe rearranges the cards in exactly the same way whenever it
is used. A simple example is the “bottom card” shuﬄe, which removes the bottom card
and places it at the top. By using various combinations of these known shuﬄes, the
crooked dealer can arrange to stack the cards in just about any particular order.
You have been retained by the security manager to track this dealer. You are given
a list of all the shuﬄes performed by the dealer, along with visual cues that allow you
to determine which shuﬄe she uses at any particular time. Your job is to predict the
order of the cards after a sequence of shuﬄes.
A standard playing card deck contains 52 cards, with 13 values in each of four suits.
The values are named 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King, Ace. The suits
are named Clubs, Diamonds, Hearts, Spades. A particular card in the deck can be
uniquely identiﬁed by its value and suit, typically denoted < value > of < suit >. For
example, “9 of Hearts” or “King of Spades.” Traditionally a new deck is ordered ﬁrst
alphabetically by suit, then by value in the order given above.

Input
The input begins with a single positive integer on a line by itself indicating the number
of test cases, followed by a blank line. There is also a blank line between two consecutive
inputs.
Each case consists of an integer n ≤ 100, the number of shuﬄes that the dealer knows.
Then follow n sets of 52 integers, each comprising all the integers from 1 to 52 in some
order. Within each set of 52 integers, i in position j means that the shuﬄe moves the
ith card in the deck to position j.
Several lines follow, each containing an integer k between 1 and n. These indicate
that you have observed the dealer applying the kth shuﬄe given in the input.

Output
For each test case, assume the dealer starts with a new deck ordered as described above.
After all the shuﬄes had been performed, give the names of the cards in the deck, in
the new order. The output of two consecutive cases will be separated by a blank line.

Sample Input
1
2
2 1 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52 51
52 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

2.8. Problems
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 1
1
2

Sample Output
King of Spades
2 of Clubs
4 of Clubs
5 of Clubs
6 of Clubs
7 of Clubs
8 of Clubs
9 of Clubs
10 of Clubs
Jack of Clubs
Queen of Clubs
King of Clubs
Ace of Clubs
2 of Diamonds
3 of Diamonds
4 of Diamonds
5 of Diamonds
6 of Diamonds
7 of Diamonds
8 of Diamonds
9 of Diamonds
10 of Diamonds
Jack of Diamonds
Queen of Diamonds
King of Diamonds
Ace of Diamonds
2 of Hearts
3 of Hearts
4 of Hearts
5 of Hearts
6 of Hearts
7 of Hearts
8 of Hearts
9 of Hearts
10 of Hearts
Jack of Hearts
Queen of Hearts
King of Hearts
Ace of Hearts
2 of Spades
3 of Spades
4 of Spades
5 of Spades
6 of Spades
7 of Spades
8 of Spades
9 of Spades
10 of Spades
Jack of Spades
Queen of Spades
Ace of Spades
3 of Clubs

2. Data Structures

2.8.6

Erdös Numbers

PC/UVa IDs: 110206/10044, Popularity: B, Success rate: low Level: 2
The Hungarian Paul Erdös (1913–1996) was one of the most famous mathematicians
of the 20th century. Every mathematician having the honor of being a co-author of
Erdös is well respected.
Unfortunately, not everybody got the chance to write a paper with Erdös, so the
best they could do was publish a paper with somebody who had published a scientiﬁc
paper with Erdös. This gave rise to the so-called Erdös numbers. An author who has
jointly published with Erdös had Erdös number 1. An author who had not published
with Erdös but with somebody with Erdös number 1 obtained Erdös number 2, and so
on.
Your task is to write a program which computes Erdös numbers for a given set of
papers and scientists.

Input
The ﬁrst line of the input contains the number of scenarios. Each scenario consists of
a paper database and a list of names. It begins with the line P N, where P and N are
natural numbers. Following this line is the paper database, with P lines each containing
the description of one paper speciﬁed in the following way:
Smith, M.N., Martin, G., Erdos, P.: Newtonian forms of prime factors
Note that umlauts, like “ö,” are simply written as “o”. After the P papers follow N
lines with names. Such a name line has the following format:
Martin, G.

Output
For every scenario you are to print a line containing a string “Scenario i” (where i is
the number of the scenario) and the author names together with their Erdös number of
all authors in the list of names. The authors should appear in the same order as they
appear in the list of names. The Erdös number is based on the papers in the paper
database of this scenario. Authors which do not have any relation to Erdös via the
papers in the database have Erdös number “infinity.”

Sample Input
1
4 3
Smith, M.N., Martin, G., Erdos, P.: Newtonian forms of prime factors
Erdos, P., Reisig, W.: Stuttering in petri nets
Smith, M.N., Chen, X.: First order derivates in structured programming
Jablonski, T., Hsueh, Z.: Selfstabilizing data structures

2.8. Problems

Smith, M.N.
Hsueh, Z.
Chen, X.

Sample Output
Scenario 1
Smith, M.N. 1
Hsueh, Z. infinity
Chen, X. 2

2. Data Structures

2.8.7

Contest Scoreboard

PC/UVa IDs: 110207/10258, Popularity: B, Success rate: average Level: 1
Want to compete in the ACM ICPC? Then you had better know how to keep score!
Contestants are ranked ﬁrst by the number of problems solved (the more the better),
then by decreasing amounts of penalty time. If two or more contestants are tied in
both problems solved and penalty time, they are displayed in order of increasing team
numbers.
A problem is considered solved by a contestant if any of the submissions for that
problem was judged correct. Penalty time is computed as the number of minutes it
took until the ﬁrst correct submission for a problem was received, plus 20 minutes for
each incorrect submission prior to the correct solution. Unsolved problems incur no
time penalties.

Input
The input begins with a single positive integer on a line by itself indicating the number
of cases, each described as below. This line is followed by a blank line. There is also a
blank line between two consecutive inputs.
The input consists of a snapshot of the judging queue, containing entries from some
or all of contestants 1 through 100 solving problems 1 through 9. Each line of input
consists of three numbers and a letter in the format contestant problem time L, where
L can be C, I, R, U, or E. These stand for Correct, Incorrect, clariﬁcation Request,
Unjudged, and Erroneous submission. The last three cases do not aﬀect scoring.
The lines of input appear in the order in which the submissions were received.

Output
The output for each test case will consist of a scoreboard, sorted by the criteria described
above. Each line of output will contain a contestant number, the number of problems
solved by the contestant and the total time penalty accumulated by the contestant.
Since not all contestants are actually participating, only display those contestants who
have made a submission.
The output of two consecutive cases will be separated by a blank line.

Sample Input

Sample Output

1 2 66
3 1 11

1
3
1
1
1

2
1
2
2
1

10
11
19
21
25

I
C
R
C
C

2.8. Problems

2.8.8

Yahtzee

PC/UVa IDs: 110208/10149, Popularity: C, Success rate: average Level: 3
The game of Yahtzee involves ﬁve dice, which are thrown in 13 rounds. A score card
contains 13 categories. Each round may be scored in a category of the player’s choosing,
but each category may be scored only once in the game. The 13 categories are scored
as follows:
• ones - sum of all ones thrown
• twos - sum of all twos thrown
• threes - sum of all threes thrown
• fours - sum of all fours thrown
• ﬁves - sum of all ﬁves thrown
• sixes - sum of all sixes thrown
• chance - sum of all dice
• three of a kind - sum of all dice, provided at least three have same value
• four of a kind - sum of all dice, provided at least four have same value
• ﬁve of a kind - 50 points, provided all ﬁve dice have same value
• short straight - 25 points, provided four of the dice form a sequence (that is,
1,2,3,4 or 2,3,4,5 or 3,4,5,6)
• long straight - 35 points, provided all dice form a sequence (1,2,3,4,5 or 2,3,4,5,6)
• full house - 40 points, provided three of the dice are equal and the other two
dice are also equal. (for example, 2,2,5,5,5)
Each of the last six categories may be scored as 0 if the criteria are not met.
The score for the game is the sum of all 13 categories plus a bonus of 35 points if the
sum of the ﬁrst six categories is 63 or greater.
Your job is to compute the best possible score for a sequence of rounds.

Input
Each line of input contains ﬁve integers between 1 and 6, indicating the values of the
ﬁve dice thrown in each round. There are 13 such lines for each game, and there may
be any number of games in the input data.

Output
Your output should consist of a single line for each game containing 15 numbers: the
score in each category (in the order given), the bonus score (0 or 35), and the total
score. If there is more than categorization that yields the same total score, any one will
do.

2. Data Structures

Sample Input
1
1
1
1
1
1
1
1
1
1
1
1
1
1
6
6
1
1
1
1
6
1
5
4
3
2

2
2
2
2
2
2
2
2
2
2
2
2
2
1
6
6
1
1
2
2
1
4
5
4
1
2

3
3
3
3
3
3
3
3
3
3
3
3
3
1
6
6
1
1
3
3
2
5
5
4
3
2

4
4
4
4
4
4
4
4
4
4
4
4
4
1
6
1
2
2
4
4
6
5
5
5
6
4

5
5
5
5
5
5
5
5
5
5
5
5
5
1
6
1
2
3
5
6
6
5
6
6
3
6

Sample Output
1 2 3 4 5 0 15 0 0 0 25 35 0 0 90
3 6 9 12 15 30 21 20 26 50 25 35 40 35 327

2.9. Hints

2.9 Hints
2.2 Can we reduce the value of a poker hand to a single numerical value to make the
comparison easier?
2.3 Do we need to build the actual table in order to compute the number of hartals?
2.4 Does it pay to partition the words into equivalence classes based on repeated
letters and length?
2.7 What is the easiest way to sort on the multiple criteria?
2.8 Do we need to try all possible mappings of rounds to categories, or can we make
certain assignments in a more straightforward way?

2.10 Notes
2.1 A jolly number is a special case of a graceful graph labeling. A graph is called
graceful if there exists a way to label the n vertices with integers such that the
absolute value of the diﬀerence between the endpoints of all m of the edges yields
a distinct value between 1 and m. A jolly jumper represents a graceful labeling
of an n-vertex path. The famous graceful tree conjecture asks whether every tree
has a graceful labeling. Graceful graphs make a great topic for undergraduate
student research. See Gallian’s dynamic survey [Gal01] for a list of accessible
open problems.
2.5 The mathematics of card shuﬄing is a fascinating topic. A perfect shuﬄe splits
the deck in half into piles A and B, and then interleaves the cards: top of A, top of
B, top of A, . . . . Amazingly, eight perfect shuﬄes take a deck back to its original
state! This can be shown using either modular arithmetic or the theory of cycles
in permutations. See [DGK83, Mor98] for more on perfect shuﬄes.
2.6 The ﬁrst author of this book has an Erdös number of 2, giving the second a number
of ≤ 3. Erdös was famous for posing beautiful, easy-to-understand but hard-tosolve problems in combinatorics, graph theory, and number theory. Read one of
the popular biographies of his life [Hof99, Sch00] for more about this fascinating
man.

3
Strings

Text strings are a fundamental data structure of growing importance. Internet search
engines such as Google search billions of documents almost instantaneously. The sequencing of the human genome has given us three billion characters of text describing
all the proteins we are built from. In searching this string for interesting patterns, we
are literally looking for the secret of life.
The stakes in solving the programming problems in this chapter are considerably
lower than that. However, they provide insight into how characters and text strings are
represented in modern computers, and the clever algorithms which search and manipulate this data. We refer the interested reader to [Gus97] for a more advanced discussion
of string algorithms.

3.1 Character Codes
Character codes are mappings between numbers and the symbols which make up a particular alphabet. Computers are fundamentally designed to work with numerical data.
All they know about a given alphabet is which symbol is assigned to each possible
number. When changing the font in a print program, all that really changes are the image bit-maps associated with each character. When changing an operating system from
English to Russian all that really changes is the mapping of symbols in the character
code.
It is useful to understand something about character code designs when working with
text strings. The American Standard Code for Information Interchange (ASCII) is a

3.1. Character Codes
0
8
16
24
32
40
48
56
64
72
80
88
96
104
112
120

NUL
BS
DLE
CAN
SP
(
0
8
@
H
P
X
‘
h
p
x

1
9
17
25
33
41
49
57
65
73
81
89
97
105
113
121

SOH
HT
DC1
EM
!
)
1
9
A
I
Q
Y
a
i
q
y

2
10
18
26
34
42
50
58
66
74
82
90
98
106
114
122

STX
NL
DC2
SUB
”
*
2
:
B
J
R
Z
b
j
r
z

3
11
19
27
35
43
51
59
67
75
83
91
99
107
115
123

ETX
VT
DC3
ESC
#
+
3
;
C
K
S
[
c
k
s
{

4
12
20
28
36
44
52
60
68
76
84
92
100
108
116
124

EOT
NP
DC4
FS
$
,
4
<
D
L
T
/
d
l
t
—

5
13
21
29
37
45
53
61
69
77
85
93
101
109
117
125

ENQ
CR
NAK
GS
%
5
=
E
M
U
]
e
m
u
}

6
14
22
30
38
46
54
62
70
78
86
94
102
110
118
126

ACK
SO
SYN
RS
&
.
6
>
F
N
V
ˆ
f
n
v
∼

7
15
23
31
39
47
55
63
71
79
87
95
103
111
119
127

57
BEL
SI
ETB
US
’
/
7
?
G
O
W
g
o
w
DEL

Figure 3.1. The ASCII character code.

single-byte character code where 27 = 128 characters are speciﬁed.1 Bytes are eight-bit
entities; so that means the highest-order bit is left as zero.
Consider the ASCII character code table presented in Figure 3.1, where the left entry
in each pair is the decimal (base-ten) value of the speciﬁcation, while the right entry
is the associated symbol. These symbol assignments were not done at random. Several
interesting properties of the design make programming tasks easier:
• All non-printable characters have either the ﬁrst three bits as zero or all seven
lowest bits as one. This makes it very easy to eliminate them before displaying
junk, although somehow very few programs seem to do so.
• Both the upper- and lowercase letters and the numerical digits appear sequentially.
Thus we can iterate through all the letters/digits simply by looping from the value
of the ﬁrst symbol (say, “a”) to value of the last symbol (say, “z”).
• Another consequence of this sequential placement is that we can convert a character (say, “I”) to its rank in the collating sequence (eighth, if “A” is the zeroth
character) simply by subtracting oﬀ the ﬁrst symbol (“A”).
• We can convert a character (say “C”) from upper- to lowercase by adding the
diﬀerence of the upper and lowercase starting character (“C”-“A”+“a”). Similarly,
a character x is uppercase if and only if it lies between “A” and “Z”.
• Given the character code, we can predict what will happen when naively sorting
text ﬁles. Which of “x” or “3” or “C” appears ﬁrst in alphabetical order? Sorting alphabetically means sorting by character code. Using a diﬀerent collating
sequence requires more complicated comparison functions, as will be discussed in
Chapter 4.
• Non-printable character codes for new-line (10) and carriage return (13) are designed to delimit the end of text lines. Inconsistent use of these codes is one of
the pains in moving text ﬁles between UNIX and Windows systems.
More modern international character code designs such as Unicode use two or even
three bytes per symbol, and can represent virtually any symbol in every language on
1 Be aware that there are literally dozens of variations on ASCII. Perhaps the most important is
ISO Latin-1, which is a full 8-bit code that includes European accented characters.

3. Strings

earth. However, good old ASCII remains alive embedded in Unicode. Whenever the
high-order bits are 0, the text gets interpreted as single-byte characters instead of twobyte entities. Thus we can still use the simpler, more memory-eﬃcient encoding while
opening the door to thousands of additional symbols.
All of this makes a big diﬀerence in manipulating text in diﬀerent programming languages. Older languages, like Pascal, C, and C++, view the char type as virtually
synonymous with 8-bit entities. Thus the character data type is the choice for manipulating raw ﬁles, even for those which are not intended to be printable. Java, on the
other hand, was designed to support Unicode, so characters are 16-bit entities. The
upper byte is all zeros when working with ASCII/ISO Latin 1 text. Be aware of this
diﬀerence when you switch between programming languages.

3.2 Representing Strings
Strings are sequences of characters, where order clearly matters. It is important to be
aware of how your favorite programming language represents strings, because there are
several diﬀerent possibilities:
• Null-terminated Arrays — C/C++ treats strings as arrays of characters. The
string ends the instant it hits the null character “\0”, i.e., zero ASCII. Failing to
end your string explicitly with a null typically extends it by a bunch of unprintable
characters. In deﬁning a string, enough array must be allocated to hold the largest
possible string (plus the null) unless you want a core dump. The advantage of this
array representation is that all individual characters are accessible by index as
array elements.
• Array Plus Length — Another scheme uses the ﬁrst array location to store the
length of the string, thus avoiding the need for any terminating null character.
Presumably this is what Java implementations do internally, even though the
user’s view of strings is as objects with a set of operators and methods acting on
them.
• Linked Lists of Characters — Text strings can be represented using linked lists,
but this is typically avoided because of the high space-overhead associated with
having a several-byte pointer for each single byte character. Still, such a representation might be useful if you were to insert or delete substrings frequently within
the body of a string.
The underlying string representation can have a big impact on which operations are
easily or eﬃciently supported. Compare each of these three data structures with respect
to the following properties:
• Which uses the least amount of space? On what sized strings?
• Which constrains the content of the strings which can possibly be represented?
• Which allow constant-time access to the ith character?

3.3. Program Design Example: Corporate Renamings

• Which allow eﬃcient checks that the ith character is in fact within the string,
thus avoiding out-of-bounds errors?
• Which allow eﬃcient deletion or insertion of new characters at the ith position?
• Which representation is used when users are limited to strings of length at most
255, e.g., ﬁle names in Windows?

3.3 Program Design Example: Corporate Renamings
Corporate name changes are occurring with ever greater frequency, as companies merge,
buy each other out, try to hide from bad publicity, or even raise their stock price –
remember when adding a .com to a company’s name was the secret to success!
These changes make it diﬃcult to ﬁgure out the current name of a company when
reading old documents. Your company, Digiscam (formerly Algorist Technologies), has
put you to work on a program which maintains a database of corporate name changes
and does the appropriate substitutions to bring old documents up to date.
Your program should take as input a ﬁle with a given number of corporate name
changes, followed by a given number of lines of text for you to correct. Only exact
matches of the string should be replaced. There will be at most 100 corporate changes,
and each line of text is at most 1,000 characters long. A sample input is —
4
"Anderson Consulting" to "Accenture"
"Enron" to "Dynegy"
"DEC" to "Compaq"
"TWA" to "American"
5
Anderson Accounting begat Anderson Consulting, which
offered advice to Enron before it DECLARED bankruptcy,
which made Anderson
Consulting quite happy it changed its name
in the first place!
Which should be transformed to —
Anderson Accounting begat Accenture, which
offered advice to Dynegy before it CompaqLARED bankruptcy,
which made Anderson
Consulting quite happy it changed its name
in the first place!
The speciﬁcations do not ask you to respect word delimiters (such as blank), so
transforming DECLARED to CompaqLARED is indeed the right thing to do.
————————————

Solution starts below

————————————

3. Strings

What kind of string operations do we need to do to solve this problem? We must
be able to read strings and store them, search strings for patterns, modify them, and
ﬁnally print them.
Observe that the input ﬁle has been segmented into two parts. The ﬁrst section,
the dictionary of name changes, must be completely read and stored before starting to
convert the text. To declare the relevant data structures:
#include
#define MAXLEN
#define MAXCHANGES

1001
101

/* longest possible string */
/* maximum number of name changes */

typedef char string[MAXLEN];
string mergers[MAXCHANGES][2];
int nmergers;

/* store before/after corporate names */
/* number of different name changes */

We represent the dictionary as a two-dimensional array of strings. We do not need to
sort the keys in any particular order, since we will be scanning through each of them
on each line of text.
Reading the list of company names is somewhat complicated by the fact that we
must parse each input line to extract the stuﬀ between quotes. The trick is ignoring
text before the ﬁrst quote, and collecting it until the second quote:
read_changes()
{
int i;

/* counter */

scanf("%d\n",&nmergers);
for (i=0; i= ylen)
for (i=(pos+xlen); i<=slen; i++) s[i+(ylen-xlen)] = s[i];
else

3.6. Completing the Merger

for (i=slen; i>=(pos+xlen); i--) s[i+(ylen-xlen)] = s[i];
for (i=0; i

/* include the character library */

int
int
int
int
int
int
int

/*
/*
/*
/*
/*
/*
/*

isalpha(int c);
isupper(int c);
islower(int c);
isdigit(int c);
ispunct(int c);
isxdigit(int c);
isprint(int c);

int toupper(int c);
int tolower(int c);

true
true
true
true
true
true
true

if
if
if
if
if
if
if

c
c
c
c
c
c
c

is
is
is
is
is
is
is

either upper or lower case */
upper case */
lower case */
a numerical digit (0-9) */
a punctuation symbol */
a hexadecimal digit (0-9,A-F) */
any printable character */

/* convert c to upper case -- no error checking */
/* convert c to lower case -- no error checking */

Check the deﬁnition of each carefully before assuming it does exactly what you want.
The following functions appear in the C language string library string.h. The full
library has more functions and options, so check it out.
#include

/* include the string library */

char *strcat(char *dst, const char *src);
int strcmp(const char *s1, const char *s2);
char *strcpy(char *dst, const char *src);
size_t strlen(const char *s);
char *strstr(const char *s1, const char *s2);
char *strtok(char *s1, const char *s2);

/*
/*
/*
/*
/*
/*

concatenation */
is s1 == s2? */
copy src to dist
length of string
search for s2 in
iterate words in

*/
*/
s1 */
s1 */

C++ String Library Functions
In addition to supporting C-style strings, C++ has a string class which contains
methods for these operations and more:
string::size()
string::empty()

/* string length */
/* is it empty */

3.7. String Library Functions

string::c_str()

/* return a pointer to a C style string */

string::operator [](size_type i)

/* access the ith character */

string::append(s)
/* append to string */
string::erase(n,m)
/* delete a run of characters */
string::insert(size_type n, const string&s) /* insert string s at n */
string::find(s)
string::rfind(s)

/* search left or right for the given string */

string::first()
string::last()

/* get characters, also there are iterators */

Overloaded operators exist for concatenation and string comparison.

Java String Objects
Java strings are ﬁrst-class objects deriving either from the String class or the
StringBuffer class. The String class is for static strings which do not change, while
StringBuffer is designed for dynamic strings. Recall that Java was designed to support
Unicode, so its characters are 16-bit entities.
The java.text package contains more advanced operations on strings, including
routines to parse dates and other structured text.

3. Strings

3.8 Problems
3.8.1

WERTYU

PC/UVa IDs: 110301/10082, Popularity: A, Success rate: high Level: 1

A common typing error is to place your hands on the keyboard one row to the right
of the correct position. Then “Q” is typed as “W” and “J” is typed as “K” and so on.
Your task is to decode a message typed in this manner.

Input
Input consists of several lines of text. Each line may contain digits, spaces, uppercase
letters (except “Q”, “A”, “Z”), or punctuation shown above [except back-quote (‘)].
Keys labeled with words [Tab, BackSp, Control, etc.] are not represented in the input.

Output
You are to replace each letter or punctuation symbol by the one immediately to its left
on the QWERTY keyboard shown above. Spaces in the input should be echoed in the
output.

Sample Input
O S, GOMR YPFSU/

Sample Output
I AM FINE TODAY.

3.8. Problems

3.8.2

Where’s Waldorf ?

PC/UVa IDs: 110302/10010, Popularity: B, Success rate: average Level: 2
Given an m by n grid of letters and a list of words, ﬁnd the location in the grid at
which the word can be found.
A word matches a straight, uninterrupted line of letters in the grid. A word can
match the letters in the grid regardless of case (i.e., upper- and lowercase letters are
to be treated as the same). The matching can be done in any of the eight horizontal,
vertical, or diagonal directions through the grid.

Input
The input begins with a single positive integer on a line by itself indicating the number
of cases, followed by a blank line. There is also a blank line between each two consecutive
cases.
Each case begins with a pair of integers m followed by n on a single line, where
1 ≤ m, n ≤ 50 in decimal notation. The next m lines contain n letters each, representing
the grid of letters where the words must be found. The letters in the grid may be in
upper- or lowercase. Following the grid of letters, another integer k appears on a line by
itself (1 ≤ k ≤ 20). The next k lines of input contain the list of words to search for, one
word per line. These words may contain upper- and lowercase letters only – no spaces,
hyphens, or other non-alphabetic characters.

Output
For each word in each test case, output a pair of integers representing its location in
the corresponding grid. The integers must be separated by a single space. The ﬁrst
integer is the line in the grid where the ﬁrst letter of the given word can be found (1
represents the topmost line in the grid, and m represents the bottommost line). The
second integer is the column in the grid where the ﬁrst letter of the given word can
be found (1 represents the leftmost column in the grid, and n represents the rightmost
column in the grid). If a word can be found more than once in the grid, then output
the location of the uppermost occurrence of the word (i.e., the occurrence which places
the ﬁrst letter of the word closest to the top of the grid). If two or more words are
uppermost, output the leftmost of these occurrences. All words can be found at least
once in the grid.
The output of two consecutive cases must be separated by a blank line.

3. Strings

Sample Input
1
8 11
abcDEFGhigg
hEbkWalDork
FtyAwaldORm
FtsimrLqsrc
byoArBeDeyv
Klcbqwikomk
strEBGadhrb
yUiqlxcnBjf
4
Waldorf
Bambi
Betty
Dagbert

Sample Output
2
2
1
7

5
3
2
8

3.8. Problems

3.8.3

Common Permutation

PC/UVa IDs: 110303/10252, Popularity: A, Success rate: average Level: 1
Given two strings a and b, print the longest string x of letters such that there is a
permutation of x that is a subsequence of a and there is a permutation of x that is a
subsequence of b.

Input
The input ﬁle contains several cases, each case consisting of two consecutive lines. This
means that lines 1 and 2 are a test case, lines 3 and 4 are another test case, and so on.
Each line contains one string of lowercase characters, with ﬁrst line of a pair denoting
a and the second denoting b. Each string consists of at most 1,000 characters.

Output
For each set of input, output a line containing x. If several x satisfy the criteria above,
choose the ﬁrst one in alphabetical order.

Sample Input
pretty
women
walking
down
the
street

Sample Output
e
nw
et

3. Strings

3.8.4

Crypt Kicker II

PC/UVa IDs: 110304/850, Popularity: A, Success rate: average Level: 2
A popular but insecure method of encrypting text is to permute the letters of the
alphabet. That is, in the text, each letter of the alphabet is consistently replaced by
some other letter. To ensure that the encryption is reversible, no two letters are replaced
by the same letter.
A powerful method of cryptanalysis is the known plain text attack. In a known plain
text attack, the cryptanalyst manages to have a known phrase or sentence encrypted by
the enemy, and by observing the encrypted text then deduces the method of encoding.
Your task is to decrypt several encrypted lines of text, assuming that each line uses
the same set of replacements, and that one of the lines of input is the encrypted form
of the plain text the quick brown fox jumps over the lazy dog.

Input
The input begins with a single positive integer on a line by itself indicating the number
of test cases, followed by a blank line. There will also be a blank line between each two
consecutive cases.
Each case consists of several lines of input, encrypted as described above. The encrypted lines contain only lowercase letters and spaces and do not exceed 80 characters
in length. There are at most 100 input lines.

Output
For each test case, decrypt each line and print it to standard output. If there is more
than one possible decryption, any one will do. If decryption is impossible, output
No solution.
The output of each two consecutive cases must be separated by a blank line.

Sample Input
1
vtz ud xnm xugm itr pyy jttk gmv xt otgm xt xnm puk ti xnm fprxq
xnm ceuob lrtzv ita hegfd tsmr xnm ypwq ktj
frtjrpgguvj otvxmdxd prm iev prmvx xnmq

Sample Output
now is the time for all good men to come to the aid of the party
the quick brown fox jumps over the lazy dog
programming contests are fun arent they

3.8. Problems

3.8.5

Automated Judge Script

PC/UVa IDs: 110305/10188, Popularity: B, Success rate: average Level: 1
Human programming contest judges are known to be very picky. To eliminate the
need for them, write an automated judge script to judge submitted solution runs.
Your program should take a ﬁle containing the correct output as well as the output
of submitted program and answer either Accepted, Presentation Error, or Wrong
Answer, deﬁned as follows:
Accepted: You are to report “Accepted” if the team’s output matches the standard
solution exactly. All characters must match and must occur in the same order.
Presentation Error: Give a “Presentation Error” if all numeric characters match
in the same order, but there is at least one non-matching non-numeric character.
For example, “15 0” and “150” would receive “Presentation Error”, whereas
“15 0” and “1 0” would receive “Wrong Answer,” described below.
Wrong Answer: If the team output cannot be classiﬁed as above, then you have no
alternative but to score the program a ‘Wrong Answer’.

Input
The input will consist of an arbitrary number of input sets. Each input set begins with
a line containing a positive integer n < 100, which describes the number of lines of the
correct solution. The next n lines contain the correct solution. Then comes a positive
integer m < 100, alone on its line, which describes the number of lines of the team’s
submitted output. The next m lines contain this output. The input is terminated by a
value of n = 0, which should not be processed.
No line will have more than 100 characters.

Output
For each set, output one of the following:
Run #x: Accepted
Run #x: Presentation Error
Run #x: Wrong Answer
where x stands for the number of the input set (starting from 1).

Sample Input
2
The answer is: 10
The answer is: 5
2
The answer is: 10

3. Strings

The answer is: 5
2
The answer is: 10
The answer is: 5
2
The answer is: 10
The answer is: 15
2
The answer is: 10
The answer is: 5
2
The answer is: 10
The answer is: 5
3
Input Set #1: YES
Input Set #2: NO
Input Set #3: NO
3
Input Set #0: YES
Input Set #1: NO
Input Set #2: NO
1
1 0 1 0
1
1010
1
The judges are mean!
1
The judges are good!
0

Sample Output
Run
Run
Run
Run
Run
Run

#1:
#2:
#3:
#4:
#5:
#6:

Accepted
Wrong Answer
Presentation Error
Wrong Answer
Presentation Error
Presentation Error

3.8. Problems

3.8.6

File Fragmentation

PC/UVa IDs: 110306/10132, Popularity: C, Success rate: average Level: 2
Your friend, a biochemistry major, tripped while carrying a tray of computer ﬁles
through the lab. All of the ﬁles fell to the ground and broke. Your friend picked up all
the ﬁle fragments and called you to ask for help putting them back together again.
Fortunately, all of the ﬁles on the tray were identical, all of them broke into exactly
two fragments, and all of the ﬁle fragments were found. Unfortunately, the ﬁles didn’t
all break in the same place, and the fragments were completely mixed up by their fall
to the ﬂoor.
The original binary fragments have been translated into strings of ASCII 1’s and 0’s.
Your job is to write a program that determines the bit pattern the ﬁles contained.

Input
The input begins with a single positive integer on its own line indicating the number
of test cases, followed by a blank line. There will also be a blank line between each two
consecutive cases.
Each case will consist of a sequence of “ﬁle fragments,” one per line, terminated by
the end-of-ﬁle marker or a blank line. Each fragment consists of a string of ASCII 1’s
and 0’s.

Output
For each test case, the output is a single line of ASCII 1’s and 0’s giving the bit pattern
of the original ﬁles. If there are 2N fragments in the input, it should be possible to
concatenate these fragments together in pairs to make N copies of the output string. If
there is no unique solution, any of the possible solutions may be output.
Your friend is certain that there were no more than 144 ﬁles on the tray, and that
the ﬁles were all less than 256 bytes in size.
The output from two consecutive test cases will be separated by a blank line.

Sample Input

Sample Output

01110111

011
0111
01110
111
0111
10111

3. Strings

3.8.7

Doublets

PC/UVa IDs: 110307/10150, Popularity: C, Success rate: average Level: 3
A doublet is a pair of words that diﬀer in exactly one letter; for example, “booster”
and “rooster” or “rooster” and “roaster” or “roaster” and “roasted”.
You are given a dictionary of up to 25,143 lowercase words, not exceeding 16 letters
each. You are then given a number of pairs of words. For each pair of words, ﬁnd the
shortest sequence of words that begins with the ﬁrst word and ends with the second,
such that each pair of adjacent words is a doublet. For example, if you were given the
input pair “booster” and “roasted”, a possible solution would be (“booster,” “rooster,”
“roaster,” “roasted”), provided that these words are all in the dictionary.

Input
The input ﬁle contains the dictionary followed by a number of word pairs. The dictionary
consists of a number of words, one per line, and is terminated by an empty line. The
pairs of input words follow; each pair of words occurs on a line separated by a space.

Output
For each input pair, print a set of lines starting with the ﬁrst word and ending with the
last. Each pair of adjacent lines must be a doublet.
If there are several minimal solutions, any one will do. If there is no solution, print a
line: “No solution.” Leave a blank line between cases.

Sample Input

Sample Output

booster
rooster
roaster
coasted
roasted
coastal
postal

booster
rooster
roaster
roasted

booster roasted
coastal postal

No solution.

3.8. Problems

3.8.8

Fmt

PC/UVa IDs: 110308/848, Popularity: C, Success rate: low Level: 2
The UNIX program fmt reads lines of text, combining and breaking them so as
to create an output ﬁle with lines as close to 72 characters long as possible without
exceeding this limit. The rules for combining and breaking lines are as follows:
• A new line may be started anywhere there is a space in the input. When a new
line is started, blanks at the end of the previous line and at the beginning of the
new line are eliminated.
• A line break in the input may be eliminated in the output unless (1) it is at the
end of a blank or empty line, or (2) it is followed by a space or another line break.
When a line break is eliminated, it is replaced by a space.
• Spaces must be removed from the end of each output line.
• Any input word containing more than 72 characters must appear on an output
line by itself.
You may assume that the input text does not contain any tabbing characters.

Sample Input
Unix fmt
The unix fmt program reads lines of text, combining
and breaking lines so as to create an
output file with lines as close to without exceeding
72 characters long as possible. The rules for combining and breaking
lines are as follows.
1. A new line may be started anywhere there is a space in the input.
If a new line is started, there will be no trailing blanks at the
end of the previous line or at the beginning of the new line.
2. A line break in the input may be eliminated in the output, provided
it is not followed by a space or another line break. If a line
break is eliminated, it is replaced by a space.

Sample Output
Unix fmt
The unix fmt program reads lines of text, combining and breaking lines
so as to create an output file with lines as close to without exceeding
72 characters long as possible. The rules for combining and breaking

3. Strings

lines are as follows.
1. A new line may be started anywhere there is a space in the input.
If a new line is started, there will be no trailing blanks at the end of
the previous line or at the beginning of the new line.
2. A line break in the input may be eliminated in the output,
provided it is not followed by a space or another line break. If a line
break is eliminated, it is replaced by a space.

3.9. Hints

3.9 Hints
3.1 Should you use hard-coded logic to perform the character replacement, or would
a table-driven strategy of initialized arrays be easier?
3.2 Can you write a single comparison routine with arguments which can handle
comparison in all eight directions when called with the right arguments? Does it
pay to specify directions as pairs of integers (δx , δy ) instead of by name?
3.3 Can you rearrange the letters of each word so that the common permutation
becomes more apparent?
3.5 What is the easiest way to compare just the numeric characters, as required for
identifying presentation errors?
3.6 Can you easily ﬁgure out which pairs of fragments go together, if not their order?
3.7 Can we model this problem as a path problem in graphs? It might pay to
look ahead to Chapter 9 where we present graph data structures and traversal
algorithms.

3.10 Notes
3.4 Although it has a history dating back thousands of years, cryptography has been
revolutionized by computational advances and new algorithms. Read Schneier’s
[Sch94] and/or Stinson’s [Sti02] books to learn more about this fascinating area.
3.8 The gold standard among text-formatting programs is Latex, the system we used
to typeset this book. It is built on top of TeX, developed by master computer
scientist Don Knuth. He is the author of the famous Art of Computer Programming
books [Knu73a, Knu81, Knu73b], which are still fascinating and unsurpassed more
than 30 years after their original publication.

4
Sorting

Sorting is the most fundamental algorithmic problem in computer science and a rich
source of programming problems for two distinct reasons. First, sorting is a useful
operation which eﬃciently solves many tasks that every programmer encounters. As
soon as you recognize your job is a special case of sorting, proper use of library routines
make short work of the problem.
Second, literally dozens of diﬀerent sorting algorithms have been developed, each
of which rests on a particular clever idea or observation. Most algorithm design
paradigms lead to interesting sorting algorithms, including divide-and-conquer, randomization, incremental insertion, and advanced data structures. Many interesting
programming/mathematical problems follow from properties of these algorithms.
In this chapter, we will review the primary applications of sorting, as well as the
theory behind the most important algorithms. Finally, we will describe the sorting
library routines provided by all modern programming languages, and show how to use
them on a non-trivial problem.

4.1 Sorting Applications
The key to understanding sorting is seeing how it can be used to solve many important
programming tasks:
• Uniqueness Testing — How can we test if the elements of a given collection of
items S are all distinct? Sort them into either increasing or decreasing order so
that any repeated items will fall next to each other. One pass through the elements
testing if S[i] = S[i + 1] for any 1 ≤ i < n then ﬁnishes the job.

4.2. Sorting Algorithms

• Deleting Duplicates — How can we remove all but one copy of any repeated
elements in S? Sort and sweep again does the job. Note that the sweeping is
best done by maintaining two indices — back, pointing to the last element in the
cleaned-out preﬁx array, and i, pointing to the next element to be considered. If
S[back] <> S[i], increment back and copy S[i] to S[back].
• Prioritizing Events — Suppose we are given a set of jobs to do, each with its
own deadline. Sorting the items according to the deadline date (or some related
criteria) puts the jobs in the right order to process them. Priority queue data
structures are useful for maintaining calendars or schedules when there are insertions and deletions, but sorting does the job if the set of events does not change
during execution.
• Median/Selection — Suppose we want to ﬁnd the kth largest item in set S.
After sorting the items in increasing order, this fellow sits in location S[k]. This
approach can be used to ﬁnd (in a slightly ineﬃcient manner) the smallest, largest,
and median elements as special cases.
• Frequency Counting — Which is the most frequently occurring element in S, i.e.,
the mode? After sorting, a linear sweep lets us count the number of times each
element occurs.
• Reconstructing the Original Order — How can we restore the original arrangement
of a set of items after we permute them for some application? Add an extra ﬁeld
to the data record for the item, such that the ith record sets this ﬁeld to i. Carry
this ﬁeld along whenever you move the record, and later sort on it when you want
the initial order back.
• Set Intersection/Union — How can we intersect or union the elements of two
containers? If both of them have been sorted, we can merge them by repeatedly
taking the smaller of the two head elements, placing them into the new set if
desired, and then deleting the head from the appropriate list.
• Finding a Target Pair — How can we test whether there are two integers x, y ∈ S
such that x + y = z for some target z? Instead of testing all possible pairs, sort
the numbers in increasing order and sweep. As S[i] increases with i, its possible
partner j such that S[j] = z − S[i] decreases. Thus decreasing j appropriately as
i increases gives a nice solution.
• Eﬃcient Searching — How can we eﬃciently test whether element s is in set S?
Sure, ordering a set so as to permit eﬃcient binary search queries is perhaps the
most common application of sorting. Just don’t forget all the others!

4.2 Sorting Algorithms
You have quite possibly seen a dozen or more diﬀerent algorithms for sorting data. Do
you remember bubblesort, insertion sort, selection sort, heapsort, mergesort, quicksort,

4. Sorting

radix sort, distribution/bin sort, Shell sort, in-order tree traversal, and sorting networks?
Most likely your eyes started to glaze by the time you made it halfway through the list.
Who needs to know so many ways to do the same thing, especially when there already
exists a sorting library function included with your favorite programming language?
The real reason to study sorting algorithms is that the ideas behind them reappear
as the ideas behind algorithms for many other problems. Understand that heapsort
is really about data structures, that quicksort is really about randomization, and that
mergesort is really about divide-and-conquer, and you have a wide range of algorithmic
tools to work with.
We review a few particularly instructive algorithms below. Be sure to note what
useful properties (such as minimizing data movement) come with each algorithm.
• Selection Sort — This algorithm splits the input array into sorted and unsorted
parts, and with each iteration ﬁnds the smallest element remaining in the unsorted
region and moves it to the end of the sorted region:
selection_sort(int s[], int n)
{
int i,j;
int min;

/* counters */
/* index of minimum */

for (i=0; i0) && (s[j] < s[j-1])) {

4.2. Sorting Algorithms

swap(&s[j],&s[j-1]);
j = j-1;
}
}
}
Insertion sort is particularly signiﬁcant as the algorithm which minimizes the
amount of data movement. An inversion in a permutation p is a pair of elements
which are out of order, i.e., an i, j such that i < j yet p[i] > p[j]. Each swap
in insertion sort erases exactly one inversion, and no element is otherwise moved,
so the number of swaps equals the number of inversions. Since an almost-sorted
permutation has few inversions, insertion sort can be very eﬀective on such data.
• Quicksort — This algorithm reduces the job of sorting one big array into the job
of sorting two smaller arrays by performing a partition step. The partition separates the array into those elements that are less than the pivot/divider element,
and those which are strictly greater than this pivot/divider element. Because no
element need ever move out of its region after the partition, each subarray can be
sorted independently. To facilitate sorting subarrays, the arguments to quicksort
include the indices of the ﬁrst (l) and last (h) elements in the subarray.
quicksort(int s[], int l, int h)
{
int p;
/* index of partition */
if ((h-l)>0) {
p = partition(s,l,h);
quicksort(s,l,p-1);
quicksort(s,p+1,h);
}
}
int partition(int s[], int l, int h)
{
int i;
/* counter */
int p;
/* pivot element index */
int firsthigh;
/* divider position for pivot */
p = h;
firsthigh = l;
for (i=l; i
void qsort(void *base, size_t nel, size_t width,
int (*compare) (const void *, const void *));
The key to using qsort is realizing what its arguments do. It sorts the ﬁrst nel
elements of an array (pointed to by base), where each element is width-bytes long.
Thus we can sort arrays of 1-byte characters, 4-byte integers, or 100-byte records, all
by changing the value of width.
The ultimate desired order is determined by the function compare. It takes as arguments pointers to two width-byte elements, and returns a negative number if the
ﬁrst belongs before the second in sorted order, a positive number if the second belongs
before the ﬁrst, or zero if they are the same.
Here is a comparison function for sorting integers in increasing order:
int intcompare(int *i, int *j)
{
if (*i > *j) return (1);
if (*i < *j) return (-1);

4. Sorting

return (0);
}
This comparison function can be used to sort an array a, of which the ﬁrst cnt
elements are occupied, as follows:
qsort((char *) a, cnt, sizeof(int), intcompare);
A more sophisticated example of qsort in action appears in Section 4.5. The name
qsort suggests that quicksort is the algorithm implemented in this library function,
although this is usually irrelevant to the user.
Note that qsort destroys the contents of the original array, so if you need to restore
the original order, make a copy or add an extra ﬁeld to the record as described in
Section 4.1.
Binary search is an amazingly tricky algorithm to implement correctly under pressure.
The best solution is not to try, since the stdlib.h library contains an implementation
called bsearch(). Except for the search key, the arguments are the same as for qsort.
To search in the previously sorted array, try
bsearch(key, (char *) a, cnt, sizeof(int), intcompare);

Sorting and Searching in C++
The C++ Standard Template Library (STL), discussed in Section 2.2.1, includes methods for sorting, searching, and more. Serious C++ users should get familiar with STL.
To sort with STL, we can either use the default comparison (e.g., ≤) function deﬁned
for the class, or override it with a special-purpose comparison function op:
void sort(RandomAccessIterator bg, RandomAccessIterator end)
void sort(RandomAccessIterator bg, RandomAccessIterator end,
BinaryPredicate op)
STL also provides a stable sorting routine, where keys of equal value are guaranteed
to remain in the same relative order. This can be useful if we are sorting by multiple
criteria:
void stable_sort(RandomAccessIterator bg, RandomAccessIterator end)
void stable_sort(RandomAccessIterator bg, RandomAccessIterator end,
BinaryPredicate op)
Other STL functions implement some of the applications of sorting described in
Section 4.1, including,
• nth element – Return the nth largest item in the container.
• set union, set intersection, set difference – Construct the union, intersection, and set diﬀerence of two containers.
• unique – Remove all consecutive duplicates.

4.5. Rating the Field

Sorting and Searching in Java
The java.util.Arrays class contains various methods for sorting and searching. In
particular,
static void sort(Object[] a)
static void sort(Object[] a, Comparator c)
sorts the speciﬁed array of objects into ascending order using either the natural ordering
of its elements or a speciﬁc comparator c. Stable sorts are also available.
Methods for searching a sorted array for a speciﬁed object using either the natural
comparison function or a new comparator c are also provided:
binarySearch(Object[] a, Object key)
binarySearch(Object[] a, Object key, Comparator c)

4.5 Rating the Field
Our solution to Polly’s dating diﬃculties revolved around making the multi-criteria
sorting step as simple as possible. First, we had to set up the basic data structures:
#include
#include
#define NAMELENGTH
#define NSUITORS

30
100

/* maximum length of name */
/* maximum number of suitors */

#define BESTHEIGHT
#define BESTWEIGHT

180
75

/* best height in centimeters */
/* best weight in kilograms */

typedef struct {
char first[NAMELENGTH];
char last[NAMELENGTH];
int height;
int weight;
} suitor;

/*
/*
/*
/*

suitor suitors[NSUITORS];
int nsuitors;

/* database of suitors */
/* number of suitors */

suitor’s
suitor’s
suitor’s
suitor’s

first name */
last name */
height */
weight */

Then we had to read the input. Note that we did not store each fellow’s actual height
and weight! Polly’s rating criteria for heights and weights were quite fussy, revolving
around how these quantities compare to a reference height/weight instead of a usual
linear order (i.e., increasing or decreasing). Instead, we altered each height and weight
appropriately so the quantities were linearly ordered by desirability:

4. Sorting

read_suitors()
{
char first[NAMELENGTH], last[NAMELENGTH];
int height, weight;
nsuitors = 0;
while (scanf("%s %s %d %d\n",suitors[nsuitors].first,
suitors[nsuitors].last, &height, &weight) != EOF) {
suitors[nsuitors].height = abs(height - BESTHEIGHT);
if (weight > BESTWEIGHT)
suitors[nsuitors].weight = weight - BESTWEIGHT;
else
suitors[nsuitors].weight = - weight;
nsuitors ++;
}
}
Finally, observe that we used scanf to read the ﬁrst and last names as tokens, instead
of character by character.
The critical comparison routine takes a pair of suitors a and b, and decides whether
a is better, b is better, or they are of equal rank. To satisfy the demands of qsort, we
must assign −1, 1, and 0 in these three cases, respectively. The following comparison
function does the job:
int suitor_compare(suitor *a, suitor *b)
{
int result;
/* result of comparison */
if (a->height < b->height) return(-1);
if (a->height > b->height) return(1);
if (a->weight < b->weight) return(-1);
if (a->weight > b->weight) return(1);
if ((result=strcmp(a->last,b->last)) != 0) return result;
return(strcmp(a->first,b->first));
}
With the comparison function and input routines in place, all that remains is a driver
program which actually calls qsort and produces the output:
main()
{
int i;

/* counter */

4.5. Rating the Field

int suitor_compare();
read_suitors();
qsort(suitors, nsuitors, sizeof(suitor), suitor_compare);
for (i=0; isignbit == MINUS) printf("- ");
for (i=n->lastdigit; i>=0; i--)
printf("%c",’0’+ n->digits[i]);
printf("\n");
}

5.3. High-Precision Arithmetic

105

For simplicity, our coding examples will ignore the possibility of overﬂow.

5.3 High-Precision Arithmetic
The ﬁrst algorithms we learned in school were those for computing the four standard
arithmetical operations: addition, subtraction, multiplication, and division. We learned
to execute them without necessarily understanding the underlying theory.
Here we review these grade-school algorithms, with the emphasis on understanding
why they work and how you can teach them to a computer. For all four operations, we
interpret the arguments as c = a b, where is +, −, ∗, or /.
• Addition — Adding two integers is done from right to left, with any overﬂow
rippling to the next ﬁeld as a carry. Allowing negative numbers complicates matters by turning addition into subtraction. This is best handled by reducing it to
a special case:
add_bignum(bignum *a, bignum *b, bignum *c)
{
int carry;
/* carry digit */
int i;
/* counter */
initialize_bignum(c);
if (a->signbit == b->signbit) c->signbit = a->signbit;
else {
if (a->signbit == MINUS) {
a->signbit = PLUS;
subtract_bignum(b,a,c);
a->signbit = MINUS;
} else {
b->signbit = PLUS;
subtract_bignum(a,b,c);
b->signbit = MINUS;
}
return;
}
c->lastdigit = max(a->lastdigit,b->lastdigit)+1;
carry = 0;
for (i=0; i<=(c->lastdigit); i++) {
c->digits[i] = (char)
(carry+a->digits[i]+b->digits[i]) % 10;
carry = (carry + a->digits[i] + b->digits[i]) / 10;
}

106

5. Arithmetic and Algebra

zero_justify(c);
}
Note a few things about the code. Manipulating the signbit is a non-trivial
headache. We reduced certain cases to subtraction by negating the numbers
and/or permuting the order of the operators, but took care to replace the signs
ﬁrst.
The actual addition is quite simple, and made simpler by initializing all the
high-order digits to 0 and treating the ﬁnal carry over as a special case of digit
addition. The zero justify operation adjusts lastdigit to avoid leading zeros.
It is harmless to call after every operation, particularly as it corrects for −0:
zero_justify(bignum *n)
{
while ((n->lastdigit > 0) && (n->digits[ n->lastdigit ]==0))
n->lastdigit --;
if ((n->lastdigit == 0) && (n->digits[0] == 0))
n->signbit = PLUS;
/* hack to avoid -0 */
}
• Subtraction — Subtraction is trickier than addition because it requires borrowing.
To ensure that borrowing terminates, it is best to make sure that the largermagnitude number is on top.
subtract_bignum(bignum *a, bignum *b, bignum *c)
{
int borrow;
/* anything borrowed? */
int v;
/* placeholder digit */
int i;
/* counter */
if ((a->signbit == MINUS) || (b->signbit == MINUS)) {
b->signbit = -1 * b->signbit;
add_bignum(a,b,c);
b->signbit = -1 * b->signbit;
return;
}
if (compare_bignum(a,b) == PLUS) {
subtract_bignum(b,a,c);
c->signbit = MINUS;
return;
}
c->lastdigit = max(a->lastdigit,b->lastdigit);
borrow = 0;

5.3. High-Precision Arithmetic

107

for (i=0; i<=(c->lastdigit); i++) {
v = (a->digits[i] - borrow - b->digits[i]);
if (a->digits[i] > 0)
borrow = 0;
if (v < 0) {
v = v + 10;
borrow = 1;
}
c->digits[i] = (char) v % 10;
}
zero_justify(c);
}
• Comparison — Deciding which of two numbers is larger requires a comparison
operation. Comparison proceeds from highest-order digit to the right, starting
with the sign bit:
compare_bignum(bignum *a, bignum *b)
{
int i;

/* counter */

if ((a->signbit==MINUS) && (b->signbit==PLUS)) return(PLUS);
if ((a->signbit==PLUS) && (b->signbit==MINUS)) return(MINUS);
if (b->lastdigit > a->lastdigit) return (PLUS * a->signbit);
if (a->lastdigit > b->lastdigit) return (MINUS * a->signbit);
for (i = a->lastdigit; i>=0; i--) {
if (a->digits[i] > b->digits[i])
return(MINUS * a->signbit);
if (b->digits[i] > a->digits[i])
return(PLUS * a->signbit);
}
return(0);
}
• Multiplication — Multiplication seems like a more advanced operation than addition or subtraction. A people as sophisticated as the Romans had a diﬃcult time
multiplying, even though their numbers look impressive on building cornerstones
and Super Bowls.
The Roman’s problem was that they did not use a radix (or base) number
system. Certainly multiplication can be viewed as repeated addition and thus

108

5. Arithmetic and Algebra

solved in that manner, but it will be hopelessly slow. Squaring 999,999 by repeated
addition requires on the order of a million operations, but is easily doable by hand
using the row-by-row method we learned in school:
multiply_bignum(bignum *a, bignum *b, bignum *c)
{
bignum row;
/* represent shifted row */
bignum tmp;
/* placeholder bignum */
int i,j;
/* counters */
initialize_bignum(c);
row = *a;
for (i=0; i<=b->lastdigit; i++) {
for (j=1; j<=b->digits[i]; j++) {
add_bignum(c,&row,&tmp);
*c = tmp;
}
digit_shift(&row,1);
}
c->signbit = a->signbit * b->signbit;
zero_justify(c);
}
Each operation involves shifting the ﬁrst number one more place to the right
and then adding the shifted ﬁrst number d times to the total, where d is the
appropriate digit of the second number. We might have gotten fancier than using
repeated addition, but since the loop cannot spin more than nine times per digit,
any possible time savings will be relatively small. Shifting a radix-number one
place to the right is equivalent to multiplying it by the base of the radix, or 10
for decimal numbers:
digit_shift(bignum *n, int d)
{
int i;

/* multiply n by 10ˆd */
/* counter */

if ((n->lastdigit == 0) && (n->digits[0] == 0)) return;
for (i=n->lastdigit; i>=0; i--)
n->digits[i+d] = n->digits[i];
for (i=0; idigits[i] = 0;

5.3. High-Precision Arithmetic

109

n->lastdigit = n->lastdigit + d;
}
• Division — Although long division is an operation feared by schoolchildren and
computer architects, it too can be handled with a simpler core loop than might
be imagined. Division by repeated subtraction is again far too slow to work with
large numbers, but the basic repeated loop of shifting the remainder to the left,
including the next digit, and subtracting oﬀ instances of the divisor is far easier
to program than “guessing” each quotient digit as we were taught in school:
divide_bignum(bignum *a, bignum *b, bignum *c)
{
bignum row;
/* represent shifted row */
bignum tmp;
/* placeholder bignum */
int asign, bsign;
/* temporary signs */
int i,j;
/* counters */
initialize_bignum(c);
c->signbit = a->signbit * b->signbit;
asign = a->signbit;
bsign = b->signbit;
a->signbit = PLUS;
b->signbit = PLUS;
initialize_bignum(&row);
initialize_bignum(&tmp);
c->lastdigit = a->lastdigit;
for (i=a->lastdigit; i>=0; i--) {
digit_shift(&row,1);
row.digits[0] = a->digits[i];
c->digits[i] = 0;
while (compare_bignum(&row,b) != PLUS) {
c->digits[i] ++;
subtract_bignum(&row,b,&tmp);
row = tmp;
}
}
zero_justify(c);
a->signbit = asign;

110

5. Arithmetic and Algebra

b->signbit = bsign;
}
This routine performs integer division and throws away the remainder. If you want
to compute the remainder of a ÷ b, you can always do a − b(a ÷ b). Slicker methods
will follow when we discuss modular arithmetic in Section 7.3. The correct sign
for the quotient and remainder when one or more of the operators is negative is
somewhat ill-deﬁned, so don’t be surprised if the answer varies with programming
language.
• Exponentiation — Exponentiation is repeated multiplication, and hence subject
to the same performance problems as repeated addition on large numbers. The
trick is to observe that
an = an÷2 × an÷2 × an mod 2
so it can be done using only a logarithmic number of multiplications.

5.4 Numerical Bases and Conversion
The digit representation of a given radix-number is a function of which numerical base
is used. Particularly interesting numerical bases include:
• Binary — Base-2 numbers are made up of the digits 0 and 1. They provide the
integer representation used within computers, because these digits map naturally
to on/oﬀ or high/low states.
• Octal — Base-8 numbers are useful as a shorthand to make it easier to read
binary numbers, since the bits can be read oﬀ from the right in groups of three.
Thus 101110012 = 3718 = 24910 . They also play a role in the only base-conversion
joke ever written. Why do programmers think Christmas is Halloween? Because
31 Oct = 25 Dec!
• Decimal — We use base-10 numbers because we learned to count on our ten
ﬁngers. The ancient Mayan people used a base-20 number system, presumably
because they counted on both ﬁngers and toes.
• Hexadecimal — Base-16 numbers are an even easier shorthand to represent binary
numbers, once you get over the fact that the digits representing 10 through 15
are “A” to ‘’F.”
• Alphanumeric — Occasionally, one sees even higher numerical bases. Base-36
numbers are the highest you can represent using the 10 numerical digits with the
26 letters of the alphabet. Any integer can be represented in base-X provided you
can display X diﬀerent symbols.
There are two distinct algorithms you can use to convert base-a number x to a base-b
number y —

5.4. Numerical Bases and Conversion

111

• Left to Right — Here, we ﬁnd the most-signiﬁcant digit of y ﬁrst. It is the integer
dl such that
(dl + 1)bk > x ≥ dl bk
where 1 ≤ dl ≤ b − 1. In principle, this can be found by trial and error, although
you must to be able to compare the magnitude of numbers in diﬀerent bases. This
is analogous to the long-division algorithm described above.
• Right to Left — Here, we ﬁnd the least-signiﬁcant digit of y ﬁrst. This is the
remainder of x divided by b. Remainders are exactly what is computed when
doing modular arithmetic in Section 7.3. The cute thing is that we can compute
the remainder of x on a digit-by-digit basis, making it easy to work with large
integers.
Right-to-left translation is similar to how we translated conventional integers to our
bignum presentation. Taking the long integer mod 10 (using the % operator) enables us
to peel oﬀ the low-order digit:
int_to_bignum(int s, bignum *n)
{
int i;
int t;

/* counter */
/* int to work with */

if (s >= 0) n->signbit = PLUS;
else n->signbit = MINUS;
for (i=0; idigits[i] = (char) 0;
n->lastdigit = -1;
t = abs(s);
while (t > 0) {
n->lastdigit ++;
n->digits[ n->lastdigit ] = (t % 10);
t = t / 10;
}
if (s == 0) n->lastdigit = 0;
}
Using a diﬀerent modulus than 10 is the key to converting numbers to alternate bases.

112

5. Arithmetic and Algebra

5.5 Real Numbers
The branches of mathematics designed to work with real numbers are real important in
understanding the real world. Newton had to develop calculus before he could develop
the basic laws of motion. The need to integrate or solve systems of equations occurs in
every area of science. The ﬁrst computers were designed as number-crunching machines,
and the numbers they were designed to crunch were real numbers.
Working with real numbers on computers is very challenging because ﬂoating point
arithmetic has limited precision. The most important thing to remember about real
numbers is that they are not real real numbers:
• Much of mathematics relies on the continuity of the reals, the fact that there
always exists a number c between a and b if a < b. This is not true in real
numbers as they are represented in a computer.
• Many algorithms rely on an assumption of exact computation. This is not true of
real numbers as they are represented in a computer. The associativity of addition
guarantees that
(a + b) + c = a + (b + c)
Unfortunately, this is not necessarily true in computer arithmetic because of
round-oﬀ errors.
There are several diﬀerent types of numbers which we may well want to work with:
• Integers — These are the counting numbers, −∞, . . . , −2, −1, 0, 1, 2, . . . , ∞. Subsets of the integers include the natural numbers (integers starting from 0) and the
positive integers (those starting from 1), although the notation is not universal.
A limiting aspect of integers is that there are gaps between them. An April Fool’s
Day edition of a newspaper once had a headline announcing, “Scientists Discover
New Number Between 6 and 7.” This is funny because while there is always a
rational number between any two rationals x and y ((x + y)/2 is a good example),
it would indeed be newsworthy if they found an integer between 6 and 7.
• Rational Numbers — These are the numbers which can be expressed as the ratio
of two integers, i.e. c is rational if c = a/b for integers a and b. Every integer can
be represented by a rational, namely, c/1. The rational numbers are synonymous
with fractions, provided we include improper fractions a/b where a > b.
• Irrational Numbers — There are many interesting numbers
which are not ra√
tional numbers. Examples include π = 3.1415926 . . . , 2 = 1.41421 . . . , and
e = 2.71828 . . . . It can be proven that there does not exist any pair of integers x
and y such that x/y equals any of these numbers.
So how can you represent them on a computer? If you really need the values
to arbitrary precision, they can be computed using Taylor series expansions. But
for all practical purposes it suﬃces to approximate them using the ten digits or
so.

5.5. Real Numbers

5.5.1

113

Dealing With Real Numbers

The internal representation of ﬂoating point numbers varies from computer to computer,
language to language, and compiler to compiler. This makes them a big pain to deal
with.
There is an IEEE standard for ﬂoating point arithmetic which an increasing number
of vendors adhere to, but you must always expect trouble on computations which require
very high precision. Floating point numbers are represented in scientiﬁc notation, i.e.,
a × 2c , with a limited number of bits assigned to both the mantissa a and exponent
c. Operating on two numbers with vastly diﬀerent exponents often results in overﬂow
or underﬂow errors, since the mantissa does not have enough bits to accommodate the
answer.
Such issues are the source of many diﬃculties with roundoﬀ errors. The most important problem occurs in testing for equality of real numbers, since there is usually enough
garbage in the low-order bits of mantissa to render such tests meaningless. Never test
whether a ﬂoat is equal to zero, or any other ﬂoat for that matter. Instead, test if it
lies within an value plus or minus of the target.
Many problems will ask you to display an answer to a given number of digits of
precision to the right of the decimal point. Here we must distinguish between rounding
and truncating. Truncation is exempliﬁed by the floor function, which converts a real
number of an integer by chopping oﬀ the fractional part. Rounding is used to get a
more accurate value for the least signiﬁcant digit. To round a number X to k decimal
digits, use the formula
round(X, k) = ﬂoor(10k X + (1/2))/10k
Use your language’s formatted output function to display only the desired number of
digits when so requested.

5.5.2

Fractions

Exact rational numbers x/y are best represented by pairs of integers x, y, where x is
the numerator and y is the denominator of the fraction.
The basic arithmetic operations on rationals c = x1 /y1 and d = x2 /y2 are easy to
program:
• Addition — We must ﬁnd a common denominator before adding fractions, so
c+d=

x1 y2 + x2 y1
y1 y2

• Subtraction — Same as addition, since c − d = c + −1 × d, so
c−d=

x1 y2 − x2 y1
y1 y2

114

5. Arithmetic and Algebra

• Multiplication — Since multiplication is repeated addition, it is easily shown that
x1 x2
c×d=
y1 y2
• Division — To divide fractions you multiply by the reciprocal of the denominator,
so
x1
y2
x1 y2
c/d =
×
=
y1
x2
x2 y1
But why does this work? Because under this deﬁnition, d(c/d) = c, which is
exactly what we want division to mean.
Blindly implementing these operations leads to a signiﬁcant danger of overﬂows. It
is important to reduce fractions to their simplest representation, i.e., replace 2/4 by
1/2. The secret is to cancel out the greatest common divisor of the numerator and the
denominator, i.e., the largest integer which divides both of these integers.
Finding the greatest common divisor by trial and error or exhaustive search can be
very expensive. However, the Euclidean algorithm for gcd is eﬃcient, very simple to
program, and discussed in Section 7.2.1.

5.5.3

Decimals

The decimal representation of real numbers is just a special case of the rational numbers.
A decimal number represents the sum of two numbers; the integer part to the left of the
decimal point and the fractional part to the right of the decimal. Thus the fractional
representation of the ﬁrst ﬁve decimal digits of π is
3.1415 = (3/1) + (1415/10000) = 6283/2000
The denominator of the fractional part is 10i+1 if the rightmost non-zero digit lies i
places to the right of the decimal point.
Converting a rational to a decimal number is easy, in principle; just divide the
numerator by the denominator. The catch is that many fractions do not have
a ﬁnite decimal representation. For example, 1/3 = 0.3333333 . . . , and 1/7 =
0.14285714285714285714 . . . . Usually a decimal representation with the ﬁrst ten or so
signiﬁcant digits will suﬃce, but sometimes we want to know the exact representation,
i.e., 1/30 = 0.03 or 1/7 = 0.142857.
What fraction goes with a given repeating decimal number? We can ﬁnd it by explicitly simulating the long division. The decimal expansion of fraction 1/7 is obtained by
dividing 7 into 1.0000000. . . . The next digit of the quotient is obtained by multiplying
the remainder by ten, adding the last digit (always zero), and ﬁnding how many times
the denominator ﬁts into this quantity. Notice that we get into an inﬁnite loop the
instant this quantity repeats. Thus the decimal digits between these positions repeats
forever.
A simpler method results if we know (or guess) the length of the repeat. Suppose that
the simple fraction a/b has a repeat R of length of l. Then 10l (a/b) − (a/b) = R, and

5.6. Algebra

115

hence a/b = R/(10l − 1). To demonstrate, suppose we want the fraction associated with
a/b = 0.0123123 . . .. The repeat length is three digits, and R = 12.3 by the formula
above. Thus a/b = 12.3/999 = 123/9990.

5.6 Algebra
In its full glory, algebra is the study of groups and rings. High-school algebra is basically
limited to the study of equations, deﬁned over the operators addition and multiplication.
The most important class of formulae are the polynomials, such that P (x) = c0 + c1 x +
c2 x2 +. . . , where x is the variable and ci is the coeﬃcient of the ith term xi . The degree
of a polynomial is the largest i such that ci is non-zero.

5.6.1

Manipulating Polynomials

The most natural representation for an nth-degree univariate (one variable) polynomial
is as an array of n + 1 coeﬃcients c0 through cn . Such a representation makes short
work of the basic arithmetic operations on polynomials:
• Evaluation — Computing P (x) for some given x can easily be done by brute force,
namely, computing each term ci xn independently and adding them together. The
trouble is that this will cost O(n2 ) multiplications where O(n) suﬃce. The secret
is to note that xi = xi−1 x, so if we compute the terms from smallest degree to
highest degree we can keep track of the current power of x, and get away with
two multiplications per term (xi−1 × x, and then ci × xi ).
Alternately, one can employ Horner’s rule, an even slicker way to do the same
job:
an xn + an−1 xn−1 + . . . + a0 = ((an x + an−1 )x + . . .)x + a0
• Addition/Subtraction — Adding and subtracting polynomials is even easier than
the same operations on long integers, since there is no borrowing or carrying.
Simply add or subtract the coeﬃcients of the ith terms for all i from zero to the
maximum degree.
• Multiplication — The product of polynomials P (x) and Q(x) is the sum of the
product of every pair of terms, where each term comes from a diﬀerent polynomial:

i=0

j=0

degree(P ) degree(Q)

P (x) × Q(x) =

(ci cj )xi+j

Such an all-against-all operation is called a convolution. Other convolutions in
this book include integer multiplication (all digits against all digits) and string
matching (all possible positions of the pattern string against all possible text
positions). There is an amazing algorithm (the fast Fourier transform, or FFT)
which computes convolutions in O(n log n) time instead of O(n2 ), but it is well

116

5. Arithmetic and Algebra

beyond the scope of this book. Still, it is nice to recognize when you are doing a
convolution so you know that such tools exist.
• Division — Dividing polynomials is a tricky business, since the polynomials
are not closed under division. Note that 1/x may or may not be thought of as a
polynomial, since it is x−1 , but 2x/(x2 +1) certainly isn’t. It is a rational function.
Sometimes polynomials are sparse, meaning that for many coeﬃcients ci = 0. Sufﬁciently sparse polynomials should be represented as linked lists of coeﬃcient/degree
pairs. Multivariate polynomials are deﬁned over more than one variable. The bivariate
polynomial f (x, y) can be represented by a matrix C of coeﬃcients, such that C[i][j] is
the coeﬃcient of xi y j .

5.6.2

Root Finding

Given a polynomial P (x) and a target number t, the problem of root ﬁnding is identifying
any or all x such that P (x) = t.
If P (x) is a ﬁrst-degree polynomial, the root is simply x = (t − a0 )/a1 , where ai is
the coeﬃcient of xi in P (x). If P (x) is a second-degree polynomial, then the quadratic
equation applies:

−a1 ± a1 2 − 4a2 (a0 − t)
x=
2a2
There are more complicated formulae for solving third- and fourth-degree polynomials
before the good times end. No closed form exists for the roots of ﬁfth-degree (quintic)
or higher-degree equations.
Beyond quadratic equations, numerical methods are typically used. Any text on numerical analysis will describe a variety of root-ﬁnding algorithms, including Newton’s
method and Newton-Raphson, as well as many potential traps such as numerical stability. But the basic idea is that of binary search. Suppose a function f (x) is monotonically
increasing between l and u, meaning that f (i) ≤ f (j) for all l ≤ i ≤ j ≤ u. Now suppose we want to ﬁnd the x such that f (x) = t. We can compare f ((l + u)/2) with t.
If t < f ((l + u)/2), then the root lies between l and (l + u)/2; if not, it lies between
(l + u)/2 and u. We can keep recurring until the window is narrow enough for our taste.
This method can be used to compute square roots because this is equivalent to solving
x2 = t between 1 and t for all t ≥ 1. However, a simpler method to ﬁnd the ith root of
t uses exponential functions and logarithms to compute t1/i .

5.7 Logarithms
You have probably noticed the log and exp buttons on your calculator, but quite likely
have never used them. You may even have forgotten why they are there. A logarithm
is simply an inverse exponential function. Saying bx = y is equivalent to saying that
x = logb y.

5.8. Real Mathematical Libraries

117

The b term is known as the base of the logarithm. Two bases are of particular importance for mathematical and historical reasons. The natural log, usually denoted ln x,
is a base e = 2.71828 . . . logarithm. The inverse of ln x is the exponential function
exp x = ex . Thus by composing these functions we get
exp(ln x) = x
Less common today is the base-10 or common logarithm, usually denoted log x. Common logarithms were particularly important in the days before pocket calculators.1
Logarithms provided the easiest way to multiply big numbers by hand, either implicitly
using a slide rule or explicitly by using a book of logarithms.
Logarithms are still useful for multiplication, particularly for exponentiation. Recall
that loga xy = loga x + loga y; i.e., the log of a product is the sum of the logs. A direct
consequence of this is that
loga nb = b · loga n
So how can we compute ab for any a and b using the exp(x) and ln(x) functions? We
know
ab = exp(ln(ab )) = exp(b ln a)
so the problem is reduced to one multiplication plus one call of
√ each of these functions.
We can use this method to compute square roots since x = x1/2 , and for any
other fractional power as well. Such applications are one reason why the mathematics
library of any reasonable programming language includes the ln and exp functions. Be
aware that these are complicated numerical functions (computed using Taylor-series
expansions) which have inherent computational uncertainty, so do not expect that
exp(0.5 ln 4) will give you exactly 2.
The other important fact to remember about logarithms is that it is easy to convert
the logarithm from one base to another, as a consequence of the following formula:
loga b =

logc b
logc a

Thus changing the base of log b from base-a to base-c simply involves dividing by logc a.
Thus it is easy to write a common log function from a natural log function, and vice
versa.

5.8 Real Mathematical Libraries
Math Libraries in C/C++
The standard C/C++ math library has several useful functions for working with real
numbers:
1 The

authors of this book are old enough to remember this pre-1972 era.

118

5. Arithmetic and Algebra

#include

/* include the math library */

double floor(double x);
double ceil (double x);
double fabs(double x);

/* chop off fractional part of x */
/* raise x to next largest integer */
/* compute the absolute value of x */

double
double
double
double
double

sqrt(double x);
/* compute square roots */
exp(double x);
/* compute eˆx */
log(double x);
/* compute the base-e logarithm */
log10(double x);
/* compute the base-10 logarithm */
pow(double x, double y);
/* compute xˆy */

Math Libraries in Java
The java class java.lang.Math has all of these functions and a few more, most obviously
a round function to take a real to the nearest integer.

5.9. Problems

119

5.9 Problems
5.9.1

Primary Arithmetic

PC/UVa IDs: 110501/10035, Popularity: A, Success rate: average Level: 1
Children are taught to add multi-digit numbers from right to left, one digit at a time.
Many ﬁnd the “carry” operation, where a 1 is carried from one digit position to the
next, to be a signiﬁcant challenge. Your job is to count the number of carry operations
for each of a set of addition problems so that educators may assess their diﬃculty.

Input
Each line of input contains two unsigned integers less than 10 digits. The last line of
input contains “0 0”.

Output
For each line of input except the last, compute the number of carry operations that
result from adding the two numbers and print them in the format shown below.

Sample Input
123 456
555 555
123 594
0 0

Sample Output
No carry operation.
3 carry operations.
1 carry operation.

120

5. Arithmetic and Algebra

5.9.2

Reverse and Add

PC/UVa IDs: 110502/10018, Popularity: A, Success rate: low Level: 1
The reverse and add function starts with a number, reverses its digits, and adds the
reverse to the original. If the sum is not a palindrome (meaning it does not give the
same number read from left to right and right to left), we repeat this procedure until
it does.
For example, if we start with 195 as the initial number, we get 9,339 as the resulting
palindrome after the fourth addition:
195
591
+ —–
786

786
687
+ ——
1,473

1,473
3,741
+ ——
5,214

5,214
4,125
+ ——
9,339

This method leads to palindromes in a few steps for almost all of the integers. But
there are interesting exceptions. 196 is the ﬁrst number for which no palindrome has
been found. It has never been proven, however, that no such palindrome exists.
You must write a program that takes a given number and gives the resulting
palindrome (if one exists) and the number of iterations/additions it took to ﬁnd it.
You may assume that all the numbers used as test data will terminate in an answer
with less than 1,000 iterations (additions), and yield a palindrome that is not greater
than 4,294,967,295.

Input
The ﬁrst line will contain an integer N (0 < N ≤ 100), giving the number of test cases,
while the next N lines each contain a single integer P whose palindrome you are to
compute.

Output
For each of the N integers, print a line giving the minimum number of iterations to ﬁnd
the palindrome, a single space, and then the resulting palindrome itself.

Sample Input

Sample Output

3
195
265
750

4 9339
5 45254
3 6666

5.9. Problems

5.9.3

121

The Archeologist’s Dilemma

PC/UVa IDs: 110503/701, Popularity: A, Success rate: low Level: 1
An archaeologist, seeking proof of the presence of extraterrestrials in the Earth’s past,
has stumbled upon a partially destroyed wall containing strange chains of numbers. The
left-hand part of these lines of digits is always intact, but unfortunately the right-hand
one is often lost because of erosion of the stone. However, she notices that all the
numbers with all its digits intact are powers of 2, so that the hypothesis that all of
them are powers of 2 is obvious. To reinforce her belief, she selects a list of numbers
on which it is apparent that the number of legible digits is strictly smaller than the
number of lost ones, and asks you to ﬁnd the smallest power of 2 (if any) whose ﬁrst
digits coincide with those of the list.
Thus you must write a program that, given an integer, determines the smallest exponent E (if it exists) such that the ﬁrst digits of 2E coincide with the integer (remember
that more than half of the digits are missing).

Input
Each line contains a positive integer N not bigger than 2,147,483,648.

Output
For every one of these integers, print a line containing the smallest positive integer E
such that the ﬁrst digits of 2E are precisely the digits of N , or, if there isn’t one, the
sentence “no power of 2”.

Sample Input
1
2
10

Sample Output
7
8
20

122

5. Arithmetic and Algebra

5.9.4

Ones

PC/UVa IDs: 110504/10127, Popularity: A, Success rate: high Level: 2
Given any integer 0 ≤ n ≤ 10, 000 not divisible by 2 or 5, some multiple of n is a
number which in decimal notation is a sequence of 1’s. How many digits are in the
smallest such multiple of n?

Input
A ﬁle of integers at one integer per line.

Output

x−1
Each output line gives the smallest integer x > 0 such that p = i=0 1 × 10i , where a
is the corresponding input integer, p = a × b, and b is an integer greater than zero.

Sample Input
3
7
9901

Sample Output
3
6
12

5.9. Problems

5.9.5

123

A Multiplication Game

PC/UVa IDs: 110505/847, Popularity: A, Success rate: high Level: 3
Stan and Ollie play the game of multiplication by multiplying an integer p by one of
the numbers 2 to 9. Stan always starts with p = 1, does his multiplication, then Ollie
multiplies the number, then Stan, and so on. Before a game starts, they draw an integer
1 < n < 4, 294, 967, 295 and the winner is whoever reaches p ≥ n ﬁrst.

Input
Each input line contains a single integer n.

Output
For each line of input, output one line – either
Stan wins.
or
Ollie wins.
assuming that both of them play perfectly.

Sample Input
162
17
34012226

Sample Output
Stan wins.
Ollie wins.
Stan wins.

124

5. Arithmetic and Algebra

5.9.6

Polynomial Coeﬃcients

PC/UVa IDs: 110506/10105, Popularity: B, Success rate: high Level: 1
This problem seeks the coeﬃcients resulting from the expansion of the polynomial
P = (x1 + x2 + . . . + xk )n

Input
The input will consist of a set of pairs of lines. The ﬁrst line of the pair consists of two
integers n and k separated with space (0 < k, n < 13). These integers deﬁne the power
of the polynomial and the number of variables. The second line in each pair consists of
k non-negative integers n1 , . . . , nk , where n1 + . . . + nk = n.

Output
For each input pair of lines the output line should consist of one integer, the coeﬃcient
of the monomial xn1 1 xn2 2 . . . xnk k in expansion of the polynomial (x1 + x2 + . . . + xk )n .

Sample Input
2
1
2
1

2
1
12
0 0 0 0 0 0 0 0 0 1 0

Sample Output
2
2

5.9. Problems

5.9.7

125

The Stern-Brocot Number System

PC/UVa IDs: 110507/10077, Popularity: C, Success rate: high Level: 1
The Stern-Brocot tree is a beautiful way for constructing the set of all non-negative
m
fractions
n where m and n are relatively prime. The idea is to start with two fractions
0 1
and
then repeat the following operation as many times as desired:
,
1 0
Insert

m+m
n+n

between two adjacent fractions

m
n

and

m
n

For example, the ﬁrst step gives us one new entry between

.
0
1

and 10 ,

0 1 1
, ,
1 1 0
and the next gives two more:
0 1 1 2 1
, , , ,
1 2 1 1 0
The next gives four more:
0 1 1 2 1 3 2 3 1
, , , , , , , ,
1 3 2 3 1 2 1 1 0
The entire array can be regarded as an inﬁnite binary tree structure whose top levels
look like this–

This construction preserves order, and thus we cannot possibly get the same fraction
in two diﬀerent places.
We can, in fact, regard the Stern-Brocot tree as a number system for representing
rational numbers, because each positive, reduced fraction occurs exactly once. Let us
use the letters “L” and “R” to stand for going down the left or right branch as we
proceed from the root of the tree to a particular fraction; then a string of L’s and R’s
uniquely identiﬁes a place in the tree. For example, LRRL means that we go left from 11
down to 12 , then right to 23 , then right to 34 , then left to 57 . We can consider LRRL to be
a representation of 57 . Every positive fraction gets represented in this way as a unique
string of L’s and R’s.

126

5. Arithmetic and Algebra

Well, almost every fraction. The fraction 11 corresponds to the empty string. We will
denote it by I, since that looks something like 1 and stands for “identity.”
In this problem, given a positive rational fraction, represent it in the Stern-Brocot
number system.

Input
The input ﬁle contains multiple test cases. Each test case consists of a line containing
two positive integers m and n, where m and n are relatively prime. The input terminates
with a test case containing two 1’s for m and n, and this case must not be processed.

Output
For each test case in the input ﬁle, output a line containing the representation of the
given fraction in the Stern-Brocot number system.

Sample Input
5 7
878 323
1 1

Sample Output
LRRL
RRLRRLRLLLLRLRRR

5.9. Problems

5.9.8

127

Pairsumonious Numbers

PC/UVa IDs: 110508/10202, Popularity: B, Success rate: high Level: 4
Any set of n integers form n(n − 1)/2 sums by adding every possible pair. Your task
is to ﬁnd the n integers given the set of sums.

Input
Each line of input contains n followed by n(n − 1)/2 integer numbers separated by a
space, where 2 < n < 10.

Output
For each line of input, output one line containing n integers in non-descending order
such that the input numbers are pairwise sums of the n numbers. If there is more than
one solution, any one will do. If there is no solution, print “Impossible”. . .

Sample Input
3
3
5
5
5
5

1269 1160 1663
1 1 1
226 223 225 224 227 229 228 226 225 227
216 210 204 212 220 214 222 208 216 210
-1 0 -1 -2 1 0 -1 1 0 -1
79950 79936 79942 79962 79954 79972 79960 79968 79924 79932

Sample Output
383 777 886
Impossible
111 112 113 114 115
101 103 107 109 113
-1 -1 0 0 1
39953 39971 39979 39983 39989

128

5. Arithmetic and Algebra

5.10 Hints
5.1 Do we need to implement complete high-precision addition for this problem, or
can we extract the number of carry operations using a simpler method?
5.3 Do we need to implement complete high-precision multiplication for this problem,
or does the fact that we are looking for a power of 2 simplify matters?
5.4 Do we actually have to compute the number in order to ﬁnd the number of digits
it contains?
5.5 Might it be easier to solve a more general problem – who wins if they start with
number x and end on number n?
5.6 Do we need to compute the resulting polynomial, or is there an easier way to
calculate the resulting coeﬃcient? Does the binomial theorem help?
5.8 Is an exhaustive search of the possibilities necessary? If so, look ahead to
backtracking in Chapter 8.

5.11 Notes
5.2 A three-year computer search for an addition palindrome from 196 went
up to 2 million digits without ever ﬁnding such a palindrome. It becomes
progressively less likely that a palindrome exists the longer we search. See
http://www.fourmilab.ch/documents/threeyears/threeyears.html for details.

6
Combinatorics

Combinatorics is the mathematics of counting. There are several basic counting
problems that occur repeatedly throughout computer science and programming.
Combinatorics problems are notorious for their reliance on cleverness and insight.
Once you look at the problem in the right way, the answer suddenly becomes obvious.
This aha! phenomenon makes them ideal for programming contests, because the right
observation can replace the need to write a complicated program that generates and
counts all solutions with one call to a simple formula. It sometimes leads to “oﬀ-line”
contest solutions. If the resulting computations are tractable only on small integers or
are in fact the same for all input, one might be able to compute all possible solutions
using (say) a pocket calculator and then write a program to print out the answers on
demand. Remember, the judge can’t look into your heart or your program to see your
intentions – it only checks the results.

6.1 Basic Counting Techniques
Here we review certain basic counting rules and formulas you may have seen but possibly
forgotten. In particular, there are three basic counting rules from which many counting
formulae are generated. It is important to see which rule applies for your particular
problem:
• Product Rule — The product rule states that if there are |A| possibilities from
set A and |B| possibilities from set B, then there are |A| × |B| ways to combine
one from A and one from B. For example, suppose you own 5 shirts and 4 pants.
Then there are 5 × 4 = 20 diﬀerent ways you can get dressed tomorrow.

130

6. Combinatorics

• Sum Rule — The sum rule states that if there are |A| possibilities from set A
and |B| possibilities from set B, then there are |A| + |B| ways for either A or B
to occur – assuming the elements of A and B are distinct. For example, given
that you own 5 shirts and 4 pants and the laundry ruined one of them, there are
9 possible ruined items.1
• Inclusion-Exclusion Formula — The sum rule is a special case of a more general
formula when the two sets can overlap, namely,
|A ∪ B| = |A| + |B| − |A ∩ B|
For example, let A represent the set of colors of my shirts and B the colors of my
pants. Via inclusion-exclusion, I can calculate the total number of colors given the
number of color-matched garments or vice versa. The reason this works is that
summing the sets double counts certain possibilities, namely, those occurring in
both sets. The inclusion-exclusion formula generalizes to three sets and beyond
in a natural way:
|A ∪ B ∪ C| = |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| + |A ∩ B ∩ C|
Double counting is a slippery aspect of combinatorics, which can make it diﬃcult
to solve problems via inclusion-exclusion. Another powerful technique is establishing a
bijection. A bijection is a one-to-one mapping between the elements of one set and the
elements of another. Whenever you have such a mapping, counting the size of one of
the sets automatically gives you the size of the other set.
For example, if we count the number of pants currently being worn in a given class,
and can assume that all students wears pants, then this tells us the number of people
in the class. It works because there is a one-to-one mapping between pants and people,
and would break if we exchanged pants for socks or removed the dress code and allowed
people to wear skirts instead.
Exploiting bijections requires us to have a repertoire of sets which we know how to
count, so we can map other objects to them. Basic combinatorial objects you should be
familiar with include the following. It is useful to have a feeling for how fast the number
of objects grows, to know when exhaustive search breaks down as a possible technique:
• Permutations — A permutation is an
arrangement of n items, where every item
n
appears exactly once. There are n! = i=1 i diﬀerent permutations. The 3! = 6
permutations of three items are 123, 132, 213, 231, 312, and 321. For n = 10,
n! = 3, 628, 800, so we start to approach the limits of exhaustive search.
• Subsets — A subset is a selection of elements from n possible items. There are 2n
distinct subsets of n things. Thus there are 23 = 8 subsets of three items, namely,
1, 2, 3, 12, 13, 23, 123, and the empty set: never forget the empty set. For n = 20,
2n = 1, 048, 576, so we start to approach the limits of exhaustive search.
• Strings — A string is a sequence of items which are drawn with repetition. There
are mn distinct sequences of n items drawn from m items. The 27 length-3 strings
1 This

is not true in practice, because the ruined item is certain to be your favorite of the bunch.

6.2. Recurrence Relations

131

on 123 are 111, 112, 113, 121, 122, 123, 131, 132, 133, 211, 212, 213, 221, 222,
223, 231, 232, 233, 311, 312, 313, 321, 322, 323, 331, 332, and 333. The number of
binary strings of length n is identical to the number of subsets of n items (why?),
and the number of possibilities increases even more rapidly with larger m.

6.2 Recurrence Relations
Recurrence relations make it easy to count a variety of recursively deﬁned structures.
Recursively deﬁned structures include trees, lists, well-formed formulae, and divide-andconquer algorithms – so they lurk wherever computer scientists do.
What is a recurrence relation? It is an equation which is deﬁned in terms of itself.
Why are they good things? Because many natural functions are easily expressed as
recurrences! Any polynomial can be represented by a recurrence, including the linear
function:
an = an−1 + 1, a1 = 1 −→ an = n
Any exponential can be represented by a recurrence:
an = 2an−1 , a1 = 2 −→ an = 2n
Finally, certain weird but interesting functions which are not easily represented using
conventional notation can be described by recurrences:
an = nan−1 , a1 = 1 −→ an = n!
Thus recurrence relations are a very versatile way to represent functions. It is often
easy to ﬁnd a recurrence as the answer to a counting problem. Solving the recurrence
to get a nice closed form can be somewhat of an art, but as we shall see, computer
programs can easily evaluate the value of a given recurrence even without the existence
of a nice closed form.

6.3 Binomial Coeﬃcients

The most important class of counting numbers are the binomial coeﬃcients, where nk
counts the number of ways to choose k things out of n possibilities. What do they
count?
• Committees — How many
ways are there to form a k-member committee from
n people? By deﬁnition, nk is the answer.
• Paths Across a Grid — How many ways are there to travel from the upper-left
corner of an n × m grid to the lower-right corner by walking only down and to
the right? Every path must consist of n + m steps, n downward and m to the
right.
Every
path with a diﬀerent set of downward moves is distinct, so there are

n+m
such sets/paths.
n

132

6. Combinatorics

• Coeﬃcients of (a + b)n — Observe that
(a + b)3 = 1a3 + 3a2 b + 3ab2 + 1b3

What is the coeﬃcient of the term ak bn−k ? Clearly nk , because it counts the
number of ways we can choose the k a-terms out of n possibilities.
• Pascal’s Triangle — No doubt you played with this arrangement of numbers in
high school. Each number is the sum of the two numbers directly above it:
1
1
1
1
1
1

3
4

1
2

1
3

6
10

4
10

1
5

Why did you or Pascal care? Because this table constructs
the binomial coeﬃ
cients! The (n + 1)st row of the table gives the values ni for 0 ≤ i ≤ n. The neat
thing about the triangle is how it reveals certain interesting identities, such that
the sum of the entries on the (n + 1)st row equals 2n .

How do you compute the binomial coeﬃcients? First, nk = n!/((n − k)!k!), so in
principle you can compute them straight from factorials. However, this method has a
serious drawback. Intermediate calculations can easily cause arithmetic overﬂow even
when the ﬁnal coeﬃcient ﬁts comfortably within an integer.
A more stable way to compute binomial coeﬃcients is using the recurrence relation
implicit in the construction of Pascal’s triangle, namely, that

n
n−1
n−1
=
+
k
k−1
k

Why does this work? Consider whether the nth element appears in one of the nk
subsets of k elements. If so, we can complete the subset by picking k − 1 other items
from the other n − 1. If not, we must pick all k items from the remaining n − 1. There
is no overlap between these cases, and all possibilities are included, so the sum counts
all k-subsets.
No recurrence is complete without basis cases. What binomial coeﬃcient values do
we know without computing them? The left term of the sum eventually drives us down
k
. How many ways are there to choose 0 things from a set? Exactly one, the
to n −
0

empty set. If this is not convincing, then it is equally good to accept that m
1 = m. The

right term of the sum drives us up to kk . How many ways are there to choose k things
from a k-element set? Exactly one, the complete set. Together with the recurrence these
basis cases deﬁne the binomial coeﬃcients on all interesting values.

6.4. Other Counting Sequences

133

The best way to evaluate such a recurrence is to build a table of all possible values,
at least up to the size that you are interested in. Study the function below to see how
we did it.
#define MAXN

100

long binomial_coefficient(n,m)
int n,m;
{
int i,j;
long bc[MAXN][MAXN];

/* largest n or m */

/* computer n choose m */
/* counters */
/* table of binomial coefficients */

for (i=0; i<=n; i++) bc[i][0] = 1;
for (j=0; j<=n; j++) bc[j][j] = 1;
for (i=1; i<=n; i++)
for (j=1; j n.
The interested student should read [GKP89] for more on these and other interesting
counting sequences. It is also worth visiting Sloane’s Handbook of Integer Sequences
on the web at http://www.research.att.com/∼njas/sequences/ to help identify virtually
any interesting sequence of integers.

6.5 Recursion and Induction
Mathematical induction provides a useful tool to solve recurrences. When we ﬁrst
learned about mathematical induction in high school it seemed like complete magic.
You proved the formula for some basis case like 1 or 2, then assumed it was true all the
way to n − 1 before proving it was true for general n using the assumption. That was
a proof? Ridiculous!
When we ﬁrst learned the programming technique of recursion in college it also
seemed like complete magic. Your program tested whether the input argument was
some basis case like 1 or 2. If not, you solved the bigger case by breaking it up into
pieces and calling the subprogram itself to solve these pieces. That was a program?
Ridiculous!
The reason both seemed like magic is because recursion is mathematical induction! In
both, we have general and boundary conditions, with the general condition breaking the
problem into smaller and smaller pieces. The initial or boundary condition terminates
the recursion. Once you understand either recursion or induction, you should be able
to turn it around to see why the other one also works.
A powerful way to solve recurrence relations is to guess a solution and then prove it
by induction. When trying to guess a solution, it pays to tabulate small values of the
function and stare at them until you see a pattern.

136

6. Combinatorics

For example, consider the following recurrence relation:
Tn = 2Tn−1 + 1, T0 = 0
Building a table of values yields the following:
n
Tn

0
0

1
1

2
3

3
7

4
15

5
31

6
63

7
127

Can you guess what the solution is? You should notice that things look like they
are doubling each time, no surprise considering the formula. But it is not quite 2n . By
playing around with variations of this function, you should be able to stumble on the
conjecture that Tn = 2n − 1. To ﬁnish the job, we must prove this conjecture, using the
three steps of induction:
1. Show that the basis is true: T0 = 20 − 1 = 0.
2. Now assume it is true for Tn−1 .
3. Use this assumption to complete the argument:
Tn = 2Tn−1 + 1 = 2(2n−1 − 1) + 1 = 2n − 1
Guessing the solution is usually the hard part of the job, and where the art and
experience comes in. The key is playing around with small values for insight, and having
some feel for what kind of closed form the answer will be.

6.6. Problems

137

6.6 Problems
6.6.1

How Many Fibs?

PC/UVa IDs: 110601/10183, Popularity: B, Success rate: average Level: 1
Recall the deﬁnition of the Fibonacci numbers:
f1

:= fn−1 + fn−2

(n ≥ 3)

Given two numbers a and b, calculate how many Fibonacci numbers are in the range
[a, b].

Input
The input contains several test cases. Each test case consists of two non-negative integer
numbers a and b. Input is terminated by a = b = 0. Otherwise, a ≤ b ≤ 10100 . The
numbers a and b are given with no superﬂuous leading zeros.

Output
For each test case output on a single line the number of Fibonacci numbers fi with
a ≤ fi ≤ b.

Sample Input
10 100
1234567890 9876543210
0 0

Sample Output
5
4

138

6. Combinatorics

6.6.2

How Many Pieces of Land?

PC/UVa IDs: 110602/10213, Popularity: B, Success rate: average Level: 2
You are given an elliptical-shaped land and you are asked to choose n arbitrary points
on its boundary. Then you connect each point with every other point using straight lines,
forming n(n − 1)/2 connections. What is the maximum number of pieces of land you
will get by choosing the points on the boundary carefully?

Dividing the land when n = 6.

Input
The ﬁrst line of the input ﬁle contains one integer s (0 < s < 3, 500), which indicates
how many input instances there are. The next s lines describe s input instances, each
consisting of exactly one integer n (0 ≤ n < 231 ).

Output
For each input instance output the maximum possible number pieces of land deﬁned
by n points, each printed on its own line.

Sample Input

Sample Output

4
1
2
3
4

1
2
4
8

6.6. Problems

6.6.3

139

Counting

PC/UVa IDs: 110603/10198, Popularity: B, Success rate: high Level: 2
Gustavo knows how to count, but he is just now learning how to write numbers.
He has already learned the digits 1, 2, 3, and 4. But he does not yet realize that 4 is
diﬀerent than 1, so he thinks that 4 is just another way to write 1.
He is having fun with a little game he created: he makes numbers with the four digits
that he knows and sums their values. For example:
132 = 1 + 3 + 2 = 6
112314 = 1 + 1 + 2 + 3 + 1 + 1 = 9 (remember that Gustavo thinks that 4 = 1)
Gustavo now wants to know how many such numbers he can create whose sum is a
number n. For n = 2, he can make 5 numbers: 11, 14, 41, 44, and 2. (He knows how to
count up beyond ﬁve, just not how to write it.) However, he can’t ﬁgure out this sum
for n greater than 2, and asks for your help.

Input
Input will consist of an arbitrary number of integers n such that 1 ≤ n ≤ 1, 000. You
must read until you reach the end of ﬁle.

Output
For each integer read, output an single integer on a line stating how many numbers
Gustavo can make such that the sum of their digits is equal to n.

Sample Input
2
3

Sample Output
5
13

140

6. Combinatorics

6.6.4

Expressions

PC/UVa IDs: 110604/10157, Popularity: C, Success rate: average Level: 2
Let X be the set of correctly built parenthesis expressions. The elements of X are
strings consisting only of the characters “(” and “)”, deﬁned as follows:
• The empty string belongs to X.
• If A belongs to X, then (A) belongs to X.
• If both A and B belong to X, then the concatenation AB belongs to X.
For example, the strings ()(())() and (()(())) are correctly built parenthesis expressions, and therefore belong to the set X. The expressions (()))(() and ())(() are
not correctly built parenthesis expressions and are thus not in X.
The length of a correctly built parenthesis expression E is the number of single
parenthesis (characters) in E. The depth D(E) of E is deﬁned as follows:

if E is empty
0
D(A)
+
1
if E = (A), and A is in X
D(E) =

max(D(A), D(B)) if E = AB, and A, B are in X
For example, ()(())() has length 8 and depth 2. Write a program which reads in n
and d and computes the number of correctly built parenthesis expressions of length n
and depth d.

Input
The input consists of pairs of integers n and d, with at most one pair per line and
2 ≤ n ≤ 300, 1 ≤ d ≤ 150. The input may contain empty lines, which you don’t need
to consider.

Output
For every pair of integers in the input, output a single integer on one line – the number
of correctly built parenthesis expressions of length n and depth d.

Sample Input

Sample Output

6 2
300 150

3
1

Note: The three correctly built parenthesis expressions of length 6 and depth 2 are
(())(), ()(()), and (()()).

6.6. Problems

6.6.5

141

Complete Tree Labeling

PC/UVa IDs: 110605/10247, Popularity: C, Success rate: average Level: 2
A complete k-ary tree is a k-ary tree in which all leaves have same depth and all
internal nodes have degree or (equivalently) branching factor k. It is easy to determine
the number of nodes of such a tree.
Given the depth and branching factor of such a tree, you must determine in how
many diﬀerent ways you can number the nodes of the tree so that the label of each
node is less that that of its descendants. This is the property which deﬁnes the binary
heap priority queue data structure for k = 2. In numbering a tree with N nodes, assume
you have the labels (1, 2, 3, . . . , N − 1, N ) available.

Input
The input ﬁle will contain several lines of input. Each line will contain two integers
k and d. Here k > 0 is the branching factor of the complete k-ary tree and d > 0 is
the depth of the complete k-ary tree. Your program must work for all pairs such that
k × d ≤ 21.

Output
For each line of input, produce one line of output containing an integer counting the
number of ways the k-ary tree can be labeled, maintaining the constraints described
above.

Sample Input
2 2
10 1

Sample Output
80
3628800

142

6. Combinatorics

6.6.6

The Priest Mathematician

PC/UVa IDs: 110606/10254, Popularity: C, Success rate: high Level: 2
The ancient folklore behind the “Towers of Hanoi” puzzle is quite well known. A
more recent legend tells us that once the Brahmin monks discovered how long it would
take to ﬁnish transferring the 64 discs from the needle which they were on to one of the
other needles, they decided to ﬁnd a faster strategy and be done with it.

The Four Needle (Peg) Tower of Hanoi
One of the priests at the temple informed his colleagues that they could achieve the
transfer in single afternoon at a one disc-per-second rhythm by using an additional
needle. He proposed the following strategy:
• First move the topmost discs (say the top k discs) to one of the spare needles.
• Then use the standard three needles strategy to move the remaining n − k discs
(for a general case with n discs) to their destination.
• Finally, move the top k discs into their ﬁnal destination using the four needles.
He calculated the value of k which minimized the number of movements and found
that 18,433 transfers would suﬃce. Thus they could spend just 5 hours, 7 minutes, and
13 seconds with this scheme versus over 500, 000 million years without the additional
needle!
Try to follow the clever priest’s strategy and calculate the number of transfers using
four needles, where the priest can move only one disc at a time and must place each disc
on a needle such that there is no smaller disc below it. Calculate the k that minimizes
the number of transfers under this strategy.

Input
The input ﬁle contains several lines of input. Each line contains a single integer 0 ≤
N ≤ 10, 000 giving the number of disks to be transferred. Input is terminated by end
of ﬁle.

6.6. Problems

143

Output
For each line of input produce one line of output which indicates the number of
movements required to transfer the N disks to the ﬁnal needle.

Sample Input

Sample Output

1
2
28
64

1
3
769
18433

144

6. Combinatorics

6.6.7

Self-describing Sequence

PC/UVa IDs: 110607/10049, Popularity: C, Success rate: high Level: 2
Solomon Golomb’s self-describing sequence f (1), f (2), f (3), . . . is the only nondecreasing sequence of positive integers with the property that it contains exactly f (k)
occurrences of k for each k. A few moment’s thought reveals that the sequence must
begin as follows:
n
f (n)

1
1

2
2

3
2

4
3

5
3

6
4

7
4

8
4

9
5

10
5

11
5

12
6

In this problem you are expected to write a program that calculates the value of f (n)
given the value of n.

Input
The input may contain multiple test cases. Each test case occupies a separate line and
contains an integer n (1 ≤ n ≤ 2, 000, 000, 000). The input terminates with a test case
containing a value 0 for n and this case must not be processed.

Output
For each test case in the input, output the value of f (n) on a separate line.

Sample Input
100
9999
123456
1000000000
0

Sample Output
21
356
1684
438744

6.6. Problems

6.6.8

145

Steps

PC/UVa IDs: 110608/846, Popularity: A, Success rate: high Level: 2
Consider the process of stepping from integer x to integer y along integer points of
the straight line. The length of each step must be non-negative and can be one bigger
than, equal to, or one smaller than the length of the previous step.
What is the minimum number of steps in order to get from x to y? The length of
both the ﬁrst and the last step must be 1.

Input
The input begins with a line containing n, the number of test cases. Each test case that
follows consists of a line with two integers: 0 ≤ x ≤ y < 231 .

Output
For each test case, print a line giving the minimum number of steps to get from x to y.

Sample Input
3
45 48
45 49
45 50

Sample Output
3
3
4

146

6. Combinatorics

6.7 Hints
6.1 Can the closed form for Fn be used to minimize the need for arbitrary-precision
arithmetic?
6.2 Can you ﬁnd a recurrence for the desired quantity?
6.3 Can you ﬁnd a recurrence for the desired sum?
6.4 Can you formulate a recurrence, maybe a two-parameter version of the Catalan
numbers?
6.5 Can you ﬁnd a recurrence for the desired quantity?
6.7 Can you explicitly build the sequence, or must you do something more clever
because of limited memory?
6.8 What kind of step sequences are deﬁned by optimal solutions?

6.8 Notes
6.6 Although the problem asks for a fast way to solve the four-peg Tower of Hanoi
problem under this strategy, it is not known that this strategy is in fact optimal!
See [GKP89] for more discussion.

7
Number Theory

Number theory is perhaps the most interesting and beautiful area of mathematics.
Euclid’s proof that there are an inﬁnite number of primes remains just as clear and
elegant today as it was more than two thousand years ago. Innocent-looking questions
like whether the equation an + bn = cn has solutions for integer values of a, b, c, and
n > 2 often turn out not to be so innocent. Indeed, this is the statement of Fermat’s
last theorem!
Number theory is great training in formal, rigorous reasoning, because numbertheoretic proofs are clear and decisive. Studying the integers is interesting because they
are such concrete and important objects. Discovering new properties of the integers is
discovering something exciting about the natural world.
Computers have long been used in number theoretic research. Performing interesting
number-theoretic computations on large integers requires great eﬃciency. Fortunately
there are many clever algorithms to help us out.

7.1 Prime Numbers
A prime number is an integer p > 1 which is only divisible by 1 and itself. Said another
way, if p is a prime number, then p = a · b for integers a ≤ b implies that a = 1 and
b = p. The ﬁrst ten prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 27.
Prime numbers are important because of the fundamental theorem of arithmetic.
Despite the impressive title, all it states is that every integer can be expressed in only
one way as the product of primes. For example, 105 is uniquely expressed as 3 × 5 × 7,
while 32 is uniquely expressed as 2×2×2×2×2. This unique set of numbers multiplying

148

7. Number Theory

to n is called the prime factorization of n. Order doesn’t matter in a prime factorization,
so we can canonically list the numbers in sorted order. But multiplicity does; it is what
distinguishes the prime factorization of 4 from 8.
We say a prime number p is a factor of x if it appears in its prime factorization. Any
number which is not prime is said to be composite.

7.1.1

Finding Primes

The easiest way to test if a given number x is prime uses repeated division. Start from
the smallest candidate divisor, and then try all possible divisors up from there. Since
2 is the only even prime, once we verify that x isn’t even we only need try the odd
numbers as candidate factors. Further, we can bless √
n as prime the instant we have
shown that it has no non-trivial prime factors below n. Why? Suppose not – i.e.,
√x
is composite but has a smallest non-trivial prime factor p which is greater than n.
Then x/p must also divide x, and must be larger than p,√or else we would have seen
it earlier. But the product of two numbers greater than n must be larger than n, a
contradiction.
Computing the prime factorization involves not only ﬁnding the ﬁrst prime factor,
but stripping oﬀ all occurrences of this factor and recurring on the remaining product:
prime_factorization(long x)
{
long i;
long c;

/* counter */
/* remaining product to factor */

c = x;
while ((c % 2) == 0) {
printf("%ld\n",2);
c = c / 2;
}
i = 3;
while (i <= (sqrt(c)+1)) {
if ((c % i) == 0) {
printf("%ld\n",i);
c = c / i;
}
else
i = i + 2;
}
if (c > 1) printf("%ld\n",c);
}

√
Testing the terminating condition i > c is somewhat problematic, because sqrt()
is a numerical function with imperfect precision. Just to be safe, we let i run an extra

7.2. Divisibility

149

iteration. Another approach would be to avoid ﬂoating point computation altogether
and terminate when i*i > c. However, the multiplication might cause overﬂow when
working on very large integers. Multiplication can be avoided by observing that (i+1)2 =
i2 + 2i + 1, so adding i + i + 1 to i2 yields (i + 1)2 .
For higher performance, we could move the sqrt(c) computation outside the main
loop and only update it when c changes value. However, this program responds instantly
on my computer for the prime 2,147,483,647. There exist fascinating randomized algorithms which are more eﬃcient for testing primality on very large integers, but these are
not something for us to worry about at this scale – except as the source of interesting
contest problems themselves.

7.1.2

Counting Primes

How many primes are there? It makes sense that primes become rarer and rarer as we
consider larger and larger numbers, but do they ever vanish? The answer is no, as shown
by Euclid’s proof that there are an inﬁnite number of primes. It uses an elegant proof by
contradiction. Knowing this proof is not strictly necessary to compete in programming
contests, but it is one sign of being an educated person. Thus there is no shame in
reviewing it here.
Let us assume the converse,
n that there are only a ﬁnite number of primes,
p1 , p2 , . . . , pn . Let m = 1 + i=1 pi , i.e., the product of all of these primes, plus one.
Since this is bigger than any of the primes on our list, m must be composite. Therefore
some prime must divide it.
But which prime? We know that m is not divisible by p1 , because it leaves a remainder
of 1. Further, m is not divisible by p2 , because it also leaves a remainder of 1. In fact, m
leaves a remainder of 1 when divided by any prime pi , for 1 ≤ i ≤ n. Thus p1 , p2 , . . . , pn
cannot be the complete list of primes, because if so m must also be prime.
Since this contradicts the assumption it means there cannot exist such a complete
list of primes; therefore the number of primes must be inﬁnite! QED! 1
Not only are there an inﬁnite number of primes, but in fact the primes are relatively
common. There are roughly x/ ln x primes less than or equal to x, or put another way,
roughly one out of every ln x numbers is prime.

7.2 Divisibility
Number theory is the study of integer divisibility. We say b divides a (denoted b|a) if
a = bk for some integer k. Equivalently, we say that b is a divisor of a or a is a multiple
of b if b|a.
1 Now a little puzzle to test your understanding of the proof. Suppose we take the ﬁrst n primes,
multiply them together, and add one. Does this number have to be prime? Give a proof or a
counterexample.

150

7. Number Theory

As a consequence of this deﬁnition, the smallest natural divisor of every non-zero
integer is 1. Why? It should be clear that there is in general no integer k such that
a = 0 · k.
How do we ﬁnd all divisors of a given integer? From the prime number theorem, we
know that x is uniquely represented by the product of its prime factors. Every divisor
is the product of some subset of these prime factors. Such subsets can be constructed
using backtracking techniques as discussed in Chapter 8, but we must be careful about
duplicate prime factors. For example, the prime factorization of 12 has three terms (2,
2, and 3) but 12 has only 6 divisors (1, 2, 3, 4, 6, 12).

7.2.1

Greatest Common Divisor

Since 1 divides every integer, the least common divisor of every pair of integers a, b is
1. Far more interesting is the greatest common divisor, or gcd, the largest divisor shared
by a given pair of integers. Consider a fraction x/y, say, 24/36. The reduced form of
this fraction comes after we divide both the numerator and denominator by gcd(x, y),
in this case 12. We say two integers are relatively prime if their greatest common divisor
is 1.
Euclid’s algorithm for ﬁnding the greatest common divisor of two integers has been
called history’s ﬁrst interesting algorithm. The naive way to compute gcd would be to
test all divisors of the ﬁrst integer explicitly on the second, or perhaps to ﬁnd the prime
factorization of both integers and take the product of all factors in common. But both
approaches involve computationally intensive operations.
Euclid’s algorithm rests on two observations. First,
If b|a, then gcd(a, b) = b.
This should be pretty clear. If b divides a, then a = bk for some integer k, and thus
gcd(bk, b) = b. Second,
If a = bt + r for integers t and r, then gcd(a, b) = gcd(b, r).
Why? By deﬁnition, gcd(a, b) = gcd(bt + r, b). Any common divisor of a and b must rest
totally with r, because bt clearly must be divisible by any divisor of b.
Euclid’s algorithm is recursive, repeated replacing the bigger integer by its remainder
mod the smaller integer. This typically cuts one of the arguments down by about half,
and so after a logarithmic number of iterations gets down to the base case. Consider
the following example. Let a = 34398 and b = 2132.
gcd(34398, 2132)
gcd(2132, 286)
gcd(286, 130)
gcd(130, 26)

=
=
=
=

gcd(34398 mod 2132, 2132) = gcd(2132, 286)
gcd(2132 mod 286, 286) = gcd(286, 130)
gcd(286 mod 130, 130) = gcd(130, 26)
gcd(130 mod 26, 26) = gcd(26, 0)

Therefore, gcd(34398, 2132) = 26.

7.2. Divisibility

151

However, Euclid’s algorithm can give us more than just the gcd(a, b). It can also ﬁnd
integers x and y such that
a · x + b · y = gcd(a, b)
which will prove quite useful in solving linear congruences. We know that gcd(a, b) =
gcd(b, a ), where a = a − ba/b. Further, assume we know integers x and y such that
b · x + a · y = gcd(a, b)
by recursion. Substituting our formula for a into the above expression gives us
b · x + (a − ba/b) · y = gcd(a, b)
and rearranging the terms will give us our desired x and y. We need a basis case to
complete our algorithm, but that is easy since a · 1 + 0 · 0 = gcd(a, 0).
For the previous example, we get that 34398 × 15 + 2132 × −242 = 26. An
implementation of this algorithm follows below:
/*

Find the gcd(p,q) and x,y such that p*x + q*y = gcd(p,q)

long gcd(long p, long q, long *x, long *y)
{
long x1,y1;
/* previous coefficients */
long g;
/* value of gcd(p,q) */
if (q > p) return(gcd(q,p,y,x));
if (q == 0) {
*x = 1;
*y = 0;
return(p);
}
g = gcd(q, p%q, &x1, &y1);
*x = y1;
*y = (x1 - floor(p/q)*y1);
return(g);
}

7.2.2

Least Common Multiple

Another useful function on two integers is the least common multiple (lcm), the smallest
integer which is divided by both of a given pair of integers. For example, the least
common multiple of 24 and 36 is 72.
Least common multiple arises when we want to compute the simultaneous periodicity
of two distinct periodic events. When is the next year (after 2000) that the presidential

152

7. Number Theory

election (which happens every 4 years) will coincide with census (which happens every
10 years)? The events coincide every twenty years, because lcm(4, 10) = 20.
It is self-evident that lcm(x, y) ≥ max(x, y). Similarly, since x · y is a multiple of both
x and y, lcm(x, y) ≤ xy. The only way that there can be a smaller common multiple is
if there is some non-trivial factor shared between x and y.
This observation, coupled with Euclid’s algorithm, gives an eﬃcient way to compute
least common multiple, namely, lcm(x, y) = xy/gcd(x, y). A slicker algorithm appears
in [Dij76], which avoids the multiplication and hence possibility of overﬂow.

7.3 Modular Arithmetic
In Chapter 5, we reviewed the basic arithmetic algorithms for integers, such as addition
and multiplication. We are not always interested in the full answers, however. Sometimes
the remainder suﬃces for our purposes. For example, suppose your birthday this year
falls on a Wednesday. What day of the week will it it will fall on next year? All you
need to know is the remainder of the number of days between now and then (either 365
or 366) when dividing by the 7 days of the week. Thus it will fall on Wednesday plus
one (365 mod 7) or two (366 mod 7) days, i.e., Thursday or Friday depending upon
whether it is aﬀected by a leap year.
The key to such eﬃcient computations is modular arithmetic. Of course, we can in
principle explicitly compute the entire number and then ﬁnd the remainder. But for
large enough integers, it can be much easier to just work with remainders via modular
arithmetic.
The number we are dividing by is called the modulus, and the remainder left over
is called the residue. The key to eﬃcient modular arithmetic is understanding how the
basic operations of addition, subtraction, and multiplication work over a given modulus:
• Addition — What is (x + y) mod n? We can simplify this to
((x mod n) + (y mod n)) mod n
to avoid adding big numbers. How much small change will I have if given $123.45
by my mother and $94.67 by my father?
(12, 345 mod 100) + (9, 467 mod 100) = (45 + 67) mod 100 = 12 mod 100
• Subtraction — Subtraction is just addition with negative values. How much small
change will I have after spending $52.53?
(12 mod 100) − (53 mod 100) = −41 mod 100 = 59 mod 100
Notice how we can convert a negative number mod n to a positive number by
adding a multiple of n to it. Further, this answer makes sense in this change
example. It is usually best to keep the residue between 0 and n − 1 to ensure we
are working with the smallest-magnitude numbers possible.

7.3. Modular Arithmetic

153

• Multiplication — Since multiplication is just repeated addition,
xy mod n = (x mod n)(y mod n) mod n
How much change will you have if you earn $17.28 per hour for 2,143 hours?
(1, 728 × 2, 143) mod 100 = (28 mod 100) × (43 mod 100) = 4 mod 100
Further, since exponentiation is just repeated multiplication,
xy mod n = (x mod n)y mod n
Since exponentiation is the quickest way to produce really large integers, this is
where modular arithmetic really proves its worth.
• Division — Division proves considerably more complicated to deal with, and will
be discussed in Section 7.4.
Modular arithmetic has many interesting applications, including:
• Finding the Last Digit — What is the last digit of 2100 ? Sure we can use inﬁnite precision arithmetic and look at the last digit, but why? We can do this
computation by hand. What we really want to know is what 2100 mod 10 is. By
doing repeated squaring, and taking the remainder mod 10 at each step we make
progress very quickly:
23 mod 10
26 mod 10
212 mod 10
2

24
48

mod 10
mod 10

2
296 mod 10
2100 mod 10

8 × 8 mod 10 → 4

4 × 4 mod 10 → 6

=
=

6 × 6 mod 10 → 6
6 × 6 mod 10 → 6

6 × 6 mod 10 → 6

296 × 23 × 21 mod 10 → 6

• RSA Encryption Algorithm — A classic application of modular arithmetic on
large integers arises in public-key cryptography, namely, the RSA algorithm. Here,
our message is encrypted by coding it as an integer m, raising it to a power k,
where k is the so-called public-key or encryption key, and taking the results mod
n. Since m, n, and k are all huge integers, computing mk mod n eﬃciently requires
the tools we developed above.
• Calendrical Calculations — As demonstrated with the birthday example, computing the day of the week a certain number of days from today, or the time a
certain number of seconds from now, are both applications of modular arithmetic.

154

7. Number Theory

7.4 Congruences
Congruences are an alternate notation for representing modular arithmetic. We say that
a ≡ b(mod m) if m|(a − b). By deﬁnition, if a mod m is b, then a ≡ b(mod m).
Congruences are an alternate notation for modular arithmetic, not an inherently
diﬀerent idea. Yet the notation is important. It gets us thinking about the set of integers
with a given remainder n, and gives us equations for representing them. Suppose that
x is a variable. What integers x satisfy the congruence x ≡ 3(mod 9)?
For such a simple congruence, the answer is easy. Clearly x = 3 must be a solution.
Further, adding or deleting the modulus (9 in this instance) gives another solution. The
set of solutions is all integers of the form 9y + 3, where y is any integer.
What about complicated congruences, such as 2x ≡ 3(mod 9) and 2x ≡ 3(mod 4)?
Trial and error should convince you that exactly the integers of the form 9y + 6 satisfy
the ﬁrst example, while the second has no solutions at all.
There are two important problems on congruences, namely, performing arithmetic
operations on them, and solving them. These are discussed in the sections below.

7.4.1

Operations on Congruences

Congruences support addition, subtraction, and multiplication, as well as a limited form
of division – provided they share the same modulus:
• Addition and Subtraction — Suppose a ≡ b(mod n) and c ≡ d(mod n). Then
a + c ≡ b + d(mod n). For example, suppose I know that 4x ≡ 7(mod 9) and
3x ≡ 3(mod 9). Then
4x − 3x ≡ 7 − 3(mod 9) → x ≡ 4(mod 9)

• Multiplication — It is apparent that a ≡ b(mod n) implies that a·d ≡ b·d(mod n)
by adding the reduced congruence to itself d times. In fact, general multiplication
also holds, i.e., a ≡ b(mod n) and c ≡ d(mod n) implies ac ≡ bd(mod n).
• Division — However, we cannot cavalierly cancel common factors from congruences. Note that 6 · 2 ≡ 6 · 1(mod 3), but clearly 2 ≡ 1(mod 3). To see what the
problem is, note that we can redeﬁne division as multiplication by an inverse, so
a/b is equivalent to ab−1 . Thus we can compute a/b(mod n) if we can ﬁnd the
inverse b−1 such that bb−1 ≡ 1(mod n). This inverse does not always exist – try
to ﬁnd a solution to 2x ≡ 1(mod 4).
We can simplify a congruence ad ≡ bd(mod dn) to a ≡ b(mod n), so we
can divide all three terms by a mutually common factor if one exists. Thus 170 ≡
30(mod 140) implies that 17 ≡ 3(mod 14). However, the congruence a ≡ b(mod n)
must be false (i.e., has no solution) if gcd(a, n) does not divide b.

7.4. Congruences

7.4.2

155

Solving Linear Congruences

A linear congruence is an equation of the form ax ≡ b(mod n). Solving this equation
means identifying which values of x satisfy it.
Not all such equations have solutions. We have seen integers which do not have
multiplicative inverses over a given modulus, meaning that ax ≡ 1(mod n) has no
solution. In fact, ax ≡ 1(mod n) has a solution if and only if the modulus and the
multiplier are relatively prime, i.e., gcd(a, n) = 1. We may use Euclid’s algorithm to
ﬁnd this inverse through the solution to a · x + n · y = gcd(a, n) = 1. Thus
ax ≡ 1(mod n) → ax ≡ a · x + n · y (mod n)
Clearly n·y ≡ 0(mod n), so in fact this inverse is simply the x from Euclid’s algorithm.
In general, there are three cases, depending on the relationship between a, b, and n:
• gcd(a, b, n) > 1 — Then we can divide all three terms by this divisor to get
an equivalent congruence. This gives us a single solution mod the new base, or
equivalently gcd(a, b, n) solutions (mod n).
• gcd(a, n) does not divide b — Then, as described above, the congruence can have
no solution.
• gcd(a, n) = 1 — Then there is one solution (mod n). Further, x = a−1 b works,
since aa−1 b ≡ b(mod n). As shown above, this inverse exists and can be found
using Euclid’s algorithm.
The Chinese remainder theorem gives us a tool for working with systems of congruences over diﬀerent moduli. Suppose there is exists an integer x such that x ≡
a1 (mod m1 ) and x ≡ a2 (mod m2 ). Then x is uniquely determined (mod m1 m2 ) if m1
and m2 are relatively prime.
To ﬁnd this x, and thus solve the system of two congruences, we begin by solving
the linear congruences m2 b1 ≡ 1(mod m1 ) and m1 b1 ≡ 1(mod m2 ) to ﬁnd b1 and b2
respectively. Then it can be readily veriﬁed that
x = a1 b1 m2 + a2 b2 m1
is a solution to both of the original congruences. Further, the theorem readily extends to
systems of an arbitrary number of congruences whose moduli are all pairwise relatively
prime.

7.4.3

Diophantine Equations

Diophantine equations are formulae in which the variables are restricted to integers.
For example, Fermat’s last theorem concerned answers to the equation an + bn = cn .
Solving such an equation for real numbers is no big deal. It is only if all variables are
restricted to integers that the problem becomes diﬃcult.
Diophantine equations are diﬃcult to work with because division is not a routine operation with integer formulae. However, there are certain classes of Diophantine equations
which are known to be solvable and these tend to arise frequently.

156

7. Number Theory

The most important class are linear Diophantine equations of the form ax − ny = b,
where x and y are the integer variables and a, b, and n are integer constants. These are
readily shown to be equivalent to the solving the congruence ax ≡ b(mod n) and hence
can be solved using the techniques of the previous section.
More advanced Diophantine analysis is beyond the scope of this book, but we refer
the reader to standard references in number theory such as Niven and Zuckerman
[ZMNN91] and Hardy and Wright [HW79] for more on this fascinating subject.

7.5 Number Theoretic Libraries
The Java BigInteger class (java.math.BigInteger) includes a variety of useful
number-theoretic functions. Most important, of course, is the basic support for arithmetic operations on arbitrary-precision integers as discussed in Chapter 5. But there
are also several functions of purely number-theoretic interest:
• Greatest Common Divisor — BigInteger gcd(BigInteger val) returns the
BigInteger whose value is the gcd of abs(this) and abs(val).
• Modular Exponentiation — BigInteger modPow(BigInteger exp, BigInteger
m) returns a BigInteger whose value is thisexp mod m.
• Modular Inverse — BigInteger modInverse(BigInteger m) returns a BigInteger whose value is this−1 ( mod m), i.e. solves the congruence y · this ≡ 1( mod m)
by returning an appropriate integer y if it exists.
• Primality Testing — public boolean isProbablePrime(int certainty) uses
a randomized primality test to return true if this BigInteger is probably prime
and false if it’s deﬁnitely composite. If the call returns true, the probability of
primality is ≥ 1 − 1/2certainty .

7.6. Problems

157

7.6 Problems
7.6.1

Light, More Light

PC/UVa IDs: 110701/10110, Popularity: A, Success rate: average Level: 1
There is man named Mabu who switches on-oﬀ the lights along a corridor at our
university. Every bulb has its own toggle switch that changes the state of the light. If
the light is oﬀ, pressing the switch turns it on. Pressing it again will turn it oﬀ. Initially
each bulb is oﬀ.
He does a peculiar thing. If there are n bulbs in a corridor, he walks along the corridor
back and forth n times. On the ith walk, he toggles only the switches whose position
is divisible by i. He does not press any switch when coming back to his initial position.
The ith walk is deﬁned as going down the corridor (doing his peculiar thing) and coming
back again. Determine the ﬁnal state of the last bulb. Is it on or oﬀ?

Input
The input will be an integer indicating the nth bulb in a corridor, which is less than or
equal to 232 − 1. A zero indicates the end of input and should not be processed.

Output
Output “yes” or “no” to indicate if the light is on, with each case appearing on its own
line.

Sample Input
3
6241
8191
0

Sample Output
no
yes
no

158

7. Number Theory

7.6.2

Carmichael Numbers

PC/UVa IDs: 110702/10006, Popularity: A, Success rate: average Level: 2
Certain cryptographic algorithms make use of big prime numbers. However, checking
whether a big number is prime is not so easy.
Randomized primality tests exist that oﬀer a high degree of conﬁdence of accurate
determination at low cost, such as the Fermat test. Let a be a random number between
2 and n − 1, where n is the number whose primality we are testing. Then, n is probably
prime if the following equation holds:
an mod n = a
If a number passes the Fermat test several times, then it is prime with a high probability.
Unfortunately, there is bad news. Certain composite numbers (non-primes) still pass
the Fermat test with every number smaller than themselves. These numbers are called
Carmichael numbers.
Write a program to test whether a given integer is a Carmichael number.

Input
The input will consist of a series of lines, each containing a small positive number n
(2 < n < 65, 000). A number n = 0 will mark the end of the input, and must not be
processed.

Output
For each number in the input, print whether it is a Carmichael number or not as shown
in the sample output.

Sample Input

Sample Output

1729
17
561
1109
431
0

The number 1729 is a Carmichael number.
17 is normal.
The number 561 is a Carmichael number.
1109 is normal.
431 is normal.

7.6. Problems

7.6.3

159

Euclid Problem

PC/UVa IDs: 110703/10104, Popularity: A, Success rate: average Level: 1
From Euclid, it is known that for any positive integers A and B there exist such
integers X and Y that AX + BY = D, where D is the greatest common divisor of A
and B. The problem is to ﬁnd the corresponding X, Y , and D for a given A and B.

Input
The input will consist of a set of lines with the integer numbers A and B, separated
with space (A, B < 1, 000, 000, 001).

Output
For each input line the output line should consist of three integers X, Y , and D,
separated with space. If there are several such X and Y , you should output that pair
for which X ≤ Y and |X| + |Y | is minimal.

Sample Input
4 6
17 17

Sample Output
-1 1 2
0 1 17

160

7. Number Theory

7.6.4

Factovisors

PC/UVa IDs: 110704/10139, Popularity: A, Success rate: average Level: 2
The factorial function, n! is deﬁned as follows for all non-negative integers n:
0!
n!

= 1
= n × (n − 1)!

(n > 0)

We say that a divides b if there exists an integer k such that
k×a=b

Input
The input to your program consists of several lines, each containing two non-negative
integers, n and m, both less than 231 .

Output
For each input line, output a line stating whether or not m divides n!, in the format
shown below.

Sample Input
6 9
6 27
20 10000
20 100000
1000 1009

Sample Output
9 divides 6!
27 does not divide 6!
10000 divides 20!
100000 does not divide 20!
1009 does not divide 1000!

7.6. Problems

7.6.5

161

Summation of Four Primes

PC/UVa IDs: 110705/10168, Popularity: A, Success rate: average Level: 2
Waring’s prime number conjecture states that every odd integer is either prime or
the sum of three primes. Goldbach’s conjecture is that every even integer is the sum of
two primes. Both problems have been open for over 200 years.
In this problem you have a slightly less demanding task. Find a way to express a
given integer as the sum of exactly four primes.

Input
Each input case consists of one integer n (n ≤ 10000000) on its own line. Input is
terminated by end of ﬁle.

Output
For each input case n, print one line of output containing four prime numbers which
sum up to n. If the number cannot be expressed as a summation of four prime numbers
print the line “Impossible.” in a single line. There can be multiple solutions. Any
good solution will be accepted.

Sample Input
24
36
46

Sample Output
3 11 3 7
3 7 13 13
11 11 17 7

162

7. Number Theory

7.6.6

Smith Numbers

PC/UVa IDs: 110706/10042, Popularity: B, Success rate: average Level: 1
While skimming his phone directory in 1982, mathematician Albert Wilansky noticed
that the telephone number of his brother-in-law H. Smith had the following peculiar
property: The sum of the digits of that number was equal to the sum of the digits of
the prime factors of that number. Got it? Smith’s telephone number was 493-7775. This
number can be written as the product of its prime factors in the following way:
4937775 = 3 · 5 · 5 · 65837
The sum of all digits of the telephone number is 4 + 9 + 3 + 7 + 7 + 7 + 5 = 42, and
the sum of the digits of its prime factors is equally 3 + 5 + 5 + 6 + 5 + 8 + 3 + 7 = 42.
Wilansky named this type of number after his brother-in-law: the Smith numbers.
As this property is true for every prime number, Wilansky excluded them from the
deﬁnition. Other Smith numbers include 6,036 and 9,985.
Wilansky was not able to ﬁnd a Smith number which was larger than the telephone
number of his brother-in-law. Can you help him out?

Input
The input consists of several test cases, the number of which you are given in the ﬁrst
line of the input. Each test case consists of one line containing a single positive integer
smaller than 109 .

Output
For every input value n, compute the smallest Smith number which is larger than n
and print it on a single line. You can assume that such a number exists.

Sample Input
1
4937774

Sample Output
4937775

7.6. Problems

7.6.7

163

Marbles

PC/UVa IDs: 110707/10090, Popularity: B, Success rate: low Level: 1
I collect marbles (colorful small glass balls) and want to buy boxes to store them.
The boxes come in two types:
T ype 1: each such box costs c1 dollars and can hold exactly n1 marbles
T ype 2: each such box costs c2 dollars and can hold exactly n2 marbles
I want each box to be ﬁlled to its capacity, and also to minimize the total cost of
buying them. Help me ﬁnd the best way to distribute my marbles among the boxes.

Input
The input ﬁle may contain multiple test cases. Each test case begins with a line containing the integer n (1 ≤ n ≤ 2,000,000,000). The second line contains c1 and n1 , and
the third line contains c2 and n2 . Here, c1 , c2 , n1 , and n2 are all positive integers having
values smaller than 2,000,000,000.
A test case containing a zero for the number of marbles terminates the input.

Output
For each test case in the input print a line containing the minimum cost solution (two
nonnegative integers m1 and m2 , where mi = number of type i boxes required if one
exists. Otherwise print “failed”.
If a solution exists, you may assume that it is unique.

Sample Input
43
1 3
2 4
40
5 9
5 12
0

Sample Output
13 1
failed

164

7. Number Theory

7.6.8

Repackaging

PC/UVa IDs: 110708/10089, Popularity: C, Success rate: low Level: 2
Coﬀee cups of three diﬀerent sizes (size 1, size 2, and size 3) are manufactured by
the Association of Cup Makers (ACM) and are sold in various packages. Each type of
package is identiﬁed by three positive integers (S1 , S2 , S3 ), where Si (1 ≤ i ≤ 3) denotes
the number of size i cups included in the package. Unfortunately, there is no package
such that S1 = S2 = S3 .
Market research has discovered there is great demand for packages containing equal
numbers of cups of all three sizes. To exploit this opportunity, ACM has decided to
unpack the cups from some of the packages in its unlimited stock of unsold products
and repack them as packages having equal number of cups of all three sizes. For example,
suppose ACM has the following packages in its stock: (1, 2, 3), (1, 11, 5), (9, 4, 3), and
(2, 3, 2). Then we can unpack three (1, 2, 3) packages, one (9, 4, 3) package, and two
(2, 3, 2) packages and repack the cups to produce sixteen (1, 1, 1) packages. One can
even produce eight (2, 2, 2) packages or four (4, 4, 4) packages or two (8, 8, 8) packages
or one (16, 16, 16) package, etc. Note that all the unpacked cups must be used to produce
the new packages; i.e., no unpacked cup is wasted.
ACM has hired you to write a program to decide whether it is possible to produce
packages containing an equal number of all three types of cups using all the cups that
can be found by unpacking any combination of existing packages in stock.

Input
The input may contain multiple test cases. Each test case begins with a line containing
an integer N (3 ≤ N ≤ 1,000) indicating the number of diﬀerent types of packages that
can be found in the stock. Each of the next N lines contains three positive integers
denoting, respectively, the number of size 1, size 2, and size 3 cups in a package. No
two packages in a test case will have the same speciﬁcation.
A test case containing a zero for N in the ﬁrst line terminates the input.

Output
For each test case print a line containing “Yes” if packages can be produced as desired.
Print “No” if they cannot be produced.

Sample Input
4
1
1
9
2
4

2 3
11 5
4 3
3 2

7.6. Problems

1
1
9
2
0

3 3
11 5
4 3
3 2

Sample Output
Yes
No

165

166

7. Number Theory

7.7 Hints
7.1 Can we ﬁgure out the state of the nth bulb without testing all numbers from 1
to n?
7.2 How can we compute an (mod n) eﬃciently?
7.3 Are we sure the construction in the text gives the minimal such pair?
7.4 Can we test the divisibility without explicitly computing n!?
7.7 Can you compute the possible exact solutions independent of cost? Which one of
these will be the cheapest?
7.8 Can we solve these Diophantine equations using the techniques discussed in this
chapter?

7.8 Notes
7.5 The Goldbach and Waring conjectures are almost certainly true, but perhaps
because of brute force instead of deep properties of the primes. Do a back-ofthe-envelope calculation of the expected number of solutions for each problem,
assuming that there are n/ ln n primes less than n. Is it promising to hunt further
for a counter-example when none has been found before n = 1,000,000?
7.6 Papers on the properties of Smith numbers include [Wil82, McD87].

8
Backtracking

Modern computers are so fast that brute force can be an eﬀective and honorable way
to solve problems. For example, sometimes it is easier to count the number of items
in a set by actually constructing them than by using sophisticated combinatorial arguments. Of course, this requires the number of items searched to be small enough for the
computation to complete.
A modern personal computer has a clock rate of about 1 gigahertz, meaning one
billion operations per second. Figure that doing anything interesting takes a few hundred
instructions or even more. Thus you can hope to search a few million items per second
on contemporary machines.
It is important to realize how big (or how small) one million is. One million permutations means all arrangements of roughly 10 or 11 objects, but not more. One million
subsets means all combinations of roughly 20 items, but not more. Solving signiﬁcantly
larger problems requires carefully pruning the search space to make sure we look at
only the elements which really matter.
In this chapter, we look at backtracking algorithms for exhaustive search and
designing eﬀective pruning techniques to make them as powerful as possible.

8.1 Backtracking
Backtracking is a systematic method to iterate through all the possible conﬁgurations
of a search space. It is a general algorithm/technique which must be customized for
each individual application.

168

8. Backtracking

In the general case, we will model our solution as a vector a = (a1 , a2 , ..., an ), where
each element ai is selected from a ﬁnite ordered set Si . Such a vector might represent
an arrangement where ai contains the ith element of the permutation. Or the vector
might represent a given subset S, where ai is true if and only if the ith element of the
universe is in S. The vector can even represent a sequence of moves in a game or a path
in a graph, where ai contains the ith event in the sequence.
At each step in the backtracking algorithm, we start from a given partial solution,
say, a = (a1 , a2 , ..., ak ), and try to extend it by adding another element at the end. After
extending it, we must test whether what we have so far is a solution – if so, we should
print it, count it, or do what we want with it. If not, we must then check whether the
partial solution is still potentially extendible to some complete solution. If so, recur and
continue. If not, we delete the last element from a and try another possibility for that
position, if one exists.
The honest working code is given below. We include a global finished ﬂag to allow
for premature termination, which could be set in any application-speciﬁc routine.
bool finished = FALSE;

/* found all solutions yet? */

backtrack(int a[], int k, data input)
{
int c[MAXCANDIDATES];
/* candidates for next position */
int ncandidates;
/* next position candidate count */
int i;
/* counter */
if (is_a_solution(a,k,input))
process_solution(a,k,input);
else {
k = k+1;
construct_candidates(a,k,input,c,&ncandidates);
for (i=0; iedges[x]. The degree ﬁeld counts the number of
meaningful entries for the given vertex. An undirected edge (x, y) appears twice in any
adjacency-based graph structure, once as y in x’s list, and once as x in y’s list.

9.2. Data Structures for Graphs

193

To demonstrate the use of this data structure, we show how to read in a graph from
a ﬁle. A typical graph format consists of an initial line featuring the number of vertices
and edges in the graph, followed by a listing of the edges at one vertex pair per line.
read_graph(graph *g, bool directed)
{
int i;
int m;
int x, y;

/* counter */
/* number of edges */
/* vertices in edge (x,y) */

initialize_graph(g);
scanf("%d %d",&(g->nvertices),&m);
for (i=1; i<=m; i++) {
scanf("%d %d",&x,&y);
insert_edge(g,x,y,directed);
}
}
initialize_graph(graph *g)
{
int i;

/* counter */

g -> nvertices = 0;
g -> nedges = 0;
for (i=1; i<=MAXV; i++) g->degree[i] = 0;
}
The critical routine is insert edge. We parameterize it with a Boolean ﬂag directed
to identify whether we need to insert two copies of each edge or only one. Note the use
of recursion to solve the problem:
insert_edge(graph *g, int x, int y, bool directed)
{
if (g->degree[x] > MAXDEGREE)
printf("Warning: insertion(%d,%d) exceeds max degree\n",x,y);
g->edges[x][g->degree[x]] = y;
g->degree[x] ++;
if (directed == FALSE)
insert_edge(g,y,x,TRUE);
else
g->nedges ++;

194

9. Graph Traversal

}
Printing the associated graph is now simply a matter of nested loops:
print_graph(graph *g)
{
int i,j;

/* counters */

for (i=1; i<=g->nvertices; i++) {
printf("%d: ",i);
for (j=0; jdegree[i]; j++)
printf(" %d",g->edges[i][j]);
printf("\n");
}
}

9.3 Graph Traversal: Breadth-First
The basic operation in most graph algorithms is completely and systematically traversing the graph. We want to visit every vertex and every edge exactly once in some
well-deﬁned order. There are two primary traversal algorithms: breadth-ﬁrst search
(BFS) and depth-ﬁrst search (DFS). For certain problems, it makes absolutely no
diﬀerent which one you use, but in other cases the distinction is crucial.
Both graph traversal procedures share one fundamental idea, namely, that it is necessary to mark the vertices we have seen before so we don’t try to explore them again.
Otherwise we get trapped in a maze and can’t ﬁnd our way out. BFS and DFS diﬀer
only in the order in which they explore vertices.
Breadth-ﬁrst search is appropriate if (1) we don’t care which order we visit the vertices
and edges of the graph, so any order is appropriate or (2) we are interested in shortest
paths on unweighted graphs.

9.3.1

Breadth-First Search

Our breadth-ﬁrst search implementation bfs uses two Boolean arrays to maintain our
knowledge about each vertex in the graph. A vertex is discovered the ﬁrst time we
visit it. A vertex is considered processed after we have traversed all outgoing edges
from it. Thus each vertex passes from undiscovered to discovered to processed over the
course of the search. This information could be maintained using one enumerated type
variable; we used two Boolean variables instead.
Once a vertex is discovered, it is placed on a queue, such as we implemented in Section
2.1.2. Since we process these vertices in ﬁrst-in, ﬁrst-out order, the oldest vertices are
expanded ﬁrst, which are exactly those closest to the root:
bool processed[MAXV];
bool discovered[MAXV];

/* which vertices have been processed */
/* which vertices have been found */

9.3. Graph Traversal: Breadth-First

int parent[MAXV];

195

/* discovery relation */

bfs(graph *g, int start)
{
queue q;
int v;
int i;

/* queue of vertices to visit */
/* current vertex */
/* counter */

init_queue(&q);
enqueue(&q,start);
discovered[start] = TRUE;
while (empty(&q) == FALSE) {
v = dequeue(&q);
process_vertex(v);
processed[v] = TRUE;
for (i=0; idegree[v]; i++)
if (valid_edge(g->edges[v][i]) == TRUE) {
if (discovered[g->edges[v][i]] == FALSE) {
enqueue(&q,g->edges[v][i]);
discovered[g->edges[v][i]] = TRUE;
parent[g->edges[v][i]] = v;
}
if (processed[g->edges[v][i]] == FALSE)
process_edge(v,g->edges[v][i]);
}
}
}
initialize_search(graph *g)
{
int i;

/* counter */

for (i=1; i<=g->nvertices; i++) {
processed[i] = discovered[i] = FALSE;
parent[i] = -1;
}
}

9.3.2

Exploiting Traversal

The exact behavior of bfs depends upon the functions process vertex() and
process edge(). Through these functions, we can easily customize what the traversal does as it makes one oﬃcial visit to each edge and each vertex. By setting the
functions to

196

9. Graph Traversal

process_vertex(int v)
{
printf("processed vertex %d\n",v);
}
process_edge(int x, int y)
{
printf("processed edge (%d,%d)\n",x,y);
}
we print each vertex and edge exactly once. By setting the functions to
process_vertex(int v)
{
}
process_edge(int x, int y)
{
nedges = nedges + 1;
}
we get an accurate count of the number of edges. Many problems perform diﬀerent
actions on vertices or edges as they are encountered. These functions give us the freedom
to easily customize our response.
One ﬁnal degree of customization is provided by the Boolean predicate valid edge,
which allows us ignore the existence of certain edges in the graph during our traversal.
Setting valid edge to return true for all edges results in a full breadth-ﬁrst search of
the graph, and will be the case for our examples except netflow in Section 10.4.

9.3.3

Finding Paths

The parent array set within bfs() is very useful for ﬁnding interesting paths through
a graph. The vertex which discovered vertex i is deﬁned as parent[i]. Every vertex
is discovered during the course of traversal, so except for the root every node has a
parent. The parent relation deﬁnes a tree of discovery with the initial search node as
the root of the tree.
Because vertices are discovered in order of increasing distance from the root, this tree
has a very important property. The unique tree path from the root to any node x ∈ V
uses the smallest number of edges (or equivalently, intermediate nodes) possible on any
root-to-x path in the graph.
We can reconstruct this path by following the chain of ancestors from x to the root.
Note that we have to work backward. We cannot ﬁnd the path from the root to x, since
that does not follow the direction of the parent pointers. Instead, we must ﬁnd the path
from x to the root.

9.3. Graph Traversal: Breadth-First
13

197

Figure 9.1. An undirected 4 × 4 grid-graph (l), with the DAG deﬁned by edges going to
higher-numbered vertices (r).

Since this is the reverse of how we normally want the path, we can either (1) store it
and then explicitly reverse it using a stack, or (2) let recursion reverse it for us, as in
the following slick routine:
find_path(int start, int end, int parents[])
{
if ((start == end) || (end == -1))
printf("\n%d",start);
else {
find_path(start,parents[end],parents);
printf(" %d",end);
}
}
On our grid graph example (Figure 9.1) our algorithm generated the following parent
relation:
vertex
parent

1
-1

2
1

3
2

4
3

5
1

6
2

7
3

8
4

9
5

10
6

11
7

12
8

13
9

14
10

15
11

16
12

For the shortest path from the lower-left corner of the grid to the upper-right corner,
this parent relation yields the path {1, 2, 3, 4, 8, 12, 16}. Of course, this shortest path is
not unique; the number of such paths in this graph is counted in Section 6.3.
There are two points to remember about using breadth-ﬁrst search to ﬁnd a shortest
path from x to y: First, the shortest path tree is only useful if BFS was performed with
x as the root of the search. Second, BFS only gives the shortest path if the graph is
unweighted. We will present algorithms for ﬁnding shortest paths in weighted graphs
in Section 10.3.1.

198

9. Graph Traversal

9.4 Graph Traversal: Depth-First
Depth-ﬁrst search uses essentially the same idea as backtracking. Both involve exhaustively searching all possibilities by advancing if it is possible, and backing up as soon
as there is no unexplored possibility for further advancement. Both are most easily
understood as recursive algorithms.
Depth-ﬁrst search can be thought of as breadth-ﬁrst search with a stack instead of
a queue. The beauty of implementing dfs recursively is that recursion eliminates the
need to keep an explicit stack:
dfs(graph *g, int v)
{
int i;
int y;
if (finished) return;

/* counter */
/* successor vertex */
/* allow for search termination */

discovered[v] = TRUE;
process_vertex(v);
for (i=0; idegree[v]; i++) {
y = g->edges[v][i];
if (valid_edge(g->edges[v][i]) == TRUE) {
if (discovered[y] == FALSE) {
parent[y] = v;
dfs(g,y);
} else
if (processed[y] == FALSE)
process_edge(v,y);
}
if (finished) return;
}
processed[v] = TRUE;
}
Rooted trees are a special type of graph (directed, acyclic, in-degrees of at most 1,
with an order deﬁned on the outgoing edges of each node). In-order, pre-order, and
post-order traversals are all basically DFS, diﬀering only in how they use the ordering
of out-edges and when they process the vertex.

9.4.1

Finding Cycles

Depth-ﬁrst search of an undirected graph partitions the edges into two classes, tree
edges and back edges. The tree edges those encoded in the parent relation, the edges

9.4. Graph Traversal: Depth-First

199

which discover new vertices. Back edges are those whose other endpoint is an ancestor
of the vertex being expanded, so they point back into the tree.
That all edges fall into these two classes is an amazing property of depth-ﬁrst search.
Why can’t an edge go to a brother or cousin node instead of an ancestor? In DFS, all
nodes reachable from a given vertex v are expanded before we ﬁnish with the traversal
from v, so such topologies are impossible for undirected graphs. The case of DFS on
directed graphs is somewhat more complicated but still highly structured.
Back edges are the key to ﬁnding a cycle in an undirected graph. If there is no back
edge, all edges are tree edges, and no cycle exists in a tree. But any back edge going
from x to an ancestor y creates a cycle with the path in the tree from y to x. Such a
cycle is easy to ﬁnd using dfs:
process_edge(int x, int y)
{
if (parent[x] != y) {
/* found back edge! */
printf("Cycle from %d to %d:",y,x);
find_path(y,x,parent);
finished = TRUE;
}
}
process_vertex(int v)
{
}
We use the finished ﬂag to terminate after ﬁnding the ﬁrst cycle in our 4 × 4 grid
graph, which is 3 4 8 7 with (7, 3) as the back edge.

9.4.2

Connected Components

A connected component of an undirected graph is a maximal set of vertices such that
there is a path between every pair of vertices. These are the separate “pieces” of the
graph such that there is no connection between the pieces.
An amazing number of seemingly complicated problems reduce to ﬁnding or counting
connected components. For example, testing whether a puzzle such as Rubik’s cube or
the 15-puzzle can be solved from any position is really asking whether the graph of legal
conﬁgurations is connected.
Connected components can easily be found using depth-ﬁrst search or breadth-ﬁrst
search, since the vertex order does not matter. Basically, we search from the ﬁrst vertex.
Anything we discover during this search must be part of the same connected component.
We then repeat the search from any undiscovered vertex (if one exists) to deﬁne the
next component, and so on until all vertices have been found:
connected_components(graph *g)
{
int c;

/* component number */

200

9. Graph Traversal

int i;

/* counter */

initialize_search(g);
c = 0;
for (i=1; i<=g->nvertices; i++)
if (discovered[i] == FALSE) {
c = c+1;
printf("Component %d:",c);
dfs(g,i);
printf("\n");
}
}
process_vertex(int v)
{
printf(" %d",v);
}
process_edge(int x, int y)
{
}
Variations on connected components are discussed in Section 10.1.2.

9.5 Topological Sorting
Topological sorting is the fundamental operation on directed acyclic graphs (DAGs). It
constructs an ordering of the vertices such that all directed edges go from left to right.
Such an ordering clearly cannot exist if the graph contains any directed cycles, because
there is no way you can keep going right on a line and still return back to where you
started from!
The importance of topological sorting is that it gives us a way to process each vertex
before any of its successors. Suppose the edges represented precedence constraints, such
that edge (x, y) means job x must be done before job y. Then any topological sort
deﬁnes a legal schedule. Indeed, there can be many such orderings for a given DAG.
But the applications go deeper. Suppose we seek the shortest (or longest) path from
x to y in a DAG. Certainly no vertex appearing after y in the topological order can
contribute to any such path, because there will be no way to get back to y. We can
appropriately process all the vertices from left to right in topological order, considering
the impact of their outgoing edges, and know that we will look at everything we need
before we need it.
Topological sorting can be performed eﬃciently by using a version of depth-ﬁrst
search. However, a more straightforward algorithm is based on an analysis of the in-

9.5. Topological Sorting

201

degrees of each vertex in a DAG. If a vertex has no incoming edges, i.e., has in-degree 0,
we may safely place it ﬁrst in topological order. Deleting its outgoing edges may create
new in-degree 0 vertices. This process will continue until all vertices have been placed in
the ordering; if not, the graph contained a cycle and was not a DAG in the ﬁrst place.
Study the following implementation:

topsort(graph *g, int sorted[])
{
int indegree[MAXV];
queue zeroin;
int x, y;
int i, j;

/*
/*
/*
/*

indegree of each vertex */
vertices of indegree 0 */
current and next vertex */
counters */

compute_indegrees(g,indegree);
init_queue(&zeroin);
for (i=1; i<=g->nvertices; i++)
if (indegree[i] == 0) enqueue(&zeroin,i);
j=0;
while (empty(&zeroin) == FALSE) {
j = j+1;
x = dequeue(&zeroin);
sorted[j] = x;
for (i=0; idegree[x]; i++) {
y = g->edges[x][i];
indegree[y] --;
if (indegree[y] == 0) enqueue(&zeroin,y);
}
}
if (j != g->nvertices)
printf("Not a DAG -- only %d vertices found\n",j);
}

compute_indegrees(graph *g, int in[])
{
int i,j;

/* counters */

for (i=1; i<=g->nvertices; i++) in[i] = 0;
for (i=1; i<=g->nvertices; i++)
for (j=0; jdegree[i]; j++) in[ g->edges[i][j] ] ++;
}

202

9. Graph Traversal

There are several things to observe. Our ﬁrst step is computing the in-degrees of each
vertex of the DAG, since the degree ﬁeld of the graph data type records the out-degree
of a vertex. These are the same for undirected graphs, but not directed ones.
Next, note that we use a queue here to maintain the in-degree 0 vertices, but only
because we had one sitting around from Section 2.1.2. Any container will do, since the
processing order does not matter for correctness. Diﬀerent processing orders will yield
diﬀerent topological sorts.
The impact of processing orders is apparent in topologically sorting the directed grid
in Figure 9.1, where all edges go from lower- to higher-numbered vertices. The sorted
permutation {1, 2, . . . , 15, 16} is a topological ordering, but our program repeatedly
stripped oﬀ diagonals to ﬁnd
1 2 5 3 6 9 4 7 10 13 8 11 14 12 15 16
Many other orderings are also possible.
Finally, note that this implementation does not actually delete the edges from the
graph! It is suﬃcient to consider their impact on the in-degree and traverse them rather
than delete them.

9.6. Problems

203

9.6 Problems
9.6.1

Bicoloring

PC/UVa IDs: 110901/10004, Popularity: A, Success rate: high Level: 1
The four-color theorem states that every planar map can be colored using only four
colors in such a way that no region is colored using the same color as a neighbor. After
being open for over 100 years, the theorem was proven in 1976 with the assistance of a
computer.
Here you are asked to solve a simpler problem. Decide whether a given connected
graph can be bicolored, i.e., can the vertices be painted red and black such that no two
adjacent vertices have the same color.
To simplify the problem, you can assume the graph will be connected, undirected,
and not contain self-loops (i.e., edges from a vertex to itself).

Input
The input consists of several test cases. Each test case starts with a line containing the
number of vertices n, where 1 < n < 200. Each vertex is labeled by a number from 0
to n − 1. The second line contains the number of edges l. After this, l lines follow, each
containing two vertex numbers specifying an edge.
An input with n = 0 marks the end of the input and is not to be processed.

Output
Decide whether the input graph can be bicolored, and print the result as shown below.

Sample Input

Sample Output

3
3
0
1
2
9
8
0
0
0
0
0
0
0
0
0

NOT BICOLORABLE.
BICOLORABLE.
1
2
0

1
2
3
4
5
6
7
8

204

9. Graph Traversal

9.6.2

Playing With Wheels

PC/UVa IDs: 110902/10067, Popularity: C, Success rate: average Level: 2
Consider the following mathematical machine. Digits ranging from 0 to 9 are printed
consecutively (clockwise) on the periphery of each wheel. The topmost digits of the
wheels form a four-digit integer. For example, in the following ﬁgure the wheels form
the integer 8,056. Each wheel has two buttons associated with it. Pressing the button
marked with a left arrow rotates the wheel one digit in the clockwise direction and
pressing the one marked with the right arrow rotates it by one digit in the opposite
direction.

We start with an initial conﬁguration of the wheels, with the topmost digits forming
the integer S1 S2 S3 S4 . You will be given a set of n forbidden conﬁgurations Fi1 Fi2 Fi3 Fi4
(1 ≤ i ≤ n) and a target conﬁguration T1 T2 T3 T4 . Your job is to write a program
to calculate the minimum number of button presses required to transform the initial
conﬁguration to the target conﬁguration without passing through a forbidden one.

Input
The ﬁrst line of the input contains an integer N giving the number of test cases. A
blank line then follows.
The ﬁrst line of each test case contains the initial conﬁguration of the wheels, speciﬁed
by four digits. Two consecutive digits are separated by a space. The next line contains
the target conﬁguration. The third line contains an integer n giving the number of forbidden conﬁgurations. Each of the following n lines contains a forbidden conﬁguration.
There is a blank line between two consecutive input sets.

Output
For each test case in the input print a line containing the minimum number of button
presses required. If the target conﬁguration is not reachable print “-1”.

9.6. Problems

Sample Input
2
8
6
5
8
8
5
7
6

0 5 6
5 0 8

0
5
8
0
0
0
0
0
0
1
9

0 0 0
3 1 7

0
0
5
5
4

0
0
0
0
1
9
0
0

5
4
0
0
0

0
0
1
9
0
0
0
0

7
7
8
8
8

1
9
0
0
0
0
0
0

Sample Output
14
-1

205

206

9.6.3

9. Graph Traversal

The Tourist Guide

PC/UVa IDs: 110903/10099, Popularity: B, Success rate: average Level: 3
Mr. G. works as a tourist guide in Bangladesh. His current assignment is to show a
group of tourists a distant city. As in all countries, certain pairs of cities are connected by
two-way roads. Each pair of neighboring cities has a bus service that runs only between
those two cities and uses the road that directly connects them. Each bus service has a
particular limit on the maximum number of passengers it can carry. Mr. G. has a map
showing the cities and the roads connecting them, as well as the service limit for each
each bus service.
It is not always possible for him to take all the tourists to the destination city in
a single trip. For example, consider the following road map of seven cities, where the
edges represent roads and the number written on each edge indicates the passenger
limit of the associated bus service.

It will take at least ﬁve trips for Mr. G. to take 99 tourists from city 1 to city 7, since
he has to ride the bus with each group. The best route to take is 1 - 2 - 4 - 7.
Help Mr. G. ﬁnd the route to take all his tourists to the destination city in the
minimum number of trips.

Input
The input will contain one or more test cases. The ﬁrst line of each test case will contain
two integers: N (N ≤ 100) and R, representing the number of cities and the number
of road segments, respectively. Each of the next R lines will contain three integers (C1 ,
C2 , and P ) where C1 and C2 are the city numbers and P (P > 1) is the maximum
number of passengers that can be carried by the bus service between the two cities.
City numbers are positive integers ranging from 1 to N . The (R + 1)th line will contain
three integers (S, D, and T ) representing, respectively, the starting city, the destination
city, and the number of tourists to be guided.
The input will end with two zeros for N and R.

9.6. Problems

207

Output
For each test case in the input, ﬁrst output the scenario number and then the minimum
number of trips required for this case on a separate line. Print a blank line after the
output for each test case.

Sample Input
7
1
1
1
2
2
3
3
4
5
6
1
0

10
2 30
3 15
4 10
4 25
5 60
4 40
6 20
7 35
7 20
7 30
7 99
0

Sample Output
Scenario #1
Minimum Number of Trips = 5

208

9. Graph Traversal

9.6.4

Slash Maze

PC/UVa IDs: 110904/705, Popularity: B, Success rate: average Level: 2
By ﬁlling a rectangle with slashes (/) and backslashes (\), you can generate nice little
mazes. Here is an example:

As you can see, paths in the maze cannot branch, so the whole maze contains only
(1) cyclic paths and (2) paths entering somewhere and leaving somewhere else. We are
only interested in the cycles. There are exactly two of them in our example.
Your task is to write a program that counts the cycles and ﬁnds the length of the
longest one. The length is deﬁned as the number of small squares the cycle consists of
(the ones bordered by gray lines in the picture). In this example, the long cycle has
length 16 and the short one length 4.

Input
The input contains several maze descriptions. Each description begins with one line
containing two integers w and h (1 ≤ w, h ≤ 75), representing the width and the height
of the maze. The next h lines describe the maze itself and contain w characters each;
all of these characters will be either “/” or “\”.
The input is terminated by a test case beginning with w = h = 0. This case should
not be processed.

Output
For each maze, ﬁrst output the line “Maze #n:”, where n is the number of the maze.
Then, output the line “k Cycles; the longest has length l.”, where k is the number of cycles in the maze and l the length of the longest of the cycles. If the maze is
acyclic, output “There are no cycles.”
Output a blank line after each test case.

9.6. Problems

209

Sample Input

Sample Output

6 4
\//\\/
\///\/
//\\/\
\/\///
3 3
///
\//
\\\
0 0

Maze #1:
2 Cycles; the longest has length 16.
Maze #2:
There are no cycles.

210

9. Graph Traversal

9.6.5

Edit Step Ladders

PC/UVa IDs: 110905/10029, Popularity: B, Success rate: low Level: 3
An edit step is a transformation from one word x to another word y such that x
and y are words in the dictionary, and x can be transformed to y by adding, deleting,
or changing one letter. The transformations from dig to dog and from dog to do are
both edit steps. An edit step ladder is a lexicographically ordered sequence of words
w1 , w2 , . . . , wn such that the transformation from wi to wi+1 is an edit step for all i
from 1 to n − 1.
For a given dictionary, you are to compute the length of the longest edit step ladder.

Input
The input to your program consists of the dictionary: a set of lowercase words in lexicographic order at one word per line. No word exceeds 16 letters and there are at most
25,000 words in the dictionary.

Output
The output consists of a single integer, the number of words in the longest edit step
ladder.

Sample Input
cat
dig
dog
fig
fin
fine
fog
log
wine

Sample Output
5

9.6. Problems

9.6.6

211

Tower of Cubes

PC/UVa IDs: 110906/10051, Popularity: C, Success rate: high Level: 3
You are given N colorful cubes, each having a distinct weight. Cubes are not
monochromatic – indeed, every face of a cube is colored with a diﬀerent color. Your
job is to build the tallest possible tower of cubes subject to the restrictions that (1)
we never put a heavier cube on a lighter one, and (2) the bottom face of every cube
(except the bottom one) must have the same color as the top face of the cube below it.

Input
The input may contain several test cases. The ﬁrst line of each test case contains an
integer N (1 ≤ N ≤ 500) indicating the number of cubes you are given. The ith of the
next N lines contains the description of the ith cube. A cube is described by giving the
colors of its faces in the following order: front, back, left, right, top, and bottom face.
For your convenience colors are identiﬁed by integers in the range 1 to 100. You may
assume that cubes are given in increasing order of their weights; that is, cube 1 is the
lightest and cube N is the heaviest.
The input terminates with a value 0 for N .

Output
For each case, start by printing the test case number on its own line as shown in the
sample output. On the next line, print the number of cubes in the tallest possible tower.
The next line describes the cubes in your tower from top to bottom with one description
per line. Each description gives the serial number of this cube in the input, followed
by a single whitespace character and then the identiﬁcation string (front, back, left,
right, top, or bottom of the top face of the cube in the tower. There may be multiple
solutions, but any one of them is acceptable.
Print a blank line between two successive test cases.

Sample Input
3
1 2
3 3
3 2
10
1 5
2 6
5 7
1 3
6 6
1 2

2 2 1 2
3 3 3 3
1 1 1 1
10 3 6 5
7 3 6 9
3 2 1 9
3 5 8 10
2 2 4 4
3 4 5 6

212

9. Graph Traversal

10 9 8 7 6 5
6 1 2 3 4 7
1 2 3 3 2 1
3 2 1 1 2 3
0

Sample Output
Case #1
2
2 front
3 front
Case #2
8
1 bottom
2 back
3 right
4 left
6 top
8 front
9 front
10 top

9.6. Problems

9.6.7

213

From Dusk Till Dawn

PC/UVa IDs: 110907/10187, Popularity: B, Success rate: average Level: 3
Vladimir has white skin, very long teeth and is 600 years old, but this is no problem
because Vladimir is a vampire. Vladimir has never had any problems with being a
vampire. In fact, he is a successful doctor who always takes the night shift and so has
made many friends among his colleagues. He has an impressive trick which he loves to
show at dinner parties: he can tell blood group by taste. Vladimir loves to travel, but
being a vampire he must overcome three problems.
1. He can only travel by train, because he must take his coﬃn with him. Fortunately
he can always travel ﬁrst class because he has made a lot of money through long
term investments.
2. He can only travel from dusk till dawn, namely, from 6 P.M. to 6 A.M. During
the day he has must stay inside a train station.
3. He has to take something to eat with him. He needs one litre of blood per day,
which he drinks at noon (12:00) inside his coﬃn.
Help Vladimir to ﬁnd the shortest route between two given cities, so that he can
travel with the minimum amount of blood. If he takes too much with him, people ask
him funny questions like, “What are you doing with all that blood?”

Input
The ﬁrst line of the input will contain a single number telling you the number of test
cases.
Each test case speciﬁcation begins with a single number telling you how many route
speciﬁcations follow. Each route speciﬁcation consists of the names of two cities, the
departure time from city one, and the total traveling time, with all times in hours.
Remember, Vladimir cannot use routes departing earlier than 18:00 or arriving later
than 6:00.
There will be at most 100 cities and less than 1,000 connections. No route takes less
than 1 hour or more than 24 hours, but Vladimir can use only routes within the 12
hours travel time from dusk till dawn.
All city names are at most 32 characters long. The last line contains two city names.
The ﬁrst is Vladimir’s start city; the second is Vladimir’s destination.

Output
For each test case you should output the number of the test case followed by “Vladimir
needs # litre(s) of blood.” or “There is no route Vladimir can take.”

214

9. Graph Traversal

Sample Input
2
3
Ulm Muenchen 17 2
Ulm Muenchen 19 12
Ulm Muenchen 5 2
Ulm Muenchen
10
Lugoj Sibiu 12 6
Lugoj Sibiu 18 6
Lugoj Sibiu 24 5
Lugoj Medias 22 8
Lugoj Medias 18 8
Lugoj Reghin 17 4
Sibiu Reghin 19 9
Sibiu Medias 20 3
Reghin Medias 20 4
Reghin Bacau 24 6
Lugoj Bacau

Sample Output
Test Case 1.
There is no route Vladimir can take.
Test Case 2.
Vladimir needs 2 litre(s) of blood.

9.6. Problems

9.6.8

215

Hanoi Tower Troubles Again!

PC/UVa IDs: 110908/10276, Popularity: B, Success rate: high Level: 3
There are many interesting variations on the Tower of Hanoi problem. This version
consists of N pegs and one ball containing each number from 1, 2, 3, . . . , ∞. Whenever
the sum of the numbers on two balls is not a perfect square (i.e., c2 for some integer c),
they will repel each other with such force that they can never touch each other.

10
6

The player must place balls on the pegs one by one, in order of increasing ball
number (i.e., ﬁrst ball 1, then ball 2, then ball 3. . . ). The game ends where there is no
non-repelling move.
The goal is to place as many balls on the pegs as possible. The ﬁgure above gives a
best possible result for 4 pegs.

Input
The ﬁrst line of the input contains a single integer T indicating the number of test cases
(1 ≤ T ≤ 50). Each test case contains a single integer N (1 ≤ N ≤ 50) indicating the
number of pegs available.

Output
For each test case, print a line containing an integer indicating the maximum number
of balls that can be placed. Print “-1” if an inﬁnite number of balls can be placed.

Sample Input

Sample Output

2
4
25

11
337

216

9. Graph Traversal

9.7 Hints
9.1 Can we color the graph during a single traversal?
9.2 What is the graph underlying this problem?
9.3 Can we reduce this problem to connectivity testing?
9.4 Does it pay to represent the graph explicitly, or just work on the matrix of slashes?
9.5 What is the graph underlying this problem?
9.6 Can we deﬁne a directed graph on the cubes such that the desired tower is a path
in the graph?
9.7 Can this be represented as an unweighted graph problem for BFS?
9.8 Can the constraints be usefully modeled using a DAG?

10
Graph Algorithms

The graph representations and traversal algorithms of Chapter 9 provide the basic
building blocks for any computation on graph structures. In this chapter, we consider
more advanced graph theory and algorithms.
Graph theory is the study of the properties of graph structures. It provides us with
a language with which to talk about graphs. The key to solving many problems is
identifying the fundamental graph-theoretic notion underlying the situation and then
using classical algorithms to solve the resulting problem.
We begin with a overview of basic graph theory and follow with algorithms for ﬁnding
important structures such as minimum spanning trees, shortest paths, and maximum
ﬂows.

10.1 Graph Theory
In this section, we provide a quick review of basic graph theory. Several excellent books
on graph theory [PS03, Wes00] are available for more detailed information. We outline relevant algorithms which should be fairly simple to program given the machinery
developed in the previous chapter.

10.1.1

Degree Properties

Graphs are made up of vertices and edges. The simplest property of a vertex is its
degree, the number of edges incident upon it.

218

10. Graph Algorithms

There are several important properties of vertex degrees. The sum of the vertex degrees in any undirected graph is twice the number of edges, since every edge contributes
one to the degree of both adjacent vertices. A corollary to this is that every graph contains an even number of odd degree vertices. For directed graphs, the relevant degree
condition is that the sum of the in-degrees of all vertices equals the sum of all outdegrees. The parity of vertex degrees has an important role in recognizing Eulerian
cycles as discussed in Section 10.1.3.
Trees are undirected graphs which contain no cycles. Vertex degrees are important
in the analysis of trees. A leaf of a tree is a vertex of degree 1. Every n-vertex tree
contains n − 1 edges, so all non-trivial trees contain at least two leaf vertices. Deleting
a leaf leaves a smaller tree, trimming the tree instead of disconnecting it.
Rooted trees are directed graphs where every node except the root has in-degree 1.
The leaves are the nodes with out-degree 0. Binary trees are rooted trees where every
vertex has an out-degree of 0 or 2. At least half the vertices of all such binary trees
must be leaves.
A spanning tree of a graph G = (V, E) is a subset of edges E ⊂ E such that E is a
tree on V . Spanning trees exist for any connected graph; the parent relation encoding
vertex discovery for either breadth-ﬁrst or depth-ﬁrst search suﬃces to construct one.
The minimum spanning tree is an important property of weighted graphs, and discussed
in Section 10.2.

10.1.2

Connectivity

A graph is connected if there is an undirected path between every pair of vertices. The
existence of a spanning tree is suﬃcient to prove connectivity. A depth-ﬁrst search-based
connected components algorithm was presented in Section 9.4.2.
However, there are other notions of connectivity to be aware of. The vertex (edge)
connectivity is the smallest number of vertices (edges) which must be deleted to disconnect the graph. The most interesting special case when there is a single weak link in the
graph. A single vertex whose deletion disconnects the graph is called an articulation
vertex; any graph without such a vertex is said to be biconnected. A single edge whose
deletion disconnects the graph is called a bridge; any graph without such an edge is said
to be edge-biconnected.
Testing for articulation vertices or bridges is easy via brute force. For each vertex/edge, delete it from the graph and test whether the resulting graph remains
connected. Be sure to add that vertex/edge back before doing the next deletion!
In directed graphs we are often concerned with strongly connected components, that
is, partitioning the graph into chunks such that there are directed paths between all
pairs of vertices within a given chunk. Road networks should be strongly connected, or
else there will be places you can drive to but not drive home from without violating
one-way signs.
The following idea enables us to identify the strongly connected components in a
graph. It is easy to ﬁnd a directed cycle using depth-ﬁrst search, since any back edge
plus the down path in the DFS tree gives such a cycle. All vertices in this cycle must be
in the same strongly connected component. Thus we can shrink (contract) the vertices

10.1. Graph Theory

219

on this cycle down to a single vertex representing the component, and then repeat.
This process terminates when no directed cycle remains, and each vertex represents
one strongly connected component.

10.1.3

Cycles in Graphs

All non-tree connected graphs contain cycles. Particularly interesting are cycles which
visit all the edges or vertices of the graph.
An Eulerian cycle is a tour which visits every edge of the graph exactly once. The
children’s puzzle of drawing a geometric ﬁgure without ever lifting your pencil from the
paper is an instance of ﬁnding an Eulerian cycle (or path), where the vertices are the
junctions in the drawing and the edges represent the lines to trace. A mailman’s route
is ideally an Eulerian cycle, so he can visit every street (edge) in the neighborhood
once before returning home. Strictly speaking, Eulerian cycles are circuits, not cycles,
because they may visit vertices more than once.
An undirected graph contains an Eulerian cycle if it is connected and every vertex is of
even degree. Why? The circuit must enter and exit every vertex it encounters, implying
that all degrees must be even. This idea also suggests a way to ﬁnd an Eulerian cycle,
by building it one cycle at a time. We can ﬁnd a simple cycle in the graph using the
DFS-based algorithm discussed in Section 9.4.1. Deleting the edges on this cycle leaves
each vertex with even degree. Once we have partitioned the edges into edge-disjoint
cycles, we can merge these cycles arbitrarily at common vertices to build an Eulerian
cycle.
In the case of directed graphs, the relevant condition is that all vertices have the
same in-degree as out-degree. Peeling oﬀ any cycle preserves this property, and thus
Eulerian cycles of directed graphs can be built in the same manner. Eulerian paths are
tours that visit every edge exactly once but might not end up where they started from.
These allow exactly two vertices to have parity violations, one of which must be the
starting node and the other the ending node.
A Hamiltonian cycle is a tour which visits every vertex of the graph exactly once.
The traveling salesman problem asks for the shortest such tour on a weighted graph. An
Eulerian cycle problem in G = (V, E) can be reduced to a Hamiltonian cycle problem
by constructing a graph G = (V , E ) such that each vertex in V represents an edge
of E and there are edges in E connecting all neighboring pair of edges from G.
Unfortunately, no eﬃcient algorithm exists for solving Hamiltonian cycle problems.
Thus you have two options on encountering one. If the graph is suﬃciently small, it
can be solved via backtracking. Each Hamiltonian cycle is described by a permutation
of the vertices. We backtrack whenever there does not exist an edge from the latest
vertex to an unvisited one. If the graph is too large for such an attack we must try to
ﬁnd an alternate formulation of the problem, perhaps as an Eulerian cycle problem on
a diﬀerent graph.

220

10.1.4

10. Graph Algorithms

Planar Graphs

Planar graphs are those which can be drawn in the plane such that no two edges
cross each other. Many of the graphs we commonly encounter are planar. Every tree is
planar: can you describe how to construct a non-crossing drawing for a given tree? Every
road network in the absence of concrete/steel bridges must be planar. The adjacency
structure of convex polyhedra also yield planar graphs.
Planar graphs have several important properties. First, there is a tight relation between the number of vertices n, edges m, and faces f of any planar graph. Euler’s
formula states that n − m + f = 2. Trees contain n − 1 edges, so any planar drawing
of a tree has exactly one face, namely, the outside face. Any embedding of a cube (8
vertices and 12 edges) must contain six faces, as those who have played dice can attest
to.
Eﬃcient algorithms exist for testing the planarity of a graph and ﬁnding non-crossing
embeddings, but all are fairly complicated to implement. Euler’s formula gives an easy
way to prove that certain graphs are not planar, however. Every planar graph contains
at most 3n−6 edges for n > 2. This bound means that every planar graph must contain
a vertex of degree at most 5, and deleting this vertex leaves a smaller planar graph with
this same property. Testing whether a given drawing is a planar embedding is the same
as testing whether any of a given set of line segments intersect, which will be discussed
when we get to geometric algorithms.

10.2 Minimum Spanning Trees
A spanning tree of a graph G = (V, E) is a subset of edges from E forming a tree
connecting all vertices of V . For edge-weighted graphs, we are particularly interested in
the minimum spanning tree, the spanning tree whose sum of edge weights is the smallest
possible.
Minimum spanning trees are the answer whenever we need to connect a set of points
(representing cities, junctions, or other locations) by the smallest amount of roadway,
wire, or pipe. Any tree is the smallest possible connected graph in terms of number of
edges, while the minimum spanning tree is the smallest connected graph in terms of
edge weight.
The two main algorithms for computing minimum spanning trees are Kruskal’s and
Prim’s, and both are covered in most any algorithms course. We will present Prim’s
algorithm here because we think it is simpler to program, and because it gives us
Dijkstra’s shortest path algorithm with very minimal changes.
First, we must generalize the graph data structure from Chapter 9 to support edgeweighted graphs. Each edge-entry previously contained only the other endpoint of the
given edge. We must replace this by a record allowing us to annotate the edge with
weights:
typedef struct {
int v;

/* neighboring vertex */

10.2. Minimum Spanning Trees

int weight;

221

/* edge weight */

} edge;
typedef struct {
edge edges[MAXV+1][MAXDEGREE];
int degree[MAXV+1];
int nvertices;
int nedges;
} graph;

/*
/*
/*
/*

adjacency
outdegree
number of
number of

info */
of each vertex */
vertices in graph */
edges in graph */

and update the various initialization and traversal algorithms appropriately. This is not
a complicated task.
Prim’s algorithm grows the minimum spanning tree in stages starting from a given
vertex. At each iteration, we add one new vertex into the spanning tree. A greedy
algorithm suﬃces for correctness: we always add the lowest-weight edge linking a vertex
in the tree to a vertex on the outside.
The simplest implementation of this idea would assign each vertex a Boolean variable
denoting whether it is already in the tree (the array intree in the code below), and
then searches all edges at each iteration to ﬁnd the minimum weight edge with exactly
one intree vertex.
Our implementation is somewhat smarter. It keeps track of the cheapest edge from
any tree vertex to every non-tree vertex in the graph. The cheapest edge over all remaining non-tree vertices gets added in each iteration. We must update the costs of
getting to the non-tree vertices after each insertion. However, since the new vertex is
the only change in the tree all possible edge-weight updates come from its outgoing
edges:
prim(graph *g, int start)
{
int i,j;
bool intree[MAXV];
int distance[MAXV];
int v;
int w;
int weight;
int dist;

/*
/*
/*
/*
/*
/*
/*

for (i=1; i<=g->nvertices; i++) {
intree[i] = FALSE;
distance[i] = MAXINT;
parent[i] = -1;
}
distance[start] = 0;
v = start;

counters */
is vertex in the tree yet? */
vertex distance from start */
current vertex to process */
candidate next vertex */
edge weight */
shortest current distance */

222

10. Graph Algorithms

while (intree[v] == FALSE) {
intree[v] = TRUE;
for (i=0; idegree[v]; i++) {
w = g->edges[v][i].v;
weight = g->edges[v][i].weight;
if ((distance[w] > weight) && (intree[w]==FALSE)) {
distance[w] = weight;
parent[w] = v;
}
}
v = 1;
dist = MAXINT;
for (i=2; i<=g->nvertices; i++)
if ((intree[i]==FALSE) && (dist > distance[i])) {
dist = distance[i];
v = i;
}
}
}
The minimum spanning tree itself or its cost can be reconstructed in two diﬀerent
ways. The simplest method would be to augment this procedure with statements that
print the edges as they are found or total the weight of all selected edges in a variable
for later return. Alternately, since the tree topology is encoded by the parent array it
plus the original graph tells you everything about the minimum spanning tree.
This minimum spanning tree algorithm has several interesting properties which help
solve many closely related problems:
• Maximum Spanning Trees — Suppose we hire an evil telephone company to
connect a bunch of houses together, and that this company will be paid a price
proportional to the amount of wire they install. Naturally, they will want to
build as expensive a spanning tree as possible. The maximum spanning tree of
any graph can be found by simply negating the weights of all edges and running
Prim’s algorithm. The most negative tree in the negated graph is the maximum
spanning tree in the original.
Most graph algorithms do not adapt so nicely to negative numbers. Indeed,
shortest path algorithms have trouble with negative numbers, and certainly do
not generate the longest possible path using this technique.
• Minimum Product Spanning Trees — Suppose we want the spanning tree with
the smallest product of edge weights, assuming all edge weights are positive. Since
lg(a · b) = lg(a) + lg(b), the minimum spanning tree on a graph where each edge
weight is replaced with its logarithm gives the minimum product spanning tree.

10.3. Shortest Paths

223

• Minimum Bottleneck Spanning Tree — Sometimes we seek a spanning tree which
minimizes the maximum edge weight over all such trees. In fact, the minimum
spanning tree has this property. The proof follows directly from the correctness
of Kruskal’s algorithm.
Such bottleneck spanning trees have interesting applications when the edge
weights are interpreted as costs, capacities, or strengths. A less eﬃcient but simpler way to solve such problems might be to delete all “heavy” edges from the
graph and ask whether the result is still connected. These kind of tests can be
done with simple BFS/DFS.
The minimum spanning tree of a graph is unique if all m edge weights in the graph
are distinct. If not, the order in which Prim’s algorithm breaks ties determines which
minimum spanning tree the algorithm returns.

10.3 Shortest Paths
The problem of ﬁnding shortest paths in unweighted graphs was discussed in Section
9.3.1; breadth-ﬁrst search does the job, and that is all she wrote. BFS does not suﬃce
for ﬁnding shortest paths in weighted graphs, because the shortest weighted path from
a to b does not necessarily contain the fewest number of edges. We all have favorite
back-door driving/walking routes which use more turns than the simplest path, but
which magically get us there faster by avoiding traﬃc, lights, etc.
In this section, we will implement two distinct algorithms for ﬁnding shortest paths
in weighted graphs.

10.3.1

Dijkstra’s Algorithm

Dijkstra’s algorithm is the method of choice for ﬁnding the shortest path between two
vertices in an edge- and/or vertex-weighted graph. Given a particular start vertex s, it
ﬁnds the shortest path from s to every other vertex in the graph, including your desired
destination t.
The basic idea is very similar to Prim’s algorithm. In each iteration, we are going to
add exactly one vertex to the tree of vertices for which we know the shortest path from
s. Just as in Prim’s, we will keep track of the best path seen to date for all vertices
outside the tree, and insert them in order of increasing cost.
The diﬀerence between Dijkstra’s and Prim’s algorithms is how they rate the desirability of each outside vertex. In the minimum spanning tree problem, all we care
about is the weight of the next potential tree edge. In shortest path, we want to include
the outside vertex which is closest (in shortest-path distance) to the start. This is a
function of both the new edge weight and the distance from the start of the tree-vertex
it is adjacent to.
In fact, this change is very minor. Below we give an implementation of Dijkstra’s
algorithm based on changing exactly three lines from our Prim’s implementation – one
of which is simply the name of the function!

224

10. Graph Algorithms

dijkstra(graph *g, int start)
{
int i,j;
bool intree[MAXV];
int distance[MAXV];
int v;
int w;
int weight;
int dist;

/* WAS prim(g,start) */
/*
/*
/*
/*
/*
/*
/*

counters */
is vertex in the tree yet? */
vertex distance from start */
current vertex to process */
candidate next vertex */
edge weight */
shortest current distance */

for (i=1; i<=g->nvertices; i++) {
intree[i] = FALSE;
distance[i] = MAXINT;
parent[i] = -1;
}
distance[start] = 0;
v = start;
while (intree[v] == FALSE) {
intree[v] = TRUE;
for (i=0; idegree[v]; i++) {
w = g->edges[v][i].v;
weight = g->edges[v][i].weight;
/* CHANGED */
if (distance[w] > (distance[v]+weight)) {
/* CHANGED */
distance[w] = distance[v]+weight;
parent[w] = v;
}
}
v = 1;
dist = MAXINT;
for (i=2; i<=g->nvertices; i++)
if ((intree[i]==FALSE) && (dist > distance[i])) {
dist = distance[i];
v = i;
}
}
}
How do we use dijkstra to ﬁnd the length of the shortest path from start to a
given vertex t? This is exactly the value of distance[t]. How can we reconstruct the
actual path? By following the backward parent pointers from t until we hit start (or
-1 if no such path exists), exactly as was done in the find path() routine of Section
9.3.3.

10.3. Shortest Paths

225

Unlike Prim’s, Dijkstra’s algorithm only works on graphs without negative-cost edges.
The reason is that midway through the execution we may encounter an edge with weight
so negative that it changes the cheapest way to get from s to some other vertex already
in the tree. Indeed, the most cost-eﬀective way to get from your house to your next-door
neighbor would be through the lobby of any bank oﬀering you enough money to make
the detour worthwhile.
Most applications do not feature negative-weight edges, making this discussion academic. Floyd’s algorithm, discussed below, works correctly unless there are negative
cost cycles, which grossly distort the shortest-path structure. Unless that bank limits
its reward to one per customer, you could so beneﬁt by making an inﬁnite number of
trips through the lobby that it would never pay to actually reach your destination!

10.3.2

All-Pairs Shortest Path

Many applications need to know the length of the shortest path between all pairs of
vertices in a given graph. For example, suppose you want to ﬁnd the “center” vertex,
the one which minimizes the longest or average distance to all the other nodes. This
might be the best place to start a new business. Or perhaps you need to know the
diameter of the graph, the longest shortest-path distance between all pairs of vertices.
This might correspond to the longest possible time it takes a letter or network packet
to be delivered between two arbitrary destinations.
We could solve this problem by calling Dijkstra’s algorithm from each of the n possible
starting vertices. But Floyd’s all-pairs shortest-path algorithm is an amazingly slick way
to construct this distance matrix from the original weight matrix of the graph.
Floyd’s algorithm is best employed on an adjacency matrix data structure, which
is no extravagance since we have to store all n2 pairwise distances anyway. Our
adjacency matrix type allocates space for the largest possible matrix, and keeps track
of how many vertices are in the graph:
typedef struct {
int weight[MAXV+1][MAXV+1];
int nvertices;
} adjacency_matrix;

/* adjacency/weight info */
/* number of vertices in graph */

A critical issue in any adjacency matrix implementation is how we denote the edges
which are not present in the graph. For unweighted graphs, a common convention is
that graph edges are denoted by 1 and non-edges by 0. This gives exactly the wrong
interpretation if the numbers denote edge weights, for the non-edges get interpreted as
a free ride between vertices. Instead, we should initialize each non-edge to MAXINT. This
way we can both test whether it is present and automatically ignore it in shortest-path
computations, since only real edges will be used unless MAXINT is less than the diameter
of your graph.
initialize_adjacency_matrix(adjacency_matrix *g)
{
int i,j;
/* counters */

226

10. Graph Algorithms

g -> nvertices = 0;
for (i=1; i<=MAXV; i++)
for (j=1; j<=MAXV; j++)
g->weight[i][j] = MAXINT;
}
read_adjacency_matrix(adjacency_matrix *g, bool directed)
{
int i;
/* counter */
int m;
/* number of edges */
int x,y,w;
/* placeholder for edge/weight */
initialize_adjacency_matrix(g);
scanf("%d %d\n",&(g->nvertices),&m);
for (i=1; i<=m; i++) {
scanf("%d %d %d\n",&x,&y,&w);
g->weight[x][y] = w;
if (directed==FALSE) g->weight[y][x] = w;
}
}
All this is fairly trivial. How do we ﬁnd shortest paths in such a matrix? Floyd’s
algorithm starts by numbering the vertices of the graph from 1 to n, using these numbers
not to label the vertices but to order them.
We will perform n iterations, where the kth iteration allows only the ﬁrst k vertices
as possible intermediate steps on the path between each pair of vertices x and y. When
k = 0, we are allowed no intermediate vertices, so the only allowed paths consist of the
original edges in the graph. Thus the initial all-pairs shortest-path matrix consists of
the initial adjacency matrix. At each iteration, we allow a richer set of possible shortest
paths. Allowing the kth vertex as a new possible intermediary helps only if there is a
short path that goes through k, so
W [i, j]k = min(W [i, j]k−1 , W [i, k]k−1 + W [k, j]k−1 )
The correctness of this is somewhat subtle, and we encourage you to convince yourself
of it. But there is nothing subtle about how short and sweet the implementation is:
floyd(adjacency_matrix *g)
{
int i,j;
int k;
int through_k;

/* dimension counters */
/* intermediate vertex counter */
/* distance through vertex k */

10.4. Network Flows and Bipartite Matching

227

for (k=1; k<=g->nvertices; k++)
for (i=1; i<=g->nvertices; i++)
for (j=1; j<=g->nvertices; j++) {
through_k = g->weight[i][k]+g->weight[k][j];
if (through_k < g->weight[i][j])
g->weight[i][j] = through_k;
}
}
The output of Floyd’s algorithm, as it is written, does not enable one to reconstruct
the actual shortest path between any given pair of vertices. Use Dijkstra’s algorithm if
you care about the actual path. Note, however, that most all-pairs applications need
only the resulting distance matrix. These jobs are what Floyd’s algorithm was designed
for.
Floyd’s algorithm has another important application, that of computing the transitive
closure of a directed graph. In analyzing a directed graph, we are often interested in
which vertices are reachable from a given node.
For example, consider the blackmail graph deﬁned on a set of n people, where there
is a directed edge (i, j) if i has sensitive-enough private information on j so that i can
get him to do whatever he wants. You wish to hire one of these n people to be your
personal representative. Who has the most power in terms of blackmail potential?
A simplistic answer would be the vertex of highest degree, but an even better representative would be the person who has blackmail chains to the most other parties. Steve
might only be able to blackmail Miguel directly, but if Miguel can blackmail everyone
else then Steve is the man you want to hire.
The vertices reachable from any single node can be computed using using breadth-ﬁrst
or depth-ﬁrst search. But the whole batch can be computed as an all-pairs shortestpath problem. If the shortest path from i to j remains MAXINT after running Floyd’s
algorithm, you can be sure there is no directed path from i to j. Any vertex pair of
weight less than MAXINT must be reachable, both in the graph-theoretic and blackmail
senses of the word.

10.4 Network Flows and Bipartite Matching
Any edge-weighted graph can be thought of as a network of pipes, where the weight of
edge (i, j) measures the capacity of the pipe. Capacities can be thought of as a function
of the cross-sectional area of the pipe – a wide pipe might be able to carry 10 units of
ﬂow in a given time where a narrower pipe might only be able to carry 5 units. For a
given weighted graph G and two vertices s and t, the network ﬂow problem asks for the
maximum amount of ﬂow which can be sent from s to t while respecting the maximum
capacities of each pipe.
While the network ﬂow problem is of independent interest, its primary importance
is that of being able to solve other important graph problems. A classic example is
bipartite matching. A matching in a graph G = (V, E) is a subset of edges E ⊂ E such

228

10. Graph Algorithms

that no two edges in E share a vertex. Thus a matching pairs oﬀ certain vertices such
that every vertex is in at most one such pair.
Graph G is bipartite or two-colorable if the vertices can be divided into two sets, say,
L and R, such that all edges in G have one vertex in L and one vertex in R. Many
naturally deﬁned graphs are bipartite. For example, suppose certain vertices represent
jobs to be done and the remaining vertices people who can potentially do them. The
existence of edge (j, p) means that job j can potentially done by person p. Or let certain
vertices represent boys and certain vertices girls, with edges representing compatible
pairs. Matchings in these graphs have natural interpretations as job assignments or as
marriages.
The largest possible bipartite matching can be found using network ﬂow. Create a
source node s which is connected to every vertex in L by an edge of weight 1. Create
a sink node t which is connected to every vertex in R by an edge of weight 1. Assign
every edge in the bipartite graph G a weight of 1. Now, the maximum possible ﬂow from
s to t deﬁnes the largest matching in G. Certainly we can ﬁnd a ﬂow as large as the
matching, by taking exactly the matching edges and their source-to-sink connections.
Further, there can be no greater possible ﬂow. How can we ever hope to get more than
one ﬂow unit through any vertex?
The simplest network ﬂow algorithm to implement is the Ford-Fulkerson augmenting
path algorithm. For each edge, we will keep track of both the amount of ﬂow currently
going through the edge as well as its remaining residual capacity. Thus we must modify
our edge structure to accommodate the extra ﬁelds:
typedef struct {
int v;
int capacity;
int flow;
int residual;
} edge;

/*
/*
/*
/*

neighboring vertex */
capacity of edge */
flow through edge */
residual capacity of edge */

We look for any path from source to sink that increases the total ﬂow and use it to
augment the total. We terminate on the optimal ﬂow when no such augmenting path
exists.
netflow(flow_graph *g, int source, int sink)
{
int volume;
/* weight of the augmenting path */
add_residual_edges(g);
initialize_search(g);
bfs(g,source);
volume = path_volume(g, source, sink, parent);
while (volume > 0) {

10.4. Network Flows and Bipartite Matching

229

augment_path(g,source,sink,parent,volume);
initialize_search(g);
bfs(g,source);
volume = path_volume(g, source, sink, parent);
}
}
Any augmenting path from source to sink increases the ﬂow, so we can use bfs to
ﬁnd such a path in the appropriate graph. We are only allowed to walk along network
edges which have remaining capacity or, in other words, positive residual ﬂow. We use
this predicate to help bfs distinguish between saturated and unsaturated edges:
bool valid_edge(edge e)
{
if (e.residual > 0) return (TRUE);
else return(FALSE);
}
Augmenting a path transfers the maximum possible volume from the residual capacity
into positive ﬂow. The amount we can move is limited by the path-edge with the smallest
amount of residual capacity, just as the rate at which traﬃc can ﬂow is limited by the
most congested point.
int path_volume(flow_graph *g, int start, int end, int parents[])
{
edge *e;
/* edge in question */
edge *find_edge();
if (parents[end] == -1) return(0);
e = find_edge(g,parents[end],end);
if (start == parents[end])
return(e->residual);
else
return( min(path_volume(g,start,parents[end],parents),
e->residual) );
}
edge *find_edge(flow_graph *g, int x, int y)
{
int i;
/* counter */
for (i=0; idegree[x]; i++)
if (g->edges[x][i].v == y)
return( &g->edges[x][i] );

230

10. Graph Algorithms

return(NULL);
}
Sending an additional unit of ﬂow along directed edge (i, j) reduces the residual
capacity of edge (i, j) but increases the residual capacity of edge (j, i). Thus the act of
augmenting a path requires looking up both forward and reverse edges for each link on
the path.
augment_path(flow_graph *g, int start, int end, int parents[], int volume)
{
edge *e;
/* edge in question */
edge *find_edge();
if (start == end) return;
e = find_edge(g,parents[end],end);
e->flow += volume;
e->residual -= volume;
e = find_edge(g,end,parents[end]);
e->residual += volume;
augment_path(g,start,parents[end],parents,volume);
}
Initializing the ﬂow graph requires creating directed ﬂow edges (i, j) and (j, i) for
each network edge e = (i, j). The initial ﬂows are all set to 0. The initial residual ﬂow
of (i, j) is set to the capacity of e, while the initial residual ﬂow of (j, i) is set to 0.
Network ﬂows are an advanced algorithmic technique, and recognizing whether a
particular problem can be solved by network ﬂow requires experience. We point the
reader to books by Cook and Cunningham [CC97] and Ahuja, Magnanti, and Orlin
[AMO93] for more detailed treatments of the subject.

10.5. Problems

231

10.5 Problems
10.5.1

Freckles

PC/UVa IDs: 111001/10034, Popularity: B, Success rate: average Level: 2
In an episode of the Dick Van Dyke show, little Richie connects the freckles on his
Dad’s back to form a picture of the Liberty Bell. Alas, one of the freckles turns out to
be a scar, so his Ripley’s engagement falls through.
Consider Dick’s back to be a plane with freckles at various (x, y) locations. Your job
is to tell Richie how to connect the dots so as to minimize the amount of ink used.
Richie connects the dots by drawing straight lines between pairs, possibly lifting the
pen between lines. When Richie is done there must be a sequence of connected lines
from any freckle to any other freckle.

Input
The input begins with a single positive integer on a line by itself indicating the number
of test cases, followed by a blank line.
The ﬁrst line of each test case contains 0 < n ≤ 100, giving the number of freckles
on Dick’s back. For each freckle, a line follows; each following line contains two real
numbers indicating the (x, y) coordinates of the freckle.
There is a blank line between each two consecutive test cases.

Output
For each test case, your program must print a single real number to two decimal places:
the minimum total length of ink lines that can connect all the freckles. The output of
each two consecutive cases must be separated by a blank line.

Sample Input
1
3
1.0 1.0
2.0 2.0
2.0 4.0

Sample Output
3.41

232

10. Graph Algorithms

10.5.2

The Necklace

PC/UVa IDs: 111002/10054, Popularity: B, Success rate: low Level: 3
My little sister had a beautiful necklace made of colorful beads. Each two successive
beads in the necklace shared a common color at their meeting point, as shown below:

But, alas! One day, the necklace tore and the beads were scattered all over the ﬂoor.
My sister did her best to pick up all the beads, but she is not sure whether she found
them all. Now she has come to me for help. She wants to know whether it is possible to
make a necklace using all the beads she has in the same way that her original necklace
was made. If so, how can the beads be so arranged?
Write a program to solve the problem.

Input
The ﬁrst line of the input contains the integer T , giving the number of test cases. The
ﬁrst line of each test case contains an integer N (5 ≤ N ≤ 1,000) giving the number
of beads my sister found. Each of the next N lines contains two integers describing the
colors of a bead. Colors are represented by integers ranging from 1 to 50.

Output
For each test case, print the test case number as shown in the sample output. If reconstruction is impossible, print the sentence “some beads may be lost” on a line
by itself. Otherwise, print N lines, each with a single bead description such that for
1 ≤ i ≤ N − 1, the second integer on line i must be the same as the ﬁrst integer on line
i + 1. Additionally, the second integer on line N must be equal to the ﬁrst integer on
line 1. There may be many solutions, any one of which is acceptable.
Print a blank line between two successive test cases.

Sample Input
2
5
1
2
3
4
5
5
2

2
3
4
5
6
1

10.5. Problems

2
3
3
2

2
4
1
4

Sample Output
Case #1
some beads may be lost
Case #2
2 1
1 3
3 4
4 2
2 2

233

234

10. Graph Algorithms

10.5.3

Fire Station

PC/UVa IDs: 111003/10278, Popularity: B, Success rate: low Level: 2
A city is served by a number of ﬁre stations. Residents have complained that the
distance between certain houses and the nearest station is too far, so a new station is to
be built. You are to choose the location of the new station so as to reduce the distance
to the nearest station from the houses of the poorest-served residents.
The city has up to 500 intersections, connected by road segments of various lengths.
No more than 20 road segments intersect at a given intersection. The locations of houses
and ﬁre stations alike are considered to be at intersections. Furthermore, we assume
that there is at least one house associated with every intersection. There may be more
than one ﬁre station per intersection.

Input
The input begins with a single line indicating the number of test cases, followed by a
blank line. There will also be a blank line between each two consecutive inputs.
The ﬁrst line of input contains two positive integers: the number of existing ﬁre
stations f (f ≤ 100) and the number of intersections i (i ≤ 500). Intersections are numbered from 1 to i consecutively. Then f lines follow, each containing the intersection
number at which an existing ﬁre station is found. A number of lines follow, each containing three positive integers: the number of an intersection, the number of a diﬀerent
intersection, and the length of the road segment connecting the intersections. All road
segments are two-way (at least as far as ﬁre engines are concerned), and there will exist
a route between any pair of intersections.

Output
For each test case, output the lowest intersection number at which a new ﬁre station can
be built so as to minimize the maximum distance from any intersection to its nearest
ﬁre station. Separate the output of each two consecutive cases by a blank line.

Sample Input

Sample Output

1
2
1
2
3
4
5
6

6
2
3
4
5
6
1

10
10
10
10
10
10

10.5. Problems

10.5.4

235

Railroads

PC/UVa IDs: 111004/10039, Popularity: C, Success rate: average Level: 3
Tomorrow morning Jill must travel from Hamburg to Darmstadt to compete in the
regional programming contest. Since she is afraid of arriving late and being excluded
from the contest, she is looking for the train which gets her to Darmstadt as early
as possible. However, she dislikes getting to the station too early, so if there are several schedules with the same arrival time then she will choose the one with the latest
departure time.
Jill asks you to help her with her problem. You are given a set of railroad schedules from which you must compute the train with the earliest arrival time and the
fastest connection from one location to another. Fortunately, Jill is very experienced in
changing trains and can do this instantaneously, i.e., in zero time!

Input
The very ﬁrst line of the input gives the number of scenarios. Each scenario consists
of three parts. The ﬁrst part lists the names of all cities connected by the railroads.
It starts with a number 1 < C ≤ 100, followed by C lines containing city names. All
names consist only of letters.
The second part describes all the trains running during a day. It starts with a number
T ≤ 1,000 followed by T train descriptions. Each of them consists of one line with a
number ti ≤ 100 and then ti more lines, each with a time and a city name, meaning
that passengers can get on or oﬀ the train at that time at that city.
The ﬁnal part consists of three lines: the ﬁrst containing the earliest possible starting
time, the second the name of the city where she starts, and the third with the destination
city. The start and destination cities are always diﬀerent.

Output
For each scenario print a line containing “Scenario i”, where i is the scenario number
starting from 1.
If a connection exists, print the two lines containing zero padded timestamps and
locations as shown in the example. Use blanks to achieve the indentation. If no connection exists on the same day (i.e., arrival before midnight), print a line containing “No
connection”.
Print a blank line after each scenario.

Sample Input
2
3
Hamburg
Frankfurt

236

10. Graph Algorithms

Darmstadt
3
2
0949 Hamburg
1006 Frankfurt
2
1325 Hamburg
1550 Darmstadt
2
1205 Frankfurt
1411 Darmstadt
0800
Hamburg
Darmstadt
2
Paris
Tokyo
1
2
0100 Paris
2300 Tokyo
0800
Paris
Tokyo

Sample Output
Scenario 1
Departure 0949 Hamburg
Arrival
1411 Darmstadt
Scenario 2
No connection

10.5. Problems

10.5.5

237

War

PC/UVa IDs: 111005/10158, Popularity: B, Success rate: average Level: 3
A war is being fought between two countries, A and B. As a loyal citizen of C, you
decide to help your country by secretly attending the peace talks between A and B.
There are n other people at the talks, but you do not know which person belongs to
which country. You can see people talking to each other, and by observing their behavior
during occasional one-to-one conversations you can guess if they are friends or enemies.
Your country needs to know whether certain pairs of people are from the same country, or whether they are enemies. You can expect to receive such questions from your
government during the peace talks, and will have to give replies on the basis of your
observations so far.
Now, more formally, consider a black box with the following operations:
setFriends(x,y)
setEnemies(x,y)
areFriends(x,y)
areEnemies(x,y)

shows that x and y are from the same country
shows that x and y are from diﬀerent countries
returns true if you are sure that x and y are friends
returns true if you are sure that x and y are enemies

The ﬁrst two operations should signal an error if they contradict your former knowledge. The two relations “friends” (denoted by ∼) and “enemies” (denoted by ∗) have
the following properties:
∼ is an equivalence relation: i.e.,
1. If x ∼ y and y ∼ z, then x ∼ z (The friends of my friends are my friends as
well.)
2. If x ∼ y, then y ∼ x (Friendship is mutual.)
3. x ∼ x (Everyone is a friend of himself.)
∗ is symmetric and irreﬂexive:
1.
2.
3.
4.

If x ∗ y then y ∗ x (Hatred is mutual.)
Not x ∗ x (Nobody is an enemy of himself.)
If x ∗ y and y ∗ z then x ∼ z (A common enemy makes two people friends.)
If x ∼ y and y ∗ z then x ∗ z (An enemy of a friend is an enemy.)

Operations setFriends(x,y) and setEnemies(x,y) must preserve these properties.

Input
The ﬁrst line contains a single integer, n, the number of people. Each subsequent line
contains a triple of integers, c x y, where c is the code of the operation,
c = 1,
c = 2,
c = 3,
c = 4,

setFriends
setEnemies
areFriends
areEnemies

238

10. Graph Algorithms

and x and y are its parameters, integers in the range [0, n) identifying two diﬀerent
people. The last line contains 0 0 0.
All integers in the input ﬁle are separated by at least one space or line break. There
are at most 10,000 people, but the number of operations is unconstrained.

Output
For every areFriends and areEnemies operation write “0” (meaning no) or “1” (meaning yes) to the output. For every setFriends or setEnemies operation which conﬂicts
with previous knowledge, output a “-1” to the output; such an operation should produce
no other eﬀect and execution should continue. A successful setFriends or setEnemies
gives no output.
All integers in the output ﬁle must be separated by one line break.

Sample Input
10
1 0
1 1
2 0
3 0
3 8
4 1
4 1
4 8
1 8
1 5
3 5
0 0

1
2
5
2
9
5
2
9
9
2
2
0

Sample Output
1
0
1
0
0
-1
0

10.5. Problems

10.5.6

239

Tourist Guide

PC/UVa IDs: 111006/10199, Popularity: B, Success rate: average Level: 3
Rio de Janeiro is a very beautiful city, but there are so many places to visit that
sometimes you feel overwhelmed, Fortunately, your friend Bruno has promised to be
your tour guide.
Unfortunately, Bruno is terrible driver. He has a lot of traﬃc ﬁnes to pay and is
eager to avoid paying more. Therefore he wants to know where all the police cameras
are located so he can drive more carefully when passing by them. These cameras are
strategically distributed over the city, in locations that a driver must pass through in
order to travel from one zone of the city to another. A location C will have a camera if
and only if there are two city locations A and B such that all paths from A to B pass
through a location C.
For instance, suppose that we have six locations (A, B, C, D, E, and F ) with seven
bidirectional routes B − C, A − B, C − A, D − C, D − E, E − F , and F − C. There
must be a camera on C because to go from A to E you must pass through C. In this
conﬁguration, C is the only camera location.
Given a map of the city, help Bruno avoid further ﬁnes during your tour by writing
a program to identify where all the cameras are.

Input
The input will consist of an arbitrary number of city maps, where each map begins
with an integer N (2 < N ≤ 100) denoting the total number of locations in the city.
Then follow N diﬀerent place names at one per line, where each place name will consist
of least one and at most 30 lowercase letters. A non-negative integer R then follows,
denoting the total routes of the city. The next R lines each describe a bidirectional
route represented by the two places that the route connects.
Location names in route descriptions will always be valid, and there will be no route
from one place to itself. You must read until N = 0, which should not be processed.

Output
For each city map you must print the following line:
City map #d: c camera(s) found
where d stands for the city map number (starting from 1) and c stands for the total
number of cameras. Then should follow c lines with the location names of each camera
in alphabetical order. Print a blank line between output sets.

Sample Input
6
sugarloaf

240

10. Graph Algorithms

maracana
copacabana
ipanema
corcovado
lapa
7
ipanema copacabana
copacabana sugarloaf
ipanema sugarloaf
maracana lapa
sugarloaf maracana
corcovado sugarloaf
lapa corcovado
5
guanabarabay
downtown
botanicgarden
colombo
sambodromo
4
guanabarabay sambodromo
downtown sambodromo
sambodromo botanicgarden
colombo sambodromo
0

Sample Output
City map #1: 1 camera(s) found
sugarloaf
City map #2: 1 camera(s) found
sambodromo

10.5. Problems

10.5.7

241

The Grand Dinner

PC/UVa IDs: 111007/10249, Popularity: C, Success rate: high Level: 4
Each team participating in this year’s ACM World Finals is expected to attend the
grand banquet arranged for after the award ceremony. To maximize the amount of
interaction between members of diﬀerent teams, no two members of the same team will
be allowed to sit at the same table.
Given the number of members on each team (including contestants, coaches, reserves,
and guests) and the seating capacity of each table, determine whether it is possible for
the teams to sit as described. If such an arrangement is possible, output one such seating
assignment. If there are multiple possible arrangements, any one is acceptable.

Input
The input ﬁle may contain multiple test cases. The ﬁrst line of each test case contains
two integers, 1 ≤ M ≤ 70 and 1 ≤ N ≤ 50, denoting the number of teams and tables,
respectively. The second line of each test case contains M integers, where the ith integer
mi indicates the number of members of team i. There are at most 100 members of any
team. The third line contains N integers, where the jth integer nj , 2 ≤ nj ≤ 100,
indicates the seating capacity of table j.
A test case containing two zeros for M and N terminates the input.

Output
For each test case, print a line containing either 1 or 0, denoting whether there exists
a valid seating arrangement of the team members. In case of a successful arrangement,
print M additional lines where the ith line contains a table number (from 1 to N ) for
each of the members of team i.

Sample Input

Sample Output

4
4
3
4
4
3
0

1
1
1
2
1
0

5
5
5
5
5
5
0

3 5
2 6 4
3 5
2 6 3

2
2
4
2

4 5
3 4 5
5
3 4 5

242

10. Graph Algorithms

10.5.8

The Problem With the Problem Setter

PC/UVa IDs: 111008/10092, Popularity: C, Success rate: average Level: 3
So many students are interested in participating in this year’s regional programming
contest that we have decided to arrange a screening test to identify the most promising
candidates. This test may include as many as 100 problems drawn from as many as 20
categories. I have been assigned the job of setting problems for this test.
At ﬁrst the job seemed to be very easy, since I was told that I would be given a pool of
about 1,000 problems divided into appropriate categories. After getting the problems,
however, I discovered that the original authors often wrote down multiple categorynames in the category ﬁelds. Since no problem can used in more than one category and
the number of problems needed for each category is ﬁxed, assigning problems for the
test is not so easy.

Input
The input ﬁle may contain multiple test cases, each of which begins with a line containing two integers, nk and np , where nk is the number of categories and np is the number
of problems in the pool. There will be between 2 and 20 categories and at most 1,000
problems in the pool.
The second line contains nk positive integers, where the ith integer speciﬁes the
number of problems to be included in category i (1 ≤ i ≤ nk ) of the test. You may
assume that the sum of these nk integers will never exceed 100. The jth (1 ≤ j ≤ np )
of the next np lines contains the category information of the jth problem in the pool.
Each such problem category speciﬁcation starts with a positive integer specifying the
number of categories in which this problem can be included, followed by the actual
category numbers.
A test case containing two zeros for nk and np terminates the input.

Output
For each test case, print a line reporting whether problems can be successfully selected
from the pool under the given restrictions, with 1 for success and 0 for failure.
In case of successful selection, print nk additional lines where the ith line contains
the problem numbers that can be included in category i. Problem numbers are positive
integers not greater then np and each two problem numbers must be separated by a
single space. Any successful selection will be accepted.

Sample Input
3
3
2
1

15
3 4
1 2
3

10.5. Problems

1
1
1
3
2
2
1
1
2
2
2
1
3
3
7
2
1
1
1
1
3
2
1
1
2
2
2
1
3
0

3
3
3
1 2
2 3
1 3
2
2
1 2
1 3
1 2
1
1 2
15
3 4
1 2
1
2
2
3
1 2
2 3
2
2
2 3
2 3
1 2
1
1 2
0

3 2 2 3

Sample Output
1
8 11 12
1 6 7
2 3 4 5
0

243

244

10. Graph Algorithms

10.6 Hints
10.1 Which problem from the chapter is Richie trying to solve?
10.2 Can this problem be modeled as a Hamiltonian or Eulerian cycle problem?
10.3 How can we use shortest-path information to help us position the station?
10.4 How can we model this as a shortest-path problem? What is the start node of our
graph? How can we break ties in favor of trains leaving later in the day?
10.5 Can we propagate some of the implications of an observation through transitive
closure?
10.6 What graph-theoretic concept deﬁnes the camera locations?
10.7 Does a greedy algorithm do the job, or must we use something like network ﬂow?
10.8 Can this be modeled using network ﬂow, or is there a more elementary approach?

11
Dynamic Programming

As algorithm designers and programmers, we are often charged with building a program
to ﬁnd the best solution for all problem instances. It is usually easy to write a program
which gives a decent and correct solution, but ensuring that it always returns the
absolute best solution requires us to think deeply about the problem.
Dynamic programming is a very powerful, general tool for solving optimization
problems on left-right-ordered items such as character strings. Once understood it is
relatively easy to apply, but many people have a diﬃcult time understanding it.
Dynamic programming looks like magic until you have seen enough examples. Start
by reviewing our binomial coeﬃcient function in Chapter 6, as an example of how we
stored partial results to help us compute what we were looking for. Then review Floyd’s
all-pairs shortest-path algorithm in Section 10.3.2. Only then should you study the two
problems in the sections below. The ﬁrst is a classic example of dynamic programming
which appears in every textbook. The second is a more ad hoc example representative
of using dynamic programming to design new algorithms.

11.1 Don’t Be Greedy
Many problems call for ﬁnding the best solution satisfying certain constraints. We
have a few tricks available to tackle such jobs. For example, the backtracking problems
of Chapter 8 often asked for the largest, smallest, or highest-scoring conﬁguration.
Backtracking searches all possible solutions and selects the best one, and hence must
return the correct answer. But this approach is only feasible for small problem instances.

246

11. Dynamic Programming

Correct and eﬃcient algorithms are known for many important graph problems, including shortest paths, minimum spanning trees, and matchings, as discussed in Chapter
10. Always be on the lookout for instances of these problems, so you can just plug in
the appropriate solution.
Greedy algorithms focus on making the best local choice at each decision point. For
example, a natural way to compute a shortest path from x to y might be to walk out of
x, repeatedly following the cheapest edge until we get to y. Natural, but wrong! Indeed,
in the absence of a correctness proof such greedy algorithms are very likely to fail.
So what can we do? Dynamic programming gives us a way to design custom algorithms which systematically search all possibilities (thus guaranteeing correctness)
while storing results to avoid recomputing (thus providing eﬃciency).
Dynamic programming algorithms are deﬁned by recursive algorithms/functions that
describe the solution to the entire problem in terms of solutions to smaller problems.
Backtracking is one such recursive procedure we have seen, as is depth-ﬁrst search in
graphs.
Eﬃciency in any such recursive algorithm requires storing enough information to
avoid repeating computations we have done before. Why is depth-ﬁrst search in graphs
eﬃcient? It is because we mark the vertices we have visited so we don’t visit them again.
Why is raw backtracking computationally expensive? Because it searches all possible
paths/solutions instead of just the ones we haven’t seen before.
Dynamic programming is a technique for eﬃciently implementing a recursive algorithm by storing partial results. The trick is to see that the naive recursive algorithm
repeatedly computes the same subproblems over and over and over again. If so, storing
the answers to them in a table instead of recomputing can lead to an eﬃcient algorithm. To understand the examples which follow, it will help ﬁrst to hunt for some kind
of recursive algorithm. Only once you have a correct algorithm can you worry about
speeding it up by using a results matrix.

11.2 Edit Distance
The problem of searching for patterns in text strings is of unquestionable importance.
Indeed, we presented algorithms for string search in Chapter 3. However, there we
limited discussion to exact string matching, ﬁnding where the pattern string s was
exactly contained in the text string t. Life is often not that simple. Misspellings in
either the text or pattern rob of us exact similarity. Evolutionary changes in genomic
sequences or language usage imply that we often search with archaic patterns in mind:
“Thou shalt not kill” morphs into “You should not murder.”
If we are to deal with inexact string matching, we must ﬁrst deﬁne a cost function
telling us how far apart two strings are, i.e., a distance measure between pairs of strings.
A reasonable distance measure minimizes the cost of the changes which have to be made
to convert one string to another. There are three natural types of changes:
• Substitution — Change a single character from pattern s to a diﬀerent character
in text t, such as changing “shot” to “spot”.

11.2. Edit Distance

247

• Insertion — Insert a single character into pattern s to help it match text t, such
as changing “ago” to “agog”.
• Deletion — Delete a single character from pattern s to help it match text t, such
as changing “hour” to “our”.
Properly posing the question of string similarity requires us to set the cost of each of
these string transform operations. Setting each operation to cost one step deﬁnes the
edit distance between two strings. Other cost values also yield interesting results, as will
be shown in Section 11.4.
But how can we compute the edit distance? We can deﬁne a recursive algorithm
using the observation that the last character in the string must either be matched,
substituted, inserted, or deleted. Chopping oﬀ the characters involved in the last edit
operation leaves a pair of smaller strings. Let i and j be the last character of the
relevant preﬁx of s and t, respectively. There are three pairs of shorter strings after
the last operation, corresponding to the strings after a match/substitution, insertion,
or deletion. If we knew the cost of editing the three pairs of smaller strings, we could
decide which option leads to the best solution and choose that option accordingly. We
can learn this cost, through the magic of recursion:
#define MATCH
#define INSERT
#define DELETE

0
1
2

/* enumerated type symbol for match */
/* enumerated type symbol for insert */
/* enumerated type symbol for delete */

int string_compare(char *s, char *t, int i, int j)
{
int k;
/* counter */
int opt[3];
/* cost of the three options */
int lowest_cost;
/* lowest cost */
if (i == 0) return(j * indel(’ ’));
if (j == 0) return(i * indel(’ ’));
opt[MATCH] = string_compare(s,t,i-1,j-1) + match(s[i],t[j]);
opt[INSERT] = string_compare(s,t,i,j-1) + indel(t[j]);
opt[DELETE] = string_compare(s,t,i-1,j) + indel(s[i]);
lowest_cost = opt[MATCH];
for (k=INSERT; k<=DELETE; k++)
if (opt[k] < lowest_cost) lowest_cost = opt[k];
return( lowest_cost );
}
This program is absolutely correct – convince yourself. It is also impossibly slow.
Running on our computer, it takes several seconds to compare two 11-character strings,
and the computation disappears into never-never land on anything longer.

248

11. Dynamic Programming

Why is the algorithm so slow? It takes exponential time because it recomputes values
again and again and again. At every position in the string, the recursion branches three
ways, meaning it grows at a rate of at least 3n – indeed, even faster since most of the
calls reduce only one of the two indices, not both of them.
So how can we make the algorithm practical? The important observation is that most
of these recursive calls are computing things that have already been computed before.
How do we know? Well, there can only be |s| · |t| possible unique recursive calls, since
there are only that many distinct (i, j) pairs to serve as the parameters of recursive calls.
By storing the values for each of these (i, j) pairs in a table, we can avoid recomputing
them and just look them up as needed.
A table-based, dynamic programming implementation of this algorithm is given below. The table is a two-dimensional matrix m where each of the |s| · |t| cells contains the
cost of the optimal solution of this subproblem, as well as a parent pointer explaining
how we got to this location:
typedef struct {
int cost;
int parent;
} cell;

/* cost of reaching this cell */
/* parent cell */

cell m[MAXLEN+1][MAXLEN+1];

/* dynamic programming table */

The dynamic programming version has three diﬀerences from the recursive version.
First, it gets its intermediate values using table lookup instead of recursive calls. Second,
it updates the parent ﬁeld of each cell, which will enable us to reconstruct the editsequence later. Third, it is instrumented using a more general goal cell() function
instead of just returning m[|s|][|t|].cost. This will enable us to apply this routine
to a wider class of problems.
Be aware that we adhere to certain unusual string and index conventions in the
following routines. In particular, we assume that each string has been padded with an
initial blank character, so the ﬁrst real character of string s sits in s[1]. This was done
using the following input fragment:
s[0] = t[0] = ’ ’;
scanf("%s",&(s[1]));
scanf("%s",&(t[1]));
Why did we do this? It enables us to keep the matrix m indices in sync with those of
the strings for clarity. Recall that we must dedicate the zeroth row and columns of m to
store the boundary values matching the empty preﬁx. Alternatively, we could have left
the input strings intact and just adjusted the indices accordingly.
int string_compare(char *s, char *t)
{
int i,j,k;
/* counters */
int opt[3];
/* cost of the three options */

11.2. Edit Distance

249

for (i=0; i0)
m[0][i].parent = INSERT;
else
m[0][i].parent = -1;
}

column_init(int i)
{
m[i][0].cost = i;
if (i>0)
m[i][0].parent = DELETE;
else
m[0][i].parent = -1;
}

• Penalty Costs — The functions match(c,d) and indel(c) present the costs for
transforming character c to d and inserting/deleting character c. For standard edit
distance, match should cost nothing if the characters are identical, and 1 otherwise,
while indel returns 1 regardless of what the argument is. But more sensitive cost
functions are certainly possible, perhaps more forgiving of replacements located
near each other on standard keyboard layouts or those which sound or look similar.

int match(char c, char d)
{
if (c == d) return(0);
else return(1);
}

int indel(char c)
{
return(1);
}

• Goal Cell Identiﬁcation — The function goal cell returns the indices of the
cell marking the endpoint of the solution. For edit distance, this is deﬁned by the
length of the two input strings. However, other applications we shall see do not
have ﬁxed goal locations.
goal_cell(char *s, char *t, int *i, int *j)
{
*i = strlen(s) - 1;
*j = strlen(t) - 1;
}
• Traceback Actions — The functions match out, insert out, and delete out
perform the appropriate actions for each edit-operation during traceback. For

252

11. Dynamic Programming

edit distance, this might mean printing out the name of the operation or character
involved, as determined by the needs of the application.
insert_out(char *t, int j)
{
printf("I");
}

match_out(char *s, char *t,
int i, int j)
{
if (s[i]==t[j]) printf("M");
else printf("S");
}

delete_out(char *s, int i)
{
printf("D");
}
For our edit distance computation all of these functions are quite simple. However,
we must confess about the diﬃculty of getting the boundary conditions and index
manipulations correct. Although dynamic programming algorithms are easy to design
once you understand the technique, getting the details right requires carefully thinking
and thorough testing.
This is a lot of infrastructure to develop for such a simple algorithm. However, there
are several important problems which can now be solved as special cases of edit distance
using only minor changes to some of these stubs.
• Substring Matching — Suppose that we want to ﬁnd where a short pattern s
best occurs within a long text t, say, searching for “Skiena” in all its misspellings
(Skienna, Skena, Skina, . . . ). Plugging this search into our original edit distance
function will achieve little sensitivity, since the vast majority of any edit cost will
be that of deleting the body of the text.
We want an edit distance search where the cost of starting the match is
independent of the position in the text, so that a match in the middle is not
prejudiced against. Likewise, the goal state is not necessarily at the end of both
strings, but the cheapest place to match the entire pattern somewhere in the text.
Modifying these two functions gives us the correct solution:
row_init(int i)
{
m[0][i].cost = 0;
m[0][i].parent = -1;
}

/* note change */
/* note change */

goal_cell(char *s, char *t, int *i, int *j)
{
int k;
/* counter */
*i = strlen(s) - 1;
*j = 0;
for (k=1; k previous) && (stops[i] <= current))
nsteps += min(stops[i]-previous, current-stops[i]);
return(nsteps);
}
Once you buy this logic, the implementation of this algorithm becomes straightforward. We set up global matrices to hold the dynamic programming tables, here
separated to store the cost and parent ﬁelds:
#define NFLOORS
#define MAX_RIDERS

110
50

/* the building height in floors */
/* what is the elevator capacity? */

int stops[MAX_RIDERS];
int nriders;
int nstops;

/* what floor does everyone get off? */
/* number of riders */
/* number of allowable stops */

int m[NFLOORS+1][MAX_RIDERS];
int p[NFLOORS+1][MAX_RIDERS];

/* dynamic programming cost table */
/* dynamic programming parent table */

The optimization function is a direct implementation of the recurrence, with care
taken to order the loops so that all values are ready before they are needed:
int optimize_floors()
{
int i,j,k;
int cost;
int laststop;

/* counters */
/* costs placeholder */
/* the elevator’s last stop */

for (i=0; i<=NFLOORS; i++) {
m[i][0] = floors_walked(0,MAXINT);
p[i][0] = -1;
}
for (j=1; j<=nstops; j++)
for (i=0; i<=NFLOORS; i++) {
m[i][j] = MAXINT;

256

11. Dynamic Programming

for (k=0; k<=i; k++) {
cost = m[k][j-1] - floors_walked(k,MAXINT) +
floors_walked(k,i) + floors_walked(i,MAXINT);
if (cost < m[i][j]) {
m[i][j] = cost;
p[i][j] = k;
}
}
}
laststop = 0;
for (i=1; i<=NFLOORS; i++)
if (m[i][nstops] < m[laststop][nstops])
laststop = i;
return(laststop);
}
Finally, we need to reconstruct the solution. The logic is exactly the same as the
previous examples: follow the parent pointers and work backward:
reconstruct_path(int lastfloor, int stops_to_go)
{
if (stops_to_go > 1)
reconstruct_path(p[lastfloor][stops_to_go], stops_to_go-1);
printf("%d\n",lastfloor);
}
Running this program on a ten-story European building (which has the ground ﬂoor
labeled zero) with single passengers seeking to go to every ﬂoor from 1 to 10 informs
us that the best single stop is at ﬂoor 7, for a cost of 18 walked ﬂights (the ﬂoor 1, 2,
and 3 passengers are told to get out and walk up from the ground ﬂoor). The best pair
of stops are 3 and 8 for a cost of 11, while the best triple of stops is at 3, 6, and 9 for
a total cost of 7 ﬂights.

11.6. Problems

257

11.6 Problems
11.6.1

Is Bigger Smarter?

PC/UVa IDs: 111101/10131, Popularity: B, Success rate: high Level: 2
Some people think that the bigger an elephant is, the smarter it is. To disprove this,
you want to analyze a collection of elephants and place as large a subset of elephants
as possible into a sequence whose weights are increasing but IQ’s are decreasing.

Input
The input will consist of data for a bunch of elephants, at one elephant per line terminated by the end-of-ﬁle. The data for each particular elephant will consist of a pair of
integers: the ﬁrst representing its size in kilograms and the second representing its IQ
in hundredths of IQ points. Both integers are between 1 and 10,000. The data contains
information on at most 1,000 elephants. Two elephants may have the same weight, the
same IQ, or even the same weight and IQ.

Output
The ﬁrst output line should contain an integer n, the length of elephant sequence
found. The remaining n lines should each contain a single positive integer representing
an elephant. Denote the numbers on the ith data line as W [i] and S[i]. If these sequence
of n elephants are a[1], a[2],..., a[n] then it must be the case that
W [a[1]] < W [a[2]] < ... < W [a[n]] and S[a[1]] > S[a[2]] > ... > S[a[n]]i
In order for the answer to be correct, n must be as large as possible. All inequalities
are strict: weights must be strictly increasing, and IQs must be strictly decreasing.
Your program can report any correct answer for a given input.

Sample Input

Sample Output

6008 1300
6000 2100
500 2000
1000 4000
1100 3000
6000 2000
8000 1400
6000 1200
2000 1900

4
4
5
9
7

258

11. Dynamic Programming

11.6.2

Distinct Subsequences

PC/UVa IDs: 111102/10069, Popularity: B, Success rate: average Level: 3
A subsequence of a given sequence S consists of S with zero or more elements deleted.
Formally, a sequence Z = z1 z2 . . . zk is a subsequence of X = x1 x2 . . . xm if there
exists a strictly increasing sequence < i1 , i2 , . . . , ik > of indices of X such that for all
j = 1, 2, . . . , k, we have xij = zj . For example, Z = bcdb is a subsequence of X = abcbdab
with corresponding index sequence < 2, 3, 5, 7 >.
Your job is to write a program that counts the number of occurrences of Z in X as
a subsequence such that each has a distinct index sequence.

Input
The ﬁrst line of the input contains an integer N indicating the number of test cases
to follow. The ﬁrst line of each test case contains a string X, composed entirely of
lowercase alphabetic characters and having length no greater than 10,000. The second
line contains another string Z having length no greater than 100 and also composed of
only lowercase alphabetic characters. Be assured that neither Z nor any preﬁx or suﬃx
of Z will have more than 10100 distinct occurrences in X as a subsequence.

Output
For each test case, output the number of distinct occurrences of Z in X as a subsequence.
Output for each input set must be on a separate line.

Sample Input
2
babgbag
bag
rabbbit
rabbit

Sample Output
5
3

11.6. Problems

11.6.3

259

Weights and Measures

PC/UVa IDs: 111103/10154, Popularity: C, Success rate: average Level: 3
A turtle named Mack, to avoid being cracked, has enlisted your advice as to the order
in which turtles should be stacked to form Yertle the Turtle’s throne. Each of the 5,607
turtles ordered by Yertle has a diﬀerent weight and strength. Your task is to build the
largest stack of turtles possible.

Input
Standard input consists of several lines, each containing a pair of integers separated by
one or more space characters, specifying the weight and strength of a turtle. The weight
of the turtle is in grams. The strength, also in grams, is the turtle’s overall carrying
capacity, including its own weight. That is, a turtle weighing 300 g with a strength of
1,000 g can carry 700 g of turtles on its back. There are at most 5,607 turtles.

Output
Your output is a single integer indicating the maximum number of turtles that can be
stacked without exceeding the strength of any one.

Sample Input
300 1000
1000 1200
200 600
100 101

Sample Output
3

260

11.6.4

11. Dynamic Programming

Unidirectional TSP

PC/UVa IDs: 111104/116, Popularity: A, Success rate: low Level: 3
Given an m×n matrix of integers, you are to write a program that computes a path of
minimal weight from left to right across the matrix. A path starts anywhere in column
1 and consists of a sequence of steps terminating in column n. Each step consists of
traveling from column i to column i + 1 in an adjacent (horizontal or diagonal) row.
The ﬁrst and last rows (rows 1 and m) of a matrix are considered adjacent; i.e., the
matrix “wraps” so that it represents a horizontal cylinder. Legal steps are illustrated
below.

R
The weight of a path is the sum of the integers in each of the n cells of the matrix that
are visited.
The minimum paths through two slightly diﬀerent 5 × 6 matrices are shown below.
The matrix values diﬀer only in the bottom row. The path for the matrix on the right
takes advantage of the adjacency between the ﬁrst and last rows.
3

Input
The input consists of a sequence of matrix speciﬁcations. Each matrix consists of the
row and column dimensions on a line, denoted m and n, respectively. This is followed
by m · n integers, appearing in row major order; i.e., the ﬁrst n integers constitute the
ﬁrst row of the matrix, the second n integers constitute the second row, and so on.
The integers on a line will be separated from other integers by one or more spaces.
Note: integers are not restricted to being positive. There will be one or more matrix
speciﬁcations in an input ﬁle. Input is terminated by end-of-ﬁle.
For each speciﬁcation the number of rows will be between 1 and 10 inclusive; the
number of columns will be between 1 and 100 inclusive. No path’s weight will exceed
integer values representable using 30 bits.

11.6. Problems

261

Output
Two lines should be output for each matrix speciﬁcation. The ﬁrst line represents a
minimal-weight path, and the second line is the cost of this minimal path. The path
consists of a sequence of n integers (separated by one or more spaces) representing
the rows that constitute the minimal path. If there is more than one path of minimal
weight, the lexicographically smallest path should be output.

Sample Input
5
3
6
5
8
3
5
3
6
5
8
3
2
9

6
4 1 2 8
1 8 2 7
9 3 9 9
4 1 3 2
7 2 8 6
6
4 1 2 8
1 8 2 7
9 3 9 9
4 1 3 2
7 2 1 2
2
10 9 10

6
4
5
6
4
6
4
5
6
3

Sample Output
1 2 3 4 4 5
16
1 2 1 5 4 5
11
1 1
19

262

11. Dynamic Programming

11.6.5

Cutting Sticks

PC/UVa IDs: 111105/10003, Popularity: B, Success rate: average Level: 2
You have to cut a wood stick into several pieces. The most aﬀordable company,
Analog Cutting Machinery (ACM), charges money according to the length of the stick
being cut. Their cutting saw allows them to make only one cut at a time.
It is easy to see that diﬀerent cutting orders can lead to diﬀerent prices. For example,
consider a stick of length 10 m that has to be cut at 2, 4, and 7 m from one end. There
are several choices. One can cut ﬁrst at 2, then at 4, then at 7. This leads to a price of
10 + 8 + 6 = 24 because the ﬁrst stick was of 10 m, the resulting stick of 8 m, and the
last one of 6 m. Another choice could cut at 4, then at 2, then at 7. This would lead to
a price of 10 + 4 + 6 = 20, which is better for us.
Your boss demands that you write a program to ﬁnd the minimum possible cutting
cost for any given stick.

Input
The input will consist of several input cases. The ﬁrst line of each test case will contain
a positive number l that represents the length of the stick to be cut. You can assume
l < 1,000. The next line will contain the number n (n < 50) of cuts to be made.
The next line consists of n positive numbers ci (0 < ci < l) representing the places
where the cuts must be made, given in strictly increasing order.
An input case with l = 0 represents the end of input.

Output
Print the cost of the minimum cost solution to cut each stick in the format shown below.

Sample Input
100
3
25 50 75
10
4
4 5 7 8
0

Sample Output
The minimum cutting is 200.
The minimum cutting is 22.

11.6. Problems

11.6.6

263

Ferry Loading

PC/UVa IDs: 111106/10261, Popularity: B, Success rate: low Level: 3
Ferries are used to transport cars across rivers and other bodies of water. Typically,
ferries are wide enough to support two lanes of cars throughout their length. The two
lanes of cars drive onto the ferry from one end, the ferry crosses the water, and the cars
exit from the other end of the ferry.
The cars waiting to board the ferry form a single queue, and the operator directs
each car in turn to drive onto the port (left) or starboard (right) lane of the ferry so as
to balance the load. Each car in the queue has a diﬀerent length, which the operator
estimates by inspecting the queue. Based on this inspection, the operator decides which
side of the ferry each car should board, and boards as many cars as possible from the
queue, subject to the length limit of the ferry. Write a program that will tell the operator
which car to load on which side so as to maximize the number of cars loaded.

Input
The input begins with a single positive integer on a line by itself indicating the number
of test cases, followed by a blank line.
The ﬁrst line of each test case contains a single integer between 1 and 100: the length
of the ferry (in meters). For each car in the queue there is an additional line of input
specifying the length of the car in cm, an integer between 100 and 3,000 inclusive. A
ﬁnal line of input contains the integer 0. The cars must be loaded in order, subject to
the constraint that the total length of cars on either side does not exceed the length of
the ferry. As many cars should be loaded as possible, starting with the ﬁrst car in the
queue and loading cars in order until the next car cannot be loaded.
There is a blank line between each two consecutive inputs.

Output
For each test case, the ﬁrst line of output should give the number of cars that can be
loaded onto the ferry. For each car that can be loaded onto the ferry, in the order the
cars appear in the input, output a line containing “port” if the car is to be directed
to the port side and “starboard” if the car is to be directed to the starboard side. If
several arrangements of cars meet the criteria above, any one will do.
The output of two consecutive cases will be separated by a blank line.

Sample Input
1
50
2500
3000

264

11. Dynamic Programming

1000
1000
1500
700
800
0

Sample Output
6
port
starboard
starboard
starboard
port
port

11.6. Problems

11.6.7

265

Chopsticks

PC/UVa IDs: 111107/10271, Popularity: B, Success rate: average Level: 3
In China, people use pairs of chopsticks to eat food, but Mr. L is a bit diﬀerent.
He uses a set of three chopsticks, one pair plus an extra; a long chopstick to get large
items by stabbing the food. The length of the two shorter, standard chopsticks should
be as close as possible, but the length of the extra one is not important so long as it is
the longest. For a set of chopsticks with lengths A, B, C (A ≤ B ≤ C), the function
(A − B)2 deﬁnes the “badness” of the set.
Mr. L has invited K people to his birthday party, and he is eager to introduce his way
of using chopsticks. He must prepare K + 8 sets of chopsticks (for himself, his wife, his
little son, little daughter, his mother, father, mother-in-law, father-in-law, and K other
guests). But Mr. L’s chopsticks are of many diﬀerent lengths! Given these lengths, he
must ﬁnd a way of composing the K + 8 sets such that the total badness of the sets is
minimized.

Input
The ﬁrst line in the input contains a single integer T indicating the number of test
cases (1 ≤ T ≤ 20). Each test case begins with two integers K and N (0 ≤ K ≤ 1,000,
3K + 24 ≤ N ≤ 5,000) giving the number of guests and the number of chopsticks.
Then follow N positive integers Li , in non–decreasing order, indicating the lengths of
the chopsticks (1 ≤ Li ≤ 32,000).

Output
For each test case in the input, print a line containing the minimal total badness of all
the sets.

Sample Input
1
1 40
1 8 10 16 19 22 27 33 36 40 47 52 56 61 63 71 72 75 81 81 84 88 96 98
103 110 113 118 124 128 129 134 134 139 148 157 157 160 162 164

Sample Output
23
Note: A possible collection of the nine chopstick sets for this sample input is
(8, 10, 16), (19, 22, 27), (61, 63, 75), (71, 72, 88), (81, 81, 84), (96, 98, 103), (128, 129, 148),
(134, 134, 139), and (157, 157, 160).

266

11. Dynamic Programming

11.6.8

Adventures in Moving: Part IV

PC/UVa IDs: 111108/10201, Popularity: A, Success rate: low Level: 3
You are considering renting a moving truck to help you move from Waterloo to the
big city. Gas prices being so high these days, you want to know how much the gas for
this beast will set you back.
The truck consumes a full liter of gas for each kilometer it travels. It has a 200-liter
gas tank. When you rent the truck in Waterloo, the tank is half-full. When you return
it in the big city, the tank must be at least half-full, or you’ll get gouged even more for
gas by the rental company. You would like to spend as little as possible on gas, but you
don’t want to run out along the way.

Input
The input begins with a single positive integer on a line by itself indicating the number
of test cases, followed by a blank line.
Each test case is composed only of integers. The ﬁrst integer is the distance in kilometers from Waterloo to the big city, at most 10,000. Next comes a set of up to 100 gas
station speciﬁcations, describing all the gas stations along your route, in non-decreasing
order by distance. Each speciﬁcation consists of the distance in kilometers of the gas
station from Waterloo, and the price of a liter of gas at the gas station, in tenths of a
cent, at most 2,000.
There is a blank line between each two consecutive inputs.

Output
For each test case, output the minimum amount of money that you can spend on gas
to get from Waterloo to the big city. If it is not possible to get from Waterloo to the
big city subject to the constraints above, print “Impossible”.
The output of each two consecutive cases will be separated by a blank line.

Sample Input

Sample Output

450550

500
100
150
200
300
400
450
500

999
888
777
999
1009
1019
1399

11.7. Hints

267

11.7 Hints
11.1 Can this be reduced to some form of string matching problem?
11.3 Does the original order of the input have any meaning, or are we free to rearrange
it? If so, what order is most useful?
11.4 What information do we need about shorter tours to be able to select the optimal
last move?
11.5 Can we exploit the fact that each cut leaves two smaller sticks to construct a
recursive algorithm?
11.6 Does always putting the next car on the side with the most remaining room
solve the problem? Why or why not? Can we exploit the fact that the sum of
accumulated car lengths on each lane of the ferry is always an integer?
11.7 How could we solve the problem if we didn’t have to worry about the third
chopstick?
11.8 What information about the costs of reaching certain places with certain amounts
of gas is suﬃcient to select the optimal last move?

11.8 Notes
11.3 More about Yertle the Turtle can be found in [Seu58].

12
Grids

It is not that polar coordinates are complicated, it is just that Cartesian coordinates are simpler than they have a right to be. – Kleppner and Kolenhow,
“An Introduction to Mechanics”
Grids underlie a wide variety of natural structures. Chessboards are grids. City blocks
are typically arranged on a grid; indeed, the most natural grid distance measure grid is
often called the “Manhattan” distance. The system of longitude and latitude deﬁnes a
grid over the earth, albeit on the surface of a sphere instead of the plane.
Grids are ubiquitous because they are the most natural way to carve space up into
regions so that locations can be identiﬁed. In the limit, these cells can be individual
points, but here we deal with coarser grids whose cells are big enough to have a shape.
In regular grids, each of these shapes is identical, and they occur in a regular pattern.
Rectangular or rectilinear subdivisions are the most common grids, due to their simplicity, but triangle-based hexagonal grids are also important. Indeed, the honey industry
has exploited the eﬃciency of hexagonal grids for literally millions of years.

12.1 Rectilinear Grids
Rectilinear grids are familiar to anyone who has used a piece of graph paper. In such
grids, the cells are typically deﬁned by regularly spaced horizontal and vertical lines.
Non-uniform spacing still yields a regular topology, although the size of each cell may
diﬀer. Three-dimensional grids are formed by connected regularly spaced layers of planar grids with perpendicular lines across the layers. Three-dimensional grids also have
planar faces, deﬁned between any two face-neighboring cubes.

12.1. Rectilinear Grids

269

There are three important components of the planar grid: the vertices, the edges,
and the cell interiors. Sometimes we are interested in the interiors of the cells, as in
geometric applications where each cell describes a region in space. Sometimes we are
interested in the vertices of the grid, such as in addressing the pieces on a chessboard.
Sometimes we are interested in the edges of the grid, such as when ﬁnding routes to
travel in a city where buildings occupy the interior of the cells.
Vertices in planar grids each touch four edges and the interiors of four cells, except
for vertices on the boundaries. Vertices in 3D grids touch on six edges and eight cells.
In d-dimensions, each vertex touches 2d edges and 2d cells. Cells in a planar grid each
touch eight faces, four diagonally through vertices and four through edges. Cells in a
3D grid each touch 26 other cells, sharing a face with 6 of them, an edge with 12 of
them, and just a vertex with the other 8.

12.1.1

Traversal

It is often necessary to traverse all the cells of an n × m rectilinear grid. Any such
traversal can be thought of as a mapping from each of the nm ordered pairs to a unique
integer from 1 to nm. In certain applications the order matters, such as in dynamic
programming evaluation strategies. The most important traversal methods are —
• Row Major — Here we slice the matrix between rows, so the ﬁrst m elements
visited belong to the ﬁrst row, the second m elements to the second row, and
so forth. Such an ordering is used inside most modern programming language
compilers to represent two-dimensional matrices as a single linear array.
(1,1)
(1,2)
(1,3)
(2,1)
(2,2)
(2,3)
(3,1)
(3,2)
(3,3)

row_major(int n, int m)
{
int i,j;
/* counters */
for (i=1; i<=n; i++)
for (j=1; j<=m; j++)
process(i,j);
}

• Column Major — Here we slice the matrix between columns, so the ﬁrst n
elements belong to the ﬁrst column, the second n elements to the second column,
and so forth. This can be done by interchanging the order of the nested loops from
row-major ordering. Knowing whether your compiler uses row-major or columnmajor ordering for matrices is important when optimizing for cache performance
and when attempting certain pointer-arithmetic operations.

270

12. Grids

(1,1)
column_major(int n, int m)
(2,1)
{
(3,1)
int i,j;
/* counters */
(1,2)
(2,2)
for (j=1; j<=m; j++)
(3,2)
for (i=1; i<=n; i++)
(1,3)
process(i,j);
(2,3)
}
(3,3)
• Snake Order — Instead of starting each row from the ﬁrst element, we alternate the order of the directions we travel down the rows. The eﬀect is that of a
typewriter which can type both left-to-right and right-to-left so as to minimize
printing time.
snake_order(int n, int m)
(1,1)
{
(1,2)
(1,3)
int i,j;
/* counters */
(2,3)
for (i=1; i<=n; i++)
(2,2)
for (j=1; j<=m; j++)
(2,1)
process(i, j + (m+1-2*j) * ((i+1) % 2));
(3,1)
}
(3,2)
(3,3)
• Diagonal Order — Here we march up and down diagonals. Note that an n × m
grid has m + n − 1 diagonals, each with a variable number of elements. This is a
trickier task than it appears at ﬁrst glance.
(1,1)
(2,1)
(1,2)
(3,1)
(2,2)
(1,3)
(4,1)
(3,2)
(2,3)
(4,2)
(3,3)
(4,3)

diagonal_order(int n, int m)
{
int d,j;
/* diagonal and point counters */
int pcount;
/* points on diagonal */
int height;
/* row of lowest point */
for (d=1; d<=(m+n-1); d++) {
height = 1 + max(0,d-m);
pcount = min(d, (n-height+1));
for (j=0; j h) return(0);
gap = 2.0 * r * (sqrt(3)/2.0);
return( 1 + floor((h-2.0*r)/gap) );
}
The number of disks which ﬁt in the box is a function of both the number of rows
of plates, and how many plates ﬁt in each row. We always pack the bottom (or zeroth)
row starting from the left-hand side of the box, so it contains as many disks as possible
constrained by the width of the box. The disks in odd-numbered rows are oﬀset by r,
and we might have to remove the last disk from these rows unless there is enough slack
(≥ r) to accommodate it:
int plates_per_row(int row, double w, double r)
{
int plates_per_full_row;
/* plates in full/even row */
plates_per_full_row = floor(w/(2*r));
if ((row % 2) == 0) return(plates_per_full_row);
if (((w/(2*r))-plates_per_full_row) >= 0.5)
return(plates_per_full_row);
else
return(plates_per_full_row - 1);

/* odd row full */

}
Determining how many plates sit on a given plate can be simpliﬁed through proper
use of our coordinate systems. In an unbounded lattice, two plates in row r + 1 sit on
top of a plate at hexagonal-(r, c), namely, (r, c−1) and (r, c). In general, i+1 such plates
sit in row r + i. However, we must clip these oﬀ to reﬂect the limits of our region. This
clipping is easier done in array-coordinates, so we convert to determine the number of
plates in our truncated cone:
int plates_on_top(int xh, int yh, double w, double l, double r)
{
int number_on_top = 0;
/* total plates on top */
int layers;
/* number of rows in grid */
int rowlength;
/* number of plates in row */
int row;
/* counter */
int xla,yla,xra,yra;
/* array coordinates */
layers = dense_layers(w,l,r);

12.4. Circle Packings

277

for (row=xh+1; row rowlength) yra = rowlength;

/* right boundary */

number_on_top += yra-yla+1;
}
return(number_on_top);
}

12.4 Circle Packings
There is an interesting and important connection between hexagonal grids and packing
circular disks. The six neighbors of each vertex v in the grid are equidistant from v, so
we can draw a circle centered in v through them, as shown in Figure 12.4(r). Each such
disk touches the six disks of its neighbors, as shown in Figure 12.3.
The plate packing problem asks us to evaluate two diﬀerent ways to pack a collection
of equal-sized disks, one with the disk centers at the vertices of a rectilinear grid, the
other with their centers at the vertices of a hexagonal grid. Which leads to a denser
circle packing? It is easy to evaluate both layouts using the routines we have already
developed:
/* How many radius r plates fit in a hexagonal-lattice packed w*h box? */
int dense_plates(double w, double l, double r)
{
int layers;
/* number of layers of balls */
layers = dense_layers(w,l,r);
return (ceil(layers/2.0) * plates_per_row(0,w,r) +
floor(layers/2.0) * plates_per_row(1,w,r) );
}

/* How many radius r plates fit in a grid-lattice packed w*h box?
int grid_plates(double w, double h, double r)

278

12. Grids

{
int layers;

/* number of layers of balls */

layers = floor(h/(2*r));
return (layers * plates_per_row(0,w,r));
}
For large enough boxes, the hexagonal packing certainly lets us pack more plates
than the square grid layout. Indeed, hexagonal packing is the asymptotically densest
way to pack disks, and its three-dimensional analog is the densest possible way to pack
spheres.
A 4 × 4 box has room for 16 unit-diameter plates under the square layout versus only
14 with the hexagonal layout, due to boundary eﬀects. But a 10 × 10 box ﬁts 105 plates
in the hexagonal layout, 5 more than the square, and it never looks back from there. A
100 × 100 box ﬁts 11,443 hex-plates versus 10,000 in a square layout. Thus we can gain
a signiﬁcant advantage with the proposed packaging technology.

12.5 Longitude and Latitude
A particularly important coordinate grid is the system of longitude and latitude which
uniquely positions every location on the surface of the Earth.
The lines that run east-west, parallel to the equator, are called lines of latitude. The
equator has a latitude of 0o , while the north and south poles have latitudes of 90o North
and 90o South, respectively.
The lines that run north-south are called lines of longitude or meridians. The prime
meridian passes through Greenwich, England, and has longitude 0o , with the entire
range of longitudes spanning from 180o West to 180o East.
Every location on the surface of the Earth is described by the intersection of a latitude
line and a longitude line. For example, the center of the universe (Manhattan) lies at
40o 47 North and 73o 58 West.
A common problem concerns ﬁnding the shortest ﬂying distance between two points
on the surface of the Earth. A great circle is a circular cross-section of a sphere which
passes through the center of the sphere. The shortest distance between points x and y
turns out to be the arc length between x and y on the unique great circle which passes
through x and y.
We refrain from working through the spherical geometry in favor of stating the result.
Denote the position of point p by its longitude-latitude coordinates, (plat , plong ), where
all angles are measured in radians. Then the great-circle distance between points p and
q is
d(p, q)

(sin(plat ) sin(qlat ) + cos(plat ) cos(qlat ) cos(plong − qlong )(r)

12.6. Problems

279

12.6 Problems
12.6.1

Ant on a Chessboard

PC/UVa IDs: 111201/10161, Popularity: B, Success rate: high Level: 1
One day, an ant named Alice came upon an M ×M chessboard. She wanted to explore
all the cells of the board. So she began to walk along the board by peeling oﬀ a corner
of the board.
Alice started at square (1, 1). First, she went up for a step, then a step to the right,
and a step downward. After that, she went a step to the right, then two steps upward,
and then two grids to the left. In each round, she added one new row and one new
column to the corner she had explored.
For example, her ﬁrst 25 steps went like this, where the numbers in each square
denote on which step she visited it.
25
10
9
2
1

24
11
8
3
4

23
12
7
6
5

22
13
14
15
16

21
20
19
18
17

Her 8th step put her on square (2, 3), while her 20th step put her on square (5, 4).
Your task is to decide where she was at a given time, assuming the chessboard is large
enough to accept all movements.

Input
The input ﬁle will contain several lines, each with an integer N denoting the step
number where 1 ≤ N ≤ 2 × 109 . The ﬁle will terminate with a line that contains the
number 0.

Output
For each input situation, print a line with two numbers (x,y) denoting the column and
the row number, respectively. There must be a single space between them.

Sample Input

Sample Output

8
20
25
0

2 3
5 4
1 5

280

12.6.2

12. Grids

The Monocycle

PC/UVa IDs: 111202/10047, Popularity: C, Success rate: average Level: 3
A monocycle is a cycle that runs on one wheel. We will be considering a special one
which has a solid wheel colored with ﬁve diﬀerent colors as shown in the ﬁgure:

The colored segments make equal angles (72o ) at the center. A monocyclist rides this
cycle on an M × N grid of square tiles. The tiles are of a size such that moving forward
from the center of one tile to that of the next one makes the wheel rotate exactly 72o
around its center. The eﬀect is shown in the above ﬁgure. When the wheel is at the
center of square 1, the midpoint of its blue segment is in touch with the ground. But
when the wheel moves forward to the center of the next square (square 2) the midpoint
of its white segment touches the ground.

Some of the squares of the grid are blocked and hence the cyclist cannot move to them.
The cyclist starts from some square and tries to move to a target square in minimum
amount of time. From any square he either moves forward to the next square or he
remains in the same square but turns 90o left or right. Each of these actions requires
exactly 1 second to execute. He always starts his ride facing north and with the midpoint
of the green segment of his wheel touching the ground. In the target square, too, the
green segment must touch the ground but he does not care which direction he will be
facing.

12.6. Problems

281

Please help the monocyclist check whether the destination is reachable and if so the
minimum amount of time he will require to reach it.

Input
The input may contain multiple test cases.
The ﬁrst line of each test case contains two integers M and N (1 ≤ M , N ≤ 25)
giving the dimensions of the grid. Then follows the description of the grid in M lines
of N characters each. The character “#” will indicate a blocked square, but all other
squares are free. The starting location of the cyclist is marked by “S” and the target is
marked by “T”.
The input terminates with two zeros for M and N .

Output
For each test case ﬁrst print the test case number on a separate line, as shown in the
sample output. If the target location can be reached by the cyclist, print the minimum
amount of time (in seconds) required to reach it in the format shown below. Otherwise
print “destination not reachable”.
Print a blank line between two successive test cases.

Sample Input

Sample Output

1 3
S#T
10 10
#S.......#
#..#.##.##
#.##.##.##
.#....##.#
##.##..#.#
#..#.##...
#......##.
..##.##...
#.###...#.
#.....###T
0 0

Case #1
destination not reachable
Case #2
minimum time = 49 sec

282

12. Grids

12.6.3

Star

PC/UVa IDs: 111203/10159, Popularity: C, Success rate: average Level: 2
A board contains 48 triangular cells. In each cell is written a digit in a range from 0
through 9. Every cell belongs to two or three lines. These lines are marked by letters
from A through L. See the ﬁgure below, where the cell containing digit 9 belongs to
lines D, G, and I and the cell containing digit 7 belongs to lines B and I.

For each line, we can measure the largest digit on the line. Here the largest digit for
line A is 5, B is 7, E is 6, H is 0, and J is 8.
Write a program that reads the largest digit for all 12 of the depicted lines and
computes the smallest and the largest possible sums of all digits on the board.

Input
Every line in the input contains 12 digits, each separated by a space. The ﬁrst of these
digits describes the largest digit in line A, the second in line B, and so on, until the last
digit denotes the largest one in line L.

Output
For each input line, print the value of the smallest and largest sums of digits possible
for the given board. These two values should appear on the same line and be separated
by exactly one space. If there does not exists a solution, your program must output “NO
SOLUTION”.

Sample Input

Sample Output

5 7 8 9 6 1 9 0 9 8 4 6

40 172

12.6. Problems

12.6.4

283

Bee Maja

PC/UVa IDs: 111204/10182, Popularity: B, Success rate: high Level: 2
Maja is a bee. She lives in a hive of hexagonal honeycombs with thousands of other
bees. But Maja has a problem. Her friend Willi told her where she can meet him, but
Willi (a male drone) and Maja (a female worker) have diﬀerent coordinate systems:
• Maja’s Coordinate System — Maja (on left) ﬂies directly to a special honeycomb
using an advanced two-dimensional grid over the whole hive.
• Willi’s Coordinate System — Willi (on right) is less intelligent, and just walks
around cells in clockwise order starting from 1 in the middle of the hive.

Maja’s system

Willi’s system

Help Maja to convert Willi’s system to hers. Write a program which for a given
honeycomb number returns the coordinates in Maja’s system.

Input
The input ﬁle contains one or more integers each standing on its own line. All honeycomb
numbers are less than 100,000.

Output
Output the corresponding Maja coordinates for Willi’s numbers, with each coordinate
pair on a separate line.

Sample Input

Sample Output

1
2
3
4
5

0 0
0 1
-1 1
-1 0
0 -1

284

12. Grids

12.6.5

Robbery

PC/UVa IDs: 111205/707, Popularity: B, Success rate: average Level: 3
Inspector Robostop is very angry. Last night, a bank was robbed and the robber
escaped. As quickly as possible, all roads leading out of the city were blocked, making
it impossible for the robber to escape. The inspector then asked everybody in the city
to watch out for the robber, but the only messages he got were “We don’t see him.”
Robostop is determined to discover exactly how the robber escaped. He asks you to
write a program which analyzes all the inspector’s information to ﬁnd out where the
robber was at any given time.
The city in which the bank was robbed has a rectangular shape. All roads leaving
the city were blocked for a certain period of time t, during which several observations
of the form “The robber isn’t in the rectangle Ri at time ti ” were reported. Assuming
that the robber can move at most one unit per time step, try to ﬁnd the exact position
of the robber at each time step.

Input
The input ﬁle describes several robberies. The ﬁrst line of each description consists of
three numbers W , H, and t (1 ≤ W, H, t ≤ 100), where W is the width, H the height
of the city, and t is the length of time during which the city is locked.
The next line contains a single integer n (0 ≤ n ≤ 100), where n is the number of
messages the inspector received. The next n lines each consist of ﬁve integers ti , Li , Ti ,
Ri , Bi , where ti is the time at which the observation has been made (1 ≤ ti ≤ t), and
Li , Ti , Ri , Bi are the left, top, right, and bottom, respectively, of the rectangular area
which has been observed. The point (1, 1) is the upper-left-hand corner, and (W, H) is
the lower-right-hand corner of the city. The messages mean that the robber was not in
the given rectangle at time ti .
The input is terminated by a test case starting with W = H = t = 0. This case
should not be processed.

Output
For each robbery, output the line “Robbery #k:”, where k is the number of the robbery.
Then, there are three possibilities:
If it is impossible that the robber is still in the city, output “The robber has
escaped.”
In all other cases, assume that the robber is still in the city. Output one line of the
form “Time step τ : The robber has been at x,y.” for each time step in which the
exact location can be deduced, and x and y are the column and row, respectively, of
the robber in time step τ . Output these lines ordered by time τ .
If nothing can be deduced, output the line “Nothing known.” and hope that the
inspector does not get even angrier.
Print a blank line after each processed case.

12.6. Problems

Sample Input
4 4 5
4
1 1 1
1 1 1
4 1 1
4 4 2
10 10
1
2 1 1
0 0 0

4
3
3
4
3

3
4
4
4

10 10

Sample Output
Robbery #1:
Time step 1:
Time step 2:
Time step 3:
Time step 4:

The
The
The
The

robber
robber
robber
robber

Robbery #2:
The robber has escaped.

has
has
has
has

been
been
been
been

at
at
at
at

4,4.
4,3.
4,2.
4,1.

285

286

12. Grids

12.6.6

(2/3/4)-D Sqr/Rects/Cubes/Boxes?

PC/UVa IDs: 111206/10177, Popularity: B, Success rate: high Level: 2
How many squares and rectangles are hidden in the 4 × 4 grid below? Maybe you
can count it by hand for such a small grid, but what about for a 100 × 100 grid or even
larger?
What about higher dimensions? Can you count how many cubes or boxes of diﬀerent
size there are in a 10 × 10 × 10 cube, or how many hypercubes and hyperboxes there
are in a four-dimensional 5 × 5 × 5 × 5 hypercube?
Your program needs to be eﬃcient, so be clever. You should assume that squares are
not rectangles, cubes are not boxes, and hypercubes are not hyperboxes.

A 4 × 4 Grid

A 4 × 4 × 4 Cube

Input
The input contains one integer N (0 ≤ N ≤ 100) in each line, which is the length of
one side of the grid, cube, or hypercube. In the example above N = 4. There may be
as many as 100 lines of input.

Output
For each line of input, output six integers S2 , R2 , S3 , R3 , S4 , R4 on a single line, where
S2 denotes the number of squares and R2 the number of rectangles occurring in a
two-dimensional (N × N ) grid. The integers S3 , R3 , S4 , R4 denote similar quantities in
higher dimensions.

Sample Input

Sample Output

1
2
3

1 0 1 0 1 0
5 4 9 18 17 64
14 22 36 180 98 1198

12.6. Problems

12.6.7

287

Dermuba Triangle

PC/UVa IDs: 111207/10233, Popularity: C, Success rate: high Level: 2
Dermuba Triangle is the universally-famous ﬂat and triangular region in the L-PAX
planet in the Geometria galaxy. The people of Dermuba live in equilateral-triangular
ﬁelds with sides exactly equal to 1 km. Houses are always built at the circumcenters of
the triangular ﬁelds. Their houses are numbered as shown in the ﬁgure below.

When Dermubian people visit each other, they follow the shortest path from their
house to the destination house. This shortest path is obviously the straight-line distance
that connects these two houses. Now comes your task. You have to write a program
which computes the length of the shortest path between two houses given their house
numbers.

Input
The input consists of several lines with two non-negative integer values n and m which
specify the start and destination house numbers, where 0 ≤ n, m ≤ 2,147,483,647.

Output
For each line in the input, print the shortest distance between the given houses in
kilometers rounded oﬀ to three decimal places.

Sample Input

Sample Output

0 7
2 8
9 10
10 11

1.528
1.528
0.577
0.577

288

12. Grids

12.6.8

Airlines

PC/UVa IDs: 111208/10075, Popularity: C, Success rate: high Level: 3
A leading airline has hired you to write a program that answers the following query:
given lists of city locations and direct ﬂights, what is the minimum distance a passenger
needs to ﬂy to get from one given city to another? The city locations are speciﬁed by
latitude and longitude.
To get from a city to another a passenger may take a direct ﬂight if one exists;
otherwise he must take a sequence of connecting ﬂights.
Assume that if a passenger takes a direct ﬂight from X to Y he never ﬂies more than
the geographical distance between X and Y. The geographical distance between two
locations X and Y is the length of the geodetic line segment connecting X and Y. The
geodetic line segment between two points on a sphere is the shortest connecting curve
lying entirely on the surface of the sphere. Assume that the Earth is a perfect sphere of
radius exactly 6,378 km, and that the value of π is approximately 3.141592653589793.
Round the geographical distance between every pair of cities to the nearest integer.

Input
The input may contain multiple test cases. The ﬁrst line of each test case contains three
integers N ≤ 100, M ≤ 300, and Q ≤ 10,000, where N indicates the number of cities,
M represents the number of direct ﬂights, and Q is the number of queries.
The next N lines contain the list of cities. The ith of these lines contains a string
ci followed by two real numbers lti and lni , representing the city name, latitude, and
longitude, respectively. The city name will be at most 20 characters and will not contain white-space characters. The latitude will be between −90o (South Pole) and +90o
(North Pole). The longitude will be between −180o and +180o , where negative (positive) numbers denote locations west (east) of the meridian passing through Greenwich,
England.
The next M lines contain the direct ﬂight list. The ith of these lines contains two
city names ai and bi , indicating that there exists a direct ﬂight from city ai to city bi .
Both city names will occur in the city list.
The next Q lines contain the query list. The ith of these lines will contain city names
ai and bi asking for the minimum distance a passenger needs to ﬂy to get from ai to bi .
Be assured that ai and bi are not equal and both city names will occur in the city list.
The input will terminate with three zeros for N , M , and Q.

Output
For each test case, ﬁrst output the test case number (starting from 1) as shown in the
sample output. Then for each input query, print a line giving the shortest distance (in
km) a passenger needs to ﬂy to get from the ﬁrst city (ai ) to the second one (bi ). If
there exists no route form ai to bi , just print the line “no route exists”.
Print a blank line between two consecutive test cases.

12.6. Problems

Sample Input
3 4 2
Dhaka 23.8500 90.4000
Chittagong 22.2500 91.8333
Calcutta 22.5333 88.3667
Dhaka Calcutta
Calcutta Dhaka
Dhaka Chittagong
Chittagong Dhaka
Chittagong Calcutta
Dhaka Chittagong
5 6 3
Baghdad 33.2333 44.3667
Dhaka 23.8500 90.4000
Frankfurt 50.0330 8.5670
Hong_Kong 21.7500 115.0000
Tokyo 35.6833 139.7333
Baghdad Dhaka
Dhaka Frankfurt
Tokyo Hong_Kong
Hong_Kong Dhaka
Baghdad Tokyo
Frankfurt Tokyo
Dhaka Hong_Kong
Frankfurt Baghdad
Baghdad Frankfurt
0 0 0

Sample Output
Case #1
485 km
231 km
Case #2
19654 km
no route exists
12023 km

289

290

12. Grids

12.7 Hints
12.1 Do we need to walk oﬀ the entire path explicitly, or can we compute the ﬁnal
square via some formula?
12.2 What is the right underlying graph to capture the color structure?
12.3 Can we compute the upper and lower bounds for each digit in isolation?
12.4 If we cannot ﬁnd a formula to compute locations under Willi’s system, how can
we best simulate his traversal using an explicit data structure?
12.5 What is the right underlying graph to capture both time and space?
12.6 How do the 2-D and 3-D face incidence formulas generalize to 4-D? Is every
hypercube still speciﬁed by two corner points?
12.7 How do we convert between our previous triangular coordinate systems and this
new one?
12.8 Do the distances derived from your longitude/lattitude computations make sense,
or is there a bug? What is the underlying graph problem?

13
Geometry

Above the gateway to Plato’s academy appeared the inscription, “Let no one who is
ignorant of geometry enter here.” The organizers of programming competitions feel
much the same way, for you can count on seeing at least one geometric problem at
every contest.
Geometry is an inherently visual discipline, one that mandates drawing pictures and
studying them carefully. Part of the diﬃculty of geometric programming is that certain
“obvious” operations you do with a pencil, such as ﬁnding the intersection of two lines,
requires non-trivial programming to do correctly with a computer.
Geometry is a subject which everybody studies in high school but which often turns
rusty with time. In this chapter, we will refresh your knowledge with programming
problems associated with “real” geometry – lines, points, circles, and so forth. After
solving a few of these problems you should feel conﬁdent enough to walk through Plato’s
academy once again.
There is more geometry to come. We defer problems associated with line segments
and polygons to Chapter 14.

13.1 Lines
Straight lines are the shortest distance between any two points. Lines are of inﬁnite
length in both directions, as opposed to line segments, which are ﬁnite. We limit our
discussion here to lines in the plane.
• Representation — Lines can be represented in two diﬀerent ways, as either pairs
of points or as equations. Every line l is completely represented by any pair of

292

13. Geometry

points (x1 , y1 ) and (x2 , y2 ) which lie on it. Lines are also completely described
by equations such as y = mx + b, where m is the slope of the line and b is the
y-intercept, i.e., the unique point (0, b) where it crosses the x-axis. The line l has
slope m = ∆y/∆x = (y1 − y2 )/(x1 − x2 ) and intercept b = y1 − mx1 .
Vertical lines cannot be described by such equations, however, because dividing by ∆x means dividing by zero. The equation x = c denotes a vertical line that
crosses the x-axis at the point (c, 0). This special case, or degeneracy, requires
extra attention when doing geometric programming. We use the more general
formula ax + by + c = 0 as the foundation of our line type because it covers all
possible lines in the plane:
typedef struct
double
double
double
} line;

{
a;
b;
c;

/* x-coefficient */
/* y-coefficient */
/* constant term */

Multiplying these coeﬃcients by any non-zero constant yields an alternate representation for any line. We establish a canonical representation by insisting that
the y-coeﬃcient equal 1 if it is non-zero. Otherwise, we set the x-coeﬃcient to 1:
points_to_line(point
{
if (p1[X] ==
l->a
l->b
l->c
} else {
l->b
l->a
l->c
}
}

p1, point p2, line *l)
p2[X]) {
= 1;
= 0;
= -p1[X];
= 1;
= -(p1[Y]-p2[Y])/(p1[X]-p2[X]);
= -(l->a * p1[X]) - (l->b * p1[Y]);

point_and_slope_to_line(point p, double m, line *l)
{
l->a = -m;
l->b = 1;
l->c = -((l->a*p[X]) + (l->b*p[Y]));
}
• Intersection — Two distinct lines have one intersection point unless they are
parallel; in which case they have none. Parallel lines share the same slope but
have diﬀerent intercepts and by deﬁnition never cross.

13.1. Lines

293

bool parallelQ(line l1, line l2)
{
return ( (fabs(l1.a-l2.a) <= EPSILON) &&
(fabs(l1.b-l2.b) <= EPSILON) );
}
bool same_lineQ(line l1, line l2)
{
return ( parallelQ(l1,l2) && (fabs(l1.c-l2.c) <= EPSILON) );
}
A point (x , y ) lies on a line l : y = mx + b if plugging x into the formula for x
yields y . The intersection point of lines l1 : y = m1 x + b1 and l2 : y2 = m2 x + b2
is the point where they are equal, namely,
x=

b 2 − b1
,
m1 − m 2

y = m1

b2 − b1
+ b1
m1 − m 2

intersection_point(line l1, line l2, point p)
{
if (same_lineQ(l1,l2)) {
printf("Warning: Identical lines, all points intersect.\n");
p[X] = p[Y] = 0.0;
return;
}
if (parallelQ(l1,l2) == TRUE) {
printf("Error: Distinct parallel lines do not intersect.\n");
return;
}
p[X] = (l2.b*l1.c - l1.b*l2.c) / (l2.a*l1.b - l1.a*l2.b);
if (fabs(l1.b) > EPSILON)
/* test for vertical line */
p[Y] = - (l1.a * (p[X]) + l1.c) / l1.b;
else
p[Y] = - (l2.a * (p[X]) + l2.c) / l2.b;
}
• Angles — Any two non-parallel lines intersect each other at a given angle. Lines
l1 : a1 x + b1 y + c1 = 0 and l2 : a2 x + b2 y + c2 = 0, written in the general form,
intersect at the angle θ given by:
tan θ =

a1 b2 − a2 b1
a1 a2 + b1 b2

For lines in slope-intercept form this reduces to tan θ = (m2 − m1 )/(m1 m2 + 1).

294

13. Geometry

Two lines are perpendicular if they cross at right angles to each other. For example,
the x-axis and y-axis of a rectilinear coordinate system are perpendicular, as are
the lines y = x and y = −1/x. The line perpendicular to l : y = mx + b is
y = (−1/m)x + b , for all values of b .
• Closest Point — A very useful subproblem is identifying the point on line l which
is closest to a given point p. This closest point lies on the line through p which
is perpendicular to l, and hence can be found using the routines we have already
developed:
closest_point(point p_in, line l, point p_c)
{
line perp;
/* perpendicular to l through (x,y) */
if (fabs(l.b) <= EPSILON) {
p_c[X] = -(l.c);
p_c[Y] = p_in[Y];
return;
}

/* vertical line */

if (fabs(l.a) <= EPSILON) {
p_c[X] = p_in[X];
p_c[Y] = -(l.c);
return;
}

/* horizontal line */

point_and_slope_to_line(p_in,1/l.a,&perp); /* normal case */
intersection_point(l,perp,p_c);
}
• Rays — These are half-lines originating from some vertex v, called the origin.
Any ray is completely described by a line equation, origin, and direction or the
origin and another point on the ray.

13.2 Triangles and Trigonometry
An angle is the union of two rays sharing a common endpoint. Trigonometry is the
branch of mathematics dealing with angles and their measurement.
There are two common units used to measure angles, radians and degrees. The entire
range of angles spans from 0 to 2π radians or, equivalently, 0 to 360 degrees. Using
radians is better computationally, because the trigonometric libraries we will see in
Section 13.5 assume angles are measured in radians. However, we confess that we think
more naturally in degrees. Historically, fractional parts of angles measured in degrees
are given in minutes, or 1/60th of a degree. But working in degrees and minutes is
hopeless, which is why radians (or at least decimal degrees) are the preferred measure.

13.2. Triangles and Trigonometry

295

hypotenuse
opposite
1
sin a
a

a
adjacent

cos a

Figure 13.1. Deﬁning sine and cosine (l). Labeling the edges of a right triangle (r).

The geometry of triangles (“three angles”) is intimately related to trigonometry, so
we will discuss them together in the sections below.

13.2.1

Right Triangles and the Pythagorean Theorem

A right angle measures 90o or π/2 radians. Right angles are formed at the intersection
of two perpendicular lines, such as rectilinear coordinate axes. Such lines divide the
360o = 2π radian space into four right angles.
Each pair of rays with a common endpoint actually deﬁnes two angles, an internal
angle of a radians and an external angle of 2π − a radians. The internal angles are
the ones we usually claim to be interested in. The three internal (smaller) angles of
any triangle add up to 180o = π radians, meaning that the average internal angle
is 60o = π/3 radians. Triangles with three equal angles are called equilateral, as was
discussed in Section 12.2.
A triangle is called a right triangle if it contains a right internal angle. Right triangles
are particularly easy to work with because of the Pythagorean theorem, which enables
us to calculate the length of the third side of any triangle given the length of the other
two. Speciﬁcally, |a|2 + |b|2 = |c|2 , where a and b are the two shorter sides, and c is the
longest side or hypotenuse.
One can go a long way in analyzing triangles with just the Pythagorean theorem.
But we can go even farther using trigonometry.

13.2.2

Trigonometric Functions

The trigonometric functions sine and cosine are deﬁned as the x- and y-coordinates of
points on the unit circle centered at (0, 0), as shown in Figure 13.1(l). Thus the values
of sine and cosine range from −1 to 1. Further, the two functions are really the same
thing, since cos(θ) = sin(θ + π/2).
A third important trigonometric function is the tangent, deﬁned as the ratio of sine
over cosine. Thus tan(θ) = sin(θ)/ cos(θ), which is well-deﬁned except when cos(θ) = 0
at θ = π/2 and θ = 3π/2.

296

13. Geometry

C
b
a
a
A
c

Figure 13.2. Notation for solving triangles (l) and computing their area (r).

These functions are important, because they enable us to relate the lengths of any
two sides of a right triangle T with the non-right angles of T . Recall that the hypotenuse
of a right triangle is the longest edge in T , that edge opposite the right angle. The other
two edges in T can be labeled as opposite or adjacent edges in relation to a given angle
a, as shown in Figure 13.1(r). Then
cos(a) =

|adjacent|
,
|hypotenuse|

sin(a) =

|opposite|
,
|hypotenuse|

tan(a) =

|opposite|
|adjacent|

These relations are worth remembering. As a mnemonic, we use the name of the
famous Indian Chief Soh-Cah-Toa, where each syllable of his name encodes a diﬀerent
relation. “Cah” means Cosine equals Adjacent over Hypotenuse, for example.
Chief Soh-Cah-Toa would not be of much use without inverse functions mapping
cos(θ), sin(θ), and tan(θ) back to the original angles. These inverse functions are called
arccos, arcsin, and arctan, respectively. With them, we can easily compute the remaining
angles of any right triangle given two side lengths.
These trigonometric functions are properly computed using Taylor series expansions,
but don’t worry: the math library of your favorite programming language already includes them. Trigonometric functions tend to be numerically unstable, so be careful. Do
not expect that θ exactly equals arcsin(sin(θ)), particularly for large or small angles.

13.2.3

Solving Triangles

Two powerful trigonometric formulae enable us to compute important properties of
triangles. The Law of Sines provides the relationship between sides and angles in any
triangle. For angles A, B, C, and opposing edges a, b, c (shown in Figure 13.2(l)),
a
b
c
=
=
sin A
sin B
sin C
The Law of Cosines is a generalization of the Pythagorean theorem beyond right
angles. For any triangle with angles A, B, C, and opposing edges a, b, c,
a2 = b2 + c2 − 2bc cos A

13.2. Triangles and Trigonometry

297

Solving triangles is the art of deriving the unknown angles and edge lengths of a
triangle given a subset of such measurements. Such problems fall into two categories:
• Given two angles and side, ﬁnd the rest — Finding the third angle is easy, since
the three angles must sum to 180o = π radians. The Law of Sines then gives us a
way to ﬁnd the missing edge lengths.
• Given two sides and an angle, ﬁnd the rest — If the angle lies between the two
edges, the Law of Cosines gives us the way to ﬁnd the remaining edge length.
The Law of Sines then enables us to mop up the unknown angles. Otherwise, we
can use the Law of Sines and the angle sum property to determine all angles, and
then the Law of Sines to get the remaining edge length.
The area A(T ) of a triangle T is given by A(T ) = (1/2)ab, where a is the altitude and
b is the base of the triangle. The base is any one of the sides while the altitude is the
distance from the third vertex to this base, as shown in Figure 13.2(r). This altitude
can be easily calculated via trigonometry or the Pythagorean theorem, depending on
what is known about the triangle.
Another approach to computing the area of a triangle is directly from its coordinate
representation. Using linear algebra and determinants, it can be shown that the signed
area A(T ) of triangle T = (a, b, c) is

2 · A(T ) = bx
cx

ay
by
cy

1
1
1

= ax by − ay bx + ay cx − ax cy + bx cy − cx by

This formula generalizes nicely to compute d! times the volume of a simplex in d
dimensions.
Note that the signed areas can be negative, so we must take the absolute value to
compute the actual area. This is a feature, not a bug. We will see how to use the sign
of this area to build important primitives for computational geometry in Section 14.1.
double signed_triangle_area(point a, point b, point c)
{
return( (a[X]*b[Y] - a[Y]*b[X] + a[Y]*c[X]
- a[X]*c[Y] + b[X]*c[Y] - c[X]*b[Y]) / 2.0 );
}
double triangle_area(point a, point b, point c)
{
return( fabs(signed_triangle_area(a,b,c)) );
}

298

13. Geometry

13.3 Circles
A circle is deﬁned as the set of points at a given distance (or radius) from its center,
(xc , yc ). A disk is circle plus its interior, i.e., the set of points a distance at most r from
its center.
• Representation — A circle can be represented in two basic ways, either as triples of
boundary points or by its center/radius. For most applications, the center/radius
representation is most convenient:
typedef struct {
point c;
double r;
} circle;

/* center of circle */
/* radius of circle */

The equation of a circle follows directly from its center/radius
representation.

Since the distance between two points isdeﬁned by (x1 − x2 )2 + (y1 − y2 )2 ,
the equation of a circle of radius r is r = (x − xc )2 + (y − yc )2 or, equivalently,
r2 = (x − xc )2 + (y − yc )2 to get rid of the root.
• Circumference and Area — Many important quantities associated with circles
are easy to compute. Both the area A and boundary length (circumference) C of
a circle depend on the magical constant π = 3.1415926. Speciﬁcally, A = πr2 and
C = 2πr. Memorizing π to many more digits is a good way to prove you are a
geek. The diameter, or longest straight-line distance within the circle, is simply
2r.
• Tangents — A line l most likely intersects the boundary of circle c at either
zero or two points; the ﬁrst case meaning it misses c entirely and the second case
meaning it crosses the interior of c. The only remaining case is when the line l
intersects the boundary of c but not its interior. Such lines are called tangent
lines.
The construction of a line l tangent to c through a given point O is illustrated
in Figure 13.3. The point of contact between c and l lies on the line perpendicular
to l through the center of c. Since the triangle with side lengths r, d, and x is a right
triangle, we can compute the unknown tangent length x using the Pythagorean
theorem. From x, we can compute either the tangent point or the angle a. The
distance d from O to the center is computed using the distance formula.
• Interacting Circles — Two circles c1 and c2 of distinct radii r1 and r2 can interact
in several ways. The circles will intersect if and only if the distance between their
centers is at most r1 + r2 . The smaller circle (say, c1 ) will be completely contained
within c2 if and only if the distance between their centers plus r1 is at most
r2 . The remaining case is if c1 and c2 intersect each other’s boundaries at two
points. As shown in Figure 13.4, the points of intersection form triangles with the
two centers whose edge lengths are totally determined (r1 , r2 , and the distance
between the centers), so the angles and coordinates can be computed as needed.

13.4. Program Design Example: Faster Than a Speeding Bullet

299

x
r

a
d

C
x

Figure 13.3. Constructing the line tangent to a circle through O.

r
1

r
2

Figure 13.4. The intersection points of two circles.

13.4 Program Design Example: Faster Than a Speeding
Bullet
Superman has at least two powers that normal mortals do not possess, namely, x-ray
vision and the ability to ﬂy faster than a speeding bullet. Some of his other skills are
not so impressive: you or I could probably change clothes in a telephone booth if we
put our minds to it.
Superman seeks to demonstrate his powers between his current position s = (xs , ys )
and a target position t = (xt , yt ). The environment is ﬁlled with circular (or cylindrical)
obstacles. Superman’s x-ray vision does not have unlimited range, being bounded by

300

13. Geometry

111
000
000
111
000
111
000
111

11111
00000
00000
11111
00000
11111
00000
11111
111
0000000
1111111
00000 000
11111
000
111
1111111
00000
11111
0000000000
0000000
1111111
000 111
111
0000000
1111111
000
111
0000000
1111111
000
111
0000000
1111111
000
111

Figure 13.5. Superman’s ﬂight plan, with associated x-ray thickness.

the amount of material he has to see through. He is eager to compute the total obstacle
intersection length between the two points to know whether to attempt this trick.
Failing this, the Man of Steel would like to ﬂy between his current position and the
target. He can see through objects, but not ﬂy through them. His desired path (Figure
13.5) ﬂies straight to the goal, until it bumps into an object. At this point, he ﬂies along
the boundary of the circle until he returns to the straight line linking position to his
start and end positions. This is not the shortest obstacle-free path, but Superman is
not completely stupid – he always takes the shorter of the two arcs around the circle.
You may assume that none of the circular obstacles intersect each other, and that
both the start and target positions lie outside of obstacles. Circles are speciﬁed by giving
the center coordinates and radius.
————————————
Solution starts below
————————————
Solving this problem requires three basic geometric operations. We need to be able
to (1) test whether a given circle intersects a given line l between the start and target
points, (2) compute the length of the chord intersecting l and circle, and (3) compute
the arc length around the smaller piece of a circle cut by l.
The ﬁrst task is relatively easy. Find the length of the shortest distance from the
center of the circle to l. If this is less than the radius, they intersect; if not, they don’t.
To test whether this intersection occurs between s and t, it suﬃces to check if the point
on l closest to the center of the circle lies within the box deﬁned by s and t.
Measuring the consequences of the intersection appears more diﬃcult. One approach
would be to start by computing the coordinates of the intersection points between the
line and circle. Although this could be done by setting the circle and line equations
equal and solving the resulting quadratic equation, it will be a mess. There is usually
a simpler way to solve a geometric problem than explicitly ﬁnding the coordinates of
points.
Such a simpler way is revealed in Figure 13.6. The length of chord intersection is
equal to 2x on the diagram. We know that d, the shortest length from l to the center,
d, lies on a line perpendicular to l. Thus all four angles at the intersection are right

13.4. Program Design Example: Faster Than a Speeding Bullet

r-d

301

a a

Figure 13.6. Computing chord and arc lengths for line-circle intersection.

angles, including the two angles incident on the triangles with sides r, d, and x. We can
now obtain x via an application of the Pythagorean theorem.
The arc length of the shorter walk around the circle can be obtained from angle a of
this triangle. The arc we are interested in is deﬁned by the angle 2a (in radians), and
so is (2a)/(2π) times the total circumference of the circle, which is just 2πr. The angle
is easily computed from the sides of the triangle using inverse trigonometric functions.
Viewed in this way, and using the subroutines developed earlier, the solution becomes
very simple:
point s;
point t;
int ncircles;
circle c[MAXN];
superman()
{
line l;
point close;
double d;
double xray = 0.0;
double around = 0.0;
double angle;
double travel;
int i;
double asin(), sqrt();
double distance();

/*
/*
/*
/*

/*
/*
/*
/*
/*
/*
/*
/*

Superman’s initial position */
target position */
number of circles */
circles data structure */

line from start to target position */
closest point */
distance from circle-center */
length of intersection with circles */
length around circular arcs */
angle subtended by arc */
total travel distance */
counter */

points_to_line(s,t,&l);
for (i=1; i<=ncircles; i++) {
closest_point(c[i].c,l,close);
d = distance(c[i].c,close);
if ((d>=0) && (d < c[i].r) && point_in_box(close,s,t)) {

302

13. Geometry

xray += 2*sqrt(c[i].r*c[i].r - d*d);
angle = acos(d/c[i].r);
around += ((2*angle)/(2*PI)) * (2*PI*c[i].r);
}
}
travel = distance(s,t) - xray + around;
printf("Superman sees thru %7.3lf units, and flies %7.3lf units\n",
xray, travel);
}

13.5 Trigonometric Function Libraries
The trigonometric libraries of diﬀerent programming languages tend to be very similar.
Make sure you know whether your library works in degrees or radians and what angle
ranges are returned in the inverse trigonometric functions. Also ﬁnd out which halfperiods the inverse sin and cos functions assume. They cannot determine the angle over
the full 360o = 2π radian range but only a 180o = π radian period.

Trigonometric Libraries in C/C++
The standard C/C++ math library math.h has all the standard trigonometric functions.
Be sure to compile it with the math library included for successful operation:
#include
double cos(double x);
double acos(double x);

/* compute the cosine of x radians */
/* compute the arc cosine of [-1,1] */

double sin(double x);
double asin(double x);

/* compute the sine of x radians */
/* compute the arc sine of [-1,1] */

double tan(double x)
/* compute the tangent of x radians */
double atan(double x);
/* compute the principal arctan of x */
double atan2(double y, double x); /* compute the arc tan of y/x */
The primary reason for two diﬀerent arctan functions is correctly identifying which
of the four quadrants the angle is in. This depends on the signs of both x and y.

Trigonometric Libraries in Java
The Java trigonometric functions reside in java.lang.Math, and assume angles are
given in radians. Library functions are provided to convert between degrees and radians.
All functions are static, with functionality very similar to the C library:

13.5. Trigonometric Function Libraries

303

double cos(double a)
double acos(double a)

Return the trigonometric cosine of angle a.
Return the arc cosine of angle a, [0,pi].

double sin(double a)
double asin(double a)

Return the trigonometric sine of angle a.
Return the arc sine of angle a, [-pi/2,pi/2].

double tan(double a)
Return the trigonometric tangent of angle a.
double atan(double a)
Return the arc tangent of angle a, [-pi/2,pi/2]
double atan2(double a, double b)
Convert (b, a) to polar (r, theta)
double toDegrees(double angrad)
double toRadians(double angdeg)

Convert a radian angle to degrees.
Convert a degree angle to radians.

304

13. Geometry

13.6 Problems
13.6.1

Dog and Gopher

PC/UVa IDs: 111301/10310, Popularity: A, Success rate: average Level: 1
A large ﬁeld has a dog and a gopher. The dog wants to eat the gopher, while the
gopher wants to run to safety through one of several gopher holes dug in the surface of
the ﬁeld.
Neither the dog nor the gopher is a math major; however, neither is entirely stupid.
The gopher decides on a particular gopher hole and heads for that hole in a straight
line at a ﬁxed speed. The dog, which is very good at reading body language, anticipates
which hole the gopher has chosen. The dog heads at double the speed of the gopher to
the hole. If the dog reaches the hole ﬁrst, the gopher gets gobbled up; otherwise, the
gopher escapes.
You have been retained by the gopher to select a hole through which it can escape,
if such a hole exists.

Input
The input ﬁle contains several sets of input. The ﬁrst line of each set contains one integer
and four ﬂoating point numbers. The integer n denotes how many holes are in the set.
The four ﬂoating point numbers denote the (x, y) coordinates of the gopher followed
by the (x, y) coordinates of the dog. The subsequent n lines of input each contain two
ﬂoating point numbers: the (x, y) coordinates of a gopher hole. All distances are in
meters to the nearest millimeter. The input is terminated by end of ﬁle and there is a
blank line between two consecutive sets.

Output
Print a single line for each set of input. If the gopher can escape, the output line should
read, “The gopher can escape through the hole at (x, y).” while identifying the
appropriate hole to the nearest millimeter. Otherwise, the output line should read, “The
gopher cannot escape.” If the gopher can escape through more than one hole, report
the one that appears ﬁrst in the input. There are at most 1,000 gopher holes in a set
of input and all coordinates range between –10,000 and +10,000.

Sample Input

Sample Output

1 1.000 1.000 2.000 2.000
1.500 1.500

The gopher cannot escape.
The gopher can escape through the hole at (2.500,2.500).

2 2.000 2.000 1.000 1.000
1.500 1.500
2.500 2.500

13.6. Problems

13.6.2

305

Rope Crisis in Ropeland!

PC/UVa IDs: 111302/10180, Popularity: B, Success rate: average Level: 2
Rope-pulling (also known as tug of war) is a very popular game in Ropeland, just
like cricket is in Bangladesh. Two groups of players hold diﬀerent ends of a rope and
pull. The group that snatches the rope from the other group is declared winner.
Due to a rope shortage, the king of the country has declared that groups will not be
allowed to buy longer ropes than they require.
Rope-pulling takes place in a large room, which contains a large round pillar of a
certain radius. If two groups are on the opposite side of the pillar, their pulled rope
cannot be a straight line. Given the position of the two groups, ﬁnd out the minimum
length of rope required to start rope-pulling. You can assume that a point represents
the position of each group.

Two groups with the round pillar
between them.

Two groups unaﬀected by the pillar.

Input
The ﬁrst line of the input ﬁle contains an integer N giving the number of input cases.
Then follow N lines, each containing ﬁve numbers X1 , Y1 , X2 , Y2 , and R, where (X1 , Y1 )
and (X2 , Y2 ) are the coordinates of the two groups and R > 0 is the radius of the pillar.
The center of the pillar is always at the origin, and you may assume that neither
team starts in the circle. All input values except for N are ﬂoating point numbers, and
all have absolute value ≤ 10,000.

Output
For each input set, output a ﬂoating point number on a new line rounded to the third
digit after the decimal point denoting the minimum length of rope required.

Sample Input

Sample Output

2
1 1 -1 -1 1
1 1 -1 1 1

3.571
2.000

306

13. Geometry

13.6.3

The Knights of the Round Table

PC/UVa IDs: 111303/10195, Popularity: A, Success rate: average Level: 2
King Arthur is planning to build the round table in a room which has a triangular
window in the ceiling. He wants the sun to shine on his round table. In particular, he
wants the table to be totally in the sunlight when the sun is directly overhead at noon.
Thus the table must be built in a particular triangular region of the room. Of course,
the king wants to build the largest possible table under the circumstances.
As Merlin is out to lunch, write a program which ﬁnds the radius of the largest
circular table that ﬁts in the sunlit area.

Input
There will be an arbitrary number of test cases, each represented by three real numbers
(a, b, and c), which stand for the side lengths of the triangular region. No side length
will be greater than 1,000,000, and you may assume that max(a, b, c) ≤ (a + b + c)/2.
You must read until you reach the end of the ﬁle.

Output
For each room conﬁguration read, you must print the following line:
The radius of the round table is: r
where r is the radius of the largest round table that ﬁts in the sunlit area, rounded to
three decimal digits.

Sample Input
12.0 12.0 8.0

Sample Output
The radius of the round table is: 2.828

13.6. Problems

13.6.4

307

Chocolate Chip Cookies

PC/UVa IDs: 111304/10136, Popularity: C, Success rate: average Level: 3
Making chocolate chip cookies involves mixing ﬂour, salt, oil, baking soda, and chocolate chips to form dough, which is then rolled into a plane about 50 cm square. Circles
are cut from this plane, placed on a cookie sheet, and baked in an oven for about 20
minutes. When the cookies are done, they are removed from the oven and allowed to
cool before being eaten.
We are concerned here with the process of cutting the ﬁrst cookie after the dough
has been rolled. Each chip is visible in the planar dough, so we simply need to place
the cutter so as to maximize the number of chocolate chips contained in its perimeter.

Input
The input begins with a single positive integer on a line by itself indicating the number
of test cases. This line is followed by a blank line, and there is also a blank line between
two consecutive test cases.
Each input case consists of a number of lines, each containing two ﬂoating point
numbers indicating the (x, y) coordinates of a chip in the square surface of cookie
dough. Each coordinate is between 0.0 and 50.0 (cm). Each chip may be considered a
point; i.e., these are not President’s Choice cookies. All of the at most 200 chocolate
chips are at diﬀerent positions.

Output
For each test case, the output consists of a single integer: the maximum number of
chocolate chips that can be contained in a single cookie whose diameter is 5 cm. The
cookie need not be fully contained in the 50-cm square dough (i.e., it may have a ﬂat
side).
The output of two consecutive cases must be separated by a blank line.

Sample Input

Sample Output

4.0
4.0
5.0
1.0
1.0
1.0
1.0
1.0

4.0
5.0
6.0
20.0
21.0
22.0
25.0
26.0

308

13. Geometry

13.6.5

Birthday Cake

PC/UVa IDs: 111305/10167, Popularity: C, Success rate: average Level: 2
Lucy and Lily are twins. Today is their birthday, so Mother buys them a birthday
cake. There are 2N cherries on the cake, where 1 ≤ N ≤ 50. Mother wants to cut the
cake into two halves with a single straight-line cut through the center so each twin gets
both the same amount of cake and the same number of cherries. Can you help her?

The cake has a radius of 100 and its center is located at (0, 0). The coordinates of each
cherry are given by two integers (x, y). You must give the line in the form Ax + By = 0,
where both A and B are integers in [−500, 500]. Cherries are not allowed to lie on the
cutline. There is at least one solution for each data set.

Input
The input ﬁle contains several test cases. The ﬁrst line of each case contains the integer
N . This is followed by 2N lines, where each line contains the (x, y) location of a cherry
with one space between them. The input is ends with N = 0.

Output
For each test case, print a line containing A and B with a space between them. If there
are many solutions, any one will suﬃce.

Sample Input

Sample Output

2
-20 20
-30 20
-10 -50
10 -5
0

0 1

13.6. Problems

13.6.6

309

The Largest/Smallest Box ...

PC/UVa IDs: 111306/10215, Popularity: A, Success rate: average Level: 2
The following ﬁgure shows a rectangular card of width W , length L, and thickness 0.
Four x × x squares are cut from the four corners of the card shown by the dotted lines.
The card is then folded along the dashed lines to make a box without a cover.

Given the width and height of the box, ﬁnd the values of x for which the box has
maximum and minimum volume.

Input
The input ﬁle contains several lines of input. Each line contains two positive ﬂoating
point numbers L (0 < L < 10, 000) and W (0 < W < 10, 000), which indicate the
length and width of the card, respectively.

Output
For each line of input, give one line of output containing two or more ﬂoating point
numbers separated by a single space. Each ﬂoating point number should contain three
digits after the decimal point. The ﬁrst number indicates the value which maximizes
the volume of the box, while the subsequent values (sorted in ascending order) indicate
the cut values which minimize the volume of the box.

Sample Input

Sample Output

1 1
2 2
3 3

0.167 0.000 0.500
0.333 0.000 1.000
0.500 0.000 1.500

310

13. Geometry

13.6.7

Is This Integration?

PC/UVa IDs: 111307/10209, Popularity: A, Success rate: high Level: 3
The image below shows a square ABCD, where AB = BC = CD = DA = a. Four
arcs are drawn taking the four vertexes A, B, C, D as centers and a as the radius.
The arc that is drawn taking A as center starts at neighboring vertex B and ends at
neighboring vertex D. All other arcs are drawn in a similar fashion. Regions of three
diﬀerent shapes are created in this fashion. You must determine the total area of these
diﬀerent shaped regions.

Input
Each line of the input ﬁle contains a ﬂoating-point number a indicating the side length
of the square, where 0 ≤ a ≤ 10, 000.0. Input is terminated by end of ﬁle.

Output
For each test case, output on a single line the area of the diﬀerent region types in the
image above. Each ﬂoating point number should be printed with three digits after the
decimal point. The ﬁrst number of each case will denote the area of the striped region,
the second number will denote the total area of the dotted regions, and the third number
will denote the rest of the area.

Sample Input

Sample Output

0.1
0.2
0.3

0.003 0.005 0.002
0.013 0.020 0.007
0.028 0.046 0.016

13.6. Problems

13.6.8

311

How Big Is It?

PC/UVa IDs: 111308/10012, Popularity: B, Success rate: low Level: 3
Ian is going to California and has to pack his things, including his collection of circles.
Given a set of circles, your program must ﬁnd the smallest rectangular box they ﬁt in.
All circles must touch the bottom of the box. The ﬁgure below shows an acceptable
packing for a set of circles, although it may not be the optimal packing for these
particular circles. In an ideal packing, each circle should touch at least one other circle,
but you probably ﬁgured that out.

Input
The ﬁrst line of input contains a single positive decimal integer n, n ≤ 50. This indicates
the number of test cases to follow. The subsequent n lines each contain a series of
numbers separated by spaces. The ﬁrst number on each of these lines is a positive
integer m, m ≤ 8, which indicates how many other numbers appear on that line. The
next m numbers on the line are the radii of the circles which must be packed in a single
box. These numbers need not be integers.

Output
For each test case, your program must output the size of the smallest rectangle which
can pack the circles. Each case should be output on a separate line by itself, with three
places after the decimal point. Do not output leading zeroes unless the number is less
than 1, e.g., 0.543.

Sample Input

Sample Output

3
3 2.0 1.0 2.0
4 2.0 2.0 2.0 2.0
3 2.0 1.0 4.0

9.657
16.000
12.657

312

13. Geometry

13.7 Hints
13.1 Is the closest hole really the safest spot for the gopher?
13.2 Does it help to compute tangent lines to the pillar?
13.3 How many sides of the triangle must the table touch?
13.4 Can we always move the cutter circle to put some chips on its boundary? If so
how many? Does this plus the radius deﬁne all “interesting” cutter placements?
13.5 There is always a solution to this problem if we remove the constraint that the
cutline have integer coordinates – can you prove it? Is there a more eﬃcient
solution than trying all possible A, B pairs?
13.6 What are the values of x for which the box has zero volume? Is calculus helpful
in maximizing the volume?
13.7 Can we use inclusion-exclusion to give us the area of complicated regions from
easy-to-compute parts?
13.8 Is it better to order the circles from largest to smallest, or to interleave them?
Does the order ever not matter? Will backtracking work for this problem?

14
Computational Geometry

Geometric computing has been become increasingly important in applications such as
computer graphics, robotics, and computer-aided design, because shape is an inherent
property of real objects. But most real-world objects are not made of lines which go to
inﬁnity. Instead, most computer programs represent geometry as arrangements of line
segments. Arbitrary closed curves or shapes can be represented by ordered collections
of line segments or polygons.
Computational geometry can be deﬁned (for our purposes) as the geometry of discrete
line segments and polygons. It is a fun and interesting subject, but one not typically
taught in required college courses. This gives the ambitious student who learns a little
computational geometry a leg up on the competition, and a window into a fascinating
area of algorithms still under active research today. Excellent books on computational
geometry are available [O’R00, dBvKOS00], but this chapter should be enough to get
you started.

14.1 Line Segments and Intersection
A line segment s is the portion of a line l which lies between two given points inclusive.
Thus line segments are most naturally represented by pairs of endpoints:
typedef struct {
point p1,p2;
} segment;

/* endpoints of line segment */

The most important geometric primitive on segments, testing whether a given pair
of them intersect, proves surprisingly complicated because of tricky special cases that

314

14. Computational Geometry

arise. Two segments may lie on parallel lines, meaning they do not intersect at all. One
segment may intersect at another’s endpoint, or the two segments may lie on top of
each other so they intersect in a segment instead of a single point.
This problem of geometric special cases, or degeneracy, seriously complicates the
problem of building robust implementations of computational geometry algorithms.
Degeneracy can be a real pain in the neck to deal with. Read any problem speciﬁcation
carefully to see if it promises no parallel lines or overlapping segments. Without such
guarantees, however, you had better program defensively and deal with them.
The right way to deal with degeneracy is to base all computation on a small number
of carefully crafted geometric primitives. In Chapter 13, we implemented a general line
data type that successfully dealt with vertical lines; those of inﬁnite slope. We can reap
the beneﬁts by generalizing our line intersection routines to line segments:
bool segments_intersect(segment s1, segment s2)
{
line l1,l2;
/* lines containing the input segments */
point p;
/* intersection point */
points_to_line(s1.p1,s1.p2,&l1);
points_to_line(s2.p1,s2.p2,&l2);
if (same_lineQ(l1,l2)) /* overlapping or disjoint segments */
return( point_in_box(s1.p1,s2.p1,s2.p2) ||
point_in_box(s1.p2,s2.p1,s2.p2) ||
point_in_box(s2.p1,s1.p1,s1.p2) ||
point_in_box(s2.p1,s1.p1,s1.p2) );
if (parallelQ(l1,l2)) return(FALSE);
intersection_point(l1,l2,p);
return(point_in_box(p,s1.p1,s1.p2) && point_in_box(p,s2.p1,s2.p2));
}
We will use our line intersection routines to ﬁnd an intersection point if one exists.
If so, the remaining question is whether this point lies within the region deﬁned by our
line segments. This is most easily tested by establishing whether the intersection point
lies in the bounding box around each line segment, which is deﬁned by the endpoints
of each segment:
bool point_in_box(point p, point b1, point b2)
{
return( (p[X] >= min(b1[X],b2[X])) && (p[X] <= max(b1[X],b2[X]))
&& (p[Y] >= min(b1[Y],b2[Y])) && (p[Y] <= max(b1[Y],b2[Y])) );
}

14.2. Polygons and Angle Computations

315

Segment intersection can also be cleanly tested using a primitive to check whether
three ordered points turn in a counterclockwise direction. Such a primitive is described
in the next section. However, we ﬁnd the point in box method more intuitive.

14.2 Polygons and Angle Computations
Polygons are closed chains of non-intersecting line segments. That they are closed means
the ﬁrst vertex of the chain is the same as the last. That they are non-intersecting means
that pairs of segments meet only at endpoints.
Polygons are the basic structure to describe shapes in the plane. Instead of explicitly
listing the segments (or edges) of polygon, we can implicitly represent them by listing
the n vertices in order around the boundary of the polygon. Thus a segment exists
between the ith and (i + 1)st points in the chain for 0 ≤ i ≤ n − 1. These indices are
taken mod n to ensure there is an edge between the ﬁrst and last point:
typedef struct {
int n;
point p[MAXPOLY];
} polygon;

/* number of points in polygon */
/* array of points in polygon */

A polygon P is convex if any line segment deﬁned by two points within P lies entirely
within P ; i.e., there are no notches or bumps such that the segment can exit and reenter P . This implies that all internal angles in a convex polygon must be acute; i.e.,
at most 180o or π radians.
Actually computing the angle deﬁned between three ordered points is a tricky problem. We can avoid the need to know actual angles in most geometric algorithms by
using the counterclockwise predicate ccw(a,b,c). This routine tests whether point c
lies to the right of the directed line which goes from point a to point b. If so, the angle
formed by sweeping from a to c in a counterclockwise manner around b is acute, hence
→

the name of the predicate. If not, the point either lies to the left of ab or the three
points are collinear.
These predicates can be computed using the signed triangle area() formula in→

troduced in Section 13.2.3. Negative area results if point c is to the left of ab. Zero
area results if all three points are collinear. For robustness in the face of ﬂoating point
errors, we compare it to a tiny constant instead of zero. This is an imperfect solution;
building provably robust geometric code with ﬂoating point arithmetic is somewhere
between diﬃcult and impossible. However, it is better than nothing.
bool ccw(point a, point b, point c)
{
double signed_triangle_area();
return (signed_triangle_area(a,b,c) > EPSILON);
}

316

14. Computational Geometry

(l)

(r)

Figure 14.1. The convex hull of a set of points (l), with the change in hull due to inserting the
rightmost point (r).

bool cw(point a, point b, point c)
{
double signed_triangle_area();
return (signed_triangle_area(a,b,c) < EPSILON);
}
bool collinear(point a, point b, point c)
{
double signed_triangle_area();
return (fabs(signed_triangle_area(a,b,c)) <= EPSILON);
}

14.3 Convex Hulls
Convex hull is to computational geometry what sorting is to other algorithmic problems,
a ﬁrst step to apply to unstructured data so we can do more interesting things with it.
The convex hull C(S) of a set of points S is the smallest convex polygon containing S,
as shown in Figure 14.1(l).
There are almost as many diﬀerent algorithms for convex hull as there are for sorting.
The Graham’s scan algorithm for convex hull which we will implement ﬁrst sorts the
points in either angular or left-right order, and then incrementally inserts the points
into the hull in this sorted order. Previous hull points rendered obsolete by the last
insertion are then deleted.
Our implementation is based on the Gries and Stojmenović [GS87] version of Graham
scan, which sorts the vertices by angle around the leftmost-lowest point. Observe that

14.3. Convex Hulls

317

both the leftmost and lowest points must lie on the hull, because they cannot lie within
some other triangle of points. We use the second criteria to break ties for the ﬁrst,
since there might be many diﬀerent but equally leftmost points. Such considerations
are necessary to achieve robustness with degenerate input.
The main loop of the algorithm inserts the points in increasing angular order around
this initial point. Because of this ordering, the newly inserted point must sit on the
hull of the thus-far-inserted points. This new insertion may form a triangle containing
former hull points which now must be deleted. These points-to-be-deleted will sit at
the end of the chain as the most recent surviving insertions. The deletion criteria is
whether the new insertion makes an obtuse angle with the last two points on the chain
– recall that only acute angles appear in convex polygons. If the angle is too large, the
last point on the chain has to go. We repeat until a small enough angle is created or we
run out of points. We can use our ccw() predicate to test whether the angle is too big:
point first_point;

/* first hull point */

convex_hull(point in[], int n, polygon *hull)
{
int i;
/* input counter */
int top;
/* current hull size */
bool smaller_angle();
if (n <= 3) {
/* all points on hull! */
for (i=0; ip[i]);
hull->n = n;
return;
}
sort_and_remove_duplicates(in,&n);
copy_point(in[0],&first_point);
qsort(&in[1], n-1, sizeof(point), smaller_angle);
copy_point(first_point,hull->p[0]);
copy_point(in[1],hull->p[1]);
copy_point(first_point,in[n]);
top = 1;
i = 2;

/* sentinel for wrap-around */

while (i <= n) {
if (!ccw(hull->p[top-1], hull->p[top], in[i]))
top = top-1;
/* top not on hull */
else {

318

14. Computational Geometry

top = top+1;
copy_point(in[i],hull->p[top]);
i = i+1;
}
}
hull->n = top;
}
The beauty of this implementation is how naturally it avoids most of the problems
of degeneracy. A particularly insidious problem is when three or more input points are
collinear, particularly when one of these points is the leftmost-lowest hull point which
we started with. If we are not careful, we can include three collinear vertices on a hull
edge, where in fact only the endpoints belong on the hull.
We resolve this by breaking ties in sorting by angle according to the distance from
the initial hull point. By making sure the farthest of these collinear points is inserted
last, we ensure that it remains on the ﬁnal hull instead of its angular brethren:
bool smaller_angle(point *p1, point *p2)
{
if (collinear(first_point,*p1,*p2)) {
if (distance(first_point,*p1) <= distance(first_point,*p2))
return(-1);
else
return(1);
}
if (ccw(first_point,*p1,*p2))
return(-1);
else
return(1);
}
The remaining degenerate case concerns repeated points. What angle is deﬁned between three occurrences of the same point? To eliminate this problem, we remove
duplicate copies of points when we sort to identify the leftmost-lowest hull point:
sort_and_remove_duplicates(point in[], int *n)
{
int i;
/* counter */
int oldn;
/* number of points before deletion */
int hole;
/* index marked for potential deletion */
bool leftlower();
qsort(in, *n, sizeof(point), leftlower);
oldn = *n;

14.4. Triangulation: Algorithms and Related Problems

319

hole = 1;
for (i=1; i<(oldn-1); i++) {
if ((in[hole-1][X]==in[i][X]) && (in[hole-1][Y]==in[i][Y]))
(*n)--;
else {
copy_point(in[i],in[hole]);
hole = hole + 1;
}
}
copy_point(in[oldn-1],in[hole]);
}
bool leftlower(point *p1, point *p2)
{
if ((*p1)[X] < (*p2)[X]) return (-1);
if ((*p1)[X] > (*p2)[X]) return (1);
if ((*p1)[Y] < (*p2)[Y]) return (-1);
if ((*p1)[Y] > (*p2)[Y]) return (1);
return(0);
}
There are a few ﬁnal things to note about convex hull. Observe the beautiful use
of sentinels to simplify the code. We copy the origin point at the end of the insertion
chain to avoid explicitly having to test for the wrap-around condition. We then implicitly
delete this duplicated point by setting the return count appropriately.
Finally, note that we sort the points by angle without ever actually computing angles.
The ccw predicate is enough to do the job.

14.4 Triangulation: Algorithms and Related Problems
Finding the perimeter of a polygon is easy; just compute the length of each edge using
the Euclidean distance formula and add them together. Computing the area of irregular
blobs is somewhat harder. The most straightforward approach is to divide the polygon
into non-overlapping triangles and then sum the area of each triangle. The operation
of partitioning a polygon into triangles is called triangulation.
Triangulating a convex polygon is easy, for we can just connect a given vertex v to all
n − 1 other vertices like a fan. This doesn’t work for general polygons, however, because
the edges might go outside the polygon. We must carve up a polygon P into triangles
using non-intersecting chords which lie completely within P .
We can represent the triangulation either by listing the chords or, as we do here, with
an explicit list of the vertex indices in each triangle.

320

14. Computational Geometry

8
9

E
I

7
H

G
C

11
2

Figure 14.2. Triangulating a polygon via the van Gogh (ear-cutting) algorithm, with triangles
labeled in order of insertion (A − I).

typedef struct {
int n;
int t[MAXPOLY][3];
} triangulation;

14.4.1

/* number of triangles in triangulation */
/* indices of vertices in triangulation */

Van Gogh’s Algorithm

Several polygon triangulation algorithms are known, the most eﬃcient of which run in
time linear in the number of vertices. But perhaps the simplest algorithm to program is
based on ear-cutting. An ear of a polygon P is a triangle deﬁned by a vertex v and its
left and right neighbors (l and r), such that the triangle (v, l, r) lies completely within
P.
→
→
→
Since lv and vr are boundary segments of P , the chord deﬁning the ear is rl. Under
→

what conditions can this chord be in the triangulation? First, rl must lie completely
within the interior of P . To have a chance, lvr must deﬁne an acute angle. Second, no
other segment of the polygon can be cut by this chord, for if so a bite will be taken out
of the triangle.
The important fact is that every polygon always contains an ear; in fact at least two
of them for n ≥ 3. This suggests the following algorithm. Test each one of the vertices
until we ﬁnd an ear. Adding the associated chord cuts the ear oﬀ, thus reducing the
number of vertices by one. The remaining polygon must also have an ear, so we can
keep cutting and recurring until only three vertices remain, leaving a triangle.
Testing whether a vertex deﬁnes an ear has two parts. For the angle test, we can
trot out our ccw/cw predicates again. We must take care that our expectations are
consistent with the vertex order of the polygon. We assume the vertices of the polygon

14.4. Triangulation: Algorithms and Related Problems

321

to be labeled in counterclockwise order around the virtual center, as in Figure 14.2.
Reversing the order of the polygon would require ﬂipping the sign on our angle test.
bool ear_Q(int i, int j, int k, polygon *p)
{
triangle t;
/* coordinates for points i,j,k */
int m;
/* counter */
bool cw();
copy_point(p->p[i],t[0]);
copy_point(p->p[j],t[1]);
copy_point(p->p[k],t[2]);
if (cw(t[0],t[1],t[2])) return(FALSE);
for (m=0; mn; m++) {
if ((m!=i) && (m!=j) && (m!=k))
if (point_in_triangle(p->p[m],t)) return(FALSE);
}
return(TRUE);
}
For the segment-cutting test, it suﬃces to test whether there exists any vertex which
lies within the induced triangle. If the triangle is empty of points, the polygon must be
empty of segments because P does not self-intersect. Testing whether a given point lies
within a triangle will be discussed in Section 14.4.3.
Our main triangulation routine is thus limited to testing the earness of vertices, and
clipping them oﬀ once we ﬁnd them. A nice property of our array-of-points polygon
representation is that the two immediate neighbors of vertex i are easily found, namely,
in the (i − 1)st and (i + 1)st positions in the array. This data structure does not cleanly
support vertex deletion, however. To solve this problem, we deﬁne auxiliary arrays l
and r that point to the current left and right neighbors of every point remaining in the
polygon:
triangulate(polygon *p, triangulation *t)
{
int l[MAXPOLY], r[MAXPOLY];
/* left/right neighbor indices */
int i;
/* counter */
for (i=0; in; i++) {
/* initialization */
l[i] = ((i-1) + p->n) % p->n;
r[i] = ((i+1) + p->n) % p->n;
}
t->n = 0;

322

14. Computational Geometry

i = p->n-1;
while (t->n < (p->n-2)) {
i = r[i];
if (ear_Q(l[i],i,r[i],p)) {
add_triangle(t,l[i],i,r[i],p);
l[ r[i] ] = l[i];
r[ l[i] ] = r[i];
}
}
}

14.4.2

Area Computations

We can compute the area of any triangulated polygon by summing the area of all
triangles. This is easy to implement using the routines we have already developed.
However, there is an even slicker algorithm based on the notion of signed areas for
triangles, which we used as the basis for our ccw routine. By properly summing the
signed areas of the triangles deﬁned by an arbitrary point p with each segment of
polygon P we get the area of P , because the negatively signed triangles cancel the area
outside the polygon. This computation simpliﬁes to the equation
n−1
1
A(P ) =
(xi · yi+1 − xi+1 · yi )
2 i=0

where all indices are taken modulo the number of vertices. Thus we don’t even need to
use our signed area routine! See [O’R00] for an exposition of why this works, but it
certainly leads to a simple solution:
double area(polygon *p)
{
double total = 0.0;
int i, j;

/* total area so far */
/* counters */

for (i=0; in; i++) {
j = (i+1) % p->n;
total += (p->p[i][X]*p->p[j][Y]) - (p->p[j][X]*p->p[i][Y]);
}
return(total / 2.0);
}

14.4.3

Point Location

Our triangulation algorithm deﬁned a vertex as an ear only when the associated triangle
contained no other points. Thus ear testing requires us to test whether a given point p
lies within the interior of a triangle t.

14.4. Triangulation: Algorithms and Related Problems

323

Figure 14.3. The odd/even parity of the number of boundary crossings determines whether a
given point is inside or outside a given polygon.

Triangles are always convex polygons, because three vertices does not give the freedom
to create bumps and notches. A point lies within a convex polygon if it is to the left
−→
of each of the directed lines pi pi+1 , where the vertices of the polygon are represented
in counterclockwise order. The ccw predicate enables us to easily make such left-of
decisions:
bool point_in_triangle(point p, triangle t)
{
int i;
/* counter */
bool cw();
for (i=0; i<3; i++)
if (cw(t[i],t[(i+1)%3],p)) return(FALSE);
return(TRUE);
}
This algorithm works to decide point location (in P or out?) for convex polygons. But
it breaks down for general polygons. Imagine the task of deciding whether a point is inside or outside the center of a complex spiral-shaped polygon. There is a straightforward
solution for general polygons using the code we have already developed. Ear-clipping
required us to test whether a given point lies within a given triangle. Thus we can use
triangulate to divide the polygon into triangular cells and then test each of the cells
to see whether they contain the point. If one of them does, the point is in the polygon.
Triangulation is a heavyweight solution for this problem, however, just as it was for
area. There is a much simpler algorithm based on the Jordan curve theorem, which
states that every polygon or other closed ﬁgure has an inside and an outside. You can’t
get from one to the other without crossing the boundary.
This gives the following algorithm, illustrated in Figure 14.3. Suppose we draw a line
l that starts from outside the polygon P and goes through point q. If this line crosses
the polygon boundary an even number of times before reaching q, it must lie outside

324

14. Computational Geometry

P . Why? If we start outside the polygon, then every pair of boundary crossings leaves
us outside. Thus an odd number of boundary crossings puts us inside P .
Important subtleties occur at degenerate cases. Cutting through a vertex of p crosses
a boundary only if we enter the interior of p, instead of just clipping oﬀ the vertex. We
cross a boundary if and only if the vertices neighboring p lie on diﬀerent sides of line l.
Crawling along an edge of the polygon does not change the boundary count, although
it raises the application-speciﬁc question of whether such a point on the boundary is
considered inside or outside p.

14.5 Algorithms on Grids
That polygons drawn on rectilinear and hexagonal grid points can be naturally decomposed into individual cells makes it useful to be able to solve certain computational
problems on these cells:
• Area — The formula length × width computes the area of a rectangle. For triangles, it is 1/2
side has length
√ × altitude × base. An equilateral triangle where each√
r has area 3r2 /4; so a regular hexagon with radius r has area 3 3r2 /2.
• Perimeter — The formula 2 × (length + width) computes the perimeter of a
rectangle. For triangles, we sum the side lengths, a + b + c, which reduces to 3r
for equilateral triangles. Regular hexagons of radius r have perimeter 6r; observe
how they approach the circumference of a circle 2πr ≈ 6.28r.
• Convex Hulls — Squares, equilateral triangles, and regular hexagons are all
inherently convex, so they are all their own convex hulls.
• Triangulation — Inserting either one of the two diagonals in a square or all
three diagonals radiating from any point in a regular hexagon triangulates it.
This works only because these ﬁgures are convex; notches and bumps make the
process harder.
• Point location — As we have seen, a point lies in an axis-oriented rectangle if
and only if xmax > x > xmin and ymax > y > ymin . Such tests are slightly more
diﬃcult for triangles and hexagons, but surrounding these shapes by a bounding
box usually reduces the need for the complicated case.
We conclude this section with two interesting algorithms for geometric computing on
grids. They are primarily of interest for rectilinear grids but can be adapted to other
lattices if the need arises.

14.5.1

Range Queries

Orthogonal range queries are a common operation in working with n × m rectilinear
grids. We seek a data structure which quickly and easily answers questions of the form:
“What is the sum of the values in a given subrectangle of the matrix?”

14.5. Algorithms on Grids

325

Any axis-oriented rectangle can be speciﬁed by two points, the upper-left-hand corner
(xl , yl ) and the lower-right-hand corner (xr , yr ). The simplest algorithm is to run nested
loops adding up all values m[i][j] for xl ≤ i ≤ xr and yr ≤ j ≤ yl . But this is
ineﬃcient, particularly if you must do it repeatedly in seeking the rectangle of largest
or smallest such sum.
Instead, we can construct an alternate rectangular matrix such that element m1[x][y]
represents the sum of all elements m[i][j] where i ≤ x and j ≤ y. This dominance
matrix m1 makes it easy to ﬁnd the sum of the elements in any rectangle, because the
sum S(xl , yl , xr , yr ) of elements in such a box is
S(xl , yl , xr , yr ) = m1 [xr , yl ] − m1 [xl − 1, yl ] − m1 [xr , yr − 1] + m1 [xl − 1, yr − 1]
This is certainly fast, reducing the computation to just four array element lookups. Why
it is correct? The term m1[xr , yl ] contains the sum of all the elements in the desired
rectangle, plus all other dominated items. The next two terms subtract this away, but
remove the lower-left-hand corner twice so it must be added back again. The argument
is that of standard inclusion-exclusion formulas in combinatorics. The array m1 can be
built in O(mn) time by ﬁlling in the cells using row-major ordering and similar ideas.

14.5.2

Lattice Polygons and Pick’s Theorem

Rectangular grids of unit-spaced points (also called lattice points) are at the heart of
any grid-based coordinate system. In general, there will be about one grid point per
unit-area in the grid, because each grid point can be assigned to be the upper-righthand corner of a diﬀerent 1 × 1 empty rectangle. Thus the number of grid points within
a given ﬁgure should give a pretty good approximation to the area of the ﬁgure.
Pick’s theorem gives an exact relation between the area of a lattice polygon P (a
non-intersecting ﬁgure whose vertices all lie on lattice points) and the number of lattice
points on/in the polygon. Suppose there are I(P ) lattice points inside of P and B(P )
lattice points on the boundary of P . Then the area A(P ) of P is given by
A(P ) = I(P ) + B(P )/2 − 1
as illustrated in Figure 14.4.
For example, consider a triangle deﬁned by coordinates (x, 1), (y, 2), and (y+k, 2). No
matter what x, y, and k are there can be no interior points, because the three points
lie on consecutive rows of the lattice. Lattice point (x, 1) serves as the apex of the
triangle, and there are k + 1 lattice points on the boundary of the base. Thus I(P ) = 0,
B(P ) = k + 2, and so the area is k/2, precisely what you get from the triangle area
formula.
As another example, consider a rectangle deﬁned by corners (x1 , y1 ) and (x2 , y2 ). The
number of boundary points is
B(P ) = 2|y2 − y1 + 1| + 2|x2 − x1 + 1| − 4 = 2(∆y − ∆x )
with the 4-term to avoid double-counting the corners. The interior is the total number
of points in or on the rectangle minus the boundary, giving
I(P ) = (∆x + 1)(∆y + 1) − 2(∆y − ∆x )

326

14. Computational Geometry

Figure 14.4. A lattice polygon with ten boundary points and nine internal points, and hence
area 13 by Pick’s theorem.

Pick’s theorem correctly computes the area of the rectangle as ∆x ∆y .
Applying Pick’s theorem requires counting lattice points accurately. This can in principle be done by exhaustive testing for small area polygons using functions that (1) test
whether a point lies on a line segment and (2) test whether a point is inside or outside
a polygon. More clever sweep-line algorithms would eliminate the need to check all but
the boundary points for eﬃciency. See [GS93] for an interesting discussion of Pick’s
theorem and related subjects.

14.6 Geometry Libraries
The Java java.awt.geom package provides the classes for deﬁning and performing operations on objects related to two-dimensional geometry. The Polygon class provides
much of the functionality we have developed here, including a contains method for
point location. The more general Area class permits us to union and intersect polygons
with other shapes and curves. The Line2D class provides much of the functionality we
have developed for line segments, including intersection testing and the ccw predicate.

14.7. Problems

327

14.7 Problems
14.7.1

Herding Frosh

PC/UVa IDs: 111401/10135, Popularity: C, Success rate: average Level: 2
One day, a lawn in the center of campus became infested with frosh. In an eﬀort
to beautify the campus, one of our illustrious senior classmen decided to round them
up using a length of pink silk. Your job is to compute how much silk was required to
complete the task.
The senior classman tied the silk to a telephone post, and walked around the perimeter
of the area containing the frosh, drawing the silk taught so as to encircle all of them. He
then returned to the telephone post. The senior classman used the minimum amount
of silk necessary to encircle all the frosh plus one extra meter at each end to tie it.
You may assume that the telephone post is at coordinates (0,0), where the ﬁrst
dimension is north/south and the second dimension is east/west. The coordinates of
the frosh are given in meters relative to the post. There are no more than 1,000 frosh.

Input
The input begins with a single positive integer on a line by itself indicating the number
of test cases, followed by a blank line.
Each test case consists of a line specifying the number of frosh, followed by one line
per frosh with two real numbers given his or her position.
There is a blank line between two consecutive inputs.

Output
For each test case, the output consists of a single number: the length of silk in meters
to two decimal places. The output of two consecutive cases will be separated by a blank
line.

Sample Input
1
4
1.0 1.0
-1.0 1.0
-1.0 -1.0
1.0 -1.0

Sample Output
10.83

328

14. Computational Geometry

14.7.2

The Closest Pair Problem

PC/UVa IDs: 111402/10245, Popularity: A, Success rate: low Level: 2
A particularly ineﬃcient telephone company seeks to claim they provide high-speed
broadband access to customers. It will suﬃce for marketing purposes if they can create
just one such link directly connecting two locations. As the cost for installing such a
connection is proportional to the distance between the sites, they need to know which
pair of locations are the shortest distance apart so as to provide the cheapest possible
implementation of this marketing strategy.
More precisely, given a set of points in the plane, ﬁnd the distance between the closest
pair of points provided this distance is less than some limit. If the closest pair is too far
apart, marketing will have to opt for some less expensive strategy.

Input
The input ﬁle contains several sets of input. Each set of input starts with an integer
N (0 ≤ N ≤ 10,000), which denotes the number of points in this set. The next N
lines contain the coordinates of N two-dimensional points. The two numbers denote
the x- and y-coordinates, respectively. The input is terminated by a set whose N = 0,
which should not be processed. All coordinates will have values less than 40,000 and be
non-negative.

Output
For each input set, produce a single line of output containing a ﬂoating point number
(with four digits after the decimal point) which denotes the distance between the closest
two points. If there do not exist two points whose distance is less than 10,000, print the
line “INFINITY”.

Sample Input

Sample Output

3
0 0
10000 10000
20000 20000
5
0 2
6 67
43 71
39 107
189 140
0

INFINITY
36.2215

14.7. Problems

14.7.3

329

Chainsaw Massacre

PC/UVa IDs: 111403/10043, Popularity: B, Success rate: low Level: 3
The Canadian Lumberjack Society has just held its annual woodcutting competition
and the national forests between Montreal and Vancouver are devastated. Now for
the social part! In order to lay out an adequate dance ﬂoor for the evening party
the organizing committee is looking for a large rectangular area without trees. All
lumberjacks are already drunk and nobody wants to take the risk of having any of
them operate a chainsaw.
The organizing committee has asked you to ﬁnd the largest free rectangular region
which can serve as the dance ﬂoor. The area in which you should search is also rectangular and the dance ﬂoor must be entirely located in that area. Its sides should be
parallel to the borders of the area. The dance ﬂoor may be located at the borders of
the area, and trees may grow on the borders of the dance ﬂoor.

Input
The ﬁrst line of the input speciﬁes the number of scenarios. For each scenario, the ﬁrst
line provides the length l and width w of the area in meters (0 < l, w ≤ 10, 000, both
integers). Each of the following lines describes either a single tree, or a line of trees
according to one of the following formats:
• 1 x y, where the “1” characterizes a single tree, and x and y provide its coordinates
in meters with respect to the upper-left corner.
• k x y dx dy, where k > 1 provides the number of trees in a line with coordinates
(x, y), (x + dx, y + dy), . . . , (x + (k − 1)dx, y + (k − 1)dy).
• 0 denotes the end of the scenario.
The coordinates x, y, dx, and dy are given as integers. All the trees will be situated in
this area, i.e., have coordinates in [0, l] × [0, w]. There will be at most 1,000 trees.

Output
For each scenario print a line containing the maximum size of the dance ﬂoor measured
in square meters.

Sample Input

Sample Output

2
2 3
0
10 10
2 1 1 8 0
2 1 9 8 0
0

6
80

330

14. Computational Geometry

14.7.4

Hotter Colder

PC/UVa IDs: 111404/10084, Popularity: C, Success rate: low Level: 3
The children’s game Hotter Colder is played as follows. Player A leaves the room
while player B hides an object somewhere within. Player A re-enters at position (0, 0)
and then visits various other locations about the room. When player A visits a new
position, player B announces “Hotter” if this position is closer to the object than the
previous position, “Colder” if it is farther, and “Same” if it is the same distance.

Input
Input consists of up to 50 lines, each containing an (x, y)-coordinate pair followed by
“Hotter”, “Colder”, or “Same”. Each pair represents a position within the room, which
may be assumed to be a square with opposite corners at (0,0) and (10,10).

Output
For each line of input, print a line giving the total area of the region in which the object
may have been placed, to two decimal places. If there is no such region, output “0.00”.

Sample Input
10.0 10.0 Colder
10.0 0.0 Hotter
0.0 0.0 Colder
10.0 10.0 Hotter

Sample Output
50.00
37.50
12.50
0.00

14.7. Problems

14.7.5

331

Useless Tile Packers

PC/UVa IDs: 111405/10065, Popularity: C, Success rate: average Level: 3
Useless Tile Packer, Inc., prides itself on eﬃciency. As their name suggests, they aim to
use less space than other companies. Their marketing department has tried to convince
management to change the name, believing that “useless” has other connotations, but
has thus far been unsuccessful.
Tiles to be packed are of uniform thickness and have a simple polygonal shape. For
each tile, a container is custom-built. The ﬂoor of the container is a convex polygon
that has the minimum possible space inside to hold the tile it is built for.

This strategy leads to wasted space inside the container. Your job is to compute the
percentage of wasted space for a given tile.

Input
The input ﬁle consists of several data blocks. Each data block describes one tile. The
ﬁrst line of a data block contains an integer N (3 ≤ N ≤ 100) indicating the number
of corner points of the tile. Each of the next N lines contains two integers giving the
(x, y) coordinates of a corner point (determined using a suitable origin and orientation
of the axes) where 0 ≤ x, y ≤ 1,000. The corner points occur in the same order on
the boundary of the tile as they appear in the input. No three consecutive points are
collinear.
The input ﬁle terminates with a value of 0 for N .

Output
For each tile in the input, print the percentage of wasted space rounded to two digits
after the decimal point. Each output must be on a separate line.
Print a blank line after each output block.

332

14. Computational Geometry

Sample Input
5
0
2
2
1
0
5
0
0
1
2
2
0

0
0
2
1
2
0
2
3
2
0

Sample Output
Tile #1
Wasted Space = 25.00 %
Tile #2
Wasted Space = 0.00 %

14.7. Problems

14.7.6

333

Radar Tracking

PC/UVa IDs: 111406/849, Popularity: C, Success rate: low Level: 2
A ground-to-air radar system uses an antenna that rotates clockwise in a horizontal
plane with a period of two seconds. Whenever the antenna faces an object, its distance
from that antenna is measured and displayed on a circular screen as a white dot. The
distance from the dot to the center of the screen is proportional to the horizontal
distance from the antenna to the object, and the angle of the line passing through
the center and the dot represents the direction of the object from the antenna. A dot
directly above the center represents an object that is north of the antenna; an object
to the right of the center represents an object to the east; and so on.
There are a number of objects in the sky. Each is moving at a constant velocity,
and so the dot on the screen appears in a diﬀerent position every time the antenna
observes it. Your task is to determine where the dot will appear on the screen the next
time the antenna observes it, given the previous two observations. If there are several
possibilities, you are to ﬁnd them all.

Input
The input consists of a number of lines, each with four real numbers: a1 , d1 , a2 , d2 . The
ﬁrst pair a1 , d1 are the angle (in degrees) and distance (in arbitrary distance units)
for the ﬁrst observation while the second pair a2 , d2 are the angle and distance for the
second observation.
Note that the antenna rotates clockwise; that is, if it points north at time t = 0.0,
it points east at t = 0.5, south at t = 1.0, west at t = 1.5, north at t = 2, and so on.
If the object is directly on top of the radar antenna, it cannot be observed. Angles are
speciﬁed as on a compass, where north is 0o or 360o , east is 90o , south is 180o , and
west is 270o .

Output
The output consists of one line per input case containing all possible solutions. Each
solution consists of two real numbers (with two digits after the decimal place) indicating
the angle a3 and distance d3 for the next observation.

Sample Input

Sample Output

90.0 100.0 90.0 110.0
90.0 100.0 270.0 10.0
90.0 100.0 180.0 50.0

90.00 120.00
270.00 230.00
199.93 64.96 223.39 130.49

334

14. Computational Geometry

14.7.7

Trees on My Island

PC/UVa IDs: 111407/10088, Popularity: C, Success rate: average Level: 3
I have bought an island where I want to plant trees in rows and columns. The trees
will be planted to form a rectangular grid, so each can be thought of as having integer
coordinates by taking a suitable grid point as the origin.

A sample of my island
However, my island is not rectangular. I have identiﬁed a simple polygonal area inside
the island with vertices on the grid points and have decided to plant trees on grid points
lying strictly inside the polygon.
I seek your help in calculating the number of trees that can be planted.

Input
The input ﬁle may contain multiple test cases. Each test case begins with a line containing an integer N (3 ≤ N ≤ 1, 000) identifying the number of vertices of the polygon.
The next N lines contain the vertices of the polygon in either the clockwise or counterclockwise direction. Each of these N lines contains two integers identifying the x- and
y-coordinates of a vertex. You may assume that the absolute value of all coordinates
will be no larger than 1,000,000.
A test case containing a zero for N in the ﬁrst line terminates the input.

Output
For each test case, print a line containing the number of trees that can be planted inside
the polygon.

14.7. Problems

Sample Input
12
3 1
6 3
9 2
8 4
9 6
9 9
8 9
6 5
5 8
4 4
3 5
1 3
12
1000
2000
4000
6000
8000
8000
7000
5000
4000
3000
3000
1000
0

1000
1000
2000
1000
3000
8000
8000
4000
5000
4000
5000
3000

Sample Output
21
25990001

335

336

14. Computational Geometry

14.7.8

Nice Milk

PC/UVa IDs: 111408/10117, Popularity: C, Success rate: low Level: 4
Little Tomy likes to cover his bread with milk. He does this by dipping it so that its
bottom side touches the bottom of the cup, as in the picture below:

Since the amount of milk in the cup is limited, only the area between the surface of
the milk and the bottom side of the bread is covered. Note that the depth of the milk
is always h and remains unchanged with repeated dippings.
Tomy wants to cover this bread with largest possible area of milk in this way, but
doesn’t want to dip more than k times. Can you help him out? You may assume that
the cup is wider than any side of the bread, so it is possible to cover any side completely.

Input
Each test case begins with a line containing three integers n, k, and h (3 ≤ n ≤ 20,
0 ≤ k ≤ 8, 0 ≤ h ≤ 10). A piece of bread is guaranteed to be a convex polygon of n
vertices. Each of the following n lines contains two integers xi and yi (0 ≤ xi , yi ≤ 1,000)
representing the Cartesian coordinates of the ith vertex. The vertices are numbered in
counterclockwise order. The test case n = 0, k = 0, h = 0 terminates the input.

Output
Output (to two decimal places) the area of the largest possible bread region which can
be covered with milk using k dips. The result for test case should appear on its own
line.

Sample Input

Sample Output

4
1
3
5
0
0

7.46

2 1
0
0
2
4
0 0

14.8. Hints

337

14.8 Hints
14.1 How do we best deal with the telephone pole constraint?
14.2 Comparing each point against each other point might be too slow. Can we use
the fact that we are only interested in a nearby closest pair to reduce the number
of comparisons?
14.3 Is the data better explicitly represented as an l × w matrix or left compressed as
in the input format?
14.4 How can we best represent the region of possible locations? Is it always a convex
polygon?
14.5 Is it easier compute the diﬀerence between the two areas or compute the area of
each external pocket?
14.6 How do multiple solutions possibly arise?
14.7 Is this a candidate for Pick’s theorem or is there a better way?
14.8 Does some form of greedy algorithm provably maximize the milk-covered area, or
must we use exhaustive search?

Appendix

Achieving top performance in a programming contest or any other sporting event is not
purely a function of talent. It is important to know the competition, to train correctly,
and to develop the proper strategies and tactics in order to compete successfully.
In this chapter, we will introduce the three most important programming contests,
namely, the ACM International Collegiate Programming Competition for college students, the International Olympiad in Informatics for high school students, and ﬁnally
the TopCoder Challenge for all practicing programmers. We discuss the history, format,
and entry requirements for each contest. Further, we have interviewed top contestants
and coaches so we can pass their training and strategy secrets on to you.

A.1 The ACM International Collegiate Programming
Contest
The ACM International Collegiate Programming Contest (ACM ICPC) is the forum
where computer science students show the world they have the right stuﬀ. The ICPC
has grown steadily in numbers, excitement, and prestige since its founding in 1976. The
2002 competition attracted 3,082 three-person teams representing over 1,300 schools
in 67 countries, plus countless other students who participated in associated local and
on-line programming contests.
The format of the contest is as follows. Each team is composed of three students, and
given a collection of ﬁve to ten programming problems. Only one computer is allocated
to each team, so coordination and teamwork are essential.

340

A. Appendix

The winner is the team which correctly solves1 the most problems within the speciﬁed
time limit, typically about ﬁve hours. No partial credit is awarded, so only completely
correct programs count. Ties between teams are broken by comparing the elapsed time
needed to get the solutions accepted. Thus the fastest programmers (as opposed to
the fastest programs) win. No points are given for programming style or eﬃciency,
provided the program completes within the several seconds the judges typically allot
for testing. Time penalties of 20 minutes are administered for every unsatisfactory
program submitted to the judges, thus providing incentive for students to check their
work carefully.
We asked the members and coaches of top teams from the 2002 ACM ICPC for their
training and competition secrets. Here is what we learned. . . .

A.1.1

Preparation

Team Selection and Coaching
It is the coach’s job to select the members of his/her school’s team. Graduate students
serve as coaches at some schools, including Cornell and Tsinghua University. Distinguished and caring faculty do the job at others, including Duke and Waterloo. The
constant is that good teams require careful selection and eﬀective leadership. Both
types of coaches can be highly successful, as the excellent performance of all these
schools demonstrates.
Many top teams run local contests to select teams for regional competition. Waterloo
Coach Gordon Cormack prefers individual contests to team contests to lower entry
barriers for loners, and then selects the best individuals for his team. By using the
contest hosting services of the Universidad de Valladolid robot judge, such contests can
be relatively easy to organize and administer.
The best teams go through extensive training. For example, Waterloo practices as a
team two or three times a week in an environment similar to the one used during the
ﬁnals.
Resources
ACM ICPC rules permit teams to bring any printed material they wish for use at the
contest but allow no access to online material. Books are carefully checked for CDROMs, and network connectivity is typically cut or sniﬀer programs employed make
sure no one gets to the web.
What are the best books to study from, beyond the one you are holding in your
hands? As a general algorithms reference, we recommend Skiena’s The Algorithm Design
Manual [Ski97], and many contestants and coaches without our degree of self-interest
agree. Especially popular are books which contain implementations of algorithms in
a real programming language, such as Sedgewick [Sed01] for graph algorithms and
O’Rourke [O’R00] for computational geometry. However, expect a painful debugging
1 At

least correctly enough to satisfy the judges. Legal battles have been fought over this distinction.

A.1. The ACM International Collegiate Programming Contest

341

session after typing someone else’s routine from a book – unless you really understand
it. Warns Cormack, “code is a lot less useful than you think, unless you have actually
composed it and/or typed it in.” Reference manuals for your favorite programming
language and associated libraries are also a must.
Well-prepared teams bring printouts of their solutions to old problems just in case
they are given something similar to one they have seen before. Christian Ohler of
the University of Oldenburg (Germany) stresses the importance of canned geometry
routines. “You’re doomed if you don’t have them prepared and pre-tested.”
Such templates are particularly important if you will be using Java. Subroutines
providing exception-less I/O and parsing the most common data types are complicated,
yet essential to get anything working. It may be useful to have such routines typed in
by a team member at the start of the competition while the rest of the team reads the
problems.
Training
The best training resource is the Universidad de Valladolid robot judge. At least
80% of last year’s ﬁnalists trained on the judge. Regular online contests are held at
http://acm.uva.es/contest/, with increasing frequency around the time of the regional
and international competitions. Check the webpage for schedules and other information.
Ural State University (http://acm.timus.ru/) and the Internet Problem Solving Contest (IPSC) (http://ipsc.ksp.sk/) also maintain online judges, running contests with
similar functionality. The USA Olympiad team website http://www.usaco.org contains
lots of interesting problems and material.
Many students like to think through solutions to past contests, even if they don’t go
so far as to implement them. Such problems are available from the oﬃcial ACM ICPC
website http://www.acmicpc.org. Rujia Liu of Tsinghua University notices that diﬀerent
types of problems appear in diﬀerent countries. He ﬁnds Asian problems “very strange
and diﬃcult” and thus good for think-through analysis. North American problems tend
to be better for programming practice under contest conditions but require less deep
algorithmics.

A.1.2

Strategies and Tactics

Teamwork
Teams in the ICPC consist of three people. With only one computer per team, there
is a premium on teamwork. Teams which ﬁght for the terminal go nowhere. Zhou Jian
from Shanghai Jiaotong University (the 2002 world champions) puts it this way: “The
goal of everything you do is to make the team’s result better, not to do your personal
best in the contest.”
Most successful teams assign the individual members diﬀerent roles depending upon
their individual skills. A typical organization identiﬁes one student as the coder; the
jockey who rides the keyboard most of the contest due to his or her superior language
and typing skills. Another student is the algorist; the one best at cracking the problem
and sketching out solutions. The third student is the debugger; one who works oﬀ-line

342

A. Appendix

from printouts of the program and output traces to ﬁx things while freeing the coder
and the keyboard for other problems.
Of course, these roles change during the ﬂow of the contest and vary considerably
from team to team. Some teams use a designated leader to decide which member gets
which problems and who has control of the machine at any given time.
Certain teams adopt special strategies for reading the problems. Most eﬃcient seems
to be dividing them up among the team members to read in parallel, since the easiest
problems may be in the back of the package. When someone ﬁnds an easy problem, they
start to go at it, or hand it oﬀ to the most appropriate team member. On some international teams, the best English reader is assigned the task of skimming the problems
and parceling them out among the team.
Contest Tactics
• Know Your Limitations — Since you only get credit for correct solutions, identify
the easiest problems and work on them ﬁrst. Often easy-sounding problems have
some dirty trick or ambiguous speciﬁcation which leads to repeated and frustrating
wrong answers. Shahriar Manzoor of the Bangladesh University of Engineering
and Technology has the following advice: If your solution to the easiest problem
in the contest is rejected for unclear reasons, have another team member redo the
problem to avoid the mind traps you fell into.
• Keep an Eye on the Competition — If possible, try to view the current standings
and ﬁnd out which problems are being solved most frequently. If your team has
not tried this problem, go for it! Odds are it is relatively easy.
• Avoid Wrong Answers — Rujia Liu of Tsinghua thinks correctness is much more
important than speed. “We missed ﬁnishing among the top three teams this year
because of time penalties from wrong answers.” Reducing such penalties requires
adequate testing before submission and discussion among team members to make
sure the problem is properly understood.
• Halting Problems — The message time limit exceeded does not always imply an
algorithmic problem. You could be in an inﬁnite loop because of problems reading
the input [Man01]. Perhaps your program is waiting for input from standard IO
when the judge is expecting you to take input from a ﬁle. Or maybe you have the
wrong input data format, such as assuming termination with a 0 symbol when
the judge terminates input with end of ﬁle.
• Know Your Compiler — Certain programming environments have options which
can make life easier for you. Flags which limit memory allocation can help test
solutions when the contest enforces certain memory limits. “Eliminate surprises
by anticipating the environment,” says Gordon Cormack.
• Keep the Machine Busy — Cormack urges his team to “always use the keyboard,
even if you are just typing in the shell of a program for reading input.”
• Clean Debugging — How do you debug in the absence of so little information? All
the judge will tell you is that your program is wrong. You cannot see the example

A.2. International Olympiad in Informatics

343

you are failing on. Judging problems occasionally occur, but usually you have a
bug if your program is not accepted. A clean printout of your program and a cool
head are your best tools.
Carefully check the test data you are using to validate your program. One of
our teams once lost two hours debugging a correct program because they typed
the test data in wrong. But don’t abandon incorrect solutions too quickly. Jens
Zumbrugel from the University of Oldenburg warns never to “begin a new problem
when you have just 90 minutes left and other problems still to debug.”
• Dirty Debugging — Here is a dirty trick which might help if you really get stuck.
Add a time-out loop or divide by zero at a point where you think your program
is failing. You can get one bit of information in exchange for a 20-minute penalty.
Don’t try this too often or you are certain to incur the wrath of your teammates
the moment your program is accepted.
• Make Exceptions — Daniel Wright of Stanford University recommends the following trick. If your language supports exception handling, use it to return a guess
at the answer instead of crashing. For example, catch any exception and output
that there is no solution or the input is invalid.
• Stay Calm Amidst the Confusion — Try not to get stressed and don’t ﬁght with
your teammates. “Have fun and don’t lose focus,” Gordon Cormack urges his
students. “You can do well by identifying and solving only the straightforward
questions.”

A.2 International Olympiad in Informatics
The International Olympiad in Informatics (IOI) is an annual competition in computer
science for secondary/high school students. Since its founding in 1989, it has grown to be
the second largest of the ﬁve international science Olympiads, behind only mathematics.
In 2002, 78 countries sent 276 competitors to the ﬁnals in Korea, but these ﬁnalists were
selected from literally hundreds of thousands of students striving to make their national
teams.
The goals of the IOI are somewhat diﬀerent from that of the ACM ICPC. Participants
typically have not yet selected a career path; the IOI seeks to stimulate their interest in
informatics (computing science). The IOI brings together exceptionally talented pupils
from various countries so they can share scientiﬁc and cultural experiences.

A.2.1

Participation

The IOI is hosted in a diﬀerent country each year: Wisconsin, USA, in 2003; Greece in
2004; and Poland in 2005. Each participating nation sends a delegation of four students
and two accompanying coaches. The students compete individually and try to maximize
their score by solving a set of programming problems during two competition days.
Typically, students have ﬁve hours to do three questions in each day’s session.

344

A. Appendix

Every country has its own procedure to select its national team. Certain countries,
such as China and Iran, give screening exams to literally hundreds of thousands of students to identify the most promising prospects. Most nations run more modest screening
exams to reduce the ﬁeld to 20 or so candidates. These students are given intensive
training under the eyes of the national coach, who then picks the four best students to
represent them.
The USA Computing Olympiad maintains an excellent training program at
http://train.usaco.org and runs a series of Internet programming competitions which
anyone may participate in. To be considered for the United States team, you must compete in the U.S. Open National Championship tournament. This requires proctoring by
a teacher at your local high school. Top ﬁnishers are invited to the USA training camp
for additional training and ﬁnal team selection. Canada weeds through about 1,000
candidates through screening exams to select 22 for a week-long training camp.

A.2.2

Format

Unlike the ACM ICPC, the Olympiad provides for partial credit. Each problem typically
has ten test inputs and you get 10 points for each input you successfully solve. Typically
three problems are given on each of the two days, for a maximum possible contest score
of 600.
All problems (called tasks in IOI lingo) involve computations of an algorithmic nature.
Whenever algorithmic eﬃciency is important, they aim to provide at least one grading
input where ineﬃcient program can also score some points. But IOI Scientiﬁc Committee
member Ian Munro says, “It is hard to design questions that give some credit to most
competitors yet are still able to distinguish at the top.”
A class of problems unique to the IOI are reactive tasks which involve live input
[HV02]. These interact with your program through function calls instead of data ﬁles.
For example, you may be asked to explore a maze, where a function call tells you whether
your next move hits the wall or not. Or you might be asked to write a game-playing
program which must interact with a real opponent.
Students were oﬀered a choice of either Linux or Windows as a programming environment at the 2002 competition. Pascal and C/C++ were possible programming
languages. Students are not allowed access to any printed or online reference material.
Grading of IOI submissions is done after the end of the session, not online as in the
ACM ICPC. As in typical course exams, students do not know their score until the
grades are announced.
The IOI is the least corporate of the major programming contests. This lends it a
more academic character according to Daniel Wright – who has reached the ﬁnals of all
three contests discussed in this book. The diﬀerence shows up in the accommodations.
IOI contestants stay in university dorms while ICPC/TopCoder ﬁnalists are put up in
luxury hotels.

A.3. Topcoder.com

A.2.3

345

Preparation

United States IOI coach Rob Kolstad encourages all interested students to work hard
at the training website and compete in preliminary contests before the U.S. Open. He
tries to teach his students to use eﬀective time management during the contest. Gordon
Cormack, who doubles as Canada’s IOI coach, encourages his students to break the
“debug-until-it-sorta-works habit” and strive for correct solutions which solve all cases
in the time available. He goes so far as to remove the debugger from his student’s hands
to help them achieve greater focus.
There is agreement that IOI problems are somewhat diﬀerent than the ACM ICPC
problems. According to Kolstad, IOI problems are “totally algorithmic” and are more
clearly stated, avoiding “trickery or clever inputs.” ACM ICPC problems have more
issues with input checking and output formatting.
IOI problems “are about the same level of diﬃculty as the ACM problems” according
to Cormack. They sometimes have shorter solutions than the ACM problems: after all,
it is a single-student contest instead of a team eﬀort. They are designed so that simple,
relatively ineﬃcient programs will solve at most a few of the inputs, but that cleverness
is necessary for full credit. Past problems are available from the oﬃcial IOI website,
http://olympiads.win.tue.nl/ioi/. This site also contains pointers to two oﬃcially recommended books, Kernighan and Pike’s The Practice of Programming [KP99] and, we
are proud to say, Skiena’s The Algorithm Design Manual [Ski97].
A strong background in mathematics seems necessary for top-ﬂight competition. The
2001 IOI Champion, Reid Barton of the United States, also won the International
Mathematics Olympiad and likely did right well when it came time for applying to
college.
Participation in the IOI is good preparation for the ACM ICPC contest. There is
a tremendous overlap between this year’s IOI contestants and next year’s ACM ICPC
ﬁnalists. For example, all three members of Tsinghua’s 2002 ACM ICPC team (which
ﬁnished in 4th place) were drawn from the top 20 Chinese IOI 2001 candidates. Similarly,
about half of the TopCoder “all-stars” were previously top performers in the IOI or
ACM ICPC.

A.3 Topcoder.com
There are many good reasons for participating in programming contests. You can have
fun while you improve both your programming skills and job prospects in the bargain.
The programming challenge problems appearing in this book are suggestive of the
interview “puzzle” problems many advanced companies give all new job applicants.
From somewhere in this vein arises TopCoder, a company which uses programming
contests as a tool to identify promising potential employees and provides this information to its clients. The big draw of TopCoder contests is money. The 2002 TopCoder
Collegiate Challenge was sponsored by Sun Microsystems and awarded $150,000 in
prizes. Daniel Wright of Stanford University walked oﬀ with the $100,000 top prize and
graciously shared his secrets with us.

346

A. Appendix

TopCoder has a slick website (www.topcoder.com) with news articles about recent
competitions which make it look almost like the sports pages. They maintain practice
contests on their website to help you train for the weekly tournaments, each of which
consist of three programming problems. Over 20,000 international programmers have
registered as participants since the weekly tournaments started in 2001. TopCoder has
paid out about $1 million in prizes to date.
The format of TopCoder contests is evolving quickly as they hunt for the most appropriate business model. Preliminary rounds are held over the web, with the ﬁnal rounds
of big tournaments held on site.
Each of these rounds shares the same basic format. The coders are divided into
“rooms” where they compete against the other coders. Each round starts with a coding
phase of 75–80 minutes where the contestants do their main programming. The score for
each problem is a decreasing function of the time from when it was ﬁrst opened to when
it was submitted. There is then a 15-minute challenge phase where coders get to view
the submissions from other contestants in their room and try to ﬁnd bugs. Coders earn
additional points by submitting a test case that breaks another competitor’s program.
There is no partial credit for incorrect solutions.
Most people seem to tackle the problems in increasing order of diﬃculty, although
Wright likes to go for the higher-value problems if he doesn’t think there will be time
to solve all three. TopCoder allows resubmissions, at the cost of a time penalty, so
there is some strategy in deciding when to submit and when to test. Time pressure is a
critical factor in TopCoder competitions. To gain speed, Wright encourages competitors
to learn to use their libraries eﬀectively.
The coding phase of the contest is generally more important than the challenge phase,
because the number of points for a challenge is not enough to make up for a diﬀerence in
the number of problems solved. To plan his challenges, Wright skims through solutions
to see if the algorithm makes sense. He pounces when it is an algorithm he considered
but rejected as incorrect. More often he ﬁnds typos and oﬀ-by-one bugs.

A.4 Go to Graduate School!
If you ﬁnd the programming challenges presented in this book interesting, you are the
kind of person who should think about going to graduate school. Graduate study in
computer science involves courses in advanced topics that build upon what you learned
as an undergraduate; but more importantly you will be doing new and original research
in the area of your choice. All reasonable American doctoral programs will pay tuition
and fees for all accepted Ph.D students, plus enough of a stipend to live comfortably if
not lavishly.
Making the ﬁnals of the ACM International Collegiate Programming Contest or the
International Olympiad on Informatics, or even being a top ﬁnisher in a regional contest, is a tremendous achievement. It clearly suggests that you have the right stuﬀ for
advanced study. I would certainly encourage you to continue your studies, ideally by
coming to work with me (Steven Skiena) at Stony Brook! My group does research in

A.4. Go to Graduate School!

347

a variety of interesting topics in algorithms and discrete mathematics. Please check us
out at http://www.algorist.com/gradstudy.
So I hope you will come to join us. By the way, oﬃcial ACM ICPC rules permit each
team to contain one ﬁrst-year graduate student. Maybe you can help take us to the
ﬁnals next year!

348

A. Appendix

A.5 Problem Credits
The ﬁrst name listed is the person who developed, commissioned, or curated the problem in question,
and granted us permission to use it in this book. Subsequent names (if any) refer to others who wrote
or worked on the problem. We thank all for their contributions to this project.

1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
4.1
4.2
4.3
4.4
4.5
4.6
4.7
4.8
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8

PC ID
110101
110102
110103
110104
110105
110106
110107
110108
110201
110202
110203
110204
110205
110206
110207
110208
110301
110302
110303
110304
110305
110306
110307
110308
110401
110402
110403
110404
110405
110406
110407
110408
110501
110502
110503
110504
110505
110506
110507
110508
110601
110602
110603
110604
110605
110606
110607
110608
110701
110702
110703
110704
110705
110706
110707
110708

UVa
100
10189
10137
706
10267
10033
10196
10142
10038
10315
10050
843
10205
10044
10258
10149
10082
10010
10252
850
10188
10132
10150
848
10041
120
10037
10191
10026
10138
10152
10194
10035
10018
701
10127
847
10105
10077
10202
10183
10213
10198
10157
10247
10254
10049
846
10110
10006
10104
10139
10168
10042
10090
10089

Title
The 3n + 1 problem
Minesweeper
The Trip
LCD Display
Graphical Editor
Interpreter
Check the Check
Australian Voting
Jolly Jumpers
Poker Hands
Hartals
Crypt Kicker
Stack ’em Up
Erdös Numbers
Contest Scoreboard
Yahtzee
WERTYU
Where’s Waldorf?
Common Permutation
Crypt Kicker II
Automated Judge Script
File Fragmentation
Doublets
Fmt
Vito’s Family
Stacks of Flapjacks
Bridge
Longest Nap
Shoemaker’s Problem
CDVII
ShellSort
Football (aka Soccer)
Primary Arithmetic
Reverse and Add
The Archeologist’s Dilemma
Ones
A Multiplication Game
Polynomial Coeﬃcients
Stern-Brocot Number System
Pairsumonious Numbers
How Many Fibs?
How Many Pieces of Land?
Counting
Expressions
Complete Tree Labeling
The Priest Mathematician
Self-describing Sequence
Steps
Light, More Light
Carmichael Numbers
Euclid Problem
Factovisors
Summation of Four Primes
Smith Numbers
Marbles
Repackaging

Sponsor/Authors
Owen Astrakan
Pedro Demasi
Gordon Cormack
Miguel Revilla, Immanuel Herrman
Alexander Denisjuk
Gordon Cormack
Pedro Demasi
Gordon Cormack
Gordon Cormack, Wim Nuij
Gordon Cormack
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack
Gordon Cormack
Miguel Revilla, Felix Gaertner
Gordon Cormack, Michael Van Biesbrouck
Gordon Cormack
Gordon Cormack
Gordon Cormack
Shahriar Manzoor
Gordon Cormack
Pedro Demasi
Gordon Cormack, Charles Clarke
Gordon Cormack
Gordon Cormack
Miguel Revilla, Pablo Puente
Owen Astrakan
Gordon Cormack
Pedro Demasi
Alex Gevak, Antonio Sánchez
Gordon Cormack
Gordon Cormack, Charles Clarke
Pedro Demasi
Gordon Cormack
Erick Moreno
Miguel Revilla
Gordon Cormack, Piotr Rudnicki
Gordon Cormack, Piotr Rudnicki
Alexander Denisjuk
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack, Piotr Rudnicki
Rujia Liu, Walter Guttmann
Shahriar Manzoor
Pedro Demasi
Petko Minkov
Shahriar Manzoor
Shahriar Manzoor, Miguel Revilla
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack, Piotr Rudnicki
Udvranto Patik, Sadi Khan, Suman Mahbub
Manuel Carro, César Sánchez
Alexander Denisjuk
Gordon Cormack
Shahriar Manzoor
Miguel Revilla, Felix Gaertner
Shahriar Manzoor, Rezaul Alam Chowdhury
Shahriar Manzoor, Rezaul Alam Chowdhury

A.5. Problem Credits

8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
9.1
9.2
9.3
9.4
9.5
9.6
9.7
9.8
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
11.1
11.2
11.3
11.4
11.5
11.6
11.7
11.8
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
13.1
13.2
13.3
13.4
13.5
13.6
13.7
13.8
14.1
14.2
14.3
14.4
14.5
14.6
14.7
14.8

PC ID
110801
110802
110803
110804
110805
110806
110807
110808
110901
110902
110903
110904
110905
110906
110907
110908
111001
111002
111003
111004
111005
111006
111007
111008
111101
111102
111103
111104
111105
111106
111107
111108
111201
111202
111203
111204
111205
111206
111207
111208
111301
111302
111303
111304
111305
111306
111307
111308
111401
111402
111403
111404
111405
111406
111407
111408

UVa
861
10181
10128
10160
10032
10001
704
10270
10004
10067
10099
705
10029
10051
10187
10276
10034
10054
10278
10039
10158
10199
10249
10092
10131
10069
10154
116
10003
10261
10271
10201
10161
10047
10159
10182
707
10177
10233
10075
10310
10180
10195
10136
10167
10215
10209
10012
10135
10245
10043
10084
10065
849
10088
10117

Title
Little Bishops
15-Puzzle Problem
Queue
Servicing Stations
Tug of War
Garden of Eden
Color Hash
Bigger Square Please...
Bicoloring
Playing With Wheels
The Tourist Guide
Slash Maze
Edit Step Ladders
Tower of Cubes
From Dusk Till Dawn
Hanoi Tower Troubles Again!
Freckles
The Necklace
Fire Station
Railroads
War
Tourist Guide
The Grand Dinner
Problem With Problem Setter
Is Bigger Smarter?
Distinct Subsequences
Weights and Measures
Unidirectional TSP
Cutting Sticks
Ferry Loading
Chopsticks
Adventures in Moving: Part IV
Ant on a Chessboard
The Monocycle
Star
Bee Maja
Robbery
(2/3/4)-D Sqr/Rects/Cubes?
Dermuba Triangle
Airlines
Dog and Gopher
Rope Crisis in Ropeland!
Knights of the Round Table
Chocolate Chip Cookies
Birthday Cake
The Largest/Smallest Box ...
Is This Integration?
How Big Is It?
Herding Frosh
The Closest Pair Problem
Chainsaw Massacre
Hotter Colder
Useless Tile Packers
Radar Tracking
Trees on My Island
Nice Milk

349

Sponsor/Authors
Shahriar Manzoor, Rezaul Alam Chowdhury
Shahriar Manzoor, Rezaul Alam Chowdhury
Marcin Wojciechowski
Petko Minkov
Gordon Cormack
Manuel Carro, Manuel J. Petit de Gabriel
Miguel Revilla, Pablo Puente
Rujia Liu
Manuel Carro, Álvaro Martı́nez Echevarrı́a
Shahriar Manzoor, Rezaul Alam Chowdhury
Shahriar Manzoor, Rezaul Alam Chowdhury
Miguel Revilla, Immanuel Herrman
Gordon Cormack
Shahriar Manzoor, Rezaul Alam Chowdhury
Rujia Liu, Ralf Engels
Rujia Liu
Gordon Cormack
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack
Miguel Revilla, Philipp Hahn
Petko Minkov
Pedro Demasi
Shahriar Manzoor, Rezaul Alam Chowdhury
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack, Charles Rackoﬀ
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack
Owen Astrakan
Manuel Carro, Julio Mario
Gordon Cormack
Rujia Liu
Gordon Cormack, Ondrej Lhotak
Long Chong
Shahriar Manzoor, Rezaul Alam Chowdhury
Petko Minkov
Rujia Liu, Ralf Engels
Miguel Revilla, Immanuel Herrman
Shahriar Manzoor
Arun Kishore
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack
Shahriar Manzoor, Rezaul Alam Chowdhury
Pedro Demasi
Gordon Cormack
Long Chong
Shahriar Manzoor
Shahriar Manzoor
Gordon Cormack
Gordon Cormack
Shahriar Manzoor
Miguel Revilla, Christoph Mueller
Gordon Cormack
Shahriar Manzoor, Rezaul Alam Chowdhury
Gordon Cormack
Shahriar Manzoor, Rezaul Alam Chowdhury
Rujia Liu

References

[AMO93]

R. Ahuja, T. Magnanti, and J. Orlin. Network Flows. Prentice Hall, Englewood
Cliﬀs NJ, 1993.

[Ber01]

A. Bergeron. A very elementary presentation of the Hannenhalli-Pevzner theory.
In Proc. 12th Symp. Combinatorial Pattern Matching (CPM), volume 2089, pages
106–117. Springer-Verlag Lecture Notes in Computer Science, 2001.

[CC97]

W. Cook and W. Cunningham. Combinatorial Optimization. Wiley, 1997.

[COM94]

COMAP. For All Practical Purposes. W. H. Freeman, New York, third edition,
1994.

[dBvKOS00] M de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf. Computational
Geometry: Algorithms and Applications. Springer-Verlag, Berlin, second edition,
2000.
[DGK83]

P. Diaconis, R.L. Graham, and W.M. Kantor. The mathematics of perfect
shuﬄes. Advances in Applied Mathematics, 4:175, 1983.

[Dij76]

E. W. Dijkstra. A discipline of programming. 1976.

[Gal01]

J. Gallian. Graph labeling: A dynamic survey. Electronic Journal of Combinatorics, DS6, www.combinatorics.org, 2001.

[GJ79]

M. R. Garey and D. S. Johnson. Computers and Intractability: A guide to the
theory of NP-completeness. W. H. Freeman, San Francisco, 1979.

[GKP89]

R. Graham, D. Knuth, and O. Patashnik. Concrete Mathematics. AddisonWesley, Reading MA, 1989.

[GP79]

B. Gates and C. Papadimitriou. Bounds for sorting by preﬁx reversals. Discrete
Mathematics, 27:47–57, 1979.

[GS87]

D. Gries and I. Stojmenović. A note on Graham’s convex hull algorithm.
Information Processing Letters, 25(5):323–327, 10 July 1987.

References

351

[GS93]

B. Grunbaum and G. Shephard. Pick’s theorem. Amer. Math. Monthly, 100:150–
161, 1993.

[Gus97]

D. Gusﬁeld. Algorithms on Strings, Trees, and Sequences: Computer Science and
Computational Biology. Cambridge University Press, 1997.

[Hof99]

P. Hoﬀman. The Man Who Loved Only Numbers: The Story of Paul Erdös and
the Search for Mathematical Truth. Little Brown, 1999.

[HV02]

G. Horvath and T. Verhoeﬀ. Finding the median under IOI conditions. Informatics in Education, 1:73–92, also at http://www.vtex.lt/informatics in education/
2002.

[HW79]

G. H. Hardy and E. M. Wright. An Introduction to the Theory of Numbers.
Oxford University Press, ﬁfth edition, 1979.

[Kay00]

R. Kaye. Minesweeper is NP-complete. Mathematical Intelligencer, 22(2):9–15,
2000.

[Knu73a]

D. E. Knuth. The Art of Computer Programming, Volume 1: Fundamental
Algorithms. Addison-Wesley, Reading MA, second edition, 1973.

[Knu73b]

D. E. Knuth. The Art of Computer Programming, Volume 3: Sorting and
Searching. Addison-Wesley, Reading MA, 1973.

[Knu81]

D. E. Knuth. The Art of Computer Programming, Volume 2: Seminumerical
Algorithms. Addison-Wesley, Reading MA, second edition, 1981.

[KP99]

B. Kernighan and R. Pike. The Practice of Programming. Addison Wesley,
Reading MA, 1999.

[Lag85]

J. Lagarias. The 3x + 1 problem and its generalizations. American Mathematical
Monthly, 92:3–23, 1985.

[LR76]

E. Luczak and A. Rosenfeld. Distance on a hexagonal grid. IEEE Transactions
on Computers, 25(5):532–533, 1976.

[Man01]

S. Manzoor. Common mistakes in online and real-time contests. ACM Crossroads
Student Magazine, http://www.acm.org/crossroads/xrds7-5/contests.html, 2001.

[McD87]

W. McDaniel. The existance of inﬁnitely many k-Smith numbers. Fibonacci
Quarterly, 25:76–80, 1987.

[MDS01]

D. Musser, G. Derge, and A. Saini. STL Tutorial and Reference Guide: C++
Programming with the Standard Template Library. Addison-Wesley, Boston MA,
second edition, 2001.

[Mor98]

S. Morris. Magic Tricks, Card Shuﬄing, and Dynamic Computer Memories:
The Mathematics of the Perfect Shuﬄe. Mathematical Association of America,
Washington, D.C., 1998.

[New96]

M. Newborn. Kasparov Versus Deep Blue: Computer Chess Comes of Age.
Springer-Verlag, 1996.

[O’R00]

J. O’Rourke. Computational Geometry in C. Cambridge University Press, New
York, second edition, 2000.

[PS03]

S. Pemmaraju and S. Skiena. Computational Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Cambridge University Press, New
York, 2003.

[Sch94]

B. Schneier. Applied Cryptography. Wiley, New York, 1994.

352

References

[Sch97]

J. Schaeﬀer. One Jump Ahead: Challenging Human Supremacy in Checkers.
Springer-Verlag, 1997.

[Sch00]

B. Schechter. My Brain Is Open: The Mathematical Journeys of Paul Erdös.
Touchstone Books, 2000.

[Sed01]

R. Sedgewick. Algorithms in C++: Graph Algorithms. Addison-Wesley, third
edition, 2001.

[Seu58]

Dr. Seuss. Yertle the Turtle. Random House, 1958.

[Seu63]

Dr. Seuss. Hop on Pop. Random House, 1963.

[Ski97]

S. Skiena. The Algorithm Design Manual. Springer-Verlag, New York, 1997.

[Sti02]

D. Stinson. Cryptography: Theory and Practice. Chapman and Hall, second
edition, 2002.

[Wes00]

D. West. Introduction to Graph Theory. Prentice-Hall, Englewood Cliﬀs NJ,
second edition, 2000.

[Wil82]

A. Wilansky. Smith numbers. Two-Year College Math. J., 13:21, 1982.

[Wol02]

S. Wolfram. A New Kind of Science. Wolfram Media, 2002.

[ZMNN91]

H. Zuckerman, H. Montgomery, I. Niven, and A. Niven. An Introduction to the
Theory of Numbers. Wiley, New York, ﬁfth edition, 1991.

Index

3n + 1 problem, 15
15-puzzle, 177
8 queens problem, 172
accepted (PE) verdict, 3
accepted verdict, 3
account, robot judge, 2
acute angle, 315
acyclic graph, 190
adding polynomials, 115
addition, 105
addition, congruence, 154
addition, modular, 152
addition, rational, 113
adjacency list, 191
adjacency list in matrix, 192
adjacency matrix, 191
algebra, 115
all-pairs shortest path, 225
American Standard Code for Information
Interchange, 56
angle, 293
angle-side-angle, 297
approximate string matching, 246
arc cosine/sine/tangent, 296
area computation, 322, 331
area of circle, 298

area of triangle, 297
arithmetic, modular, 152
arrays, 12
arrays in C, 7
articulation vertex, 218
ASCII code, 56
back edge, 199
backtracking, 167
base conversion, 110
base conversion, logarithms, 117
BFS, 194
biconnected graph, 218
bijection, 130
bin packing, 263
binary heap counting, 141
binary number, 110
binary search, 79, 84, 85
binary tree, 30
binary trees, counting, 134
binomial coeﬃcients, 131, 245
binomial theorem, 132
bipartite graph, 203, 228
bipartite matching, 228, 242
bit vector, 32
bit-mapped images, 19
blackmail graph, 227

354

Index

bottleneck spanning tree, 223
boundary conditions, dynamic programming,
252
breadth-ﬁrst search, 194
bridge edge, 218
bsearch(), 84
C language, 5, 6
C language strings, 64
C math library, 117
C sorting library, 83
C++, 5, 33
C++ sorting library, 84
C++ strings, 64
calendar, 32
calendar calculations, 153
call by reference/value, 7
card data structure, 36
card games, 34, 43, 48
card shuﬄing, 48
carry operation, addition, 119
Catalan numbers, 134
ccw predicate, 315
cellular automata, 182
chaining, 31
character code, 56
character representation, 58
chess, 23
Chief Soh-Cah-Toa, 296
Chinese remainder theorem, 155
circle, 298
circular queue, 29
circumference, 298
closest pair, 328
closest point, 294
coding hints, 9
collating sequence, 57
column-major ordering, 269
combinatorial explosion, 173
combinatorics, 129
comments, 9
committees, counting, 131
common logarithm, 117
comparison, 107
comparison function, 83
compile error verdict, 4
computational geometry, 313
computer simulator, 21
concatenation, string, 73

congruence, 154
connect the dots, 231
connected component, 199, 218
contest scoring, ACM ICPC, 52
continuity of the reals, 112
convex hull, 316
convex polygon, 315
copying a string, 62
cosine, 295
counting theory, 129
crossword puzzle, 67
cycle detection, 199
cycle in graph, 190
DAG, 190, 200
dance partner selection, 82
data movement, 80, 81
data structures, 27
data types, 11
debugger, source level, 40
debugging techniques, 11, 39
decimal number, 110, 114
decimal to rational conversion, 114
decreasing subsequence, 253
decryption, 47, 70
degeneracy, 292, 314
degree, angular, 294
degree, vertex, 217
depth-ﬁrst search, 198
dequeue, 28
DFS, 198
diagonal ordering, 270
dice games, 53
dictionary, 30, 33
diﬀerence sequence, 42
diﬃculty ratings, 13
Dijkstra’s algorithm, 223
dinner plates, 275
Diophantine equation, 155, 163
directed acyclic graph, 190
directed graph, 189
disk packing, 277
distance formula, 13
dividing polynomials, 116
divisibility, 149
division, 109
division, congruence, 154
division, modular, 153
division, rational, 114

Index
dominating set, 180
double counting, 130
Dr. Seuss, 33
dual graph, 271
duplicate identiﬁcation, 79
dynamic memory allocation, 104
dynamic programming, 133, 245
dynamic programming traceback, 250
dynamic programming, smelling, 254
ear cutting, 320
Ebola virus, 11
edge, 189
edge connectivity, 218
edge table, 192
edge-weighted rectilinear grid, 271
edit distance, 246
edit step ladder, 210
eight queens problem, 172
elevator optimization, 253
embedded graph, 190
enclosing circle, 307
encryption, 47, 70
encryption, RSA, 153
enqueue, 28
enumerated types, 9
equations, 115
Erdös number, 50
errors by programming language, 6
Euclid’s algorithm, 150, 159
Euler’s formula, 220
Eulerian cycle, 219
Eulerian numbers, 134
Eulerian path, 219
evaluating polynomials, 115
example programs, 6
explicit graph, 191
exponential time, 248
exponentiation, 110
facility location, 234
factorial function, 160
factoring integers, 148
feedback from judge, 3
Fermat test, 158
Fermat’s last theorem, 147
Fibonacci numbers, 133, 137
FIFO queue, 28, 35, 194
ﬂoating point numbers, 103

Floyd’s algorithm, 225
Ford-Fulkerson algorithm, 228
formatted input, 37
formatted input/output functions, 8
four-color theorem, 271
fraction, 113
frequency counting, 79
fundamental theorem of arithmetic, 147
games, 53
gcd, 150, 156
geometry, 291
geometry library, Java, 326
Goldbach’s conjecture, 161
Golumb’s sequence, 144
Google, 56
Graham scan, 316
graph, 189
graph data structure, 220
graph theory, 217
graph traversal, 169
graphical user interfaces, 7
great circle route, 278
greatest common divisor, 150
greedy algorithm, 246
grid, 268
grid geometry, 324
ham and cheese sandwich theorem, 308
Hamiltonian cycle, 219
hash function, 36
hash table, 31
heap, 32
heapsort, 80
hexadecimal number, 110
hexagonal coordinates, 271
high-precision integers, 103
highway tolls, 95
honeycomb, 273
Horner’s rule, 115
hypotenuse, 295
IEEE ﬂoating point standard, 113
image editor, 19
implicit graph, 191
in-order traversal, 198
inclusion-exclusion formula, 130
increasing subsequence, 253
index manipulation, 252

355

356

Index

induction and recursion, 135
inexact string matching, 246
inﬁnite number of primes, 149
insertion sort, 80
integer, 112
integer libraries, 103
integer partition, 181, 263
intersection, 32, 79, 84
intersection point, 292
invariant, 40
inverse trigonometric functions, 296
inversion, 81
irrational number, 112
ISO Latin-1, 56
Java, 5, 34
Java language hints, 8
Java libraries, 156
Java sorting library, 85
Java strings, 65
java.util, 34
java.util.arrays, 85
jolly jumper, 42
Jordan curve theorem, 323
judging script, 71
keyboard, computer, 66
Kruskal’s algorithm, 220
labeled graph, 191
largest enclosed circle, 306
last digit computation, 153
latitude, 278
lattice polygon, 325
law of cosines, 297
law of sines, 296
LCD displays, 18
least common multiple, 151
libraries, 27
libraries, trigonometric functions, 302
line, 291
line intersection, 314
line segment, 291, 313
line, equation of, 292
linear congruence, 155
linked list, 58
linked structures, advantages of, 104
logarithm, 116
long division, 109

long integer, 102
longest common subsequence, 253
longest decreasing subsequence, 257, 259
longest path, DAG, 200, 210, 211
longitude, 278
lowercase character, 57
machine arithmetic, 102
magnitude, 103
mailer errors, 3
Manhattan, 278
mantissa, 113
matching, 228
maximum spanning tree, 222
median, 88
median element, 79
memory limit exceeded verdict, 4
Minesweeper, 16
minimax game tree, 123
minimum movement sorting, 97
minimum product spanning tree, 222
minimum spanning tree, 220, 231
modular arithmetic, 152
modulus, 152
money ﬂow, 17
monotone subsequence, 253
multi-edge, 190
multidimensional arrays, 12
multiplication, 107
multiplication, congruence, 154
multiplication, modular, 153
multiplication, rational, 114
multiplicative inverse, 154
multiplying polynomials, 115
natural logarithm, 117
natural number, 112
nearest neighbor, 304
negative-cost edge, 225
network ﬂow, 227
non-printable character, 57
number systems, 125
number theory, 147
numerical base, 110
object-oriented programming, 11
octal number, 110
oﬀ-by-one errors, 40
one million, 167

Index
open addressing, 31
orthogonal range query, 324
output limit exceeded verdict, 4
palindrome addition, 120
pancake sorting, 89
parallel lines, 292
parameter passing, 7
parenthesizations, balanced, 134
parenthesized formulas, 28
partition, applications of, 82
Pascal, 5, 64
Pascal’s triangle, 132
path reconstruction, 250
paths across a grid, counting, 131
paths in graphs, 196
pattern matching, 61
penalty costs, 251
permutation, 130
permutation maxima and minima, 179
permutation operations, 48
permutations, constructing, 170
permuted subsequence, 69
perpendicular lines, 293
Pick’s theorem, 325
planar graph, 220, 271
Plato’s academy, 291
playing cards, 36
point location, 322
pointers, 7
points, three dimensions, 13
poker, 43
polygon, 313, 315
polygon intersection, 330
polynomial multiplication, 124
polynomials, 115
pop, 28
powers of two, 121
pre-order traversal, 198
precision, 103
presentation error verdict, 3
Prim’s algorithm, 220, 223
primality testing, 148
prime factorization, 148
prime number, 147
printable character, 57
priority queue, 31, 33, 79
problem description, 35
problem ratings, 13

357

product rule, combinatorics, 129
program ﬂow graph, 190
program testing, 39
programming challenges robot judge, 2
programming examples, 6
programming languages, 5
proof by contradiction, 149
pruning search, 173
push, 28
Pythagorean theorem, 295
qsort(), 83
quadratic equation, 116
queue, 28, 33
quicksort, 81
radian measure, 294
radius, 298
randomized primality testing, 149, 156, 158
range query, 324
ranking function, 36, 57
rational functions, 116
rational number, 112, 113
rational to decimal conversion, 114
ray, 294
reading graphs, 193
reading problem descriptions, 35
real numbers, 112
records, 12
rectilinear grid, 268
recurrence relation, 131
recurrence relation, basis case, 132
recursion, 197, 198
recursion and induction, 135
recursion, backtracking, 169
recursion, speeding up, 246
recursive program calls, 28
reducing fractions, 114
redundant code, 10
repeating decimals, 114
replacing a substring, 62
restricted function verdict, 4
reversing a string, 62
right angle, 295
road network, 190
robot judge account, 2
Roman numbers, 107
root ﬁnding algorithms, 116
rooted tree, 198, 218

358

Index

roundoﬀ error, 113
routing numbers, 113
row-major ordering, 269
runtime error verdict, 4
schedule data structure, 32
scheduling, 200
scheduling problems, 94
search, 30
selection, 79
selection sort, 80
self-describing sequence, 144
sentinels, 12
set data structure, 32
set partitions, 135
Seuss, Dr., 33
shortest path algorithms, 223
shortest path, unweighted graph, 197
shuﬄing cards, 48
side-angle-side, 297
sieve of Eratosthenes, 45
signed area, 297, 315, 322
simple graph, 190
sine, 295
snake ordering, 270
soccer scoreboard, 99
software tools philosophy, 7
Soh-Cah-Toa, 296
solving congruences, 155
solving recurrence relations, 135
solving triangles, 296
sorting, 30
sorting algorithms, 80
sorting applications, 78
sorting order, 57
spanning tree, 218
sparse polynomial, 116
sphere packing, 277
square root computation, 116
stable sorting algorithms, 83
stack, 28, 33
standard input/output, 8
Standard Template Library, 33, 84
Stern-Brocot number system, 125
Stirling numbers, 134
streams, 37
strikes, 45
string, 56
string comparison, 61

string input, 37
string library functions, 64
string representation, 58
string, combinatorial, 131
strongly connected component, 218
submission error verdict, 4
submissions by programming language, 6
submitting programs, 2
subset, 130
subset sum, 263
subsets, 32
subsets, constructing, 169
substring matching, 252
subtracting polynomials, 115
subtraction, 106
subtraction, congruence, 154
subtraction, modular, 152
subtraction, rational, 113
sum rule, combinatorics, 130
Superman, 299
sweep-line algorithm, 32
symbolic constants, 9
symmetry, 175
tangent line, 298
target pair, 79
templates, C++, 33
testing tips, 39
text formatting, 75
text searching, 61
text string, 56
tiling problem, 186
time limit exceeded verdict, 4
topological graph, 190
topological sorting, 200
tower of Hanoi, 142
traceback, dynamic programming, 250
transitive closure, 227
traveling salesman problem, 260
tree, 218
tree edge, 199
triangles, solving, 296
triangular grid, 271
triangulation, 319
triangulations, counting, 134
trigonometry, 294
tug of war, 181
twos complement numbers, 103
typing error, 66

Index
undirected graph, 189
Unicode, 58
union, 32, 79, 84
uniqueness testing, 78
Universidad de Valladolid robot judge, 1
unlabeled graph, 191
unranking function, 36, 57
unweighted graph, 190
uppercase character, 57
van Gogh’s algorithm, 320
variable documentation, 9
vertex, 189
vertex connectivity, 218
vertex degree, 217
War, card game of, 34
Waring’s problem, 161
weighted graph, 190
wild card character, 61
word grid, 67
word ladder, 74
wrong answer verdict, 4
Yahtzee, 53

359

Source Exif Data:

File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.7
Linearized                      : No
Creator                         : DVIPSONE 2.2.3  http://www.YandY.com
Modify Date                     : 2017:11:30 18:58:05+01:00
Create Date                     : 2003:02:11 12:45:27+01:00
EBX PUBLISHER                   : Springer Science & Business Media
PXC Viewer Info                 : PDF-XChange Viewer;2.5.201.0;Jan 23 2012;21:08:47;D:20171130185724+01'00'
Page Count                      : 373
Has XFA                         : No
Xpacket                         : 
XMP Toolkit                     : XMP Core 4.1.1
Creator Tool                    : DVIPSONE 2.2.3  http://www.YandY.com
Producer                        : Acrobat Distiller 5.0 (Windows)
Format                          : application/pdf
Page Mode                       : UseOutlines
Page Layout                     : SinglePage

EXIF Metadata provided by EXIF.tools

Steven S. Skiena, Miguel A. Revilla Programming Challenges. The Con Training Manual Spri

Navigation menu

Versions of this User Manual:

Views

Navigation