A Practitioner's Guide To Software Design @Team LiB 11 By Lee Copeland

User Manual:

Open the PDF directly: View PDF .
Page Count: 355

Download
Open PDF In Browser	View PDF

A Practitioner's Guide to Software Test Design
by Lee Copeland

ISBN:158053791x

Artech House © 2004
This text presents all the important test design techniques in a single
place and in a consistent, and easy-to-digest format. It enables you to
choose the best test case design, find software defects, develop optimal
strategies, and more.

Table of Contents
A Practitioner's Guide to Software Test Design
Preface
Chapter 1 - The Testing Process
Chapter 2 - Case Studies
Section I - Black Box Testing Techniques

Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9

- Equivalence Class Testing
- Boundary Value Testing
- Decision Table Testing
- Pairwise Testing
- State-Transition Testing
- Domain Analysis Testing
- Use Case Testing

Section II - White Box Testing Techniques

Chapter 10 - Control Flow Testing
Chapter 11 - Data Flow Testing
Section III - Testing Paradigms

Chapter 12 - Scripted Testing
Chapter 13 - Exploratory Testing
Chapter 14 - Test Planning
Section IV - Supporting Technologies

Chapter 15 - Defect Taxonomies
Chapter 16 - When to Stop Testing

Section V - Some Final Thoughts

Appendix A - Brown & Donaldson Case Study
Appendix B - Stateless University Registration System Case Study
Bibliography
Index
List of Figures
List of Tables
List of Examples

Back Cover
Here’s a comprehensive, up-to-date and practical introduction to software test
design. This invaluable book presents all the important test design techniques in a
single place and in a consistent, and easy-to-digest format. An immediately useful
handbook for test engineers, developers, quality assurance professionals, and
requirements and systems analysts, it enables you to: choose the best test case
design, find software defects in less time and with fewer resources, and develop
optimal strategies that help reduce the likelihood of costly errors. It also assists you
in estimating the effort, time and cost of good testing.
Numerous case studies and examples of software testing techniques are included,
helping you to fully understand the practical applications of these techniques. From
well-established techniques such as equivalence classes, boundary value analysis,
decision tables, and state-transition diagrams, to new techniques like use case
testing, pairwise testing, and exploratory testing, the book is an indispensable
resource for testing professionals seeking to improve their skills and an excellent
reference for college-level courses in software test design.
About the Author
Lee Copeland is an internationally known consultant in software testing, with over
30 years of experience as an information systems professional. He has held a
number of technical and managerial positions with commercial and nonprofit
organizations in the areas of software development, testing, and process
improvement. He has taught seminars and consulted extensively throughout the
United States and internationally.

A Practitioner's Guide to Software Test Design
Lee Copeland

Artech House Publishers
Boston • London
Library of Congress and British CIP information available on request 685 Canton Street
Norwood, MA 02062
(781) 769-9750
www.artechhouse.com
46 Gillingham Street
London SW1V 1AH
+44 (0)20 7596-8750
Copyright © 2004 STQE Publishing
All rights reserved. No part of this book shall be reproduced, stored in a retrieval system, or
transmitted by any means, electronic, mechanical, photocopying, recording, or otherwise
without written permission from the publisher.
International Standard Book Number: 1-58053-791-X
Printed in the United States of America
First Printing: November 2003
Trademarks
All terms mentioned in this book that are known to be trademarks or service marks have been
appropriately capitalized. Artech House Publishers and STQE Publishing cannot attest to the
accuracy of this information. Use of a term in this book should not be regarded as affecting the
validity of any trademark or service mark.
Warning and Disclaimer
Every effort has been made to make this book as complete and accurate as possible, but no
warranty or fitness is implied. The information provided is on an "as is" basis. The authors and
the publisher shall have neither liability nor responsibility to any person or entity with respect to
any loss or damages arising from the information contained in this book.
Dedication
To my wife Suzanne, and our wonderful children and grandchildren
Shawn and Martha

Andrew and Cassandra
David
Cathleen
Katelynn and Kiley
Melissa and Jay
Ross, Elizabeth, and Miranda
Brian and Heather
Cassidy and Caden
Thomas and Jeni
Carrie
Sundari
Rajan
and to Wayne, Jerry, Dani, Ron, and Rayanne for their encouragement over the years.

Lee Copeland is an internationally known consultant in software testing, with over 30 years of
experience as an information systems professional. He has held a number of technical and
managerial positions with commercial and nonprofit organizations in the areas of software
development, testing, and process improvement. He has taught seminars and consulted
extensively throughout the United States and internationally.
As a consultant for Software Quality Engineering, Lee travels the world promoting effective
software testing to his clients. In addition, he is the program chair for STAREAST and
STARWEST, the world's premier conferences on software testing.

Preface
A Practitioner's Guide to Software Test Design contains today's important current test design
approaches in one unique book. Until now, software testers had to search through a number of
books, periodicals, and Web sites to locate this vital information.
Importance of Test Design
"The act of careful, complete, systematic, test design will catch as many bugs as the act
of testing. ... Personally, I believe that it's far more effective."
- Boris Beizer
The book focuses only on software test design, not related subjects such as test planning, test
management, test team development, etc. While those are important in software testing, they
have often overshadowed what testers really need—the more practical aspects of testing,
specifically test case design. Other excellent books can guide you through the overall process
of software testing. One of my favorites is Systematic Software Testing by Rick Craig and
Stefan Jaskiel.
A Practitioner's Guide to Software Test Design illustrates each test design approach through
detailed examples and step-by-step instructions. These lead the reader to a clear
understanding of each test design technique.

Today's Testing Challenges
For any system of interesting size it is impossible to test all the different logic paths and all the
different input data combinations. Of the infinite number of choices, each one of which is worthy
of some level of testing, testers can only choose a very small subset because of resource
constraints. The purpose of this book is to help you analyze, design, and choose such subsets,
to implement those tests that are most likely to discover defects.
It is vital to choose test cases wisely. Missing a defect can result in significant losses to your
organization if a defective system is placed into production.
A Practitioner's Guide to Software Test Design describes a set of key test design strategies
that improve both the efficiency and effectiveness of software testers.

Structure and Approach
A Practitioner's Guide to Software Test Design explains the most important test design
techniques in use today. Some of these techniques are classics and well known throughout the
testing community. Some have been around for a while but are not well known among test
engineers. Still others are not widely known, but should be because of their effectiveness. This
book brings together all these techniques into one volume, helping the test designer become
more efficient and effective in testing.
Each test design technique is approached from a practical, rather than a theoretical basis.
Each test design technique is first introduced through a simple example, then explained in detail.
When possible, additional examples of its use are presented. The types of problems on which
the approach can be used, along with its limitations, are described. Each test design technique
chapter ends with a summary of its key points, along with exercises the reader can use for
practice, and references for further reading. Testers can use the techniques presented
immediately on their projects.
A Note from the Author
I love a good double integral sign

as much as the next tester, but we're going to concentrate on the practical, not the
theoretical.
Each test design approach is described in a self-contained chapter. Because the chapters are
focused, concise, and independent they can be read "out of order." Testers can read the
chapters that are most relevant to their work at the moment.

Audience
This book was written specifically for:
Software test engineers who have the primary responsibility for test case design. This
book details the most efficient and effective methods for creating test cases.
Software developers who, with the advent of Extreme Programming and other agile
development methods, are being asked to do more and better testing of the software
they write. Many developers have not been exposed to the design techniques
described in this book.
Test and development managers who must understand, at least in principle, the work
their staff performs. Not only does this book provide an overview of important test
design methods, it will assist managers in estimating the effort, time, and cost of good
testing.
Quality assurance and process improvement engineers who are charged with defining
and improving their software testing process.
Instructors and professors who are searching for an excellent reference for a course in
software test design techniques.

Appreciation
The following reviewers have provided invaluable assistance in the writing of this book: Anne
Meilof, Chuck Allison, Dale Perry, Danny Faught, Dorothy Graham, Geoff Quentin, James
Bach, Jon Hagar, Paul Gerrard, Rex Black, Rick Craig, Robert Rose-Coutré, Sid Snook, and
Wayne Middleton. My sincere thanks to each of them. Any faults in this book should be
attributed directly to them. (Just kidding!)

Some Final Comments
This book contains a number of references to Web sites. These references were correct when
the manuscript was submitted to the publisher. Unfortunately, they may have become broken by
the time the book is in the readers' hands.
It has become standard practice for authors to include a pithy quotation on the title page of
each chapter. Unfortunately, the practice has become so prevalent that all the good quotations
have been used. Just for fun, I have chosen instead to include on each chapter title page a
winning entry from the 2003 Bulwer-Lytton Fiction Contest (http://www.bulwer-lytton.com).
Since 1982, the English Department at San Jose State University has sponsored this event, a
competition that challenges writers to compose the opening sentence to the worst of all
possible novels. It was inspired by Edward George Bulwer-Lytton who began his novel Paul
Clifford with:
"It was a dark and stormy night; the rain fell in torrents—except at occasional intervals,
when it was checked by a violent gust of wind which swept up the streets (for it is in
London that our scene lies), rattling along the housetops, and fiercely agitating the scanty
flame of the lamps that struggled against the darkness."
My appreciation to Dr. Scott Rice of San Jose State University for permission to use these
exemplary illustrations of bad writing. Hopefully, nothing in this book will win this prestigious
award.

Acknowledgements
The caricature of Mick Jagger is owned and copyrighted by Martin O'Loughlin and used by
permission.
Clip Art copyright by Corel Corporation and used under a licensing agreement.

References
Beizer, Boris (1990). Software Testing Techniques (Second Edition). Van Nostrand
Reinhold.
Craig, Rick D. and Stefan P. Jaskiel (2002). Systematic Software Testing. Artech House
Publishers.

Chapter 1: The Testing Process

Overview
The flock of geese flew overhead in a 'V' formation—not in an old-fashioned-looking
Times New Roman kind of a 'V', branched out slightly at the two opposite arms at the top
of the 'V', nor in a more modern-looking, straight and crisp, linear Arial sort of 'V'
(although since they were flying, Arial might have been appropriate), but in a slightly
asymmetric, tilting off-to-one-side sort of italicized Courier New-like 'V'—and LaFonte
knew that he was just the type of man to know the difference. [1]
— John Dotson
[1]If

you think this quotation has nothing to do with software testing you are correct. For an
explanation please read "Some Final Comments" in the Preface.

Testing
What is testing? While many definitions have been written, at its core testing is the process of
comparing "what is" with "what ought to be." A more formal definition is given in the IEEE
Standard 610.12-1990, "IEEE Standard Glossary of Software Engineering Terminology" which
defines "testing" as:
"The process of operating a system or component under specified conditions, observing or
recording the results, and making an evaluation of some aspect of the system or
component."
The "specified conditions" referred to in this definition are embodied in test cases, the subject
of this book.
Key
Point

At its core, testing is the process of comparing "what is" with "what ought to be."

Rick Craig and Stefan Jaskiel propose an expanded definition of software testing in their book,
Systematic Software Testing.
"Testing is a concurrent lifecycle process of engineering, using and maintaining testware in
order to measure and improve the quality of the software being tested."
This view includes the planning, analysis, and design that leads to the creation of test cases in
addition to the IEEE's focus on test execution.
Different organizations and different individuals have varied views of the purpose of software
testing. Boris Beizer describes five levels of testing maturity. (He called them phases but today
we know the politically correct term is "levels" and there are always five of them.)
Level 0 - "There's no difference between testing and debugging. Other than in support of
debugging, testing has no purpose." Defects may be stumbled upon but there is no formalized
effort to find them.
Level 1 - "The purpose of testing is to show that software works." This approach, which starts
with the premise that the software is (basically) correct, may blind us to discovering defects.
Glenford Myers wrote that those performing the testing may subconsciously select test cases
that should not fail. They will not create the "diabolical" tests needed to find deeply hidden
defects.
Level 2 - "The purpose of testing is to show that the software doesn't work." This is a very
different mindset. It assumes the software doesn't work and challenges the tester to find its
defects. With this approach, we will consciously select test cases that evaluate the system in
its nooks and crannies, at its boundaries, and near its edges, using diabolically constructed test
cases.
Level 3 - "The purpose of testing is not to prove anything, but to reduce the perceived risk of

not working to an acceptable value." While we can prove a system incorrect with only one test
case, it is impossible to ever prove it correct. To do so would require us to test every possible
valid combination of input data and every possible invalid combination of input data. Our goals
are to understand the quality of the software in terms of its defects, to furnish the programmers
with information about the software's deficiencies, and to provide management with an
evaluation of the negative impact on our organization if we shipped this system to customers in
its present state.
Level 4 - "Testing is not an act. It is a mental discipline that results in low-risk software without
much testing effort." At this maturity level we focus on making software more testable from its
inception. This includes reviews and inspections of its requirements, design, and code. In
addition, it means writing code that incorporates facilities the tester can easily use to
interrogate it while it is executing. Further, it means writing code that is self-diagnosing, that
reports errors rather than requiring testers to discover them.

Current Challenges
When I ask my students about the challenges they face in testing they typically reply:
Not enough time to test properly
Too many combinations of inputs to test
Not enough time to test well
Difficulty in determining the expected results of each test
Nonexistent or rapidly changing requirements
Not enough time to test thoroughly
No training in testing processes
No tool support
Management that either doesn't understand testing or (apparently) doesn't care about
quality
Not enough time
This book does not contain "magic pixie dust" that you can use to create additional time, better
requirements, or more enlightened management. It does, however, contain techniques that will
make you more efficient and effective in your testing by helping you choose and construct test
cases that will find substantially more defects than you have in the past while using fewer
resources.

Test Cases
To be most effective and efficient, test cases must be designed, not just slapped together. The
word "design" has a number of definitions:
1. To conceive or fashion in the mind; invent: design a good reason to attend the STAR
testing conference. To formulate a plan for; devise: design a marketing strategy for
the new product.
2. To plan out in systematic, usually documented form: design a building; design a test
case.
3. To create or contrive for a particular purpose or effect: a game designed to appeal to
all ages.
4. To have as a goal or purpose; intend.
5. To create or execute in an artistic or highly skilled manner.
Key
Point

To be most effective and efficient, test cases must be designed, not just slapped
together.

Each of these definitions applies to good test case design. Regarding test case design, Roger
Pressman wrote:
"The design of tests for software and other engineering products can be as challenging as
the initial design of the product itself. Yet ... software engineers often treat testing as an
afterthought, developing test cases that 'feel right' but have little assurance of being
complete. Recalling the objectives of testing, we must design tests that have the highest
likelihood of finding the most errors with a minimum amount of time and effort."
Well designed test cases are composed of three parts:
Inputs
Outputs
Order of execution
Key
Point

Test cases consist of inputs, outputs, and order of execution.

Inputs
Inputs are commonly thought of as data entered at a keyboard. While that is a significant
source of system input, data can come from other sources—data from interfacing systems,
data from interfacing devices, data read from files or databases, the state the system is in
when the data arrives, and the environment within which the system executes.

Outputs
Outputs have this same variety. Often outputs are thought of as just the data displayed on a
computer screen. In addition, data can be sent to interfacing systems and to external devices.
Data can be written to files or databases. The state or the environment may be modified by the
system's execution.
All of these relevant inputs and outputs are important components of a test case. In test case
design, determining the expected outputs is the function of an "oracle."
An oracle is any program, process, or data that provides the test designer with the expected
result of a test. Beizer lists five types of oracles:
Kiddie Oracles - Just run the program and see what comes out. If it looks about right, it
must be right.
Regression Test Suites - Run the program and compare the output to the results of the
same tests run against a previous version of the program.
Validated Data - Run the program and compare the results against a standard such as
a table, formula, or other accepted definition of valid output.
Purchased Test Suites - Run the program against a standardized test suite that has
been previously created and validated. Programs like compilers, Web browsers, and
SQL (Structured Query Language) processors are often tested against such suites.
Existing Program - Run the program and compare the output to another version of the
program.

Order of Execution
There are two styles of test case design regarding order of test execution.
Cascading test cases - Test cases may build on each other. For example, the first test
case exercises a particular feature of the software and then leaves the system in a
state such that the second test case can be executed. In testing a database consider
these test cases:
1. Create a record
2. Read the record
3. Update the record
4. Read the record
5. Delete the record
6. Read the deleted record

Each of these tests could be built on the previous tests. The advantage is that each
test case is typically smaller and simpler. The disadvantage is that if one test fails, the
subsequent tests may be invalid.
Independent test cases - Each test case is entirely self contained. Tests do not build on
each other or require that other tests have been successfully executed. The advantage
is that any number of tests can be executed in any order. The disadvantage is that each
test tends to be larger and more complex and thus more difficult to design, create, and
maintain.

Types Of Testing
Testing is often divided into black box testing and white box testing.

Black box testing is a strategy in which testing is based solely on the requirements and
specifications. Unlike its complement, white box testing, black box testing requires no
knowledge of the internal paths, structure, or implementation of the software under test.
White box testing is a strategy in which testing is based on the internal paths, structure, and
implementation of the software under test. Unlike its complement, black box testing, white box
testing generally requires detailed programming skills.
An additional type of testing is called gray box testing. In this approach we peek into the "box"
under test just long enough to understand how it has been implemented. Then we close up the
box and use our knowledge to choose more effective black box tests.

Testing Levels
Typically testing, and therefore test case design, is performed at four different levels:
Unit Testing - A unit is the "smallest" piece of software that a developer creates. It is
typically the work of one programmer and is stored in a single disk file. Different
programming languages have different units: In C++ and Java the unit is the class; in C
the unit is the function; in less structured languages like Basic and COBOL the unit may
be the entire program.
Key
Point

The classical testing levels are unit, integration, system, and acceptance.

Integration Testing - In integration we assemble units together into subsystems and
finally into systems. It is possible for units to function perfectly in isolation but to fail
when integrated. A classic example is this C program and its subsidiary function:
/* main program */
void oops(int);
int main(){
oops(42); /* call the oops function passing an integer */
return 0;
}
/* function oops (in a separate file) */
#include
void oops(double x) {/* expects a double, not an int! */
printf ("%f\n",x); /* Will print garbage (0 is most likely) */
}
If these units were tested individually, each would appear to function correctly. In this
case, the defect only appears when the two units are integrated. The main program
passes an integer to function oops but oops expects a double length integer and trouble
ensues. It is vital to perform integration testing as the integration process proceeds.
System Testing - A system consists of all of the software (and possibly hardware, user
manuals, training materials, etc.) that make up the product delivered to the customer.
System testing focuses on defects that arise at this highest level of integration.
Typically system testing includes many types of testing: functionality, usability, security,
internationalization and localization, reliability and availability, capacity, performance,
backup and recovery, portability, and many more. This book deals only with functionality
testing. While the other types of testing are important, they are beyond the scope of
this volume.
Acceptance Testing - Acceptance testing is defined as that testing, which when
completed successfully, will result in the customer accepting the software and giving us
their money. From the customer's point of view, they would generally like the most

exhaustive acceptance testing possible (equivalent to the level of system testing). From
the vendor's point of view, we would generally like the minimum level of testing possible
that would result in money changing hands. Typical strategic questions that should be
addressed before acceptance testing are: Who defines the level of the acceptance
testing? Who creates the test scripts? Who executes the tests? What is the pass/fail
criteria for the acceptance test? When and how do we get paid?
Not all systems are amenable to using these levels. These levels assume that there is a
significant period of time between developing units and integrating them into subsystems and
then into systems. In Web development it is often possible to go from concept to code to
production in a matter of hours. In that case, the unit-integration-system levels don't make much
sense. Many Web testers use an alternate set of levels:
Code quality
Functionality
Usability
Performance
Security

The Impossibility Of Testing Everything
In his monumental book Testing Object-Oriented Systems, Robert Binder provides an excellent
example of the impossibility of testing "everything." Consider the following program:
int blech (int j) {
j = j -1;
// should be j = j + 1
j = j / 30000;
return j;
}
Note that the second line is incorrect! The function blech accepts an integer j, subtracts one
from it, divides it by 30000 (integer division, whole numbers, no remainder) and returns the
value just computed. If integers are implemented using 16 bits on this computer executing this
software, the lowest possible input value is -32768 and the highest is 32767. Thus there are
65,536 possible inputs into this tiny program. (Your organization's programs are probably
larger.) Will you have the time (and the stamina) to create 65,536 test cases? Of course not.
So which input values do we choose? Consider the following input values and their ability to
detect this defect.
Input (j) Expected Result Actual Result
1

40000

-64000

-2

Oops! Note that none of the test cases chosen have detected this defect. In fact only four of
the possible 65,536 input values will find this defect. What is the chance that you will choose all
four? What is the chance you will choose one of the four? What is the chance you will win the
Powerball lottery? Is your answer the same to each of these three questions?

Summary
Testing is a concurrent lifecycle process of engineering, using, and maintaining testware
in order to measure and improve the quality of the software being tested. (Craig and
Jaskiel)
The design of tests for software and other engineering products can be as challenging
as the initial design of the product itself. Yet ... software engineers often treat testing
as an afterthought, developing test cases that 'feel right' but have little assurance of
being complete. Recalling the objectives of testing, we must design tests that have the
highest likelihood of finding the most errors with a minimum amount of time and effort.
(Pressman)
Black box testing is a strategy in which testing is based solely on the requirements and
specifications. White box testing is a strategy in which testing is based on the internal
paths, structure, and implementation of the software under test.
Typically testing, and therefore test case design, is performed at four different levels:
Unit, Integration, System, and Acceptance.

Practice
1. Which four inputs to the blech routine will find the hidden defect? How did you
determine them? What does this suggest to you as an approach to finding other
defects?

References
Beizer, Boris (1990). Software Testing Techniques (Second Edition). Van Nostrand
Reinhold.
Binder, Robert V. (2000). Testing Object-Oriented Systems: Models, Patterns, and
Tools. Addison-Wesley.
Craig, Rick D. and Stefan P. Jaskiel (2002). Systematic Software Testing. Artech House
Publishers.
IEEE Standard 610.12-1990, IEEE Standard Glossary of Software Engineering
Terminology, 1991.
Myers, Glenford (1979). The Art of Software Testing. John Wiley & Sons.
Pressman, Roger S. (1982). Software Engineering: A Practitioner's Approach (Fourth
Edition). McGraw-Hill.

Chapter 2: Case Studies
They had but one last remaining night together, so they embraced each other as tightly as
that two-flavor entwined string cheese that is orange and yellowish-white, the orange
probably being a bland Cheddar and the white . . . Mozzarella, although it could possibly
be Provolone or just plain American, as it really doesn't taste distinctly dissimilar from the
orange, yet they would have you believe it does by coloring it differently.
— Mariann Simms

Why Case Studies?
Two case studies are provided in the appendices of this book. Appendix A describes "Brown &
Donaldson," an online brokerage firm. Appendix B describes the "Stateless University
Registration System." Examples from these case studies are used to illustrate the test case
design techniques described in this book. In addition, some of the book's exercises are based
on the case studies. The following sections briefly describe the case studies. Read the detailed
information in Appendix A and B when required.

Brown & Donaldson
Brown & Donaldson (B&D) is a fictitious online brokerage firm that you can use to practice the
test design techniques presented in this book. B&D was originally created for Software Quality
Engineering's Web/eBusiness Testing course (for more details see http://www.sqe.com).

Screen shots of various pages are included in Appendix A. Reference will be made to some of
these throughout the book. The actual B&D Web site is found at http://bdonline.sqe.com. Any
resemblance to any actual online brokerage Web site is purely coincidental.
You can actually try the B&D Web site. First-time users will need to create a BDonline account.
This account is not real—any transactions requested or executed via this account will not
occur in the real world, only in the fictitious world of B&D. Once you have created an account,
you will bypass this step and login with your username and password. While creating a new
account you will be asked to supply an authorization code. The authorization code is eight 1s.
This Web site also contains a number of downloadable documents from the B&D case study,
which can be used to assist you in developing test plans for your own Web projects.

Stateless University Registration System
Every state has a state university. This case study describes an online student registration
system for the fictitious Stateless University. Please do not attempt to cash out your stocks
from Brown & Donaldson to enroll at Stateless U.

The document in Appendix B describes the planned user interface for the Stateless University
Registration System (SURS). It defines the user interface screens in the order in which they are
typically used. It starts with the login screen. Then it provides the data base set-up fields, the
addition/change/deletion of students, the addition/change/deletion of courses, and the
addition/change/deletion of class sections. The final data entry screen provides the selection of
specific course sections for each student. Additional administrative functions are also defined.

Section I: Black Box Testing Techniques
Chapter List
Chapter 3: Equivalence Class Testing
Chapter 4: Boundary Value Testing
Chapter 5: Decision Table Testing
Chapter 6: Pairwise Testing
Chapter 7: State-Transition Testing
Chapter 8: Domain Analysis Testing
Chapter 9: Use Case Testing

Part Overview

Definition
Black box testing is a strategy in which testing is based solely on the requirements and
specifications. Unlike its complement, white box testing, black box testing requires no
knowledge of the internal paths, structure, or implementation of the software under test (SUT).

The general black box testing process is:
The requirements or specifications are analyzed.
Valid inputs are chosen based on the specification to determine that the SUT processes
them correctly. Invalid inputs must also be chosen to verify that the SUT detects them
and handles them properly.
Expected outputs for those inputs are determined.
Tests are constructed with the selected inputs.
The tests are run.
Actual outputs are compared with the expected outputs.
A determination is made as to the proper functioning of the SUT.

Applicability
Black box testing can be applied at all levels of system development—unit, integration, system,
and acceptance.

As we move up in size from module to subsystem to system the box gets larger, with more
complex inputs and more complex outputs, but the approach remains the same. Also, as we
move up in size, we are forced to the black box approach; there are simply too many paths
through the SUT to perform white box testing.

Disadvantages
When using black box testing, the tester can never be sure of how much of the SUT has been
tested. No matter how clever or diligent the tester, some execution paths may never be
exercised. For example, what is the probability a tester would select a test case to discover
this "feature"?
if (name=="Lee" && employeeNumber=="1234" &&
employmentStatus=="RecentlyTerminatedForCause") {
send Lee a check for $1,000,000;
}
Key
Point

When using black box testing, the tester can never be sure of how much of the
system under test has been tested.

To find every defect using black box testing, the tester would have to create every possible
combination of input data, both valid and invalid. This exhaustive input testing is almost always
impossible. We can only choose a subset (often a very small subset) of the input combinations.
In The Art of Software Testing, Glenford Myers provides an excellent example of the futility of
exhaustive testing: How would you thoroughly test a compiler? By writing every possible valid
and invalid program. The problem is substantially worse for systems that must remember what
has happened before (i.e., that remember their state). In those systems, not only must we test
every possible input, we must test every possible sequence of every possible input.
Key
Point

Even though we can't test everything, formal black box testing directs the tester to
choose subsets of tests that are both efficient and effective in finding defects.

Advantages
Even though we can't test everything, formal black box testing directs the tester to choose
subsets of tests that are both efficient and effective in finding defects. As such, these subsets
will find more defects than a randomly created equivalent number of tests. Black box testing
helps maximize the return on our testing investment.

References
Myers, Glenford J. (1979). The Art of Software Testing. John Wiley & Sons.

Chapter 3: Equivalence Class Testing
On the fourth day of his exploration of the Amazon, Byron climbed out of his inner tube,
checked the latest news on his personal digital assistant (hereafter PDA) outfitted with
wireless technology, and realized that the gnawing he felt in his stomach was not fear
—no, he was not afraid, rather elated—nor was it tension—no, he was actually rather
relaxed—so it was in all probability a parasite.
— Chuck Keelan

Introduction
Equivalence class testing is a technique used to reduce the number of test cases to a
manageable level while still maintaining reasonable test coverage. This simple technique is used
intuitively by almost all testers, even though they may not be aware of it as a formal test design
method. Many testers have logically deduced its usefulness, while others have discovered it
simply because of lack of time to test more thoroughly.
Consider this situation. We are writing a module for a human resources system that decides
how we should process employment applications based on a person's age. Our organization's
rules are:
0–16Don't hire
16–18Can hire on a part-time basis only
18–55Can hire as a full-time employee
55–99 Don't hire[*]
[*]Note:

If you've spotted a problem with these requirements, don't worry. They are written
this way for a purpose and will be repaired in the next chapter.
Observation
With these rules our organization would not have hired Doogie Houser, M.D. or Col. Harlan
Sanders, one too young, the other too old.
Should we test the module for the following ages: 0, 1, 2, 3, 4, 5, 6, 7, 8, ..., 90, 91, 92, 93,
94, 95, 96, 97, 98, 99? If we had lots of time (and didn't mind the mind-numbing repetition and
were being paid by the hour) we certainly could. If the programmer had implemented this
module with the following code we should test each age. (If you don't have a programming
background don't worry. These examples are simple. Just read the code and it will make sense
to you.)
If (applicantAge == 0) hireStatus="NO";
If (applicantAge == 1) hireStatus="NO";
…
If (applicantAge == 14) hireStatus="NO";
If (applicantAge == 15) hireStatus="NO";
If (applicantAge == 16) hireStatus="PART";
If (applicantAge == 17) hireStatus="PART";
If (applicantAge == 18) hireStatus="FULL";
If (applicantAge == 19) hireStatus="FULL";
…
If (applicantAge == 53) hireStatus="FULL";
If (applicantAge == 54) hireStatus="FULL";

If (applicantAge ==
If (applicantAge ==
…
If (applicantAge ==
If (applicantAge ==

55) hireStatus="NO";
56) hireStatus="NO";
98) hireStatus="NO";
99) hireStatus="NO";

Given this implementation, the fact that any set of tests passes tells us nothing about the next
test we could execute. It may pass; it may fail.
Luckily, programmers don't write code like this (at least not very often). A better programmer
might write:
If (applicantAge >= 0 && applicantAge <=16)
hireStatus="NO";
If (applicantAge >= 16 && applicantAge <=18)
hireStatus="PART";
If (applicantAge >= 18 && applicantAge <=55)
hireStatus="FULL";
If (applicantAge >= 55 && applicantAge <=99)
hireStatus="NO";
Given this typical implementation, it is clear that for the first requirement we don't have to test
0, 1, 2, ... 14, 15, and 16. Only one value needs to be tested. And which value? Any one within
that range is just as good as any other one. The same is true for each of the other ranges.
Ranges such as the ones described here are called equivalence classes. An equivalence
class consists of a set of data that is treated the same by the module or that should produce
the same result. Any data value within a class is equivalent, in terms of testing, to any other
value. Specifically, we would expect that:
If one test case in an equivalence class detects a defect, all other test cases in the
same equivalence class are likely to detect the same defect.
If one test case in an equivalence class does not detect a defect, no other test cases in
the same equivalence class is likely to detect the defect.
A group of tests forms an equivalence class if you believe that:
They all test the same thing.
Key
Point

If one test catches a bug, the others probably will too.
If one test doesn't catch a bug, the others probably won't either.
Cem Kaner Testing Computer Software

This approach assumes, of course, that a specification exists that defines the various
equivalence classes to be tested. It also assumes that the programmer has not done something
strange such as:
If (applicantAge >= 0 && applicantAge <=16)

If
If
//
If
If
//

hireStatus="NO";
(applicantAge >= 16 && applicantAge <=18)
hireStatus="PART";
(applicantAge >= 18 && applicantAge <=41)
hireStatus="FULL";
strange statements follow
(applicantAge == 42 && applicantName == "Lee")
hireStatus="HIRE NOW AT HUGE SALARY";
(applicantAge == 42 && applicantName <> "Lee")
hireStatus="FULL";
end of strange statements

If (applicantAge >= 43 && applicantAge <=55)
hireStatus="FULL";
If (applicantAge >= 55 && applicantAge <=99)
hireStatus="NO";
Using the equivalence class approach, we have reduced the number of test cases from 100
(testing each age) to four (testing one age in each equivalence class)—a significant savings.
Now, are we ready to begin testing? Probably not. What about input values like 969, -42,
FRED, and &$#!@? Should we create test cases for invalid input? The answer is, as any good
consultant will tell you, "it depends." To understand this answer we need to examine an
approach that came out of the object-oriented world called design-by-contract.
According to the Bible, the age of Methuselah when he died was 969 years (Gen
Note 5:27). Thanks to the Gideons who made this data easily accessible in my hotel room
without the need for a high speed Internet connection.
In law, a contract is a legally binding agreement between two (or more) parties that describes
what each party promises to do or not do. Each of these promises is of benefit to the other.
In the design-by-contract approach, modules (called "methods" in the object-oriented paradigm,
but "module" is a more generic term) are defined in terms of pre-conditions and post-conditions.
Post-conditions define what a module promises to do (compute a value, open a file, print a
report, update a database record, change the state of the system, etc.). Pre-conditions define
what that module requires so that it can meet its post-conditions. For example, if we had a
module called openFile, what does it promise to do? Open a file. What would legitimate
preconditions of openFile be? First, the file must exist; second, we must provide the name (or
other identifying information) of the file; third, the file must be "openable," that is, it cannot
already be exclusively opened by another process; fourth, we must have access rights to the
file; and so on. Pre-conditions and postconditions establish a contract between a module and
others that invoke it.
Testing-by-contract is based on the design-by-contract philosophy. Its approach is to create
test cases only for the situations in which the pre-conditions are met. For example, we would

not test the openFile module when the file did not exist. The reason is simple. If the file does
not exist, openFile does not promise to work. If there is no claim that it will work under a
specific condition, there is no need to test under that condition.
For More
Information

See Bertrand Meyer's book Object-Oriented Software Construction for
more on design-by-contract.

At this point testers usually protest. Yes, they agree, the module does not claim to work in that
case, but what if the preconditions are violated during production? What does the system do?
Do we get a misspelled word on the screen or a smoking crater where our company used to
be?
A different approach to design is defensive design. In this case the module is designed to
accept any input. If the normal preconditions are met, the module will achieve its normal
postconditions. If the normal pre-conditions are not met, the module will notify the caller by
returning an error code or throwing an exception (depending on the programming language
used). This notification is actually another one of the module's postconditions. Based on this
approach we could define defensive testing: an approach that tests under both normal and
abnormal pre-conditions.
A student in one of my classes, let's call him Fred, said he didn't really care which
design approach was being used, he was going to always use defensive testing.
Insight
When I asked why, he replied, "If it doesn't work, who will get the blame - those
responsible or the testers?"
How does this apply to equivalence class testing? Do we have to test with inputs like -42,
FRED, and &$#!@? If we are using design-by-contract and testing-by-contract the answer is
No. If we are using defensive design and thus defensive testing, the answer is Yes. Ask your
designers which approach they are using. If they answer either "contract" or "defensive," you
know what style of testing to use. If they answer "Huh?" that means they are not thinking about
how modules interface. They are not thinking about pre-condition and post-condition contracts.
You should expect integration testing to be a prime source of defects that will be more complex
and take more time than anticipated.

Technique
The steps for using equivalence class testing are simple. First, identify the equivalence classes.
Second, create a test case for each equivalence class. You could create additional test cases
for each equivalence class if you have the time and money. Additional test cases may make you
feel warm and fuzzy, but they rarely discover defects the first doesn't find.
A student in one of my classes, let's call her Judy, felt very uncomfortable about
having only one test case for each equivalence class. She wanted at least two for
that warm and fuzzy feeling. I indicated that if she had the time and money that
Insight
approach was fine but suggested the additional tests would probably be ineffective.
I asked her to keep track of how many times the additional test cases found
defects that the first did not and let me know. I never heard from Judy again.
Different types of input require different types of equivalence classes. Let's consider four
possibilities. Let's assume a defensive testing philosophy of testing both valid and invalid input.
Testing invalid inputs is often a great source of defects.
If an input is a continuous range of values, then there is typically one class of valid values and
two classes of invalid values, one below the valid class and one above it. Consider the Goofy
Mortgage Company (GMC). They will write mortgages for people with incomes between
$1,000/month and $83,333/month. Anything below $1,000/month you don't qualify. Anything
over $83,333/month you don't need GMC, just pay cash.
For a valid input we might choose $1,342/month. For invalids we might choose $123/month and
$90,000/month.

Figure 3-1: Continuous equivalence classes
If an input condition takes on
discrete values within a range of permissible values, there are typically one valid and two
invalid classes. GMC will write a single mortgage for one through five houses. (Remember,
it's Goofy.) Zero or fewer houses is not a legitimate input, nor is six or greater. Neither are
fractional or decimal values such as 2 1/2 or 3.14159.

Figure 3-2: Discrete equivalence classes
choose two houses. Invalids could be -2 and 8.

For a valid input we might

GMC will make mortgages only for a person. They will not make mortgages for
corporations, trusts, partnerships, or any other type of legal entity.

Figure 3-3: Single selection equivalence classes
For a valid input we
must use "person." For an invalid we could choose "corporation" or "trust" or any other
random text string. How many invalid cases should we create? We must have at least
one; we may choose additional tests for additional warm and fuzzy feelings.
GMC will make mortgages on Condominiums, Townhouses, and Single Family
dwellings. They will not make mortgages on Duplexes, Mobile Homes, Treehouses, or
any other type of dwelling.

Figure 3-4: Multiple selection equivalence class
For valid input we
must choose from "Condominium," "Townhouse," or "Single Family." While the rule
says choose one test case from the valid equivalence class, a more comprehensive
approach would be to create test cases for each entry in the valid class. That
makes sense when the list of valid values is small. But, if this were a list of the fifty
states, the District of Columbia, and the various territories of the United States,

would you test every one of them? What if the list were every country in the world?
The correct answer, of course, depends on the risk to the organization if, as testers,
we miss something that is vital.
Now, rarely will we have the time to create individual tests for every separate
equivalence class of every input value that enters our system. More often, we will
create test cases that test a number of input fields simultaneously. For example, we
might create a single test case with the following combination of inputs:
Key
Point

Rarely will we have the time to create individual tests for every
separate equivalence class of every input value.

Table 3-1: A test case of valid data values.
Monthly Income Number of Dwellings Applicant Dwelling Types Result
$5,000

Person

Condo

Valid

Each of these data values is in the valid range, so we would expect the system to
perform correctly and for the test case to report Pass.
It is tempting to use the same approach for invalid values.
Table 3-2: A test case of all invalid data values. This is not a good
approach.
Monthly Income Number of Dwellings Applicant Dwelling Types Result
$100

Partnership

Treehouse

Invalid

If the system accepts this input as valid, clearly the system is not validating the four
input fields properly. If the system rejects this input as invalid, it may do so in such a
way that the tester cannot determine which field it rejected. For example: ERROR:
653X-2.7 INVALID INPUT
In many cases, errors in one input field may cancel out or mask errors in another
field so the system accepts the data as valid. A better approach is to test one invalid
value at a time to verify the system detects it correctly.
Table 3-3: A set of test cases varying invalid values one by one.
Monthly Income Number of Dwellings

Applicant

Dwelling Types Result

$100

Person

SingleFam

Invalid

$1,342

Person

Condo

Invalid

$1,342

Corporation

Townhouse

Invalid

$1,342

Person

Treehouse

Invalid

For additional warm and fuzzy feelings, the inputs (both valid and invalid) could be
varied.
Table 3-4: A set of test cases varying invalid values one by one but also
varying the valid values.
Monthly Income Number of Dwellings

Applicant

Dwelling Types Result

$100

Person

Single Family

Invalid

$1,342

Person

Condominium

Invalid

$5,432

Corporation

Townhouse

Invalid

$10,000

Person

Treehouse

Invalid

Another approach to using equivalence classes is to examine the outputs rather than
the inputs. Divide the outputs into equivalence classes, then determine what input
values would cause those outputs. This has the advantage of guiding the tester to
examine, and thus test, every different kind of output. But this approach can be
deceiving. In the previous example, for the human resources system, one of the
system outputs was NO, that is, Don't Hire. A cursory view of the inputs that should
cause this output would yield {0, 1, ..., 14, 15}. Note that this is not the complete
set. In addition {55, 56, ..., 98, 99} should also cause the NO output. It's important
to make sure that all potential outputs can be generated, but don't be fooled into
choosing equivalence class data that omits important inputs.

Examples
Example 1
Referring to the Trade Web page of the Brown & Donaldson Web site described in Appendix A,
consider the Order Type field. The designer has chosen to implement the decision to Buy or
Sell through radio buttons. This is a good design choice because it reduces the number of test
cases the tester must create. Had this been implemented as a text field in which the user
entered "Buy" or "Sell" the tester would have partitioned the valid inputs as {Buy, Sell} and the
invalids as {Trade, Punt, ...}. What about "buy", "bUy", "BUY"? Are these valid or invalid
entries? The tester would have to refer back to the requirements to determine their status.
Insight

Let your designers and programmers know when they have helped you. They'll
appreciate the thought and may do it again.

With the radio button implementation no invalid choices exist, so none need to be tested. Only
the valid inputs {Buy, Sell} need to be exercised.

Example 2
Again, referring to the Trade Web page, consider the Quantity field. Input to this field can be
between one and four numeric characters (0, 1, ..., 8,9) with a valid value greater or equal to 1
and less than or equal to 9999. A set of valid inputs is {1, 22, 333, 4444} while invalid inputs are
{-42, 0, 12345, SQE, $#@%}.

Very often your designers and programmers use GUI design tools that can enforce
restrictions on the length and content of input fields. Encourage their use. Then your
Insight
testing can focus on making sure the requirement has been implemented properly
with the tool.
Example 3
On the Trade page the user enters a ticker Symbol indicating the stock to buy or sell. The valid
symbols are {A, AA, AABC, AAC, ..., ZOLT, ZOMX, ZONA, ZRAN). The invalid symbols are
any combination of characters not included in the valid list. A set of valid inputs could be {A, AL,
ABE, ACES, AKZOY) while a set of invalids could be {C, AF, BOB, CLUBS, AKZAM, 42,
@#$%).

For More
Information
Example 4

Click on the Symbol Lookup button on the B&D Trade page to see the
full list of stock symbols.

Rarely will we create separate sets of test cases for each input. Generally it is more efficient to
test multiple inputs simultaneously within tests. For example, the following tests combine
Buy/Sell, Symbol, and Quantity.

Table 3-5: A set of test cases varying invalid values one by one.
Buy/Sell

Symbol

Quantity

Result

Buy

Valid

Buy

Invalid

Buy

Invalid

Sell

ACES

Valid

Sell

BOB

Invalid

Sell

ABE

-3

Invalid

Applicability and Limitations
Equivalence class testing can significantly reduce the number of test cases that must be
created and executed. It is most suited to systems in which much of the input data takes on
values within ranges or within sets. It makes the assumption that data in the same equivalence
class is, in fact, processed in the same way by the system. The simplest way to validate this
assumption is to ask the programmer about their implementation.
Equivalence class testing is equally applicable at the unit, integration, system, and acceptance
test levels. All it requires are inputs or outputs that can be partitioned based on the system's
requirements.

Summary
Equivalence class testing is a technique used to reduce the number of test cases to a
manageable size while still maintaining reasonable coverage.
This simple technique is used intuitively by almost all testers, even though they may not
be aware of it as a formal test design method.
An equivalence class consists of a set of data that is treated the same by the module
or that should produce the same result. Any data value within a class is equivalent, in
terms of testing, to any other value.

Practice
1. The following exercises refer to the Stateless University Registration System Web
site described in Appendix B. Define the equivalence classes and suitable test cases
for the following:
1. ZIP Code—five numeric digits.
2. State—the standard Post Office two-character abbreviation for the states,
districts, territories, etc. of the United States.
3. Last Name—one through fifteen characters (including alphabetic characters,
periods, hyphens, apostrophes, spaces, and numbers).
4. User ID—eight characters at least two of which are not alphabetic (numeric,
special, nonprinting).
5. Student ID—eight characters. The first two represent the student's home
campus while the last six are a unique six-digit number. Valid home campus
abbreviations are: AN, Annandale; LC, Las Cruces; RW, Riverside West;
SM, San Mateo; TA, Talbot; WE, Weber; and WN, Wenatchee.

References
Beizer, Boris (1990). Software Testing Techniques. Van Nostrand Reinhold.
Kaner, Cem,Jack Falk and Hung Quoc Nguyen (1999). Testing Computer Software
(Second Edition). John Wiley & Sons.
Myers, Glenford J. (1979). The Art of Software Testing. John Wiley & Sons.

Chapter 4: Boundary Value Testing
The Prince looked down at the motionless form of Sleeping Beauty, wondering how her
supple lips would feel against his own and contemplating whether or not an Altoid was
strong enough to stand up against the kind of morning breath only a hundred years' nap
could create.
— Lynne Sella

Introduction
Equivalence class testing is the most basic test design technique. It helps testers choose a
small subset of possible test cases while maintaining reasonable coverage. Equivalence class
testing has a second benefit. It leads us to the idea of boundary value testing, the second key
test design technique to be presented.
In the previous chapter the following rules were given that indicate how we should process
employment applications based on a person's age. The rules were:
0–16 Don't hire
16–18Can hire on a part-time basis only
18–55Can hire as a full-time employee
55–99Don't hire
Notice the problem at the boundaries—the "edges" of each class. The age "16" is included in
two different equivalence classes (as are 18 and 55). The first rule says don't hire a 16-yearold. The second rule says a 16-year-old can be hired on a part-time basis.
Boundary value testing focuses on the boundaries simply because that is where so many
defects hide. Experienced testers have encountered this situation many times. Inexperienced
testers may have an intuitive feel that mistakes will occur most often at the boundaries. These
defects can be in the requirements (as shown above) or in the code as shown below:
Key
Point

Boundary value testing focuses on the boundaries because that is where so many
defects hide.

If (applicantAge >= 0 && applicantAge <=16)
hireStatus="NO";
If (applicantAge >= 16 && applicantAge <=18)
hireStatus="PART";
If (applicantAge >= 18 && applicantAge <=55)
hireStatus="FULL";
If (applicantAge >= 55 && applicantAge <=99)
hireStatus="NO";
Of course, the mistake that programmers make is coding inequality tests improperly. Writing >
(greater than) instead of ≥ (greater than or equal) is an example.
The most efficient way of finding such defects, either in the requirements or the code, is
through inspection. Gilb and Graham's book, Software Inspection, is an excellent guide to this
process. However, no matter how effective our inspections, we will want to test the code to
verify its correctness.
Perhaps this is what our organization meant:
0–15Don't hire

16–17Can hire on a part-time basis only
18–54Can hire as full-time employees
55–99Don't hire
What about ages -3 and 101? Note that the requirements do not specify how these values
should be treated. We could guess but "guessing the requirements" is not an acceptable
practice.
The code that implements the corrected rules is:
If (applicantAge >= 0 && applicantAge <=15)
hireStatus="NO";
If (applicantAge >= 16 && applicantAge <=17)
hireStatus="PART";
If (applicantAge >= 18 && applicantAge <=54)
hireStatus="FULL";
If (applicantAge >= 55 && applicantAge <=99)
hireStatus="NO";
The interesting values on or near the boundaries in this example are {-1, 0, 1}, {15, 16, 17},
{17, 18, 19}, {54, 55, 56}, and {98, 99, 100}. Other values, such as {-42, 1001, FRED, %$#@}
might be included depending on the module's documented preconditions.

Technique
The steps for using boundary value testing are simple. First, identify the equivalence classes.
Second, identify the boundaries of each equivalence class. Third, create test cases for each
boundary value by choosing one point on the boundary, one point just below the boundary, and
one point just above the boundary. "Below" and "above" are relative terms and depend on the
data value's units. If the boundary is 16 and the unit is "integer" then the "below" point is 15 and
the "above" point is 17. If the boundary is $5.00 and the unit is "US dollars and cents" then the
below point is $4.99 and the above point is $5.01. On the other hand, if the value is $5 and the
unit is "US dollars" then the below point is $4 and the above point is $6.
Key
Point

Create test cases for each boundary value by choosing one point on the
boundary, one point just below the boundary, and one point just above the
boundary.

Note that a point just above one boundary may be in another equivalence class. There is no
reason to duplicate the test. The same may be true of the point just below the boundary.
You could, of course, create additional test cases farther from the boundaries (within
equivalence classes) if you have the resources. As discussed in the previous chapter, these
additional test cases may make you feel warm and fuzzy, but they rarely discover additional
defects.
Boundary value testing is most appropriate where the input is a continuous range of values.
Returning again to the Goofy Mortgage Company, what are the interesting boundary values?
For monthly income the boundaries are $1,000/month and $83,333/month (assuming the units
to be US dollars).

Figure 4-1: Boundary values for a continuous range of inputs.
Test data input of {$999, $1,000, $1,001} on the low end and {$83,332, $83,333, $83,334} on
the high end are chosen to test the boundaries.
Because GMC will write a mortgage for one through five houses, zero or fewer houses is not a
legitimate input nor is six or greater. These identify the boundaries for testing.

Figure 4-2: Boundary values for a discrete range of inputs.
Rarely will we have the time to create individual tests for every boundary value of every input
value that enters our system. More often, we will create test cases that test a number of input
fields simultaneously.
Table 4-1: A set of test cases containing combinations of valid (on the boundary)
values and invalid (off the boundary) points.
Monthly Income

Number of Dwellings Result

Description

$1,000

Valid

Min income, min dwellings

$83,333

Valid

Max income, min dwellings

$1,000

Valid

Min income, max dwellings

$83,333

Valid

Max income, max dwellings

$1,000

Invalid

Min income, below min dwellings

$1,000

Invalid

Min income, above max dwellings

$83,333

Invalid

Max income, below min dwellings

$83,333

Invalid

Max income, above max dwellings

$999

Invalid

Below min income, min dwellings

$83,334

Invalid

Above max income, min dwellings

$999

Invalid

Below min income, max dwellings

$83,334

Invalid

Above max income, max dwellings

Plotting "monthly income" on the x-axis and "number of dwellings" on the y-axis shows the
"locations" of the test data points.

Figure 4-3: Data points on the boundaries and data points just outside the boundaries.
Note that four of the input combinations are on the boundaries while eight are just outside. Also
note that the points outside always combine one valid value with one invalid value (just one unit
lower or one unit higher).

Examples
Boundary value testing is applicable to the structure (length and character type) of input data as
well as its value. Consider the following two examples:
Example 1
Referring to the Trade Web page of the Brown & Donaldson Web site described in Appendix A,
consider the Quantity field. Input to this field can be between one and four numeric characters
(0,1, ..., 8,9). A set of boundary value test cases for the length attribute would be {0, 1, 4, 5}
numeric characters.

Example 2
Again, on the Trade page, consider the Quantity field, but this time for value rather than
structure (length and character type). Whether the transaction is Buy or Sell, the minimum
legitimate value is 1 so use {0, 1, 2} for boundary testing. The upper limit on this field's value is
more complicated. If the transaction is Sell, what is the maximum number of shares that can be
sold? It is the number currently owned. For this boundary use {sharesOwned-1, sharesOwned,
sharesOwned+1}. If the transaction is Buy, the maximum value (number of shares to be
purchased) is defined as
shares = (accountBalance - commission) / sharePrice
assuming a fixed commission. Use {shares-1, shares, shares+1} as the boundary value test
cases.

Applicability and Limitations
Boundary value testing can significantly reduce the number of test cases that must be created
and executed. It is most suited to systems in which much of the input data takes on values
within ranges or within sets.
Boundary value testing is equally applicable at the unit, integration, system, and acceptance
test levels. All it requires are inputs that can be partitioned and boundaries that can be identified
based on the system's requirements.

Summary
While equivalence class testing is useful, its greatest contribution is to lead us to
boundary value testing.
Boundary value testing is a technique used to reduce the number of test cases to a
manageable size while still maintaining reasonable coverage.
Boundary value testing focuses on the boundaries because that is where so many
defects hide. Experienced testers have encountered this situation many times.
Inexperienced testers may have an intuitive feel that mistakes will occur most often at
the boundaries.
Create test cases for each boundary value by choosing one point on the boundary, one
point just below the boundary, and one point just above the boundary. "Below" and
"above" are relative terms and depend on the data value's units.

Practice
1. The following exercises refer to the Stateless University Registration System Web
site described in Appendix B. Define the boundaries, and suitable boundary value test
cases for the following:
1. ZIP Code—five numeric digits.
2. First consider ZIP Code just in terms of digits. Then, determine the lowest
and highest legitimate ZIP Codes in the United States. For extra credit [1],
determine the format of postal codes for Canada and the lowest and highest
valid values.
3. Last Name—one through fifteen characters (including alphabetic characters,
periods, hyphens, apostrophes, spaces, and numbers). For extra credit [2]
create a few very complex Last Names. Can you determine the "rules" for
legitimate Last Names? For additional extra credit [3] use a phonebook from
another country—try Finland or Thailand.
4. User ID—eight characters at least two of which are not alphabetic (numeric,
special, nonprinting).
5. Course ID—three alpha characters representing the department followed by
a six-digit integer which is the unique course identification number. The
possible departments are:
PHY - Physics
EGR - Engineering
ENG - English
LAN - Foreign languages
CHM - Chemistry
MAT - Mathematics
PED - Physical education
SOC - Sociology
[1]There

actually is no extra credit, so do it for fun.

[2]There

actually is no extra credit, so do it for fun.

[3]There

actually is no extra credit, so do it for fun.

References
Beizer, Boris (1990). Software Testing Techniques. Van Nostrand Reinhold.
Gilb, Tom and Dorothy Graham (1993). Software Inspection. Addison-Wesley. ISBN 0201-63181-4.
Myers, Glenford J. (1979). The Art of Software Testing. John Wiley & Sons.

Chapter 5: Decision Table Testing
I'd stumbled onto solving my first murder case, having found myself the only eyewitness,
yet no matter how frantically I pleaded with John Law that the perp was right in front of
them and the very dame they'd been grilling - the sultry but devious Miss Kitwinkle, who
played the grieving patsy the way a concert pianist player plays a piano - the cops just
kept smiling and stuffing crackers in my beak.
— Chris Esco

Introduction
Decision tables are an excellent tool to capture certain kinds of system requirements and to
document internal system design. They are used to record complex business rules that a
system must implement. In addition, they can serve as a guide to creating test cases.
Decision tables are a vital tool in the tester's personal toolbox. Unfortunately, many analysts,
designers, programmers, and testers are not familiar with this technique.

Technique
Decision tables represent complex business rules based on a set of conditions. The general
form is:
Table 5-1: The general form of a decision table.
Rule 1

Rule 2

…

Rule p

Conditions
Condition-1
Condition-2
…
Condition-m
Actions
Action-1
Action-2
…
Action-n
Conditions 1 through m represent various input conditions. Actions 1 through n are the actions
that should be taken depending on the various combinations of input conditions. Each of the
rules defines a unique combination of conditions that result in the execution ("firing") of the
actions associated with that rule. Note that the actions do not depend on the order in which the
conditions are evaluated, but only on their values. (All values are assumed to be available
simultaneously.) Also, actions depend only on the specified conditions, not on any previous input
conditions or system state.
Perhaps a concrete example will clarify the concepts. An auto insurance company gives
discounts to drivers who are married and/or good students. Let's begin with the conditions. The
following decision table has two conditions, each one of which takes on the values Yes or No.
Table 5-2: A decision table with two binary conditions.
Rule 1

Rule 2

Rule 3

Rule 4

Married?

Yes

Good Student?

Yes

Conditions

Note that the table contains all combinations of the conditions. Given two binary conditions (Yes
or No), the possible combinations are {Yes, Yes}, {Yes, No}, {No, Yes}, and {No, No}. Each rule
represents one of these combinations. As a tester we will verify that all combinations of the
conditions are defined. Missing a combination may result in developing a system that may not
process a particular set of inputs properly.
Now for the actions. Each rule causes an action to "fire." Each rule may specify an action
unique to that rule, or rules may share actions.
Table 5-3: Adding a single action to a decision table.
Rule 1

Rule 2

Rule 3

Rule 4

Married?

Yes

Good Student?

Yes

Conditions

Actions
Discount ($)

Decision tables may specify more than one action for each rule. Again, these rules may be
unique or may be shared.
Table 5-4: A decision table with multiple actions.
Rule 1

Rule 2

Rule 3

Rule 4

Condition-1

Yes

Condition-2

Yes

Action-1

Do X

Do Y

Do X

Do Z

Action-2

Do A

Do B

Conditions

Actions

In this situation, choosing test cases is simple—each rule (vertical column) becomes a test
case. The Conditions specify the inputs and the Actions specify the expected results.
While the previous example uses simple binary conditions, conditions can be more complex.
Table 5-5: A decision table with non-binary conditions.
Rule 1

Rule 2

Rule 3

Rule 4

Conditions
Condition-1

0–1

1–10

10–100

100–1000

Condition-2

6 or 7

Action-1

Do X

Do Y

Do X

Do Z

Action-2

Do A

Do B

Actions

In this situation choosing test cases is slightly more complex— each rule (vertical column)
becomes a test case but values satisfying the conditions must be chosen. Choosing appropriate
values we create the following test cases:
Table 5-6: Sample test cases.
Test Case ID

Condition-1

Condition-2 Expected Result

TC1

3 Do X / Do A

TC2

5 Do Y / Do B

TC3

7 Do X / Do B

TC4

500

10 Do Z / Do B

If the system under test has complex business rules, and if your business analysts or designers
have not documented these rules in this form, testers should gather this information and
represent it in decision table form. The reason is simple. Given the system behavior
represented in this complete and compact form, test cases can be created directly from the
decision table.
In testing, create at least one test case for each rule. If the rule's conditions are binary, a single
test for each combination is probably sufficient. On the other hand, if a condition is a range of
values, consider testing at both the low and high end of the range. In this way we merge the
ideas of Boundary Value testing with Decision Table testing.
Key
Point

Create at least one test case for each rule.

To create a test case table simply change the row and column headings:
Table 5-7: A decision table converted to a test case table.
Test Case 1

Test Case 2

Test Case 3

Test Case 4

Yes

Inputs
Condition-1

Condition-2

Yes

Action-1

Do X

Do Y

Do X

Do Z

Action-2

Do A

Do B

Expected Results

Examples
Decision Table testing can be used whenever the system must implement complex business
rules. Consider the following two examples: Example 1
Referring to the Trade Web page of the Brown & Donaldson Web site described in Appendix A,
consider the rules associated with a Buy order.

Table 5-8: A decision table for the Brown & Donaldson Buy order.
Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Rule 6 Rule 7 Rule 8
Conditions
Valid Symbol

Yes

Valid Quantity

Yes

Sufficient Funds

Yes

Actions
Buy?

Admittedly, the outcome is readily apparent. Only when a valid symbol, valid quantity, and
sufficient funds are available should the Buy order be placed. This example was chosen to
illustrate another concept.
Examine the first four columns. If the Symbol is not valid, none of the other conditions matter.
Often tables like this are collapsed, rules are combined, and the conditions that do not affect
the outcome are marked "DC" for "Don't Care." Rule 1 now indicates that if the Symbol is not
valid, ignore the other conditions and do not execute the Buy order.
Table 5-9: A collapsed decision table reflecting "Don't Care" conditions.
Rule 1

Rule 2

Rule 3

Rule 4

Rule 5

Valid Symbol

Yes

Valid Quantity

Yes

Sufficient Funds

Yes

Conditions

Actions
Buy?

Note also that Rule 2 and Rule 3 can be combined because whether Sufficient Funds are

available does not affect the action.
Table 5-10: A further collapsed decision table reflecting "Don't Care" conditions.
Rule 1

Rule 2

Rule 3

Rule 4

Valid Symbol

Yes

Valid Quantity

Yes

Sufficient Funds

Yes

Conditions

Actions
Buy?

While this is an excellent idea from a development standpoint because less code is written, it is
dangerous from a testing standpoint. It is always possible that the table was collapsed
incorrectly or the code was written improperly. The un-collapsed table should always be used
as the basis for our test case design.
Example 2
The following screen is from the Stateless University Registration System. It is used to enter
new students into the system, to modify student information, and to delete students from the
system.

Figure 5-1: SURS Student Database Maintenance Screen.
To enter a new student, enter name, address, and telephone information on the upper part of
the screen and press Enter. The student is entered into the database and the system returns a
new StudentID. To modify or delete a student, enter the StudentID, select the Delete or Modify
radio button and press Enter. The decision table reflecting these rules follows:

Table 5-11: A decision table for Stateless University Registration System.

Rule Rule Rule Rule Rule Rule Rule Rule Rule Rule Rule Rule Rule Rule Ru
1
2
3
4
5
6
7
8
9
10 11 12 13 14
Conditions
Entered
Student
data

Entered
Student
ID

No Yes Yes Yes Yes No

Selected
No
Modify

No Yes Yes No

No Yes Yes Yes Yes Yes Yes Y

No Yes Yes No

No Yes Yes Y

No Yes Yes No

No Y

Selected
No Yes No Yes No Yes No Yes No Yes No Yes No Yes
Delete
Actions
Create
new
student

Modify
Student

No Yes No

Delete
Student

No Yes No

Rules 1 through 8 indicate that no data was entered about the student. Rules 1 through 4
indicate that no StudentID was entered for the student, thus no action is possible. Rules 5
through 8 indicate the StudentID was entered. In these cases creating a new Student is not
proper. Rule 5 does not request either modification or deletion so neither is done. Rules 6 and 7
request one function and so they are performed. Note that Rule 8 indicates that both
modification and deletion are to be performed so no action is taken.
Rules 9 through 16 indicate that data was entered about the student. Rules 9 through 12
indicate that no StudentID was entered so these rules refer to a new student. Rule 9 creates a
new student. Rule 10 deletes the student. Rule 11 allows modification of the student's data.
Rule 12 requests that both modification and deletion are to be performed so no action is taken.
Rules 13 through 16 supply student data indicating a new student but also provide a StudentID
indicating an existing student. Because of this contradictory input, no action is taken. Often,
error messages are displayed in these situations.

Applicability and Limitations
Decision Table testing can be used whenever the system must implement complex business
rules when these rules can be represented as a combination of conditions and when these
conditions have discrete actions associated with them.

Summary
Decision tables are used to document complex business rules that a system must
implement. In addition, they serve as a guide to creating test cases.
Conditions represent various input conditions. Actions are the processes that should be
executed depending on the various combinations of input conditions. Each rule defines a
unique combination of conditions that result in the execution ("firing") of the actions
associated with that rule.
Create at least one test case for each rule. If the rule's conditions are binary, a single
test for each combination is probably sufficient. On the other hand, if a condition is a
range of values, consider testing at both the low and high end of the range.

Practice
1. Attending Stateless University is an expensive proposition. After all, they receive no
state funding. Like many other students, those planning on attending apply for student
aid using FAFSA, the Free Application for Federal Student Aid. The following
instructions were taken from that form. Examine them and create a decision table that
represents the FAFSA rules. (Note: You can't make up stuff like this.)
Step Four: Who is considered a parent in this step?
Read these notes to determine who is considered a parent for purposes of this form.
Answer all questions in Step Four about them, even if you do not live with them.
Are you an orphan, or are you or were you (until age 18) a ward/dependent of the
court? If Yes, skip Step Four. If your parents are both living and married to each
other, answer the questions about them. If your parent is widowed or single, answer
the questions about that parent. If your widowed parent is remarried as of today,
answer the questions about that parent and the person whom your parent married
(your stepparent). If your parents are divorced or separated, answer the questions
about the parent you lived with more during the past 12 months. (If you did not live
with one parent more than the other, give answers about the parent who provided
more financial support during the last 12 months, or during the most recent year that
you actually received support from a parent.) If this parent is remarried as of today,
answer the questions on the rest of this form about that parent and the person whom
your parent married (your stepparent).

Chapter 6: Pairwise Testing
Anton was attracted to Angela like a moth to a flame - not just any moth, but one of the
giant silk moths of the genus Hyalophora, perhaps Hyalophora euryalus, whose great
red-brown wings with white basal and postmedian lines flap almost languorously until one
ignites in the flame, fanning the conflagration to ever greater heights until burning down
to the hirsute thorax and abdomen, the fat-laden contents of which provide a satisfying
sizzle to end the agony.
— Andrew Amlen

Introduction
As they used to say on Monty Python, "And now for something completely different."
Consider these situations:
A Web site must operate correctly with different browsers—Internet Explorer 5.0, 5.5,
and 6.0, Netscape 6.0, 6.1, and 7.0, Mozilla 1.1, and Opera 7; using different plug-ins
—RealPlayer, MediaPlayer, or none; running on different client operating systems—
Windows 95, 98, ME, NT, 2000, and XP; receiving pages from different servers—IIS,
Apache, and WebLogic; running on different server operating systems—Windows NT,
2000, and Linux.
Web Combinations
8 browsers
3 plug-ins
6 client operating systems
3 servers
3 server OS
1,296 combinations.
A bank has created a new data processing system that is ready for testing. This bank
has different kinds of customers—consumers, very important consumers, businesses,
and non-profits; different kinds of accounts—checking, savings, mortgages, consumer
loans, and commercial loans; they operate in different states, each with different
regulations—California, Nevada, Utah, Idaho, Arizona, and New Mexico.
Bank Combinations
4 customer types
5 account types
6 states
120 combinations.
In an object-oriented system, an object of class A can pass a message containing a
parameter P to an object of class X. Classes B, C, and D inherit from A so they too can
send the message. Classes Q, R, S, and T inherit from P so they too can be passed
as the parameter. Classes Y and Z inherit from X so they too can receive the message.
OO Combinations

4 senders
5 parameters
3 receivers
60 combinations.
Students in my classes often have a very difficult time thinking of bad ways to do
Insight things. Cultivate the skill of choosing poorly. It will be invaluable in evaluating others'
ideas.
Can You Believe This?
A student in one of my classes shared this story: His organization uses a process they call
"Post-Installation Test Planning." It sounds impressive until you decipher it. Whatever tests
they happen to run that happen to pass are documented as their Test Plan.
What do these very different situations all have in common? Each has a large number of
combinations that should be tested. Each has a large number of combinations that may be risky
if we do not test. Each has such a large number of combinations that we may not have the
resources to construct and run all the tests, there are just too many. We must, somehow,
select a reasonably sized subset that we could test given our resource constraints. What are
some ways of choosing such a subset? This list starts with the worst schemes but does
improve:

Don't test at all. Simply give up because the number of input
combinations, and thus the number of test cases, is just too great.
Test all combinations [once], but delay the project so it misses its
market window so that everyone quits from stress, or the company
goes out of business.
Choose one or two tests and hope for the best.

Choose the tests that you have already run, perhaps as part of
programmer-led testing. Incorporate them into a formal test plan and
run them again.
Choose the tests that are easy to create and run. Ignore whether they
provide useful information about the quality of the product.
Make a list of all the combinations and choose the first few.
Make a list of all the combinations and choose a random subset.

By magic, choose a specially selected, fairly small subset that finds a
great many defects—more than you would expect from such a subset.

This last scheme sounds like a winner (but it is a little vague). The question is—what is the
"magic" that allows us to choose that "specially selected" subset?
Insight

Random selection can be a very good approach to choosing a subset but most
people have a difficult time choosing truly randomly.

The answer is not to attempt to test all the combinations for all the values for all the variables
but to test all pairs of variables. This significantly reduces the number of tests that must be
created and run. Consider the significant reductions in test effort in these examples:
If a system had four different input parameters and each one could take on one of
three different values, the number of combinations is 34 which is 81. It is possible to
cover all the pairwise input combinations in only nine tests.
If a system had thirteen different input parameters and each one could take on one of

three different values, the number of combinations is 313 which is 1,594,323. It is
possible to cover all the pairwise input combinations in only fifteen tests.
If a system had twenty different input parameters and each one could take on one of
ten different values, the number of combinations is 1020. It is possible to cover all the
pairwise input combinations in only 180 tests.
There is much anecdotal evidence about the benefit of pairwise testing. Unfortunately, there are
only a few documented studies:
In a case study published by Brownlie of AT&T regarding the testing of a local-area
network-based electronic mail system, pairwise testing detected 28 percent more
defects than their original plan of developing and executing 1,500 test cases (later
reduced to 1,000 because of time constraints) and took 50 percent less effort.
A study by the National Institute of Standards and Technology published by Wallace and
Kuhn on software defects in recalled medical devices reviewed fifteen years of defect
data. They concluded that 98 percent of the reported software flaws could have been
detected by testing all pairs of parameter settings.
Kuhn and Reilly analyzed defects recorded in the Mozilla Web browser database. They
determined that pairwise testing would have detected 76 percent of the reported
errors.
Why does pairwise testing work so well? I don't know. There is no underlying "software
physics" that requires it. One hypothesis is that most defects are either single-mode defects
(the function under test simply does not work and any test of that function would find the
defect) or they are double-mode defects (it is the pairing of this function/module with that
function/module that fails even though all other pairings perform properly). Pairwise testing
defines a minimal subset that guides us to test for all single-mode and double-mode defects.
The success of this technique on many projects, both documented and undocumented, is a
great motivation for its use.
Pairwise testing may not choose combinations which the developers and testers know
are either frequently used or highly risky. If these combinations exist, use the pairwise
Note
tests, then add additional test cases to minimize the risk of missing an important
combination.

Technique
Two different techniques are used to identify all the pairs for creating test cases—orthogonal
arrays and the Allpairs algorithm.

Orthogonal Arrays
What are orthogonal arrays? The origin of orthogonal arrays can be traced back to Euler, the
great mathematician, in the guise of Latin Squares. Genichi Taguchi has popularized their use in
hardware testing. An excellent reference book is Quality Engineering Using Robust Design by
Madhav S. Phadke.
Consider the numbers 1 and 2. How many pair combinations (combinations taken two at a time)
of '1' and '2' exist? {1,1}, {1,2}, {2,1} and {2,2}. An orthogonal array is a two-dimensional array
of numbers that has this interesting property—choose any two columns in the array. All the
pairwise combinations of its values will occur in every pair of columns. Let's examine an L4(23)
array:
Table 6-1: L4(23) Orthogonal Array
1

The gray column headings and row numbers are not part of the orthogonal array but are
included for convenience in referencing the cells. Examine columns 1 and 2—do the four
combinations of 1 and 2 all appear in that column pair? Yes, and in the order listed earlier. Now
examine columns 1 and 3—do the four combinations of 1 and 2 appear in that column pair?
Yes, although in a different order. Finally, examine columns 2 and 3—do the four combinations
appear in that column pair also? Yes they do. The L4(23) array is orthogonal; that is, choose
any two columns, all the pairwise combinations will occur in all the column pairs.
Important
Note

As a tester you do not have to create orthogonal arrays, all you must do is
locate one of the proper size. Books, Web sites, and automated tools will
help you do this.

A note about the curious (but standard) notation: L4 means an orthogonal array with four rows,
(23) is not an exponent. It means that the array has three columns, each with either a 1 or a 2.

Figure 6-1: Orthogonal array notation
Let's consider a larger orthogonal
array. Given the numbers 1, 2 and 3, how many pair combinations of 1, 2, and 3 exist? {1,1},
{1,2}, {1,3}, {2,1}, {2,2}, {2,3}, {3,1}, {3,2}, and {3,3}. Below is an L9(34) array:
Table 6-2: L9(34) Orthogonal Array
1

Examine columns 1 and 2—do the nine combinations of 1, 2, and 3 all appear in that column
pair? Yes. Now examine columns 1 and 3—do the nine combinations of 1, 2, and 3 appear in
that column pair? Yes, although in a different order. Examine columns 1 and 4—do the nine
combinations appear in that column pair also? Yes they do. Continue on by examining other
pairs of columns—2 and 3, 2 and 4, and finally 3 and 4. The L9(34) array is orthogonal; that
is, choose any two columns, all the combinations will occur in all of the column pairs.
Tool
The rdExpert tool from Phadke Associates implements the orthogonal array approach.
See http://www.phadkeassociates.com
Note that not all combinations of 1s, 2s, and 3s appear in the array. For example, {1,1,2},
{1,2,1}, and {2,2,2) do not appear. Orthogonal arrays only guarantee that all the pair
combinations exist in the array. Combinations such as {2,2,2} are triples, not pairs.

The following is an L18(35) orthogonal array. It has five columns, each containing a 1, 2, or 3.
Examine columns 1 and 2 for the pair {1,1}. Does that pair exist in those two columns? Wait!
Don't look at the array. From the definition of an orthogonal array, what is the answer? Yes,
that pair exists along with every other pair of 1, 2, and 3. The pair {1,1} is in row 1. Note that
{1,1} also appears in row 6. Returning to the original description of orthogonal arrays,
An orthogonal array is a two-dimensional array of numbers that has this interesting
property—choose any two columns in the array. All the pairwise combinations of its
values will occur in every column pair.
This definition is not totally complete. Not only will all the pair combinations occur in the
array, but if any pair occurs multiple times, all pairs will occur that same number of times.
This is because orthogonal arrays are "balanced." Examine columns 3 and 5—look for {3,2}.
That combination appears in rows 6 and 17.
Table 6-3: L18(35) Orthogonal Array
1

In orthogonal arrays not all of the columns must have the same range of values (1..2, 1..3,
1..5, etc.). Some orthogonal arrays are mixed. The following is an L18(2137) orthogonal
array. It has one column of 1s and 2s, and seven columns of 1s, 2s, and 3s.
Table 6-4: L18(2137) Orthogonal Array
1

Reference
Neil J.A. Sloane maintains a very comprehensive catalog of orthogonal arrays at
http://www.research.att.com/~njas/oadir/index.html

Using Orthogonal Arrays
The process of using orthogonal arrays to select pairwise subsets for testing is:
1. Identify the variables.

2. Determine the number of choices for each variable.
3. Locate an orthogonal array which has a column for each variable and values within the
columns that correspond to the choices for each variable.
4. Map the test problem onto the orthogonal array.
5. Construct the test cases.
If this seems rather vague at this point it's time for an example.
Web-based systems such as Brown & Donaldson and the Stateless University Registration
System must operate in a number of environments. Let's execute the process step-by-step
using an orthogonal array to choose test cases. Consider the first example in the introduction
describing the software combinations a Web site must operate with.
1. Identify the variables.
The variables are Browser, Plug-in, Client operating system, Server, and Server
operating system.
2. Determine the number of choices for each variable.
Browser - Internet Explorer 5.0, 5.5, and 6.0, Netscape 6.0, 6.1, and 7.0, Mozilla 1.1,
and Opera 7 (8 choices).
Plug-in - None, RealPlayer, and MediaPlayer (3 choices).
Client operating system - Windows 95, 98, ME, NT, 2000, and XP (6 choices).
Server - IIS, Apache, and WebLogic (3 choices).
Server operating system - Windows NT, 2000, and Linux (3 choices).
Multiplying 8 x 3 x 6 x 3 x 3 we find there are 1,296 combinations. For "complete" test
coverage, each of these combinations should be tested.
3. Locate an orthogonal array that has a column for each variable and values
within the columns that correspond to the choices of each variable.
What size array is needed? First, it must have five columns, one for each variable in
this example. The first column must support eight different levels (1 through 8). The
second column must support three levels (1 through 3). The third requires six levels.
The fourth and the fifth each require three levels. The perfect size orthogonal array
would be 816133 (one column of 1 through 8, one column of 1 through 6, and three
columns of 1 through 3). Unfortunately, one of this exact size does not exist. When
this occurs, we simply pick the next larger array.
Important

As a tester you do not have to create orthogonal arrays. All you

Note

must do is locate one of the proper size and then perform the
mapping of the test problem onto the array.

The following orthogonal array meets our requirements. It's an L64(8243) array.
Orthogonal arrays can be found in a number of books and on the Web. A favorite
book is Quality Engineering Using Robust Design by Madhav S. Phadke. In addition,
an excellent catalog is maintained on the Web by Neil J.A. Sloane of AT&T. See
http://www.research.att.com/~njas/oadir/index.html.
The requirement of 8161 (one column of 1 through 8 and 1 column of 1 through 6) is
met by 82 (two columns of 1 through 8). The requirement of 33 (three columns of 1
through 3) is met by 43 (three columns of 1 through 4).
The number of combinations of all the values of all the variables is 1,296 and thus
1,296 test cases should be created and run for complete coverage. Using this
orthogonal array, all pairs of all the values of all the variables can be covered in only
sixty-four tests, a 95 percent reduction in the number of test cases.
Table 6-5: L64(8243) Orthogonal Array
1

4. Map the test problem onto the orthogonal array.
The Browser choices will be mapped onto column 1 of the orthogonal array. Cells
containing a 1 will represent IE 5.0; cells with a 2 will represent IE5.5; cells with a 3
will represent IE 6.0; etc. The mapping is:
1 ↔ IE 5.0
2 ↔ IE 5.5
3 ↔ IE 6.0
4 ↔ Netscape 6.0
5 ↔ Netscape 6.1
6 ↔ Netscape 7.0
7 ↔ Mozilla 1.1

8 ↔ Opera 7
Partially filling in the first column gives:
Table 6-6: L64 (8243) with a partial mapping of its first column.
Browser

IE 5.0

IE 6.0

IE 5.5

Net 6.0

Net 6.1

Moz 1.1

Net 7.0

Opera 7

Is it clear what is happening? In column 1 (which we have chosen to represent the
Browser) every cell containing a 1 is being replaced with "IE 5.0." Every cell
containing a 2 is being replaced with "IE 5.5." Every cell containing an 8 is being
replaced with "Opera 7," etc.
We'll continue by completing the mapping (replacement) of all the cells in column 1.
Note that the mapping between the variable values and the 1s, 2s, and 3s is totally
arbitrary. There is no logical connection between "1" and IE 5.0 or "7" and Mozilla 1.1.
But, although the initial assignment is arbitrary, once chosen, the assignments and use
must remain consistent within each column.
Table 6-7: L64 (8243) with a full mapping of its first column.
Browser

IE 5.0

IE 6.0

IE 5.5

Net 6.0

Net 6.1

Moz 1.1

Net 7.0

Opera 7

Now that the first column has been mapped, let's proceed to the next one. The Plug-in
choices will be mapped onto column 2 of the array. Cells containing a 1 will represent
None (No plug-in); cells with a 2 will represent RealPlayer; cells with a 3 will represent
MediaPlayer; cells with a 4 will not be mapped at the present time. The mapping is:
1 ↔ None
2 ↔ RealPlayer
3 ↔ MediaPlayer

4 ↔ Not used (at this time)
Filling in the second column gives:
Table 6-8: L64 (8243) with a full mapping of its first and second columns.
Browser

Plug-In

IE 5.0

None

IE 5.0

None

IE 5.0

MediaPlayer

IE 5.0

RealPlayer

IE 5.0

RealPlayer

IE 5.0

MediaPlayer

IE 6.0

None

IE 6.0

None

IE 6.0

RealPlayer

IE 6.0

MediaPlayer

IE 6.0

MediaPlayer

IE 6.0

RealPlayer

IE 5.5

MediaPlayer

IE 5.5

RealPlayer

IE 5.5

RealPlayer

IE 5.5

MediaPlayer

IE 5.5

None

IE 5.5

None

Net 6.0

RealPlayer

Net 6.0

MediaPlayer

Net 6.0

MediaPlayer

Net 6.0

RealPlayer

Net 6.0

None

Net 6.0

None

Net 6.0

Net 6.1

RealPlayer

Net 6.1

MediaPlayer

Net 6.1

MediaPlayer

Net 6.1

RealPlayer

Net 6.1

None

Net 6.1

None

Net 6.1

Moz 1.1

MediaPlayer

Moz 1.1

RealPlayer

Moz 1.1

RealPlayer

Moz 1.1

MediaPlayer

Moz 1.1

None

Moz 1.1

None

Net 7.0

None

Net 7.0

None

Net 7.0

RealPlayer

Net 7.0

MediaPlayer

Net 7.0

MediaPlayer

Net 7.0

RealPlayer

Opera 7

None

Opera 7

None

Opera 7

MediaPlayer

Opera 7

RealPlayer

Opera 7

RealPlayer

Opera 7

MediaPlayer

Now that the first and second columns have been mapped, let's proceed to map the
next three columns simultaneously.
The mapping for Client operating system is:
1 ↔Windows 95
2 ↔ Windows 98
3 ↔ Windows ME
4 ↔ Windows NT
5 ↔ Windows 2000
6 ↔ Windows XP
7 ↔ Not used (at this time)
8 ↔ Not used (at this time)
The mapping for Servers is:
1 ↔ IIS
2 ↔ Apache
3 ↔ WebLogic
4 ↔ Not used (at this time)
The mapping for Server operating system is:
1 ↔ Windows NT
2 ↔ Windows 2000

3 ↔ Linux
4 ↔ Not used (at this time)
Filling in the remainder of the columns gives:
Table 6-9: L64(8243) with a full mapping of all its columns.
Browser

Plug-in

Client OS

Server

Server OS

IE 5.0

None

Win 95

IIS

Win NT

IE 5.0

Win ME

IE 5.0

Win 98

IE 5.0

None

Win NT

IIS

Win NT

IE 5.0

MediaPlayer

Win 2000

WebLogic

Linux

IE 5.0

RealPlayer

Apache

Win 2000

IE 5.0

RealPlayer

Win XP

Apache

Win 2000

IE 5.0

MediaPlayer

WebLogic

Linux

IE 6.0

Win 95

WebLogic

Linux

IE 6.0

None

Win ME

Apache

Win 2000

IE 6.0

None

Win 98

Apache

Win 2000

IE 6.0

Win NT

WebLogic

Linux

IE 6.0

RealPlayer

Win 2000

IIS

Win NT

IE 6.0

MediaPlayer

IE 6.0

MediaPlayer

Win XP

IE 6.0

RealPlayer

Win NT

IE 5.5

MediaPlayer

Win 95

Apache

Win NT

IE 5.5

RealPlayer

Win ME

WebLogic

IE 5.5

RealPlayer

Win 98

WebLogic

IE 5.5

MediaPlayer

Win NT

Apache

Win NT

IE 5.5

None

Win 2000

Linux

IE 5.5

IIS

Win 2000

IE 5.5

Win XP

IIS

Win 2000

IE 5.5

None

Linux

Net 6.0

RealPlayer

Win 95

Linux

Net 6.0

MediaPlayer

Win ME

IIS

Win 2000

Net 6.0

MediaPlayer

Win 98

IIS

Win 2000

Net 6.0

RealPlayer

Win NT

Linux

Net 6.0

Win 2000

Apache

Win NT

Net 6.0

None

WebLogic

Net 6.0

None

Win XP

WebLogic

Net 6.0

Apache

Win NT

Net 6.1

RealPlayer

Win 95

Win 2000

Net 6.1

MediaPlayer

Win ME

IIS

Linux

Net 6.1

MediaPlayer

Win 98

IIS

Linux

Net 6.1

RealPlayer

Win NT

Win 2000

Net 6.1

Win 2000

Apache

Net 6.1

None

WebLogic

Win NT

Net 6.1

None

Win XP

WebLogic

1 Win NT

Net 6.1

Apache

Moz 1.1

MediaPlayer

Win 95

Apache

Moz 1.1

RealPlayer

Win ME

WebLogic

Win NT

Moz 1.1

RealPlayer

Win 98

WebLogic

Win NT

Moz 1.1

MediaPlayer

Win NT

Apache

Moz 1.1

None

Win 2000

Moz 1.1

IIS

Linux

Moz 1.1

Win XP

IIS

Linux

Moz 1.1

None

Win 2000

Net 7.0

Win 95

WebLogic

Win 2000

Net 7.0

None

Win ME

Apache

Linux

Net 7.0

None

Win 98

Apache

Linux

Net 7.0

Win NT

WebLogic

Win 2000

Net 7.0

RealPlayer

Win 2000

IIS

Net 7.0

MediaPlayer

Win NT

Net 7.0

MediaPlayer

Win XP

Win NT

Net 7.0

RealPlayer

IIS

Opera 7

None

Win 95

IIS

Opera 7

Win ME

Win NT

Opera 7

Win 98

Win NT

Opera 7

None

Win NT

IIS

Opera 7

MediaPlayer

Win 2000

WebLogic

Win 2000

Opera 7

RealPlayer

Apache

Linux

Opera 7

RealPlayer

Win XP

Apache

Linux

Opera 7

MediaPlayer

WebLogic

Win 2000

Were it not for the few cells that remain unassigned, the mapping of the orthogonal
array, and thus the selection of the test cases, would be completed. What about the
unassigned cells—first, why do they exist?; second, what should be done with them?
The unassigned cells exist because the orthogonal array chosen was "too big." The
perfect size would be an 816133 array; that is, one column that varies from 1 to 8; one
column that varies from 1 to 6; and three columns that vary from 1 to 3. Unfortunately,
that specific size orthogonal array does not exist. Orthogonal arrays cannot be
constructed for any arbitrary size parameters. They come in fixed, "quantum" sizes.
You can construct one "this big"; you can construct one "that big"; but you cannot
necessarily construct one in-between. Famous Software Tester Mick Jagger gives
excellent advice regarding this, "You can't always get what you want, But if you try
sometimes, You just might find, you get what you need."
Famous Software Tester

If the perfect size array does not exist, choose one that is slightly bigger and apply
these two rules to deal with the "excess." The first rule deals with extra columns. If
the orthogonal array chosen has more columns than needed for a particular test
scenario, simply delete them. The array will remain orthogonal. The second rule deals
with extra values for a variable. In the current example, column 3 runs from 1 to 8 but
only 1 through 6 is needed. It is tempting to delete the rows that contain these cells
but DON'T. The "orthogonalness" may be lost. Each row in the array exists to provide
at least one pair combination that appears nowhere else in the array. If you delete a
row, you lose that test case. Instead of deleting them, simply convert the extra cells
to valid values. Some automated tools randomly choose from the set of valid values
for each cell while others choose one valid value and use it in every cell within a
column. Either approach is acceptable. Using this second approach, we'll complete
the orthogonal array. Note that it may be difficult to maintain the "balanced" aspect of
the array when assigning values to these extra cells.
Table 6-10: L64 (8243) with a full mapping of all its columns including the
"extra" cells.
Browser

Plug-in

Client OS

Server

Server OS

IE 5.0

None

Win 95

IIS

Win NT

IE 5.0

None

Win ME

IIS

Win NT

IE 5.0

None

Win 98

IIS

Win NT

IE 5.0

None

Win NT

IIS

Win NT

IE 5.0

MediaPlayer

Win 2000

WebLogic

Linux

IE 5.0

RealPlayer

Win 95

Apache

Win 2000

IE 5.0

RealPlayer

Win XP

Apache

Win 2000

IE 5.0

MediaPlayer

Win 98

WebLogic

Linux

IE 6.0

None

Win 95

WebLogic

Linux

IE 6.0

None

Win ME

Apache

Win 2000

IE 6.0

None

Win 98

Apache

Win 2000

IE 6.0

None

Win NT

WebLogic

Linux

IE 6.0

RealPlayer

Win 2000

IIS

Win NT

IE 6.0

MediaPlayer

Win 95

IIS

Win NT

IE 6.0

MediaPlayer

Win XP

IIS

Win NT

IE 6.0

RealPlayer

Win 98

IIS

Win NT

IE 5.5

MediaPlayer

Win 95

Apache

Win NT

IE 5.5

RealPlayer

Win ME

WebLogic

Win NT

IE 5.5

RealPlayer

Win 98

WebLogic

Win NT

IE 5.5

MediaPlayer

Win NT

Apache

Win NT

IE 5.5

None

Win 2000

IIS

Linux

IE 5.5

None

Win 95

IIS

Win 2000

IE 5.5

None

Win XP

IIS

Win 2000

IE 5.5

None

Win 98

IIS

Linux

Net 6.0

RealPlayer

Win 95

IIS

Linux

Net 6.0

MediaPlayer

Win ME

IIS

Win 2000

Net 6.0

MediaPlayer

Win 98

IIS

Win 2000

Net 6.0

RealPlayer

Win NT

IIS

Linux

Net 6.0

None

Win 2000

Apache

Win NT

Net 6.0

None

Win 95

WebLogic

Win NT

Net 6.0

None

Win XP

WebLogic

Win NT

Net 6.0

None

Win 98

Apache

Win NT

Net 6.1

RealPlayer

Win 95

IIS

Win 2000

Net 6.1

MediaPlayer

Win ME

IIS

Linux

Net 6.1

MediaPlayer

Win 98

IIS

Linux

Net 6.1

RealPlayer

Win NT

IIS

Win 2000

Net 6.1

None

Win 2000

Apache

Win NT

Net 6.1

None

Win 95

WebLogic

Win NT

Net 6.1

None

Win XP

WebLogic

Win NT

Net 6.1

None

Win 98

Apache

Win NT

Moz 1.1

MediaPlayer

Win 95

Apache

Win NT

Moz 1.1

RealPlayer

Win ME

WebLogic

Win NT

Moz 1.1

RealPlayer

Win 98

WebLogic

Win NT

Moz 1.1

MediaPlayer

Win NT

Apache

Win NT

Moz 1.1

None

Win 2000

IIS

Win 2000

Moz 1.1

None

Win 95

IIS

Linux

Moz 1.1

None

Win XP

IIS

Linux

Moz 1.1

None

Win 98

IIS

Win 2000

Net 7.0

None

Win 95

WebLogic

Win 2000

Net 7.0

None

Win ME

Apache

Linux

Net 7.0

None

Win 98

Apache

Linux

Net 7.0

None

Win NT

WebLogic

Win 2000

Net 7.0

RealPlayer

Win 2000

IIS

Win NT

Net 7.0

MediaPlayer

Win 95

IIS

Win NT

Net 7.0

MediaPlayer

Win XP

IIS

Win NT

Net 7.0

RealPlayer

Win 98

IIS

Win NT

Opera 7

None

Win 95

IIS

Win NT

Opera 7

None

Win ME

IIS

Win NT

Opera 7

None

Win 98

IIS

Win NT

Opera 7

None

Win NT

IIS

Win NT

Opera 7

MediaPlayer

Win 2000

WebLogic

Win 2000

Opera 7

RealPlayer

Win 95

Apache

Linux

Opera 7

RealPlayer

Win XP

Apache

Linux

Opera 7

MediaPlayer

Win 98

WebLogic

Win 2000

5. Construct the test cases.
Now, all that remains is to construct a test case for each row in the orthogonal array.
Note that the array specifies only the input conditions. An oracle (usually the tester) is
required to determine the expected result for each test.

Allpairs Algorithm
Using orthogonal arrays is one way to identify all the pairs. A second way is to use an algorithm
that generates the pairs directly without resorting to an "external" device like an orthogonal
array.
Reference
James Bach provides a tool to generate all pairs combinations at http://www.satisfice.com.
Click on Test Methodology and look for Allpairs.
Ward Cunningham provides further discussion and the source code for a Java program to
generate all pairs combinations at http://fit.c2.com/wiki.cgi?AllPairs.
James Bach presents an algorithm to generate all pairs in Lessons Learned in Software
Testing. In addition, he provides a program called "Allpairs" that will generate the all pairs
combinations. It is available at http://www.satisfice.com. Click on "Test Methodology" and look
for Allpairs. Let's apply the Allpairs algorithm to the previous Web site testing problem.
After downloading and unzipping, to use Allpairs create a tab-delimited table of the variables
and their values. If you are a Windows user, the easiest way is to launch Excel, enter the data
into the spreadsheet, and then SaveAs a .txt file. The following table was created and saved as
input.txt.
Table 6-11: Input to the Allpairs program.
Browser

Client OS

Plug-in

Server

Server OS

IE 5.0

Win 95

None

IIS

Win NT

IE 5.5

Win 98

Real Player

Apache

Win 2000

IE 6.0

Win ME

Media Player

WebLogic

Linux

Netscape 6.0

Win NT

Netscape 6.1

Win 2000

Netscape 7.0

Win XP

Mozilla 1.1
Opera 7

Then run the Allpairs program by typing: allpairs input.txt > output.txt where output.txt will
contain the list of all pairs test cases. The following table was created:
Table 6-12: Output from the Allpairs program.
Browser

Client OS

Plug-in

Server

Server OS

IE 5.0

Win 95

None

IIS

Win NT

IE 5.0

Win 98

Real Player

Apache

Win 2000

IE 5.0

Win ME

Media Player

WebLogic

Linux

IE 5.5

Win 95

Real Player

WebLogic

Win NT

IE 5.5

Win 98

None

IIS

Linux

IE 5.5

Win ME

None

Apache

Win 2000

IE 6.0

Win 95

Media Player

Apache

Linux

IE 6.0

Win 98

Real Player

IIS

Win NT

IE 6.0

Win ME

None

WebLogic

Win 2000

10 Netscape 6.0

Win ME

Real Player

IIS

Linux

11 Netscape 6.0

Win NT

Media Player

IIS

Win 2000

12 Netscape 6.0

Win 2000

None

Apache

Win NT

13 Netscape 6.1

Win NT

None

WebLogic

Linux

14 Netscape 6.1

Win 2000

Media Player

IIS

Win 2000

15 Netscape 6.1

Win XP

Real Player

Apache

Win NT

16 Netscape 7.0

Win NT

Real Player

Apache

Win NT

17 Netscape 7.0

Win 2000

Media Player

WebLogic

Linux

18 Netscape 7.0

Win XP

Media Player

IIS

Win 2000

19 Mozilla 1.1

Win XP

Media Player

WebLogic

Win NT

20 Mozilla 1.1

Win 98

Media Player

Apache

Linux

21 Mozilla1.1

Win 95

Real Player

IIS

Win 2000

22 Opera 7

Win XP

None

WebLogic

Linux

23 Opera 7

Win 98

Real Player

WebLogic

Win 2000

24 Opera 7

Win ME

Media Player

Apache

Win NT

25 IE 5.5

Win 2000

Real Player

~WebLogic

~Linux

26 IE 5.5

Win NT

Media Player

~IIS

~Win NT

27 Netscape 6.0

Win 95

~None

WebLogic

~Win 2000

28 Netscape 7.0

Win 95

None

~Apache

~Linux

29 Mozilla 1.1

Win ME

None

~IIS

~Win NT

30 Opera 7

Win NT

~Real Player

IIS

~Linux

31 IE 5.0

Win NT

~None

~Apache

~Win 2000

32 IE 5.0

Win 2000

~Real Player

~IIS

~Win NT

33 IE 5.0

Win XP

~None

~WebLogic

~Linux

34 IE 5.5

Win XP

~Real Player

~Apache

~Win 2000

35 IE 6.0

Win 2000

~None

~Apache

~Win 2000

36 IE 6.0

Win NT

~Real Player

~WebLogic

~Win NT

37 IE 6,0

Win XP

~Media Player

~IIS

~Linux

38 Netscape 6.0

Win 98

~Media Player

~WebLogic

~Win NT

39 Netscape 6.0

Win XP

~Real Player

~Apache

~Linux

40 Netscape 6.1

Win 95

~Media Player

~Apache

~Win 2000

41 Netscape 6.1

Win 98

~None

~IIS

~Win NT

42 Netscape 6.1

Win ME

~Real Player

~WebLogic

~Linux

43 Netscape 7.0

Win 98

~None

~WebLogic

~Win 2000

44 Netscape 7.0

Win ME

~Real Player

-US

~Win NT

45 Mozilla 1.1

Win NT

~None

~Apache

~Linux

46 Mozilla 1.1

Win 2000

~Real Player

~WebLogic

~Win 2000

47 Opera 7

Win 95

~Media Player

~IIS

~Win NT

48 Opera 7

Win 2000

~None

~Apache

~Win 2000

When a particular value in the test case doesn't matter, because all of its pairings have already
been selected, it is marked with a ~. Bach's algorithm chooses the value that has been paired
the fewest times relative to the others in the test case. Any other value could be substituted for
one prefixed with a ~ and all pairs coverage would still be maintained. This might be done to
test more commonly used or more critical combinations more often. In addition, Bach's program
displays information on how the pairings were done. It lists each pair, shows how many times
that pair occurs in the table, and indicates each test case that contains that pair.
Because of the "balanced" nature of orthogonal arrays, that approach required sixty-four test
cases. The "unbalanced" nature of the all pairs selection algorithm requires only forty-eight test
cases, a savings of 25 percent.
Note that the combinations chosen by the Orthogonal Array method may not be the same as
those chosen by Allpairs. It does not matter. What does matter is that all of the pair

combinations of parameters are chosen. Those are the combinations we want to test.
Proponents of the Allpairs algorithm point out that given a problem with 100 parameters, each
capable of taking on one of two values, 101 test cases would be required using a (balanced)
orthogonal array while the un-balanced all pairs approach requires only ten tests. Since many
applications have large numbers of inputs that take on only a few values each, they argue the
all pairs approach is superior.
Tool
The AETG tool from Telcordia implements the all-pairs testing approach. See
http://aetgweb.argreenhouse.com.

Final Comments
In some situations, constraints exist between certain choices of some of the variables. For
example, Microsoft's IIS and Apple's MacOS are not compatible. It is certain that the pairwise
techniques will choose that combination for test. (Remember, it does select all the pairs.) When
creating pairwise subsets by hand, honoring these various constraints can be difficult. Both the
rdExpert and AETG tools have this ability. You define the constraints and the tool selects pairs
meeting those constraints.
Given the two approaches to pairwise testing, orthogonal arrays and the Allpairs algorithm,
which is more effective? One expert, who favors orthogonal arrays, believes that the coverage
provided by Allpairs is substantially inferior. He notes that the uniform distribution of test points
in the domain offers some coverage against faults that are more complex than double-mode
faults. Another expert, who favors the Allpairs approach, notes that Allpairs does, in fact, test
all the pairs, which is the goal. He claims there is no evidence that the orthogonal array
approach detects more defects. He also notes that the Allpairs tool is available free on the
Web. What both experts acknowledge is that no documented studies exist comparing the
efficacy of one approach over the other.
The exciting hope of pairwise testing is that by creating and running between 1 percent to 20
percent of the tests you will find between 70 percent and 85 percent of the total defects. There
is no promise here, only a hope. Many others have experienced this significant result. Try this
technique. Discover whether it works for you.
Cohen reported that in addition to reducing the number of test cases and increasing the defect
find rate, test cases created by the Allpairs algorithm also provided better code coverage. A
set of 300 randomly selected tests achieved 67 percent statement coverage and 58 percent
decision coverage while the 200 all pairs test cases achieved 92 percent block coverage and
85 percent decision coverage, a significant increase in coverage with fewer test cases.
One final comment—it is possible that certain important combinations may be missed by both
pairwise approaches. The 80:20 rule tells us that combinations are not uniformly important. Use
your judgment to determine if certain additional tests should be created for those combinations.

In the previous example we can be assured that the distribution of browsers is not identical. It
would be truly amazing if 12.5 percent of our users had IE 5.0, 12.5 percent had IE 5.5, 12.5
percent had IE 6.0, etc. Certain combinations occur more frequently than others. In addition,
some combinations exist that, while used infrequently, absolutely positively must work properly
—"shut down the nuclear reactor" is a good example. In case pairwise misses an important
combination, please add that combination to your test cases.

Applicability and Limitations
Like other test design approaches previously presented, pairwise testing can significantly
reduce the number of test cases that must be created and executed. It is equally applicable at
the unit, integration, system, and acceptance test levels. All it requires are combinations of
inputs, each taking on various values, that result in a combinatorial explosion, too many
combinations to test.
Remember, there is no underlying "software defect physics" that guarantees pairwise testing
will be of benefit. There is only one way to know—try it.

Summary
When the number of combinations to test is very large, do not to attempt to test all
combinations for all the values for all the variables, but test all pairs of variables. This
significantly reduces the number of tests that must be created and run.
Studies suggest that most defects are either single-mode defects (the function under
test simply does not work) or double-mode defects (the pairing of this function/module
with that function/module fails). Pairwise testing defines a minimal subset that guides us
to test for all single-mode and double-mode defects. The success of this technique on
many projects, both documented and undocumented, is a great motivation for its use.
An orthogonal array is a two-dimensional array of numbers that has this interesting
property—choose any two columns in the array, all the combinations will occur in every
column pair.
There is no underlying "software defect physics" that guarantees pairwise testing will
be of benefit. There is only one way to know—try it.

Practice
1. Neither the Brown & Donaldson nor the Stateless University Registration System case
studies contain huge numbers of combinations suitable for the pairwise testing
approach. As exercises, use the orthogonal array and/or all pairs technique on the
other two examples in this chapter. Determine the set of pairwise test cases using the
chosen technique.
1. A bank has created a new data processing system that is ready for testing.
This bank has different kinds of customers—consumers, very important
consumers, businesses, and non-profits; different kinds of accounts—
checking, savings, mortgages, consumer loans, and commercial loans; they
operate in different states, each with different regulations—California,
Nevada, Utah, Idaho, Arizona, and New Mexico.
2. In an object-oriented system, an object of class A can send a message
containing a parameter P to an object of class X. Classes B, C, and D
inherit from A so they too can send the message. Classes Q, R, S, and T
inherit from P so they too can be passed as the parameter. Classes Y and
Z inherit from X so they too can receive the message.

References
Brownlie, Robert, et al. "Robust Testing of AT&T PMX/StarMAIL Using OATS," AT&T
Technical Journal, Vol. 71, No. 3, May/June 1992, pp. 41–47.
Cohen, D.M., et al. "The AETG System: An Approach to Testing Based on Combinatorial
Design." IEEE Transactions on Software Engineering, Vol. 23, No. 7, July, 1997.
Kaner, Cem,James Bach, and Bret Pettichord (2002). Lessons Learned in Software
Testing: A Context-Driven Approach. John Wiley & Sons.
Kuhn, D. Richard and Michael J. Reilly. "An Investigation of the Applicability of Design of
Experiments to Software Testing," 27th NASA/IEEE Software Engineering Workshop,
NASA Goddard Space Flight Center, 4–6 December, 2002.
http://csrc.nist.gov/staff/kuhn/kuhn-reilly-02.pdf
Mandl, Robert. "Orthogonal Latin Squares: An Application of Experiment Design to
Compiler Testing," Communications of the ACM, Vol. 128, No. 10, October 1985, pp.
1054–1058.
Phadke, Madhav S. (1989). Quality Engineering Using Robust Design. Prentice-Hall.
Wallace, Delores R. and D. Richard Kuhn. "Failure Modes In Medical Device Software:
An Analysis Of 15 Years Of Recall Data," International Journal of Reliability, Quality,
and Safety Engineering, Vol. 8, No. 4, 2001.

Chapter 7: State-Transition Testing
Colonel Cleatus Yorbville had been one seriously bored astronaut for the first few months
of his diplomatic mission on the third planet of the Frangelicus XIV system, but all that
had changed on the day he'd discovered that his tiny, multipedal and infinitely hospitable
alien hosts were not only edible but tasted remarkably like that stuff that's left on the pan
after you've made cinnamon buns and burned them a little.
— Mark Silcox

Introduction
State-Transition diagrams, like decision tables, are another excellent tool to capture certain
types of system requirements and to document internal system design. These diagrams
document the events that come into and are processed by a system as well as the system's
responses. Unlike decision tables, they specify very little in terms of processing rules. When a
system must remember something about what has happened before or when valid and invalid
orders of operations exist, state-transition diagrams are excellent tools to record this
information.
These diagrams are also vital tools in the tester's personal toolbox. Unfortunately, many
analysts, designers, programmers, and testers are not familiar with this technique.

Technique
State-Transition Diagrams
It is easier to introduce state-transition diagrams by example rather than by formal definition.
Since neither Brown & Donaldson nor the Stateless University Registration System has
substantial state-transition based requirements let's consider a different example. To get to
Stateless U, we need an airline reservation. Let's call our favorite carrier (Grace L. Ferguson
Airline & Storm Door Company) to make a reservation. We provide some information including
departure and destination cities, dates, and times. A reservation agent, acting as our interface
to the airline's reservation system, uses that information to make a reservation. At that point,
the Reservation is in the Made state. In addition, the system creates and starts a timer. Each
reservation has certain rules about when the reservation must be paid for. These rules are
based on destination, class of service, dates, etc. If this timer expires before the reservation is
paid for, the reservation is cancelled by the system. In state-transition notation this information

is recorded as:
Figure 7-1: The Reservation is Made.
The circle represents one state of the Reservation—in this case the Made state. The arrow
shows the transition into the Made state. The description on the arrow, giveInfo, is an event
that comes into the system from the outside world. The command after the "/" denotes an
action of the system; in this case startPayTimer. The black dot indicates the starting point of
the diagram.
Sometime after the Reservation is made, but (hopefully) before the PayTimer expires, the
Reservation is paid for. This is represented by the arrow labeled PayMoney. When the
Reservation is paid it transitions from the Made state to the Paid state.

Figure 7-2: The Reservation transitions to the Paid state.
Before we proceed let's define the terms more formally:
State (represented by a circle)—A state is a condition in which a system is waiting for
one or more events. States "remember" inputs the system has received in the past and
define how the system should respond to subsequent events when they occur. These
events may cause state-transitions and/or initiate actions. The state is generally
represented by the values of one or more variables within a system.
Transition (represented by an arrow)—A transition represents a change from one state
to another caused by an event.
Event (represented by a label on a transition)—An event is something that causes the
system to change state. Generally, it is an event in the outside world that enters the
system through its interface. Sometimes it is generated within the system such as
Timer expires or Quantity on Hand goes below Reorder Point. Events are
considered to be instantaneous. Events can be independent or causally related (event
B cannot take place before event A). When an event occurs, the system can change
state or remain in the same state and/or execute an action. Events may have
parameters associated with them. For example, Pay Money may indicate Cash,
Check, Debit Card, or Credit Card.
Action (represented by a command following a "/")—An action is an operation initiated
because of a state change. It could be print a Ticket, display a Screen, turn on a
Motor, etc. Often these actions cause something to be created that are outputs of the
system. Note that actions occur on transitions between states. The states themselves
are passive.
The entry point on the diagram is shown by a black dot while the exit point is shown by
a bulls-eye symbol.
This notation was created by Mealy. An alternate notation has been defined by Moore but is
less frequently used. For a much more in-depth discussion of state-transition diagrams see
Fowler and Scott's book, UML Distilled: A Brief Guide To The Standard Object Modeling
Language. It discusses more complex issues such as partitioned and nested state-transition
diagrams.

Note that the state-transition diagram represents one specific entity (in this case a
Reservation). It describes the states of a reservation, the events that affect the reservation,
the transitions of the reservation from one state to another, and actions that are initiated by the
reservation. A common mistake is to mix different entities into one state-transition diagram. An
example might be mixing Reservation and Passenger with events and actions corresponding
to each.
From the Paid state the Reservation transitions to the Ticketed state when the print
command (an event) is issued. Note that in addition to entering the Ticketed state, a Ticket is
output by the system.

Figure 7-3: The Reservation transitions to the Ticketed state.
From the Ticketed state we giveTicket to the gate agent to board the plane.

Figure 7-4: The Reservation transitions to the Used state.
After some other action or period of time, not indicated on this diagram, the state-transition
path ends at the bulls-eye symbol.

Figure 7-5: The path ends.
Does this diagram show all the possible states, events, and transitions in the life of a
Reservation? No. If the Reservation is not paid for in the time allotted (the PayTimer
expires), it is cancelled for non-payment.

Figure 7-6: The PayTimer expires and the Reservation is cancelled for nonpayment.
Finished yet? No. Customers sometimes cancel their reservations. From the Made state the
customer (through the reservation agent) asks to cancel the Reservation. A new state,
Cancelled By Customer, is required.

Figure 7-7: Cancel the Reservation from the Made state.
In addition, a Reservation can be cancelled from the Paid state. In this case a Refund should
be generated and leave the system. The resulting state again is Cancelled By Customer.

Figure 7-8: Cancellation from the Paid state.
One final addition. From the Ticketed state the customer can cancel the Reservation. In that
case a Refund should be generated and the next state should be Cancelled by Customer.
But this is not sufficient. The airline will generate a refund but only when it receives the printed
Ticket from the customer. This introduces one new notational element—square brackets [] that
contain a conditional that can be evaluated either True or False. This conditional acts as a
guard allowing the transition only if the condition is true.

Figure 7-9: Cancellation from the Ticketed state.
Note that the diagram is still incomplete. No arrows and bulls-eyes emerge from the Cancelled
states. Perhaps we could reinstate a reservation from the Cancelled NonPay state. We could
continue expanding the diagram to include seat selection, flight cancellation, and other
significant events affecting the reservation but this is sufficient to illustrate the technique.
As described, state-transition diagrams express complex system rules and interactions in a
very compact notation. Hopefully, when this complexity exists, analysts and designers will have
created state-transition diagrams to document system requirements and to guide their design.

State-Transition Tables
A state-transition diagram is not the only way to document system behavior. The diagrams may
be easier to comprehend, but state-transition tables may be easier to use in a complete and
systematic manner. State-transition tables consist of four columns—Current State, Event,
Action, and Next State.
Table 7-1: State-Transition table for Reservation.
Current State

Event

Action

Next State

null

giveInfo

startPayTimer

Made

null

payMoney

null

giveTicket

null

cancel

null

PayTimerExpires

null

Made

giveInfo

Made

payMoney

Paid

Made

giveTicket

Made

cancel

Can-Cust

Made

PayTimerExpires

Can-NonPay

Paid

giveInfo

Paid

payMoney

Paid

Ticket

Ticketed

Paid

giveTicket

Paid

cancel

Refund

Can-Cust

Paid

PayTimerExpires

Paid

Ticketed

giveInfo

Ticketed

payMoney

Ticketed

giveTicket

Used

Ticketed

cancel

Refund

Can-Cust

Ticketed

PayTimerExpires

Ticketed

Used

giveInfo

Used

payMoney

Used

giveTicket

Used

cancel

Used

PayTimerExpires

Used

Can-NonPay

giveInfo

Can-NonPay

payMoney

Can-NonPay

giveTicket

Can-NonPay

cancel

Can-NonPay

PayTimerExpires

Can-NonPay

Can-Cust

givelnfo

Can-Cust

payMoney

Can-Cust

giveTicket

Can-Cust

cancel

Can-Cust

PayTimerExpires

Can-Cust

The advantage of a state-transition table is that it lists all possible state-transition combinations,
not just the valid ones. When testing critical, high-risk systems such as avionics or medical
devices, testing every state-transition pair may be required, including those that are not valid. In
addition, creating a state-transition table often unearths combinations that were not identified,
documented, or dealt with in the requirements. It is highly beneficial to discover these defects
before coding begins.
Key
Point

The advantage of a state-transition table is that it lists all possible state-transition
combinations, not just the valid ones.

Using a state-transition table can help detect defects in implementation that enable invalid paths
from one state to another. The disadvantage of such tables is that they become very large very
quickly as the number of states and events increases. In addition, the tables are generally
sparse; that is, most of the cells are empty.

Creating Test Cases
Information in the state-transition diagrams can easily be used to create test cases. Four
different levels of coverage can be defined:
1. Create a set of test cases such that all states are "visited" at least once under test.
The set of three test cases shown below meets this requirement. Generally this is a
weak level of test coverage.

Figure 7-10: A set of test cases that "visit" each state.
2. Create a set of test cases such that all events are triggered at least once under test.
Note that the test cases that cover each event can be the same as those that cover
each state. Again, this is a weak level of coverage.

Figure 7-11: A set of test cases that trigger all events at least once.
3. Create a set of test cases such that all paths are executed at least once under test.
While this level is the most preferred because of its level of coverage, it may not be
feasible. If the state-transition diagram has loops, then the number of possible paths
may be infinite. For example, given a system with two states, A and B, where A
transitions to B and B transitions to A. A few of the possible paths are: A→B
A→B→A
A→B→A→B→A→B
A→B→A→B→A→B→A A→B→A→B→A→B→A→B→A→B
...
and so on forever. Testing of loops such as this can be important if they may result in
accumulating computational errors or resource loss (locks without corresponding
releases, memory leaks, etc.).
Key
Point

Testing every transition is usually the recommended level of coverage
for a state-transition diagram.

4. Create a set of test cases such that all transitions are exercised at least once under
test. This level of testing provides a good level of coverage without generating large
numbers of tests. This level is generally the one recommended.

Figure 7-12: A set of test cases that trigger all transitions at least once.
Test cases can also be read directly from the state-transition table. The gray rows in the
following table show all the valid transitions.
Table 7-2: Testing all valid transitions from a State-transition table.
Current State

Event

Action

Next State

null

giveInfo

startPayTimer

Made

null

payMoney

null

giveTicket

null

cancel

null

PayTimerExpires

null

Made

giveInfo

Made

payMoney

Paid

Made

giveTicket

Made

cancel

Can-Cust

Made

PayTimerExpires

Can-NonPay

Paid

giveInfo

Paid

payMoney

Paid

Ticket

Ticketed

Paid

giveTicket

Paid

cancel

Refund

Can-Cust

Paid

PayTimerExpires

Paid

Ticketed

giveInfo

Ticketed

payMoney

Ticketed

giveTicket

Used

Ticketed

cancel

Refund

Can-Cust

Ticketed

PayTimerExpires

Ticketed

Used

giveInfo

Used

payMoney

Used

giveTicket

Used

cancel

Used

PayTimerExpires

Used

Can-NonPay

giveInfo

Can-NonPay

payMoney

Can-NonPay

giveTicket

Can-NonPay

cancel

Can-NonPay

PayTimerExpires

Can-NonPay

Can-Cust

givelnfo

Can-Cust

payMoney

Can-Cust

giveTicket

Can-Cust

cancel

Can-Cust

PayTimerExpires

Can-Cust

In addition, depending on the system risk, you may want to create test cases for some or all of
the invalid state/event pairs to make sure the system has not implemented invalid paths.

Applicability and Limitations
State-Transition diagrams are excellent tools to capture certain system requirements, namely
those that describe states and their associated transitions. These diagrams then can be used
to direct our testing efforts by identifying the states, events, and transitions that should be
tested.
State-Transition diagrams are not applicable when the system has no state or does not need to
respond to real-time events from outside of the system. An example is a payroll program that
reads an employee's time record, computes pay, subtracts deductions, saves the record, prints
a paycheck, and repeats the process.

Summary
State-Transition diagrams direct our testing efforts by identifying the states, events,
actions, and transitions that should be tested. Together, these define how a system
interacts with the outside world, the events it processes, and the valid and invalid order
of these events.
A state-transition diagram is not the only way to document system behavior. They may
be easier to comprehend, but state-transition tables may be easier to use in a
complete and systematic manner.
The generally recommended level of testing using state-transition diagrams is to create
a set of test cases such that all transitions are exercised at least once under test. In
high-risk systems, you may want to create even more test cases, approaching all paths
if possible.

Practice
1. This exercise refers to the Stateless University Registration System Web site
described in Appendix B. Below is a state-transition diagram for the "enroll in a
course" and "drop a course" process. Determine a set of test cases that you feel
adequately cover the enroll and drop process.
The following terms are used in the diagram: Events
create - Create a new course.
enroll - Add a student to the course.
drop - Drop a student from the course.
Attributes
ID - The student identification number.
max - The maximum number of students a course can hold.
#enrolled - The number of students currently enrolled in the course.
#waiting - The number of students currently on the Wait List for this course.
Tests
isEnrolled - Answers "is the student enrolled (on the Section List)?"
onWaitList - Answers "is the student on the WaitList?"
Lists
SectionList - A list of students enrolled in the class.
WaitList - A list of students waiting to be enrolled in a full class.
Symbols
++ Increment by 1.
-- Decrement by 1.

Figure 7-13: State-transition diagram for enroll and drop a course at Stateless U.

References
Binder, Robert V. (1999). Testing Object-Oriented Systems: Models, Patterns, and
Tools. Addison-Wesley.
Fowler, Martin and Kendall Scott (1999). UML Distilled: A Brief Guide to the Standard
Object Modeling Language (2nd Edition). Addison-Wesley.
Harel, David. "Statecharts: a visual formalism for complex systems." Science of
Computer Programming 8, 1987, pp 231–274.
Mealy, G.H. "A method for synthesizing sequential circuits." Bell System Technical
Journal, 34(5): 1045–1079, 1955.
Moore, E.F. "Gedanken-experiments on sequential machines," Automata Studies (C. E.
Shannon and J. McCarthy, eds.), pp. 129–153, Princeton, New Jersey: Princeton
University Press, 1956.
Rumbaugh, James, et al. (1991). Object-Oriented Modeling and Design. Prentice-Hall.

Chapter 8: Domain Analysis Testing
Standing in the concessions car of the Orient Express as it hissed and lurched away from
the station, Special Agent Chu could feel enemy eyes watching him from the inky
shadows and knew that he was being tested, for although he had never tasted a plug of
tobacco in his life, he was impersonating an arms dealer known to be a connoisseur, so
he knew that he, the Chosen One, Chow Chu, had no choice but to choose the choicest
chew on the choo-choo.
— Loren Haarsma

Introduction
In the chapters on Equivalence Class and Boundary Value testing, we considered the testing of
individual variables that took on values within specified ranges. In this chapter we will consider
the testing of multiple variables simultaneously. There are two reasons to consider this:
We rarely will have time to create test cases for every variable in our systems. There
are simply too many.
Often variables interact. The value of one variable constrains the acceptable values of
another. In this case, certain defects cannot be discovered by testing them individually.
Domain analysis is a technique that can be used to identify efficient and effective test cases
when multiple variables can or should be tested together. It builds on and generalizes
equivalence class and boundary value testing to n simultaneous dimensions. Like those
techniques, we are searching for situations where the boundary has been defined or
implemented incorrectly.
Key
Point

Domain analysis is a technique that can be used to identify efficient and effective
test cases when multiple variables should be tested together.

In two dimensions (with two interacting parameters) the following defects can occur:
A shifted boundary in which the boundary is displaced vertically or horizontally
A tilted boundary in which the boundary is rotated at an incorrect angle
A missing boundary
An extra boundary
Figure 8-1 is adapted from Binder. It illustrates these four types of defects graphically.

Figure 8-1: Two dimensional boundary defects.
Certainly there can be interactions between three or more variables, but the diagrams are more
difficult to visualize.

Technique
The domain analysis process guides us in choosing efficient and effective test cases. First, a
number of definitions:
An on point is a value that lies on a boundary.
An off point is a value that does not lie on a boundary.
An in point is a value that satisfies all the boundary conditions but does not lie on a
boundary.
An out point is a value that does not satisfy any boundary condition.
Choosing on and off points is more complicated that it may appear.
When the boundary is closed (defined by an operator containing an equality, i.e., ≤, ≥ or
=) so that points on the boundary are included in the domain, then an on point lies on
the boundary and is included within the domain. An off point lies outside the domain.
When the boundary is open (defined by an inequality operator < or >) so that points on
the boundary are not included in the domain, then an on point lies on the boundary but
is not included within the domain. An off point lies inside the domain.
Confused? At this point examples are certainly in order.

Figure 8-2: Examples of on, off, in, and out points for both closed and open boundaries.
On the left is an example of a closed boundary. The region defined consists of all the points
greater than or equal to 10. The on point has the value 10. The off point is slightly off the
boundary and outside the domain. The in point is within the domain. The out point is outside the
domain.
On the right is an example of an open boundary. The region defined consists of all the points
greater than (but not equal to) 10. Again, the on point has a value of 10. The off point is slightly
off the boundary and inside the domain. The in point is within the domain. The out point is
outside the domain.
Having defined these points, the 1x1 ("one-by-one") domain analysis technique instructs us to
choose these test cases:

For each relational condition (≥, >, ≤, or <) choose one on point and one off point.
For each strict equality condition (=) choose one on point and two off points, one
slightly less than the conditional value and one slightly greater than the value.
Note that there is no reason to repeat identical tests for adjacent domains. If an off point for
one domain is the in point for another, do not duplicate these tests.
Binder suggests a very useful table for documenting 1x1 domain analysis test cases called the
Domain Test Matrix.
Table 8-1: Example Domain Test Matrix.

Note that test cases 1 through 8 test the on points and off points for each condition of the first
variable (X1) while holding the value of the second variable (X2) at a typical in point. Test cases
9 through 16 hold the first variable at a typical in point while testing the on and off points for
each condition of the second variable. Additional variables and conditions would follow the
same pattern.

Example
Admission to Stateless University is made by considering a combination of high school grades
and ACT test scores. The shaded cells in the following table indicate the combinations that
would guarantee acceptance. Grade Point Averages (GPAs) are shown across the top while
ACT scores are shown down the left side. Stateless University is a fairly exclusive school in
terms of its admission policy.
Explanation
The ACT Assessment is an examination designed to assess high school students' general
educational development and their ability to complete college-level work.
The Grade Point Average is based on converting letter grades to numeric values A = 4.0
(Best)
B = 3.0
C = 2.0 (Average)
D = 1.0

Table 8-2: Stateless University Admissions Matrix.

This table can be represented as the solution set of these three linear equations: ACT ≤ 36 (the
highest score possible) GPA ≤ 4.0 (the highest value possible) 10*GPA + ACT ≥ 71
(The third equation can be found by using the good old y=mx+b formula from elementary
algebra. Use points {ACT=36, GPA=3.5} and {ACT=31, GPA=4.0} and crank—that's math
slang for solve the pair of simultaneous equations obtained by substituting each of these two
points into the y=mx+b equation.)

Figure 8-3: Stateless University Admissions Matrix in graphical form.
The following test cases cover these three boundaries using the 1x1 domain analysis process.
Table 8-3: 1x1 Domain Analysis test cases for Stateless University admissions.

Test cases 1 and 2 verify the GPA ≤ 4.0 constraint. Case 1 checks on the GPA = 4.0 boundary
while case 2 checks just outside the boundary with GPA = 4.1. Both of these cases use typical
values for the ACT and GPA/ACT constraints.
Test cases 3 and 4 verify the ACT ≤ 36 constraint. Case 3 checks on the ACT = 36 boundary
while case 4 checks just outside the boundary with ACT = 37. Both of these cases use typical
values for the GPA and GPA/ACT constraints.
Test cases 5 and 6 verify the 10*GPA + ACT ≥ 71 constraint. Case 5 checks on the GPA = 3.7
and ACT = 34 boundary while case 6 checks just outside the boundary with GPA=3.8 and ACT
= 32. Both of these cases use typical values for the GPA and ACT constraints.

Applicability and Limitations
Domain analysis is applicable when multiple variables (such as input fields) should be tested
together either for efficiency or because of a logical interaction. While this technique is best
suited to numeric values, it can be generalized to Booleans, strings, enumerations, etc.

Summary
Domain analysis facilitates the testing of multiple variables simultaneously. It is useful
because we rarely will have time to create test cases for every variable in our systems.
There are simply too many. In addition, often variables interact. When the value of one
variable constrains the acceptable values of another, certain defects cannot be
discovered by testing them individually.
It builds on and generalizes equivalence class and boundary value testing to n
simultaneous dimensions. Like those techniques, we are searching for situations where
the boundary has been implemented incorrectly.
In using the 1x1 domain analysis technique for each relational condition (≥, >, ≤, or <)
we choose one on point and one off point. For each strict equality condition (=) we
choose one on point and two off points, one slightly less than the conditional value and
one slightly greater than the value.

Practice
1. Stateless University prides itself in preparing not just educated students but good
citizens of their nation. (That's what their advertising brochure says.) In addition to
their major and minor coursework, Stateless U. requires each student to take (and
pass) a number of General Education classes. These are:
College Algebra (the student may either take the course or show competency
through testing).
Our Nation's Institutions—a survey course of our nation's history, government,
and place in the world.
From four to sixteen hours of Social Science courses (numbers 100–299).
From four to sixteen hours of Physical Science courses (numbers 100–299)
No more than twenty-four combined hours of Social Science and Physical
Science courses may be counted toward graduation.
Apply 1 x 1 domain analysis to these requirements, derive the test cases,
and use Binder's Domain Test Matrix to document them.

References
Beizer, Boris (1990). Software Testing Techniques. Van Nostrand Reinhold.
Binder, Robert V. (2000). Testing Object-Oriented Systems: Models, Patterns, and
Tools. Addison-Wesley.

Chapter 9: Use Case Testing
The Insect Keeper General, sitting astride his giant hovering aphid, surveyed the
battlefield which reeked with the stench of decay and resonated with the low drone of the
tattered and dying mutant swarms as their legs kicked forlornly at the sky before turning
to his master and saying, 'My Lord, your flies are undone.'
— Andrew Vincent

Introduction
Up until now we have examined test case design techniques for parts of a system—input
variables with their ranges and boundaries, business rules as represented in decision tables,
and system behaviors as represented in state-transition diagrams. Now it is time to consider
test cases that exercise a system's functionalities from start to finish by testing each of its
individual transactions.
Defining the transactions that a system processes is a vital part of the requirements definition
process. Various approaches to documenting these transactions have been used in the past.
Examples include flowcharts, HIPO diagrams, and text. Today, the most popular approach is
the use case diagram. Like decision tables and state-transition diagrams, use cases are usually
created by developers for developers. But, like these other techniques, use cases hold a
wealth of information useful to testers.
Use cases were created by Ivar Jacobsen and popularized in his book Object-Oriented
Software Engineering: A Use Case Driven Approach. Jacobsen defines a "use case" as a
scenario that describes the use of a system by an actor to accomplish a specific goal. By
"actor" we mean a user, playing a role with respect to the system, seeking to use the system to
accomplish something worthwhile within a particular context. Actors are generally people
although other systems may also be actors. A "scenario" is a sequence of steps that describe
the interactions between the actor and the system. Note that the use case is defined from the
perspective of the user, not the system. Note also that the internal workings of the system,
while vital, are not part of the use case definition. The set of use cases makes up the functional
requirements of a system.
The Unified Modeling Language notion for use cases is:

Figure 9-1: Some Stateless University use cases.
The stick figures represent the actors, the ellipses represent the use cases, and the arrows
show which actors initiate which use cases.
It is important to note that while use cases were created in the context of object-oriented
systems development, they are equally useful in defining functional requirements in other
development paradigms as well.

The value of use cases is that they:
Capture the system's functional requirements from the user's perspective; not from a
technical perspective, and irrespective of the development paradigm to be used.
Can be used to actively involve users in the requirements gathering and definition
process.
Provide the basis for identifying a system's key internal components, structures,
databases, and relationships.
Serve as the foundation for developing test cases at the system and acceptance level.

Technique
Unfortunately, the level of detail specified in the use cases is not sufficient, either for developers
or testers. In his book Writing Effective Use Cases, Alistair Cockburn has proposed a detailed
template for describing use cases. The following is adapted from his work:
Table 9-1: Use case template.
Use Case
Component

Description

Use Case Number
A unique identifier for this use case
or Identifier
Use Case Name

The name should be the goal stated as a short active verb phrase

Goal in Context

A more detailed statement of the goal if necessary

Scope

Corporate | System | Subsystem

Level

Summary | Primary task | Subfunction

Primary Actor

Role name or description of the primary actor

Preconditions

The required state of the system before the use case is triggered

Success End
Conditions

The state of the system upon successful completion of this use case

Failed End
Conditions

The state of the system if the use case cannot execute to completion

Trigger

The action that initiates the execution of the use case
Step

Main Success
Scenario

Action

1
2
...

Extensions

Conditions under which the main success scenario will vary and a
description of those variations

Sub-Variations

Variations that do not affect the main flow but that must be
considered

Priority

Criticality

Response Time

Time available to execute this use case

Frequency

How often this use case is executed

Channels to

Interactive | File | Database | ...

Primary Actor
Secondary Actors Other actors needed to accomplish this use case
Channels to
Interactive | File | Database | ...
Secondary Actors
Date Due

Schedule information

Completeness
Level

Use Case identified (0.1)| Main scenario defined (0.5) | All extensions
defined (0.8) | All fields complete (1.0)

Open Issues

Unresolved issues awaiting decisions

Example
Consider the following example from the Stateless University Registration System. A student
wants to register for a course using SU's online registration system, SURS.
Table 9-2: Example use case.
Use Case Component

Description

Use Case Number or
Identifier

SURS1138

Use Case Name

Goal in Context
Scope

System

Level

Primary task

Primary Actor

Student

Preconditions

None

Success End Conditions

The student is registered for the course—the course has
been added to the student's course list

Failed End Conditions

The student's course list is unchanged

Trigger

Student selects a course and "Registers"
Step Action
1

A: Selects "Register for a course"

A: Selects course (e.g. Math 1060)

Main Success Scenario A: 3
Actor S: System
4

S: Displays course description
A: Selects section (Mon & Wed 9:00am)

S: Displays section days and times

A: Accepts

S: Adds course/section to student's course list
2a

4a
Extensions

Course does not exist
S: Display message and exit
Section does not exist
S: Display message and exit
Section is full

S: Display message and exit
6a

Student does not accept S: Display message and exit

Student may use
Sub-Variations

Web
Phone

Priority

Critical

Response Time

10 seconds or less

Frequency

Approximately 5 courses x 10,000 students over a 4-week
period

Channels to Primary Actor Interactive
Secondary Actors

None

Channels to Secondary
Actors

N/A

Date Due

1 Feb

Completeness Level

0.5

Open Issues

None

Hopefully each use case has been through an inspection process before it was implemented.
To test the implementation, the basic rule is to create at least one test case for the main
success scenario and at least one test case for each extension.
Because use cases do not specify input data, the tester must select it. Typically we use the
equivalence class and boundary value techniques described earlier. Also a Domain Test Matrix
(see the Domain Analysis Testing chapter for an example) may be a useful way of documenting
the test cases.
It is important to consider the risk of the transaction and its variants under test. Less risky
transactions merit less testing. More risky transactions should receive more testing. For them
consider the following approach.
Key
Point

Always remember to evaluate the risk of each use case and extension and create
test cases accordingly.

To create test cases, start with normal data for the most often used transactions. Then move to
boundary values and invalid data. Next, choose transactions that, while not used often, are vital
to the success of the system (i.e., Shut Down The Nuclear Reactor). Make sure you have at
least one test case for every Extension in the use case. Try transactions in strange orders.
Violate the preconditions (if that can happen in actual use). If a transaction has loops, don't just
loop through once or twice—be diabolical. Look for the longest, most convoluted path through

the transaction and try it. If transactions should be executed in some logical order, try a
different order. Instead of entering data top-down, try bottom-up. Create "goofy" test cases. If
you don't try strange things, you know the users will.
Free Stuff Download Holodeck from http://www.sisecure.com/holodeck/holodecktrial.aspx.
Most paths through a transaction are easy to create. They correspond to valid and invalid data
being entered. More difficult are those paths due to some kind of exceptional condition—low
memory, disk full, connection lost, driver not loaded, etc. It can be very time consuming for the
tester to create or simulate these conditions. Fortunately, a tool is available to help the tester
simulate these problems—Holodeck, created by James Whittaker and his associates at Florida
Institute of Technology. Holodeck monitors the interactions between an application and its
operating system. It logs each system call and enables the tester to simulate a failure of any
call at will. In this way, the disk can be "made full," network connections can "become
disconnected," data transmission can "be garbled," and a host of other problems can be
simulated.
A major component of transaction testing is test data. Boris Beizer suggests that 30 percent to
40 percent of the effort in transaction testing is generating, capturing, or extracting test data.
Don't forget to include resources (time and people) for this work in your project's budget.
Note

One testing group designates a "data czar" whose sole responsibility is to provide test
data.

Applicability and Limitations
Transaction testing is generally the cornerstone of system and acceptance testing. It should be
used whenever system transactions are well defined. If system transactions are not well
defined, you might consider polishing up your resume or C.V.
While creating at least one test case for the main success scenario and at least one for each
extension provides some level of test coverage, it is clear that, no matter how much we try,
most input combinations will remain untested. Do not be overconfident about the quality of the
system at this point.

Summary
A use case is a scenario that describes the use of a system by an actor to accomplish
a specific goal. An "actor" is a user, playing a role with respect to the system, seeking
to use the system to accomplish something worthwhile within a particular context. A
scenario is a sequence of steps that describe the interactions between the actor and
the system.
A major component of transaction testing is test data. Boris Beizer suggests that 30
percent to 40 percent of the effort in transaction testing is generating, capturing, or
extracting test data. Don't forget to include resources (time and people) for this work in
your project's budget.
While creating at least one test case for the main success scenario and at least one for
each extension provides some level of test coverage, it is clear that, no matter how
much we try, most input combinations will remain untested. Do not be overconfident
about the quality of the system at this point.

Practice
1. Given the "Register For A Course" use case for the Stateless University Registration
System described previously, create a set of test cases so that the main success
scenario and each of the extensions are tested at least once. Choose "interesting"
test data using the equivalence class and boundary value techniques.

References
Beizer, Boris (1990). Software Testing Techniques (Second Edition). Van Nostrand
Reinhold.
Beizer, Boris (1995). Black-Box Testing: Techniques for Functional Testing of Software
and Systems. John Wiley & Sons.
Cockburn, Alistair (2000). Writing Effective Use Cases. Addison-Wesley.
Fowler, Martin and Kendall Scott (1999). UML Distilled: A Brief Guide to the Standard
Object Modeling Language (2nd Edition). Addison-Wesley.
Jacobsen, Ivar, et al. (1992). Object-Oriented Systems Engineering: A Use Case Driven
Approach. Addison-Wesley.

Section II: White Box Testing Techniques
Chapter List
Chapter 10: Control Flow Testing
Chapter 11: Data Flow Testing

Part Overview

Definition
White box testing is a strategy in which testing is based on the internal paths, structure, and
implementation of the software under test (SUT). Unlike its complement, black box testing,
white box testing generally requires detailed programming skills.

The general white box testing process is:
The SUT's implementation is analyzed.
Paths through the SUT are identified.
Inputs are chosen to cause the SUT to execute selected paths. This is called path
sensitization. Expected results for those inputs are determined.
The tests are run.
Actual outputs are compared with the expected outputs.
A determination is made as to the proper functioning of the SUT.

Applicability
White box testing can be applied at all levels of system development—unit, integration, and
system. Generally white box testing is equated with unit testing performed by developers. While
this is correct, it is a narrow view of white box testing.
White box testing is more than code testing—it is path testing. Generally, the paths that are
tested are within a module (unit testing). But we can apply the same techniques to test paths
between modules within subsystems, between subsystems within systems, and even between
entire systems.

Disadvantages
White box testing has four distinct disadvantages. First, the number of execution paths may be
so large than they cannot all be tested. Attempting to test all execution paths through white box
testing is generally as infeasible as testing all input data combinations through black box testing.
Second, the test cases chosen may not detect data sensitivity errors. For example:
p=q/r;
may execute correctly except when r=0.
y=2*x // should read y=x2
will pass for test cases x=0, y=0 and x=2, y=4
Third, white box testing assumes the control flow is correct (or very close to correct). Since the
tests are based on the existing paths, nonexistent paths cannot be discovered through white
box testing.
Fourth, the tester must have the programming skills to understand and evaluate the software
under test. Unfortunately, many testers today do not have this background.

Advantages
When using white box testing, the tester can be sure that every path through the software
under test has been identified and tested.

Chapter 10: Control Flow Testing
It was from the primeval wellspring of an antediluvian passion that my story arises which,
like the round earth flattened on a map, is but a linear projection of an otherwise
periphrastic and polyphiloprogenitive, non-planar, non-didactic, self-inverting construction
whose obscurantist geotropic liminality is beyond reasonable doubt.
— Milinda Banerjee

Introduction
Control flow testing is one of two white box testing techniques. This testing approach identifies
the execution paths through a module of program code and then creates and executes test
cases to cover those paths. The second technique, discussed in the next chapter, focuses on
data flow.
Key
Point

Path: A sequence of statement execution that begins at an entry and ends at an
exit.

Unfortunately, in any reasonably interesting module, attempting exhaustive testing of all control
flow paths has a number of significant drawbacks.
The number of paths could be huge and thus untestable within a reasonable amount of
time. Every decision doubles the number of paths and every loop multiplies the paths by
the number of iterations through the loop. For example:
for (i=1; i<=1000; i++)
for (j=1; j<=1000; j++)
for (k=1; k<=1000; k++)
doSomethingWith(i,j,k);
executes doSomethingWith() one billion times (1000 x 1000 x 1000). Each unique path
deserves to be tested.
Paths called for in the specification may simply be missing from the module. Any testing
approach based on implemented paths will never find paths that were not implemented.
if (a>0) doIsGreater();
if (a==0) dolsEqual();
// missing statement - if (a<0) dolsLess();
Defects may exist in processing statements within the module even through the control
flow itself is correct.
// actual (but incorrect) code
a=a+1;
// correct code
a=a-1;
The module may execute correctly for almost all data values but fail for a few.
int blech (int a, int b) {
return a/b;
}
fails if b has the value 0 but executes correctly if b is not 0.
Even though control flow testing has a number of drawbacks, it is still a vital tool in the tester's

toolbox.

Technique
Control Flow Graphs
Control flow graphs are the foundation of control flow testing. These graphs document the
module's control structure. Modules of code are converted to graphs, the paths through the
graphs are analyzed, and test cases are created from that analysis. Control flow graphs
consist of a number of elements:
Key
Point

Control flow graphs are the foundation of control flow testing.

Process Blocks
A process block is a sequence of program statements that execute sequentially from beginning
to end. No entry into the block is permitted except at the beginning. No exit from the block is
permitted except at the end. Once the block is initiated, every statement within it will be
executed sequentially. Process blocks are represented in control flow graphs by a bubble with
one entry and one exit.

Decision Point
A decision point is a point in the module at which the control flow can change. Most decision
points are binary and are implemented by if-then-else statements. Multi-way decision points are
implemented by case statements. They are represented by a bubble with one entry and
multiple exits.

Junction Point

A junction point is a point at which control flows join together.

The following code example is represented by its associated flow graph:

Figure 10-1: Flow graph equivalent of program code.

Levels of Coverage
In control flow testing, different levels of test coverage are defined. By "coverage" we mean the
percentage of the code that has been tested vs. that which is there to test. In control flow
testing we define coverage at a number of different levels. (Note that these coverage levels are
not presented in order. This is because, in some cases, it is easier to define a higher coverage
level and then define a lower coverage level in terms of the higher.)

Level 1
The lowest coverage level is "100% statement coverage" (sometimes the "100%" is
dropped and is referred to as "statement coverage"). This means that every statement
within the module is executed, under test, at least once. While this may seem like a
reasonable goal, many defects may be missed with this level of coverage. Consider the
following code snippet:
if (a>0) {x=x+1;}

if (b==3) {y=0;}
This code can be represented in graphical form as:

Figure 10-2: Graphical representation of the two-line code snippet.
These two lines of code implement four different paths of execution:

Figure 10-3: Four execution paths.
While a single test case is sufficient to test every line of code in this module (for
example, use a=6 and b=3 as input), it is apparent that this level of coverage will miss
testing many paths. Thus, statement coverage, while a beginning, is generally not an
acceptable level of testing.

Even though statement coverage is the lowest level of coverage, even that may be difficult to
achieve in practice. Often modules have code that is executed only in exceptional
circumstances—low memory, full disk, unreadable files, lost connections, etc. Testers may find
it difficult or even impossible to simulate these circumstances and thus code that deals with
these problems will remain untested.
Holodeck is a tool that can simulate many of these exceptional situations. According to
Holodeck's specification it "will allow you, the tester, to test software by observing the system
calls that it makes and create test cases that you may use during software execution to modify
the behavior of the application. Modifications might include manipulating the parameters sent to
functions or changing the return values of functions within your software. In addition, you may
also set error-codes and other system events. This set of possibilities allows you to emulate
environments that your software might encounter - hence the name 'Holodeck.' Instead of
needing to unplug your network connection, create a disk with bad sectors, corrupt packets on
the network, or perform any outside or special manipulation of your machine, you can use
Holodeck to emulate these problems. Faults can easily be placed into any software testing
project that you are using with Holodeck."
Holodeck
To download Holodeck visit http://www.sisecure.com/holodeck/holodeck-trial.aspx.

Level 0
Actually, there is a level of coverage below "100% statement coverage." That level is
defined as "test whatever you test; let the users test the rest." The corporate
landscape is strewn with the sun-bleached bones of organizations who have used this
testing approach. Regarding this level of coverage, Boris Beizer wrote "testing less
than this [100% statement coverage] for new software is unconscionable and should be
criminalized. ... In case I haven't made myself clear, ... untested code in a system is
stupid, shortsighted, and irresponsible."

Level 2
The next level of control flow coverage is "100% decision coverage." This is also called
"branch coverage." At this level enough test cases are written so that each decision
that has a TRUE and FALSE outcome is evaluated at least once. In the previous
example this can be achieved with two test cases (a=2, b=2 and a=4, b=3).

Figure 10-4: Two test cases that yield 100% decision coverage.
Case statements with multiple exits would have tests for each exit. Note that decision
coverage does not necessarily guarantee path coverage but it does guarantee
statement coverage.

Level 3
Not all conditional statements are as simple as the ones previously shown. Consider
these more complicated statements:
if (a>0 && c==1) {x=x+1;}
if (b==3 || d<0) {y=0;}
To be TRUE, the first statement requires a greater than 0 and c equal 1. The second
requires b equal 3 or d less than 0.
In the first statement if the value of a were set to 0 for testing purposes then the c==1
part of the condition would not be tested. (In most programming languages the second
expression would not even be evaluated.) The next level of control flow coverage is
"100% condition coverage." At this level enough test cases are written so that each
condition that has a TRUE and FALSE outcome that makes up a decision is evaluated
at least once. This level of coverage can be achieved with two test cases (a>0, c=1,
b=3, d<0 and a≤0, c≠1, b≠3, d≥0). Condition coverage is usually better than decision
coverage because every individual condition is tested at least once while decision
coverage can be achieved without testing every condition.

Level 4
Consider this situation:
if(x&&y) {conditionedStatement;}
// note: && indicates logical AND
We can achieve condition coverage with two test cases (x=TRUE, y=FALSE and
x=FALSE, y=TRUE) but note that with these choices of data values the
conditionedStatement will never be executed. Given the possible combination of
conditions such as these, to be more complete "100% decision/condition" coverage can
be selected. At this level test cases are created for every condition and every decision.

Level 5
To be even more thorough, consider how the programming language compiler actually
evaluates the multiple conditions in a decision. Use that knowledge to create test cases
yielding "100% multiple condition coverage."
if (a>0 && c==1) {x=x+1;}
if (b==3 || d<0) {y=0;}
// note: || means logical OR
will be evaluated as:

Figure 10-5: Compiler evaluation of complex conditions.
This level of coverage can be achieved with four test cases:
a>0, c=1, b=3, d<0
a≤0, c=1, b=3, d≥0
a>0, c≠1, b≠3, d<0
a≤0, c≠1, b≠3, d≥0
Achieving 100% multiple condition coverage also achieves decision coverage, condition
coverage, and decision/condition coverage. Note that multiple condition coverage does
not guarantee path coverage.

Level 7
Finally we reach the highest level, which is "100% path coverage." For code modules
without loops the number of paths is generally small enough that a test case can
actually be constructed for each path. For modules with loops, the number of paths can
be enormous and thus pose an intractable testing problem.

Figure 10-6: An interesting flow diagram with many, many paths.

Level 6
When a module has loops in the code paths such that the number of paths is infinite, a
significant but meaningful reduction can be made by limiting loop execution to a small
number of cases. The first case is to execute the loop zero times; the second is to
execute the loop one time, the third is to execute the loop n times where n is a small
number representing a typical loop value; the fourth is to execute the loop its maximum
number of times m. In addition you might try m-1 and m+1.
Before beginning control flow testing, an appropriate level of coverage should be chosen.

Structured Testing / Basis Path Testing
No discussion on control flow testing would be complete without a presentation of structured
testing, also known as basis path testing. Structured testing is based on the pioneering work of
Tom McCabe. It uses an analysis of the topology of the control flow graph to identify test
cases.
The structured testing process consists of the following steps:
Derive the control flow graph from the software module.
Compute the graph's Cyclomatic Complexity (C).
Select a set of C basis paths.
Create a test case for each basis path.
Execute these tests.
Consider the following control flow graph:

Figure 10-7: An example control flow graph.
McCabe defines the Cyclomatic Complexity (C) of a graph as
C = edges - nodes + 2
Edges are the arrows, and nodes are the bubbles on the graph. The preceding graph has 24
edges and 19 nodes for a Cyclomatic Complexity of 24-19+2 = 7.
In some cases this computation can be simplified. If all decisions in the graph are binary (they
have exactly two edges flowing out), and there are p binary decisions, then
C = p+1
Cyclomatic Complexity is exactly the minimum number of independent, nonlooping paths (called
basis paths) that can, in linear combination, generate all possible paths through the module. In
terms of a flow graph, each basis path traverses at least one edge that no other path does.
McCabe's structured testing technique calls for creating C test cases, one for each basis path.
IMPORTANT Creating and executing C test cases, based on the basis paths, guarantees
!
both branch and statement coverage.
Because the set of basis paths covers all the edges and nodes of the control flow graph,
satisfying this structured testing criteria automatically guarantees both branch and statement
coverage.
A process for creating a set of basis paths is given by McCabe:
1. Pick a "baseline" path. This path should be a reasonably "typical" path of execution
rather than an exception processing path. The best choice would be the most
important path from the tester's view.

Figure 10-8: The chosen baseline basis path ABDEGKMQS
2. To choose the next path, change the outcome of the first decision along the baseline
path while keeping the maximum number of other decisions the same as the baseline
path.

Figure 10-9: The second basis path ACDEGKMQS
3. To generate the third path, begin again with the baseline but vary the second decision
rather than the first.

Figure 10-10: The third basis path ABDFILORS
4. To generate the fourth path, begin again with the baseline but vary the third decision
rather than the second. Continue varying each decision, one by one, until the bottom
of the graph is reached.

Figure 10-11: The fourth basis path ABDEHKMQS

Figure 10-12: The fifth basis path ABDEGKNQS
5. Now that all decisions along the baseline path have been flipped, we proceed to the
second path, flipping its decisions, one by one. This pattern is continued until the basis
path set is complete.

Figure 10-13: The sixth basis path ACDFJLORS

Figure 10-14: The seventh basis path ACDFILPRS
Thus, a set of basis paths for this graph are:
ABDEGKMQS
ACDEGKMQS
ABDFILORS
ABDEHKMQS
ABDEGKNQS
ACDFJLORS
ACDFILPRS
Structured testing calls for the creation of a test case for each of these paths. This set of test
cases will guarantee both statement and branch coverage.
Note that multiple sets of basis paths can be created that are not necessarily unique. Each set,
however, has the property that a set of test cases based on it will execute every statement and
every branch.

Example
Consider the following example from Brown & Donaldson. It is the code that determines
whether B&D should buy or sell a particular stock. Unfortunately, the inner workings are a highly
classified trade secret so the actual processing code has been removed and generic
statements like s1; s2; etc. have substituted for them. The control flow statements have been
left intact but their actual conditions have been removed and generic conditions like c1 and c2
have been put in their place. (You didn't think we'd really show you how to know whether to buy
or sell stocks, did you?)
Note s1, s2, ... represent Java statements while c1, c2, ... represent conditions.
boolean evaluateBuySell (TickerSymbol ts) {
s1;
s2;
s3;
if (c1) {s4; s5; s6;}
else {s7; s8;}
while (c2) {
s9;
s10;
switch (c3) {
case-A:
s20;
s21;
s22;
break; // End of Case-A
case-B:
s30;
s31;
if (c4) {
s32;
s33;
s34;
}
else {
s35;
}
break; // End of Case-B
case-C:
s40;
s41;
break; // End of Case-C
case-D:

s50;
break; // End of Case-D
} // End Switch
s60;
s61;
s62;
if (c5) {s70; s71; }
s80;
s81;
} // End While
s90;
s91;
s92;
return result;
Figure 10-15: Java code for Brown & Donaldson's evaluateBuySell module.

The following flow diagram corresponds to this Java code:
Figure 10-16: Control flow graph for Brown & Donaldson's evaluateBuySell module.
The cyclomatic complexity of this diagram is computed by edges - nodes + 2
or
22-16+2 = 8

Let's remove the code and label each node for simplicity in describing the paths.

Figure 10-17: Control flow graph for Brown & Donaldson's evaluateBuySell module.
A set of eight basis paths is:
1. ABDP
2. ACDP
3. ABDEFGMODP
4. ABDEFHKMODP
5. ABDEFIMODP
6. ABDEFJMODP
7. ABDEFHLMODP
8. ABDEFIMNODP
Remember that basis path sets are not unique; there can be multiple sets of basis paths for a
graph.
This basis path set is now implemented as test cases. Choose values for the conditions that

would sensitize each path and execute the tests.
Table 10-1: Data values to sensitize the different control flow paths.
Test Case

False

N/A

True

False

N/A

False

True

N/A

False

True

False

True

N/A

False

True

N/A

False

True

False

True

N/A

True

Applicability and Limitations
Control flow testing is the cornerstone of unit testing. It should be used for all modules of code
that cannot be tested sufficiently through reviews and inspections. Its limitations are that the
tester must have sufficient programming skill to understand the code and its control flow. In
addition, control flow testing can be very time consuming because of all the modules and basis
paths that comprise a system.

Summary
Control flow testing identifies the execution paths through a module of program code
and then creates and executes test cases to cover those paths.
Control flow graphs are the foundation of control flow testing. Modules of code are
converted to graphs, the paths through the graphs are analyzed, and test cases are
created from that analysis.
Cyclomatic Complexity is exactly the minimum number of independent, nonlooping paths
(called basis paths) that can, in linear combination, generate all possible paths through
the module.
Because the set of basis paths covers all the edges and nodes of the control flow
graph, satisfying this structured testing criteria automatically guarantees both branch
and statement coverage.

Practice
1. Below is a brief program listing. Create the control flow diagram, determine its
Cyclomatic Complexity, choose a set of basis paths, and determine the necessary
values for the conditions to sensitize each path.
if (c1) {
while (c2) {
if (c3) { s1; s2;
if (c5) s5;
else s6;
break; // Skip to end of while
else
if (c4) { }
else { s3; s4; break;}
} // End of while
} // End of if
s7;
if (c6) s8; s9;
s10;

References
Beizer, Boris (1990). Software Testing Techniques (Second Edition). Van Nostrand
Reinhold.
Myers, Glenford (1979). The Art of Software Testing. John Wiley & Sons.
Pressman, Roger S. (1982). Software Engineering: A Practitioner's Approach (Fourth
Edition). McGraw-Hill.
Watson, Arthur H. and Thomas J. McCabe. Structured Testing: A Testing Methodology
Using the Cyclomatic Complexity Metric. NIST Special Publication 500-235 available at
http://www.mccabe.com/nist/nist_pub.php

Chapter 11: Data Flow Testing
Holly had reached the age and level of maturity to comprehend the emotional nuances of
Thomas Wolfe's assertion "you can't go home again," but in her case it was even more
poignant because there was no home to return to: her parents had separated, sold the
house, euthanized Bowser, and disowned Holly for dropping out of high school to marry
that 43-year-old manager of Trailer Town in Idaho—and even their trailer wasn't a place
she could call home because it was only a summer sublet.
— Eileen Ostrow Feldman

Introduction
Almost every programmer has made this type of mistake:
main() {
int x;
if (x==42){ ...}
}
The mistake is referencing the value of a variable without first assigning a value to it. Naive
developers unconsciously assume that the language compiler or run-time system will set all
variables to zero, blanks, TRUE, 42, or whatever they require later in the program. A simple C
program that illustrates this defect is:
#include
main() {
int x;
printf ("%d",x);
}
The value printed will be whatever value was "left over" in the memory location to which x has
been assigned, not necessarily what the programmer wanted or expected.
Data flow testing is a powerful tool to detect just such errors. Rapps and Weyuker,
popularizers of this approach, wrote, "It is our belief that, just as one would not feel confident
about a program without executing every statement in it as part of some test, one should not
feel confident about a program without having seen the effect of using the value produced by
each and every computation."
Key
Point

Data flow testing is a powerful tool to detect improper use of data values due to
coding errors.

Technique
Variables that contain data values have a defined life cycle. They are created, they are used,
and they are killed (destroyed). In some programming languages (FORTRAN and BASIC, for
example) creation and destruction are automatic. A variable is created the first time it is
assigned a value and destroyed when the program exits.
In other languages (like C, C++, and Java) the creation is formal. Variables are declared by
statements such as:
int x;

// x is created as an integer
string y; // y is created as a string

These declarations generally occur within a block of code beginning with an opening brace {
and ending with a closing brace }. Variables defined within a block are created when their
definitions are executed and are automatically destroyed at the end of a block. This is called
the "scope" of the variable. For example:
{

// begin outer block
int x;
// x is defined as an integer within this outer block
...;
// x can be accessed here
{
// begin inner block
int y; // y is defined within this inner block
...;
// both x and y can be accessed here
}
// y is automatically destroyed at the end of
// this block
...;
// x can still be accessed, but y is gone
} // x is automatically destroyed

Variables can be used in computation (a=b+1). They can also be used in conditionals (if
(a>42)). In both uses it is equally important that the variable has been assigned a value before
it is used.
Three possibilities exist for the first occurrence of a variable through a program path:
1. ~dthe variable does not exist (indicated by the ~), then it is defined (d)
2. ~uthe variable does not exist, then it is used (u)
3. ~k the variable does not exist, then it is killed or destroyed (k)
The first is correct. The variable does not exist and then it is defined. The second is incorrect. A
variable must not be used before it is defined. The third is probably incorrect. Destroying a
variable before it is created is indicative of a programming error.
Now consider the following time-sequenced pairs of defined (d), used (u), and killed (k):
ddDefined and defined again—not invalid but suspicious. Probably a programming error.
duDefined and used—perfectly correct. The normal case.

dk Defined and then killed—not invalid but probably a programming error.
ud Used and defined—acceptable.
uu Used and used again—acceptable.
uk Used and killed—acceptable.
kd Killed and defined—acceptable. A variable is killed and then redefined.
ku

Killed and used—a serious defect. Using a variable that does not exist or is undefined is
always an error.

kk Killed and killed—probably a programming error.
Key
Point

Examine time-sequenced pairs of defined, used, and killed variable references.

A data flow graph is similar to a control flow graph in that it shows the processing flow through
a module. In addition, it details the definition, use, and destruction of each of the module's
variables. We will construct these diagrams and verify that the define-use-kill patterns are
appropriate. First, we will perform a static test of the diagram. By "static" we mean we
examine the diagram (formally through inspections or informally through look-sees). Second, we
perform dynamic tests on the module. By "dynamic" we mean we construct and execute test
cases. Let's begin with the static testing.

Static Data Flow Testing
The following control flow diagram has been annotated with define-use-kill information for each
of the variables used in the module.

Figure 11-1: The control flow diagram annotated with define-use-kill information
for each of the module's variables.
For each variable within the module we will examine define-use-kill patterns along the control
flow paths. Consider variable x as we traverse the left and then the right path:

Figure 11-2: The control flow diagram annotated with define-use-kill information for the x
variable.
The define-use-kill patterns for x (taken in pairs as we follow the paths) are:
~define

correct, the normal case

define-definesuspicious, perhaps a programming error
define-use

correct, the normal case

Now for variable y. Note that the first branch in the module has no impact on the y variable.

Figure 11-3: The control flow diagram annotated with define-use-kill information
for the y variable.
The define-use-kill patterns for y (taken in pairs as we follow the paths) are:
~use

major blunder

use-defineacceptable
define-usecorrect, the normal case
use-kill

acceptable

define-kill probable programming error
Now for variable z.

Figure 11-4: The control flow diagram annotated with define-use-kill information
for the z variable.
The define-use-kill patterns (taken in pairs as we follow the paths) are:
~kill

programming error

kill-use

major blunder

use-use

correct, the normal case

use-defineacceptable
kill-kill

probably a programming error

kill-define acceptable
define-usecorrect, the normal case
In performing a static analysis on this data flow model the following problems have been
discovered:
x: define-define
y: ~use
y: define-kill
z: ~kill
z: kill-use
z: kill-kill
Unfortunately, while static testing can detect many data flow errors, it cannot find them all.
Consider the following situations:

1. Arrays are collections of data elements that share the same name and type. For
example
int stuff[100];
defines an array named stuff consisting of 100 integer elements. In C, C++, and Java

the individual elements are named stuff[0], stuff[1], stuff[2], etc. Arrays are defined
and destroyed as a unit but specific elements of the array are used individually. Often
programmers refer to stuff[j] where j changes dynamically as the program executes.
In the general case, static analysis cannot determine whether the define-use-kill rules
have been followed properly unless each element is considered individually.

2. In complex control flows it is possible that a certain path can never be executed. In
this case an improper define-use-kill combination might exist but will never be
executed and so is not truly improper.

3. In systems that process interrupts, some of the define-use-kill actions may occur at
the interrupt level while other define-use-kill actions occur at the main processing
level. In addition, if the system uses multiple levels of execution priorities, static
analysis of the myriad of possible interactions is simply too difficult to perform
manually.
For this reason, we now turn to dynamic data flow testing.

Dynamic Data Flow Testing
Because data flow testing is based on a module's control flow, it assumes that the control flow
is basically correct. The data flow testing process is to choose enough test cases so that:
Every "define" is traced to each of its "uses"
Every "use" is traced from its corresponding "define"
To do this, enumerate the paths through the module. This is done using the same approach as
in control flow testing: Begin at the module's entry point, take the leftmost path through the
module to its exit. Return to the beginning and vary the first branching condition. Follow that
path to the exit. Return to the beginning and vary the second branching condition, then the third,
and so on until all the paths are listed. Then, for every variable, create at least one test case to
cover every define-use pair.

Applicability and Limitations
Data flow testing builds on and expands control flow testing techniques. As with control flow
testing, it should be used for all modules of code that cannot be tested sufficiently through
reviews and inspections. Its limitations are that the tester must have sufficient programming skill
to understand the code, its control flow, and its variables. Like control flow testing, data flow
testing can be very time consuming because of all the modules, paths, and variables that
comprise a system.

Summary
A common programming mistake is referencing the value of a variable without first
assigning a value to it.
A data flow graph is similar to a control flow graph in that it shows the processing flow
through a module. In addition, it details the definition, use, and destruction of each of
the module's variables. We will use these diagrams to verify that the define-use-kill
patterns are appropriate.
Enumerate the paths through the module. Then, for every variable, create at least one
test case to cover every define-use pair.

Practice
1. The following module of code computes n! (n factorial) given a value for n. Create
data flow test cases covering each variable in this module. Remember, a single test
case can cover a number of variables.
int factorial (int n) {
int answer, counter;
answer = 1;
counter = 1;
loop:
if (counter > n) return answer;
answer = answer * counter;
counter = counter + 1;
goto loop;
}
2. Diagram the control flow paths and derive the data flow test cases for the following
module:
int module( int selector) {
int foo, bar;
switch selector {
case SELECT-1:
foo = calc_foo_method_1();
break;
case SELECT-2:
foo = calc_foo_method_2();
break;
case SELECT-3:
foo = calc_foo_method_3();
break;
}
switch foo {
case FOO-1:
bar = calc_bar_method_1();
break;
case FOO-2:
bar = calc_bar_method_2();
break;
}
return foo/bar;
}
Do you have any concerns with this code? How would you deal with them?

References
Beizer, Boris (1990). Software Testing Techniques. Van Nostrand Reinhold.
Binder, Robert V. (2000). Testing Object-Oriented Systems: Models, Patterns, and
Tools. Addison-Wesley.
Marick, Brian (1995). The Craft of Software Testing: Subsystem Testing Including
Object-Based and Object-Oriented Testing. Prentice-Hall.
Rapps, Sandra and Elaine J. Weyuker. "Data Flow Analysis Techniques for Test Data
Selection." Sixth International Conference on Software Engineering, Tokyo, Japan,
September 13–16, 1982.

Section III: Testing Paradigms
Chapter List
Chapter 12: Scripted Testing
Chapter 13: Exploratory Testing
Chapter 14: Test Planning

Part Overview

Paradigms
In his book, Paradigms: The Business of Discovering the Future, Joel Barker defines a
paradigm as "a set of rules and regulations (written or unwritten) that does two things: (1) it
establishes or defines boundaries, and (2) it tells you how to behave inside the boundaries in
order to be successful." Futurist Marilyn Ferguson sees a paradigm as "a framework of thought
... a scheme for understanding and explaining certain aspects of reality."

Paradigms are useful because they help us make sense of the complexities of the world around
us. In this way, paradigms sharpen our vision. But paradigms can blind us to realities.
Paradigms act as psychological filters. Data that does not match our paradigms is blocked. In
this way, paradigms cloud our vision.
In software testing today, two very different paradigms are battling for adherents—scripted
testing and exploratory testing.
Scripted testing is based on the sequential examination of requirements, followed by the design
and documentation of test cases, followed by the execution of those test cases. The scripted
tester's motto is, "Plan your work, work your plan."
Exploratory testing is a very different paradigm. Rather than a sequential approach, exploratory
testing emphasizes simultaneous learning, test design, and test execution. The tester designs
and executes tests while exploring the product.
Word Of
Warning !

In the following chapters the scripted and exploratory paradigms are
defined at the extreme ends of the spectrum. Rarely will either be used as
inflexibly as described.

The next two chapters describe these paradigms. A word of warning though—each paradigm is
described at the extreme end of the process spectrum. Rarely will either paradigm be used as
inflexibly as described. More often, scripted testing may be somewhat exploratory and
exploratory testing may be somewhat scripted.

Test Planning
Planning has been defined as simply "figuring out what to do next." To be most effective and
efficient, planning is important. But when and how should that planning be done? Scripted
testing emphasizes the value of early test design as a method of detecting requirements and
design defects before the code is written and the system put into production. Its focus is on
accountability and repeatability. Exploratory testing challenges the idea that tests must be
designed so very early in the project, when our knowledge is typically at its minimum. Its focus
is on learning and adaptability.

References
Barker, Joel Arthur (1992). Paradigms: The Business of Discovering the Future.
HarperCollins.
Ferguson, Marilyn (1980). The Aquarian Conspiracy: Personal and Social
Transformation in Our Time. Putnam Publishing Group.

Chapter 12: Scripted Testing
Jane was toast, and not the light buttery kind, nay, she was the kind that's been charred
and blackened in the bottom of the toaster and has to be thrown away because no matter
how much of the burnt part you scrape off with a knife, there's always more blackened
toast beneath, the kind that not even starving birds in winter will eat, that kind of toast.
— Beth Knutson

Introduction
For scripted testing to be understood, it must be understood in its historical context. Scripted
testing emerged as one of the component parts of the Waterfall model of software
development. The Waterfall model defines a number of sequential development phases with
specific entry and exit criteria, tasks to be performed, and deliverables (tangible work products)
to be created. It is a classic example of the "plan your work, work your plan" philosophy.
Typical Waterfall phases include:
1. System Requirements - Gathering the requirements for the system.
2. Software Requirements - Gathering the requirements for the software portion of the
system.
3. Requirements Analysis - Analyzing, categorizing, and refining the software
requirements.
4. Program Design - Choosing architectures, modules, and interfaces that define the
system.
5. Coding - Writing the programming code that implements the design.
6. Testing - Evaluating whether the requirements were properly understood (Validation)
and the design properly implemented by the code (Verification).
7. Operations - Put the system into production.
Interesting Trivia A Google search for "plan your work" and "work your plan" found 3,570
matches including:
Football recruiting
Business planning
Building with concrete blocks
Online marketing
Industrial distribution
The Princeton University's Women's Water Polo Team
And thousands more
This model was first described in 1970 in a paper entitled "Managing the Development of Large
Scale Systems" by Dr. Winston W. Royce. Royce drew the following diagram showing the
relationships between development phases:

Figure 12-1: The Waterfall life cycle model.
What process was used before Waterfall? It is a process known as "Code & Fix."
Programmers simply coded. Slogans like "Requirements? Requirements? We don't need no
stinkin' Requirements!" hung on the walls of programmers' offices. Development was like the
scene in the movie Raiders of the Lost Ark. Our hero, Indiana Jones, is hiding from the bad
guys. Indy says, "I'm going to get that truck." Marion, our heroine, turns to him and asks, "How
are you going to get that truck?" Indy replies, "I don't know. I'm making this up as I go." If we
substituted "build that system" for "get that truck" we'd have the way real men and real women
built software systems in the good old days.
Curious Historical Note Today, Winston Royce is known as the father of the Waterfall
model of software development. In fact, in his paper he was actually proposing an iterative
and incremental process that included early prototyping - something many organizations
are just now discovering.
Today we take a different view of scripted testing. Any development methodology along the
spectrum from Waterfall to Rapid Application Development (RAD) may use scripted testing.
Whenever repeatability, objectivity, and auditability are important, scripted testing can be used.
Repeatability means that there is a definition of a test (from design through to detailed
procedure) at a level of detail sufficient for someone other than the author to execute it in an
identical way. Objectivity means that the test creation does not depend on the extrordinary
(near magical) skill of the person creating the test but is based on well understood test design
principles. Auditability includes traceability from requirements, design, and code to the test
cases and back again. This enables formal measures of testing coverage.
"Plan your work, work your plan." No phrase so epitomizes the scripted testing approach as
does this one, and no document so epitomizes the scripted testing approach as does IEEE Std
829-1998, the "IEEE Standard for Software Test Documentation."
This standard defines eight documents that can be used in software testing. These documents
are:
Test plan

Test design specification
Test case specification
Test procedure specification
Test item transmittal report
Test log
Test incident report
Test summary report
Figure 12-2 shows the relationships between these documents. Note that the first four
documents that define the test plan, test designs, and test cases are all created before the
product is developed and the actual testing is begun. This is a key idea in scripted testing—plan
the tests based on the formal system requirements.

Figure 12-2: The IEEE 829 Test Documents
Curiously, the IEEE 829
standard states, "This standard specifies the form and content of individual test documents.
It does not specify the required set of test documents." In other words, the standard does
not require you to create any of the documents described. That choice is left to you as a
tester, or to your organization. But, the standard requires that if you choose to write a test
plan, test case specification, etc., that document must follow the IEEE 829 standard.

The IEEE 829 standard lists these advantages for its use:
"A standardized test document can facilitate communication by providing a common
frame of reference.
The content definition of a standardized test document can serve as a completeness
checklist for the associated testing process.
A standardized set can also provide a baseline for the evaluation of current test
documentation practices.
The use of these documents significantly increases the manageability of testing.
Increased manageability results from the greatly increased visibility of each phase of
the testing process."

IEEE 829 Document Descriptions
The IEEE 829 standard defines eight different documents. Each document is composed of a
number of sections.

Test Plan

The purpose of the test plan is to describe the scope, approach, resources, and
schedule of the testing activities. It describes the items (components) and features
(functionality, performance, security, usability, etc.) to be tested, tasks to be
performed, deliverables (tangible work products) to be created, testing responsibilities,
schedules, and approvals required. Test plans can be created at the project level
(master test plan) or at subsidiary levels (unit, integration, system, acceptance, etc.).
The test plan is composed of the following sections:
1. Test plan identifier - A unique identifier so that this document can be
distinguished from all other documents.
2. Introduction - A summary of the software to be tested. A brief description and
history may be included to set the context. References to other relevant
documents useful for understanding the test plan are appropriate. Definitions
of unfamiliar terms may be included.
3. Test items - Identifies the software items that are to be tested. The word
"item" is purposely vague. It is a "chunk" of software that is the object of
testing.
4. Features to be tested - Identifies the characteristics of the items to be tested.
These include functionality, performance, security, portability, usability, etc.
5. Features not to be tested - Identifies characteristics of the items that will not
be tested and the reasons why.
6. Approach - The overall approach to testing that will ensure that all items and
their features will be adequately tested.
7. Item pass/fail criteria - The criteria used to determine whether each test item
has passed or failed testing.
8. Suspension criteria and resumption requirements - The conditions under which
testing will be suspended and the subsequent conditions under which testing

will be resumed.
9. Test deliverables - Identifies the documents that will be created as a part of
the testing process.
10. Testing tasks - Identifies the tasks necessary to perform the testing.
11. Environmental needs - Specifies the environment required to perform the
testing including hardware, software, communications, facilities, tools, people,
etc.
12. Responsibilities - Identifies the people/groups responsible for executing the
testing tasks.
13. Staffing and training needs - Specifies the number and types of people
required to perform the testing, including the skills needed.
14. Schedule - Defines the important key milestones and dates in the testing
process.
15. Risks and contingencies - Identifies high-risk assumptions of the testing plan.
Specifies prevention and mitigation plans for each.
16. Approvals - Specifies the names and titles of each person who must approve
the plan.

Test Design Specification

The purpose of the test design specification is to identify a set of features to be tested
and to describe a group of test cases that will adequately test those features. In
addition, refinements to the approach listed in the test plan may be specified. The test
design specification is composed of the following sections:
1. Test design specification identifier - A unique identifier so that this document
can be distinguished from all other documents.
2. Features to be tested - Identifies the test items and the features that are the
object of this test design specification.
3. Approach refinements - Specifies the test techniques to be used for this test
design.
4. Test identification - Lists the test cases associated with this test design.

Provides a unique identifier and a short description for each test case.
5. Feature pass/fail criteria - The criteria used to determine whether each
feature has passed or failed testing.

Test Case Specification

The purpose of the test case specification is to specify in detail each test case listed in
the test design specification. The test case specification is composed of the following
sections:
1. Test case specification identifier - A unique identifier so that this document can
be distinguished from all other documents.
2. Test items - Identifies the items and features to be tested by this test case.
3. Input specifications - Specifies each input required by this test case.
4. Output specifications - Specifies each output expected after executing this
test case.
5. Environmental needs - Any special hardware, software, facilities, etc. required
for the execution of this test case that were not listed in its associated test
design specification.
6. Special procedural requirements - Defines any special setup, execution, or
cleanup procedures unique to this test case.
7. Intercase dependencies - Lists any test cases that must be executed prior to
this test case.

Test Procedure Specification

The purpose of the test procedure specification is to specify the steps for executing a
test case and the process for determining whether the software passed or failed the
test. The test procedure specification is composed of the following sections:

1. Test procedure specification identifier - A unique identifier so that this
document can be distinguished from all other documents.
2. Purpose - Describes the purpose of the test procedure and its corresponding
test cases.
3. Special requirements - Lists any special requirements for the execution of this
test procedure.
4. Procedure steps - Lists the steps of the procedure. Possible steps include:
Set up, Start, Proceed, Measure, Shut Down, Restart, Stop, and Wrap Up.

Test Item Transmittal Report (a.k.a. Release Notes)

The purpose of the test item transmittal report is to specify the test items being
provided for testing. The test item transmittal report is composed of the following
sections:
1. Transmittal report identifier - A unique identifier so that this document can be
distinguished from all other documents.
2. Transmitted items - Lists the items being transmitted for testing including their
version or revision level.
3. Location - Identifies the location of the transmitted items.
4. Status - Describes the status of the items being transmitted. Include any
deviations from the item's specifications.
5. Approvals - Specifies the names and titles of all persons who must approve
this transmittal.

Test Log

The purpose of the test log is to provide a chronological record about relevant details
observed during the test execution. The test log is composed of the following sections:

1. Test log identifier - A unique identifier so that this document can be
distinguished from all other documents.
2. Description - Identifies the items being tested and the environment under
which the test was performed.
3. Activity and event entries - For each event, lists the beginning and ending date
and time, a brief description of the test execution, the results of the test, and
unique environmental information, anomalous events observed, and the
incident report identifier if an incident was logged.

Test Incident Report (a.k.a. Bug Report)

The purpose of the test incident report is to document any event observed during
testing that requires further investigation. The test incident report is composed of the
following sections:
1. Test incident report identifier - A unique identifier so that this document can be
distinguished from all other documents.
2. Summary - Summarizes the incident.
3. Incident description - Describes the incident in terms of inputs, expected
results, actual results, environment, attempts to repeat, etc.
4. Impact - Describes the impact this incident will have on other test plans, test
design specifications, test procedures, and test case specifications. Also
describes, if known, the impact this incident will have on further testing.

Test Summary Report

The purpose of the test summary report is to summarize the results of the testing
activities and to provide an evaluation based on these results. The test summary report
is composed of the following sections:
1. Test summary report identifier - A unique identifier (imagine that!) so that this

document can be distinguished from all other documents.
2. Summary - Summarizes the evaluation of the test items.
3. Variance - Reports any variances from the expected results.
4. Comprehensive assessment - Evaluates the overall comprehensiveness of the
testing process itself against criteria specified in the test plan.
5. Summary of results - Summarizes the results of the testing. Identifies all
unresolved incidents.
6. Evaluation - Provides an overall evaluation of each test item including its
limitations.
7. Summary of activities - Summarizes the major testing activities by task and
resource usage.
8. Approvals - Specifies the names and titles of each person who must approve
the report.

Advantages of Scripted Testing
1. Scripted testing provides a division of labor—planning, test case design, test case
implementation, and test case execution that can be performed by people with
specific skills and at different times during the development process.
2. Test design techniques such as equivalence class partitioning, boundary value testing,
control flow testing, pairwise testing, etc. can be integrated into a formal testing
process description that not only guides our testing but that could also be used to
audit for process compliance.
3. Because scripted tests are created from requirements, design, and code, all
important attributes of the system will be covered by tests and this coverage can be
demonstrated.
4. Because the test cases can be traced back to their respective requirements, design,
and code, coverage can be clearly defined and measured.
5. Because the tests are documented, they can be easily understood and repeated
when necessary without additional test analysis or design effort.
6. Because the tests are defined in detail, they are more easily automated.
7. Because the tests are created early in the development process, this may free up
additional time during the critical test execution period.
8. In situations where a good requirements specification is lacking, the test cases, at the
end of the project, become the de facto requirements specification, including the
results that demonstrate which requirements were actually fulfilled and which were
not.
9. Scripted tests, when written to the appropriate level of detail, can be run by people
who would otherwise not be able to test the system because of lack of domain
knowledge or lack of testing knowledge.
10. You may have special contractual requirements that can only be met by scripted
testing.
11. There may be certain tests that must be executed in just the same way, every time, in
order to serve as a kind of benchmark.
12. By creating the tests early in the project we can discover what we don't know.
13. By creating the tests early we can focus on the "big picture."
In his book Software System Testing and Quality Assurance, Boris Beizer summarizes in this
way:
"Testing is like playing pool. There's real pool and kiddie pool. In kiddie pool, you hit the

balls and whatever pocket they happen to fall into, you claim as the intended pocket. It's
not much of a game and although suitable to ten-year-olds it's hardly a challenge. The
object of real pool is to specify the pocket in advance. Similarly for testing. There's real
testing and kiddie testing. In kiddie testing, the tester says, after the fact, that the
observed outcome was the intended outcome. In real testing the outcome is predicted and
documented before the test is run."

Disadvantages of Scripted Testing
1. Scripted testing is very dependent on the quality of the system's requirements. Will
the requirements really be complete, consistent, unambiguous, and stable enough as
the foundation for scripted testing? Perhaps not.
2. Scripted testing is, by definition, inflexible. It follows the script. If, while testing, we
see something curious, we note it in a Test Incident Report but we do not pursue it.
Why not? Because it is not in the script to do so. Many interesting defects could be
missed with this approach.
3. Scripted testing is often used to "de-skill" the job of testing. The approach seems to
be, "Teach a tester a skill or two and send them off to document mountains of tests.
The sheer bulk of the tests will probably find most of the defects."

Summary
"Plan your work, work your plan." Like the Waterfall model, no phrase so epitomizes
the scripted testing approach as does this one, and no document so epitomizes the
scripted testing approach as does IEEE Std 829-1998, the "IEEE Standard for
Software Test Documentation."
The IEEE Standard 829 defines eight documents that can be used in software testing.
These documents are: test plan, test design specification, test case specification, test
procedure specification, test item transmittal report, test log, test incident report, and
test summary report.
The advantages of scripted testing include formal documentation, coverage, and
traceability.

References
Beizer, Boris (1984). Software System Testing and Quality Assurance. Van Nostrand
Reinhold.
"IEEE Standard for Software Test Documentation," IEEE Std 829-1998. The Institute of
Electrical and Electronics Engineers, Inc. ISBN 0-7381-1443-X
Royce, Winston W. "Managing the Development of Large Software Systems,"
Proceedings of the 9th International Conference on Software Engineering, Monterey,
CA, IEEE Computer Society Press, Los Alamitos, CA, 1987.
http://www.ipd.bth.se/uodds/pd&agile/royce.pdf

Chapter 13: Exploratory Testing
As she contemplated the setting sun, its dying rays casting the last of their brilliant purple
light on the red-gold waters of the lake, Debbie realized that she should never again buy
her sunglasses from a guy parked by the side of the road.
— Malinda Lingwall

Introduction
The term "exploratory testing," coined by Cem Kaner in his book Testing Computer Software,
refers to an approach to testing that is very different from scripted testing. Rather than a
sequential examination of requirements, followed by the design and documentation of test
cases, followed by the execution of those test cases, exploratory testing, as defined by James
Bach, is "simultaneous learning, test design, and test execution." The tester designs and
executes tests while exploring the product.
In an article for StickyMinds.com entitled "Exploratory Testing and the Planning Myth," Bach
wrote, "Exploratory Testing, as I practice it, usually proceeds according to a conscious plan.
But not a rigorous plan ... it's not scripted in detail." James adds, "Rigor requires certainty and
implies completeness, but I perform exploratory testing precisely because there's so much I
don't know about the product and I know my testing can never be fully complete." James
continues, "To the extent that the next test we do is influenced by the result of the last test we
did, we are doing exploratory testing. We become more exploratory when we can't tell what
tests should be run, in advance of the test cycle."
Exploratory Testing
To the extent that the next test we do is influenced by the result of the last test we did, we
are doing exploratory testing. We become more exploratory when we can't tell what tests
should be run, in advance of the test cycle.
In exploratory testing, the tester controls the design of test cases as they are performed rather
than days, weeks, or even months before. In addition, the information the tester gains from
executing a set of tests then guides the tester in designing and executing the next set of tests.
Note this process is called exploratory testing to distinguish it from ad hoc testing which (by my
definition, although others may disagree) often denotes sloppy, careless, unfocused, random,
and unskilled testing. Anyone, no matter what their experience or skill level, can do ad hoc
testing. That kind of testing is ineffective against all but the most defect-ridden systems, and
even then may not find a substantial portion of the defects.
Bach suggests that in today's topsy-turvy world of incomplete, rapidly changing requirements
and minimal time for testing, the classical sequential approach of Test Analysis followed by Test
Design followed by Test Creation followed by Test Execution is like playing the game of
"Twenty Questions" by writing out all the questions in advance. Consider the following
discussion from a testing seminar discussing exploratory testing:
Instructor: Let's play a game called "Twenty Questions." I am thinking about
something in the universe. I'm giving you, the class, twenty questions to identify what
I'm thinking about. Each question must be phrased in a way that it can be answered
"Yes" or "No." (If I let you phrase the question in any form you could ask "What are you
thinking about" and we would then call this game "One Question.") Ready? Brian, let's
begin with you.

Twenty Questions: The Game
A game in which one person thinks of something and others ask up to 20 questions
to determine what has been selected. The questions must be answerable "Yes" or
"No."
When played well, each question is based on the previous questions and their
answers. Writing the questions out in advance prevents using the knowledge
acquired from each answer.
Brian: Does it have anything to do with software testing?
Instructor: No, that would be too easy.
Michael: Is it large?
Instructor: No, it's not large.
Rebecca: Is it an animal?
Instructor: No.
Rayanne: Is it a plant?
Instructor: Yes, it is a plant.
Henry: Is it a tree?
Instructor: No, it is not a tree.
Sree: Is it big?
Instructor: No, I've already said it is not large.
Eric: Is it green?
Instructor: Yes, it is green.
Cheryl: Does it have leaves?
Instructor: Yes, it has leaves.
Galina: Is it an outdoor plant?
Instructor: Yes, it generally grows outdoors.
Jae: Is it a flowering plant?
Instructor: No, I don't believe so but I'm not a botanist.
Melanie: Is it a shrub?

Instructor: No.
Patrick: Is it a cactus?
Instructor: No, it is not a cactus.
Angel: Is it a cucumber?
Instructor: No, perhaps rather than guessing individual plants it would be more
effective to identify categories.
Sundari: Is it a weed?
Instructor: No, good try though.
Lynn: Is it a perennial?
Instructor: No, I don't believe so. I think it must be replanted each year.
Julie: Does it grow from bulbs?
Instructor: No.
Michelle: Is it in everyone's yard?
Instructor: No, at least it's not in mine.
Kristie: Is it illegal? (Laughter in the class)
Instructor: No, it's quite legal. Well, we've gone through the class once. Brian, let's go
back to you.
Brian: Is it poisonous?
Instructor: No, although my children think so.
Michael: Is it eaten?
Instructor: Yes, it is eaten.
Rebecca: Is it lettuce?
Instructor: No, not lettuce.
Rayanne: Is it spinach?
Instructor: Yes, it is spinach. Very good.
How successful would we be at this game if we had to write out all the questions in advance?
When we play this game well, each question depends on the previous questions and their

answers. So it is in exploratory testing. Each test provides us with information about the
product. We may see evidence of the product's correctness; we may see evidence of its
defects. We may see things that are curious; we're not sure what they mean, things that we
wonder about and want to explore further. So, as we practice exploratory testing, we
concurrently learn the product, design the tests, and execute these tests.

Description
In his classic time management book, How to Get Control of Your Time and Your Life, Alan
Lakein suggests we should constantly ask ourselves: What is the most important thing I can do
with my time right now? Exploratory testers ask an equivalent question: What is the most
important test I can perform right now?
Key Question
What is the most important test I can perform right now?
A possible exploratory testing process is:
Creating a conjecture (a mental model) of the proper functioning of the system
Designing one or more tests that would disprove the conjecture
Executing these tests and observing the outcomes
Evaluating the outcomes against the conjecture
Repeating this process until the conjecture is proved or disproved
Another process might be simply to explore and learn before forming conjectures of proper
behavior.
Exploratory testing can be done within a "timebox," an uninterrupted block of time devoted to
testing. These are typically between sixty and 120 minutes in length. This is long enough to
perform solid testing but short enough so that the tester does not mentally wander. In addition,
a timebox of this length is typically easier to schedule, easier to control, and easier to report.
When performing "chartered exploratory testing," a charter is first created to guide the tester
within the timebox. This charter defines a clear mission for the testing session. The charter may
define:
What to test
What documents (requirements, design, user manual, etc.) are available to the tester
What tactics to use
What kinds of defects to look for
What risks are involved
This charter is a guideline to be used, not a script to be followed. Because of this approach,
exploratory testing makes full use of the skills of testers. Bach writes, "The more we can make
testing intellectually rich and fluid, the more likely we will hit upon the right tests at the right
time."

Key
Point

The charter is a guideline to be used, not a script to be followed.

Charters focus the exploratory tester's efforts within the timebox. Possible charters include:
Thoroughly investigate a specific system function
Define and then examine the system's workflows
Identify and verify all the claims made in the user manual
Understand the performance characteristics of the software
Ensure that all input fields are properly validated
Force all error conditions to verify each error message
Check the design against user interface standards
It is possible to perform exploratory testing without a charter. This is called "freestyle
exploratory testing." In this process testers use their skills to the utmost as they concurrently
learn the product and design and execute tests.
Exploratory testers are skilled testers. (Of course, we want testers to be skilled no matter
what testing process we are using!) The exploratory testing approach respects those skills and,
in fact, depends on them. Good exploratory testers are:
Good modelers, able to create mental models of the system and its proper behavior.
Careful observers, able to see, hear, read, and comprehend.
Skilled test designers, able to choose appropriate test design techniques in each
situation. Bach emphasizes, "An exploratory tester is first and foremost a test
designer."
Able to evaluate risk and let it guide their testing.
Critical thinkers, able to generate diverse ideas, integrate their observations, skills, and
experiences to concurrently explore the product, design the tests, and execute the
tests.
Careful reporters, able to rigorously and effectively report to others what they have
observed.
Self managed, able to take the lead in testing rather than execute a plan devised by
others.
Not distracted by trivial matters.

Testers without these skills can still perform useful exploratory testing if they are properly
supervised and coached.
In general, processes that have weak, slow, or nonexistent feedback mechanisms often do not
perform well. Scripted testing is a prime example of a slow feedback loop. Exploratory testing
provides a tight feedback loop between both test design and test execution. In addition, it
provides tight feedback between testers and developers regarding the quality of the product
being tested.

Advantages of Exploratory Testing
1. Exploratory testing is valuable in situations where choosing the next test case to be
run cannot be determined in advance, but should be based on previous tests and their
results.
2. Exploratory testing is useful when you are asked to provide rapid feedback on a
product's quality on short notice, with little time, off the top of your head, when
requirements are vague or even nonexistent, or early in the development process
when the system may be unstable.
3. Exploratory testing is useful when, once a defect is detected, we want to explore the
size, scope, and variations of that defect to provide better feedback to our
developers.
4. Exploratory testing is a useful addition to scripted testing when the scripted tests
become "tired," that is, they are not detecting many errors.

Disadvantages of Exploratory Testing
1. Exploratory testing has no ability to prevent defects. Because the design of scripted
test cases begins during the requirements gathering and design phases, defects can
be identified and corrected earlier.
2. If you are already sure exactly which tests must be executed, and in which order,
there is no need to explore. Write and then execute scripted tests.
3. If you are required by contract, rule, or regulation to use scripted testing then do so.
Consider adding exploratory tests as a complementary technique.

Summary
Exploratory testing is defined as "simultaneous learning, test design, and test
execution." The tester designs and executes tests while exploring the product.
In exploratory testing, the tester controls the design of test cases as they are
performed rather than days, weeks, or even months before. In addition, the information
the tester gains from executing a set of tests then guides the tester in designing and
executing the next set of tests.
Exploratory testing is vital whenever choosing the next test case to be run cannot be
determined in advance but should be chosen based on previous tests and their results.

References
Bach, James. "Exploratory Testing and the Planning Myth."
http://www.stickyminds.com/r.asp?F=DART_2359, 19 March 2001.
Bach, James. "Exploratory Testing Explained." v.1.3 16 April 2003.
http://www.satisfice.com/articles/et-article.pdf
Kaner, Cem,Jack Falk, and Hung Q. Nguyen (1999). Testing Computer Software. John
Wiley & Sons.
Kaner, Cem,James Bach, and Bret Pettichord (2002). Lessons Learned in Software
Testing: A Context-Driven Approach. John Wiley & Sons.
Weinberg, Gerald M. (1975). An Introduction to General Systems Thinking. John Wiley
& Sons.

Chapter 14: Test Planning
John Stevenson lives in Vancouver with his wife Cindy and their two kids Shawn and
Cassie, who are the second cousins of Mary Shaw, who is married to Richard Shaw,
whose grandmother was Stewart Werthington's housekeeper, whose kids Damien and
Charlie went to the Mansfield Christian School for Boys with Danny Robinson, whose
sister Berta Robinson ran off with Chris Tanner, who rides a motorcycle and greases his
hair and their kid Christa used to go out with my pal Tom Slipper, who is the main
character of this story, but not the narrator 'cause I am (Tommy couldn't write to save his
life).
— Emma Dolan

Introduction
Mort Sahl, the brilliant social commentator of the 1960s, often began his act by dividing the
world into the "right wing," the "left wing," and the "social democrats." The previous two
chapters have described the right and left wings. Now it's time for the social democrats.
Scripted testing is based on the sequential examination of requirements, followed by the
design and documentation of test cases, followed by the execution of those test cases. The
scripted tester's motto is, "Plan your work, work your plan." Exploratory testing is a very
different paradigm. Rather than a sequential approach, exploratory testing emphasizes
concurrent product learning, test design, and test execution. The tester designs and executes
tests while exploring the product.

Technique
Planning has been defined simply as "figuring out what to do next." Most of us would admit that
to be effective and efficient, planning is important. But when and how should that planning be
done? Scripted testing emphasizes the value of early test planning and design as a method of
detecting requirements and design defects before the code is written and the system put into
production. Exploratory testing challenges the idea that tests must be designed so very early in
the project, when our knowledge is typically at its minimum. In his article, "Exploratory Testing
and the Planning Myth," published on StickyMinds.com, James Bach discusses the planning of
plays that are run in a football game. He examines when the plays can or should be planned.
Let's consider this sport to learn more about planning.
But first, an apology or explanation. In this chapter the term "football" refers to the game of the
same name as played in the United States and Canada and exported, with only marginal
success, to the rest of the world. "Football" does not refer to that marvelous game played
world-wide that North Americans call "soccer."
For More
Information

To learn more about the game of football as played in North America
see ww2.nfl.com/basics/history_basics.html

When are football plays planned? Our first thought might be in the huddle just before the play
begins, but the following list shows more possibilities:
Planned Football Play

Before the game begins - the first n plays are chosen and executed without regard to
their success or failure to evaluate both teams' abilities
Before each play - in the huddle, based on an overall game plan, field position, teams'
strengths and weaknesses, and player skills and experience
At the line of scrimmage - depending on the defensive lineup
At the start of a play - play action - run or pass depending on the defense

During the play - run for your life when all else has failed
Adaptive Planning
Adaptive planning is not an industry standard term. Other possible terms are:
Dynamic
Flexible
Just-In-Time
Responsive
Pliable
Progressive
Purposeful planning
We could define the terms "classical planning" and "adaptive planning" to indicate these
different approaches. The relationship between classical planning and adaptive planning in
football is:
Table 14-1: Classical planning vs. Adaptive planning.
Classical Planning

Before the game begins (the first ten plays are scripted)
Before each play (in the huddle)

Adaptive Planning

At the line of scrimmage (depending on the defensive setup)
At the start of a play (play action - run or pass)
During the play (scramble when all else has failed)

Let's now leave football and consider software test planning. (While we'd rather stay and watch
the game, we've got software to test.)
Table 14-2: Classical test planning vs. Exploratory test planning.
Classical Test As requirements, analysis, design, and coding are being done—long before
Planning
system is built and the testing can begin
Choose a strategy (depending on our current knowledge)
Adaptive Test Before each screen / function / flow is to be tested
Planning
At the start of an individual test (choose different strategies)
During the test (as we observe things we don't expect or understand)

A reasonable planning heuristic would be:
We plan as much as we can (based on the knowledge available),
When we can (based on the time and resources available),
But not before.
Aside from these new labels, haven't good planners always done this? Is this concept really
new?
A remarkable little book simply titled Planning, published by the United States Marine Corps in
1997, describes the concepts of adaptive planning in detail.
The Marine Corps defines planning as encompassing two basic functions—"envisioning a
desired future and arranging a configuration of potential actions in time and space that will allow
us to realize that future." But, to the Marines, planning is not something done early which then
becomes cast in concrete. "We should think of planning as a learning process—as mental
preparation which improves our understanding of a situation." Plans are not concrete either.
"Since planning is an ongoing process, it is better to think of a plan as an interim product based
on the information and understanding known at the moment and always subject to revision as
new information and understanding emerge."
The authors of Planning list these planning pitfalls to avoid:
Attempting to forecast events too far into the future. By planning we may fool ourselves
into thinking we are controlling. There is a difference.
Trying to plan in too much detail. Helmuth von Moltke, German Army Chief of Staff
during World War I said, "No plan survives contact with the enemy." In exactly that
same way, no test plan survives contact with the defects in the system under test.
Institutionalizing planning methods that lead us to inflexible or lockstep thinking in which
both planning and plans become rigid. Rather than "Plan your work and work your plan"
as our mantra, we should constantly "Plan our work, work our plan, re-evaluate our
work, re-evaluate our plan."
Thinking of a plan as an unalterable solution to a problem. Rather, it should be viewed
as an open architecture that allows us to pursue many alternatives. "We will rarely, if
ever, conduct an evolution exactly the way it was originally developed."
Ignoring the need for a feedback mechanism to identify shortcomings in the plan and
make necessary adjustments. This is a component of planning which often does not
receive adequate emphasis. "Many plans stop short of identifying the signals,
conditions, and feedback mechanisms that will indicate successful or dysfunctional
execution."
Adaptive planning, as described above, acknowledges and deals with these pitfalls.

The following excerpt from Planning summarizes these concepts well:
"Planning is a continuous process involving the ongoing adjustment of means and ends.
We should also view planning as an evolutionary process involving continuous
adjustment and improvement. We can think of planning as solution-by-evolution rather
than solution-by-engineering. We should generally not view planning as trying to solve a
problem in one iteration because most ... problems are too complex to be solved that
way. In many cases, it is more advisable to find a workable solution quickly and
improve the solution as time permits. What matters most is not generating the best
possible plan but achieving the best possible result. Likewise, we should see each plan
as an evolving rather than a static document. Like planning, plans should be dynamic; a
static plan is of no value to an adaptive organization in a fluid situation."

Summary
James Bach asks, "What if it [the plan] comes into existence only moments before the
testing?" Why must the plan be created so very early in the project, when our
knowledge is typically at its minimum?
In adaptive planning we plan as much as we can (based on the knowledge available),
when we can (based on the time and resources available), but not before.
Key
Point

The use of scripted testing does not preclude the use of exploratory
testing. The use of exploratory testing does not preclude the use of
scripted testing. Smart testers use whatever tool in their toolbox is
required.

Since planning is an ongoing process, it is better to think of a plan as an interim product
based on the information and understanding known at the moment and always subject
to revision as new information and understanding emerge.
The use of scripted testing does not preclude the use of exploratory testing. The use of
exploratory testing does not preclude the use of scripted testing. As Rex Black wrote,
"Smart testers use whatever tool in their toolbox is required. No paradigms here. No
worldviews here. No screwdrivers vs. hammers. Let's do whatever makes sense given
the problem at hand."

Practice
1. In what areas could you use adaptive planning where you now use classical planning?
With what benefit? What would the challenges be? Who would support you in this new
process? Who would oppose your efforts? Why?
2. In what movies about the Marine Corps were the process of planning and the value of
plans emphasized over action? Can you explain why?

References
Bach, James. "Exploratory Testing and the Planning Myth." 19 March 2001.
http://www.stickyminds.com/r.asp?F=DART_2359
Copeland, Lee. "Exploratory Planning." 3 September 2001.
http://www.stickyminds.com/r.asp?F=DART_2805
"IEEE Standard for Software Test Documentation," IEEE Std 829-1998. The Institute of
Electrical and Electronics Engineers, Inc. ISBN 0-7381-1443-X
Planning. MCDP 5. United States Marine Corps.
https://www.doctrine.usmc.mil/mcdp/view/mpdpub5.pdf

Section IV: Supporting Technologies
Chapter List
Chapter 15: Defect Taxonomies
Chapter 16: When to Stop Testing

Part Overview

The Bookends
Two questions, like bookends, frame our software testing:
Where do we start?
When do we stop?

Where do we start testing? Of all the places to look for defects, where should we begin? One
answer is with a defect taxonomy. A taxonomy is a classification of things into ordered groups
or categories that indicate natural, hierarchical relationships. Taxonomies help identify the kinds
of defects that often occur in systems, guide your testing by generating ideas, and audit your
test plans to determine the coverage you are obtaining with your test cases. In time, they can
help you improve your development process.
And stopping. How do we logically decide when we have tested enough and the software is
ready for delivery and installation? Boris Beizer has written, "There is no single, valid, rational
criterion for stopping." If he is correct, how do we make that decision?
The next two chapters address these important issues.

Chapter 15: Defect Taxonomies
'Failure' was simply not a word that would ever cross the lips of Miss Evelyn Duberry,
mainly because Evelyn, a haughty socialite with fire-red hair and a coltish gate, could
pronounce neither the letters 'f' nor 'r' as a result of an unfortunate kissing gesture made
many years earlier toward her beloved childhood parrot, Snippy.
— David Kenyon

Introduction
What is a taxonomy? A taxonomy is a classification of things into ordered groups or categories
that indicate natural, hierarchical relationships. The word taxonomy is derived from two Greek
roots: "taxis" meaning arrangement and "onoma" meaning name. Taxonomies not only facilitate
the orderly storage of information, they facilitate its retrieval and the discovery of new ideas.
Taxonomies help you:
Guide your testing by generating ideas for test design
Audit your test plans to determine the coverage your test cases are providing
Understand your defects, their types and severities
Understand the process you currently use to produce those defects (Always remember,
your current process is finely tuned to create the defects you're creating)
Improve your development process
Improve your testing process
Train new testers regarding important areas that deserve testing
Explain to management the complexities of software testing
Key
Point

A taxonomy is a classification of things into ordered groups or categories that
indicate natural, hierarchical relationships.

In his book Testing Object-Oriented Systems, Robert Binder describes a "fault model" as a list
of typical defects that occur in systems. Another phrase to describe such a list is a defect
taxonomy. Binder then describes two approaches to testing. The first uses a "non-specific fault
model." In other words, no defect taxonomy is used. Using this approach, the requirements and
specifications guide the creation of all of our test cases. The second approach uses a "specific
fault model." In this approach, a taxonomy of defects guides the creation of test cases. In other
words, we create test cases to discover faults like the ones we have experienced before. We
will consider two levels of taxonomies—project level and software defect level. Of most
importance in test design are the software defect taxonomies. But it would be foolish to begin
test design before evaluating the risks associated with both the product and its development
process.
Note that none of the taxonomies presented below are complete. Each could be expanded.
Each is subjective based on the experience of those who created the taxonomies.

Project Level Taxonomies
SEI Risk Identification Taxonomy
The Software Engineering Institute has published a "Taxonomy-Based Risk Identification" that
can be used to identify, classify, and evaluate different risk factors found in the development of
software systems.
Table 15-1: The SEI Taxonomy-Based Risk Identification taxonomy.
Class

Element

Attribute
Stability
Completeness
Clarity

Requirements

Validity
Feasibility
Precedent
Scale
Functionality
Difficulty

Design

Interfaces
Performance
Testability

Product Engineering

Feasibility
Code and Unit Test

Testing
Coding/Implementation
Environment

Integration and Test

Product
System
Maintainability
Reliability
Safety

Engineering Specialties

Security

Human Factors
Specifications
Formality
Suitability
Development Process

Process Control
Familiarity
Product Control
Capacity
Suitability
Usability

Development System

Familiarity
Reliability
System Support
Deliverability

Development Environment

Planning
Management Process

Project Organization
Management Experience
Program Interfaces
Monitoring

Management Methods

Personnel Management
Quality Assurance
Configuration Management
Quality Attitude

Work Environment

Cooperation
Communication
Morale
Schedule

Resources

Staff
Budget
Facilities

Types of Contract
Contract

Restrictions
Dependencies

Program Constraints

Customer
Associate Contractors
Subcontractors
Program Interfaces

Prime Contractor
Corporate Management
Vendors
Politics

If, as a tester, you had concerns with some of these elements and attributes, you would want
to stress certain types of testing. For example:
If you are concerned about:

You might want to emphasize:

The stability of the requirements Formal traceability
Incomplete requirements

Exploratory testing

Imprecisely written requirements Decision tables and/or state-transition diagrams
Difficulty in realizing the design

Control flow testing

System performance

Performance testing

Lack of unit testing

Additional testing resources

Usability problems

Usability testing

ISO 9126 Quality Characteristics Taxonomy
The ISO 9126 Standard "Software Product Evaluation—Quality Characteristics and Guidelines"
focuses on measuring the quality of software systems. This international standard defines
software product quality in terms of six major characteristics and twenty-one subcharacteristics
and defines a process to evaluate each of these. This taxonomy of quality attributes is:
Table 15-2: The ISO 9126 Quality Characteristics taxonomy.
Quality Characteristic

Subcharacteristic
Suitability

Functionality

Accuracy

(Are the required functions available in the software?)

Interoperability
Security
Maturity

Reliability
(How reliable is the software?)

Fault tolerance
Recoverability
Understandability

Usability

Learnability

(Is the software easy to use?)

Operability
Attractiveness

Efficiency

Time behavior

(How efficient is the software?)

Resource behavior
Analyzability

Maintainability

Changeability

(How easy is it to modify the software?)

Stability
Testability
Adaptability

Portability
(How easy is it to transfer the software to another operating
environment?)

Installability
Coexistence
Replaceability

Each of these characteristics and subcharacteristics suggest areas of risk and thus areas for
which tests might be created. An evaluation of the importance of these characteristics should
be undertaken first so that the appropriate level of testing is performed. A similar "if you are
concerned about / you might want to emphasize" process could be used based on the ISO
9126 taxonomy.
These project level taxonomies can be used to guide our testing at a strategic level. For help in
software test design we use software defect taxonomies.

Software Defect Taxonomies
In software test design we are primarily concerned with taxonomies of defects, ordered lists of
common defects we expect to encounter in our testing.

Beizer's Taxonomy
One of the first defect taxonomies was defined by Boris Beizer in Software Testing Techniques.
It defines a four-level classification of software defects. The top two levels are shown here.
Table 15-3: A portion of Beizer's Bug Taxonomy.
1xxx

Requirements

11xx

Requirements incorrect

12xx

Requirements logic

13xx

Requirements, completeness

14xx

Verifiability

15xx

Presentation, documentation

16xx

Requirements changes

2xxx

Features And Functionality

21xx

Feature/function correctness

22xx

Feature completeness

23xx

Functional case completeness

24xx

Domain bugs

25xx

User messages and diagnostics

26xx

Exception conditions mishandled

3xxx

Structural Bugs

31xx

Control flow and sequencing

32xx

Processing

4xxx

Data

41xx

Data definition and structure

42xx

Data access and handling

5xxx

Implementation And Coding

51xx

Coding and typographical

52xx

Style and standards violations

53xx

Documentation

6xxx

Integration

61xx

Internal interfaces

62XX

External interfaces, timing, throughput

7XXX

System And Software Architecture

71XX

O/S call and use

72XX

Software architecture

73XX

Recovery and accountability

74XX

Performance

75XX

Incorrect diagnostics, exceptions

76XX

Partitions, overlays

77XX

Sysgen, environment

8XXX

Test Definition And Execution

81XX

Test design bugs

82XX

Test execution bugs

83XX

Test documentation

84XX

Test case completeness

Even considering only the top two levels, it is quite extensive. All four levels of the taxonomy
constitute a fine-grained framework with which to categorize defects.
At the outset, a defect taxonomy acts as a checklist, reminding the tester so that no defect
types are forgotten. Later, the taxonomy can be used as a framework to record defect data.
Subsequent analysis of this data can help an organization understand the types of defects it
creates, how many (in terms of raw numbers and percentages), and how and why these
defects occur. Then, when faced with too many things to test and not enough time, you will
have data that enables you to make risk-based, rather than random, test design decisions. In
addition to taxonomies that suggest the types of defects that may occur, always evaluate the
impact on the customer and ultimately on your organization if they do occur. Defects that have
low impact may not be worth tracking down and repairing.

Kaner, Falk, and Nguyen's Taxonomy
The book Testing Computer Software contains a detailed taxonomy consisting of over 400
types of defects. Only a few excerpts from this taxonomy are listed here.

Table 15-4: A portion of the defect taxonomy from Testing Computer Software.
Functionality
Communication
User Interface Errors

Command structure
Missing commands
Performance
Output
Error prevention

Error Handling

Error detection
Error recovery
Numeric boundaries

Boundary-Related Errors

Boundaries in space, time
Boundaries in loops
Outdated constants

Calculation Errors

Calculation errors
Wrong operation order
Overflow and underflow
Failure to set a data item to 0

Initial And Later States

Failure to initialize a loop control variable
Failure to clear a string
Failure to reinitialize
Program runs amok

Control Flow Errors

Program stops
Loops
IF, THEN, ELSE or maybe not
Data type errors
Parameter list variables out of order or missing

Errors In Handling Or Interpreting
Data

Outdated copies of data
Wrong value from a table
Wrong mask in bit field

Assuming one event always finishes before
another
Race Conditions

Assuming that input will not occur in a specific
interval
Task starts before its prerequisites are met
Required resource not available

Load Conditions

Doesn't return unused memory
Device unavailable

Hardware

Unexpected end of file

Source And Version Control
Documentation

Old bugs mysteriously reappear
Source doesn't match binary
None
Failure to notice a problem
Failure to execute a planned test

Testing Errors

Failure to use the most promising test cases
Failure to file a defect report

Binder's Object-Oriented Taxonomy
Robert Binder notes that many defects in the object-oriented (OO) paradigm are problems
using encapsulation, inheritance, polymorphism, message sequencing, and state-transitions.
This is to be expected for two reasons. First, these are cornerstone concepts in OO. They
form the basis of the paradigm and thus will be used extensively. Second, these basic concepts
are very different from the procedural paradigm. Designers and programmers new to OO
would be expected to find them foreign ideas. A small portion of Binder's OO taxonomy is given
here to give you a sense of its contents:
Table 15-5: A portion of Binder's Method Scope Fault Taxonomy.
Method Scope

Fault

Requirements

Requirement omission
Abstraction
Refinement

Low Cohesion
Feature override missing
Feature delete missing
Naked access

Design

Encapsulation

Overuse of friend
Responsibilities
Exceptions

Incorrect algorithm
Invariant violation
Exception not caught

Table 15-6: A portion of Binder's Class Scope Fault Taxonomy.
Class Scope

Fault
Abstraction

Refinement
Design

Encapsulation

Modularity
Implementation

Association missing or incorrect
Inheritance loops
Wrong feature inherited
Incorrect multiple inheritance
Public interface not via class methods
Implicit class-to-class communication
Object not used
Excessively large number of methods
Incorrect constructor

Note how this taxonomy could be used to guide both inspections and test case design. Binder
also references specific defect taxonomies for C++, Java, and Smalltalk.

Whittaker's "How to Break Software" Taxonomy
James Whittaker's book How to Break Software is a tester's delight. Proponents of exploratory
testing exhort us to "explore." Whittaker tells us specifically "where to explore." Not only does
he identify areas in which faults tend to occur, he defines specific testing attacks to locate
these faults. Only a small portion of his taxonomy is presented:
Table 15-7: A portion of Whittaker's Fault Taxonomy.
Fault Type

Attack
Force all error messages to occur

Inputs and outputs

Force the establishing of default values
Overflow input buffers

Data and computation

Force the data structure to store too few or too many values
Force computation results to be too large or too small

File system interface

Fill the file system to its capacity
Damage the media

Software interfaces

Cause all error handling code to execute
Cause all exceptions to fire

Vijayaraghavan's eCommerce Taxonomy
Beizer's, Kaner's, and Whittaker's taxonomies catalog defects that can occur in any system.
Binder's focuses on common defects in object-oriented systems. Giri Vijayaraghavan has
chosen a much narrower focus—the eCommerce shopping cart. Using this familiar metaphor,
an eCommerce Web site keeps track of the state of a user while shopping. Vijayaraghavan has
investigated the many ways shopping carts can fail. He writes, "We developed the list of
shopping cart failures to study the use of the outline as a test idea generator." This is one of the
prime uses of any defect taxonomy. His taxonomy lists over sixty high-level defect categories,
some of which are listed here:
Performance
Reliability
Software upgrades
User interface usability
Maintainability
Conformance
Stability
Operability
Fault tolerance
Accuracy
Internationalization
Recoverability
Capacity planning
Third-party software failure
Memory leaks
Browser problems

System security
Client privacy
After generating the list he concludes, "We think the list is a sufficiently broad and wellresearched collection that it can be used as a starting point for testing other applications." His
assertion is certainly correct.

A Final Observation
Note that each of these taxonomies is a list of possible defects without any guidance regarding
the probability that these will occur in your systems and without any suggestion of the loss your
organization would incur if these defects did occur. Taxonomies are useful starting points for our
testing but they are certainly not a complete answer to the question of where to start testing.

Your Taxonomy
Now that we have examined a number of different defect taxonomies, the question arises—
which is the correct one for you? The taxonomy that is most useful is your taxonomy, the one
you create from your experience within your organization. Often the place to start is with an
existing taxonomy. Then modify it to more accurately reflect your particular situation in terms of
defects, their frequency of occurrence, and the loss you would incur if these defects were not
detected and repaired.
Key
Point

The taxonomy that is most useful is your taxonomy, the one you create.

Just as in other disciplines like biology, psychology, and medicine, there is no one, single, right
way to categorize, there is no one right software defect taxonomy. Categories may be fuzzy
and overlap. Defects may not correspond to just one category. Our list may not be complete,
correct, or consistent. That matters very little. What matters is that we are collecting, analyzing,
and categorizing our past experience and feeding it forward to improve our ability to detect
defects. Taxonomies are merely models and, as George Box, the famous statistician, reminds
us, "All models are wrong; some models are useful."
To create your own taxonomy, first start with a list of key concepts. Don't worry if your list
becomes long. That may be just fine. Make sure the items in your taxonomy are short,
descriptive phrases. Keep your users (that's you and other testers in your organization) in mind.
Use terms that are common for them. Later, look for natural hierarchical relationships between
items in the taxonomy. Combine these into a major category with subcategories underneath.
Try not to duplicate or overlap categories and subcategories. Continue to add new categories
as they are discovered. Revise the categories and subcategories when new items don't seem
to fit well. Share your taxonomy with others and solicit their feedback. You are on your way to a
taxonomy that will contribute to your testing success.

Summary
Taxonomies help you:
Guide your testing by generating ideas for test case design
Audit your test plans to determine the coverage your test cases are providing
Understand your defects, their types and severities
Understand the process you currently use to produce those defects (Always
remember, your current process is finely tuned to create the defects you're
creating)
Improve your development process
Improve your testing process
Train new testers regarding important areas that deserve testing
Explain to management the complexities of software testing
Testing can be done without the use of taxonomies (nonspecific fault model) or with a
taxonomy (specific fault model) to guide the design of test cases.
Taxonomies can be created at a number of levels: generic software system,
development paradigm, type of application, and user interface metaphor.

References
Beizer, Boris (1990). Software Testing Techniques (Second Edition). Van Nostrand
Reinhold.
Binder, Robert V. (2000). Testing Object-Oriented Systems: Models, Patterns, and
Tools. Addison-Wesley.
Carr, Marvin J., et al. (1993) "Taxonomy-Based Risk Identification." Technical Report
CMU/SEI-93-TR-6, ESC-TR-93-183, June 1993.
http://www.sei.cmu.edu/pub/documents/93.reports/pdf/tr06.93.pdf
ISO (1991). ISO/IEC Standard 9126-1. Software Engineering - Product Quality - Part 1:
Quality Model, ISO Copyright Office, Geneva, June 2001.
Kaner, Cem,Jack Falk and Hung Quoc Nguyen (1999). Testing Computer Software
(Second Edition). John Wiley & Sons.
Whittaker, James A. (2003). How to Break Software: A Practical Guide to Testing.
Addison Wesley.
Vijayaraghavan, Giri and Cem Kaner. "Bugs in your shopping cart: A Taxonomy."
http://www.testingeducation.org/articles/BISC_Final.pdf

Chapter 16: When to Stop Testing
The ballerina stood on point, her toes curled like shrimp, not deep-fried shrimp because,
as brittle as they are, they would have cracked under the pressure, but tender ebi-kind-ofshrimp, pink and luscious as a Tokyo sunset, wondering if her lover was in the Ginza,
wooing the geisha with eyes reminiscent of roe, which she liked better than ebi anyway.
— Brian Tacang

The Banana Principle
In his classic book An Introduction to General Systems Thinking, Gerald Weinberg introduces
us to the "Banana Principle." A little boy comes home from school and his mother asks, "What
did you learn in school today?" The boy responds, "Today we learned how to spell 'banana' but
we didn't learn when to stop." In this book we have learned how to design effective and efficient
test cases, but how do we know when to stop? How do we know we have done enough
testing?

When to Stop
In The Complete Guide to Software Testing, Bill Hetzel wrote regarding system testing,
"Testing ends when we have measured system capabilities and corrected enough of the
problems to have confidence that we are ready to run the acceptance test." The phrases
"corrected enough" and "have confidence," while certainly correct, are vague.
Regarding stopping, Boris Beizer has written, "There is no single, valid, rational criterion for
stopping. Furthermore, given any set of applicable criteria, how exactly each is weighted
depends very much upon the product, the environment, the culture and the attitude to risk."
Again, not much help in knowing when to stop testing.

Even though Beizer says there is no single criterion for stopping, many organizations have
chosen one anyway. The five basic criteria often used to decide when to stop testing are:
You have met previously defined coverage goals
The defect discovery rate has dropped below a previously defined threshold
The marginal cost of finding the "next" defect exceeds the expected loss from that
defect
The project team reaches consensus that it is appropriate to release the product
The boss says, "Ship it!"

Coverage Goals
Coverage is a measure of how much has been tested compared with how much is available to
test. Coverage can be defined at the code level with metrics such as statement coverage,
branch coverage, and path coverage. At the integration level, coverage can be defined in terms
of APIs tested or API/parameter combinations tested. At the system level, coverage can be
defined in terms of functions tested, use cases (or user stories) tested, or use case scenarios
(main path plus all the exception paths) tested. Once enough test cases have been executed to
meet the previously defined coverage goals, we are, by definition, finished testing. For
example, we could define a project's stopping criteria as:
100% statement coverage
90% use case scenario coverage

When this number of tests pass, we are finished testing. (Of course, there are many other
combinations of factors that could be used as stopping criteria.) Not all testers approve of this
approach. Glenford Myers believes that this method is highly counterproductive. He believes
that because human beings are very goal oriented, this criterion could subconsciously
encourage testers to write test cases that have a low probability of detecting defects but do
meet the coverage criteria. He believes that more specific criteria such as a set of tests that
cover all boundary values, state-transition events, decision table rules, etc. are superior.

Defect Discovery Rate
Another approach is to use the defect discovery rate as the criteria for stopping. Each week (or
other short period of time) we count the number of defects discovered. Typically, the number of
defects found each week resembles the curve in Figure 16-1. Once the discovery rate is less
than a certain previously selected threshold, we are finished testing. For example, if we had set
the threshold at three defects/week, we would stop testing after week 18.

Figure 16-1: Defect Discovery Rate
While this approach appeals to our intuition, we should consider what other situations would
produce a curve like this—creation of additional, but less effective test cases; testers on
vacation; "killer" defects that still exist in the software but that hide very well. This is one reason
why Beizer suggests not depending on only one stopping criterion.

Marginal Cost
In manufacturing, we define "marginal cost" as the cost associated with one additional unit of
production. If we're making 1,000 donuts, what is the additional cost of making one more? Not
very much. In manufacturing, the marginal cost typically decreases as the number of units made
increases. In software testing, however, just the opposite occurs. Finding the first few defects
is relatively simple and inexpensive. Finding each additional defect becomes more and more
time consuming and costly because these defects are very adept at hiding from our test cases.
Thus the cost of finding the "next" defect increases. At some point the cost of finding that
defect exceeds the loss our organization would incur if we shipped the product with that defect.
Clearly, it is (past) time to stop testing.
Not every system should use this criterion. Systems that require high reliability such as

weapons systems, medical devices, industrial controls, and other safety-critical systems may
require additional testing because of their risk and subsequent loss should a failure occur.

Team Consensus
Based on various factors including technical, financial, political, and just "gut feelings," the
project team (managers, developers, testers, marketing, sales, quality assurance, etc.) decide
that the benefits of delivering the software now outweigh the potential liabilities and reach
consensus that the product should be released.

Ship It!
For many of us, this will be the only strategy we will ever personally experience. It's often very
disheartening for testers, especially after many arduous hours of testing, and with a sure
knowledge that many defects are still hiding in the software, to be told "Ship it!" What testers
must remember is that there may be very reasonable and logical reasons for shipping the
product before we, as testers, think it is ready. In today's fast-paced market economy, often
the "first to market" wins a substantial market share. Even if the product is less than perfect, it
may still satisfy the needs of many users and bring significant profits to our organization; profits
that might be lost if we delayed shipment.
Some of the criteria that should be considered in making this decision are the complexity of the
product itself, the complexity of the technologies used to implement it and our skills and
experience in using those technologies, the organization's culture and the importance of risk
aversion in our organization, and the environment within which the system will operate including
the financial and legal exposure we have if the system fails.
As a tester, you may be frustrated by the "Ship It" decision. Remember, our role as testers is
to inform management of the risks of shipping the product. The role of your organization's
marketing and sales groups should be to inform management of the benefits of shipping the
product. With this information, both positive and negative, project managers can make
informed, rational decisions.

Some Concluding Advice
Lesson 185 in Lessons Learned in Software Testing states:
"Because testing is an information gathering process, you can stop when you've
gathered enough information. You could stop after you've found every bug, but it would
take infinite testing to know that you've found every bug, so that won't work. Instead,
you should stop when you reasonably believe that the probability is low that the product
still has important undiscovered problems.
"Several factors are involved in deciding that testing is good enough (low enough
chance of undiscovered significant bugs):
You are aware of the kinds of problems that would be important to find, if they
existed.
You are aware of how different parts of the product could exhibit important
problems.
You have examined the product to a degree and in a manner commensurate
with these risks.
Your test strategy was reasonably diversified to guard against tunnel vision.
You used every resource available for testing.
You met every testing process standard that your clients would expect you to
meet.
You expressed your test strategy, test results, and quality assessments as
clearly as you could."

Summary
Regarding stopping, Boris Beizer has written, "There is no single, valid, rational criterion
for stopping. Furthermore, given any set of applicable criteria, how exactly each is
weighted depends very much upon the product, the environment, the culture and the
attitude to risk."
The five basic criteria often used to decide when to stop testing are:
You have met previously defined coverage goals
The defect discovery rate has dropped below a previously defined threshold
The marginal cost of finding the "next" defect exceeds the expected loss from
that defect
The project team reaches consensus that it is appropriate to release the
product
The boss says, "Ship it!"

References
Hetzel, Bill (1998). The Complete Guide to Software Testing (Second Edition). John
Wiley & Sons.
Kaner, Cem,James Bach, and Bret Pettichord (2002). Lessons Learned in Software
Testing: A Context-Driven Approach. John Wiley & Sons.
Myers, Glenford (1979). The Art of Software Testing. John Wiley & Sons.
Weinberg, Gerald M. (1975). An Introduction to General Systems Thinking. John Wiley
& Sons.

Section V: Some Final Thoughts
Appendix List
Appendix A: Brown & Donaldson Case Study
Appendix B: Stateless University Registration System Case Study

Part Overview

Your Testing Toolbox
My oldest son Shawn is a glazier—he installs glass, mirrors, shower doors, etc. He is an artist
in glass. As a father, I decided it would be good to know what my son does for a living, so I
rode with him in his truck for a few hours watching him work.
At the first job site he pulled out a clipboard with a work order that told him what was needed.
He hopped out and walked around to the back of the truck. There, he grabbed his tool bucket
(an old five-gallon paint bucket) and rooted around through it. He pulled out some tools, walked
up to the house, did his magic, came back to the truck, put the tools in the bucket, and away
we went. At the second job site he repeated the process. Once again, he pulled out the
clipboard, hopped out, walked around to the back of the truck, grabbed his tool bucket, and
rooted around through it. He pulled out some tools, but different tools this time, walked up to
the house, did his magic, came back to the truck, put the tools in the bucket, and away we
went. As we went from job to job it occurred to me that all good craftspeople, including
software testers, need a bucket of tools. In addition, good craftspeople know which tool to use
in which situation. My intent in writing this book was to help put more tools in your personal
testing tool bucket and to help you know which tool to use in which situation. Remember, not
every tool needs to be used every time.

Now, it's up to you. The next level of skill comes with practice. Famous educator Benjamin
Bloom created a taxonomy for categorizing levels of competency in school settings. The first
three levels are:
Knowledge
Comprehension
Application
This book has focused on knowledge and comprehension. The "application" is up to you.
Best wishes in your testing ...

References
Bloom, Benjamin S. (1969). Taxonomy of Educational Objectives: The Classification of
Educational Goals. Longman Group.

Appendix A: Brown & Donaldson Case Study

Introduction
Brown & Donaldson (B&D) is a fictitious online brokerage firm that you can use to practice the
test design techniques presented in this book. B&D was originally created for Software Quality
Engineering's Web/eBusiness Testing course (see http://www.sqe.com). The actual B&D Web
site is found at http://bdonline.sqe.com. Any resemblance to any actual online brokerage Web
site is purely coincidental.

Market News
The Market News page is the main page of the B&D site. It contains navigation buttons on the
left side of the page, stock performance charts at the top, and news stories of interest to
B&D's investors.

Trade
The Trade page allows a B&D client to buy and sell stocks. It contains a buy/sell button, a text
box for the stock ticker symbol, a text box for the number of shares to be bought or sold
(quantity), and boxes indicating the type of trade.

Symbol Lookup
The Symbol Lookup page is reached from the Trade page. It is used when the B&D client is
unsure of the stock ticker symbol and must look it up. It contains one field where the first few
characters of the organization's name are entered.

Lookup Results
The Lookup Results page is the result of the previous Symbol Lookup page. It displays the
stock symbols that matched the previous search.

Holdings
Perhaps the most important page on the B&D site, the Holdings page displays the stocks
currently owned by this client.

Glossary
The Glossary page can be used to look up terms that unfamiliar to the B&D client.

Appendix B: Stateless University Registration System Case
Study

System Documentation
Stateless University Registration System (SURS) User Interface Specification
May 1, 2002
Version 2.3
Prepared By:
Otis Kribblekoblis
Super Duper Software Company (SDSC)
422 South 5th Avenue
Anytown, USA

Introduction
The purpose of this document is to describe the planned user interface for the Stateless
University Registration System. It will be revised to reflect that "as-built" software after system
testing has begun. It is a customized version of the registration system delivered to Universal
Online University (UOU) last year. Stateless U has requested some major modifications to the
UOU version, so that it is essentially a rewrite of the software. Some of the modules for
database creation and backup have been reused, but that is not apparent from the user
interface, which is all new.
This manual has the user interface screens defined in the order in which they are customarily
used. It starts with the login screen. Then it provides the data base set-up fields: the addition /
change / deletion of students, the addition / change / deletion of courses, and the addition /
change / deletion of class sections. The final data entry screen provides the selection of
specific course sections for each student. There is also an administrative function that is
accessible to only the supervisor. It provides access to the administrative functions of backup
and restore of the databases. Each screen is defined in a separate section providing the
following information:
Functionality supported
Formatting requirements for each data entry field
A sample screen layout (the final implemented software may differ)
The figure below summarizes the screens and their navigation options.

2.1 Log-in and Select Function Screen
2.1.1 Functions
Each user is required to enter a User ID and a Password. The identification of the status of
the user (supervisor: yes or no) is mandatory at the time of log-in. Only Yes or No may be
selected by clicking on the appropriate box (not both). After a successful log-in has been
completed, then the next function to be executed can be selected. Only a supervisor may
access the Administrative screen. The Exit button is active at all times.

2.1.2 Data Entry Formats
The formats for the fields on this screen are: User ID: eight characters at least two of which
are not alphabetic (can be numeric or special characters).
Password: eight characters at least two of which are not alphabetic (can be numeric or special
characters).

2.1.3 Screen Format

3.1 Student Database Maintenance Screen
3.1.1 Functions
This screen allows the entry of the identifying information for a new student and the assignment
of his/her student ID number. All fields are required to be filled in before the Enter button is
selected. The fields may be entered in any order. The backspace key will work at any time in
any field, but the Reset button will clear all of the fields when it is pressed.
If the Student ID is entered first, then the Delete (allows a student to be removed from the
database) and Modify (allows the modification of the student's contact information—the data
currently in the database will be displayed) buttons become active. The Enter button will cause
the Delete or the Modify to be executed and the fields on the screen to be cleared.

3.1.2 Data Entry Formats
The formats for the fields (all mandatory) on this screen are: First name: one to ten characters
Middle name: one to ten characters or NMN for no middle name Last name: one to fifteen
characters (alpha, period, hyphen, apostrophe, space, numbers) Street address: four to
twenty alphanumeric characters City: three to ten alpha characters State: two alpha
characters Zip: the standard five numerics hyphen four numerics Phone: telephone number in
the following format 703.555.1212
Student ID: two characters representing the home campus and a six-digit number which is
unique for each student. The home campus designations are:
AN for Annandale
LO for Loudoun
MA for Manassas
WO for Woodbridge
AR for Arlington
The six-digit number is generated by the system when the Enter button is selected. It remains
displayed until the Reset button is depressed. At that time, all fields are cleared for the next set
of entries.

3.1.3 Screen Layout

3.2 Course Database Maintenance Screen
3.2.1 Functions
This screen allows the entry of the identifying information for a new course and the assignment
of the course ID number. All fields are required to be filled in before the Enter button is
pressed. The fields may be entered in any order. The backspace key will work at any time in
any field, but the Reset button will clear all of the fields when it is entered. The Back button
causes a return to the previous screen. The Exit button causes an exit from this application.
If the Course ID is entered first, then the Delete (allows a course to be removed from the
database) and Modify (allows the modification of an existing course's information—the data
currently in the database will be displayed) buttons become active. The Enter button will cause
the Delete or the Modify to be executed and the fields on the screen to be cleared.

3.2.2 Data Entry Formats
The formats for the fields (all are mandatory) on this screen are: Course ID: three alpha
characters representing the department followed by a six-digit integer which is the unique
course identification number. The possible departments are:
PHY - Physics
EGR - Engineering
ENG - English
LAN - Foreign languages
CHM - Chemistry
MAT - Mathematics
PED - Physical education
SOC - Sociology
LIB - Library science
HEC - Home economics
Course name: a free format alphanumeric field of up to forty characters Course description:
a free format alphanumeric field of up to 250 characters

3.2.3 Screen Layout

3.3 Class Section Database Maintenance Screen
3.3.1 Functions
This screen allows the entry of the identifying information for a new course section. All fields
are required to be filled in before the Enter button is pressed. The fields may be entered in any
order. The backspace key will work at any time in any field, but the Reset button will clear all of
the fields when it is entered. The Back button causes a return to the previous screen. The Exit
button causes an exit from this application.
The Course ID is required to be entered first (all existing sections will be displayed as soon as
it is entered), followed by the new Section #, Dates and Time fields. The Delete (allows a
section to be removed from the database) and Modify (allows the modification of an existing
section's information) buttons become active after the Section # is entered. If the section is
already in the database, the current information will be displayed as soon as the Section # field
is filled in. The Enter button will cause the Delete or the Modify to be executed and the fields
on the screen to be cleared.

3.3.2 Data Entry Formats
The formats for the fields (all mandatory) on this screen are: Course ID: three alpha
characters representing the department followed by a six-digit integer Section #: a three-digit
integer (leading zeros are required) assigned by the user Dates: the days of the week the
class meets (up to three with hyphens in between); the weekday designations are:
Sun
Mon
Tue
Wed
Thr
Fri
Sat
Time: the starting and ending times of the section (using military time) with a hyphen in
between, e.g., 12:00–13:30.

3.3.3 Screen Layout

3.4 Section Selection Entry Screen
3.4.1 Functions
This screen allows the entry of the selection of specific course sections for an individual
student. All fields are required to be filled in before the Enter button is pressed. The fields may
be entered in any order. The backspace key will work at any time in any field, but the Reset
button will clear all of the fields when it is entered. The Back button causes a return to the
previous screen. The Exit button causes an exit from this application.
The Student ID is required to be entered first, followed by the Course ID (all available sections
will be displayed as soon as it is entered). Sections are selected by clicking on the section to
be assigned. The Enter button will cause the student to be added to the selected section.
Entering a new Course ID will cause a new list of available sections to be displayed, allowing
another course section to be selected for the same student.

3.4.2 Data Entry Formats
The formats for the fields (all mandatory) on this screen are:
Course ID: three alpha characters representing the department followed by a six-digit integer
Student ID: two characters representing the home campus and a six-digit number that is
unique for each student
Available sections: a list of all of the sections that are not full

3.4.3 Screen Layout

3.5 Administrative Screen
3.5.1 Functions
Only the supervisor may access the administrative screen. It permits one of the following three
activities at a time:
Creation of a backup of any or all of the databases
Restore of a backup of any or all of the databases
Printing of a report of any or all of the databases
After the activity (create or restore) and the databases have been selected, the name of the
backup is to be entered.
The Back and Exit buttons are active at all times.

3.5.2 Data Entry Formats
The formats for the fields on this screen are:
Backup ID: aannnn (required only for backups, not reports) Commentary: a free format
character field 200 characters in length (required only for backups, not reports)

3.5.3 Screen Layout

Bibliography
Works Cited
Bach, James. "Exploratory Testing and the Planning Myth." 19 March 2001.
http://www.stickyminds.com/r.asp?F=DART_2359
Bach, James. "Exploratory Testing Explained." v.1.3, 16 April 2003.
http://www.satisfice.com/articles/et-article.pdf
Beizer, Boris (1990). Software Testing Techniques (Second Edition). Van Nostrand Reinhold.
ISBN 0-442-20672-0.
Beizer, Boris (1995). Black-Box Testing: Techniques for Functional Testing of Software and
Systems. John Wiley & Sons. ISBN 0-471-12094-4.
Binder, Robert V. (2000). Testing Object-Oriented Systems: Models, Patterns, and Tools.
Addison-Wesley. ISBN 0-201-80938-9.
Brownlie, Robert, et al. "Robust Testing of AT&T PMX/StarMAIL Using OATS," AT&T
Technical Journal, Vol. 71, No. 3, May/June 1992.
Carr, Marvin J., et al. (1993) Taxonomy-Based Risk Identification. Technical Report
CMU/SEI-93-TR-6, ESC-TR-93-183, June 1993.
http://www.sei.cmu.edu/pub/documents/93.reports/pdf/tr06.93.pdf
Cockburn, Alistair (2000). Writing Effective Use Cases. Addison-Wesley. ISBN 0-20170225-8.
Cohen, D.M., et al. "The AETG System: An Approach to Testing Based on Combinatorial
Design." IEEE Transactions on Software Engineering, Vol. 23, No. 7, July 1997.
Copeland, Lee. "Exploratory Planning." 3 September 2001.
http://www.stickyminds.com/r.asp?F=DART_2805
Craig, Rick D. and Stefan P. Jaskiel (2002). Systematic Software Testing. Artech House
Publishers. ISBN 1-58053-508-9.
Fowler, Martin and Kendall Scott (1999). UML Distilled: A Brief Guide to the Standard
Object Modeling Language (2nd Edition). Addison-Wesley. ISBN 0-201-65783X.
Gilb, Tom and Dorothy Graham (1993). Software Inspection. Addison-Wesley. ISBN 0-20163181-4.
Harel, David. "Statecharts: a visual formalism for complex systems." Science of Computer
Programming 8, 1987.

Hetzel, Bill (1998). The Complete Guide to Software Testing (Second Edition). John Wiley &
Sons. ISBN 0-471-56567-9.
IEEE Standard for Software Test Documentation: IEEE Standard 829-1998. ISBN 0-73811443-X.
IEEE Standard Glossary of Software Engineering Terminology: IEEE Standard 610.121990. ISBN 1-55937-067-X.
ISO (1991). ISO/IEC Standard 9126-1. Software Engineering - Product Quality - Part 1:
Quality Model. ISO Copyright Office, Geneva, June 2001.
Jacobsen, Ivar, et al (1992). Object-Oriented Systems Engineering: A Use Case Driven
Approach. Addison-Wesley. ISBN 0-201-54435-0.
Kaner, Cem,Jack Falk and Hung Quoc Nguyen (1999). Testing Computer Software (Second
Edition). John Wiley & Sons. ISBN 0-471-35846-0.
Kaner, Cem,James Bach, and Bret Pettichord (2002). Lessons Learned in Software Testing:
A Context-Driven Approach. John Wiley & Sons. ISBN 0-471-08112-4.
Kuhn, D. Richard and Michael J. Reilly. "An Investigation of the Applicability of Design of
Experiments to Software Testing," 27th NASA/IEEE Software Engineering Workshop, NASA
Goddard Space Flight Center, 4–6 December 2002. http://csrc.nist.gov/staff/kuhn/kuhnreilly-02.pdf
Marick, Brian (1995). The Craft of Software Testing: Subsystem Testing Including ObjectBased and Object-Oriented Testing. Prentice-Hall. ISBN 0-131-77411-5.
Mandl, Robert. "Orthogonal Latin Squares: An Application of Experiment Design to Compiler
Testing," Communications of the ACM, Vol. 128, No. 10, October 1985.
Mealy, G.H. "A method for synthesizing sequential circuits." Bell System Technical Journal,
34(5): 1955.
Myers, Glenford (1979). The Art of Software Testing. John Wiley & Sons. ISBN 0-47104328-1.
Moore, E.F. "Gedanken-experiments on sequential machines," Automata Studies (C. E.
Shannon and J. McCarthy, eds.), Princeton, New Jersey: Princeton University Press, 1956.
Phadke, Madhav S. (1989). Quality Engineering Using Robust Design. Prentice-Hall. ISBN
0-13-745167-9.
Planning, MCDP 5. United States Marine Corps.
https://www.doctrine.usmc.mil/mcdp/view/mpdpub5.pdf

Pressman, Roger S. (1982). Software Engineering: A Practitioner's Approach (Fourth
Edition). McGraw-Hill. ISBN 0-07-052182-4.
Rapps, Sandra and Elaine J. Weyuker. "Data Flow Analysis Techniques For Test Data
Selection." Sixth International Conference on Software Engineering, Tokyo, Japan,
September 13–16, 1982.
Rumbaugh, James, et al. (1991). Object-Oriented Modeling and Design. Prentice-Hall. ISBN
0-13-629841-9.
Watson, Arthur H. and Thomas J. McCabe. Structured Testing: A Testing Methodology
Using the Cyclomatic Complexity Metric. NIST Special Publication 500-235.
http://www.mccabe.com/nist/nist_pub.php
Wallace, Delores R. and D. Richard Kuhn. "Failure Modes in Medical Device Software: An
Analysis of 15 Years of Recall Data," International Journal of Reliability, Quality, and Safety
Engineering, Vol. 8, No. 4, 2001. http://csrc.nist.gov/staff/kuhn/final-rqse.pdf
Weinberg, Gerald M. (1975). An Introduction to General Systems Thinking. John Wiley &
Sons. ISBN 0-471-92563-2.
Whittaker, James A. (2003). How to Break Software: A Practical Guide to Testing. Addison
Wesley. ISBN 0-201-79619-8.

Other Useful Publications
Beizer, Boris (1984). Software System Testing and Quality Assurance. Van Nostrand
Reinhold. ISBN 0-442-21306-9.
Black, Rex (1999). Managing the Testing Process. Microsoft Press. ISBN 0-7356-0584-X.
British Computer Society. Standard on Software Component Testing. BS 7925-2.
http://www.testingstandards.co.ukhttp://www.testingstandards.com/BS7925_3_4.zip
Kit, Edward (1995). Software Testing in the Real World: Improving the Process. AddisonWesley. ISBN 0-201-87756-2.
McGregor, John D. and David A. Sykes (2001). A Practical Guide to Testing ObjectOriented Software. Addison-Wesley. ISBN 0-201-32564-0.
Meyer, Bertrand (2000). Object-Oriented Software Construction (2nd Edition). Prentice-Hall.
ISBN 0-136-29155-4.
Roper, Marc (1994). Software Testing. McGraw-Hill. ISBN 0-07-707466-1.
Tamres, Louise (2002). Introducing Software Testing. Addison-Wesley. ISBN 0-201-71974–
6.

Index
A
acceptance testing, 10, 135
ACT test scores, 121
action, 51–52, 95, 97
actor, 128
ad hoc testing, 202
adaptive planning, 213–214, 216
all events coverage, 106
all pairs testing, 64
all pairs tools
AETG, 88
Allpairs, 85
rdExpert, 68
all paths coverage, 107
all states coverage, 105
all transitions coverage, 108
Allison, Chuck, xvi
Allpairs algorithm, 66, 85–88
comparison with orthogonal arrays, 88–89
unbalanced, 87–88
An Introduction to General Systems Thinking, 236
automatic variables, 169

Index
B
Bach, James, xvi, 85, 202, 206, 212
banana principle, 236
Barker, Joel, 182
baseline path, 155
basis path sets, 164
basis path testing, 154–159
baseline path, 155
basis path sets, 164
example, 160–164
multiple sets, 159
path creation, 155
basis paths, 154–155
Beizer, Boris, 2, 6, 135, 198, 220, 226, 236
Beizer's taxonomy, 226
binary decisions, 155
Binder, Robert, 11, 222, 229
Binder's object-oriented taxonomy, 229
black box testing, 8, 20–22, 140
black dot symbol, 95, 97
Black, Rex, xvi
Bloom, Benjamin, 246
bookends, 220
boundaries, 40, 134
boundary value testing, 40, 197
boundaries, 40, 134
examples, 45
boundary values, 134
Box, George, 232
Brown & Donaldson, 16, 54, 71, 91, 94, 160, 250
authorization code, 17
bulls-eye symbol, 97
Bulwer-Lytton Fiction Contest, xvi

Bulwer-Lytton, Edward George, xvi
business rules, 50, 58

Index
C
case studies
Brown & Donaldson, 16, 54, 71, 91, 94, 160, 250
Stateless University Registration System, 17, 36, 47, 56, 71, 91, 94, 111, 131, 136, 260
chartered exploratory testing, 206
choosing off points, 118
choosing on points, 118
choosing test cases, 119
classical planning, 213
closed boundary, 118
Cockburn, Alistair, 130
code quality testing, 11
collapsed rules, 55
combination testing, See pairwise testing
combinations, 63
combined rules, 55
competency, 246
complex business rules, 50, 53–54, 58
condition coverage, 151
conditional, 102
conditions, 50, 51–53, 55, 58
connection lost, 134
contract, 27
control flow, 141
control flow graphs, 145–147, 154–159, 171–176
control flow paths, 172
control flow testing
condition coverage, 151
control flow graphs, 145–147, 154–159, 171–176
decision point, 146
decision/condition coverage, 151
execution paths, 144
exhaustive testing, 144
junction point, 146

levels of test coverage, 147–153
limiting loop execution, 153
missing paths, 144
multiple condition coverage, 152
path coverage, 153
process block, 146
statement coverage, 147, 150
test coverage, 147–153
coverage, 147, 153
Craig, Rick, xiii, xvi, 2
created variables, 169
creating test cases, 109–110, 134
cyclomatic complexity, 154–155
binary decisions, 155
edges, 154–155
nodes, 154–155

Index
D
data flow graphs, 171–174
data flow testing, 168
~define, 170
~kill, 170
~use, 170
created variables, 169
data flow graph, 171–174
define-define, 170
define-kill, 170
define-use, 170
define-use-kill, 172–174
destroyed variables, 169
dynamic data flow testing, 176
kill-define, 170
killed variables, 169
kill-kill, 170
kill-use, 170
life cycle, 169
processing flow, 171
static, 171–176
used variables, 169
use-define, 170
use-kill, 170
use-use, 170
variables in computation, 169
variables in conditionals, 169
data sensitivity errors, 141
debugging, 3
decision coverage, 150
decision point, 146
decision tables
actions, 51–52
collapsed rules, 55
combined rules, 55
conditions, 50–53, 55, 58
deriving test cases, 52–53
examples, 54–58
expected results, 54

"firing", 51
decision table testing, 50–54
decision/condition coverage, 151
defect taxonomies, 222–232
defensive-design, 28
defensive-testing, 28
define-define, 170
define-kill, 170
define-use, 170
define-use-kill, 172–174
design-by-contract, 27
destroyed variables, 169
development managers, xv
development paradigm, 129
disk full, 134
documenting transactions
flowcharts, 128
HIPO diagrams, 128
text, 128
use cases, 128
domain analysis, 116
domain analysis testing
choosing off points, 118
choosing on points, 118
choosing test cases, 119
closed boundary, 118
Domain Test Matrix, 120
example, 121
extra boundary, 116–117
in point, 118, 120
interactions between variables, 116
missing boundary, 116–117
off point, 118, 120
on point, 118, 120
open boundary, 118
out point, 118
shifted boundary, 116–117
tilted boundary, 116–117
Domain Test Matrix, 120, 134

double-mode defects, 65
driver not loaded, 134
dynamic data flow testing, 176

Index
E
edges, 154–155
equivalence class partitioning, 197
equivalence class testing, 24–33
examples, 33–34
input equivalence classes, 29
output equivalent classes, 33
equivalence class types
continuous range of values, 29
discrete values within a range, 29–30
multiple selection, 30–31
single selection, 30
equivalence classes, 25, 28
events, 94–96
execution paths, 141, 144
exhaustive testing, 21, 144
existing paths, 141
exploratory planning rigorous plan, 202
exploratory testing, 182, 202–208, 212
chartered, 206
charters, 206
conscious plan, 202
freestyle, 207
process, 205
extra boundary, 116–117

Index
F
FAFSA, 59
Faught, Danny, xvi
fault model, 222
non-specific, 222
specific, 222
Ferguson, Marilyn, 182
figures
decision coverage, 150
annotated control flow diagram, 171–173
B&D control flow graph, 162–163
B&D Java code, 160
baseline basis path, 156
boundary values, 43
cancel the Reservation, 101
cancellation from Paid state, 102
cancellation from Ticketed state, 103
continuous equivalence classes, 29
data on the boundaries, 44
defect discovery rate, 238
discrete equivalence classes, 29
enroll and drop a course, 113
evaluation of complex conditions, 152
example control flow graph, 154
fifth basis path, 158
flow graph equivalent, 147
four execution paths, 148
fourth basis path, 157
graphical representation, 148
IEEE 829 test documentation, 188
interesting flow diagram, 153
multiple selection equivalence classes, 30
on, off, in, and out points, 119
path terminates, 99
PayTimer expires, 100
Reservation in Made state, 95
Reservation in Paid state, 95
Reservation in Ticketed state, 97
Reservation in Used state, 98
second basis path, 156

seventh basis path, 159
single selection equivalence classes, 30
sixth basis path, 158
Stateless University Admissions Matrix, 121
Stateless University use cases, 129
SURS maintenance screen, 56
test cases that trigger all events, 106
test cases that visit each state, 106
third basis path, 157
two dimensional boundary defects, 117
Waterfall model, 187
fire, 51
firing, 51
Florida Institute of Technology, 135
flowcharts, 128
football, 212–213
Free Application for Federal Student Aid, 59
freestyle exploratory testing, 207
functionality testing, 11

Index
G
Gerrard, Paul, xvi
Grace L. Ferguson Airline & Storm Door Company, 94
Grade Point Average, 121
Graham, Dorothy, xvi
gray box testing, 8
guard, 102

Index
H
Hagar, Jon, xvi
Hetzel, Bill, 236
high school grades, 121
HIPO diagrams, 128
Holodeck, 135, 149
How To Break Software, 230
How to Get Control of Your Time and Your Life, 205

Index
I
IEEE 829 advantages
completeness checklist, 190
evaluation of test practices, 190
facilitate communication, 190
increased manageability, 190
IEEE 829 standard, 189–190
bug report, 195
release notes, 194
test case specification, 188, 193
test design specification, 188, 192
test incident report, 188, 195
test item transmittal report, 188, 194
test log, 188, 194
test plan, 188, 190
test procedure specification, 188, 193
test summary report, 188, 196
IEEE 829 test case specification
environmental needs, 193
input specifications, 193
intercase dependencies, 193
output specifications, 193
special procedural requirements, 193
test case specification identifier, 193
test items, 193
IEEE 829 test design specification
approach refinements, 192
feature pass/fail criteria, 193
features to be tested, 192
test design specification identifier, 192
test identification, 192
IEEE 829 test incident report
impact, 195
incident description, 195
summary, 195
test incident report identifier, 195
IEEE 829 test item transmittal report
approvals, 194
location, 194
status, 194

transmittal report identifier, 194
transmitted items, 194
IEEE 829 test log
activity and event entries, 195
description, 195
test log identifier, 195
IEEE 829 test plan
approach, 191
approvals, 192
environmental needs, 192
features not to be tested, 191
features to be tested, 191
introduction, 191
item pass/fail criteria, 191
responsibilities, 192
risks and contingencies, 192
schedule, 192
staffing and training needs, 192
suspension and resumption criteria, 191
test deliverables, 191
test items, 191
test plan identifier, 191
testing tasks, 191
IEEE 829 test procedure specification
procedure steps, 194
purpose, 194
special requirements, 194
test procedure specification identifier, 194
IEEE 829 test summary report
approvals, 196
comprehensive assessment, 196
evaluation, 196
summary, 196
summary of activities, 196
summary of results, 196
test summary report identifier, 196
variance, 196
IEEE Standard 610.12–1990, 2
IEEE Standard for Software Test Documentation, 188–196
IEEE Standard Glossary of Software Engineering Terminology, 2
IEEE Std 829–1998, 188–196
in point, 118, 120

inspection, 4, 41, 134, 164, 171, 176, 230
integration, 140
integration testing, 9
interactions between variables, 116
ISO 9126 Standard Software Product Evaluation—Quality Characteristics and Guidelines, 225

Index
J
Jacobsen, Ivar, 128
Jagger, Mick, 83
Jaskiel, Stefan, xiii, 2
junction point, 146

Index
K
Kaner, Cem, 202
Kaner, Falk, and Nguyen's taxonomy, 228
key points
adaptive planning, 213
advantage of state-transition table, 105
bank combinations, 62
black box testing helps efficiency and effectiveness, 21
boundaries are where defects hide, 40
choosing combinations, 65
comparing what is with what ought to be, 2
control flow graphs, 145
create test cases, 53
cultivate the skill of choosing poorly, 63
data czar, 135
data flow testing, 168
domain analysis, 116
double integral sign, xiv
equivalence classes, 26
evaluate the risk, 134
executing C test cases, 155
exploratory charter, 206
exploratory testing, 202
express your appreciation, 33
Holodeck, 149
importance of test design, xiii
locate orthogonal arrays, 67
Methuselah, 27
most important test, 205
object oriented combinations, 62
path definition, 144
post installation test planning, 63
random selection, 64
rarely will we have time, 31
rdExpert tool, 68
scripted and exploratory testing, 217
scripted and exploratory paradigms, 182
taxonomy, 222
test cases are inputs, outputs, and order, 6
test cases at each boundary, 42
test cases must be designed, 5

testing every transition, 108
testing levels - unit, integration, system, acceptance, 9
time-sequenced pairs, 170
trigger all transitions, 108
Twenty Questions, 203
use GUI design tools, 34
warm, fuzzy feelings, 29
we can never be sure of coverage, 21
web combinations, 62
who gets the blame, 28
your taxonomy, 232
kiddie pool, 198
kill-define, 170
killed variables, 169
kill-kill, 170
kill-use, 170

Index
L
L18(2137) orthogonal array, 69
L18(35) orthogonal array, 68
L4(23) orthogonal array, 66
L64(8243) orthogonal array, 73
L9(34) orthogonal array, 67
Lakein, Alan, 205
large number of combinations, 63
Lessons Learned in Software Testing, 85, 241
levels of competency, 246
life cycle, 169
limiting loop execution, 153
low memory, 134

Index
M
McCabe, Tom, 154
Mealy, G.H., 97
Meilof, Anne, xvi
Myers, Glenford, 3, 21, 237
Middleton, Wayne, xvi
missing boundary, 116–117
missing paths, 144
Moore, E.F., 97
multiple condition coverage, 152
multiple basis path sets, 159

Index
N
nested state-transition diagrams, 97
nodes, 154–155
nonexistent paths, 141
non-specific fault model, 222

Index
O
O'Loughlin, Martin, xvii
Object-Oriented Software Engineering: A Use Case Driven Approach, 128
off points, 118, 120
on points, 118, 120
open boundary, 118
oracles
existing programs, 7
kiddie, 6
purchased test suites, 7
regression test suites, 7
validated data, 7
order of execution, 6, 7
orthogonal arrays, 66–70
balanced, 87
definition, 68
L18(2137), 69
L18(35), 68
L4(23), 66
L64(8243), 72
L9(34), 67
mapping onto, 74–76
comparison with Allpairs algorithm, 88–89
dealing with extra columns, 83
dealing with extra values in a row, 83
notation, 67
unassigned cells, 82
using 70–85
out point, 118

Index
P
pairwise testing, 64–65, 197
additional tests, 89
all pairs testing, 64
Allpairs algorithm, 66, 85, 88–89
constraints, 88
documented studies, 64–65
effectiveness, 64–65, 89
paradigms, 182
cloud our vision, 182
sharpen our vision, 182
Paradigms: The Business of Discovering the Future, 182
partitioned state-transition diagram, 97
path coverage, 153
path creation, 155
path testing, 140
performance testing, 11
Perry, Dale, xvi
Phadke, Madhav S., 66, 72
planning, 183, 212, 216
adaptive, 213
classical, 213
Planning, 215, 216
planning functions, 215
planning heuristic, 214
planning pitfalls, 215
post-conditions, 27–28
pre-conditions, 27–28
Pressman, Roger, 5
process block, 146
processing flow, 171
project level taxonomies, 223
purchased test suites, 7

Index
Q
quality assurance engineers, xv
Quality Engineering Using Robust Design, 66, 72
Quentin, Geoff, xvi

Index
R
RAD, 187
rapid application development, 187
Rapps, Sandra, 168
regression test suites, 7
reviews, 4
Rice, Dr. Scott, xvii
risk, 3, 134
Rose-Coutré, Robert, xvi
Royce, Winston W., 186

Index
S
scenario, 128
scripted testing, 182, 186, 208, 212
auditability, 187
objectivity, 187
repeatability, 187
security testing, 11
shifted boundary, 116–117
single-mode defects, 65
Sloane, Neil J.A., 72
Snook, Sid, xvi
software defect taxonomies, 223–232
Beizer's taxonomy, 226
Binder's object-oriented taxonomy, 229
Kaner, Falk, and Nguyen's taxonomy, 228
Vijayaraghavan's eCommerce taxonomy, 231
Whittaker's How To Break Software taxonomy, 230
software developers, xv
Software Engineering Institute, 223
software inspection, 4, 41, 134, 164, 171, 176, 230
Software Inspection, 41
Software Quality Engineering, 16
Software System Testing and Quality Assurance, 198
software test engineers, xv
Software Testing Techniques, 226
specific fault model, 222
state, 51, 95–96
Stateless University, 59, 94, 121, 124
admisions, 121
Stateless University Registration System, 17, 36, 47, 56, 71, 91, 94, 111, 131, 136, 260
statement coverage, 147
state-transition diagrams, 94–95, 128
action, 95, 97
all events coverage, 106

all paths coverage, 107
all states coverage, 105
all transitions coverage, 108
black dot symbol, 95, 97
bulls-eye symbol, 97
conditional, 102
creating test cases, 105–110
event, 94–96
guard, 102
mixing different entities, 97
nested state-transition diagrams, 97
partitioned state-transition diagrams, 97
state, 95–96
transition, 95–96
state-transition tables
action, 104
advantage, 105
creating test cases, 105–110
current state, 104
disadvantage, 105
event, 104
next state, 104
static data flow testing, 171
StickyMinds.com, 202, 212
stopping criteria, 236–240
defect discovery rate, 238
marginal cost, 239
met coverage goals, 237
team consensus, 239
the boss says "Ship it!", 240
structured testing, 154–159
system, 140
system state, 51
system testing, 10, 135
system transactions, 135
Systematic Software Testing, xiii, 2

Index
T
tables
Allpairs program input, 86
Allpairs program output, 86
Beizer's Bug taxonomy, 226
Binder's Class Scope taxonomy, 230
Binder's Method Scope taxonomy, 229
classical vs. adaptive planning, 214
classical vs. exploratory planning, 214
collapsed decision table, 55
decision table, 51–52, 54
Domain Analysis test cases, 122
Domain Test Matrix, 120
example use case, 132
further collapsed decision table, 55
generic decision table, 50
invalid data values, 32
ISO 9126 Quality Characteristics, 225
Kaner's taxonomy, 228
L18(2137) orthogonal array, 70
L18(35) orthogonal array, 69
L4(23) orthogonal array, 66
L64(8243) orthogonal array, 73, 75, 77, 79, 81, 84
L9(34) orthogonal array, 67
sample test cases, 53
SEI taxonomy, 223
sensitizing control flow paths, 164
set of test cases, 44
Stateless University Admissions Matrix, 121
State-Transition table, 104
SURS decision table, 57
test case table, 54
testing all valid transitions, 109
use case template, 130
valid data values, 31
varying valid and invalid values, 32, 34
Whittaker's Fault taxonomy, 230
taxonomies, 222–223
creating your own, 232
project level, 223–226

software defect, 226–232
taxonomy-based risk identification, 223
template for use cases, 130
test analysis, 2
test case components
inputs, 6–7
order of execution, 7–8
outputs, 6–7
test case creation, 134
test case definition, 5
test case design, 2
test case design styles
cascading, 7
independent, 8
test case specification, See IEEE 829 test case specification
test case subsets, 22
test coverage, 147–153
test design, See test case design
test design specification, See IEEE 829 test design specification
test incident report, See IEEE 829 test incident report
test item transmittal report, See IEEE 829 test item transmittal report
test log, See IEEE 829 test log
test managers, xv
test multiple variables simultaneously, 116
test oracles, 6–7
test plan, See IEEE 829 test plan
test planning, 2, 212–217
test procedure specification, See IEEE 829 test procedure specification
test suites
purchased, 7
regression, 7
test summary report, See IEEE 829 test summary report
tester skills
careful observers, 207
careful reporters, 207
critical thinkers, 207
evaluate risk, 207

good modelers, 207
not distracted, 207
self managed, 207
test designers, 207
testing
maturity levels, 2–4
black box, 8, 20–21
exhaustive, 11, 21
gray box, 8
white box, 8, 20
testing challenges, 4
Testing Computer Software, 202, 228
testing interacting variables, 116
testing levels
acceptance testing, 10
integration testing, 9
system testing, 10
unit testing, 9
testing maturity, 2–4
testing n simultaneous dimensions, 116
Testing Object-Oriented Systems, 222
testing techniques
Allpairs algorithm, 85
all pairs testing, 64
boundary value testing, 40
data flow testing, 168
decision tables, 50
domain analysis, 116
equivalence class testing, 24
pairwise testing, 64
testing toolbox, 246
testing, definition
a concurrent lifecycle process, 2
comparison, 2
testing-by-contract, 27
The Complete Guide To Software Testing, 236
tilted boundary, 116–117
timebox, 206
tools
AETG, 88

Allpairs, 85
rdExpert, 68
transaction testing, 135
transactions, 128
transitions, 95–96
Twenty Questions, 203
types of testing
defensive-testing, 28
testing-by-contract, 27

Index
U
UML Distilled: A Brief Guide To The Standard Object Modeling Language, 97
Unified Modeling Language, 129
unit, 140
unit testing, 9, 140
United States Marine Corps, 215
usability testing, 11
use cases, 128–129
actor, 128
example, 131–133
functional requirements, 128–129
scenario, 128
template, 130–131
value, 129
use case template
actors, 130–131
channels to primary actor, 131
channels to secondary actors, 131
completeness level, 131
date due, 131
extensions, 131
failed end conditions, 130
frequency, 131
goal in context, 130
level, 130
main success scenario, 130
name, 130
number or identifier, 130
open issues, 131
preconditions, 130
primary actor, 130
priority, 131
response time, 131
scope, 130
secondary actors, 131
sub-variations, 131
success end condition, 130
trigger, 130
used variables, 169

use-define, 170
use-kill, 170
use-use, 170

Index
V
variables in computation, 169
variables in conditionals, 169
validation, 186
verification, 186
Vijayaraghavan's eCommerce taxonomy, 231
von Moltke, Helmuth, 215

Index
W
waterfall development model, 186–187
waterfall development model phases
coding, 186
operations, 186
program design, 186
requirements analysis, 186
software requirements, 186
system requirements, 186
waterfall development model testing, 186
Web testing levels, 11
code quality, 11
functionality, 11
performance, 11
security, 11
usability, 11
Weinberg, Gerald, 236
Weyuker, Elaine, 168
white box testing, 8, 20, 140–142
existing paths, 141
nonexistent paths, 141
white box testing techniques
control flow testing, 144–159
data flow testing, 168–176
Whittaker, James, 135, 230
Whittaker's How To Break Software taxonomy, 230
Writing Effective Use Cases, 130

Index
Y
y=mx+b, 121

List of Figures

Chapter 3: Equivalence Class Testing
Figure 3-1: Continuous equivalence classes
Figure 3-2: Discrete equivalence classes
Figure 3-3: Single selection equivalence classes
Figure 3-4: Multiple selection equivalence class

Chapter 4: Boundary Value Testing
Figure 4-1: Boundary values for a continuous range of inputs.
Figure 4-2: Boundary values for a discrete range of inputs.
Figure 4-3: Data points on the boundaries and data points just outside the boundaries.

Chapter 5: Decision Table Testing
Figure 5-1: SURS Student Database Maintenance Screen.

Chapter 6: Pairwise Testing
Figure 6-1: Orthogonal array notation

Chapter 7: State-Transition Testing
Figure 7-1: The Reservation is Made.
Figure 7-2: The Reservation transitions to the Paid state.
Figure 7-3: The Reservation transitions to the Ticketed state.
Figure 7-4: The Reservation transitions to the Used state.
Figure 7-5: The path ends.
Figure 7-6: The PayTimer expires and the Reservation is cancelled for nonpayment.
Figure 7-7: Cancel the Reservation from the Made state.
Figure 7-8: Cancellation from the Paid state.
Figure 7-9: Cancellation from the Ticketed state.
Figure 7-10: A set of test cases that "visit" each state.
Figure 7-11: A set of test cases that trigger all events at least once.
Figure 7-12: A set of test cases that trigger all transitions at least once.
Figure 7-13: State-transition diagram for enroll and drop a course at Stateless U.

Chapter 8: Domain Analysis Testing
Figure 8-1: Two dimensional boundary defects.
Figure 8-2: Examples of on, off, in, and out points for both closed and open boundaries.
Figure 8-3: Stateless University Admissions Matrix in graphical form.

Chapter 9: Use Case Testing
Figure 9-1: Some Stateless University use cases.

Chapter 10: Control Flow Testing
Figure 10-1: Flow graph equivalent of program code.
Figure 10-2: Graphical representation of the two-line code snippet.
Figure 10-3: Four execution paths.
Figure 10-4: Two test cases that yield 100% decision coverage.
Figure 10-5: Compiler evaluation of complex conditions.
Figure 10-6: An interesting flow diagram with many, many paths.
Figure 10-7: An example control flow graph.
Figure 10-8: The chosen baseline basis path ABDEGKMQS
Figure 10-9: The second basis path ACDEGKMQS
Figure 10-10: The third basis path ABDFILORS
Figure 10-11: The fourth basis path ABDEHKMQS
Figure 10-12: The fifth basis path ABDEGKNQS
Figure 10-13: The sixth basis path ACDFJLORS
Figure 10-14: The seventh basis path ACDFILPRS
Figure 10-15: Java code for Brown & Donaldson's evaluateBuySell module.
Figure 10-16: Control flow graph for Brown & Donaldson's evaluateBuySell module.
Figure 10-17: Control flow graph for Brown & Donaldson's evaluateBuySell module.

Chapter 11: Data Flow Testing
Figure 11-1: The control flow diagram annotated with define-use-kill information for each of
the module's variables.
Figure 11-2: The control flow diagram annotated with define-use-kill information for the x
variable.
Figure 11-3: The control flow diagram annotated with define-use-kill information for the y
variable.
Figure 11-4: The control flow diagram annotated with define-use-kill information for the z
variable.

Chapter 12: Scripted Testing
Figure 12-1: The Waterfall life cycle model.
Figure 12-2: The IEEE 829 Test Documents

Chapter 16: When to Stop Testing
Figure 16-1: Defect Discovery Rate

List of Tables

Chapter 3: Equivalence Class Testing
Table 3-1: A test case of valid data values.
Table 3-2: A test case of all invalid data values. This is not a good approach.
Table 3-3: A set of test cases varying invalid values one by one.
Table 3-4: A set of test cases varying invalid values one by one but also varying the valid
values.
Table 3-5: A set of test cases varying invalid values one by one.

Chapter 4: Boundary Value Testing
Table 4-1: A set of test cases containing combinations of valid (on the boundary) values and
invalid (off the boundary) points.

Chapter 5: Decision Table Testing
Table 5-1: The general form of a decision table.
Table 5-2: A decision table with two binary conditions.
Table 5-3: Adding a single action to a decision table.
Table 5-4: A decision table with multiple actions.
Table 5-5: A decision table with non-binary conditions.
Table 5-6: Sample test cases.
Table 5-7: A decision table converted to a test case table.
Table 5-8: A decision table for the Brown & Donaldson Buy order.
Table 5-9: A collapsed decision table reflecting "Don't Care" conditions.
Table 5-10: A further collapsed decision table reflecting "Don't Care" conditions.
Table 5-11: A decision table for Stateless University Registration System.

Chapter 6: Pairwise Testing
Table 6-1: L4(23) Orthogonal Array
Table 6-2: L9(34) Orthogonal Array
Table 6-3: L18(35) Orthogonal Array
Table 6-4: L18(2137) Orthogonal Array
Table 6-5: L64(8243) Orthogonal Array
Table 6-6: L64 (8243) with a partial mapping of its first column.
Table 6-7: L64 (8243) with a full mapping of its first column.
Table 6-8: L64 (8243) with a full mapping of its first and second columns.
Table 6-9: L64(8243) with a full mapping of all its columns.
Table 6-10: L64 (8243) with a full mapping of all its columns including the "extra" cells.
Table 6-11: Input to the Allpairs program.
Table 6-12: Output from the Allpairs program.

Chapter 7: State-Transition Testing
Table 7-1: State-Transition table for Reservation.
Table 7-2: Testing all valid transitions from a State-transition table.

Chapter 8: Domain Analysis Testing
Table 8-1: Example Domain Test Matrix.
Table 8-2: Stateless University Admissions Matrix.
Table 8-3: 1x1 Domain Analysis test cases for Stateless University admissions.

Chapter 9: Use Case Testing
Table 9-1: Use case template.
Table 9-2: Example use case.

Chapter 10: Control Flow Testing
Table 10-1: Data values to sensitize the different control flow paths.

Chapter 14: Test Planning
Table 14-1: Classical planning vs. Adaptive planning.
Table 14-2: Classical test planning vs. Exploratory test planning.

Chapter 15: Defect Taxonomies
Table 15-1: The SEI Taxonomy-Based Risk Identification taxonomy.
Table 15-2: The ISO 9126 Quality Characteristics taxonomy.
Table 15-3: A portion of Beizer's Bug Taxonomy.
Table 15-4: A portion of the defect taxonomy from Testing Computer Software.
Table 15-5: A portion of Binder's Method Scope Fault Taxonomy.
Table 15-6: A portion of Binder's Class Scope Fault Taxonomy.
Table 15-7: A portion of Whittaker's Fault Taxonomy.

List of Examples

Chapter 3: Equivalence Class Testing
Example 1
Example 2
Example 3
Example 4

Chapter 4: Boundary Value Testing
Example 1
Example 2

Chapter 5: Decision Table Testing
Example 1
Example 2

Source Exif Data:

File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.4
Linearized                      : No
Author                          : by Lee Copeland
Create Date                     : 2015:11:22 16:02:08+00:00
Producer                        : calibre 2.41.0 [http://calibre-ebook.com]
Title                           : A Practitioner's Guide to Software Test Design @Team LiB
Description                     : 
Creator                         : by Lee Copeland
Publisher                       : Artech House
Subject                         : 
Language                        : en
Identifier Scheme               : isbn
Identifier                      : ISBN:158053791x
Metadata Date                   : 2015:11:22 16:02:08.916732+00:00
ISBN                            : ISBN:158053791x
Isbn                            : ISBN:158053791x
Timestamp                       : 2015:11:22 16:01:35.155342+00:00
Page Count                      : 355

EXIF Metadata provided by EXIF.tools

A Practitioner's Guide To Software Design @Team LiB 11 By Lee Copeland

Navigation menu

Versions of this User Manual:

Views

Navigation