1971 11_#39 11 #39

1971-11_#39 1971-11_%2339

User Manual: 1971-11_#39

Open the PDF directly: View PDF PDF.
Page Count: 697

Download1971-11_#39 1971-11 #39
Open PDF In BrowserView PDF
AFIPS
CONFERENCE
PROCEEDINGS
VOLUME 39

1971
FALL JOINT
COMPUTER
CONFERENCE
November 16- 18. 1971
Las Vegas. Nevada

The ideas and opinions expressed herein are solely those of the authors and are not
necessarily representative of or endorsed by the 1971 Fall Joint Computer Conference Committee or the American Federation of Information Processing Societies,
Inc.

Library of Congress Catalog Card Number 55-44701
AFIPS PRESS
210 Summit Avenue
Montvale, New Jersey 07645

@1971 by the American Federation of Information Processing Societies, Inc.,
Montvale,New Jersey 07645. All rights reserved. This book, or parts thereof, may
not be reproduced in any form without permission of the publisher.

Printed in the United States of America

CONTENTS
DATA COMMUNICATIONS
A universal cyclic division circuit .................................... .

1

Cyclic redundancy checking by program .............................. .

9

A.
R.
P.
R.

W. Maholick
B. Freeman
E. Boudreau
F. Steen

APPLICATIONS OF COMPUTERS IN EMERGING NATIONS
Development of computer applications in emerging nations ............. .
Notions about installing and maintaining a population register in Brazil .. .

17

27

A. B. Kamman
A. L. Mesquita

OPERATING SYSTEMS-SOME MODELS AND MEASURES
The neutron monitor system ........................................ .

31

R. Aschenbrenner
L. Amiot
N. K. N atarajan

A simple thruput and response model of EXEC 8 under swapping
saturation ...................................................... .
Throughput measurement using a synthetic job stream .............. '" ..

39
51

A feedback queueing model for an interactive computer system ......... .

57

J. C. Strauss
D. C. Wood
E. H. Forman
G. Nakamura

TERMINALS
Alcoa picturephone remote information system (APRIS) ............... .

65

M. L. Coleman
K. W. Hinkelman
W. J. Kolechta

Computer support for an experimental PICTUREPHONE® computer
system at Bell Telephone Laboratories Incorporated ................. .
Proposed braille computer terminal offers expanded world to the blind ... .

71
79

E. J. Rodriguez
N. C. Loeber

89

B. L. Bateman
D. D. Drew
P. B. Crawford

97

W. E. Langlois
L. W. Ross
D. J. Olsen

SIMULATION OF ENVIRONMENTAL DYNAMICS
Numerical simulation of subsurface environment ......... '............. .

Digital simulation of the general atmospheric circulation using a very dense
grid ............................................................ .
Simulation of the dynamics of air and water pollution .................. .
Programming the war against water pollution ......................... .
Application of a large scale nonlinear programming problem to pollution
control ......................................................... .

105
115
123

G. Graves
D. Pingry
A. Whinston

IMAGES AND PATTERNS

135
145
153

A. J. Frank
L. D. Menninga
D. S. Prerau

163
171
177

C. K. Tang
T. S. Jen
K. J. Thurber
R. O. Berg

195
195
196

B. G. Lamson
C. T. Post, Jr.
E. E. Van Brundt

197

J. L. Bennett

.
.
.
.
.

199
199
200
201
201

C. Levinthal
N. E. Morton
R. Nathan
W. F. Raub
W. S. Yamamato

Introduction to training simulator programming ....................... .
The handling qualities simulation program for the augmentor wing jet
STOL research aircraft ........................................... .
Software validation·of the Titan IIlC digital flight control system utilizing
a hybrid computer ............................................... .

203

D. G. O'Connor

213

W. B. Cleveland

225

Multivariable function generation for simulations ...................... .

233

R. S. Jackson
S. A. Bravdica
P. Chew
J. E. Sanford
E. Z. Asman

243
253
263

J. E. Sammet
B. Wegbreit
D. D. Chamberlin

271

D. J. Mishelevich

Parametric font and image definition and generation ................... .
A syntax-directed approach to pattern recognition and description ....... .
Computer pattern recognition of printed music ........................ .
LARGE SCALE INTEGRATION (LSI)
A storage cell reduction technique for ROS design ..................... .
A new approach to implementing high-density shift registers ............ .
Universal logic modules implemented using LSI memory techniques ..... .

COMPUTERS IN MEDICINE-PROBLEMS AND PERSPECTIVES
(PANEL SESSION)
Position paper .................................................... .
Position paper .................................................... .
Position paper .................................................... .
THE USER INTERFACE FOR INTERACTIVE SEARCH
(PANEL SESSION)
Chairman's Note .... " ............................................ .
STATE OF THE COMPUTER ART IN BIOLOGY (PANEL SESSION)
Position
Position
Position
Position
Position

paper ....................................................
paper ....................................................
paper ...................................: .................
paper ....................................................
paper ....................................................

SIMULATION OF AEROSPACE SYSTEMS

r

PROGRAMMING LANGUAGES AND LANGUAGE PROCESSORS
Problems in, and a pragmatic approach to, programming language measurement ........................................................... .
The ECL programming system ...................................... .
The "single-assignment" approach to parallel processing ................ .
MEANINGEX-A computer-based semantic parse approach to the analysis
of meaning ................................................ , .... .

APPLICATION OF COMPUTERS TO LAW ENFORCEMENT AND
CRIMINAL JUSTICE
Law enforcement communications and inquiry systems ................. .
The Long Beach public safety information subsystem .................. .

281
295

State criminal justice information system ............................. .
Automated court system ........................................... .

303
309

J. D. Hodges, Jr.
G. Medak
P. Whisenand
G. Gack
R. Gallati
R. Baca
M. Chambers
W. Pringle
S. Boehm

EXPERIMENTS IN ON-LINE DELPHI RESEARCH

317
327
337

M. Turoff
T. B. Sheridan
S. Umpleby

INSIGHT-An interactive graphic instructional aid for systems analysis ..

351

M. J. Merritt
R. Sinclair

An interactive class-oriented dynamic graphic display system using a hybrid
computer ..................................................... .
Hybrid terminal system for simulation in science education ............. .
BIOMOD-An interactive computer graphics system for modeling ...... .

357
361
369

The future of on-line continuous-system simulation .................... .

379

A. Frank
D. C. Martin
G. F. Groner
R. L. Clark
R. A. Bermam
E. C. DeLand
H. M. Aus
G. A. Korn

Delphi and its potential impact on information systems ................ .
Technology for group dialogue and social choice ....................... .
Structuring information for a computer-based communications medium ... .
INTERACTIVE CONTINUOUS-SYSTEM SIMULATION IN
RESEARCH AND EDUCATION

COMPUTER STRUCTURES-PAST PRESENT AND FUTURE
(PANEL SESSION)
Position paper .................................................... .

387

Position paper .................................................... .
Position paper .................................................... .
Position paper. . .................................................. .

395
395
395

C. G. Bell
A. Newell
F. P. Brooks, Jr.
D. B. G. Edwards
A. Kay

COMPUTERS IN SPORTS (PANEL SESSION)
Position
Position
Position
Position
Position
Position

paper .................................................... .
paper. . .................................................. .
paper .................................................... .
paper .................................................... .
paper .................................................... .
paper ............................... ' ..................... .

TWENTY YEARS IN PASSING (PANEL SESSION)
(No papers in this volume)

397
397
398
399
399
400

G. Brandt
L. Eppele
A. Lalchandani
K. Mitchell
K. G. Purdy
F. B. Ryan

DATA SECURITY IN DATA BASE SYSTEMS
Multi-dimensional security programs for a generalized information retrieval
system ......................................................... .

for statistical purposes ........................................... .
The formulary model for flexible privacy and access controls ............ .
Insuring confidentiality of individual records in data storage and retrieval

571

579
587

J. M. Carroll
R. Martin
L. McHardy
H. Moravec
M. H. Hansen
L. J. Hoffman

THE APPLICATION OF COMPUTERS TO URBAN PLANNING
AND DEVELOPMENT
Integrated municipal information systems: Benefits for cities--Requirements for vendors ............................................... .
Geocoding techniques developed by the census use study ............... .

603
609

Urban COGO-A geographic-based land information system ................ .

619

S. E. Gottlieb
C. C. Smith
lV1. S. White, Jr.
B. Schumaker

SELECTED PAPERS IN DISCRETE SIMULATION
Understanding urban dynamics. . . . ................................. .
Bankmod-An interactive decision aid for banks ...................... .

631
639

G. O. Barney
W. P. Hoenhenwarter
K. E. Reich

Simulation of large asynchronous logic circuits using an ambiguous gate
model .......................................................... .
Adaptive memory trackers .......................................... .

651
663

S. G. Chappell
G. Epstein

669
670
671
671
672

B.
N.
B.
N.
E.

675
675
676
677
677

P. Kamnitzer
N. F. Kristy
J. McLeod
E. W. Paxson
R. Weinberg

PLANNING COJVIMUNITY INFORMATION UTILITIES
(PANEL SESSION)
Position
Position
Position
Position
Position

paper .................................................... .
paper .................................................... .
paper .................................................... .
paper .................................................... .
paper .................................................... .

W. Boehm
D. Cohen
Nanus
R. Nielsen
B. Parker

COMPUTERS AND THE PROBLEMS OF SOCIETY
(PANEL SESSION)
Position paper ..................................................... .
Position paper .................................................... .
Position paper .................................................... .
Position paper .................................................... .
Position paper .................................................. .

NUMERICAL METHODS
On the hybrid computer solution of partial differential equations with two
spatial dimensions ............................................. .

401

Numerical solution of partial differential equations by associative processing
Consistency tests for elementary functions ............................ .

411
419

G. A. Bekey
M. T. Ung
P. A. Gilmore
A. C. R. Newbery

LABORATORY AUTOMATION
Laboratory automation at General Electric corporate research and development..........................................................

423

1\1ulticomputer processing in laboratory automation. . . . . . . . . . . . . . . . . . . .

435

Enhancement of chemical measurement techniques by real-time computer.
interaction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The television/computer system-Acquisition and processing of cardiac
catherization data u~ing a small computer. . . . . . . . . . . . . . . . . . . . . . .. . . .

Cost benefits analysis in the design and evaluation of information systems. .
Factors to be considered in computerizing a clinical chemistry department
of a large city hospital. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

P. R. Kennicott
V. P. Scavullo
J. S. Sicko
E. F. Lifshin
C. E. Klopfenstein
C. L. Wilkins

441

S. P. Perone

455

H. J. Covvey
A. G. Adelman
C. H. Felderhof
P. Mendler
E. D. Wigle
K. W. Taylor
I. Learnman

469
477

R. L. Morey
M. C. Adams
E. Laga

491

J. C. Pendleton

501
515

I. Hirschsohn
A. C. Patterson IV

523
533

W. J. Hansen
W. D. Elliott
A. Van Dam
W. A. Potas

Planning computer services for a complex environment ................. .
A high performance computing system for time critical applications ...... .

541
549

Effective corporate networking, organization, and standardization ....... .

561

J. E. Austin
T. J. Gracon
R. A. Nolby
F. J. Sansom
P. L. Peck

DATA BASE SYSTEMS DESIGN
Integrated information system ...................................... .
A machine independent FORTRAN data management software system for
scientific and engineering applications .............................. .
Requirements for a generalized data base management system .......... .
INTERACTIVE TEXT EDITING SYSTEMS
User engineering principles for interactive systems ..................... .
Computer assisted tracing of text evolution ........................... .

PLANNING AND DESIGNING OF HIGH PERFORMANCE SYSTEMS

A universal cyclic division circuit
by ANDREW W. MAHOLICK and RICHARD B. FREEMAN
IBM Corporation
Research Triangle Park, North Carolina

communication line multiplexers the logic is sometimes
shared, but it is limited to those communication lines
using the same CRC checking polynomial.
We shall describe a generalized method for updating
cyclic redundancy checking logic at the character level
which is capable of operating upon any data character
size in conjunction with any checking polynomial of a
given length.

INTRODUCTION
Recent innovations in circuit technology have allowed
design alternatives that previously would have been
economically unsound. LSI technology permits the use
of generalized systems containing more logic than the
specialized systems used in the past, implemented in
unit logic, and at even lower cost. Five years ago an
engineer would not have even considered using a cyclic
redundancy checking circuit in the manner described
here.
Cyclic Redundancy Checking (CRC) is a relatively
old technique for use in error detection. W. W. Peterson
and D. T. Brown1 wrote a fundamental paper pointing
out the great potentialities for cyclic codes in error
detection and the requirements for implementing such
error detection systems. The specialized serial case,
i.e., with one input channel and one output channel,
has been extensively studied and is contained in the
Peterson2 text. Many related papers, including pioneering efforts on this subject, are contained in a book
edited by Kautz. 3 Hsiao and Sih,4 Hsiao,5 and Patel6
have concentrated on the generalized case of CRC circuits with parallel multiple-channel inputs and outputs.
The above articles emphasize the use of fixed wiring
patterns to implement the error-detection capabilities
of cyclic redundancy codes. The hardware would require a complete rewiring to change the polynomial for
the cyclic redundancy check. This, in turn, would
mean that the circuitry itself would have a limited
usefulness because only one type of polynomial could
be used within a system at anyone time.
Conventional CRC circuits for a given polynomial
and data character size consist of a serial-by-bit shift
register with EXCLUSIVE OR feedback circuits in
those bit positions which represent a term in the CRC
polynomial. Figure 1 shows an implementation for the
polynomial, X16+X15+X2+ 1. In a digital data communication system, this bit-synchronous scheme must
usually be duplicated for each communication line. In

RECEIVED OR TRANSMITTED CHARACTER BITS
TO BE INCLUDED. IN

eRe ACCUMULATION

2 ),EEDBACK DATA

FIGURE 1 -

SIMPLIFIED SERIAllMPlEMfNTATION OF THE POLYNOMIAL, Xf6 •

xl5 + x2 +

I

Figure 1-Simplified serial implementation of the polynomial

X16 +X15 +X2 +1

We shall describe the device from the point of view
of its application in a digital data communication line
multiplexer where a variety of CRC polynomials might
be employed to service multiple communication lines.
However, it should be noted that this device can be used
by future terminals as well as by the increasing number
of I/O devices (tapes, discs, et al.) which are employing
CRC checking.
THE EVOLUTION FROM SERIAL TO
PARALLEL
In this section we shall trace the evolution of the
polynomial, X 8 +X5+X3+X+l, from its serial-by-bit
implementation as shown in Figure 2 to its parallel-bycharacter implementation as shown in Figure 7. This
will provide the background necessary to understand
1

2

Fall Joint Computer Conference, 1971

INPUT DATA SHIFT REGISTER

Figure 2-8implified serial implementation of the polynomial
X8+X5+X3+X +1

Figure 4-Equivalent implementation #2 for the polynomial
X8+X5+X3+X +1

the generalized method described later that can process
to generate the CRC character or the syndrome.
From Patel,6 we have the required theoretical relationship between the serial-by-bit and parallel-bycharacter cases. A brief section is reproduced for the
convenience of the reader in Appendix A.
Figure 2 shows the conventional serial-by-bit implementation for the polynomial, X8+X5+X3+X+1, to
be used in conjunction with a four-bit data character.
Figure 3 shows an equivalent circuit in which redundant EXCLUSIVE-OR circuits have been added ,
such that there is one at the input of each stage of the
shift register. Those EXCLUSIVE-ORS not required
to implement the polynomial have one input connected
to a "logical 0" voltage level. Thus, the input coming
from a previous shift register stage will pass through as
if the EXCLUSIVE-OR circuit were not there, i.e.,
O+X=X.
Figure 4 shows another functionally equivalent circuit. Some flexibility is obtained by the addition of an
AND circuit, which controls one input of the EXCLUSIVE-OR circuit. One input of each AND circuit
is connected in common to the feedback path. The other
input of each AND circuit may be connected to either a
logical 1 or 0 voltage level according to whether or not
the corresponding term exists in the polynomial.
Figure 5 shows another step in the evolutionary
process. The data is entered parallel-by-character rather

than serial-by-bit. This is accomplished by EXCLUSIVE-ORing the data character with the corresponding
low order bits of the shift register prior to shifting. This
may be done since the contents of the shift register will
be the same after f-bit shifts on a serial-by-bit basis or
if the f-bits are EXCLUSIVE-ORed with the low order
stages of the shift register and then allowing the f-bit
shifts to occur.
Figure 6 shows the next step in the evolutionary
process. Here one row of shift registers has been ~dded
for each data bit. For all rows except the first, the input
of an EXCLUSIVE-OR circuit in cell position Cn,m is
connected to the output of cell Cn - i , m-i, where n is the
row number and m is the column number. It is connected
there rather than to the cell on its immediate left,
Cn,m-i. We shall shift each row from 1 t04 on a mutually
exclusive basis. When the last row has been shifted, the
output of row 4 will be identical to the serial-by-bit
implementation after four bit shifts.
In the final equivalent circuit every shift register
state is deleted such that all that remains is the combinational logic as shown in Figure 7. In this version,
the only time delay that will be encountered is the
propagation delay of the logic elements.
We still need some memory elements, however. We
require an OLD CRC REGISTER, a NEW CHARACTER REGISTER, and a NEW CRC REGISTER.
The NEW CHARACTER REGISTER and the low

Figure 3-Equivalent implementation # 1 for the polynomial
X8+X5+X 3 +X +1

Figure 5-Equivalent implementation # 3 for the polynomial
X8+X5+X3+X +1

f bits in parallel for any arbitrary checking polynomial

Universal Cyclic Division Circuit

3

Figure 6-Equivalent implementation #4 for the polynomial
X8+X5+X3+X +1

order position of the OLD CRC REGISTER are EXCLUSIVE-ORed together. The outputs of the exclusive
or circuits plus the high order positions of the OLD CRC
REGISTER are connected to appropriate positions in
the first row of the array. The updated CRC remainder
will appear at the output of the bottom row and will be
set in the NEW CRC REGISTER. The contents of
the NEW CRC REGISTER can then be transferred
to the OLD CRC REGISTER in preparation for the
next iteration.
A POLYNOMIAL REGISTER set to the required
bit configuration is used in lieu of fixed wiring to select
the polynomial. It offers more than is required for the
single polynomial implemented since it will provide for
any polynomial of the eighth degree.
THE UNIVERSAL CRC REGISTER
For a practical implementation in a communication
multiplexer, the system (Figure 8) uses a memory de-

Figure 7-Equivalent implementation # 5 for the polynomial
X8+X5+X3+X +1

Figure 8-Functional block diagram of universal CRC logic

vice which is addressable via a communication line
scanner. The location accessed in the memory for a
particular line contains unique line control information
including the current CRC value, data character
length, and a binary representation of the polynomial
associated with that line. Subsequent to the receipt of
the transmission line address, the memory will be accessed to obtain specific parameters associated with
that address to set the CODE LENGTH SELECTOR
(6, 7, or 8 bits), POLYNOMIAL, and OLD CRC
registers. The old CRC is the cyclic redundancy check
remainder calculated for the previous data characters
received or transmitted during the current transmission
on the line.
At the same time that the transmission line address
is made available to memory, the new data character
to be serviced from this line is stored in the NEW
CHARACTER register.
When all the parameters associated with the transmission line address have been set, the CODE
LENGTH SELECTOR, POLYNOMIAL, OLD CRC
and NEW CHARACTER registers are gated to the
inputs of an array calculator.
The array calculator is an asynchronous device
which will continually calculate a cyclic redundancy
check upon the data contained within the POLYNOMIAL register, the OLD CRC register, the NEW
CHARACTER register, and the CODE LENGTH
SELECTOR register. The output of the array calculator, after a sufficient amount of propagation delay
time within the array calculator, is the new CRC value
and it is stored in the NEW CRC register. The new
CRC contained in the NEW CRC register is then
stored in memory at the same location as the old CRC

4

Fall Joint Computer Conference, 1971

was previously stored. On the next iteration, this data
will be the old CRC remainder.
CRC calculation continues in a multiplexed fashion.
Each communication adapter invokes the CRC parameters associated with it by presenting a unique memory
address to access the memory. This insures that the
proper old CRC, code length, and polynomial are combined with the new data character to generate the new
cyclic redundancy check remainder.
An alternate approach will be to transmit the cyclic
redundancy check following the stop character and
allow the cyclic redundancy check and the stop character to pass through the universal cyclic redundancy
check generator. The result of this operation would be
a data word as an output from the array calculator
which represents the syndrome. A non-zero syndrome
indicates an error in the received data. However, if the
syndrome is zero we know only that the received bit
stream is one of the allowable set of transmitted bit
streams. It may not be the actual transmitted bit
stream. That is, an undetectable error may have occurred.
OPERATIONAL CHARACTERISTICS
Figure 9 is a detailed presentation of the input logic
of the rectangular array calculator which provides for
6-, 7-, or 8-bit data characters and polynomials of order
16 or less. The POLYNOMIAL and OLD CRC registers
are 16-bit registers labeled from 0 to 15 and the NEW
CHARACTER register is an 8-bit register labeled 0 to
7. Only the first two rows of the rectangular array are
shown in Figure 9.
Each of the registers is always filled from data busses
entering these registers such that the right-most binarybit positions in each of these registers represent data
corresponding to the particular polynomial terms, the

Figure 9-Input logic for the rectangular array

old CRC value, and the new data character which is
required to update the CRC value. In cases where this
data does not fill the entire register, the higher order or
left-most bit positions are forced to a binary 0 condition
as is shown above each of the registers in Figure 9.
While the logic for the array shown in Figure 9 looks
very extensive, it should be noted that this is deceptive,
since the logic has intentionally been designed as an
iterative structure to make it attractive for large scale
integration (LSI). The array could be packaged on one
chip, making the large amount of logic involved of
little consequence.
The logic shown in Figure 9 performs a relatively
complicated mathematical function upon the various
register inputs to the array, initially, a modulo-two
addition (half summing) occurs between the old CRC
and new data character. The result of that addition is
then applied to the array calculator. The array calculator operates in a manner so as to duplicate mathematically the results which might be obtained by serial
feedback approaches to CRC generation as previously
shown.
The circuitry within the array has its various analogies to serial feedback shift register implementation.
For example, the vertical lines such as line 1 in Figure 9
represents a single feedback point in an analogous
serial feedback approach to CRC generation (Line 1 in
Figure 1). The vertical line represents the presence
("I") or absence ("0") of a term in the chosen cyclic
check polynomial. For instance, the polynomial
X16+X15+X2+1 used in Binary Synchronous Communication would be represented with "I's" in positions 15,
2, and 0 (1 =Xo) of the POLYNOMIAL register. In
effect, there is always a high order term (16 in this
case) which necessitates the initial modulo-two
addition.
To determine the right-justified positions in the
POL YNOMIAL register for polynomials of degree less
than 16, it is necessary to multiply the polynomial by
x raised to the power (16-(degree of polynomial)). For
instance, the polynomial X6+X5+ 1 would be implemented as though it were X(16-6) (X6+X5+ 1) =
X16+X15+XlO and "ones" would be placed in positions 10
and 15 with position 16 implied. LRC for 8 b t codes
(x 8 +1) would te implemented as X (16-8) (x 8 +1) =
X 16 +X8 with a "one" in position 8.
The horizontal line or intermediate feedback line,
such as line 2 of Figure 9, represents for each bit shift
the state of the feedback network in the serial feedback
approach to ORC generation (Line 2 of Figure 1). The
horizontal line is always the output of the right-most
position in the row above in the rectangular array. The
concurrence of a feedback path and the proper data
bit in the feedback path would cause a change in the

Universal Cyclic Division Circuit

data within the serial shifting network. A similar
changing of data occurs in the transmission between one
cell element and another if the data on the intermediate
feedback line and the line from the polynomial register
are of the proper values. The output ("I" or "0") of a
given cell is equal to the output of the cell diagonally
above and to the left of it unless it is reversed by the
coincidence of a logical "one" on both the vertical line,
and on the horizontal line associated with the position.
THE ARRAY AND ITS OPERATION
Certain machines might interface the array in a different fashion than shmvll in Figure 9, i.e., the CRC
polynomial select register might be replaced by permanent wiring in a terminal application, or the output
assembler (described later) might be reduced or eliminated if only one data character length exists.
The array consists of replicas of the simple circuit
shown in Figure 10. The cell is shown enclosed in the
dotted line labeled cell Cn,m on Figure 9. Each cell element has three inputs. The first, 1, is connected to a
line 4 carrying signals representing the binary value for
the intermediate feedback within the array calculator.
The second input, 2, is connected to a line 5 which has

5 POLYNOMIAL POSITION
4 INTERMEDIATE FEEDBACK

,------1
I.
CELL
I
n, m

11

I
I

AND

I

IITO CELL

FROM CELL

C

C n + 1, m+l

EX OR

n-l, m-I

I
L

______

I

~

Figure H}-Logic diagram for a standard cell of the array

5

binary information representing the binary value of a
given single bit position within the polynomial, and is
connected directly to the POLYNOMIAL REGISTER.
A third input, 3, is a connection to a cell which is diagonally upward to the left of the array. Specifically,
Cell Cn,m has its third input connected to the output of
cell Cn-I,m-I, where n designates the row number and
m the column number.
For the cells in the leftmost column, there are no
positions diagonally upward to the left. Therefore, the
third input to the cell is wired permanently to a voltage
source having a binary value of O.
The cell elements along the first row of the array
have a slightly different characteristic than the other
cells of the array because the third input to each cell
cannot be connected to the cell element diagonally upward to the left within the array since no such element
exists for those in the first row. For cell element Co,o,
cell element row 0 and column 0, the third input is
wired to a binary 0 voltage level. For cell element CO,I,
the cell element in row 0 and column 1, the third input
is connected to bit position 0 of the OLD CRC register.
Subsequent cells in row 0 have their third input connected directly to the OLD CRC register up to and
including cell CO,7.
For cells CO,g to cell CO,IS, the third input to each cell
is wired in a different manner than for the other cells
within the row. Cell CO,IS provides a good example.
The third input to this cell is connected to EXCLUSIVE OR circuit 6. The inputs to EXCLUSIVE OR 6
are connected to bit position 6 of the NEW CHARACTER register and to bit position 14 of the OLD
CRC register. Similar wiring exists for the other array
elements CO,8 through CO,I4.
The intermediate feedback signal from EXCLUSIVE OR 7 is connected to the first input to each of
the cell elements in row o. The intermediate feedback
signal is generated by EXCLUSIVE OR circuit 7. The
inputs to EXCLUSIVE OR circuit 7 are connected to
bit position 7 of the NEW CHARACTER register and
tobit position 15 of the OLD CRC register.
PROVISION FOR VARIOUS CODE LENGTHS
Although the concept is general enough to accommodate other code lengths, it is assumed that 6-, 7-,
and 8-bit codes may be used and that the polynomials
associated with these code lengths can be of degree 6 or
12, 7 or 14, and 8 and 16, respectively. The same array
may be used to accomplish this by "right justifying" it.
For example, a 7-bit code of polynomial degree 14,
would extend from 2 to 15 in the POLYNOMIAL
register. Positions 0 and 1 would be set to zero. A 7-bit

6

Fall Joint Computer Conference, 1971

POLY POSO

POLY POS I

POLY POS 7

POLY POS 8

POLY POS IS

POLY POS 14

10

II

12

13

14

IS

+-- NEW eRe REGISTER

6 Ie - 0000 XXXX XXXX XXXX
7 Ie - OOXX XXXX XXXX XXXX
8 Ie - XXXX XXXX XXXX XXXX

Figure ll-Output logic for the array

code of degree 7 would extend from positions 9 to 15 in
the POLYNOMIAL register. Positions 0 to 8 would be
set to zero. This method appropriately tr~cates the
"width" of the array.
However, the "depth" of the array must also be
truncated in accordance with the character length.
Each row of the array represents a serial shift of one
bit. Thus, for a 6-bit code, using an array designed for
eight bits, the desired answer is present and properly
aligned at the outputs of the sixth row. However, because of chip layout limitations, it is not practical to
bring out indeplmdent outputs from each of several
rows.
Thus, a compromise is struck by degating the intermediate feedback path to lower rows, which results in
a single right shift of the answer for each such row.
Note that the right-most bits wrap around the bottom
right side, appearing as the outputs of the right-hand
positions of the second row up (for a 7-bit code) or
subsequent rows for shorter codes. Thus for multilength systems it will be necessary to assemble for
proper alignment some time before the NEW CRC becomes the next OLD CRC. See Figure 11. The resultant

alignment is as shown at the bottom of Figure 11 for
6-, 7-, and 8-bit code lengths and 12-, 14-, and 16·degree
polynomials, respectively.
The output of the array calculator must be taken
from the proper cells within the array and this is dependent upon the particular bit length of the character
for which the cyclic redundancy check is being calculated. For example, should the character upon which
the CRC is being calculated be of a length of only
six bit positions, the output should be taken from
the output of row number 5, (the first row being identified by a 0). This is accomplished through various circuit elements within the array calculator as shown in
Figure 11. Specifically, OR circuits 1 and 3 are activated
by a signal indicating that the new character is of a
6-bit code type. The outputs of the OR circuits are
inverted and then propagated along the intermediate
feedback signal paths to disable the AND circuits in
each of the cell elements in rows 6 and 7. As a consequence, the cell elements in rows 6 and 7 will not
modify the data received from the outputs of the cell
elements within row number 5, and they can be used to
propagate the output from the cell elements in row 5.

Universal Cyclic Division Circuit

FROM C. 0

l

POLY 0

~

FROM C. 1

C5,0

~

0

L
~

C 6 ,0

0

-

L

l

POLY 1

~

--

L

---

~

L
-

C 7,0

0

FROM C. 2

L

POLY 2

C 5 ,1

I

POLY 3

~

f---

C 5 ,2

'- I - -

C 6 ,1

7

-

L

~

-

..-

L

-

C 7 ,1

C 6 ,2

-

-

I---

'-

-

l

-

..--

C 5 ,3

l

4,15

POLY 5

FROM C 5,15

C 6,3

L

-

~

C 6 ,4

r-

~

-

C 7 ,2

FROM C
POLY.

FROM C

C 7,3

L

t'---4~

-

C7 ,4

L

'---

t-

6,15

C 7 ,5

~

6 B1T CODE
7 B1T COQE
8 B1T CODE

A

I

QQI I
OR

+

A

oa
OR

I I

+

TO NEW

TO NEW

CRC 0

CRe 1

A

OQ
OR

A

1I

+

TO NEW
CRC 2

QQI MMo
I
I
OR

+

TO NEW

CRe 3

OR

+

TO NEW
CRC 4

Figure 12-Alternate output logic for the array

The output of Cell GS,IS is propagated to AND circuit 5.
When a 6-bit code is selected, a positive voltage will
appear on the second input to AND circuit 5. The
data appearing on the output of cell Gus would then
be transmitted via AND circuit 5 or OR circuit 8 and
on to bit position 15 of the NEW CRC register.
Bit position 14 of the output is gated from cell element G6 ,IS to AND circuit 9 when a 6-bit code is indicated. AND circuit 9 has an output connected to OR
circuit 12 whose output is connected to bit position 14
of the NEW CRC register.
Cell element Gus provides the output for bit position 13 of the NEW CRC register when a 6-bit code is
being operated upon. This is accomplished by gating
circuitry not shown. The other bit positions of the
NEW CRC REGISTER would be filled from data from
cell elements in row 7 of the array in a similar manner
to that described for bit position 13 in the NEW CRC
REGISTER when a 6-bit code was being transmitted.
Inthe case where the new character contains eight data
bits, each of the outputs of the eighth row of the array
calculator would be connected directly to the NEW
CRC register via appropriate switching circuits and no
compensation for the shift in the array network would
be necessary.
The gating circuitry above-mentioned in connection

with Figure 11, is particularly adapted to LSI circuitry because the output gating occurs from elements
of the network which are on the peripheries of the rectangular array calculator. With the above scheme, the
array could easily be placed in a single chip and all
wiring connections can be made to points within the
array without crossing any internal connections.
The advantage to the above-shown output gating is
that additional wires from the outside of the array are
not necessary to connect to interior points within the
array. Where such wiring problems do not exist, a
simpler approach to the outputting is shown in Figure
12. This logic is a simple AND-OR assembler where
row 5,6, or 7 is selected for gating into the NEW CRC
register, depending on whether the code length is 6, 7,
or 8 bits, respectively. This assembly function can be
considered as part of the array element and can therefore be extended throughout the array to provide for
any code length.
ACKNOWLEDGMENT
The authors wish to acknowledge the hardware implementation contribution of M. T. Kawalec and S. R.
Stager, III.

8

Fall Joint Computer Conference, 1971

given by:

REFERENCES

1

0

1 W W PETERSON D T BROWN
Cyclic codes .for error detection

1

0

Proceedings of the IRE pp 228-235 January 1961
2 W W PETERSON
Error correcting codes

T=

MIT Press 1961
3 W H KAUTZ

1

0

Linear sequential switching circuits

Holden-Day Inc 1965
4 M Y HSIAO K Y SIH

Go

Serial-to-parallel transformations of feedback shift
reg1:.'1ter circuits

IEEE Transactions on Electronic Computers
VOL EC-13 pp 738-740 December 1964
5 M Y HSIAO
Theories and applications of parallel linear feedback shift
register

IBM TR 1708 SDn Poughkeepsie March 1968
6 A M PATEL

(3)

1

G1 G2

Gr- 1

Suppose that Zt, Zt+l, ... Zt+f-l are the f data bits
(a byte) entering successively into the serial eRe
register during the f consecutive shifting operations.
The contents of the eRe register at the end of 1 shifts
is denoted by the vector X t+f . Using Equation 2 iteratively, 1 times, one can obtain:
Xt+!=XtTfEBZtGTf-lEBZt+1GTf-2EB . . .Zt+f-lG

A multi-channel eRC register

AFIPS Conference Proceedings Vol 38 pp 11-14 Spring
1971

(4)

Here Ti is the jth power of the matrix T. Let Z t denote
the input data sequence, as follows:
Zt= (Zt+!-l, Zt+f-2, ... , zt+1, Zt)

APPENDIX A

Let D denote the following partitioned matrix:

The following is reproduced from Pate16 :
In this section, we develop the mathematics for obtaining a multi-channel CRe register that can process
1 bits in parallel to generate the eRe character or the
syndrome. One shift in the parallel circuit is equivalent
to 1 shifts in the corresponding serial eRe register.
The number 1 is a positive integer, smaller than the
degree r of the checking polynomial.
G (x) denotes the checking polynomial, often called
the generator polynomial. We use the following notation:

The state vector Xt=(xo, Xl, ... Xr-l)t denotes the contents of the eRe register at time t. T denotes the
companion matrix of.the polynomial G(x), corresponding to the serial eRC register connections. Let Zt denote
the data bit entering the serial eRe register at time t.
Then the shifting operation of the serial eRe register
is given by the (mod-2)_ matrix equation
Xt+1=XtTEBztG

(2)

where G is the vector (Go, GI , G2 . . . Gr- 1), and T is

G
GT
D=

GT2

(5)

GTf-1

Note that the vectors G, GT, GT2, ... GTf-1 represent the contents of the serial eRe register a.s the
vector G is shifted 1-1 times.
Then, Equation 4 can be rewritten as:
Xt+!=XtTfEBZtD

(6)

The sequential circuit realizing Equation 6 has the
property that with the input byte Zt (1 bits in parallel),
it changes from state X t toX t+! in a single shift. This
is the equivalent operation to 1 shifts of the corresponding serial eRe register with the same input data
entered serially.

Cyclic redundancy checking by program
by P. E. BOUDREAU and R. F. STEEN
IBM Corporation
Research Triangle Park, N.C.

One part of the problem is addressed in this paper.

INTRODUCTION

It is the problem of encoding or generating check bits.
The solution, however, also applies to the decoding
problem for error detection codes of this type. A similar
approach, based on the properties of the companion
matrix, has been used for parallel hardware devices. 8,9
With this approach, efficient and attractive programs
can be developed for software or firmware. Subroutines
developed here require as few as six instructions with
sequential instruction execution to update a 16-bit
remainder for eight new information bits. A program
directly simulating a shift register would require at
least three instructions (EXCLUSIVE OR, SHIFT,
and BRANCH) per bit, or 24 instructions for an eightbit update.

Recent advances in the use of mini-computers as
control elements of a computer complex and as intelligent terminals! are indicative of a trend toward
relocation of certain hardware functions to microprogram or machine level program. One such function
which is a particularly good candidate, for various
reasons, has already been moved into program in
several machines (e.g., IBM System 360/25 Integrated
Communication Adapter2 and the IBM 11303). This
function is error control using an error detection Cyclic
Redundancy Check (CRC). A CRC is a variable
length shortened cyclic code in which a message is a
code word if, and only if, the message polynomial
M(x) is divisible by the generator polynomial G(x).
Error detection and correction codes have been
studied extensively for more than 15 years. The most
comprehensive references,4,5 as well as the majority of
papers written in the area, measure the encoding and
decoding complexity in terms of the cost of hardware
and the time for decoding. With some notable exceptions,6,7 very little attention is given to the problem of
encoding and decoding using machine level or microinstructions. However, in some cases such as the
Berlekamp algorithm3 for BCH codes, it may very
possibly be easier to write a program for certain steps
of the decoding procedure than to design hardware.
Programmed error correction is especially· appealing for
use with high rate codes when error probabilities are
low, since, in this case, a major portion of the correction
process need only be performed when errors actually
occur. Allocation of a significant amount of hardware for
these relatively infrequent events is expensive. Furthermore, rapidly advancing memory technology helps to
make program-controlled devices not only economically
feasible but attractive.

MATRIX APPROACH TO CYCLIC CODES
In this section, we review the relationship between
multiplication by the companion matrix and polynomial division used to generate a code word. We then
generalize the operation to an m-bit character-bycharacter operation developing a matrix equation to
update the calculated redundancy m bits at a time. The
appendix will be helpful to those familiar with the shift
register in order to further justify the connection between the shift register operation and the matrix
multiplication.
Generally, the check bit generation process is one of
determining R(x) =xhl(x) mod G(x) where lex) is the
polynomial whose coefficients are the information bits
and h is the number of check bits. We can next let the
coefficients of R(x) be an h bit vector, R, and let G be
the h by h companion matrix shown below. The binary
digits, gi, i = 1,2,3 ... h-l, are the coefficients of the
generator polynomial.
9

10

Fall Joint Computer Conference, 1971

o

1

°

o

o

0

1

°

o

0

°

If indeed we are operating with m bits per character and
A (t) is the remainder after some character has been
sent, then A(t+m), given by Equation (4), is the
remainder after the next character has been sent and
b(t+m), b(t+m-l), ... , b(t+l) is the bit string of
length m representing that next character, where
b(t+l) is the first bit sent.
Since we will be using this from now on, it is convenient to make a slight change of notation. We define

G=
1

Then, if we let b(l) =ik- I be the first information bit
(the k-lth coefficient of l(x» and b(k) =io be the
last information bit, it is clear (see the appendix or
Reference 7) that the remainder R can be calculated
iteratively using the following formula:

A (t+ 1) ={A (t) +[0,0, ... ,0, b(t+ 1)]}·G

(1)

and setting R=A(k). It should be noted that A(t)
represents the remainder of xhlt(x) divided by G(x)
which is the calculated redundancy after the first t
information bits, It(x), have been taken into account.
We now define B(t+ 1) = [0, 0, ... , 0, b(t+ 1)] and
rewrite Equation (lor A2) as

A(t+l) =[A(t)+B(t+l)]G.

(2)

Equation (2) is the basic matrix description of the
polynomial division process (circuit function) on a
bit-by-bit basis. The advantage of the matrix approach
is realized when one extends it to a multibit or character level. We can do this for m bits-per-character as
follows, assuming m ~ h. Repeated use of Equation
(2) yields:

A(t+m) =[A(t+m-l)+B(t+m)J·G
= {[A(t+m-2)+B(t+m-l)}G
+B(t+m)} ·G

as the remainder after the jth character, and

as an h component vector where

Co.h CI.h C2.i· .. Cm-l.i
is the bit string of length m representing the jth character and Cm-I.i is the first bit of the character transmitted. That is
for

i=O, 1, ... , h-l

and

Co.i=b(t+m)
CI.i=b(t+m-l)
Cm-I.i=b(t+ 1).
With this notation Equation (5) becomes the character-by-character version of Equation (2)
(5)

This equation expresses the remainder after j + 1
characters as a function of the remainder after j characters and the j + 1st character for m ~ h bits per
character. It is the fundamental result which we apply
below.
MATRIX IMPLEMENTATION OF CYCLIC
CODES

m

=A (t) ·Gm+

:E B(t+j) •Gm-i+l.

(3)

i=1

Equation (3) expresses the remainder at time t+m in
terms of the remainder at time t and the next m input
bits b(t+l), b(t+2), ... , b(t+m). This equation can
be put into a better form by using the "shifting"
property of the companion matrix G.

A (t+m) =A (t) ·Gm
+[0,0, ... ,0, b(t+m), b(t+m-l), ... , b(t+l)}Gm.
(4)

This matrix description of cyclic checking leads
directly and intuitively to several different programmed
checking implementations. It is this feature which
makes the approach valuable. Since instruction sets,
core availability, and instruction execution times vary
widely, three approaches will be described.
It is very convenient to describe these subroutines in
APLIO with a single line of APL representing a single
machine language instruction. For those interested in
the exact operation of the simulated machine language
instruction, a knowledge of basic APL is required;
otherwise, the marginal machine instructions and

Cyclic Redundancy Checking by Program

comments should clearly indicate the general nature of
the operation on each line of code. It is assumed that
there are four 16-bit registers which are available to
the programmer. These are represented by the APL
vector variables RA, RB, and RC with the fourth being
the base register which is used for the return branch to
the main program. In APL, RA[1 ;J represents the high
order byte of register RA and RA[2;J represents the
low-order byte of the same register. The storage area
for tables is represented by the matrix SA which is as
large as necessary.
Although we have assumed a 16-bit data path for the
three examples, it is easy to write similar subroutines
for an eight-bit ALU by partitioning the G8 matrix
in a different manner. We will use the terms, "byte"
and "halfword" to mean eight and 16 bits respectively.
In general, our methods below are iterative schemes
for finding the remainder using the recurrence relationship
AHI = [Ai+Ci+lJGm.

We note again for emphasis that C7,j is the first bit of
the jth character while the j = 1st character is the first
character transmitted or received.
One-256-halfword-table look-up method

This is a simple one-table look-up method which
requires a significant amount of storage and frequently
will be impractical for codes with more than eight
bits-per-character. However, it embodies most of the
basic ideas of the matrix approach and is a good starting
place. In an instruction set with the logical EXCLUSIVE
OR operation, the forming of WH1 is trivial. The next
stepjs to find Ai+! which can be found by multiplying
W j +1 by G8. This can be done very rapidly by table
look-up. Rather than blindly storing all 216 halfwords
which can result from this operation, we notice that G8
has the form

0'=

For simplicity we define what we call a "working
remainder" W i+l,

[OiT

Thus W H1 G8 can be written
Wi+l(L)XEBWH1(H)[0

= [aO,h •.. , ah-m-l,h (ah-m,iEBCO,i+l) ,
••• , (ah-l,i EB Cm-l,Hl)

Basically, our problem is to find
Gmusing

J

AHI

given

WJ+l

and

A H1 = Wi+1Gm.

Since WH1 is a binary vector of length h, it can take
no more than 2h values. The following methods, called
the "one-256-halfword-table look-up," the "two-32halfword-table look-up," and the "binary summation"
method, are various ways to perform this job.
Purely for ease of notation, we now fix the values of
hand m. We will let the number of parity bits be
16(h = 16) and the number of bits per character be
eight (m=8). Substitution in (5) gives us the fundamental equation
(6)

where
Wi+! = Ai+Ci+l
Ao= [0,0,0,

11

I I]

where Wi+1(H) is an eight-bit vector comprising the
high-order eight bits of W H1 and Wi+l(L) represents the
low-order eight bits of WH1 . If byte operations are
available, the product WH1(H). [0 I I] is simply moving
the byte from the high-order half of a 16-bit register to
the low-order half. The second instruction in Table I
performs this operation. The second product above
requires a table look-up for one of 256 halfwords
representing all possible values of WH1(L) ·X. This is
done in instruction four after the program has shifted
the address left one bit in order to force the address to
a halfword boundary. The table is assumed to be
located on a 512 byte boundary. Its address is stored in
the seven low-order bits of the high-order byte of the
RB register. The two results are EXCLUSIVE ORed
together in the fifth instruction and the table address is
restored in the last instruction before the return branch.
Table I shows the program which will update the CRC
for a full eight-bit character.
This is called the one-table, one-step look-up method.
It is very fast but may be impractical because of the
quantity of core required.

... ,0, OJ
Two-32-halfword-table look-up method

Ci=[O,

0, ... ,0,CO,hCl,h

••• ,C7,iJ

are all 16-bit vectors, and
G8 = the

companion matrix raised to the 8th power.

A more practical subroutine for CRC character
update relative to core storage requirements is the twotable method. In. this method, we further partition the
matrix X above into two matrices Y and Z. Thus we

12

Fall Joint Computer Conference, 1971

TABLE I-Subroutine Using One-256-Halfword Look-up
Initial conditions for all subroutines:
Register RA contains the old CRC, Ai
Register RB2 contains the new character, Ci +1•
Final conditions for all subroutines:
Register RA contains the new CRC~ A i +1'
V

[1]
[2]
[3]
[4]

EXCLUSIVE OR RB2, RA2
MOVE RB2, RA1
SHIFT LEFT RB, 1
LOAD RA, RB
EXCLUSIVE OR RA, RC
ROTATE LEFT RB, 15
BRANCH RETURN

CRC1

[5]
[6]
V

write G8 as
G8=

called PTYRC, the even parity of register RC. Looking
back to the defining equation

[~]
YIZ

Aj+l= [Cj+l+A j ]-G8= Wj+l-G8.

Here, the Y and Z matrices are four by 16 binary
matrices and Wj+l(L) is broken into two four-bit vectors
Wj+l(LL)
and Wj+l(LH). Thus, the new calculation
becomes
Aj+l= Wj+1(LH) -

Form W i +1 (L)
Form W i +1 (H)[0 I I]
Form address
Load Wi+1(L)X
Form Ai+l
Reset address
Return

RB[2;] ~ RB[2;] ~ RA[2;]
RC[2;] ~ RA[l;]
RB ~ ((15p1), 0) /\ 1cf>(l6pRB)
RA ~ 28 p(16p2) T SA[2 J.. RB]
RA[2;] ~ RA[2;] ~ RC[2;]
RB ~ 2 8 p(15cf>RB)

Dk = [dO,k' d1,k, ••• , d l5 ,k] be the kth column of
Then the high-order position of the new remainder
Aj+l is given by

Let
G8.

15

YEe Wj+l(LL) -ZEe Wj+1(H) - [0 I J].

Each of the products is a 16-bit row vector. The program
now requires two look-up operations for the first two
terms and a byte move for the last term. All three
terms must then be EXCLUSIVE ORed together. The
program is shown in Table II.
Binary summation method

aO,j+l =

L:

d i ,l-Wi,j+l

i=O

which is operationally the same as ANDing the first
column of the matrix G8 with the working remainder
W j +! and finding the even parity of the result. This
parity is the value of aO,j+l. Similarly, we can find
the remaining bits by ANDing Wj+l with each
column D k +1 and find the even parity to determine
ak,j+l 0 ~ k ~ 15.
15

Finally, it is possible to perform this whole operation
without tables. This is done by performing the matrix
multiplication by program rather than by table look-up.
This requires a parity test as a condition on the branch
instruction, however. This branching condition will be

ak,j+l

=

L:

d i ,k+l- W i,j+l

i=O

This operation can be carried out in a program as
illustrated in Table III.
The program shown here requires more than 80 words

TABLE II-Subroutine Using Two-32-Halfword Look-up
V

EXCLUSIVE OR RB2, RA2
MOVE RA2, RB2
AND RB2, H'FO'
ROTATE LEFT RB2
LOAD RC, RB
EXCLUSIVE OR RC2, RA1
MOVE RB2, RA2
AND RB2, H'OF'
EXCLUSIVE OR RB2,H'1O'
ROTATE LEFT RB, 1
LOAD RA, RB
EXCLUSIVE OR RA, RC
BRANCH RETURN

CRC2
RB[2 ;]~ RB[2;] ~RA(2;]
RA[2 ;]~ RB[2;]
RB[2;]~RB[2;]/\ 1 1 1 1 0 0 0 0
RB[2 ;]~ cf>RB[2;]
RC~2 8 p(16p2) TSA[2+2J.. 16pRB]
RC[2 ;]~ RC[2;] ~RA[l;]
RB[2;]~ RA[2;]
RB[2;]~RB[2;]/\O 0 0 0 1 1 1 1
RB[2;]~RB[2;]~0 0 0 1 0 0 0 0
RB[2 ;]~ 1cf>RB[2 i]
RA~2 8 p(16p2) TSA[2+2J.. 16pRB]

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]

RA~RA~RC

[12]
V

Form W i +1 (L)
Save W i +1 (L)
Mask address
Form address
Load Wi+1(LH)Y
W i + 1 (H)[O I]E9 Wi+1(LH)Y
Get W i +1(L)
Form address
Form address
Form address
Load W i +1(LL)Z
Form Ai+l
Return

Cyclic Redundancy Checking by Program

13

TABLE III-Subroutine for Binary Summation Method
V CRC3

EXCLUSIVE OR RB, RA
LOAD RA, ZERO
LOAD RC, Dl
AND RC, RB
BRANCH [7], PTRC
EXCLUSIVE OR RA, H'8000'
LOAD RC, D2
ANDRC, RB
BRANCH Ill], PTRC
EXCLUSIVE OR RA, H'4000'

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]

RB~2
RA~2

8 p(16pRB) ~(16pRA)
8pO

RC~SA[1 i]

RC~2

8 pRC/\(16pRB)
16pRC»/SECONDBIT
RA[li]~RA[li]~1 000 0 0 0 0
~(~/(1,

SECONDBIT:RC~SA[2;]
RC~2

8 pRC/\ (16pRB)
16pRC»/THIRDBIT
RA[li]~RA[li]~O 1 0 0 0 0 0 0
~(~/(1,

Form W i +1
Set A i +1 to zero
Load Dl
Calculate D1Wi+l
Branch if aO.i+l =0
Set aO.i+l = 1
Load D2
Calculate D 2W i +1
Branch if al.i+l =0
Set al.i+l = 1

And so on for the third through the 15th bits.
LOAD RC, D16
AND RC, RB
BRANCH [16], PTRC
EXCLUSIVE OR RA, H'OOOI'
BRANCH RETURN

[12]
[13]
[14]
[15]
[16]

SIXTEENTHBIT:RC~SA[16i]
RC~2

8 pRC/\(16pRB)
~(~/(1, 16pRC»/OUT
RA[2;]~RA[2i]~0 0 000 0 0 1
OUT:~O

of storage. However, a reduction in the storage requirement is possible by forming a loop to calculate the 16
binary sums. Further reduction is also possible when a
specific polynomial is chosen and a combination of this
and other schemes is used. For example, using G(x) =
X16+X15,+X2+ 1, the number of instructions can be
reduced to less than 20, making this method competitive with the other two given here. The key to this
method is the branch instruction which tests the
condition of the parity of the 16 bits in the accumulator.
This is the last of the three matrix-oriented methods to
be discussed and generally requires less core storage and
more execution time than the previous two.
Other methods which partition the G8 matrix in
other ways are possible and may be better in specific
cases.

2 A W MAHOLIC H H SCHWARZELL
Integrated microprogrammed communications control

Computer Design November 1969
3 IBM 1130 synchronous communications adapter subroutine

SRL File 1130-30 Form C26-3706-4 IBM Corporation
White Plains New York
4 W W PETERSON
Error-correcting codes

The M.LT. Press Cambridge Mass 1961
5 E R BERLEKAMP
Algebraic coding theory

McGraw-Hill Book Company New York 1968
6 I B OLDHAM R T CHIEN D T TANG
Error detection and correction in a photo-digital
storage system

IBM Journal of Research and Development Vol 12
No 6 1968
7 R T CHIEN
Burst-correcting codes with high-speed decoding

IEEE Transactions on Information Theory Vol IT-IS
No 1 January 1969
8 M Y HSIAO K Y SIH

SUMMARY
Using a matrix description of the operations required
to generate the check bits in a cyclic redundancy errordetection scheme leads to new approaches to the software implementation problem. Certain variations are
in use today and have proven to be superior to direct
shift register simulation programs in most cases. With
an apparent increase in programmable terminals and
multiplexers, such approaches are likely to become even
more important in the future.

Serial to parallel transformation of feedback shift
register circuits

IEEE Transactions on Electronic Computers
Vol EC-13 pp 738-740 December 1964
9 A M PATEL
A multi-channel CRC register

AFIPS Conference Proceedings Vol 38 pp 11-14
Spring 1971
10 K E IVERSON
A Programming Language

Wiley New York 1962

APPENDIX

REFERENCES
1 W L SCHILLER
A VAN DAM

Load Dl6
Calculate D16Wi+1
Branch if a15.i+1 =0
Set aI5.i+l = 1
Return

R L ABRAHAM

R M FOX

A microprogrammed intelligent graphics terminal

IEEE Transactions on Computers Vol C20 No 7 1971

Here, we will show how a shift register is used to
perform the functions required to generate or verify a
code word (calculate the proper h bits of redundancy) .
Then it can be shown that the operation of a shift

14

Fall Joint Computer Conference, 1971

contents of the shift register which is the h bit remainder
R(x) =rO+rlx+··· +rh_Ixh- 1

o
1

o
o
o
o

o
o
J

o
o
o

o
o

o

J

J

J

o
o
o
o
J

o

INITAL STATE
STATE 1
STATE 2
STATE 3
STATE ..
STATE 5

Figure Al-An elementary shift register

register on a bit-by-bit basis can be written in terms of
matrix operations on vectors. Using this approach, it is
possible to justify the several table look-up software
schemes which are developed in the main text.
A feedback shift register is a device which stores
bits in a serial string and is capable of shifting the string
one bit at a time. There may be EXCLUSIVE OR and
AND gates associated with the shift register which will
operate when a shift takes place. The structure of a
shift register is shown in Figure AI. The bit storage
positions are indicated by a box (D) and the EXCLUSIVE OR gates are indicated by the "E9." If
the storage positions are denoted as shown, we can
illustrate the operation by assuming that bit positions
1, 2, and 3 contain zero and that a one bit is placed on
the "IN" lead. A single shift of the register by a clock
pulse (not shown) will cause the "IN" to be EXCLUSIVE ORed with the feedback from position 3 ~nd
placed in position 1. Thus position 1 = 1 (1 E9 0 = 1) .
Now, let us assume that "IN" is set to zero and then
another clock pulse occurs. Position 3 E9 "IN" = 0 is
placed in position 1. Position 1 E9 position 3 (1 E9 0 = 1) is
placed in position 2.
A general shift register which performs division by

is shown schematically in Figure A2. The AND gates
are represented by the" 0." Although the output does
represent the quotient, of major interest to us is the

Figure A2-A general division shift register

of the bits shifted in at any time. Thus, if we shift
information bits
lex) =io+ilx+·· ·ik_IXk- 1
into the shift register, highest degree coefficient first,
we will have the remainder of lex) after all k bits have
been entered. However, we would prefer to have the
remainder of xhl(x) rather than the remainder of lex)
so that we may append the remainder bits directly to
the information. One way to do this would be to shift
the shift register h times after I (x) has been entered.
However, this represents wasted time since we can wire
the shift register differently in order· to cause it to
"pre-multiply" by Xh. This shift register is shown in
Figure A3, and the remainder at time l will be denoted
by the polynomial A(x, l). After shifting lex) into this
circuit, the remainder R(x) of xhl(x) divided by G(x)
will be contained without further shifts; that is,
A(x, k) =R(x). If R(x) is appended to xhl(x) , a code
word will be formed (R(x) +xhl(x». At the receiver,
exactly the same circuit or program may be used to
determine whether the received block is a code word.
In order to further illustrate the operation of the
shift register, it is possible to develop a set of functional
relationships between the bits that have entered the
shift register and the contents of the register. These
are the circuit equations for the shift register.
Let the bits in the shift register (Figure A3) at time
t be represented by
ao(l), al(l), a2(l), ... , ah-l(t)

where ao(t) is the leftmost bit in )the shift register.
We will also denote the bits which are shifted into the
shift register as b(l). That is, the contents of the shift
register at time T include the effects of all b (t) for
1 < t ~ T. Since the bits come at discrete times, both
I (x)

Figure A3-A shift register for pre-multiplication by
division by G(x)

Xh

and

Cyclic Redundancy Checking by Program

15

With these we can calculate any ai (T) given the
b(t) (0 < t::::; T) and the generator polynomial

b(t+!)

G(x) = 1 +gIX+g2X2+ ••• +gh_IXh-I+Xh.

o

x position

x 1 position

x

h-2

position

h
x -

J

position

Figure A4-Development of circuit equations from the
pre-multiply shift register

t and T are integers. Figure A4 may help the reader
visualize this operation. From the figure, we can write
the circuit equations directly.

These circuit equations will be used in the development
of the matrix equations which are the subject of the
main section.
In order to develop a matrix approach to the generation of a set of parity or check bits, we define a vector
which consists of h binary components and represents
the bits in the shift register at time t as defined above:

N ext, we define G to be the companion matrix of the
polynomial G(x) as shown in the main text.
. From the circuit equations (AI), it is apparent that

A (t + I) = [ao (t + I ) , al ( t + I ) , a2 (t + I ),

ao(t+l) =b(t+l) EBah-l(t)

=[0, ao(t), al(t) ,

al(t+l) =ao(t) EBgI[b(t+l) EBah-l(t)]

000,

0

0

0,

ah-l (t + I) ]

ah-2(t)]

+[0,0,000,0, b(t+l) EBah-l(t)].G.

a2(t+l) =al(t) EBg2[b(t+l) EBah-l(t)]
(AI)

Equation (A2) below follows immediately if one merely
observes that

ah-2(t+l) =ah-3(t) EBgh-2[b(t+1) EBah-l(t)]
[ao(t), al(t),

000,

ah-2(t) , O}G

ah-l(t+l) =ah-2(t) EBgh-l[b(t+l) EBah-l(t)]
=[0, ao(l), al(t),
Since we set the register to zero before beginning to
calculate the remainder, we have the initial conditions

000,

ah-2(t)].

A(t+1)={A(t)+[0, 0, ooo,O,b(t+I)]}·G.
This is equation (1) of the main text.

(A2)

Development of computer applications in emerging nations
by ALAN B. KAMMAN
Arthur D. Little, Inc.
Cambridge, Massachusetts

it for both accounting and engineering applications such
as the critical path method, plant loading and blending,
and standard accounting functions. The hardware
served until late 1967, when it was replaced by a third
generation machine. Antigua also saw its first computer
installed at the West Indies Oil Company doing
applications on profit/output relationships and linear
programming.
The need for computers in agriculture is often overlooked by industrialists who are not familiar with the
economic needs of countries depending on this natural
resource. Greece, a predominantly agricultural country,
provides an example. Almost half of the population
derives a living from farming or farm-related activities.
The farm population is about 4 million people distributed among some 1.1 million farm holdings. Farm
sizes are very small, and often not enough to support a
family adequately. About 60 percent of the farms range
from one to ten acres in size, and most of these are not
irrigated. The large-scale farm does not really exist
in Greece.
Major farm management decisions in planning by
the Government for the effective use of a country's
agricultural resources depend on determining the
optimum use of their scarce agricultural resources;
primarily land, labor and capital. Traditionally, by this
is meant planning the land use or cropping systems;
planning the livestock enterprises compatible with that
cropping system, and third, adjusting other resources
in order to realize optimum returns for the total bundle
of resources. Proper solutions to the resource allocation
problems are vital to realizing optimum returns from
the farming operation, the planned development of a
country's agricultural sector, and the overall development of the country. The use of operations research
techniques in analyzing the optimum combinations· of
the scarce agricultural resources for the individual farm,
firm or government planner is both feasible, practical
and valuable as a decision-making aid. The computer
greatly facilitates this research ..

INTRODUCTION
The purpose of this paper is to explore a series of
guidelines which will help identify attractive computer
applications in emerging countries. We will examine
areas of development from the standpoints of natural
resources, labor intensive industries, public and private
services. Then we shall explore levels of development in
communications, education, high technology and the
financial commitment of a nation. We shall examine the
computer applications from a cost versus benefits viewpoint, and finally discuss a feasibility planning study.
A basic assumption throughout this paper, is that
there is no such thing as a "model" or "average"
developing country. Nor is there such a thing as an
Haverage" computer application. Each emerging nation
and each application must be explored and related as a
separate case.
AREAS OF DEVELOPMENT

Natural resources
Developing countries can often be placed into one of
two major categories; those that primarily depend on
natural resources for their economy, and those which
depend on labor intensive industries. From the standpoint of natural resources, the subcategories include
applications such as agriculture, fuels, minerals, water
and power. South American copper and coffee countries, West Indian fruit companies, Near East oil
producing countries provide relevant examples.
Computers often enter first into natural resource
countries with high technology industries (to be discussed in more detail later) such as in oil producing
lands. Aruba, whose main economy is based on the
Lago oil refinery, an) affiliate of Standard Oil of New
Jersey, shows such an application. The oil company
purchased the island's first computer in 1961, and used
17

18

Fall Joint Computer Conference, 1971

In 1964, the computer presented to the Indian
Agricultural Research Institute in New Delhi helped
scientists to develop new seeds of wheat and sorghum,
and a· formula for the best conditions of sowing,
fertilizing and irrigating.
These scientists had to go through thousands of
"crossings" before hitting upon the correct genetic
combination. The combination had to have a high yield,
while being resistant to pests, diseases and climatic
variations. This time-consuming process would have
been practically impossible without a computer. 1
Other computer applications for the natural resource
countries include the projection of .fuel and mineral
reserves, including geophysical exploration and development. The distribution of power and water by
computer has already been undertaken by Russia, and
is in need of implementation in such water-scarce
countries as East Pakistan and parts of India.

iron and steel industry, more than 90 percent of 1970
products were turned out by automated systems. The
machine-building sector also has witnessed computerized control, with the truck works of Brasov and
the Heavy Machine Works of Bucharest (UMGB)
leading the way.
A priority control program exists for placing numerical control applications into this machine tool industry.
Cement works, glass factories, weaving mills, footwear
factories and the food industry (bakeries, breweries,
sugar refineries, slaughterhouses) all provide relevant
examples in Romania. 2
In Finland, also, process control computers are
expected to be in high demand from the rapidly expanding metal, engineering and chemical industries.
EDP imports, estimated at $9,000,000 in 1969, are
expected to reach a level of $21,600,000 annually
by 1974.3

Labor intensive industries

Public and private services

For countries with few natural resources to export,
the key to their success is labor intensive industries.
Examples include manufacturing companies and works
programs.
Computers in labor intensive countries have generally
followed or replaced conventional punch card equipment, with government taking the lead in their introduction. It is quite noticeable that the private sector
has been slow to take the plunge, and most frequently
it is those corporations with international connections
that have installed the first computing equipment.
Initial applications include inventory control, disbursement and revenue accounting applications. More
creative programs in existence in developing countries
include the pert-charting of major construction jobs,
and the programming of construction logistics such as
stress analysis, cut/fill balances and operations research.
The most sophisticated step in these manufacturing
operations is the use of small computers for numerical
control applications. Here there is a danger of increasing unemployment unless the country itself is in a
rising economy where displaced persons can find jobs.
In Romania, EDP was introduced in the early
sixties, directly as a result of the fact that large-scale
automation was introduced in production processes.
Since that time, automation has increased by a factor
of two, and a 450 percent expansion is envisioned during
the 1971-1975 period.
Applications for this labor-intensive country include
centralized control equipment for supervising the
extraction and transportation of gas and oil, and the
automation of hydroelectric stations. In Romania's

While countries fall easily into one of two categories
(natural resources or labor intensive) for categorization
of their export possibilities, a third major sector in each
type of country can make use of computers to good
advantage. Examples of the public and private services
sectors include administrative government, military
applications, transportation, communications, trade
and commerce applications, financing and banking,
libraries and education, health, social welfare and law
enforcement, and finally, computer service bureaus.
Computers will choose people to fill 35,000 job
vacancies in Ceylon, Colombo's public service this year.
Furthermore, the hardest working computer in the
country belongs to the Department of Census and
Statistics. It processes data for the Registrar General,
Police, Department of Education and the Customs
Department. The Inland Revenue Department has
undertaken a study to see how to utilize a computer in
the most optimum manner when its new "pay-as-youearn" tax reform goes into service. 4
Needless to say, computer applications in these areas
of priority development run a large gamut, including the
standard disbursement and revenue accounting applications. One must remember, however, that in many
developing countries in their early stages, the entire
payroll is done in cash since a creditless society exists.
Therefore, basic applications such as payroll, check
servicing, etc., are not applicable. Banking applications, however, excluding check processing, still
represent a formidable way of beginning. First, the
statistical and numerical applications must be done
accurately and in large volume. Furthermore, com-

Development of Computer Applications

munications between a main bank and its branches, can
help in its own way to develop a communications system
within the country.
The use of computers for a theoretical, rather than a
mass production application is not common in developing countries but could prove to be immensely
helpful. For example, many nations in their formative
stages depend heavily on a series of strategies including
alternatives based on a several-year plan. Pakistan has
gone through four five-year plans and Algeria is in the
midst of a four-year plan. The need for this is heightened
by the fact that in a developing country, internal
currency is usually worthless outside that nation's
boundaries. Therefore, the nation must husband its
foreign currency (such as dollars, pounds and francs)
so that the exchange is spent on the most essential items
for each nation.
It is common knowledge to people who have worked
in developing countries that often the purchase of an
automobile from outside is considered extremely
wasteful and in some cases illegal. Penalties are placed
on luxuries such as imported foods or alcohol, and in
one country that I visited it was impossible to get a
battery for my dictation unit because the batteries
were considered to be luxury items.
To these countries, the strategies and alternatives to
combine all the vast financial and human requirements
of the entire nation require a synthesis, combination
and analysis almost impossible by manual means. One
of the greatest contributions to a developing country
who is basing its entire economy on a plan, would be to
set up and train the government officials in the use of
simulation models directed at these planning applications.
To be practical, one must also recognize that several
countries, far from a state of complete development,
are using their government computers for defense
applications. The use of EDP for military inventory
and for war gaming is not limited to the developed
nations of this world.
Of more practical use, is Nigeria's application where
computers help them quickly recognize trends for
epidemics that might be starting in geographical sectors
of the nation. Furthermore, in 1969 approximately 14
computers existed in that country. '.fbirteen were being
used in normal, industrial and commercial operations
like payroll, billing and research connected with the oil
industry. The fourteenth served the West African
Examinations Council.
As its name implies, the Council's main function is
to provide and administer examinations all over
Nigeria. It is West African because the Lagos office is
only a branch of the international organization whose
headquarters are located in Accra, Ghana, and which

19

was set up jointly by the governments of Gambia,
Ghana, Nigeria and Sierra Leone to conduct examinations in their countries. Besides conducting examinations, the Council was also a pioneer in the field of
educational development.
To carry out its functions the Council makes use of
two computers, one based in Accra and one in Lagos.
The second computer would probably not be necessary
were it not for the great distances and loss of time that
would be involved in shuttling data from one country to
the other. Source data comes chiefly in the form of
candidates' entry forms. From these documents are
produced lists which are required before an examination
can be conducted, such as a packing list to enable
officials to determine what quantity of materials and
examination forms in each subject must be sent to the
Center, a candidate list which can be used as an
attendance sheet, individual timetables, and admission
notices to enable candidates to know where they should
report for the examination.
After the examination, source data comprise marked
scripts and marked sheets from which marks for each
candidate are punched. From these, mark distributions
are made to determine the level of overall performance.
The grades of each candidate and each subject are
computed and the final test rate is determined by the
machine. Eventually the results are listed and the
certificates are printed by the computer from summary
cards.
Computer growth in South Korea has risen from 20
to 30 in the past year. Back in 1967, the first two computers were imported by the Productivity Center and
the Economic Planning Board. Furthermore, the
Ministry of Science and Technology, set up in the same
year, organized the National Computer Center to help
facilitate usage. Currently, 32 percent of the computers
are in operation in Government offices, 32 percent by
the Universities, and 25 percent by special agencies.
Industry and banks shared the remainder.
As of the end of 1970, Taiwan had installed 28
computers, of which 11 are used by Universities. The
first was installed in 1964 at National Chlaotung
University. An additional seven machines are used by
the Government, including the Army, Navy and Air
Force Logistics Commands. 4
LEVELS OF DEVELOPMENT
Communications

Now that we have discussed priority areas of development, we should come down one step to talk about
general levels of development within the nation under

20

Fall Joint Computer Conference, 1971

study. Four major guidelines should be observed and
the first of these is the field of communications. This
includes the state of development of the telegraph and
telephone system and the post office.
Any country, developed or undeveloped, will tell you
quite quickly the state of their telephone or telegraph
lines. As soon as the first computer system utilizing
communications links· is placed in service, the review
will become much more critical. In general, the government or an airline becomes the primary group to
experience problems. As discussed in the Nigeria case,
two computers were .necessary only because communications between two principal cities was virtually
impossible within a reasonable length of time.
Not only, therefore, is instantaneous communication
a problem for real-time systems, but the post office (or
other means of carrying documents) often determines
the application. It becomes almost useless to save time
through using a computer in the central branch of a
bank, if it takes three days to get documents to that
center from one of the branches and another three days
to return them.
In Algeria there is a major problem in communicating
from the southern part of the country across the Sahara
to the northern industrial cities. Many of the oil
installations sit in the Sahara with very few links outward. Furthermore, the PTT has no immediate plans to
extend major relief to the southern sector because of the
problems involved and the lack of major usage. Here, a
developing country finds itself in a chicken/egg relationship. Would the usage increase if the facilities were
there? The answer in one case was a resounding yes!
The Telephone Department of the Government of
Pakistan installed a direct dial cable between Lahore
and Karachi to relieve the operator circuits between
those two cities. They designed the size of the cable
based on what they felt was necessary for traffic relief.
They never anticipated that so many people just did
not make calls because they found the service impossible.
When the new link was opened, circuits were completely busy from four to six hours each day, and the
public, if anything, became more frustrated with the
addition of those new facilities· because they were not
able to get through on them.
Waiting time for telephone installations ranges from
three to five years in some countries, and businesses are
often required to buy their own switchboards and resell
them to the Government Telephone Department at
partial cost, then pay a monthly maintenance charge to
keep them in service. Such conditions do not facilitate
any type of computer usage where either source documentsor output must travel great distances.
On the brighter side, reports from Taipei indicate

that a U.S. Air Force satellite program is being installed
at the Linkou Air Station, where 18,000 punched cards
record 6,000 supply items needed by Air Force personnel. A remote-batch computer at Linkou will use
private line circuits to talk with the main computer at
Ching Chuan Kang Air Base in Taichung. If an item is
out of stock at Linkou, it will interrogate the larger
supply base at Taichung. If the latter cannot supply it,
the computer automatically orders it from the United
States. 4
Education

The need for education and a proper level of development during the initial introduction of computers
within developing countries is essential. First, the vast
majority of computer failures in developing countries
occur because of a "love 'em and leave 'em" attitude.
It would be embarrassing to tell you how much computer consulting work is done in developing nations
because vendors have raced through the land selling
systems, and then left the users with support ranging
from inadequate to nonexistent. A program to provide
computer hardware without provisions for training the
necessary software and maintenance .personnel creates
more problems than it solves.
Furthermore, it almost appears necessary for a
person who wants to keep up with the data processing
profession to be able to read English, French or German.
Since U.S. business executives primarily deal with heads
of industry and top-level managers in foreign lands who
have this capability, they neglect to realize that a vast
majority of the technical workers or lower-level
management employees cannot read with facility,
technical literature in any of these three languages.
The Brazilian growth rate for EDP will probably be
20 percent from 1970 through 1974. (1970 base was
approximately $14,000,000.) Medium- and small-scale
computers are in greatest demand. Computer room
peripheral equipment is also forecast at the same high
growth. These include printers, MICR equipment,
memory systems and, outside the computer center, a
wide variety of terminals. s
In Brazil, the Society of Users of Electronic Computers have predicted that 200 additional computers
would be installed and an additional 1200 qualified
compute:r programmers and systems analysts would be
necessary within a year. The government has attempted
to deal with the situation by providing Fortran lessons,
and classes in general concepts of data processing at the
Universities, without charge, to members of the
Mathematics, Engineering, Social Science, and Science

Development of Computer Applications

Departments. Also, a post graduate course in computer
science, leading to a masters degree is available from
the federal university.
Singapore has a different method of education. The
Singapore Computer Society, with over one hundred
members, has been extremely active in promoting DP
activities. The computer firms support this society
activity in large measure by encouraging their own
employees to lead discussions and classes in systems
analysis and programming.
Of course, in any country with computer potential,
the vendors give a wide array of courses. Although only
30 computers are installed in Taiwan, one vendor offers
training in basic concepts, computer systems, programming, operations and special applications programming
courses range up to three months in duration. 4
The United States has recently tried some low-key
training during an EDP mission to Taipei, Djakarta and
Singapore. Anticipating that questions would come
from businessmen who were only beginning to learn
about EDP, the Commerce Department, acting through
the· U.S. Trade Center in Bangkok, arranged for local
speakers to participate along with the mission members.
The synergistic reaction among the two groups proved
immensely successful. 6
Conversely, an eastern nation with whom I have
worked, has training facilities only through an international agency and has made no attempt whatsoever
to implement computer training in the universities or
technical trade schools. First, of course, schools must
exist in reasonable quantity. For example, a recent
census showed the population of another country at
roughly 94 million. Education statistics indicated that
approximately 3,800 students were enrolled in engineering courses at the universities, while approximately
200 additional students were attending technical or
trade schools.· In other words, only .004 percent of the
population was involved in higher technical training.
There is no doubt that basic computer courses can
be introduced early in a student's career to expose him
to concepts and give him interest in the subject.
Detailed courses could be set up on the same basis as
the ITU/UN communications schools so successful in
emerging nations. Once again, however, the chicken/egg
relationship must be observed. If the training produces
a large group of people who, upon graduation, have
absolutely no possibility to use their talents because of
the lack of hardware developments, the courses almost
become senseless.
Finally, there has to be a technological flair on the
part of the young people growing up, or no courses of
this type .will be popular. Surprisingly, a government
official in one developing nation stated that students are

21

growing up with neither the desire nor the qualifications
to use their hands. Although technical and trade
schools exist in that country, enrollments are dropping,
even as the population increases.
Existence of high technology industries

Often a small number of high technology industries
within a nation can provide the nucleus around which
computer development can grow. This is most obvious,
although not limited to, oil producing countries. For
example, many of the smaller· airlines have justified
reservation system computers on a combination
cost/ country-training basis. To speak to a previous
point, in many of these cases it was necessary to advise
the airlines to wait until the communications facilities
within their own country could support a reservations
system.
Banks often serve as the computer nucleus of a
nation. Many Managing Directors of these institutions,
however, will tell you the sad stories of training programmers, only to have them leave for higher paying
positions as that nation's manufacturing group started
to install machines.
Although not falling strictly within this category,
computers depend on high technology industires for
their daily operation. For example, the power requirements of computers are quite stringent. Voltage variations due to inadequate federal power service will
cause malfunctions if they exceed allowable limits.
It's clear that anything that stops the supply of power
(a prevalent malady in developing countries) also will
halt completely computer production. 7
Financial commitment of a nation

Most important to the growth of computers is the
financial climate under which they can be installed.
Once again, the worthlessness of currency outside the
nation's boundaries is critical. In one country, computer equipment was first priced internationally, then
subject to a doubling factor as a penalty for using
foreign exchange (since the vendor would not accept
local currency) and finally, subject to alOO percent tax
on the original amount. Users who wanted hardware
paid triple price.
On top of that, a vendor was maintaining a certain
number of machine models in that country. When asked
to supply a higher model in the series, the company
requested the equivalent of. several hundred thousand
dollars extra to staff a special maintenance force and
carry spare parts.

22

Fall Joint Computer Conference, 1971

Computers need a complete stock of parts close by.
Often they cannot be flown in from another country
because the ensuing red tape caused by "purchasing"
in foreign exchange rears its head. The charge by the
previously named vendor was probably justified
because the company knew that a week-to-month delay
in getting the machine "up" would not be tolerated by
the customer even though it was the customer's own
governmental regulations which caused the bottleneck.
Conversely, that same vendor is well known for
selling its obsolete models to emerging nations for their
local currency. This policy has a great many advantages. First, the hardware has proven itself over the
years and maintenance problems have been identified
and categorized. Second, a great many standard software packages exist for such models and with adequate
planning, an emerging nation can get much more than
its money's worth by buying such a computer with
available utility and application software. Third, of
course, those nations can get equipment by spending
their unrecognized currency, thus saving their desperately insufficient dollars, pounds or francs.
While Singapore has had a mixed series of reactions
with computers, one factor strongly contributes to
their growth in the country; it's a free port and no duty
is charged on EDP devices. However, until recently a
duty was levied on carbon paper according to the area
of the paper, rather than by the sheet. Until this was
changed, printer-paper was a first class luxury item,
rather than a negligible cost supply as it is in the U.S.4
COSTS VERSUS BENEFITS
Items to consider

Such discussions lead to the next category; cost
versus benefits. First, a number of United States
industries lose money on computer applications and a
developing country can't afford to do that. One reason
for the loss is that these U.S. corporations spread their
applications thinly. Any country can concentrate on
one, two or perhaps up to 5 percent of standard applications and do an excellent job in terms of technical
and economic measurements. The United States proved
this when they used the first generation of computer
equipment. The problem comes when countries try to
expand too rapidly, or add too much at one time.
Therefore, "limitation" is the initial secret to having
benefits exceed the liabilities.
Next, it is important to recognize exactly why computers are introduced in various locations. If one wants
to lose money he should recognize in advance that he is

going to do so. For example, the use of a computer as a
status symbol in an emerging nation is quite common,
although its justification is hidden under the guise of
"competitive necessary." We have seen this particularly
in the case of airlines, where the only thing they have to
sell is service, and while the computer won't reduce
costs in a country where wages are low and unemployment is high, the appearance to the public of a mechanized reservations systems is important to them. No
doubt, status might be a very good reason in a very few
cases for installing a computer. The key is to recognize
it and admit it.
In early development stages, as discussed previously,
usually no credit system exists. Payrolls are constantly
paid in the cash of the land, and checks are virtually
unknown. Therefore, the most basic system installed in
the United States becomes useless for a number of years
in this emerging "checkless society."
Inventory control is another basic United States
package, which begins to become valuable in developing
countries. Most particularly, the control of high unitpriced items will tend to save dispersion of foreign
funds. Inventory control isn't necessary if the items are
relatively inexpensive, and a large unemployed labor
force exists. If, however, the items are high volume and
expensive, such as drugs, inventory control could
payoff in making sure that proper utilization limits the
need for foreign currency to purchase additional
amounts until they are really needed.
One of the largest engineering organizations in India,
the Tata Engineering and Locomotive Company in
Jamshedpur, stated that as early as 1969 it was able to
save $8,000,000 in its inventory control by using computerized programs. In addition, in a different type of
inventory control computers helped the Indian Railways
locate hundreds of "lost" freight cars and coaches.!
Earlier we. discussed the obvious benefits where a
vendor will sell his older equipment in local currency,
and probably accept that same local currency for
maintenance. Along the same line, we do not underestimate the impact of the mini-computer market. For
initial applications, no longer is the $100,Ooo-on-up
computer necessary. For reasons of staffing and finances,
very few mini-computer manufacturers have entered
emerging nations. We feel that the loss is primarily
theirs, and that intelligent marketing combined with
support would yield them a much higher profit than that
accruing to major manufacturers who maintain large
facilities in these remote locations.
Conversely, the mini forces must not follow in the
footsteps of the Hlove 'em and leave 'em" salesmen.
That trend is now well recognized, and new companies
entering the field will be questioned in great detail

Development of Computer Applications

concerning their method of supporting.the .software as
well as the hardware, and means of training the national
personnel. The problems will be greater because emerging nations are becoming smarter, but the profits for
the vendor and savings for the customer exist in this
low-priced computer field.
One of the great danger' areas where costs can exceed
benefits is, of course, where people are displaced in an
economy which already has a high unemployment rate.
One must be careful not to overstate the situation
because the masses of unemployed in many nations
might be incapable of performing even the clerical
functions which are considered replaceable by a computer. Therefore, the "replaceable" labor force may
be a small percentage of those masses.
In an emerging country, especially one that is now
considering computers, it is quite possible that the types
of jobs handled by the "replaceable" clerks are multiplying in other sectors. While it may not appear so at
first glance, a detailed study may indeed show that for
the clerical level and above, there is a rising economy
and jobs can be found.
However, trade unions in India have complained of
the use of computers, contending that with the vast
manpower available, there is no justification for
automation. The Government of India has been
responsive to these arguments, and has adopted a policy
of a "gradual switch to automation."
Conversely, India's computer growth (there are
about 150 machines in the country in 1971) has opened
a new line of jobs. IBM states they have trained nearly
125,000 Indian technicians in programming and other
computer disciplines. These graduates have found
jobs in India, Canada, Australia, the U.S. and other
locations. l
Furthermore, a salary inflation can happen which
throws wages askew. This occurred in the United States
with engineers, then with high technology experts and
finally with computer programmers. The crafts developed so quickly that people to perform needed
technical functions were scarce. Fairly soon the salaries
necessary to attract such employees placed them in
much higher salary brackets than their peers in the
same company. Dissatisfaction was the minimum
condition which resulted.
Conversely, the wage structures in many government
institutions have been guided by long-standing financial
instructions and general orders. These were drawn up
before computers came on the scene, and before new
skills and techniques connected with automatic data
processing were developed. It isn't surprising that such
a wage structure becomes unrealistic, since such
regulations regard programmers and machine operators

23

as just another set of clerical staff. Therefore, where the
choice exists, these people migrate to private industry.
Case after case has occurred where government and
banks lose programmers ~to the airlines, oil companies
and private institutions.
One must determine the true social costs of computer
personnel, ranging from the ones who are paid higher
than normal to the unemployed. Most generally in an
emerging nation it is important to get people working,
and this objective is diametrically opposed to the
United States commercial computer installation where
its primary objective is to reduce high labor costs.
One cannot stress enough the advantages from a
cost/benefit basis of standardized software applications.
An emerging country rarely has enough personnel to
maintain and run their system; no less to design it.
In one case, we had a difficult time advising an airline
, to use either the Univac or IBM reservations system,
since these were the only industry standards. U.S.
airlines can testify to the horrors of developing such a
complicated process with a vendor who has never done
it before.
To give another example, so many common inventory
control packages exist that it is shear foolishness to
invoke the Not Invented Here (NIH) factor and design
a new one. The first concern of an implementation
manager in a developing nation should be to collect all
the available packages capable of being run on the
computer for the application that he wishes to place.
These should be stated when performing a feasibility
planning study prior to either approving the application
or ordering the hardware. As a matter of fact, in some
cases the available packages might even dictate the
hardware to be acquired.
Finally, we stress again that the use of computers for
planning purposes is almost always considered a cost
and rarely a benefit. This just simply isn't true if it can
be utilized properly to develop strategies and alternatives for projects as important as national five-year
plans. Operations research, simulation and modeling
techniques have a definite purpose on a mechanized
basis for a developing country. It would be well worth'
the investment to train the ministerial levels and their
subordinates in the acceptance and use of such techniques.
Finally, the NIH factor often comes into play to
prevent cooperation in developing EDP expertise.
Perhaps because of the administrative functions involved there is very little coordination between users.
In Nigeria, for example, when 14 computers existed
( 1969) there was no coordination or any centralized
services in any way. Basic information was scanty and
everyone thought they w~re developing their own

24

Fall Joint Computer Conference, 1971

technique first. Hence redundancy in this labor-critical
area occurred with all of its resulting waste.
The use of service bureaus gives an opportunity to
many government departments and industries to trade
valuable information and to "cut their teeth" on data
processing techniques. It also provides an excellent
training ground and a transition system while developing one's own in-house equipment. Sometimes the
first industry to get a computer will make it available to
others. ICL has been one of the leaders in convincing its
customers to do this in emerging nations.
However, service bureaus on an independent basis
could provide one of the best ways to start a country on
the EDP path. Expertise in terms of both software and
hardware would be centralized, and a systems design
force could exist to be made available to all companies
using the services. In general, the service bureau would
operate on a batch basis, but it could be available to aid
the telephone department in establishing initial installations of data processing lines.
Even developing countries must be allowed the use of
a computer for pure pleasure. A computerized totalizor
("Tote Board") has been placed in operation at
Djakarta, Indonesia's racetrack. It accepts data from
up to 64 ticket issuing machines, and instantly calculates the odds. The "core" consists of three minicomputers and several multiplexors handling the
input lines. s
The feasibility planning study

Since it is virtually impossible to develop one formula
to relate areas of development, levels of development
and costs versus benefits, the use of a feasibility planning study before hardware or software commitments
are made is mandatory. At Arthur D. Little, Inc., a
procedure has been developed by Thorpe E. Wright,
and used quite successfully in emerging nations by this
author. The process involves the formulation of an
initial framework, then systematic expansion and
recalibration to produce a finished document.
It is necessary to make sets of assumptions, see what
results they yield, modify the original assumptions
when appropriate, then see how this affects the results.
Each of the six basic sections of the study may be
developed independently, based on the sections which
precede it. In a real sense, therefore, each section
provides the foundation ,upon which the subsequent
section is built, and therefore major additions or
changes to anyone section may affect any of the other
sections.
The basic document, which is, at different stages of

its development, both a working document and a
finished systems plan, comprises six basic parts.
Introduction

The primary. importance of this section is that it
establishes the need for the system. It should contain
information concerning the history or background
leading to and stating the need for the system. It may
also include definitions of any terms used throughout
the document.
Objectives

This section specifies the objectives to be achieved
by the new system; It may also specify the manner or
style of operation to be achieved or preserved by the
new system. These objectives play a vital role in
systems design since they provide the context within
which various systems alternatives may be evaluated.
Without them it is often not possible to resolve systems
dilemmas.
Functional descriptions

This section contains statements of what the system
is to do and the services to be provided to various classes
of users. It also indicates in a general way the general
response time requirements to be met, such as on-line
response, daily processing cycle, etc. This section is
wholly "what" oriented, with little or no consideration
of how this is to be accomplished, and it is stated in
non-technical terms.
Perfor:rnance specifications

This section contains statements of the amount of
work the proposed system must do. It includes such
things as estimated numbers of key items to be processed) response time requirements for processing each
of these key items, and the required reliability and
operating performance for the system. Where the system
has on-line terminals, it also includes estimates of the
numbers of such terminals.
Design specifications

This section contains a proposed system of hardware
and software capable of meeting the requirements
stated in the previous three sections. The primary

Development of Computer Applications

purposes of this section are to assure that the system is
technically feasible, and to design a realistic configuration to derive cost estimates for the system. Specifically,
it contains rough estimates of overall system cost and
time to complete the system. It is the most technicallyoriented section of the six.

Feasibility analysis

This section contains statements of the four types of
feasibility of a proposed system: technical, economic,
acceptability to users, and legal acceptability. Technical
feasibility is primarily comprised of statements concerned with whether the proposed system is workable
and capable of meeting the specified performance
requirements within the required time frame. It is also
concerned with aspects of continuity of system performance where this is implied or stated in the performance specifications.
Economic feasibility is primarily comprised of
various analyses and statements concerning the net
savings (revenues or gross savings minus start-up and
operating costs) and other tangible and intangible net
benefits (advantages minus disadvantages) associated
with the proposed system.
The third part is concerned with the acceptability of
the proposed system services to users at various levels
including the management level and the operator level.
Where system users are outside the company, this might'
also involve marketing research studies. It also should
involve a comparison of the final system design back to
its objectives (Section 2) to assure that the objectives
have been adequately satisfied.
This final category is primarily concerned with
possible legal implications of providing the proposed
system services. The government regulations of developing countries are often so rigid that system changes
must be implemented to conform to them.
The feasibility planning process requires the close
collaboration of two groups: user-management and the
project study team. The user management is responsible
for the content of certain parts of the study (specifically,
Sections 2 and 3), while the project study team is
responsible for the others, and may assist in the preparation of Sections 2 and 3.
'
The basic functions of the proj ect study team are to
make suggestions to the user management group, to do
the staff work required to develop certain basic data
about the system, and to determine the implications of
various assumptions about what the system might do.
The basic functions of the user management group is
to assume the responsibility for the content of Sections

25

2 and 3 (Objectives and Functional Descriptions),
and to make decisions concerning various system
alternatives based on information presented to them by
the project study team. Each group requires the other.
The user management group is not normally capable of
doing the required technical staff work whereas the
project study team must carefully avoid making the
required top management decisions or approving its
own study results.
A key feature of the feasibility planning process is
that it is only necessary that anyone who is not a data
processing specialist, understand Sections 1, 2, 3, and 6
in order to understand fully what the system is and
what its implications are. Since these key sections are
written in non-technical language, it is easy for a person
who is not trained in the EDP-related technologies to
understand the system at any point in its development.

CONCLUSIONS
The purpose of this report has been to set out a series of
guidelines on which to judge the most attractive computer applications in developing countries. It viewed
the problem from three areas; priority development,
levels of ov~rall development and cost versus benefits.
It discussed the primary division of countries into
either natural resources or labor intensive categories,
and added in a general sense the public and private
services sector.
Under levels of development, it discussed four major
categories: communications, education, high technology
industries, and the financial commitment of the nation.
Cost versus benefits were reviewed by setting forth a
number of items to take into consideration, stressing
that their applicability depended entirely on the nation
and on the process under consideration. Finally, this
paper gave a suggested method for performing a
feasibility study before an emerging nation commits
itself to hardware or other costs.
Key to the entire application of these ideas is the
need to get away from generalities when discussing
each problem. It has been the author's experience that
no such thing as an average nation at an average state
of development exists. It is extremely difficult to judge
development on an overall basis since a country might
be extremely well developed in one or two areas and
backward in others.
Finally, and most emphatically, the paper stresses
that computer applications should each be considered
on their own merits; that standardized equipment and
software must be used in the early stages and that
maintenance of both the hardware and software must

26

Fall Joint Computer Conference, 1971

be assured by the vendor before implementation begins.
To this end, a feasibility study is essential both from a
standpoint of justifying dollars and applications and
from the standpoint of forcing its originators to set
down in specific terms, the objectives, the approach,
and ways of measuring the accomplishments.

REFERENCES
1 The New York Times
July 31970

2 Journal of Commerce
April261971
3 Iron Age
October 1 1970
4 Far Eastern Economic Review
January 16 1971
5 Computerworld
November 4 1970
6 International Commerce
July 13 1970
7 Finance and Development
March 1970
8 Computer Digest
August 18 1970

Notions about installing and maintaining
a population register in Brazil
by ANTONIO LUIZ DE MESQUITA
SERPRO jR. Eduardo Guinle 61
Rio, Brazil

INTRODUCTION

For federal and state government the two main applications developed are tax collection and payroll. The
biggest success is in the area of income tax collection,
controlled at the federal level.
From 1965 to 1970 the number of income tax payers
grew more than twentyfold. In 1969, for the first
time, the Treasury Department mailed back income
tax refund checks, about 400,000 of them. This has increased to 900,000 in 1970. SERPRO has had prime
responsibility for this breakthrough. There has also
been a continuing effort toward the improvement of
the quality of information by convincing public officials
of its value.
No major advances have been introduced in the areas
of data storage and data utilization. In the field of
data processing, tape oriented and straightforward report generation techniques are still used.
The database concept, as well as some of the most
recent tools of data utilization for management and
planning, are now under study, on an experimental
basis only.

The problem of implementing and maintaining a centralized population register in a country as large as Brazil
is a complex undertaking. This paper presents some
facts and ideas underlying the work under way for the
automation of the clerical and bureaucratic tasks of our
government. A reliable Population Register system
will undoubtedly be one of its cornerstones.

ENVIRONMENT
Status of data processing in government

Brazil is a Federative Republic of twenty two states
comprising more than 4000 municipalities. At the federal level, most of the data processing is performed by
SERPRO, a public company owned by the Treasury
Department. It was founded in December 1964 by recommendation of the Administrative Reform Commission, in such a way as to encompass all of the data
processing equipment and know-how then existing at
that Department, including two 1401s and two
UNIVAC 1004s, and employed approximately 40 professionals. SERPRO today has offices throughout
Brazil and employs some 3000 people, dedicated to
data processing. Its data processing equipment monthly
bill is now of the order of 300,000 US dollars.
At the state level, only six out of twenty two state
governments run their own data processing agencies,
~ost of them organized as public companies, sometimes
with the participation of private owners. Most of the
state governments do not use data processing at all. The
same is true for all but ten of the municipalities.
There is little use of data processing for defense.
Computers are used by the military mostly for clerical
purposes.

Communications network development

In this area our federal government set up a special
fund in 1967 to finance the improvement of the eountry's long distance communication network. A new
company has been formed and it supervises the installation of about five million usable voice channel-kilometers in microwave linkages.
The program, costing some fifty billion dollars, is
planned to become operational in mid-1972. The first
benefits are however already here. A new breeze is blowing over our local telephone companies. New regulations
have allowed them to he funded directly from the subscriber. Also the quality of long distance calls has improved and is exerting pressure on local services. The
telephone system is thus becoming a reality in Brazil.
27

28

Fall Joint Computer Conference, 1971

The mail system however has not kept pace. Only
now are the first automatic letter dispatchers going
into operation. The Post Office has been turned into a
public company, and this will certainly help to improve
future services. One third of the city of S. Paulo, the
Brazilian huge industrial metropolis, still lacks the services of a postman.
Use of identification cards

Identification cards are sometimes not used in connection \vith Population Register. In ~olland the Population Register does not inform the individual of his
identification number.
In Brazil, despite the non-existence of a centralized
register system, identification cards are used and citizens are required to display them often. Partly for this
reason, there exists a reasonable number of public
agencies legally empowered to deliver identification
documents, though each has a well defined and distinctive prime objective in mind. Driver licenses,
labor cards, income tax cards duplicate in many aspects the regular identification cards. Even the latter
are not issued by a single agency.
If this number were unique and were carried on an
identification card together with other identification
information, it would assure local auditing of the number's use and a permanent feedback system through
the individual's reporting to the government. The use
of magnetic character printing for the identification
number in the card would also avoid its misuse.
Our basic problem in this respect is therefore to unify
all of the existing identification documents into a single,
multipurpose standard identification card. The legal
support for this matter has already been established
and requires as of now just small changes.
The Population Register under study

A Population Register is, in my understanding, a
system which basically allows substitution of a nonunique alphabetic person identifier (i.e., person names),
by a non-ambiguous code number. The principal properties of this number are uniqueness, ease of use and
universality. Uniqueness is its most important property and requires the sustaining services of sophisticated computer systems.
Computer hardware is so powerful nowadays it requires no longer code numbers to carry particular
meaning, such as a digit for sex, two for birth date,
etc .... This may have been a must for systems of some

years ago where the code had to play the role of addresses and retrieval keys. Today, sequential coding
provides the necessary flexibility to management, and
a measure of protection against privacy disclosures.
The ultimate goals of a Population Register are:
(i) the simplification of administrative routines;
(ii) the control of population on an individual basis
and the full use of social legislation; and
(iii) the reduction of the burden of the government
over society.
To meet these objectives the Population Register has
to assure:
(i) uniqueness;
(ii) universality;
(iii) minimum delay between the actual event that
generates data and its incorporation into the
files; and
(iv) suitable response time for a broad class of users.
Two types of information have to be maintained in
the Population Register files:
(i) the identification information;
(ii) general purpose information, i.e., non-identification information interesting to a large number of users (such as address, education level,
etc.).
There is a large spectrum of identification data about
individuals. These data differ in their frequency of updating and change, easiness of collection and number
of possible values. The selection of the identification
data is a vital point in the design of the control system.
For each selected identification set there is a measurable
probability of identifying a unique person. This probability has to be as high as possible, provided:
(i) the update and response times are not substantially degraded; and
(ii) the system's cost and complexity are kept under
reasonable limits.
The solution proposed for the Brazilian Population
Register Control System incorporates the use of Master
and Complementary Files. The latter serves the purpose of resolving indeterminacy questions which may
occur when searching the Master File. Taking advantage of modern hardware (specifically direct access
storage) the system is being conceived with the Master

Installing and IVlaintaining a Population Register in Brazil

File permanently on-line, and the Complementary File
scheduled on-line.
The objective of the Master File with entries by
name and identification number, is to maintain permanently a cross-reference between these two person
identifiers. Sex, date and place of birth are the other
identification data items maintained on-line. Compressing techinques for data compactation are necessary because the premium on secondary storage space is
greater than in processing time. Names have to be
normalized and sometimes shortened, in which case a
name's complement is recorded in the Complementary
File.
Database/Data Communication philosophy is employed all along the system's design, as we will enforce
the use of file handlers and transaction oriented software. However, the system is planned to start operating in batch. Future transition to an integrated data
processing system, requiring an on-line communication
environment, will probably be attained without too
large an effort.
The role of the Population Register Control System
at that time will be to exercise control over population
decentralized databases, both functionally and geographically. Among these databases we may count on
having the Social Security Pension Plan, the Medic
Care, the Income Tax, Labor Funds, and Popular
Savings Bank, to mention just a few. The Central Control System will avoid redundant and contradictory
data collection and storage, having it centrally controlled and maintained. Local databases will supply
detailed information where needed. Centralized processing will consolidate data into higher levels of aggregation. The exchange of information among the local
databases will also be assured and disciplined by the
Central Control System.
SERPRO, being the largest data processing agency
for the federal government, is the natural vehicle to
pursue these plans and to turn them into reality in the
next 4 or 5 years. Other government data processing
facilities will make use of SERPRO's services via
terminals. The system will become a nation-wide government information system.
IMPLEMENTATION
The usefulness of a Population Register is related to
the frequency people need to report their identification.
The quality of information, viz., its level of updating,
accuracy, etc., depends on the pressures exerted over
the system by its users. The larger the universe of
users, the better we think the system will work and in

29

general, the higher the quality of the information it will
provide.
To implement the system, a new identification has to
be provided for every citizen. This must be done as
much as possible in accordance with the existing body
of laws and regulations.
A practical way of issuing to a significant amount
of citizens their new identification in a relatively short
period of time is to make use of a simple sequential
coding system. A census like campaign can achieve this
goal. Pre-numbered identification forms will be distributed to the population thus tying the data collected
about a person with the code number which has been
assigned to him. During this phase it is not recommended to centralize geographically the assignment of
identification numbers to persons, for this would slow
down the impetus of the campaign.
Checks for code duplicates and for data validation
would have to be carried out at file creation time. Also,
provisions have to be taken to convert files containing
the existing and varied identification information. Correspondence files will have to be established and maintained between the new and the old identification system throughout the duration of this phase.
At its end we will switch to a centralized code assignment operation. From this moment all data validation
will have to be performed before a person is admitted
to the Register. At this time the system may be fully
operational, although expansions and adjustments will
have to be expected.
Databases are fragile. Checkpoint and recovery procedures must be carefully thought out. Tape copies of
the disk files have to be produced periodically. This is
better justified if it takes place when exhaustive file
searches become necessary to satisfy new requests.
There must also exist a single responsible institution
for reporting updated information about each data
item in the Population Register files. This is a key
point in the updating process, where, once more, turn
around time has to be very short.
CONCLUSION
Emerging nations must take advantage of their late
start in many technological areas. In this regard, data
processing in Brazil has to follow the steps we took in
developing our long distance communication network.
We started late, from scratch, and are making use of
the most recent technology.
This must happen too in the field of data processing,
mostly inside the government. In data processing the
principal problem which must be solved is that of

30

Fall Joint Computer Conference, 1971

manning the new technology in order to reconcile the
requirements for software and hardware. This was not
necessary in communications and in this respect implementation of the two technologies differ. One approach to solving the data processing problem is
through local computer vendors. They must begin to

rely on local talents not only for marketing and manufacturing, but also for product development. This
change in the rules of investment policy followed today
will guarantee the catalytic element that will bring
forth a locally developed technological society in this
branch of activity.

"

The neurotron monitor system*
by RICHARD A. ASCHENBRENNER, LAWRENCE AMIOT and N. K. NATARAJAN
Argonne National Laboratory
Argonne, Illinois

INTRODUCTION

HARDWARE MONITORING

The subject of performance monitoring and measurement has grown from infancy to childhood, and with
this growth came substantial performance improvements even with superficial monitoring analysis. The
recent increased interest in applying measurement
techniques by manufacturers and users of large systems
stems mainly from the high cost of development, purchase, and use of such systems. This cost obligates
each to obtain quantitative information on the dynamic
behavior of proposed or purchased equipment and software. This quantitative information is necessary when
a determination is to be made of the difference between
potential and actual performance of hardware and
software.1 •2 •3
In addition, the particular areas of interest at
Argonne National Laboratory which have· benefited
from the development of hardware and software
monitoring techniques are: ( 1) configuration analysis
and optimization; (2) "large" program profile analysis;
(3) simulation analysis; and (4) computer architecture
studies. Each area requires a parametric description of
the system for solution. Most important, and even
more elementary,. is selection and quantification of the
independent parameters or variables, rather than just of
the measures which indicate performance for a particular
situation. The methods used to evaluate and predict
system performance must provide insight into how
complex systems function; they must provide insight
into the most important parameters and measures of
the system; and they must also provide quantitative
information on the sensitivity of these measures.
System software monitors, program analyzers, and
hardware monitors have each made their contributions
in performance evaluation and prediction.

Hardware monitoring offers the ability to obtain
information on system performance by directly attaching probes on a host system. Microevent analysis
which would take inordinate time by means of timer
trace simulation or gross statistics gathering can best be
obtained by hardware monitors where event measurements can easily be selected and varied to minimize
voluminous amounts of data recording.
Hardware monitoring has the obvious advantage of
measuring systems which have no easily implemented
means of software monitoring, or where the introduction
of such artifact would cause system degradation. This is
true especially in the computer-automated experiments
and real-time or communications-oriented systems.
Similarly} it is advantageous when the software artifact
introduced affects the measurement statistics.
Previous studies using available hardware monitors
demonstrated the need to obtain information directly
at a more primitive level, greater in quantity, at higher
bandwidths, and with more convenient and accessible
output facilities. To this end, a hardware monitor
project was initiated at Argonne National Laboratory
to achieve a more creditable means of system measurement and evaluation.
This monitor has demonstrated its ability to interact
with the monitoring process and to provide analysis
and display concurrent with measurements. This
monitor development has· also aided .in solving the
problem of information loss due to sampling while
reducing the data collection rates and raw data storage
and processing requirements.

"NeUTQtron" monitor
The design of the Argonne "Neurotron" monitor
overcomes the previously stated deficiencies. in many
hardware monitors, and in addition provides inter-

* Work performed under the auspices of the U.S. Atomic Energy
Commission.

31

32

Fall Joint Computer Conference, 1971

that remotely located computer systems could
utilize the equipment.
9. Relative ease in adapting hardware to new
experiments or expanding the equipment as
monitoring experience evolves.
10. Form a basis for a combination hardwaresoftware monitoring process when further understanding of this process is available.

Figure I-Photograph of the 'Neurotron' monitor

action by operator or program with the data accumulation, analysis, and display. This interactive (rather than
passive) monitor is based around a minicomputer,
storage display, tape unit, and specialized computercontrolled logic and data accumulation hardware.
This interactive ability has also been the basis of design
for the future communications between the hardware
and software monitor processes.
The goals of the monitor development were as follows:
1. Capability of high bandwidth in logic and data
accumulation facilities.
2. Program-controlled logic for selecting or filtering
events based on current experiment or recently
.
collected data.
3. Capability of obtaining data at subinstruction or
instruction level as well as gross operating
statistics.
4. Capability of response to events or interrupts ()f
interest within a reasonable interval or to
"automatically" select monitoring periods during
events of interest.
5. Capability of recording. and analyzing events or
sequences with short (nullisec) perturbations as
occurs in many "real-time" systems as well as
presenting statistics on long-term variations.
6. Graphic output for providing messages and
snapshots of the monitoring process to operators
or experimenters.
7. Capability of obtaining information felt necessary for examining the highest performance
processor locally available (360/MOD 75).
8. Inexpensive and portable as possible in order

Figure 1 is a photograph of the equipment, and
Figure 2 is a functional description.
The mini-CPU is used as the monitor control element.
The computer coordinates the operation ~f the I/O,
logic selection, algorithm selection for the Random
Access Memory and arithmetic unit, programmed
logic, counters and sequencers. This coordination is
performed under a multiprogramming-priority system.
Display programs, data acquisition programs, and
analysis programs individually have priorities attached
in addition to the priorities normally associated with
I/O and probe interrupts.
Monitor elements

The basis for monitor data acquisition is the programmed .selection and control of the elements in the
monitor. In addition to the computer program selection
of elements and paths, a 36 X 24 patchboard programmer
is included to aid in the I/O selection.
Control elements

Programmable logic and registers which can be set
and read. by either the CPU or the external environment, form the main communication link. These

Figure 2-Functional description of the 'Neurotron' system

Neurotron Monitor System

registers control the selection of input probes, logic
selection, start/stop control, sequence configuration,
transmission paths, and other control functions depending on the particular experiment in progress.
Logical elements

More typical of conventional monitors are combinatorial logic elements, decoders, comparators, and
pulse generators. These devices are used in performing
logical processing on the input signals. Thirty-two 40
MHertz counters are available for event counting or
timing. Each counter or group of counters can be
selected by program for start/stop, read, or read and
clear. In this manner, sampling intervals are completely at program discretion.
Sequencers

Sequencers are logical devices used in the determination of event occurrences relative to previous or
subsequent events. Events may be addresses, instructions, device movements, encoded signals, etc. Each
sequencer is designed to accept pulses representing an
event and to track subsequent events. If a break in the
defined series of events occurs, an output is enabled
which can control other sequences, logical changes, or
be used for counting or timing. Each physical device can
be used for the sequencing of three events; however,
sequencers can be chained to much longer lengths.
Sequence detection experiments typically use from
2 to 16 events.
Random access memory (RAM)

A key element in the acquisition of data is a Random
Access Memory and associated arithmetic unit. The
RAM consists of 60 nanosecond access monolithic chips
organized in a basic configuration of 256 words by
16 bits. Depending on the selection of the experiment,
however, it can be used in configurations of 512X8,
256X16, or 128X32 bits. The CPU and external
system can access this memory in several modes; e.g.,
write, read only, increment, and read/clear. In addition,
control over external access is maintained by the CPU.

Data processing
Data processing is performed by combining and
controlling the physical elements in the "Neurotron"
monitor, accessing the contents of these elements by

CPU UTILIZATION
CPU TI"E~ I/O OVERLAP
TOTAL TI"E~ I/O ONLY
SUPERVISOR STATE
SUPERVISOR ACTIVE
WAIT PENDING, 2301
2391 ACTIVE (I/O TOTAL)
eTC WAIT
"DC ~e USE OF SHARED FILE
59 "UX USE BY RE"OTE BATCH
59 "UX USE 2821-V

33

6er.
4e~
48r.

5e~
25~
3e~
2e~
8~
85~
8~
85~

Figure 3-Text display of an activity interval

the CPU, and analyzing, recording, and displaying the
effects of the monitored events.
Text displays

A typical output display is shown in Figure 3. The
interval for sampling and the information displayed is
selected by the operator or experimenter, constrained,
obviously, by the monitored entities. Information can
be presented in bar graph form if desired rather than in
the text form illustrated.
Similar display and recording can be accomplished
by encoding of events such as interrupts, device accesses, and channel use, limited generally only by the
ingenuity of the experimenter.
Instruction analysis

In gathering statistics on instruction distributions
correlated with I/O activity, time, program keys, or
other events, the RAM is used as a 256X 16 bit accumulator. That is, the instruction format (up to 8
bits) is used as the address field in accessing the
memory. On each instruction execution (or instruction
issuance depending on the host system architecture or
statistic of interest) the RAM is accessed at the location
specified by the address, updated by a count of 1, and
restored to memory. The unit was designed to perform
the update in less than 200 ns (sufficient for current
equipment at the Laboratory). The sampling interval
can be controlled by a CPU data collection program
driven by a programmable clock, or can be determined
by externally triggered or internally calculated events.
By sampling interval is meant the time during which
every instruction is monitored, counted, and totals are
read into the CPU memory. Normally, at the end of
each interval the data collection program can read or
read/ clear all locations of inter~st in the RAM as well
as counters, communication registers, and other devices
in preparation for the next interval while event counting

34

Fall Joint Computer Conference, 1971

i

..J

1
-i

j
]

I

1

I

1,1

,

, I

Figure 4-Distribution of instructions during sample interval

Figure 5-Memory activity display-interval 1

continues. The program may start and stop the collection of data during this interval, depending on the
allowable skew between the reading of all data and
continued accumulation (the degree of correlation).
Multiple samples can be retrieved, accumulated,
recorded, and displayed. An example of an instruction
distribution display for a sampled interval is shown in

Figure 4. Selected portions or a condensation into
major categories can be displayed if desired. A condensation for a time period including the displayed
sample interval is shown in Table I.

TABLE I-Instruction Type Distribution During
Successive Intervals

TYPE
SPECIAL.
DECIMAL.
EDIT
CONTROL,
IO
LO AD .. STORE,
INTEGER

INTERVAL I

4.13%

INTERVAL

]I

INTERVAL

2.82%

2.26%

.37%

.38%

.45%

37.13%

41.13%

45.36 %

m

Memory utilization

A similar approach is taken in monitoring address
streams. Since the current size of the RAM is 256
words, only 8 bits of an address stream are utilized. In
a one-million byte system, the -host memory is partitioned therefore into 4K byte blocks. Any memory
access of the host system increments the corresponding
location in the RAM. Concurrently this absolute
memory activity during a sampling interval can be
retrieved, recorded, or displayed. Examples of a
memory activity display for two consecutive sampled
intervals are shown in Figures 5 and 6. Each division on
the horizontal axis represents a 4K block of core, and
the vertical displacement (full scale = 106 accesses)

1

INTEGER
ARITH.

10.24%

LOAD-STORE
FLOAT. PT.

2.42%

9.72%

7.76%
8-

1.94%

1.45%

-

\oJ

....
'cr

FLOAT.
ARITH.

.83%

.82%

~09%

BRANCH

27.86%

26.04%

24.69 %

LOGICAL,
TEST,
COMPARE

v:

""vv :

2-

~

1.

i

ttL!,
4K BLOCK ADDRESS

17.02%

17.15 %

17.94%

Figure 6-Memory activity display-interval 2

Neurotron Monitor System

represents the absolute number of accesses made to
that block in the sampled interval. The build-up and
decay of utilization is quite obvious from these successive displays. Since the displays are under program
control, scale changes and interval selections are available to the operator while absolute counts and correlated information from counters or sequencers are also
displayed and recorded.

35

I

1
-l

~

MeDlory accessing

In analyzing memory accessing in various systems, it
may be more important to know the read/write characteristics and the relative magnitude of accesses than
the absolute memory utilization. In this type of experiment the RAM is used as a 512 X 8 bit memory with
even locations used for 'read' counting and odd locations
used for 'write' counting. Thus, 255 counts may be
accumulated in any sampling interval with counting
inhibited after 255 is reached. While these counts are
being retrieved and recorded, an operator display is
generated, as shown in Figure 7, in which each block
corresponds to a 4K byte segment of a one-million byte
memory. This display indicates those regions active for
read, write, or both, during the preceding interval and
also indicates those regions with no memory activity.
These latter regions may be allocated but are not
active. Similar displays can illustrate channel and/or
CPU access in each region.
Activity graphs

While statistics on memory or instructions are being
collected, other information can be recorded in highspeed counters during the corresponding interval.

FI

,,!

[I,

,,;

R

II
R ..

R
It

It W It W It

80

II

It

AtI

,,;

'"

"

!/l'

II:

"1

,

"I"

"w

ee, .. " ..

R

" ..

IW

Iw

ItW

WIW"WI

IW

RW

ItW

I

ItW

IW

IW

" ...

IWRW"W"WIWIWItW" . . .

28

111:,
.. ~"T..~,--r-;--rW-I~If--I~W~R~W~R~W-I+-~-+~~+-~~

•• 1..

H

It W
~

"If

~

"W

v

I

W " ..
~

~

R If

N

" ..

V

I

M

It

H

,

_

It If

_

It If

K

It If

•

" ..

•

"If

w

Figure 7-Memory access display-read/write activity

,

Figure 8-Device activity history

Updated activity graphs can be generated by the
data acquisition programs indicating the most recent
history of the device or event of interest. Display;;
similar to Figure 8 (indicating device activity in this
case, for the last 32 seconds) can be most interesting
and useful to operators and experimenters alike,
especially during periods of high activity.
Buffer analysis

Since the RAM can be loaded from either the CPU
or an external system, the memory and associated
arithmetic unit may be used as a large quantity of
comparators.
An example of this type of operation is in the analysis
of buffer type memories and their appropriate
algorithms. 4 •5 It is anticipated that hierarchy memories
will be usefully implemented in a variety of design
situations and it is necessary to determine their effects
on a range of applications and environments. One useful
technique currently implemented is called "congruence
mapping." This technique has advantages both in ease
of implementation and access time relative to mapping
techniques necessitating a full associative search.
Buffer configurations are N X M blocks of B bytes
capacity each. The "N eurotron" allows the simulation
of buffers up to 4 X 128 blocks. The selection of the
address bits monitored determines the block capacity.
The Random Access Memory is organized into a
128X32 bit memory, each word divided into four fields
of up to 8 bits each. Obviously, any submultiple of each
dimension can be used also. The number of successes
(buffer hits), the class level of the match (or replacement) and other information is available for recording
for the currently implemented algorithms of least
recently used (LRU) , first in-first out (FIFO), and
random replacement. Again, the sampling interval can

36

Fall Joint Computer Conference, 1971

N x MX B

REPLACEMENT
ALGORITHM

TYPICAL
SUCCESS RATIO

4 x 128 x 64

LRU

97ett. - 99%

2 x 128 x 64

LRU

94%- 98°4

4 x 128 x 32

LRU

97°4- 99%

2 x 128 x 32

LRU

91 %- 98°/.

4 x 128 x 64

FIFO

95%-99%

2 x 128 x 64

FIFO

94%-97%

4 x 128 x 32

FIFO

92%- 96%

2 x 128)( 32

FIFO

89%- 95%

TABLE II-Typical Results Obtained by Address Stream
Monitoring, with 'Neurotron' Used as a Simulated
Buffer Memory

be determined by program, clock, interrupts, events,
etc.
By monitoring the various address-generating
mechanisms in a system without buffer capabilities,
the effects of buffer size and replacement algorithm on
such equipment can be determined. Table II indicates
results of a few randomly sampled intervals on a
S/360/MOD 75 with 1M byte of fast core. Similar
experiments have been performed on computer control
equipment, communications concentrators, and timesharing systems. Variations of this experiment are used
in analyzing use of distributed and read-only memory.
Data recording and display

Once data has been retrieved from counters, RAM or
communication registers, information buffers may be
updated, recorded, or displays generated. By use of
rotating buffers, double buffering and other techniques,
statistics with a resolution of a few milliseconds can be
recorded, or significant filtering and compression of
data can be performed before display or recording.
The interactive display· has provided a means of
pre-acquisition probe adjustment, judging the reasonableness of the on-line data acquisition and reduction
programs, and snapshots of the performance statistics
during data collection. It also provides information to
inquiries by means of messages, histograms, bar charts,
time plots, etc. This interactive capability has effected

a time savings not only during monitoring, but is
useful also in observations during post monitoring
analysis which may be performed in the monitor itself.
As anyone familiar with hardware monitoring techniques can verify, the probing of a large number of
unfamiliar systems can be difficult. A display of the
information currently being processed has proved
useful in determining that probes have been properly
placed and are in working order. Similarly, since the
data acquisition programs are usually time or interrupt
dependent, the display can provide a means of judging
the necessary sampling intervals or buffering techniques
for the proper display and recording of data before final
monitor results are obtained.
Snapshot displays of the statistics are useful to
experimenters and operators interested in more immediate information before postmonitoring analysis. This
information usually relates to device utilization,
interrupt activity and associated core utilization, and
channel activity.
This immediate data reduction can also potentially
provide feedback to software monitors executing in the
host system. Currently being designed is a channel
interface to IBM 360 equipment. It is felt that this
interface between our existing monitors will provide a
means for a more optimum collection of information for
both system and user programs. Our experience thus
far, however, has demonstrated the usefulness of interactive hardware monitors in several environments in
which no software monitors are available, or in which
the event bandwidth is outside the capability of those
monitors.
Software developIllent

An on-line operating system for the "Neurotron"
was developed6 to service the diverse applications
anticipated. This operating system establishes an
environment for allowing a versatile priority structure
to be defined by the user programs. Although physical
interrupts and devices within the system are assigned
separate priorities, each program or module may have
separately assigned execution priorities. This allows
users to have dynamically varying priorities for various
modules based on current data rates, event occurrences,
program type, etc. The actual scheduling of programs
and interrupt connect/disconnect can occur by means
of keyboard input, interrupt occurrence, or from another
routine. A rigid modular structure allowed the operating
system to be highly interruptable and greatly decreased
the development time. With the operating system is
provided a set of routines to aid users in the develop-

Neurotron Monitor System

ment of their applications. Programs for display, dump,
trace, interrupt connect, breakpoint, etc., are available
for debug and on-line use.
I

CONCLUSION
The development of the N eurotron has provided the
engineers and system programmers at Argonne with a
convenient means of monitoring the operation of a
variety of computing facilities. The monitor organization
and acquisition hardware have allowed the recording
of data whose collection heretofore was prohibitive,
expensive, or time-consuming. The interactive and
display capabilities of the system have provided the
user with the necessary facility for immediate interrogation and presentation of data. It is estimated that
similar monitors could be made commercially available
for $35,000 to $65,000 depending on software provided,
logical features, etc.
Data of the type previously shown is continually
available to the user to provide the necessary operating
picture of the system. The monitoring work that has
been accomplished to date suggests the usefulness of a

37

real-time interaction between system monitoring (hardware and software) and system programs such that
dynamic system adjustment is both possible and
useful.

REFERENCES
1 G ESTRIN et al
SNUPER computer
AFIPS Spring Joint Computer Conference 1967
2 P CALINGAERT
System performance evaluation: survey and appraisal
Comm ACM Vol 10 No 1 January 1967
3 D HOPKINS G ESTRIN
An interfering instrumentation computer
UCLA 10P14 57
4 L A BELADY
A study of replacement algorithms for a virtual storage
computer
IBM Systems Journal Vol 5 No 2 1966
5 R L MATTSEN et al
Evaluation techniques for storage hierarchies
IBM Systems Journal Vol 9 No 2 1970
6 L AMIOT R ASCHENBRENNER
The 'Neurotron' operating system
Argonne National Laboratory Applied Mathematics
Division Technical Memorandum No 223 unpublished

A simple thruput and response model
of EXEC 8 under swapping saturation
by J. C. STRAUSS
Washington University
Saint Louis, Missouri

INTRODUCTION

significant behavioral aspects are isolated to preserve,
in some sense, a homomorphic mapping between the
original system and the reduced model. The verification
and interpretation problem is also similar to other
modeling and simulation situations; i.e., the model is
verified for measured behavior and employed to
predict unmeasured and/or unmeasurable behavior.
Here, too, the significant problem is to limit the aspirations of the study and not attempt to employ the model
in situations that do not preserve the original homomorphic mapping.
This work on the performance model presented here
developed out of a larger system performance evaluation
and timing study concerning the 1108 operated by
SINTEF (a non-profit engineering research foundation)
for the Technical University of Norway (NTH),
Trondheim, Norway. This study was prompted by the
circumstance of upgrading from a very satisfactory
Univac 1107 to an 1108 and experiencing an increase in
the installation cost/performance ratio. This was
subsequently explained by a number of factors such as
lower discount percentage on the 1108, minimum
EXEC 8 configuration, workload tuned to the 1107
EXEC 2, etc. However, this post facto analysis did
little to soften the blow. Also, in trying to analyze the
behavior of EXEC 8 with a view to tuning the system
control parameters for "optimum" performance with
the local workload and configuration, it was determined
that Univac (at least in Europe) did not understand
EXEC 8 very well. Thus before initiating more ambitious measurement, analysis, and simulation projects
aimed at performance tuning, it was necessary to obtain
simple conceptual and analytic models of significant
behavioral aspects of the system.
The model developed is based on the not unreasonable
assumption that under heavy pressure for demand
service the single channel swapping device of the
NTH 1108 configuration will saturate and thereby
become the limiting resource to system performance.

EXEC 8 is the multiprogramming, time sharing
operating system for the Univac 1100 computer
systems. EXEC 8 attempts to provide satisfactory
concurrent batch, demand (interactive), and real time
processing through complicated priority scheduling
schemes for both real memory and CPU time allocation.
Basically, the scheduling schemes allow real time
service to have whatever resources it requires and
demand and batch service requests share the remainder.
The sharing algorithm is quite complicated; in essence,
however, it dynamically limits the time average impact
of demand service on the system performance to an
installation set limit function of the number of active
demand users. Within the demand and batch type
categories, time and core are allocated by exponential
scheduling algorithms biased to favor small jobs, but
constrained to service all jobs eventually. In addition,
EXEC 8 provides all the I/O control, file handling,
diagnostic error testing, user support systems, etc.,
normally associated with third generation operating
systems.
This paper presents a simple deterministic steadystate model developed to help understand the gross
scheduling and resource allocation problems in the
operation and (particularly) the performance tuning of
EXEC 8. The model is concerned solely with the longterm balance of demand and batch services; as such, it
does not concern itself with real time services and need
not concern itself with the standard operating system
user services.
This is but one of a number of attempts l - 7 to model
significant behavioral aspects of very complex computing systems by simple models. Most of the referenced
papers present justification for the philosophy. To
avoid repetition here, suffice it to say that the problem
is basically no different than any complex system
modeling problem; i.e., groupings of interesting and

39

40

Fall Joint Computer Conference, 1971

Most other simple models are based on limiting resource assumptions of one sort or another. For example,
in Reference 7 Kimbleton and Moore develop a simple
model of IBM 360/67 performance based on the
assumption that the CPU is the limiting resource.
While limiting resource models are certainly conceptually and often analytically simple, their usefulness
is very much dependent on the validity of the original
limiting resource assumption. The validity of the
swapping saturation assumption underlying the current
model is investigated here in concept, by measurement,
and finally directly in terms of the model parameters.
The extent to which the model is a success has to be
measured against its initial goals; i.e.: (1) to follow and
predict steady-state EXEC 8 performance as a function
of configuration and workload, and (2) to serve as a
focus for detailed study of EXEC 8 design, construction, and behavior. Both these points are discussed in
the sequel in light of presented results.
This paper is organized as follows: the next section
describes the manner in which the load is characterized
and presents those features of the EXEC 8 core and
CPU time scheduling algorithms that significantly
affect the average behavior of the system. The BASIC
MODEL section develops the basic model and the
VERIFICATION section attempts to verify this model
against behavior observed at NTH. The AUGMENTED MODEL section analyzes the. shortcomings
of the basic model and proposes and verifies an augmented model designed to correct problems due to
limited core space. The final section analyzes the region
of significance of the swapping saturation assumption
underlying both basic and augmented models. In
addition, some simple extensions involving queueing
theory are indicated.
LOAD AND SYSTEM CHARACTERISTICS
The manner in which the system load is characterized
is described and important features of EXEC 8 operation and behavior are presented.

Load
The model is intended to describe the steady-state
performance of the system under average loading
conditions. Such performance is almost assuredly not
the same as the performance under a uniform average
load characterized by the means of various distributions
describing the average load. However, for sake of
simplicity, a uniform average load is employed in the
subsequent model development. This is a place to start
and perhaps as pointed out in References 2 and 4 some

interesting gross performance statistics can be obtained.
(While very interesting, it would be extremely expensive
to investigate the effects of this assumption experimentally in a meaningful way. It certainly should be
pointed out, however, that the resulting model predictions will be optimistic at best. At NTH, it is hoped
to look at this question in detail with the aid of a
complete simulation model of EXEC 8 now under
development. )
Table I presents the average characteristics notation
employed to describe the load:

Core quanta
The core-time impact of a task in EXEC 8 is measured by its core quantum, 'iF, which is related to the
core quantum time, Q, of a task requiring C blocks of
core as follows:
(1)

where:
'iF8 is an installation specified value. (The NTH

EXEC 8 employs 'iF8=512 block.ms)

Pc is the core priority level of the task. (In EXEC 8,
the core priority level of a batch task is fixed at its
run priority level [typically 6J while the core priority
level of a demand task starts at level 2 and increases
with each successive level of the exponential CPU
time scheduling algorithm that the task experiences
[actual core and CPU priority is in inverse order of
level number; typically, the priority level of an
interactive task remains at level 2J) .
Unfortunately for ease of analysis, the core quantum
time, Q, of a task is not elapsed time in core, but rather

TABLE I-Load Characteristics Notation

Batch
Average core requirements including
non-resident executive functions that
must be loaded
(in units of 512 word blocks)
Average CPU time per task core residence
(i.e., per swap)
Number of open/active jobs
(N b is an installation parameter and
constant under heavy load while nd
describes demand load with the
installation fixing an upper bound)
Average total CPU time per job

Cb

tb

Nb

Tb

Demand

Thruput and Response Model of EXEC 8

is measured in terms of CPU time and channel time
charged to the task (concurrent usage is only charged
once) .
Demand impact control philosophy

EXEC 8 attempts to limit the impact of demand
service on system performance by dynamically adjusting system behavior to maintain a statistic referred
to here as the demand service ratio~ DSR, at an installation specified limit function of the number of
active demand users, nd. The value of this statistic is
computed at six second intervals and averaged over the
last several minutes of operation. If necessary, EXEC 8
temporarily raises the core priority of the next batch
task to insure loading and increases its core quantum,
'!Fb , to correct the measured DSR to the desired DSR
during the next six second interval. The DSR is defined
as:
DSR= (Core quanta charged to demand+core-time
product of total swap activity) / (Core quanta
charged to both batch and demand+coretime product of total swap activity)
(2)
To further quantify this relation, it is necessary to
develop more precise terminology.
EXEC 8 core scheduling algorithm

In terms of core quanta, the EXEC 8 core scheduling
algorithm exhibits the following behavior:
(1) When core conditions change, the highest core
priority task ready for load-in is checked for
ability to fit in the available space. If possible,
space is reserved and a swap-in is initiated. If
not, the task type is checked; if a batch task, the
next lower priority ready task is checked, etc.;
if .a demand task, the core scheduler is disengaged until the next core status change. In
order to prevent excessive delays, wait times are
accumulated on each of the waiting tasks and
after too long a wait, core entry is forced by
temporarily raising the core priority.
(2) Once in core, a task is guaranteed of remaining
for its full core quantum or until it voluntarily
relinquishes core control by entering a terminal
I/O or long wait state or unless exceptional
system conditions such as I/O buffer full state
occur.
(3) Upon completing its core quantum, a task is
considered swappable by a higher core priority

41

demand task. In the absence of demand task
pressure, the system does relatively little swapping. The main cause of swapping in a pure
batch environment is dynamic facilities conflicts caused generally by two core resident jobs
employing the same system processor (e.g.,
FORTRAN) resulting in a long wait state which
may lead to swapping.
From a uniform load, steady",:,state modeling viewpoint, the problem is to abstract the important aspects
of the fairly complicated core scheduling algorithm and
ignore fine structure details such as the controls to deal
with excessive waits by tasks with large core requirements.
CPU time scheduling algorithm

Once a task has received core, it is subject to a complex multilevel exponential CPU scheduling algorithm.
Fortunately, from a steady-state modeling viewpoint
for a limited core configuration such as that of NTH,
CPU scheduling has smaller impact on system performance and therefore need not be given as much
attention here as core scheduling. For sake of completeness though, the essence of the algorithm is as follows:
(1) A queue of queues is maintained for both batch
and demand type tasks. Each successive queue
is a higher level and has associated with it a lower
CPU priority and a larger CPU time quantum.
(2) Both batch and demand tasks start their core
quanta at level 2 CPU priority (their interrupt
activities are processed at levelland they are
forced if necessary at level 0). If a task at level 2
relinquishes control of the CPU prior to completion of its CPU time quantum, it is queued
for service at level 2 upon completion of whatever I/O action caused it to relinquish control.
So long as the task remains at level 2, it receives
a new level 2 CPU time quantum each time it
receives CPU service. If, however, a task does
not relinquish control and runs to the end of its
CPU time quantum, it is queued for service at
the end of the queue at the next level with a
CPU time quantum that is twice as large as it
had previously.
(3) So long as the task remains compute bound
(i.e., it does not voluntarily release the CPU),
it moves up the priority levels each move
resulting in a doubling of CPU time quantum
until it reaches level 7. As soon, however, as it
voluntarily relinquishes control it is requem:,d for

42

Fall Joint Computer Conference, 1971

service at the end of the level 2 queue with the
original level 2 CPU time quantum.
All this of course is subject to the constraints of the
core quanta scheduling scheme described previously.

and/ or system processor on the swapping drum (for
the FH 432, T A = 4.3ms) .
Tp is the flow time per 512 word block of information
from (or to) the swapping drum (for the FH 432,
T p = 2.13ms).
Substituting relations from (1) and (3) into (4) and
solving for DBR yields:

Demand cycle
The definition of the demand cycle is needed to
quantify the relationships just described. The demand
cycle is artificial; i.e., it has no physical counterpart in
the system. However, it provides a way of introducing a
cyclic time frame into an otherwise steady-state model.
If there are nd average active demand tasks competing
for system resources, all other things equal, they will
be serviced in cyclic order. The sequential execution of
these nd tasks in combination with sufficient batch
tasks to maintain the DSR is termed a demand cycle.
The number of batch tasks that are swapped into core
and executed during a demand cycle is denoted as nbd.

DBR= (~:) [(C~~~R)
X

(~) SF-2 (~)) / ((~:) +2(:))]

(6)

which corroborates the previous assertion that DBR is
a direct function of DSR. For simplicity, DBR is
employed in the following formulation to represent the
effect of the system control actions. These actions are
mechanized in EXEC 8 in terms of DSR, but (6)
establishes that the effect can be described in terms
ofDBR.

Demand batch ratio

It will be convenient to parameterize the relationship
between nbd and nd by the concept of a demand batch
ratio denoted by DBR. The DBR like nbd is an artificial
quantity; these quantities do not appear physically in
EXEC 8, but appear implicitly as a direct function of
DSR. The DBR is the average ratio of CPU time
allocated to demand tasks to that allocated to batch
tasks. In view of the definitions, the DBR can be
computed over a demand cycle as:
(3)

In terms of defined quantities, DSR defined in (2)
can be quantified as:
DSR=

(ndwd+2(ndCdSd+nbdCbSb»
(4)
(ndwd+ (nbdWb) SF+2 (ndCdSd+nbdCbSb) )

where:
SF is an installation-set scale factor. (The value
employed by Univac in the NTH EXEC 8 is %.)
Sb and Sd denote the times to swap the average batch
and demand tasks into (or out of) core and are respectively:
Sb=TA+CbTp
Sd=TA+CdTp

(5)

where:
TA is the average access time to locate a swap file

BASIC MODEL
The underlying assumptions are justified and a basic
model is developed.
Simplifying arguments
In Reference 5, Hellerman and Smith present a
simple, but elegant, model for throughput analysis of
record processing EDP applications for various physical
and logical overlap configurations. Their simplifying
assumptions exclude a number of important EDP
applications, but interestingly enough cover the case
of full swap, buffered time sharing systems. Now
EXEC 8 introduces additional complexity through its
batch multiprogramming features, but Reference. 5
provides interesting insights that serve as the basis of
the current model.
In particular, computing the batch and demand swap
times from (5) for the NTH average core requirements
of: Cb=50 blocks, Cd=25 blocks; yields: Sb=111ms,
Sd=58ms. The NTH observed compute time per swap
for batch and demand of: tb= 160ms, td=20ms, plus the
observation that swapped in tasks must also be swapped
out, suggests that under heavy demand load the single
swapping channel on an 1108 configuration such as
that of NTH will be very busy. With the exception of
some cycle stealing conflicts, the tasks' compute
activity can be overlapped completely by swap activity
and with sufficient core buffer space available, the

Thruput and Response Model of EXEC 8

swapping channel might saturate under heavy demand
load. Under swapping saturation, the difference
between compute times and swap times would appear to
provide more than half of the total possible CPU time
per demand cycle to handle EXEC 8 functions.
These simplifying arguments lead to the following
assumptions for the basic model:

Basic assumptions

(1) The swapping channel is saturated.
(2) All compute time on both batch and demand is
completely overlapped by swap time.
(3) All system overhead is also completely overlapped by the swap time.
(4) There is sufficient core available for buffering so
that the above assumptions are reasonable.

43

demand cycle as:
CPUde
CPU percent = - - ·100
Tde

(I+DBR-l) ·100

(10)

Interestingly, (7) and (8) predict that after swapping
saturation, demand response time and batch turnaround time will increase linearly with increasing
number of demand users nd. Also, (10) indicates that
with swapping saturation in a system controlled to a
fixed demand service ratio, the CPU utilization is
independent of nd.
VERIFICATION
The basic model is adjusted to the NTH load and
system characteristics and an attempt is made to verify
the model against observed system behavior.

Performance calculations

Assumptions 1, 2, and 3 above allow the total elapsed
time for the demand cycle, T de , to be quantified as the
sum of the total demand and batch swap time:
Tde= 2 (ndSd+nbdSb)

T de provides an upper bound to R d, the system
response time to demand users, assuming cyclic service
to the nd users; i.e., R d< T de .
T de can also be employed to compute the elapsed
processing time for a batch job, ETb which serves as a
lower bound on the expected turnaround time, R b , for
the average batch job. Rb also includes the time spent
waiting in the system backlog queue and depending on
definition may also include waiting time in a system
input queue prior to the backlog queue, a printer queue,
and one or more output handling queues prior to
delivery back to the user. If there are Nb open batch
jobs receiving equal service, the amount of CPU time
allocated to a single batch job during a demand cycle is
(nbd/Nb)tb. If the requisite CPU time for batch job
execution is T b , the total elapsed time for an average
batch job is:

System conditions

The load and system parameters presented in Table
II have been directly observed in the NTH environment
as a part of a detailed measurement study.

TABLE II-NTH Load and System Parameters
= 50 blocks
Nb = 5
Tb = 25 sec
Wb = 16000 block.ms

Cb

DSR

Cd

= 25 blocks

nd

= 6

Wd
ttl

= 1000 block.ms
= 20ms

= .35

The average batch load characteristics of Table II also
agree with those observed by the University of Wisconsin
in a recently reported measurement study. 8
From the values of Table II and (1) and (5), it is
possible to compute the intermediate model parameters
of Table III:

(8)

The total CPU time used during the demand cycle is:
(9)
The average CPU utilization can be computed over the

TABLE III-Intermediate Model Parameters
Sb

~

Qb

~

111 ms
320 ms

44

Fall Joint' Computer Conference, 1971

TABLE IV-DBR Versus DSR
DBR
.01
.05

DSR
.30
.35

. 15

.40

In order to verify the basic model, it only remains to
obtain a realistic value for tb, the CPU time per swap-in
(or equivalently per core quantum) for a batch task.
The td of 20ms reported in Table II is measured directly.
There are two pieces of experimental data that permit
estimation of tb:
(1) The observed ratio of Qd/td is 2, and
(2) Wisconsin reports in Reference 8, that the
average total channel time per batch job is
twice the average CPU time. No overlap of
individual channel time or CPU time would, as
described in the LOAD AND SYSTEM CHARACTERISTICS section, yield a Qb/tb of 3 and
complete overlap of two channels and the CPU
will yield a Qb/tb of 1.
For a university environment with its heavy use of
system processors that make good use of overlap
possibilities it is not unreasonable to expect that the
Qb/tb ratio will be the same as the experimentally
observed Qd/td ratio of 2.
On the basis of this argument, a tb of 160ms is employed in the sequel.
Performance predictions

Solution of (6) for the given parameter values yields
a DBR of .05 for a DSR of .35. In experimental studies
at NTH, DBR values lower than .01 have been observed
with an nd of 6, but, as is later developed, appreciably
larger DBRs are necessary to support the swapping
saturation assumption. Thus in the sequel, calculations
are performed for DBR values given in Table IV with
corresponding DSR values.
Solution of Equations (7), (8) ,and (10) as a function
of DBR for the parameter values of Tables II and III
yields Table V:
TABLE V-Performance of Basic Model
DBR

CPU Percent

Tdc(sec)
(nd = 10)

.01
.05
.15

29
7
3

181

70

210
282

63
51

Measurements at NTH for the average load characteristics of Table II and with a DBR of less than .01
at an nd of6 indicated an average CPU utilization of
50 percent, an average batch turnaround of 5 min.,
and an average demand response time of 38 seconds .
The basic model predictions of Table V indicate an
appreciably lower Tde which causes a lower ETb and a
higher CPU percent than that measured. Moreover
other measurements indicated that the swapping
channel was approximately 60 percent busy rather than
the 100 percent assumed by the model under similar
loading conditions. This lack of agreement prompts
investigation of the validity of the assumptions supporting this basic model.
AUGMENTED MODEL
The swapping saturation assumption itself is subject
to question, but analysis of this is presented in the next
section. This section explores the question of available
core, modifies the basic model, and predicts system
performance from the resulting augmented model.
Required core

One approach to understanding the effect of available
core on model performance is to analyze the amount of
core necessary to maintain swapping saturation. This is
done by first considering the integrated core-time
demand over a demand cycle, CTD de . If EQd and EQb
are respectively the effective elapsed times demand and
batch tasks remain in core when swapped in for their
core quanta, then average demand and batch jobs tie
up Cd and Cb blocks of core for: 2S d+EQd and
2Sb + EQbms respectively. Thus in one demand cycle the
integrated core-time demand is:

block-ms, and the minimum average core requirement
to maintain a demand cycle of T de is:

Cd[2(~)+(~)]+ rfjfu [2(~)+(~)]
2

[(~:) + DBR-l (~)]
(12)

The problem in analyzing (12) lies in the determination of EQd and EQb. As explained previously,

Thruput and Response Model of EXEC 8

TABLE VI-Cmin for Varying DBR and EQ/Q
(Units of 512 Word Blocks)
EQd/Qd

1

3/2

2

1

2
188
167
134

3
257
227
180

DBR\
EQb/Qb

.01
.05
.15

119

107
87

Qa and Qb are well defined quantities for average demand
and batch jobs. Unfortunately, it is non-trivial to relate
the elapsed core time, EQ, to the core quantum time,
Q, which is counted only when CPU and/or channel
activity is occurring for that task. The following arguments are pertinent:
(1) For all batch task swap-ins, but the swap-in
where the task terminates, EQb>Qb. The elapsed
residence time of the last swap-in may be less
than Qb because the task will voluntarily give up
core control upon terminating. Inasmuch as
Tb/tb, the number of swap-ins for the average
batch job, is greater than 100 and this job will
involve approximately three sequential tasks
(compilation, collection, and execution), these
end effects will have little effect on average
behavior. Hence EQb>Qb.
(2) For the average demand task swap-in, there are
end effects more frequently because the interactive tasks will voluntarily give up core control
upon entering terminal wait state. In the EXEC
8 environment however, the average user
employs batch oriented processors (compilers
and the collector) for a large proportion of the
swap-ins. Also because the demand CPU time
quantum is more closely related to the demand
core quantum time than for the case of batch,
the relation in (13) is almost certainly true on
the average:
(13)
(3) In order to estimate the magnitude of the ratios
in (13), it should be noted that each time the
task is eligible to receive CPU or channel
attention it will on the average have to wait
Y2 (in the case of one competitor) or more of the
length of time it needs to employ the resource.
For these reasons EQa/Qa is probably greater
than % and EQb/ Qb, because of the increased
employment of channel resources by batch tasks,
is almost assuredly greater than that.

45

Since these ratios appear to be very much a function
of what is happening in the model, they are carried as
parameters in the augmented model development and
verification.
Equation (12) is solved to yield the values in Table
VI as a function of DBR and the EQ/Q ratios.
For certain parameter combinations, the Cmin required to sustain saturated swapping is greater than
CA, the actual number of core blocks available for user
tasks at a particular installation. In practice, Cmin must
be in multiples of Ca (or Cb) blocks. Thus the numbers in
Table VI must be increased at least to CA or the next
higher multiple of Ca (whichever is larger). This
modified Cmin is denoted as C'min.
There is one other usage of core that must be taken
into account. For each possible active demand user,
space must be reserved for line buffers, switch list
information, etc., in the executive buffer pool area
(EXPOOL). Each possible user requires the reservation
of approximately 0.8 of a 512 word block of core store.
Thus the CA employed to represent the number of
available user blocks is really a function of na:
(14)
(In the NTH configuration, there are a total of 256
blocks of which CAO = 130 and typically the system has
been generated for a maximum na of 6 yielding an
effective CA of 125 blocks for user tasks.) In the following model performance predictions, the NTH CAO of
130 is used.
As with Cmin, when postulating an average model,
CA only has meaning as a multiple of Cd (or Cb) blocks.
Thus all use of CA must be in terms of C' A where C' A is
the CA defined in ( 14) reduced to the next lower
multiple of Ca.

Model development
The minimum effect of having too little core to buffer
the tasks under swapping saturation will be to increase
T ae by an amount proportional to the ratio between
C'min and C' A. Such an increase in Tae will cause a
reduction in CPU percent by the inverse ratio.
Thus for the case of swapping saturation limited by
available core, the basic model Equations (7), (8), and
(10) are augmented as follows:
T'ae= (C'min/C'A)Tdc

(15)

ET'b= (C'min/C'A)ETb

(16)

CPU' percent = (C' A/ C' min) CPU percent

(17)

46

Fall Joint Computer Conference, 1971

TABLE VIII-T'de (for Two EQ/Q Couplets)*
Varying DBR and nd

/'

(Units of Seconds)
DBR\nd

6

20

30

.01
.05
.15

(28, 38)
(6, 8)
(2, 3)

(116, 159)
(23, 34)
( 9, 12)

(173,239)
(35, 50)
(14, 18)

present plots of T'de and CPU' percent respectively for
an EQIQ couplet of L%,2} and a DBR of .10 as a
function of nd.
20

10

30

40

Model verification

Figure I-Response Time Upper Bound (T'de)
vs.
Number of Demand Users (nd)
DBR=.lO, EQd/Qd=3/2, EQb/Qb=2

Model performance predictions

The ratio of C' AI C' min can also be employed to
predict the percentage of time that the swapping
channel will be busy for the core limited case.
Solutions of (15) and (17) for varying DBR, nd,
and EQIQ yields the results in Tables VIII and IX.
The nd values of 6, 20, and 30 are selected because NTH
currently has an nd of 6, plans to expand to an nd of 20
before acquiring more core, and is interested in the
impact of an even greater number of terminals. DBR
values in the range .01 to .15 are employed to investigate the effect of a DSR in the range .30 to .40
as in Table IV.
For ease of sensitivity analysis, Figures 1 and 2

TABLE

VII-C'A/C'~in

At first analysis the performance predictions in
Tables VIII and IX and Figures 1 and 2 appear to
agree reasonably well with NTH experience. For
example on December 16, 1970, in approximately 10
hours of EXEC 8 operation there were 725 batch jobs
involving 3325 batch tasks; six demand terminals were
available and a total of 58 demand user sessions were
conducted involving 1437 demand service requests. For
the average job characteristics reported in Table II,
the average CPU utilization was 50 percent, average

(for Two EQ/Q Couplets)*

Swapping Channel Busy Percentage Varying DBR and nil
DBR\nd

.01
.05
.15

(63,46)
(72, 50)
(84, 63)

6.t

t~O,

computer
system

tdeparture
-ill

L------ _____________

I

"
1

I
J

Figure 1-The queueing model with feedback

Feedback Queueing Model

Takacs' model is more general than ours. However,
immediate returns are required when jobs join the
queue again, while some delay is involved in our model.
The model is well described by introducing a virtual
thinking system in which infinitely many servers are
furnished. Let the whole system consist of the computer
system and the thinking system, as is shown in Figure 1.
Jobs that arrive at the whole system enter the computer
system and join the queue. The jobs are served by a
single processor in order of arrival. After being served,
each job either immediately enters the thinking system
with probability I' or departs from the whole system
with probability I-I'. Since infinitely many servers are
furnished in the thinking system, there is no queue in it.
Therefore, the service in the thinking system is immediately commenced when a job enters the thinking
system. The distribution function of service times in the
thinking system is given by G (t). After being served,
each job immediately enters the computer system and
joins the queue. The cycles from computer system to
thinking system are repeated until the job departs from
the whole system. Then, the probability with which a
job has n returns is given by

rn = 'Y n ( 1 - I' ) ,

Multiplying (2), (3), (4), and (5) by 1, yi, Xi, and
xiyi respectively and summing them, it is derived that

v(y-x)Py(x, y) +[X(I-x)
+JL{ 1- (1-1') /x-'YY/x} JP(x, y)
= JL {l- (1- 'Y) / x - 'YY/ x} P (0, y) ,

(6)

where we put

Py(x, y) =dP(x, y)/dy.
Let L2 be the mean number of jobs in the thinking
system. Then,

L2 is easily found in the following way. Putting X= 1 in
(6), it is obtained that

v(y-I)Py(I, y) +JL'Y(I-y)P(I, y)
=JL'Y(I-y)P(O,y).

(8)

Then, by differentiating both sides of (8) with respect
to y and then putting y= 1, it is obtained that

vPy(l, 1) =JL'Y{P(I, 1) -P(O, I)}.

(1)

n=O, 1,2, ....

59

(9)

Substituting (7) and P(I, 1) = 1 in (9), it is obtained
that

ANALYSIS

(10)

The mean number of jobs in the system

Here,P(O, 1) isfoundasfollows.Puty=xin (6). Then,

Assume that the whole system is in statistical
equilibrium. Let ~ be the random variable representing
the number of jobs in the computer system, and let 1]
be the random variable representing the number of jobs
in the thinking system. Denote by P (i, j) the probability with which ~=i and 1]=j. Then, it is obtained that
(2)

(X+jv) P (0, j) = JL (1-'Y)p (1, j) +JL'YP(I, j-I),
j>O

i>O,

= (I-x)JL(I-'Y)P(O, x).

(4)

(12)
Hence, by substituting (12) in (10), it is obtained that
(13)

Next, let Ll be the mean number of jobs in the
computer system. Then,

(X+JL+jv )p(i, j) = Xp (i-I, j) +JL(1-'Y )p(i+I, j)

(14)

+ JL'YP (i + 1, j - 1) + ( j + 1) vp (i-I, j + 1) ,
i>O,}>O.
The probability generating function of
defined by

P(x, y) =E[x~y'lJ=

00

00

i=O

i=O

~

and

(11)

By differentiating both sides of (11) with respect to x
and then putting X= 1, it is obtained that

(3)

(X+JL)p(i, 0) =Xp(i-I, O)+JL(I-'Y)p(i+I, 0)
+vp(i-I,I),

(I-x) {JL(1-'Y) - Xx}P(x, x)

where we put
(5)
1]

is

L: L: p(i,j)Xiyi,

where E represents the mathematical expectation.

Px(I, 1) =Px(x, y) !x=l,y=l =dP(x, y) /dx !X=I,y=l.
Since

dP(x, x)/dx=Px(x, x) +Py(x, x),
it is derived that
(15)

60

Fall Joint Computer Conference, 1971

Differentiating both sides of (11) two times with
respect to x and then putting X= 1, it is obtained that

probability generating function of the number of jobs in
the system MjlVI/1 is given by (1- p) I (1- px) ,4 we put
P (x, 1) = (1- p) I ( 1- px) .

Here, it is noted that
dP(O, x)ldx=Py(O, x).

Thus, (16) yields
L 1 +L2 = pi (l-p) +Py(O, 1) I (l-p),

(17)

where we put

(20)

Next, denote by MIMloo the infinitely many-server
queueing system with a Poisson arrival and exponential
service times. 5 It is also known that the mean number of
jobs in the system MIMloo is given by t..' I v, where t..'
is the arrival rate and 1/v is the mean service time. Now,
under the assumption of the stochastic independence,
the second term in the right hand side of (17) is
written as

p is the utilization factor of the computer system,

where

because

t..' = t..')'I (1-')') = t.. ('Y +')'2 +')'3 + ... )
4

is the arrival rate and II}! is the mean service time.
The second term in the right hand side of (17) is the
mean number of jobs in the thinking system, under the
condition that there is no job in the computer system.
Then, if the state in the thinking system is stochastically independent of that in the computer system, L2
may be given by
L 2 =Py(1, 1) =Py(O, l)/(1-p).

is the arrival rate of the thinking system. Thus, it is
suspected that the thinking system forms the system
MIMloo from the queueing point of view. Since the
probability generating function of the number of jobs in
the system MIMI 00 is given by exp{ - (t..'lv) (1-x)},4
we put
P(l, y) =e-(3(l-y)
(21)
where

Under the assumption of this stochastic independence,
L1 may be given by
Ll =p/ (l-p).

(18)

In the following section we will prove the stochastic
independence between the states in the computer
system and in the thinking system.

{J=t..')'/(1-'Y)v.

Substituting (20) and (21) in (19), it is obtained that
P(x, y) =P(x, l)P(I, y) =

(1- p) e-(3(l-y)
1-px

(22)

To show that P(x, y) given by (22) is the solution of
(6), put

Method of finding P(x, y)
Al = v(y-x)Py(x, y),

Assume that the states in the computer system and in
the thinking system are stochastically independent.
Then, tl\e probability generating function P(x, y) is
factorized as
P(x, y) =P(x, l)P(l, y).

A 2 = [t..(I-x) +}!{ 1- (1-,),) IX-'Yylx} ]P(x, y),
A 3 =}!{ 1- (1-,),) Ix-')'ylx}P(O, y).

Then, it is sufficient to prove that

(19)

We will show that P(x, y) given by (19) becomes the
solution of (6) if P(x, 1) and P(l, y) are suitably
chosen.
Denote by MIMll the single-server queueing system
with a Poisson arrival and exponential service times. 5
It is known that the mean number of jobs in the system
MIMll is given by pi (1- pL where p is the utilization
factor of the system. This formula coincides with the
first term in the right hand side of (17). Thus, it is
suspected that the computer system forms the system
MIMII from the queueing point of view. Since the

A 1 +A 2 -A 3 =0.

Since Al and A3 are calculated as
Al = v{J(y-x)P(x, y),
A 3 = }! { 1- (1- ')' ) I x - 'YY I x} (1- px) P (x, y) ,

it is derived that
(A 1 +A 2 -A 3 )IP(x, y) =t..,),(y-x)/(I-'Y)+t..(I-x)

+ t.. {x =0.

(1- ')') - ')'y } / ( 1-')' )

Feedback Queueing Model

Thus, (22) becomes the solution of (6) and our conjecture that the states in the computer system and in
the thinking system are stochastically independent is
justified.
In the above analysis, it is shown that the computer
system and the thinking system form the systems
M/M/1 and MIMloo respectively. This fact can be
intuitively explained in the following way. Assume that
the computer system forms the system M/l\1:/1 from
the queueing point of view. It is known that the output
process of the system M/M/1 is a Poisson process. 6
It is also known that the probabilistic selection of jobs
from a Poisson process results in a Poisson process. 7
Hence, the input process of the thinking system becomes
a Poisson process and then the thinking system forms
the system MIMloo from the queueing point of view.
Since the output process of the system MIMloo is a
Poisson process,6 and the aggregation of several independent Poisson processes results in a Poisson
process,7 the input process of the computer system
becomes a Poisson process. Thus, the computer system
forms the system M/M/1 from the queueing point of
view, which is our first assumption. Therefore, no
contradiction is derived, and our assumption is justified.
This intuitive argument will be used later.

thinking
system

"'

\ A¥/(1 -1)

arriva I
--"

A.I«(- 'I)

A

61

.L

'I)

Ar/(1 -

computer
system

de parture
A/(1

-~)

{.

Figure 2-The aspect of flows in the system

condition. Then, it is evident that
n

LI(n)

=

L

L1(n, k),

(25)

L 2(n, k).

(26)

k=O
n-l

L2(n)

=

L

k=O

In statistical equilibrium, the input rate in any system
is equal to the output rate in the same system. Now,
define
UI(n, k) =L1(n, k)IL 1,
U2(n, k) =L2(n, k)IL2.

The mean turnaround time

The turnaround time is defined as the time interval
between the generation of the first request and the
reception of the final service from the computer system.
In other words, the turnaround time is the time interval
during which a job stays in the whole system. The mean
turnaround time is one of the most important characteristics for users.
Denote by T(n), (n= 0, 1,2, ... ), the mean turnaround time for jobs with n returns. Since the probability with which a job has n returns is given by (1),
the arrival rate for jobs with n returns is written as
n= 0, 1, 2, . . ..

=

{L1(n) +L2(n) }jX(n),

n=O, 1,2, . . ..

X'Y n (1-'Y)

(24)

We will find L1(n) and L2(n).
Let L1(n, k) be the mean number of jobs with n
returns in the computer system, under the condition
that those jobs already have k(k~n) returns. And let
L2 (n, k) be that in the thinking system under the same

= {X/(1-'Y)

(27)

}uI(n, n),

{A'Y/(I-'Y) }u2(n, k-l)
n~k~

1.

(28)

Here, it is noted that the input rates in the computer
system and in the thinking system are AI (1- "I) and
Al'l (1- "I) respectively, as is shown in Figure 2. From
(27) and (28), it is derived that
UI(n, n) ='Y n (1-'Y)2,

(23)

Let Ll(n) be the mean number of jobs with n returns
in the computer system, and let L 2 (n) be that in the
thinking system. Then, by applying Little's theorem to
the whole system,8 it is obtained that
T(n)

Then, by equating the input rate of jobs with n returns
in the computer system and the output rate, it is
obtained that

(29)
(30)

'Yu2(n, k-1) =uI(n, k),

Similarly, by equating the input rate of jobs with n
returns in the thinking system and the output rate, it
is obtained that
{XI (1-"1)

}UI (n,

k) = {XI'I (1-"1) }u2(n, k),

n> k ~ O.

(31)
From (31), it is derived that
UI (n,

k) = 'YU2 (n, k) ,

n>k~O.

(32)

Using (29), (30), and (32), uI(n, k) and u2(n, k) are

62

Fall Joint Computer Conference, 1971

system,

5

L2
K= A'Y/(l-'Y) =l/v.

4

(39)

Hence, (37) is written as

3

2

o

(40)

T(n) =R+n(R+K).

R

0·2

0·4

0·8

0·6

1·0

p
Figure 3-The mean response time R

Here, (R+K) is the mean interaction time for the
interactive computer system. Thus, the mean turnaround time for jobs with n returns is the sum of the
mean response time and n times the mean interaction
time. R, K, and T(n) are shown in Figures 3,4, and 5
respectively. In the case that a job has· no return, of
course, the mean turnaround time coincides with the
mean response time.
Finally, the mean turnaround time for any job,
regardless of the number of its returns, is given by

given by

T=
U1 (n,

k) = 'Y n (1- 'Y) 2,

u2(n, k) ='Y n - 1 (1-'Y)2,

n~k~O,

~

n=O

n=l

= R + {'Y / ( 1- 'Y) } (R +K) ,

n>k~O.

(41)

which coincides with (L1 + L 2) /A.

Then, by the definitions of u1(n, k) and u2(n, k), it is
obtained that
(33)
L 1(n, k) ='Y n(1-'Y)2L 1,
L 2(n, k) = 'Y n- I (1-'Y)2L 2.

~

:E rnT(n) =R+ (1-'Y) (R+K) :E n'Y n

(34)

Here, it is noted that LI(n, k) and L 2(n, k) do not
depend on k. By substituting (33), (34) in (25), (26)
respectively, it is obtained that
LI(n) = (n+l)'Y n (1-'Y)2L I,

(35)

L2(n) =n'Y n- 1 (1-'Y)2L 2.

(36)

Extension to the interactive computer system
with multiple processors
I t is easy to extend the previous analysis to the
interactive computer system with multiple processors.
Suppose that there are s processors in the computer
system. Then, the computer system forms the system

5
Then, by using (23), (24), (35), and (36), the mean
turnaround time T(n) for jobs with nreturns is given by

4
(37)
Now, we will consider the meaning of (37). Let R be
the mean response time of the interactive computer
system, and let K be the mean think time. It is shown
that the input rate in the computer system is A/(l-'Y),
and the mean number of jobs in that system is denoted
by L 1 • Then, by applying Little's theorem to the computer system,8 it is obtained that
R=

L1
A/(l-'Y)

K3

2

(38)

Similarly, the input rate in the thinking system is
A'Y/(l-'Y) and the mean number of jobs in that system
is L 2 , then, by applying Little's theorem to the thinking

o

2

3
l/V

Figure 4-The mean think time K

4

5

Feedback Queueing Model

63

number of jobs in the computer system is given by4
ps+l

L 1 = -------8--~------------(8-1)!
(pklkf) {(S-k)2_k}

L:

5

(44)

k=O

Then, the previous results (37)-(41) still hold for the
interactive computer system with s processors.

4

T{n)

DISCUSSION
3

2

: (R i+ K)
: 1
1

-

- - - - - -1- - - - - - - 1

t

R
.J.

o

I
1

2

3

4

5

n
Figure 5-The mean turnaround time T(n)

M/MI8 from the queueing point of view, where M/MI8
means the 8 servers queueing system with a Poisson
arrival and exponential service times. 5 It is known that
the output process of the system M/MI8 is a Poisson
process. 6 Then, the intuitive argument given for the
analysis of the computer system with a single processor
is entirely applied to the analysis of the computer
system with 8 processors. The probability generating
function of the number of jobs in the system MIMI s
is given by4
po

C~ (px)kjk!+

t:

(PX)'jS!s"-') ,

where po is the probability with which there is no job
in the system M/Mls. po is calculated by
Po= 1 /

C~ p'jk!+p'j(s-l) !(S-p)).

Then, if we use

P(X, 1) =Po

C~ (pX)'jk!+

t:

(PX)'jS!S!>-')

(42)

(43)

instead of (20), the probability generating function
P(x, y) is factorized as

P(x, y) =P(x, l)P(l, y)
where pel, y) is given by (21). In this case, the mean

As an analytical model for the interactive computer
system, the paper has proposed a feedback queueing
model in which some delay is required before jobs join
the queue again. From the analysis of the model, the
mean turnaround time is related to the mean response
time and the mean think time in a very simple way.
Although the validity of this simple relation largely
depends on the exponential distribution assumptions
for the service times and the think times, it is considered that the result obtained is a good approximation
of actual behavior. In fact, Equation (40) which gives
the relation can be intuitively justified and can be
empirically recognized.
The exponential distribution assumption for the
service times is frequently adopted in various queueing
models. This is due to the tractability of models as well
as to the reasonability of the assumption. Sometimes
we may be interested in queueing models with nonexponential distribution assumption. These models
usually become hardly tractable in a theoretical way,
when they are slightly complicated. However, it is well
known that the adoption of the exponential distribution
assumption causes results to be on the safe side.
SUMMARY
The paper has proposed a simple mathematical model
of the operation for an interactive computer system with
a single processor, and has presented some characteristics of the system, such as the mean turnaround time,
the mean interaction time, etc. From the queueing
point of view, the proposed model is a kind of singleserver queueing model with feedback. But, unlike the
usual queueing models with feedback, the proposed
model requires some delay when a job returns the
queueing system. This delay represents the user's think
time in the interactive computer system.
The model is well described by introducing a virtual
thinking system in which infinitely many servers are
furnished. Thus, the whole system consists of the
computer system and the thinking system. Here, the
user's thinking is represented by the service ill the
thinking system. The analysis is first made for the

64

Fall Joint Computer Conference, 1971

mean number of jobs in the thinking system and for
that in the computer system, under the assumptions
of a Poisson arrival and exponential service times. Then,
the probability generating function of the number of
jobs is derived by solving a differential equation. It is
shown that the states in the computer system and in
the thinking system are stochastically independent, and
the computer system and the thinking system form the
systems M/M/1, MIMloo respectively.

ACKNOWLEDGMENTS
The author would like to thank Dr. N. Ikeno, Mr. K.
Naemura, and Mr. Y. Yoshida for numerous discussions and suggestions which aided in the preparation
of this paper.

REFERENCES
1 J M MC KINNEY
A survey of analytical time-sharing models
Computing Surveys Vol 1 No 2 1969

2 A L SCHERR
An analysis of time-shared computer systems
Research Monograph No 36 MIT Cambridge
Massach usetts
3 L TAKACS
A single-server queue with feedback
Bell System Technical Journal Vol 42 No 2 1963
4 T L SAATY
Elements of queueing theory
McGraw-Hill New York Toronto London 1961
5 D G KENDALL
Stochastic processes occurring in the theory of queues and
their analysis by the method of the imbedded Markov chain
Annals of Mathematical Statistics Vol 24 No 3 1953
6 P J BURKE
The output of a queuing system
Operations Research Vol 4 No 6 1956
7 R W CONWAY W L MAXWELL L W MILLER
Theory of scheduling Chap 8
Addison-Wesley Reading Massachusetts Palo Alto
London Don Mills Ontario 1967
8 J D C LITTLE
A proof for the queuing formula: L =A W
Operations Research Vol 9 No 3 1961

Alcoa Picturephone Remote Information System (APRIS)
by M. L. COLEMAN, K. W. HINKELMAN, and W. J. KOLECHTA
Aluminum Company of America
Pittsburgh, Pennsylvania

OBJECTIVES AND DESIGN PHILOSOPHY
The objective of the Alcoa Picturephone* Remote
Information System (APRIS) is to give to Alcoa executives the capability of using their Picturephones to
retrieve information from the corporate computer
data base.
The primary design criterion was ease of use. Other
management information systems, in an effort to be as
powerful as possible, sacrificed simplicity and thus made
themselves unsuitable for the personal use of the
executive. Experience with these systems has shown
that it is unreasonable to expect a busy executive to
learn the complex procedures necessary to operate
them. In fact it is undesirable, since the job of an
executive is to make decisions; anything which interferes with this process, no matter how technologically
intriguing, cannot be tolerated.
APRIS's solution to the conflict between ease of use
and power was to provide an information center to
interpret and respond to the executive's requests for
information. Rather than provide just a tool, the goal
was to provide a service: the service of better access to
information.
APRIS does not require, or even allow, the executive
to make retrievals based on complex boolean functions.
Rather, by having him press buttons on his Touch-Tone
phone, it lets him step through pages of display, one at
a time, displaying an index whenever it is necessary to
choose between several alternatives. (The complete
user guide for the system is shown in Figure 1.) The
information center has the responsibility for creating
these display pages in response to the executive's
demands for information. They can use any techniques
available to gather information: existing management
information systems (with their complex and powerful
logic), independent programs to extract and format the
data from the data base used in the daily data processing applications, or manual entry using hardcopy
sources.

User Guide
Alcoa Picturephone Remote Information System (APR IS)
User Guide for _ _ _ _ _ _ _ _ _ _ _ _ __
Your password is

.

Please do not disclose

it to anyone else.

To use the system call #xxx-xxxx on your Picturephone.
Push the Touch- Tone buttons to go from page to page.
Normally, button 1 will display the next page in a series
of displays and button 0 will display the previous page.
To graph numerical data that is being displayed pres s the
three buttons: '~l ~'.
If you have any questions, call #xxx-xxxx.

Figure I-User guide for APRIS

In addition, the information center has a monitor
which displays the pages that the executive is seeing and
provides audio contact so that the executive can make
requests of the information center pertaining to the
current data base and the information center can
manipulate the display if the executive so desires.
HARDWARE
The current hardware configuration necessary to
support Picturephone access at Alcoa is shown in Figure
2. Two lines are presently installed. One, an "intercom"
line, allows access from the information center. The
other line is connected to the general exchange to allow
access from any Picturephonein the calling area.
Each of the lines is connected through a Bell 305 Data
Display Set to a 2701 attached to an IBM 360/65
computer. The 305 data set converts Touch-Tone
signals from the user to digital codes which are interpretable by the computer and also converts digital
codes produced by the computer into video scan lines
which are displayable on the Picturephone. The total
cost for this configuration, including the 2701 is approximately $1600 per month. Each additional Picturephone display station costs $189 a month with exchange
service or $70 a month on the intercom line. A break-

* Picturephone and Touch-Tone are registered trademarks of the
Bell System.
65

66

Fall Joint Computer Conference, 1971

tained on disk storage in an encrypted form and is not
decrypted for display unless all security access requirements are met.
SYSTEM PROGRAMMING

down of these costs is contained in Figure 3. (These
rates are based upon Bell of Pennsylvania tariffs and
will vary in other states.)

In designing the system it was desired that it be
flexible and easy to code. This required the use of a high
level language. However, it was also necessary that the
system occupy as small an amount of core as possible
since it would be resident in the computer the entire
day. This required the use of assembly language coding.
Both objectives were satisfied by writing and debugging
the system in PL/I and. then, when the program logic
was correct, recoding it in BAL using the PL/I coding
as a guide.
Both systems are still in use, the BAL for general
use and the PL/I to develop and check out modifications and expansions to the system. Both make use of
reentrant code and support multiple, simultaneous
Picture phone access.

SECURITY

DATA STRUCTURE

A tight, multi-level security system is integral to
APRIS. To gain access, the proper password must be
entered. Each display page is tagged as being public,
private, or semi-private giving the capability to restrict
the dissemination of confidential data. Data is main-

Each display page is stored on the disk as a 534 byte
record consisting of a 50 byte header followed by the
484 character display page, 22 lines of 22 characters
each.
The format of the header is shown in Figure 4.

Exchange Line

Figure 2-Hardware to support Picturephone access

GRAPHICS

Basic System

QTY

ITEM
2701 with 2 Type III adapters

MONTHLY

COSTS~'

A limited graphic capability has been provided. By
pressing a three button code, an executive can have

$ 550.

305 Data Display Sets

550.

Picturephone Intercom Circuit

35.
Bytes

2

Picture phones

210.

Business lines with Picturephone service

238.

Key service control unit
Total

12.50
$1595.50

PGNUM

is the number of the page.

PRGST

is a code which tells the system how to graph the data
appearing on that page.

CHGFLG

is used a

RDAUTH

is the read authorization field. The first four bits
indicate whether the page is public, private, serniprivate. Or inforrrlation center. The remainder is
a code specifying the access number of the owner
if the page is private, Or a pOinter to a list of access
numbers if the page is semi-private.

WTAUTH

is the write authorization field.

Additional Terminals

Picture phone Display Set
Business line with Picturephone service
Total

$

5

a flag during alteration of the page.

70.

119.

$ 189.

':'Costs are based on Bell of Pennsylvania tariffs and will vary from

BUTTON 0 - BUTTON 9

are 10 fields each of which contain the page number
of the page which will be displayed after the cOrresponding Touch-Tone button is pressed. A 1 in a
field means that the user's initial page (as defined
in a table) is displayed when that button is pressed.
A zerO in a field means that that button is undefined.

state to state.

Figure 3-Monthly costs

Figure 4-Header format

APRIS

numerical data displayed as a bar graph. Both positive
and negative values can be graphed. The system labels
the x-axis but there is no room on the screen to indicate
values for the y-axis. A push of a button, however,
returns the corresponding numerical display. While
austere, the graphics serve to effectively highlight
trends and thus significantly improve the usefulness of
the system.
PICTUREPHONE vs. CRTs
The display capabilities of the Picture phone are 22
lines of 22 characters with the first and last lines nonuseable. Many system analysts feel this is too small to
display useful information and thus would prefer to
design systems which use CRTs with their larger
screens. The problems of 20 X 22 character display are
those of scale. The limited display size restricts the
analyst in his design of system output formats. On the
Picture phone it may take a bit more effort to produce
useful output and may possibly require the division of
related information onto several display pages but the
data can be displayed and the executive can read and
use it quickly. We feel that the problem of limited
display size is more than offset by the fact that the
Picturephone may be used both as a face-to-face
communication device and as a remote terminal.
Thus, its cost is essentially shared over both capabilities. In addition, the executive's desk is not cluttered
with an additional screen and keyboard.

marized the data by year, quarter, and month and
formatted it into approximately 4000 pages suitable for
display. Another program created a series of index
pages which allow any desired item to be located. An
example of a typical inquiry, the yearly production of
aluminum vans and the net change by year, with their
associated graphs, is shown in the Appendix, Figure 5
through Figure 17.
As an example of the second type of data a daily
report called the Forward Load Report was chosen.
This consists of order information of various aluminum
products produced by Alcoa plants. A program transforms the report into a displayable format and builds
the indices necessary to access it. No example of this
report is given here due to the confidential nature of
the data.
CONCLUSIONS
The Picturephone has proven to be an effective and an
efficient means of allowing executives to directly access
a computer data base. However, Picturephone access,
as an isolated capability, is of little use to a busy
executive. Only when it is made one arm of an efficient
information center does it serve to provide the executive
with useful information for his decision making process.
APPENDIX
Example of a typical inquiry

DATA BASE
For the initial presentation of APRIS to the top
executives of Alcoa a sample of the type of information
that could be efficiently and effectively displayed on the
Picturephone was needed. The data had to be real and
useful to the executives. It 'was felt that it would be a
mistake to provide data that was either "dummied up"
for the presentation, was of no use in the decision
making process, or could be better presented by having
it typed on a piece of paper.
There are two types of data which meet these criteria.
The first is massive historical data which in hardcopy
form is too bulky to allow convenient access. The second
is data which changes more rapidly than can be routinely handled with current reporting methods.
For an example of the first, an existing consumer
research data file was used. This file consisted of 40,000
data entries recording monthly shipment and production figures for aluminum and various consumer
products ranging from vacuum cleaners to automobiles.
The file was passed against a program which sum-

67

Figure 5-Welcome message

68

Fall Joint Computer Conference, 1971

Figure 6-Mter entering password

Figure 8-After pressing button 2

Figure 7-After pressing button 1

Figure 9-Mter pressing button 6

APRIS

Figure lO-After pressing button 4

Figure ll-After pressing button 1

Figure l2-Mter pressing button 6

Figure l3-After pressing button 4

69

70

Fall Joint Computer Conference, 1971

Figure 14-After pressing the code for graph: *1*

Figure 16-Mter pressing button 9

Figure 15-Mter pressing button 0

Figure 17-Mter pressing the code for graph: *1*

Computer support for an experimental
PICTUREPHONE®/computer system at
Bell Telephone Laboratories, Incorporated
by ERNESTO J. RODRIGUEZ
Bell Telephone Laboratories, Incorporated
Holmdel, New Jersey

INTRODUCTION

TONE signals into ASCII characters for the computer;
it stores ASCII information from the computer and
translates the information to an appropriate video signal for refreshing the display on the PICTUREPHONE
screen. Thus, the computer is relieved of the task of
repetitively transmitting the message to be viewed.

This paper describes the computer support of an experimental PICTUREPHONEjComputer system implemented at Bell Laboratories. The system provides
Bell Laboratories and AT&T executives with the capability of using their PICTUREPHONE station sets to
display information retrieved from a computer. Its
primary purpose is to demonstrate the technical feasibility of accessing a computer from standard PICTUREPHONE stations and to help in the evaluation
of the service.
Methods of operation in PICTUREPHONEjComputer systems can vary; they depend on the system
objectives, types of users, and information to be retrieved. The information provided in this paper may
serve as a general guide for those faced with the task of
providing software andj or hardware to support PICTUREPHONEjComputer systems. The hardware and
software used in the Bell Laboratories system are functions of the type of operation chosen and the particular
computer facilities which were available. However, some
of the concepts employed and techniques of overcoming
implementation problems should be applicable to any
PICTUREPHONEjComputer system and to a smaller
degree, to the implementation of other systems which
include terminals not supported by readily available
hardware and software.
In the Bell Laboratories system, users gain access to
the computer by dialing a PICTUREPHONE number
associated with the computer. Thereafter, the user
communicates with the computer using his station's
TOUCH-TONE® dial. The computer's responses are
displayed on the PICTUREPHONE station screen. A
Display Data Set is used to interface the PICTUREPHONE network and station with the computer (see
Figure 1). The Display Data Set translates TOUCH-

OPTIONAL VOICEBAND
PRIV ATE LINE
COMPUTER

STANDARD
PICTUREPHONE
LOOP
PICTUREPHONE
SWITCH

PICTUREPHONE
STATIlN

Figure I-Computer access for PICTUREPHONE Service

Further information on the Display Data Set can be
found in its Technical Reference* and in the February,
1971 issue of the Bell System Technical Journal. This
paper is concerned with the software implementation
and general operational characteristics of the Bell
Laboratories experimental PICTUREPHONEjComputer system.

* Technical Reference Information for the Display Data Set
Used to Provide Computer Access Service for PICTUREPHONE
Stations-available from Engineering Director, Data Communications, American Telephone and Telegraph Company.
71

72

Fall Joint Computer Conference, 1971

SYSTEM OBJECTIVES AND PROGRAMMING
CONSTRAINTS

INPUT/OUTPUT HARDWARE AND
SOFTWARE SUPPORT

Man/ computer dialogue in the Bell Laboratories experimental PICTUREPHONE/Computer system is
constrained by certain physical considerations and
human factors. The following physical considerations
imposed general constraints on the system and consequently on the software implementation:

General

• The system is designed to be accessible from a
standard PICTUREPHONE station set without
requiring auxiliary input devices, such as keyboards, light pens, etc.; that is, only the TOUCHTONE dial is required for input.
• Display size is limited to 440 characters, 22 characters per line, 20 lines per display.
Human factor considerations are particularly important when implementing this type of system. Basic
assumptions in the design of the experimental system,
which also imposed some constraints on the software
implementation were:
• Users of the system would not, in general, have
any knowledge of computer programming nor
would they be willing to use a reference manual of
interaction codes.
• User inputs should be as short as possible, without
any intercharacter input time constraint and without the need for an "end of message" character.
• Each display should contain sufficient instructions
to enable the user to select the next display, one
of a few possible new displays, or return to a
familiar point in the program.
In the experimental system, displays always contain
enough instructions so that even the most inexperienced user may proceed from display to display with
confidence. However, the more experienced users can
use the system more efficiently, since more options are
allowed at a particular point in the interaction than are
enumerated on the screen. Once a user becomes familiar
with the codes for selecting the system's various abilities, he often can choose them directly rather than
being led step-by-step through a sequence of decisions.
If a user makes a selection which is not allowed at a
particular point in the interaction, he is presented an
appropriate error display. This display tells the user the
particular error he has made and gives him the option
of returning to the point at which he made the error,
reviewing a particular set of instructions, or using
another code.

The computer used in the Bell Laboratories experimental PICTUREPHONE/Computer system is an
IBM 360/50 computer operated with the System/360
Operating System and under a multiprogrammed environment with a variable number of tasks (MVT) . The
computer is connected to the experimental PICTUREPHONE network via two Display Data Sets (DDSs).
The DDSs are connected to the computer through an
IBJ\1 2701 Telecommunications Control Unit (hereafter referred to as the control unit) with two Terminal
Adapters Type III. Since, in this system, the DDSs are
remotely located from the computer, voiceband facilities equipped with 202D-series data sets are required
on the transmission facility between the DDSs and the
computer. Transmission is half-duplex at 1200 bauds.
Because the operational characteristics of the DDS
are different than those of existing terminals and since
the Bell Laboratories system represented the first use
of PICTUREPHONE stations for computer access, it
was not expected that standard computer hardware and
software would support the system operation. Minor
modifications were required in the input/output hardware and software to support the experimental system.
The terminal Adapter Type III (hereafter referred
to as the adapter) normally permits the attachment to
the computer of remotely located IBM 2260/2848 display complexes.
Operation with these devices involves polling and
framing of messages, both of which require recognition
of control characters by the adapter. However, the
experimental PICTUREPHONE/Computer system
uses simple Read only/Write only operations, without
hardware recognition of line control characters, so that
minor hardware and software modifications were
necessary.
Hardware modifications

Two modifications to the adapter hardware were
made to provide for the experimental PICTUREPHONE/Computer access system operation. These
were to delete the two-second timeout period for the
Read command and to delete the line-control character
(EOT, STX, ETX, SOH, CAN, ACK, NAK) recognition. The first modification was made by grounding
pin 01A-B2-C3-B09 and the second was made by
grounding the "search latch" pin 01A-B2-G3-D06 in
the adapter.

Computer Support for an Experimental Picturephone/Computer System

The two-second timeout was deleted in order to avoid
the termination of the Read command if a character is
not received within a two-second period. Although not
an essential modification, the deletion of the two-second
timeout would avoid the need to continually reestablish
the Read command and thereby permit a more efficient
computer operation.
The line-control character recognition was deleted
because the associated polling mode of operation and
associated message framing are not compatible with
the PICTUREPHONE/Computer access system. Since
only one PICTUREPHONE station is connected
through a Display Data Set to the computer at one
time, polling is not appropriate. Message framing
would require the dedication of one of the 12 TOUCHTONE input characters as an "end of message" character. This would restrict the user input capabilities
and require users to remember to terminate inputs
with a special character. This latter requirement was
judged undesirable since the users are primarily members of upper management. The absence of message
framing implies that the experimental software knows
at all times how many characters it should expect and
requires CPU intervention for each character received.
Once a character is received, several validity checks
are made and then a new Read command is issued;
this process takes approximately 9 ms of CPU time and
it repeats until the experimental software count is
completed. Since the system was implemented a new
"Read Clear" command has been made available which
disables the "search latch" function.
Software modifications

The control unit operations are supported by two
IBM data management access methods. One of these
access methods, Basic Telecommunication Access
Method (BTAM), which controls data transmission,
was used to support the teleprocessing operations in the
experimental PICTUREPHONE/Computer system.
BTAM is most helpful for implementing programs
for telecommunications applications. It presently supports the following· terminal devices: IBM 1030, 1050,
1060, 2260 and 2740 terminals, Bell System 83B3 and
TWX stations and Western Union 115A stations. However, the use of BTAM to control transmission of computer messages to and from an unsupported terminal
device not in the above list, such as the Display Data
Set, requires special attention particularly when considering the use of the BTAM provided device input/
output (I/O) modules.
A device I/O module contains the control information for the generation of channel programs for a given

73

terminal device. The terminal device supported by the
adapter is represented by the device I/O module
IGG019l\13. However, since the experimental PICTUREPHONE/Computer system requires simple Read
only/Write only operations, without line-control character recognition, and the module IGG019M3 did not
provide for this type of operation, it was necessary for
it to be expanded.
The following actions were taken to incorporate into
the device I/O module the ability to support the experimental system:
• Two unassigned operation types representing Read
and Write options in the 32-byte table of offsets
were selected.
• Two entries in the channel program offsets for the
operations were added. Each of these entries have
a count of one for either a Read or Write operation
and a pointer to a channel command word (CCW).
A count greater than one is not necessary since
there is no need for polling or acknowledgment of
responses from the Display Data Set.
• Two channel command words for the Read and
Write operations were added. They are:
01 04 00 00 20 11 04 00 for Write operations
02 04 00 00 20 11 04 00 for Read operations
In the above CCWs, the area address and count
fields are obtained from the data event control
block (DECB) associated with the Write or Read
macro instruction.

PICTUREPHONE SOFTWARE
General

Software for the Bell Laboratories experimental
PICTUREPHONE/Computer system operates in its
own region of the computer. The PICTUREPHONE
software handles several computer ports simultaneously, operates under an overlay structure, and consists
of three self-contained but interrelated modules:
• Input/Output Telecommunications
• Executive Module
• Interactive Abilities
A region size of about 22,000 bytes is required by the
Input/Output Telecommunications Module, which resides in core at all times and occupies the highest
priority task. Because it occupies the highest priority
task, jobs running in lower tasks are interrupted whenever the Input/Output Telecommunications Module

74

Fall Joint Computer Conference, 1971

requires CPU attention. Whenever a call is received by
the computer, a lower priority task is created by issuing
an ATTACH macro instruction. This new task, having
a maximum size of 18,000 bytes, contains the Executive
Module and Interactive Abilities, which reside on
disk. ** When the call is completed, this task is removed
from the system and its main storage area is released.
The following sections will discuss some characteristics
of the modules used in the experimental PICTUREPHONE/Computer system.
I nput/ output telecommunications module

This module is written in Basic Assembler Language
(BAL) and in BTAM. It provides the program interface between the control unit, the Operating System,
and the Executive Module for the PICTUREPHONE/
Computer system. It controls the transmission and reception of messages between the computer and the
PICTUREPHONE user by having the Operating
System instruct the control unit to pass data to the
Display Data Set and to receive data from the Display
Data Set.
The Input/Output Telecommunications Module performs the following functions:

TABLE I-Typical Abilities Provided in the Bell
Laboratories PICTUREPHONE@ /Computer System
1. AT&T Stock Report

2.
3.
4.
5.

Stock Market Report
Personnel Information
Bell System News
System Description (describes the system configuration and
operation)
6. Calculator Routine (addition, subtraction, multiplication,
division, square root functions are available)
7. Keyboard Routine (allows user to input information from an
alphanumeric keyboard by means of the functions described
in the Display Data Set Technical Reference)
8. Personal Files (files with personal information for a single
user or a group of users)

• Creates new task by means of ATTACH macro
instruction.
• Writes (i.e., Transmits) characters to Display
Data Set via the control unit.
• Pending the receipt of an incoming input, or during the transmission of messages, causes the software package to enter a "wait state," i.e., it gives
control to the Operating System to service programs in lower priority tasks.

• Reads (i.e., Receives) characters from the Display Data Set via the control unit.
• Checks for the following ASCII 'control characters
sent by the DDS:
• EN Q-start of call
• EOT-end of call
• DC1-start of keyboard mode. In this mode, an
optional adjunct alphanumeric keyboard can be
used at the PICTUREPHONE station for
input.
• DC3-end of keyboard mode
• SUB-start of edit mode. In this mode the user
can modify the contents of the DDS buffer
without interaction with the computer.
• DC2-end of edit mode. This signals the computer that the user is finished modifying the
DDS buffer contents. At this point the computer
stores the modified display on disk or takes
other appropriate action.
• Translates ASCII characters into numeric format.
• Stores numeric characters in common area accessed by Executive Module.

** If two simultaneous calls are placed to the computer, two new
tasks of 18,000 bytes each are created.

Figure 2-Hello message display

Computer Support for an Experimental Picturephone/Computer SysteIlfl

75

Executive module

The Executive Module is written in FORTRAN
and it provides the logic for the selection of the abilities
(see Table I). Basically, the Executive Module keeps
an account of which display is being shown to the user
and what course should be taken on the basis of his
input. This module will handle the response itself
when the response is a simple information retrieval
application, e.g., AT&T Stock Report, or give control
to one of the Interactive Abilities when the response
requires additional processing, e.g., Calculator Ability.
The Executive module scans every input from the user
before handling any responses. The Executive's primary
function for the user is the provision of general guidance to what information retrieval and other services
the computer can provide.
Abilities

There are eight separate abilities (all written in
FORTRAN) implemented in the experimental system.

Figure 4-INDEX list

Figure 3-Thank you message display

They are briefly described in Table 1. An ability, when
selected, will either handle subsequent dialogue itself
until the user decides to leave it, or return control to a
special section of the Executive which will interact
with the user. This interaction is determined by certain
choices, set by the ability, which are allowed to the
user. Most of the abilities provide information retrieval functions and some of them require some degree
of interaction. All information retrieval displays are
stored on disk in fixed 440-character records. The abilities fit into a wide range of programming sophistication,
from extremely simple to quite complicated. For example, the Calculator Ability allows the user to use his
PICTUREPHONE station set as a desk calculator. Addition, subtraction, multiplication, division and square
root operations, as well as memory, recall, start, and
cancel features are provided. Obviously, the number of
operations available is a function of the software design
and is not limited by the DDS operational requirements. As in most of the other abilities, user-computer
interaction is accomplished with the use of the TOUCHTONE dial. A keyboard ability which allows the usp

76

Fall Joint Computer Conference, 1971

inputted, the Executive Module will make the fact
known and request the extension number again. After
three illegal extension numbers, the user is instructed
to hang up and get some help. At this point, the computer will not accept any more inputs from the user
until a new call is made.
After a user gains access to the system successfully,
he may request an INDEX of available abilities or he
may go directly to any ability for which he knows the
code. An INDEX display is simply a list of available
abilities with their selection code numbers. From this
INDEX any ability may be reached. The inexperienced
user would naturally make use of these lists frequently.
Figure 4 illustrates the INDEX list.
Most of the selection codes consist of two characters, with the INDEX pages and various abilities having permanently assigned codes. For example, the
INDEX has code *1, Stock Market Report has code
12, etc. Figure 5 shows the format of the Stock Market
Report display. Whenever the system is expecting a
two character input, the user may select any ability,
even in the middle of another ability. However, because of display size limitation, many of these choices

Figure 5-Stock market report display

of an adjunct alphanumeric keyboard to update information stored in the computer and which permits a
greater input repertoire is also available in the experimental system. This ability allows the user to retrieve
displays, modify them, and store the modified display
on disk.
METHOD OF PROGRAM OPERATION
To use the experimental system, a user dials the
PICTUREPHONE number associated ,vith the computer and a connection is established. The Input/
Output Telecommunications module recognizes the incoming call, gives control to the Executive Module, and
provides a "Hello" message which is displayed with a
request for the user's extension number (Figure 2).
When the user inputs his four-digit extension number,
the Executive Routine compares the extension number
inputted with a list 0 C valid user numbers. If a match
is found, the name of the user associated with the extension number is displayed in a "Thank You" display
(Figure 3). If an invalid extension number has been

Figure 6-0peration codes in calculator ability

Computer Support for an Experimental Picturephone/Computer System

are not enumerated for him. Even in special routines,
where an input of more than two characters is expected,
the user may always opt for any of the *-plus-a-digit
codes, which are legal at any time in the system. Thus,
the user can always leave an ability at any point if he
is somewhat familiar with the system. In any event,
each display of an ability includes the code to return to
the INDEX page. Within abilities, operation codes
directed to the specific ability itself, are always O-plus-adigit. For example, an input of 01 may display the next
page of a list.
Employing the various codes which the program displays, the user can command the computer's calculating
ability. This is accomplished through the use of twocharacter operation codes. Figure 6 illustrates the set
of TOUCH-TONE codes in the Calculator Ability.
CONCLUSION
General characteristics of an experimental PICTUREPHONE/Computer system at Bell Laboratories have
been described.
The system has been operational on the Bell Laboratories corporate PICTUREPHONE network for about
three years. It has resided in the same computer that
serves other time-sharing applications, such as Con-

77

versational Programming System (CPS) and Administrative Terminal System (ATS). Although the experimental system is not at present part of a vital
corporate information system, it is demonstrative of
potentially useful PICTUREPHONE/Computer capabilities. In this system, the typical call holding time is
about five minutes, the average number of retrieved
displays is ten, and the approximate processing time is
one second per call.
The method of operation described in this paper is
simply one of several methods that could be used in
implementing PICTUREPHONE/Computer systems.
However, the information on the hardware and software used in the Bell Laboratories system may help
those implementing systems in a similar environment.
In addition, the techniques used to provide computer
access to PICTUREPHONE terminals and the method
of interaction employed at Bell Laboratories may be
useful in designing PICTUREPHONE/Computer systems in different environments.
ACKNOWLEDGMENT
Mr. J. J. Mansell was initially responsible for the implementation of the experimental system. His guidance
and technical advice are very much appreciated.

Proposed Braille computer terminal offers
expanded world to the blind
by N. C. LOEBER
IBM Corporation
San Jose, California

some means of communication for his students. It
employs a system of embossed dots on the surface of the
paper, which are felt and read with the fingertips.
Braille is read from left to right, top to bottom,
exactly as a sighted person reads conventional printing.
The average speed of reading is about 100 words per
minute. Figure 1 shows the basic cell configuration for a
Braille symbol. Up to six dots (two vertical columns of
three dots each) are used. The dots of the cell are
numbered as shown. Sixty-three dot patterns or Braille
characters can be formed by arranging the dots in
different positions and combinations. One other
configuration, that of no dots, is used for spacing
between words.

INTRODUCTION
As professional people-engineers, scientists, programmers and managers-most of us make frequent use
of the library. We keep abreast of recent developments
by reading books or technical journals. If we have
developed our reading skills, we can zip through a
document at the rate of 1000 words per minute. But
what would happen to us and our interests if we no
longer had access to the library or if our supply of
reading material was suddenly cut off because we
became blind?
The end of the world you say! No, not quite, but it
might seem so if this fate befell us. Unfortunately, more
than 30,000 individuals lose their sight each year. The
world hastily closes in on them. Books and magazines
are practically out of reach. What alternative do they
have to keep informed?
Thanks to Louis Braille and others, the Braille
system of raised dots on paper provides an opportunity
for written communication. Transcribing and embossing
of Braille is difficult, so there is a limited amount of
literary works available. This situation need not remain
static. It can be improved by applying computers and
programs to help translate Braille and to develop
equipment for embossing Braille.
A brief tutorial on the raised-dot language of Braille
is presented to illustrate some of its complexities. This
is followed by an explanation of the methods used in
producing Braille material. Finally, the proposed
Braille computer terminal will be described and some
experimental results from a feasibility model will be
discussed.

1 •• 4

2 •• 5
3 •• 6

Figure I-Braille cell dot identification

Figure 2 shows representative cell dimensions. The
distance between the center of each dot is approximately ~o inch. There are 4 cells per inch horizontally,
and 272 lines per inch vertically. Dot height is about
.020 inch.
Braille as officially approved in the United States
includes several levels or grades, each level increasing in
complexity with a corresponding reduction in the
number of cells required. Grade I Braille provides full
spelling of words and consists of the letters of the
alphabet, punctuation, and a number of composition
signs which are special to Braille. Figure 3 shows the
basic Braille alphabet. Grade II Braille consists of·
Grade I plus 189 contractions in short form words and
is officially known as English Braille. Grade II Braille is
often compared to shorthand.

THE BRAILLE SYSTEM
Braille was developed more than a century ago by
Louis Braille, a French teacher of the blind, to provide
79

80

Fall Joint Computer Conference, 1971

•• ••

Grade I
•

0

I

• •

0

0

0

•

a

n

GE\.l
.•••.•.

W

•

0

•

0

8B\
w

GE\.I.·.>

\d../

Grade II

8B\
w

•

0

•

0

o

0

•

•

e

0

•

0

•

••
•

0

0

0

0

•
e

••
••
and
•

I

d

• ••

•0

b

I

0

0

0

•

0

•

0

•

o. .
0

••
0

before

Figure 4-Word samples written in Braille
Figure 2-Braille cell dimensions

In between Grades I and II is another level of Braille
which employs only 44 one-cell contractions. It is
known as Grade I~. Grade III Braille is an extension
of Grade II, using additional contractions and short
form words and by the use of outlining (the omission
of vowels). Grade III contains more than 500 contracted forms and is used mainly by individuals for
their personal convenience. Several other Braille codes
exist for special applications such as the writing of
music and mathematics.
The majority of experienced blind readers use Grade
I I Braille. This is also used for most text printing
because of the advantage of space saving (up to
30 percent), faster reading, and faster writing.

•

0

•

0

••

.0

a

0

•

0

•

0

k

•

•

0

0

•

0

•

0

0

•

••

•

0

•
0

•

0

o

•

n

•

0

o.

• • • • •

0

h

•• ••
•• ••

•

•

0

•

0

0

•

0

0

m

I

0

9

•• ••
•

••

e

d

c

b

•••

o. o. .

••

Grade I· Braille utilizes a character-to-cell relationship. That is, each letter of a word would be reproduced
as a Braille cell. This is slow reading and results in a
bulky transcript. However, it is necessary to use this
level of Braille for writing programs and statistics.
As a point of information, there are now approximately
400 individuals who have been trained as programmers.
Although handicapped, these programmers are active
and productive workers in our society.
Conventional spelling is perfectly feasible in Braille
and is used in some applications such as computer
programming, but the more frequently used contractions are assigned their own dot configurations. Some
cell combinations are used either as a whole word or
part of a word or possible letter groups. In English
Braille, for example, the letter "f" when alone (or
adjacent to a punctuation mark, the capital sign or the
italic indicator) stands for the word "from." The "ch"
sign under these same conditions means "child." This
method of writing is known as contracted Braille, and
the characters used in this way are called Braille contractions. The method differs from regular shorthand
in that by assigning actual letter group values to most

•

0

p

0

q

•

0

a
o

•• ••
•• •• •• •• •• •• ••
o

•

o

•

•

•

0

•

0

•

0

•

0

•

0

0

u

v

•

•

o

o

•

•

w

x

y

Figure 3-The English Braille alphabet

o

0

•

o

•

o

•

••
#

•

•

0

•

0

••

b

0

c

•

0

•

0

2

••
3

Figure 5-Use of number sign

Proposed Braille Computer Terminal

81

of the contractions, conventional spelling is significantly
retained in spite of the contractions.
Figure 4 gives a comparison of Grade I and Grade II
Braille. Because of the many contractions used and the
particular rules that apply to the hyphenation of words,
the capitalizing of letters, and the displaying of numbers,
it is best to consider Braille as a foreign language. It
requires training, skill and practice to be a good transcriber or to write and read Braille.
Numbers are represented by using the first 10 letters
of the alphabet. Figure 5 shows the "number sign"
preceding the letters, the scheme for converting them
to numbers. Figure 6 shows how the "capital sign" is
used. A single capital sign tells the reader that the
following letter is capitalized. Two consecutive capital
signs indicate the entire word is capitalized. Thus, we
see that interpretation of Braille is based on adjacent
cells as well as the cell being read.
BRAILLE PRODUCTION
Braille documents have been produced by using a
variety of devices. Some of these are reviewed here.

Braille slates
Figure 7-Braille slate

Figure 7 shows a typical Braille slate, guide and
stylus. Individual cells or dot patterns are manually
and singly embossed with a stylus which is guided
across the writing line by a metal strip or guide. Braille
embossed on a slate must be the mirrored image of the
actual embossing desired, because the dots are formed
on the back side of the paper. Reading the dots necessitates reversal of the paper. This is essential since the
depressions cannot be felt. To make a correction, the
paper is removed from the slate, and the dots are
flattened with a blunt instrument or correcting tool.

o

o

•

•

•

••

•• •

0

J

a

c

These are similar to small portable typewriters.
Figure 8 shows a Perkins Braillewritermade by the
Howe Press Company. There is a key for each of the six
possible dots. From 1 to 6 keys must be depressed

0

•

c s
a i

Braillewriters

0

k

p 9
n

•

o

•

o

•
0

•

0

•

0

••

•
•

c s
a i
p 9
n

c s
a i

B

0

M

P 9
n

Figure 6-Use of capital sign

Figure 8-Perkins Braillewriter

82

Fall Joint Computer Conference, 1971

feasible to produce textbooks in "press" Braille. Most
textbooks, school materials and the like. are produced
by volunteers utilizing hand or manual embossing
devices. There are several Braille printing houses which
attempt to fulfill the need of general-interest Braille
material. The cost of producing this Braille is partially
defrayed by Government agencies. Some books,
magazines, and other publications are generally available from the Library of Congress.

Computer braille

Figure 9-IBM Braille electric typewriter

simultaneously to· emboss a Braille cell. The embossing
usually occurs from the back of the paper so that the
norma!' left-to-right reading is possible and the transcriber may check his work as he goes along.

Braille typewriter

Figure 9 shows an IBM Electric Braille Typewriter
with a full alphabetic keyboard. The usual type faces
have been replaced with dot configurations. Embossing
is from the front into a rubber platen. This typewriter
is easy to use because only one key is depressed at a
time for any particular cell combination rather than
multiple keys as on the Braillewriter.

Some progress has been made in the production of
Braille on high-speed printers coupled to computers.
Generally, a single copy is produced and has limited
life, depending on the paper used and the method of
embossing. (Properly embossed Braille on special paper
usually lasts for 50 readings.) Embossing by impacting
against a rubber platen does not produce as well a
defined dot as when a metal die is used. The rubber
platen tends to mushroom the dot base and limit dot
height.
During the past few years, several organizations
have written programs for various computers. Some of
these are used regularly to produce Braille output for
the blind. Printouts may be computer translated
Braille from English input or conversion of computer
output to simple Braille as might be used by a blind
programmer.

PreS8 braille

Where large quantities of the same material are
needed, metal master plates are made on a stereotype
machine such as the one shown in Figure 10, built by
American Printing House for the Blind (APH). These
master plates are then used on various types of presses
for embossing Braille. An example of this might be the
Braille edition of various magazines, periodicals, or
religious books; Figure 11 shows a typical book of
interpointed press Braille.
Unfortunately, due to the wide variety of textbooks
used not only throughout the United States but even
within one state or within a school district, it is not

Figure to-A Braille stereotype machine built by American
Printing House for the Blind

Proposed Braille Computer Terminal

83

Figure 11-Braille book

PROPOSED ON-LINE BRAILLE TERMINAL
SYSTEM
Description

Figure 12 shows a typical on-line terminal. Such
terminals are attached by communications lines to a
system to provide real-time response to various inquiries
and computer assistance with mathematical problems.
This capability is available now. It's a matter of using
existing technologies to develop a system that will
greatly improve communications and facilitate the
availability of information in Braille.
The key to the proposed on-line Braille terminal
system is the embossing terminal printer. This system,
supported by some programming, would open
"Pandora's Box" to the blind.
The terminal system should be versatile enough to
operate in two modes, first as a local typewriter unit
and second on-line to a computer. In the local mode
the input terminal keyboard would be modified to
provide the Braille function keys such as the number
sign, capital sign, etc. A possible keyboard layout is
shown in Figure 13. This keyboard includes all the
Braille function keys, as well as the English Braille

contractions. The configuration shown is used by the
Lutheran Braille Workers, Inc., on their modified IBM
keypunches. These automatic Braille transcribing keyboards have been used since 1956 and have proven very
satisfactory.

Figure 12-0n-line terminal

84

Fall Joint Computer Conference, 1971

MULTIPLE
PUNCH

@

ND@OR@F@HE

DOT 45

DOT 5

DOT 46

DOT 456

SPACE

Figure I3-Automatic Braille keyboard as used on IBM keypunch

Figure 14 shows a diagram for a possible terminal
system in the local mode, with the ink-print in/out
terminal, encoder and a second unit for embossing
Braille. Ink-print copy can be made available for the
sighted and embossed copy for the blind. The embossing
mechanism, having been carefully designed from a
human factors standpoint, allows reading of the dots
immediately after embossing without need of moving
the paper. This is especially helpful to the blind person
if he is interrupted while typing. It means that he can

read the embossed copy to review what he has written.
A metal die is used to ensure a good quality dot.
Figure 15 shows the on-line operation. When the
embossing terminal is used on-line to a computer,

D_;_E~_A_---I~

__

ENCODER

EMBOSSED
BRAILLE
OUTPUT
(SIMPLE
GRADEl
BRAILLE)

Figure I4-Proposed Braille t~rminal system with embosser
(local mode)

COMPUTER

EMBOSSED
BRAILLE
OUTPUT
(ALL LEVELS
OF BRAILLE
DEPENDING
ON
COMPUTER
PROGRAM)

Figure I5-Proposed Braille terminal system with embosser
(on-line mode)

I

Proposed Braille Computer Terminal

various conversion or translating programs resident in
the computer would assist the individual. These
programs would convert information in various reference banks to the Braille codes.
Examples of possible system operation

Many sighted programmers use on-line terminals to
write their programs and communicate directly with
the computer. This speeds up the process of writing
and debugging programs. The sightless programmer
does not have this advantage. He does not have two-way
communication with the computer. He must depend on
Braille output from a high-speed printer which is
usually run only once a day because certain modifications and special setups are required to print Braille.
If we could provide the sightless programmer with his
own embossing terminal, it would greatly increase his
communication ability. To bring this about, it is
necessary to take the programs that are used to provide
Braille output on a high-speed printer and modify them
for use on a Braille printout terminal. Providing this
capability would mean many additional job opportunities for the blind.
Let us consider another example. Many school
districts are now training sightless children in the classroom along with the sighted. This is an excellent
arrangement in that it does not separate the handicapped child, but rather places him in society where he
can participate and learn to get along with others.
However, it is not an easy arrangement for the teachers,
the school, or the students. Text material, handouts,
and test papers must be provided in Braille for the
child. The production of these various documents can
at times be very awkward, inconvenient, and almost
impossible.
At present it is necessary for the teacher either to
know Braille and transcribe and emboss a document into
Braille for that student or to enlist the aid of a volunteer
to do this for,him. Often there is a lack of time or availability of a trained transcriber to do this. A school
district utilizing a central information bank could have
much of this information available on tapes or on-line
storage. When the teacher needed a copy of a particular
examination, he could request it by phone and it would
be provided to him, perhaps hundreds of miles away,
by means of the on-line embossing terminal.
In cases where a document is not on file, the teacher
could enter the text by typing. it into the computer.
The computer would then translate this and provide the
embossed Braille on a real-time quick turnaround basis.
If a terminal was used to prepare the ink-print master
copy of the test, it could also be transmitted to the

85

computer for translating. The Braille copy would then
be available simultaneously with the master copy of the
test for the sighted individuals. Such an arrangement
would greatly aid the educational process and give the
handicapped child many of the advantages and opportunities provided to the sighted.
A third example is the blind child who is limited by
the number of reference books that are available for him
to do his homework. He really can't afford to own a
personal copy of some books. Besides being expensive,
the books are voluminous, requiring much storage
space. For example, a 30-volume encyclopedia used by
a sighted person would equal 145 volumes of 4- to 5-inch
books in Braille.
The use of a remote terminal and various data banks
or information systems could solve the problem nicely.
It is not hard to realize or project that a blind student
could have a terminal in his home and proceed to do his
homework by dialing into a data bank and inquiring
about the particular subject of interest to him. Imagine
the benefits to this individual if he could dial into a
dictionary or an encyclopedia. What a tremendous
boon to him to be able to inquire on a particular subject
and have the computer respond by embossing on his
remote terminal the information he is seeking.
Projecting even further we can see where a low-cost
embossing terminal could be installed in the home of a
sightless person for daily communication. Major
newspaper items and magazine articles could all be
made available from the central information bank.
Individuals could receive these by means of their
telephone and the Braille terminal. They could have
access to many of the same articles that you and I are
privileged to read.
Our last example covers handicapped individuals who
have other problems in addition to blindness. They too
could benefit by having a terminal in the home. Specifically, this arrangement could provide an opportunity
for a new productive life. The individual, although
afflicted with immobility or other difficulties, could
work in his home and contribute to society. He could be
employed as a programmer, communicating with the
computer, developing programs, receiving his response
from the computer, and enjoying two-way communication.
RESULTS OF EXPERIMENTAL MODEL
An experimental model of the proposed system was
built and used for various feasibility tests. Figure 16
shows the model which consists of an ink-print terminal
with an attached Braille embossing unit. Special attention was given to the human factors requirements

86

Fall Joint Computer Conference, 1971

Figure 16-Feasibility !U0del of on-line terminal with Braille embosser

Proposed Braille Computer Terminal

during the design stage. On the basis of this study, the
unit was designed and built to emboss from the rear,
with the data appearing on the front side of the paper.
A metal die was used to mate with the selected pins to
provide positive control in forming the raised dots.
Results of the feasibility tests have been favorable.
Quality of the Braille dots is good, making the symbols
easy to read. Front embossing offers added convenience
to the blind operator in that fingertip reading and
checking are possible while the paper is in the terminal.
Braille printout for the blind and conventional printout
for the sighted, both from the same terminal system,
allow improved speed in communicating between each
other.
Our investigation and experimentation with the
Braille terminal will continue. We hope to define a practical, easy-to-use, general-purpose embossing terminal.
To accomplish this, we are continuing to discuss and

87

explore the actual use of such a terminal with additional
blind individuals.
BIBLIOGRAPHY
1 R B STEWART JR
Suggestions for curriculum and ancillary services in training
the blind to program computers
System Development Corporation Publication
No. PB-176-8411968
2 N C LOEBER
A utomatic Braille keyboard
IBM Corporation San Jose California Publication
No TM 02 138 1960
3 AMERICAN FOUNDATION FOR THE BLIND
Understanding Braille
New York New York
4 AMERICAN PRINTING HOUSE FOR THE BLIND
General catalog of Braille publications
Louisville Kentucky 1969

Numerical simulation of subsurface environment
by BARRY L. BATEMAN
University of Southwestern Louisiana
Lafayette, Louisiana

and
PAUL B. CRAWFORD and DAN D. DREW
Texas A&M University
College Station, Texas

INTRODUCTION

One of the basic prerequisites in performing a
simulation is the availability of easily attainable random
numbers. In this study, uniformly distributed random
numbers between zero and one were generated by a
modified version of IBM's RANDU routine. 5 Several
methods are available for generating various statistical
distributions from uniform random numbers.6 The use
of the cumulative distribution form of the desired distribution is among these methods. This method utilizes
the fact that the range of both the uniform distribution
and the cumulative function is between zero and one.
Using this fact and the limiting parameters on the
second distribution, it is convenient to randomly
generate numbers from the desired distribution.

Subsurface environments are heterogeneous. The
permeability and porosity varies from point to point and
permeability is only approximately related to porosity.
To construct a numerical model that is useful in the
analysis of the flow of oil, gas or water in the stratum
one must simulate the porous media.
When permeability and porosity values have been
assigned for each sector of the strata, they may be used
to calculate the fluid saturation at geological equilibrium.
In this paper emphasis has been placed on simulating
realistic rock properties throughout the reservoir.
Advances in reservoir engineering were being accomplished at a rate comparable to those in numerical
techniques. Most early work assumed a homogeneous
rock matrix. It was generally accepted that the reservoir
had one permeability, porosity, initial water saturation,
and one capillary pressure curve that was constant
through the reservoir .1,2 The error of this concept was
well-known, but solutions to more realistic approaches
were too difficult. Instead of a completely homogeneous
reservoir, heterogeneous reservoirs consisting of two
homogeneous layers have been studied. 3,4 This corresponds to stratified or layered systems. Porous rock
that is normally considered homogeneous will have
small to large variations in its porosity and permeability
when sampled at different areas. Although these
variations exist, these properties will have a maximum,
minimum and an average value. Analysis of field data
indicates that these values are not predictable, but can
be represented by a distribution about some mean
value.

PERMEABILITY
Measurements of permeability are given in darcys.
A rock of one darcy permeability is one in which a
fluid of one centipoise viscosity will move at a rate of
one cubic centimeter per second under a pressure
gradient of one atmosphere per centimeter and a crosssection of one square centimeter.7 Since this is a fairly
large unit for most producing rocks, permeabilities are
commonly expressed in units one thousandth as large,
the millidarcy.
The permeability of an oil-bearing and commercially
productive sandstone formation is normally between ten
and five hundred millidarcys. The actual permeability
range, as well as the distribution of permeability values,
is determined from core data of the formation to be
simulated. The distribution of permeability values is
ordinarily not uniform. In a typical rock formation
89

90

Fall Joint Computer Conference, 1971

100

1
. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - .

•

INTERMEDIA'I'E

~OUS

M£OI.A

,

After the calculations for the first four steps have been
completed, steps 5, 6, and 7 are performed repeatedly
until a permeability value has been calculated for each
computational module. The sequence in which these
values are assigned to the modules is not important.
The 1225 permeability values were generated in this
manner. The permeability range was 10 to 500 millidarcys. In this instance n = 10 and

INTERMEDIATE POROUS MEDIA

100

10

10

Xo= 10
X 1= 59
X 2 =108
X3=157
X 4 =206
Xs=255
X 6=304
X7=353
Xs=402
X 9=451
X 1o =500

1000

PERMEABILITY IN MILLIDARCYS

Figure I-Porosities and per me abilities of 2200
sandstone specimens

there will be a number of values within the range that
will be points of concentration for the measured values.
These values are called cluster points. The cluster
points need not be well defined and the distribution of
values around the points can be irregular, but this
concept is quite useful in representation of geological
strata. Figure 1 is an illustration of this distribution. 8
I t gives a plot of the porosity versus permeability values
. of 2200 sandstone specimens of two types: intermediate
porous media and intergranular porous media. We are
concerned with the area between the two curves which
represents the intergranular porous media. Considering
only the permeability values and not the porosity
values, one can detect several poorly defined cluster
points.
This clustering effect can be generated using the
following procedure:
1. Assume a permeability range Xo to X n •
2. Choose values of Xi such that X O-0

I-~

+

wi
0
0::

+

+

:t

+

+
+

.t

+

+

t-

++

++

+++

-+

+
+

+

+
+t+

++

+

~

t

+- +-

+. +

t

+
+-+
++ .f-

+

+

+-

+

+

+

+

~

+

+

'I-

+

+

a..~ +

+

+

..t+- ;-

+
t-

1-

-\-

++

fl...

0

t +

+

+

+

!e

+
0

q
0

-10.00

6. Find m such that A m- 1 '>:'.;',; ';~':t!;.)~: :;;~};:U/i:\~

li:/~,/~::<,,: 'YI;>2~') ;:~:~t~};':;;;I;';"S:;::~:-:;:; :->', '.",:1;/:,: :',~X~: :{L:~;~~< ',::;:/y;;,;

Figure 4-Typical reservoir porosity by blocks (10% to 35%)

left corner of Figure 4 represents a porosity value of
fifteen percent.
The algorithms for porosity and permeability values
were programmed for a digital computer. This program
was used to generate 1225 points. The distribution of
these points reasonably duplicates the actual sample
values presented in Figure 1. Having A, B, and Sigma
as input parameters to the algorithms, as well as the
ranges for permeability and porosity, enables one to
easily duplicate results obtained from actual measurements of samples for a given formation. It should be
noted here that particular measurements made in a
rock formation are not duplicated. The method duplicates the general rock properties of the strata.
CAPILLARY PRESSURE
Because oil-bearing reservoirs universally contain
more than one fluid phase, interfacial forces and
pressures are continually influencing both static and
dynamical states of equilibrium. 8 The pressure difference across an interface between two fluid phases is the
capillary pressure in dynes per square centimeter. When
this is expressed in oil-field terms, the equilibrium
capillary pressure in pounds per square inch can be
stated as:

P c =h/144(PI-P2) *
Since h is the height above the water table it can be seen
that the initial values of capillary pressure vary vertically but not horizontally.
The relationship between capillary pressure and
water saturation has been experimentally determined by
Figure 3-Typical reservoir permeability by blocks
(10 to 500 md)

* Symbols are defined in the appendix

92

Fall Joint Computer Conference, 1971

is held constant. These curves compare favorably with
those of Rose and Bruce. 10

3.0
r-IO MO

2.5

f+on
'0

r- 50 MO

RELATIVE PERMEABILITY

100MO

2.0
250 MO
500 MO

z

!:? 1.5

tu

z

:::>
u..

...,

\

1.0

\

\ \

"-

~ ~""'--

\ \\ ~

0.5

o 30

40

50

60

70

80

90

100

WATER SATURATION IN PERCENT

Figure 5-Effect of permeability on the J-fwlCtion

Rose and Bruce. 10 This relationship is given by the
equation proposed by Leverett.ll
J(Sw)

= (Pc/u) (K/ is porosity, the quantity
(K/ )112

Rose and Bruce illustrated the nature of the capillary
retention curves, referred to as J (Sw) functions. It is
evident that there is no universal curve, but even though
the curves vary with rock type, they each have the
general shape of a hyperbola. The coefficients Ai,h
Bi,h and Ci,h control the shape of the hyperbola. These
coefficients are determined from porosity, permeability
and the type of rock formation. Since there are significant differences in the correlation of the J-function with
water saturation from formation to formation no
universal curve can be obtained. Correlation of the
J-function with water saturation for a number of
different materials is shown in Figures 5 and 6. The
effects of permeability when all other parameters are
held constant is sho"TI in Figure 5. Figure 6 illustrates
the effects of the other parameters when permeability

v

Jmax

10

\

4.
0

r

u.

...,

I

2.
~

I

l~

I.

\~ t::::::-

o

~

~

~

WATER

ro

ro

SATURATION IN

W

00

PERCENT

Figure 6-Effect of J-function asymptote on capillary
pressure curves

100

Numerical Simulation of Subsurface Environment

show there is a decrease of eighty-five percent down to
fifty-five percent.
At seventeen percent oil saturation, the relative permeability to oil is essentially zero. This value of oil
saturation, seventeen percent in this case, is called the
critical saturation. This is the saturation at which oil
will first begin to flow as the oil saturation increases. It
is also called the residual saturation, the value below
which the saturation cannot be reduced in an oil-water
system. This is why all of the oil in a formation cannot
be recovered. When the water saturation in the production module is increased above its residual water
saturation water begins to flow.

93

...

co

o

10

2D
WATER

40

60

SATURATION

IN PERCENT

70

80

90

100

ANALYSIS OF RESERVOIRS
No matter how complete the coring and precise the
data, one is still limited to an examination and study of
rock samples which can constitute at the most an
extremely small fraction of the total reservoir volume.
This sample may be of the order of one ten-thousandth
of one percent of the total reservoir;8 This small areal
sampling of a reservoir could suffice for a description of
the gross average properties of the producing formation.
On the other hand, the ultimate limitation imposed
thereby on the quantitative applicability of coreanalysis data cannot be ignored. The basic fact is that
all the features of the rock which are measured in core
analyses are often so variable in passing from sample to
sample along the well bore that the exact numerical
data for a single sample are of little importance. What
are significant are the average values for a set of
neighboring samples or the large differences between

9
~

0

RGURE 7 TYPICAL OIL AND WAfER RELATIVE
PERMEABILITY CURVES

~~

0

~o

i=

P
I\)

0

«
..J
W
a::
0

0.00

14.23

28.57

42.86

WATER

57.14

71.43

5.71

100.00

SATURATION

Figure 7-Typical oil and water relative permeability curve.:;

Figure 8-Minimum, average, and maximum water saturations
from 70 wells-Ten feet of sand

adjacent groups of samples, which may indicate changes
in type of strata or transition zones with respect to
fluid content.
An appreciation of the concepts of porosity, permeability, and fluid saturations may be crystallized in
terms of the numerical values associated with these
terms. Unfortunately, however, no well-defined set of
"typical" magnitudes can be given for these quantities.
These quantities not only vary from formation to
formation, but also from well to well in the same
geologic stratum. Even in a single well, while penetrating a particular zone, the variations in the actual
core-analysis data from sample to sample may be so
large that simple averaging over the whole section may
be unjustified and the supposedly single stratum must
be considered as a composite of several distinct rock
layers. 8 Indeed, it is much easier to exhibit the variability in core analysis data than to provide average
results of any significance.
This paper takes cognizance of the variability in rock
properties from point to point. This is accomplished by
combining rock simulation, capillary pressure, fluid
properties, and physical laws to yield a representation
of geologic strata which illustrates a typical reservoir.
Three of the properties which have been analyzed
are water saturation, permeability and porosity. These
properties were generated from samples on seventy
consecutive wells using the procedures described
previously. The samples were analyzed in one-foot
increments for a sand thickness of ten feet located fifty
feet above the water table.
Data pertaining to the water saturation is shown in
Figure 8. This figure shows the minimum, average, and
maximum water saturations for the odd numbered

94

Fall Joint Computer Conference, 1971

-

.

15

-.

11

W
E
L
L

1

::
JI

»

.,

M'

3'

."
.
"

E

R"
s:

.5'

-

'1

51
S

.,
6,

..

os

o

100

200

..

.

.

.

...

.-

•

.
..

£
~

.-

----

A

.. -

-

<0'
~.

<..

--

.

-.

B"

5'

.

..

N
U

..

'"

>1

..
•
.. ..

.;;

-

~

::;

500

400

~

0

4

FRACTIONAL POROSITY

PERMEABILITY I N MD.

Figure 9-Minimum, average, and maximum permeabilities from
70 wells-Ten feet of sand

Figure 1Q-Minimum, average, and maximum porosities from
70 wells-Ten feet of sand

wells. The minimum water saturations are distributed
between thirty-five and forty-five percent while the
maximum water saturation observed in at least one
sample from a single well was one hundred percent for
more than eighty-five percent of the wells. It should be
noted, however, that one well has a minimum value of
approximately thirty-eight percent water saturation
while its maximum value is only fifty-three percent.
This fact, combined with the realization that the average
water saturation is between forty-eight and eighty-two
percent, reasserts the futility of utilizing an average
core sample to represent an entireJeservoir.
Figure 9 illustrates the variability of permeability in
a sample of the previous seventy wells. As shown, the
minimum permeability in a single well can range from
a minimum of ten millidarcys to one hundred and ten
millidarcys while the maximum values vary between
three hundred and sixty and five hundred millidarcys.
More interesting, perhaps, is the range of the average
permeability. It varies from one hundred and fifty to
three hundred and fifty millidarcys. Again, the fallacy
of using average values from core analysis of one well is
graphically illustrated.
Concluding this study of the core analysis of seventy
wells is the display of fraction porosity values shown in
Figure 10. This figure seems to display a tighter distribution of values than Figures 8 and 9, but it should
be remembered that the range of the porosity is between
0.1 and 0.35. Since the :-ange is smaller, minimum values
between 0.14 and 0.202 contrasted with maximum
values between 0.25 and 0.32 are not as clearly defined
as one might expect. The average values seem to fill in

the gap as they range from 0.21 to 0.31. These values
seem especially appropriate to illustrate that an
average value in one well may be a maximum or a
minimum for another well in the same reservoir.
CONCLUSIONS
This paper demonstrates the feasibility of simulating
heterogeneous permeable strata for numerical study on
high speed computers. The method uses the actual core
data for permeability and porosity. The porosity is
related to the permeability by a distribution curve
utilizing a random number generator. The resulting
fluid saturations for each foot of rock may then be
computed by using a relation between capillary pressure, rock properties and fluid saturations when
drilling and coring permeable strata.
REFERENCES
1 J E BRIGGS T N DIXON
Some practical considerations in the numerical solution of
two-dimensional reservoir problems
'
Soc of Pet Engrs Jour June 1968
2 J DOUGLAS JR D W PEACEMAN
H H RACHFORD
A method for calculating multi-dimensional immiscible
displacement
Trans AIME 1959 Vol 216 297
3 J BJORDAMMEN K H COATS
Comparison of alternating direction and successive
overrelaxation techniques in simulation of reservoir
fluid flow
Soc of Pet Engrs Jour March 1969

Numerical Simulation· of Subsurface Environment

4 J BJORDAMMEN
Comparison of three methods for simulating two- and
three-dimensional flow in reservoirs
Master Thesis The University of Texas at Austin January
1968
5 IBM
System/360 scientific subroutine package
Version III Programmers Manual NoH 20-0205-3
New York 1968
6 K D TOCHER
The art of simulation
The English Universities Press LTD London 1963
7 B C CRAFT M F HAWKINS
A pplied petroleum reservoir engineering
Prentice Hall Inc New Jersey 1959
8 MORRIS MUSKAT
Physical principles of oil production
McGraw Hill Book Co Inc New York 1948
9 J W AMYX D M BASS JR R L WHITING
Petroleum reservoir engineering
McGraw-Hill Book Company New York New York 1960
10 W ROSE W A BRUCE
Evaluation of capillary character in petroleum reservoir rock
Trans AI ME 1949 Vol 186 127
11 M C LEVERETT
Capillary behavior in porous solids
Trans AIME 1941 Vol 142 152-169
12 C E JOHNSON JR
Graphical determination of the constants in the Corey
equation for gas-oil relative permeability ratio
Journal of Petroleum Technology October 1968

95

APPENDIX
Nomenclature
Symbol

K
Kr

Sigma
p
(J'

Definition
Capillary Pressure, psi
Saturation
Verticle Position Measured
Positively Downward, feet
Absolute permeability, darcy
Relative permeability
Standard Deviation for Porosity
Density, psi/ft.
Interfacial tension, dynes/ cm.
Porosity

Subscripts
c

i
j

LR
rw
rn

n
w

Capillary
Index for numbering blocks III the
X-direction
Index for numbering blocks III the
Z-direction
Minimum water saturation
Relative to wetting phase
Relative to non-wetting phase
N on-wetting phase
Wetting phase

Digital simulation of the genera] atnlospheric
circulation using a very dense grid
by W. E. LANGLOIS*
Notre Dame University
Notre Dame, Indiana

grid"). Nevertheless the 1960s produced significant
advances in understanding the general circulation. As
the decade progressed, the various models (summarily
described by Kolskyl) began to simulate the main
features of the circulation rather well. Certain important details, to be discussed below, do require high
resolution, and of course these are the focus of current
interest, but the principal permanent and semi-permanent circulation systems can be realistically modeled
with a 5°X4° grid.
The above discussion pertains only to research models
of the general circulation, not to forecast models-for
which the resolution problem is quite different. This
may seem paradoxical since, after all, there is only one
general circulation. Presumably it is governed by the
same dynamical system, whether we are trying to understand its behavior or to forecast its evolution from
an observed initial state. If an infinitely fast computer
were available (actually a million MIPS might be
enough) there would in fact be no distinction between
a research model and a forecast model. Realistically,
however, each type of model must leave off certain
features of the other type in order to retain those
features essential to its intended purpose. A research
model requires global (or at least hemispherical) coverage, representation of the non-adiabatic atmospheric
processes, and very-long-term simulation. A forecast
model requires dense coverage, initialization from observed data, and near-real-time operation. The incompatibility of these requirements is illustrated by the fact
that the 272° X 2° simulation reported here requires
272 hours of CPU time (about 6 hours of real time) per
simulated day with the program running in a 1000
kilobyte partition of an IBM 360/91, even though the
vertical resolution is quite coarse (2 vertical levels) .

INTRODUCTION
The dynamics of the weather represents, in its full
generality, a computational problem which far exceeds
the capability of any computer presently foreseeable.
Fortunately, however, specific aspects of the weather
problem can be profitably attacked with computers already in existence.
The large-scale motion of the atmosphere, usually
termed the general circulation, is one such aspect. Computationally, it is a digital simulation problem based
on a spatial finite-difference grid which, from an anthropocentric point of view, is rather coarse. A general
circulation research model with a grid-spacing of 1° of
longi tude by 1° of latitude ( 110 kilometers by 110
kilometers at the equator) would be regarded as having
extremely high resolution-unrealistically high for
present-day computers.
One must bear in mind, however, that even the 272°
by 2° coverage used in the present investigation distributes 12816 grid points over the surface of the earth.
Thus the large-scale wind systems, which have horizontallength scales of 1000 kilometers or more, are not
grossly underresolved. Sub-grade-scale ·effects are important, to be sure, but their details are only weakly
coupled to the large-scale motion. For example, the
grid is far too coarse to resolve the dynamics of a single
cumulus cloud, but the net effect of cumulus activity
in a grid cell can be reasonably well parameterized in
terms of the grid-scale quantities.
Because of the complexity of general circulation calculations, 272° by 2° global coverage is feasible only
with computers in the class of the IBM 360/91 or the
CDC 7600. During the 1960s, general circulation research was carried out with much coarser resolution,
5° X 4° being typically considered a "fine grid" (hence
we have termed 272° by 2° resolution the "hyperfine

TWO SPECIAL ACKNOWLEDGMENTS
The general circulation model used in the present
study is a version of that developed at UCLA by A.

* Visiting Professor of Mathematics
97

98

Fall Joint Computer Conference, 1971

Arakawa and Y. Mintz, with the collaboration of A.
Katayama. It was at Professor Arakawa's suggestion,
and under his guidance, that the hyperfine grid simulation was undertaken. Working from the UCLA listings,
H. C. W. Kwok and the author reprogrammed the
model to run efficiently on a "pipeline" computer. A
few minor thermodynamic modifications were incorporated but most of the model's physics remains as outlined by Arakawa, Mintz, and Katayama in their
Tokyo paper.2 A detailed description of the physics,
and of our final code, is available in the series of reports
by Langlois and K wok. 3
Since preparation of the paper in which we used the
5° X 4° version of the model to study air contaminant
transport,4 Kwok has transferred to projects not concerned with general circulation research. Fortunately
for the present author, however, he had already solved
most of the formidable data-management problems associated with hyperfine grid simulation.
DESCRIPTION OF THE MODEL
What follows is a reasonably complete, but entirely
verbal, description of the general circulation model. The
mathematical details are available in our reports,3 except for certain aspects of the radiation model and
cumulus parameterization which are described in the
appendices of the paper by Arakawa, Mintz and
Katayama. 2
The model troposphere is divided into two quasihorizontal layers of equal mass. Specifically "(J coordinates" are used, i.e., the vertical coordinate is

where P8 is the surface pressure and Pt is the tropopause
pressure which is taken to be a constant 200 millibars.
Thus the upper tropospheric layer corresponds to
o~ (J ~ Y2 and the lower layer to Y2 ~ (J ~ 1. The earth's
surface, which follows the elevation of the large-scale
mountain systems, corresponds to (J = 1. Vertical differencing is carried out in a way which con'3erves the
first and second moments of potential temperature, as
well as other physical quantities obeying integral conservation laws.
Horizontal differencing is carried out in the longitudelatitude plane. In this plane the image of the earth's
surface is a rectangle of height 7r and width 27r. Except
in the immediate vicinity of the poles, the finite-difference grid is constructed by subdividing this rectangle
into a network of congruent rectangular cells measuring
2Y2° of longitude by 2° of latitude. Near each pole, one
row of grid points is skipped. Thus the grid cells along
the 90° N or S latitude lines correspond, on the surface

of the earth, to 272° wedges extending from the pole to
87° latitude. The motivation for skipping points near
the poles is linear stability: At 89° latitude the 272°
longitudinal spacing corresponds to only 5 kilometers,
which would require far too short a time step. The
convergence of meridians near the poles is further mitigated by an averaging procedure which tends to damp
out short waves moving in the longitudinal direction.
Except for these details, the space-differencing is based
on Arakawa's conservative differencing scheme, derived from the dynamical equations written in flux form.
A time step of 2>i simulated minutes is used. The
time differencing scheme is a variation of the Matsuno
two-stage scheme, which approximates backward differencing. It differs from IVlatsuno's original scheme in
two particulars:
(1) The non-adiabatic processes are not calculated
at every time step, since they are relatively
slowly varying (time scale about one hour) ; to
over-resolve them would greatly slow up the
simulation.
(2) The fluxes are not estimated the same way at all
time steps, nor at both stages of the same time
step. "Checkerboard instability" is avoided by
alternating between centered and uncentered
estimates.
The surface underlying each grid cell is specified as
being ice-free ocean, sea ice, ice-free land, glacier, or
snow-covered land. Grid cells corresponding to ice-free
ocean are assigned surface temperatures appropriate
for the season; since the present paper describes a simulation of northern hemisphere winter, the observed
January values are used. Land surface is regarded as a
thermal in'3ulator with no capacity to store heat; its
temperature is determined by balancing incoming and
outgoing thermal fluxes. If the land is ice or snow
covered, this temperature is constrained not to exceed
the melting point of ice. Sea ice is treated like icecovered land, except that there is some heat conduction
thru the ice. The ice distribution is specified for northern
hemisphere winter; the snow distribution is estimated
as a function of calender date.
The dependent variables are the surface pressure, the
temperatures and horizontal wind velocities of the two
layers (the vertical differencing scheme represents
these as carried at (J = >i and (J = %), and the mixing
ratio of the lower layer. Since the moisture-carrying
capacity of the air diminishes rapidly with decreasing
temperature, and hence with increasing altitude, the
mixing ratio is carried rather low, viz, at (J= %. The
moisture content of the relatively cold upper layer is
neglected. However, in the final section we present some

Digital Simulation of General Atmospheric Circulation

evidence that this approximation should really be modified at hyperfine grid resolution.
The model accounts for four contributions to the
non-adiabatic heating and cooling: incoming solar
radiation, mostly visible and near infra-red, which depends on latitude, season, and local time of day; longwave infra-red radiation (usually a heat sink) ; sensible
heat transfer between the lower layer and the underlying surface; release of latent heat during precipitation.
Radiative heating and cooling of the air in a grid cell
depend on the mixing ratio and on the nature and extent of the cloud cover. Three types of clouds are distinguished: stratus deck, which results from large-scale
condensation occurring when the mixing ratio exceeds
its saturation value; cumulus towers, which result
either from convection in the middle portion of the
troposphere or from penetrating convection originating
in the planetary boundary layer; low-level cumulus
clouds, which result from boundary layer convection
that is too weak to penetrate into the middle troposphere, and which produce no rain.
The moisture source in the model is evaporation
from the open sea, from ice or snow covered surfaces,
and from ice-free land that has previously been moistened by rain. The evaporation rate depends on the surface wind speed and on the vapor pressure difference
between the air and the surface. For ocean, ice, and
snow, the surface vapor pressure is the saturation value
for the surface temperature; for ice-free land it depends
on the history of precipitation, evaporation, and runoff.
lVIOTIVATION FOR HYPERFINE GRID
SIMULATION
As indicated in the introduction, the main features
of the general circulation can be simulated without resort to hyperfine grid resolution. This can be seen, for
example, from the pressure maps in the GARP study,5
which employed a 5° X 4° version of the UCLA model
(the "fine grid" version described in our reports. 3 However, certain aspects of weather development are not
well simulated with a 5°X4° grid. For example:
1. The west to east progression of cyclonic storms
is too slow-about 5° per day, whereas 10°, or
slightly more, is typical of the real atmosphere.
2. The development of storms is likewise too slow.
3. The subtropical highs are 5 to 10 millibars too
weak. This is related to the previous two items:
cyclonic eddies form the principal mechanism
for removing heat from the subtropics and, with
this process slowed down, thermal lows tend to
weaken the highs.

99

Manabe, Smagorinsky, Holloway, and Stone 6 described a high-resolution simulation using the hemispherical general circulation model developed at the
Geophysical Fluid Dynamics Laboratory of the Environmental Sciences Services Administration. Their
horizontal differencing uses a uniformly spaced grid on
a stereographic projection centered at the pole, with 40
grid points between pole and equator. On the average,
this yields horizontal resolution roughly comparable
with that of the present study.
The GFDL group compared the results of their highresolution simulation with those of a previous study in
which they used 20 grid points between pole and equator (roughly comparable to our 5°X4° grid). They
reported that the system of fronts and the associated
cyclone families in the high resolution model are much
more realistic than those of the low-resolution model.
Moreover, they analyzed the energetics of the simulation in some detail, finding that the general magnitude
and the spectral distribution of kinetic energy are in
better agreement with the actual atmosphere as a result
lof the improvement in resolution. These findings offer
further incentives for hyperfine grid resolution with the
UCLA model. It must be borne in mind that the two
models accept computer limitations in entirely different
ways. For example, the GFDL model assumes a featureless earth, but uses nine levels of vertical resolution. However, the improvements reported by the
GFDL group appear to result from better horizontal
resolution of the major transport mechanisms essentially common to both models.

THE SIMULATION
As implied in the introduction, research models of
the general circulation are not usually initialized from
real data. Rather, the model generates its own data
which, in present day models, is statistically reasonable
but which does not correspond in detail to the weather
on any actual day.
When a model is run for the first time, its "cold
start" is taken from an artificial initial state. For example, we use a neutrally-stable and completely dry
atmosphere at rest with a uniform sea-level air temperature of 0° centigrade. This choice is easily programmed even for a model with mountains, and it
offers the additional advantage that realistic patterns
evolve after only two simulated weeks (cold start from
an isothermal atmosphere requires about three times
as long to achieve realism because of the extreme
stability). Subsequent runs are initialized from a history tape whose records are generated at specified update intervals (usually 6 simulated hours). There is

100

Fall Joint Computer Conference, 1971

Figure lA-Zonal mean westerlies after 30 days of fine-grid
simulation

also provision for generating a history tape record at
the end of a run which is interrupted between regular
updates.
To save computer time, we did not run the hyperfine
grid simulation from cold start. Instead, we generated
a history tape by interpolating the final state of a 20
day simulation with the 5°X4° grid. This preparatory
simulation began from cold start with climatic data
appropriate to Nov. 1.
The first 10 days of hyperfine grid simulation were
carried out with the non-adiabatic processes calculated
every twenty time steps (45 simulated minutes). At
this point we compared the static features of the simulation with those of a 30 day run with the 5°X4° grid.
The distributions of the mean zonal temperatures
for the two cases were substantially the same. The
permanent and semi-permanent features of the sealevel pressure maps differed primarily in that the Azores
and Siberian highs were about 10 millibars more intense with the hyperfine grid. Changes in the wind pattern are more pronounced; Figure 1 compares the mean
zonal westerlies for the two cases. The vertical structure
was obtained by linear interpolation in u from the
computed values at u= ~ and u=~. The most notice-

able features of the hyperfine grid result are the intensification of the northern hemisphere jet stream and
the appearance of a core of weak westerlies in the intertropical convergence zone.
These first ten days of hyperfine grid simulation revealed that carrying all the moisture in the lower layer
is not fully acceptable at 2Y2°X2° resolution. The
problem arises when a grid cell undergoes low-level
convergence and high-level divergence in a region of
high relative humidity. In nature this produces dynamically forced convective rain over the grid cell,
with the released latent heat being divided between the
u-layers, and with some outward water vapor transport in the upper layer. In the model, however, the
upper layer is incapable of carrying moisture. Hence
the model sees large amounts of water vapor advected
into the lower layer of the grid cell, but none carried
upward in spite of the pronounced lifting. This leaves
the lower layer grossly supersaturated. Intense largescale condensation then takes place and the concomitant release of latent heat-all in the lower layerunrealistically destabilizes the atmosphere.
This effect was discovered because of an oversight in
setting up the hyperfine-grid run. Passing to higher
resolution involves changing the continental outlines.
As illustrated in Figure 2, certain areas which repre-

~
--1--+-- =--i
[I

;:

L__ - ---- - -i

n

__

I
I
IL _________ .JI

IL _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

--1
I

I
I

I
I
I

I
I
I
I
I

I
I
I

L __________ --,
I

Figure lB-Zonal mean westerlies after 20 days of fine-grid
simulation followed by 10 days of hyperfine grid simulation

Figure 2-Resolution of the Coral Sea shorelines: dotted line,
fine grid; solid line, hyperfine grid

Digital Simulation of General Atmospheric Circulation

Figure 3-The simulated sea-level pressure
A. Day 29

101

C. Day 31

sented coastal regions of Australia or New Guinea with
5° X 4° resolution represent parts of the Coral Sea when
the grid is refined to 27'2° X 2°. The hyperfine-grid history tape was generated by interpolating data from the
fine-grid tape at midnite Gl\1T of "November 20", i.e.,
at 10 A.l\1. Brisbane time during southern hemisphere
summer. Consequently the surface temperatures of
New Guinea and Queensland were rather high (35 to
45 degrees centigrade). Thus some of the Coral Sea
hyperfine-grid cells were assigned surface temperatures
far too high for ocean. Prodigious evaporation immediately took place. A center of convergence just off the
Queensland coast completed the requirements for
runaway precipitation, causing a local hot spot. The
simulation was continued with the ocean temperatures
cooled off to a maximum of 304° Kelvin, which relieved
the Coral Sea problem. However, unrealistically high
supersaturation at a few grid points continued to result
from the model's inability to simulate dynamically
forced convection. Since the algorithm for computing
the large-scale condensation is a second-order N ewton-

Raphson procedure it appeared advisable to avoid
trouble by artificially removing moisture in excess of
140 percent relative humidity. Since this grossly unrealistic supersaturation typically occurs at only 1 to 7
grid points out of nearly 13 thousand, the artificial disruption of the model's moisture balance is insignificant.
During the final 10 days of the simulation, the interval for calculating the non-adiabatic processes was reduced to ten time steps (227'2 simulated minutes). The
objective of this was to eliminate more smoothly the
unrealistic results of the sea-temperature error. In
retrospect, however, this was probably unnecessary:
short comparison runs reveal that changing this interval from 20 to 10 time steps produces hardly any discernible effect.
The sequence of simulated events during the last
twelve days of the simulation is depicted in Figures 3A
thru 3L. These are maps of the surface pressure reduced
to sea level. The contour interval is 7.5 millibars and

B. Day 30

D. Day 32

102

Fall Joint Computer Conference, 1971

E. Day 33
H. Day 36

F. Day 34

G. Day 35

I.

Day 37

J. Day 38

Digital Simulation of General Atmospheric Circulation

the numerical data indicate pressure excess over 1000
millibars. In each case the simulated time is 00:00
GMT of the indicated day. The thermal low over the
Coral Sea should be ignored because it resulted from
the runaway precipitation described above.
The success of hyperfine grid resolution in simulating
the motion and development of cyclonic storms is best
seen from this sequence by examining the activity over
North America and the eastern North Atlantic. The
history of four separate storms can easily be followed
from the twelve maps in Figure 3.
On Day 29 a broad 1006 mb low was situated off the
New England coast. A day later it had intensified to
989 mb and moved 15° east. By Day 31 it had moved
10° more and merged with the Icelandic low.
Also on Day 31, a 992 mb low appeared over Hudson
Bay. It rapidly intensified to 884 and one day later it
was 16° to the east over northern Quebec. Its rate of
progression then slowed to 9° per day and on Day 34
its merger with the Icelandic low was in progress.
On Day 35 a 995 mb low appeared over the Mackenzie district. This one intensified only slightly (to 993
mb), then weakened as it passed across Hudson Bay
and over the Hudson Strait. During its three-day life
span it moved eastward 29°.
The last three maps depict rather complicated cyclonic activity over the United States. The movement
and development seem to be heavily influenced by the

K. Day 39

103

L. Day 40

blocking effect of a pair of high pressure regions over
southern Canada.
REFERENCES
1 H G KOLSKY
Some computer aspects of meteorology
IBM Journal of Research and Development 11 5841967
2 A ARAKAWA Y MINTZ A KATAYAMA
Numerical simulation of the general circulation of the
atmosphere
Proc of the WMO JIUGG Symposium on Numerical
Weather Prediction Tokyo Sect IV p 7 1968
3 W E LANGLOIS H C W KWOK
Numerical simulation of weather and climate
A series of reports of the Large-Scale Scientific
Computations Department IBM Research Laboratory
San Jose California
I. Physical description of the model 1969
II. Computational aspects 1969
III. Hyperfine grid with improved hydrological cycle 1970
4 H C W KWOK W E LANGLOIS R A ELLEFSEN
Digital simulation of the global transport of carbon monoxide
IBM Journal of Research and Development 15 p 2 1971
5 R JASTROW M HALEM
Simulation studies related to GARP
Bulletin American Meteorological Society 51 p 490 1970
6 S MANABE J SMAGORINSKY
J L HOLLOWAY JR H M STONE
Simulated climatology of a general circulation model with a
hydrologic cycle III. Effects of increased horizontal
computational resolution
Monthly Weather Review 98 p 175 1970

Simulation of the dynamics of air and water pollution
by LAURENCE W. ROSS
University of Denver
Denver, Colorado

INTRODUCTION

AIR POLLUTION

Simulation of the dynamics of air and water pollution
rests firmly on the diffusion equation, which in simplest
form is known as Fick's second law. The problem of dispersion of solutes and suspensoids is much older than
the pollution crisis, and in fact the development of the
first useful solutions of the diffusion equation came in
response to a need for predicting the spread of poison
gas.
Between the World Wars, a small group of English
investigators pressed forward steadily with this development. The first really useful result came in 1923, and
in 1932 Sutton developed the three-parameter formula
that is still the most widely used in the regulations of
several states, and in the gas dispersion correlations of
the U.S. Army Chemical Corps. The state of development up till about 1950 is admirably summarized in
Sutton's text. 1
The decade of the 1960s witnessed a very strong
upsurge in the development of dynamic models for dispersion of air and water pollution. The opening of the
1970s finds the federal government committed to the
rational management of all our natural resources, and
thus we see a strong upsurge in development of air
pollution models for urban environments, especially.
In the realm of water pollution, thermal pollution poses
an immediate threat that has defied realistic simulation,
so far, and here we also observe a great upsurge in
activity. The problem of solid waste pollution is a
problem in systems engineering, not simulation, and
we must neglect it here even though it has strong
potential for environmental crisis in the 1970s.
Despite the massive efforts, we observe very few
fundamental advances. There are good reasons for the
paucity of really useful, successful simulation models of
air and water transport, and that is the subject of
this paper.

General theory

The transport of particulates and gases in the lower
atmosphere is influenced by all the natural mechanisms
that give rise to motion. In general, therefore, the
correct mathematical description of the atmospheric
environment must include the rate expressions that
govern the three conserved quantities of the physical
world: momentum, energy, and mass. These general
rate equations are shown in Table 1.
In air pollution, the equations of energy do not enter
the mathematical description except at definite points
(e.g., stacks) where energy is added to the system.
Therefore, the energy equation of Table I may reasonably be neglected on the large scale; it will reappear
when we consider behavior of smoke plumes.
Therefore, the usual set of equations for description
of air pollution dynamics is the following:
Equation of continuity
Equations of motion
Equation of diffusion.
The diffusion equation is obviously coupled to the
equations of motion by the velocity terms. The set is
usually further simplified by assigning
u=u(z)
v=O

or w=O.

It will be observed that if v = w = 0, then u =
{constant}. Therefore, one other velocity besides the
horizontal must be retained if the altitude dependence
of u is to be retained.
On a large scale, it is often convenient to retain v and
permit the wind to possess two components parallel to
105

106

Fall Joint Computer Conference, 1971

TABLE I-The Equations of Transport
Momentum
x-direction:

p

a.u au au au) = -aP
(aTXX aTyX aTZX)
- - -+-+( -+u-+v-+wat
ax ay
az
ax
ax ay az

y-direction:

p

av av av av) = -aP
(aTXY aTyy aTZY)
- - -+-+( -+u-+v--;-+wat ax ay
az
ay
ax ay az

z-direction:

p

aw aw aw aw) = -aP
(aTXZ aTyZ aTZZ) +P(jz
- - -+-+( -+u-+v-+wat
ax ay
az
az
ax ay az

Energy:

Mass:

aT -+w
aT aT)
(a 2T a2T a2T)
-+v
=k -+-+-2
ax ay ax
ax2 ay2 az
aC aC ae aC (Dx-+Dy-+D
a 2c
a2c z a- 2C)
-+u-+v-+w-=
at
ax ay
az
ax2
ay2
az 2
aT
at

pCp( -+u

au av aw

Continuity:

-+-+-=0
ax ay az

Assumptions:

Constant density (P), thermal diffusivity (k/pC p), and mass diffusivity (Di), and absence of viscous dissipation contributions to energy.

the earth. This is the procedure adopted by Hino. 2
On an intermediate scale (,-....;miles), the coordinate
system is often defined such that w = v = 0, and a mean
horizontal wind velocity u is used. On the very smallest
scale ( ,-....; 1000 yards), a velocity profile is assumed
a priori; the theory of turbulent flow usually leads to
adoption of the Prandtl mixing-length model (see for
example Randerson 3 ) :

J<.u/u* = In (z/zo) +const.
Alternatively, a power-law approximation is sometimes
used for convenience in computation, or in similarity
solutions.
The diffusion coefficients present a different sort of
theoretical problem, because they must always be
modeled. Quite often D,; is neglected in view of the
assumption
u(ac /ax»>D x (a 2C/ax 2).
The lateral and vertical diffusion coefficients, Dy and
D z, have therefore received principal attention, especially D z • However, the best results are still quite
empirical (see Pasquill) .4
At this point, it is convenient to mention the
Richardson number, which is defined as
Ri = (g/To) [(dT /dz) + r] .

(du/dz) 2
The Richardson number may be regarded as the ratio
of the thermal driving force for vertical air motion to the
vertical force arising from turbulent shear. It is the
basic parameter for micro meteorological motion, in the

same sense that the Reynolds' number is basic to
confined fluid motion. For example, the vertical diffusion coefficient Dz is often expressed as a function of
Ri as follows:

Dz = J<.2Z2(du/dz) (l-o-Ri){J
where {j is usually taken as 72. On a small or intermediate scale, where the diffusion coefficient cannot be
averaged, the Richardson number may have to be
known. *
At this point, we have reduced the diffusion equation
to the form:

aC = ~ (Dy ac)
at
ay
ay

+ ~ (Dz ac) -u aC -R.
az

az

ax

(1)

The transient equation is practically never of interest,
and R is rarely considered (of this, more below), and
the remaining equation has the following general
solution (for a point source of emission) :

[1 (y2

Z2)] .

Q
C=-_-exp
- - -2+2
7rUo-yO-z
2 0-Y
0- z

(2)

Particular solutions, based upon particular choices of
the form of o-y and O-z, are shown in Table II. The
Bosanquet-Pearson solution, for example, carries the
implicit assumption that 5
D 1I cx:ux

DzCX:uz.

* Note that, in general, Ri=Ri(z).

Simulation of Dynamics of Air and Wate!' Pollution

107

Modeling urban air pollution

Equation (2) is the basis of most current air pollution
models for urban areas. There have been numerous
applications (see Tikvart6 and Stern7 ), differing mainly
in the treatment of the source field, assignment of U'JI
and u z, and whether or not short-range wind structure
is analyzed. It should be noted that the scale of air
pollution dispersion suggests the use of the asymptotic
form of equation (2), viz.,
(3)

Miller and Holzworth8 are the principal exponents of
this approach. Bowne9 has reported a digital simulation
of air pollution patterns over the State of Connecticut,
based on Equation (2). Hino 2 solved a coupled system
involving the two-dimensional diffusion equation and
the equations of motion (N avier-Stokes) in two dimensions to handle the problem of topographical variations.
Interestingly, the grid square dimensions are usually
either 1 mile or 1 km in all investigations.
The basis of the urban modeling method, when using
Equations (2) or (3), is to employ experimental data as a
Emission
strength,
Ib/day/sq mi

TABLE II-Practical Solutions of the Diffusion
Equation for Air Pollution
(Case of Continuous Point Source in a Wind)

Highest level

•

I ntermediate level

~

Lowest level

0

1. Sutton equation
Figure I-Typical pattern of pollution emission strengths.
Typical grid dimension is 1 mile square

O"y/x=(1/V2) CyX- n/2 , O"z/x = (1/V2) CzX- n12
C = (2Q/7rCyCZUXx2-n)
X exp {-x n- 2[(y2/C y2)+(Z2/Cy2)]}
Note: n

=

:!i for neutral atmosphere,

~~

(II.l)

for strong lapse,

7:2 for strong inversion
2. Calder equation
O"y/x=V2 (aku*)/u x ,
C=

u z =V2 (ku*)/u x
2
(Qu x/2k au*2x2) exp {- (JLx/ku*x) [(y/a) -zll

(II.2)

Note: Not Gaussian!

3. Bosanquet-Pearson equation
U'y/x=q,

O"z/X=V2 p

C = (Q/2 3/27ru xpq X2) exp [- (y2/2q2X2) - (h/px)]

(II.3)

4. Modified Sutton equation
ny-l
U'y/X=V2 Cyx

nz-l
,U'z/x=V2 Czx

C = (Q/27rC yCZuxX ny+n z) exp [_7:2(y2/C y2X2nz)]

(II A)

means of assigning each grid square as a point source of
given emission strength (Figure 1). It is wasteful of
computer storage to maintain more than a few levels of
emission strength in the model (e.g., three levels in
Figure 1). Then the mean wind speed at a given direction is applied, together with assignments of U'y and U z
based upon the model chosen, and concentration in a
given cell is computed as the resultant of the contributions from other cells, above some predetermined
lower limit of concentration. Figure 2 illustrates the
principle.
Day-to-day 'pollution control requires a somewhat
different approach. Here the problem is to produce the
pollution pattern (as in Figure 1) from moment-tomoment measured data, then to apply (2) or other
predictor relation to obtain estimates of pollution
levels. Figure 3 shows the situation with respect to data
monitoring stations. With distance and position of
emission sources established with respect to wind
direction, it is feasible to estimate the concentration of

108

Fall Joint Computer Conference, 1971

o
/

Wind
direction

a given pollutant at any position, if its stoichiometry is
known. For example, the use of fuel of known sulfur
content will produce a given amount of 802. This is the
model described by Takamatsu et al., 10 for air pollution
control in Osaka. Furthermore, Takamatsu's model
considers a smaller scale than that of Bowne9 or
Randerson, 3 and recognizes that the region near ground
level is subject to different meteorological patterns
than higher regions. This leads to consideration of a
separate layer, the "complete mixing zone," where the
diffusion equation applies, the zone above being assumed
a perfect sink for pollutants. This also resembles the
"box model" of Lettau.u
The so-called complete mixing zone has not been
identified formally with the famous "inversion layer,"
nor is the APCO definition of "mixing depth" the same
as the depth of this zone, necessarily. Indeed, the failure
to simulate inversion effects is one of the principal
embarrassments of simulation efforts to date. By
whatever definition, it is clear that a criterion for a
layer of finite depth is required. The only theoretical
basis for such a finite depth is the stability height of
Monin and Obukhov,I2 formalized in 1954 as
(4)

Figure 2-Pollution isopleths resulting from applying dispersion
model to measured emission patterns

This is essentially equivalent to a normalization of
height against a Richardson-number criterion. It
requires a knowledge of friction velocity u* and heat
flux q, but this is not excessively demanding and the
present author believes that the stability height
deserves more use.
Special cases

~_--+-_----.:..

The perfect
mixing zone

Figure 3-Features of an emission monitoring system for air
pollution control. It should be noted that individual sources are
identified

Although "special," these may be the cases of more
intense public interest. For example, the simulation of
smog dynamics is obviously unsatisfactory, for otherwise the means of smog control would have found their
way into legislation instantly. The reaction kinetics of
smog processes has been deciphered with reasonable
confidence, but the physical influences-diurnal temperatures and winds, air-water interactions, etc.-are
still mysteries. For example: What is the ultimate fate
of smog? Noone knows.
The dynamics of smoke plumes has received intensive
study, and seems to be well understood (see for example
References 13, 14). This case is especially interesting
because it requires simultaneous consideration of the
equations of momentum, energy, and mass (Table I).
Practically no simplifications are available in the
general case, and this becomes a demanding exercise in

Simulation of Dynamics of Air and Water Pollution

computer simulation. Another feature of the smoke
plume problem is that it is a natural convection problem,
which fact invites the application of two-dimensional
analysis by combination of variables and computation
of stream functions; this approach has not been applied,
to date.
Future directions for air pollution simulation

The basic missing ingredient in air pollution simulation is meteorological measurement. Emission measurements, on the other hand, are fairly well advanced,
except that we still have not had the political courage
to pinpoint individual sources of strong emissions. *
This lack of measurement is surprising, because it
would be simple, and a series of brilliant experiments by
English scientists have provided ample verification of
the theory. The quantities that require measurement
are not in doubt.
The current programs in various cities provide masses
of data for regression analysis. However, these obviously
have application only locally, and only then for limited
periods of time. Thus, they cannot be used for prediction except in a statistical sense, and they are useless
for control. To rectify this situation, it is merely
necessary to obtain data on the same basis that the
theoretical models are constructed. This calls for wind
speed and direction (at other points than just the local
airport !), and the wind and temperature profiles in
vertical direction. The "stability depth" must also be
established, but the other measurements will probably
make this automatic.
Chemical change in the atmosphere has received very
little consideration up to the present. Most simulation
experimen ts have been based on sulfur dioxide, overlooking the fact that about three-fourths of the S02
emitted to the atmosphere is converted to H 2S04 which
is rapidly removed by condensation. Thus (for example) ,
a downwind variation will probably yield inaccurate
conclusions about the diffusion coefficients, because
reaction is also a significant mechanism of pollutant
elimination. Smog pollution has prompted some important studies of reaction kinetics, but the complexity of
smog reactions and the physical influences on smog
(moisture, sunlight, mountain barriers, etc.) make this
a very difficult subject. Thus, we are in the unfortunate
position of lumping chemical changes into diffusion
coefficients, which leads us to conclude that we must
obtain diffusion coefficients for each separate polluting
species. Most important of all, mechanistic models of
diffusion coefficients become meaningless.

* The Japanese,

to their credit, have done this.

109

Reaction in the atmosphere should logically be
simulated by supplying a functional form for R, the
reaction rate. In the case of S02 and CO, this form may
be satisfied by a first-order assumption, i.e.,
R=-kR·C.

The literature contains a few values for kso2' but they
vary over two orders of magnitude, which probably
points to the influence of moisture, associated fly ash,
sunlight, and possibly ozone. There is practically
nothing available for other gases. Alternatively, the
disappearance of gaseous species may be simulated by a
sink term (constant) in the diffusion equation, but no
investigators have reported this method.
In the case of particulates, there is loss by deposition.
This generally calls for inclusion of v (aC / az) in the
diffusion equation, and a suitable boundary condition
at the earth's surface, e.g.,
lim C(x, z) = (Q/u)o(z)

as suggested by Calder .15
Smoke plume simulation is seldom the subject of
computer simulation. Nevertheless, plumes have been
studied extensively, because of their importance in
prediction of pollution from stacks. The situation is
very complex, combining the influences of fluid motion,
energy, and diffusion, so that analytical solutions are
not reasonable to seek. Csanady l3.14 is the outstanding
investigator in the field; there is a useful review of the
subject by Brummage. 16 A typical mathematical
formulation of this situation for the case of vertical
plumes is given by

a

- (prw)
az

+ -ara (pru) =0

-a (prw2)+ -a (pruw) =r (()')
- pg+ -a (rr)
()e
ar
ar
ar
a
- (prw()
az

1 a
(pru() = - - - (rF)
ar
Cp ar

+ -a

(5)

When written in cylindrical coordinates, which is
natural for vertical plumes, the use of analog computer
techniques is possible (see for example Reference 17).
SIMULATION OF WATER POLLUTION
DYNAMICS
General theory

The dispersion of pollutants in water is identical to
dispersion in air, at least in principle. The general
equations of transport (Table I) still apply.

110

Fall Joint Computer Conference, 1971

Lumped-parameter simulation

~

Measured data
3

"

Cl.

Z

\i\h,.' r

~

le:(

a:

2

IZ
Z

0

MIA 4n = 0.325)

'\\

'2\ '\.. ~ Co~relation ~s above, corrected for
'("
dally advectlve variation

lJ.J
U
U

Correlation (u = 0.8, D = 5.0,

"',"\1

I

'''>,'''- .....

,

~'tl.....

"'"

-n,.

'-~~

0
0

10

20

30

TIME, DAYS

40

Figure 4-Dispersion of pollutant in the Potomac River as
function of time, 2.3 miles downstream of injection point17

The simulation of real streams presents several
problems, especially those of tributary influx and the
mainstream velocity variation (meander). This has
led several investigators to consider that simulation by
lumped-parameter formulations is required, in order to
absorb all the distorting influences. The method has
been used for many years by individual industries to
describe dispersion of their pollutant discharges, but the
definitive formulation is given by Thomann 21 in the
following form:

Vi(aci/ at) = Qi[~iCi-1 + (1-~i) Ci]-Qi+1[~u+1Ci

+ (1-~i+1) Ci+1]+E

i ( Ci -

1-Ci )

+Ei+1(Ci-Ci+1) +kViCi+P i .
However, measurements in waterways are somewhat
more difficult to obtain than measurements in air, and
we find that the diffusion equation is universally written
as
(6)
O'Connor18 seems to be the only investigator who has
applied two-dimensional modeling, although numerous
authors have recognized that the general model must
be multi-dimensional. The usual one-dimensional
expression obviously lumps lateral and vertical dispersion effects into the longitudinal parameters D and
u. Furthermore, the velocity u is usually assigned as
the overall average,

u=Q/A

(7)

A good example of the application of this method is
reported by Hetling. 22 The method is obviously suitable
for analog simulation if the parameters are available,
or if the data are sufficient to permit extraction of the
parameters by potentiometer twiddling or formal
methods. 23
The lumped-parameter method of simulation can
always be made to succeed if (and only if) the data
supply is sufficient to permit evaluation of the parameters. This is the great virtue of the lumped-parameter
method. On the other hand, the method contributes
nothing to theory, and the parameters cannot be
extended to other waterways or even to different
situations on the same waterway.
Simulation of thermal pollution

so that the burden on D is all the greater.
Despite the theoretical reservations that must
surround such a simplified model, it has been remarkably successful. The advantages of the model-two
parameters plus Gaussian form-are considerable,
because waterways do exhibit this form of behavior
(Figure 4) .19
The model according to Equation (6) may describe
either pollutant, expressed as BOD or COD, or dissolved oxygen (DO). In the case of DO, a suitable
boundary condition is required, usually

Thermal pollution has emerged as a major problem
because our waterways will soon be saturated with heat,
if the present rate of growth is maintained. Nuclear
power plants are especially serious offenders in terms
of waste heat.
The dispersion of waste heat in waterways requires
consideration of the energy equation (Table I). In lakes
or ponds, or in well-behaved waterways, the simulation
may be based on straightforward application of the
energy equation, viz.,

{dC/dz=kL(C*-C) }z=o.

aT /at= \72·DT-u(aTlax)

Sometimes, the right side of this expression is added to
Equation (6), implying that aeration of the waterway
is a homogeneous process; this is incorrect.
Equatiori (6) often requires addition of a reaction
term R. This may describe either reactive decay of the
dissolved pollutant or "dead zones" describing imperfect
mixing, as defined by Krenkel. 20

However, the boundary conditions of thermal pollution
are difficult, because they must include the effects of
radiation, conduction, and evaporation at the waterway
surface:

(8)

{-D(aT laz) =hr(T04- T4)
+hc(To- T) +ke(p*- p) }z=O.

Simulation of Dynamics of Air and Water Pollution

111

The difficulty of defining waterway velocity, in all but
the simplest situations, has discouraged the use of this
formulation. Edinger24 has used a linearized version,
lumping all boundary effects into a single coefficient of
exchange, for the case of a cooling pond. The nonlinear
radiation term may be avoided by simply specifying
the radiative flux, which may be taken as constant over
a given period.
Simulation is usually achieved by lumped parameters.
Jaske 25 has the principal body of work here, but
Yearsley's work26 best represents the current thinking
of the federal water establishment. The federal government has published a manual that suggests modeling of
thermal pollution by a simple first-order decay law.
Modeling of thermal pollution by two-layer representations had some early success, but has been
neglected in recent years. The principle, obviously, is
based on considering a hot layer atop a cold layer;
this is a very reasonable model, as we have observed in
the laboratory. The only basic difficulty with the concept is the necessity to describe the shear stress at the
two-laver interface, but this should be capable of
extraction by well-known methods if sufficient data are
available.

resolved by computer methods. The number of parameters is formidable, but the results are promising.

Estuarine salinity models

CONCLUSIONS

The dynamics of salinity exchange in estuaries is not
exactly water pollution dynamics, but is interesting
because (1) salinity exchange is a problem of differential densities, similar to thermal pollution, and (2)
some very fine work has been performed in this field
probably foreshadowing future developments in water
pollution.
For example, the development that Rattray 27 uses in
describing steady-state circulation in fjords is as
follows:

Simulation of air and water pollution dynamics has
developed rapidly, but will probably experience no
outstanding developments in the 1970s. The theoretical
basis is satisfactory, and the current generation of
computers is entirely adequate for the task of simulation.
The chief restrictions on the development of this field
are those of measurement. N either air pollution nor
water pollution measurement programs, as presently
implemented in the U.S., provide sufficient data for
control or for generalizations that may be applied to
the nation's urban areas.
In air pollution, the simulation of urban pollution
patterns (especially smog patterns) is moving steadily
forward. However, the present author is convinced that
several basic considerations are being neglected, which
could easily be repaired by attention to the theory so
painstakingly developed by foreign investigators. It is
difficult to resist the conclusion that political considerations outweigh the scientific considerations.
In water pollution, the theory ends with twoparameter Gaussian dispersion models. All efforts past
this point have resort to linear, lumped-parameter
models. There is room for breakthroughs here, but there
is no particular impetus for them, so we should not
expect them in the 1970s. The one possible exception is

au
uau
- +wax
az

-

1 a (P+Y2puo2 )
p ax

= - ~-

au)
+ -aza ( Aaz
-

a (ub) + -a (wb) =0

ax

az

u as +w as = ~
ax
az
az

(Dz as)
az

(9)

Still required is a relation between density and salinity,
p(S) usually taken as a linear function. Then introduction of stream functions and combined variables
yields (as usual) a nonlinear set that can readily be

Future directions for water pollution simulation
Techniques of water pollution simulation are essentially at a standstill. This fact reflects the rapidly
improving water pollution situation, and the difficulty
of improving upon existing two- and three-parameter
models.
Thermal pollution is the one major area where the
problems are growing rapidly, and this is where simulation will probably be needed soonest. The available
correlations are not adequate, and recourse to methods
based upon the energy equation seems certain, sooner
or later.
We should probably expect no improvement in the
theory, in the foreseeable future. In contrast to the
atmosphere, which is vast enough to be described by
more or less general theory, waterways are highly
individualistic. The problem of meander is fundamental,
and it will resist simulation for some time to come, until
the need for understanding of our crowded waterways
on a short-range scale provides the stimulus.

112

Fall Joint Computer Conference, 1971

18 D J O'CONNOR
Journal San Eng Div ASCE 91 23 1965
19 L W ROSS
Simulation 14 95 1970
T SAVILLE
Bulletin No 125 Florida Eng and Ind Exp Sta
August 1966
20 J R HAYS P A KRENKEL
Advances in water quality improvement
Vol 1 p 111 Univ of Texas Press Austin 1968
21 R V THOMANN
Journal San Eng Div ASCE 89 No SA5 1 1963
22 L J HETLING R L O'CONNELL
Water Resources Research 2 825 1966
23 E S LEE I WANG
Journal Water Pollution Control Fed 43 306 1971
24 J E EDINGER J C GEYER
Journal San Eng Div ASCE 94 611 1968
25 R T JASKE J L SPURGEON
Water Research 2 777 1968
26 J YEARSLEY
A mathematical model for predicting temperatures in rivers
and river-run reservoirs
Working Paper No 65 US Dept of Interior FWPCA
March 1969

thermal pollution, which may become a crisis item in
the 1970s.
Since Japanese cities have long since resorted to fue'l
consumption regulation based upon modeling of air
pollution dynamics, it seems very logical to expect
similar considerations to be applied in the U.S. in the
foreseeable future, in both air and water pollution.

REFERENCES
1 0 G SUTTON
Micrometeorology
McGraw-Hill New York 1953
2 M HI NO
Atmos Environment
2541 1968
3 D RANDERSON
Atmos Environment
4615 1970
4 F PASQUILL
Atmospheric diffusion
Van Nostrand New York 1962
5 A I DENISOV
Izvest Akad N auk SSSR S.;r Goefix
6834 1957
6 J A TIKVART
Computer simulation and air quality control
Paper Published by NAPCA 1970
7 A C STERN Ed.
Proceedings of Symposium on Multiple-Source Urban
Diffusion Models US Environmental Protection
Agency APCO Publ No AP-86 1970
8 M E MILLER G C HOLZWORTH
Journal Air Pollution Control Assoc
17 46 1967 ibid 232
9 N E BOWNE
Journal Air Pollution Control Assoc
19570 1969
10 T T AKAMATSU et al
Computer control system for air pollution
Published by Kyoto University and the Osaka
Prefectural Govt 1967
11 H H LETTAU
Physical and meterological basis for mathematical models
of urban diffusion processes
In Stern, ACed. Proceedings on Symposium on
Multiple-Source Urban Diffusion Models US EPA APCO
Publ No AP-86 pp 2-1 through 2-26 1970
12 A S MONIN A M OBUKHOV
Trudy Geofiz In-ta AN SSSR No 24 p 1511954
13 G T CSANADY
Journal Applied Meterology 10 36 1971
14 P R SLAWSON G T CSANADY
Journal Fluid Mech 47 33 1971
15 K L CALDER
Journal Meterology 18 413 1961
16 K G BRUMMAGE
Atmos Environment 2 197 1968
17 M P MURGAI H W EMMONS
Journal Fluid Mech 8 611 1960

APPENDIX
SYMBOLS
a

A
b

C

Ci

D

Parameter of Calder's equation (Table II)
Area of waterway cross section; vertical
coefficient of turbulent velocity
Width of waterway
Concentration of pollutant species
Concentration of pollutant species in segment i (Equation (7))
Oxygen saturation concentration in water
Heat capacity
Diffusion-related coefficients of Sutton's
equation (Table II)
Diffusion coefficient of energy or mass in
water (x-direction)
Diffusion coefficients in respective directions
Turbulent exchange coefficient (Equation
(7) )
Heat flux function
Gravitational acceleration
Film coefficient of conduction
Film coefficient of radiation
Rate constant (Equation (7) only)
Evaporation film coefficient
Oxygenation film coefficient
Reaction rate constant
Monin-Obukhov stability height

Simulation of Dynamics of Air and Water Pollution

n
p

r

R
Ri
S

t
T

To
u
Uo

U*

Parameter (Table II)
Parameters (Table II)
Partial pressure of water vapor; static hydraulic pressure; parameter of BosanquetPearson (Table II)
Vapor pressure of water
Source term, segment i (Equation (7))
Heat flux (Equation (4)); parameter of
Bosanquet-Pearson equation (Table II)
Rate of pollutant emission
Pollutant flux in segment i (Equation (7))
Radial dimension
Reaction rate
Richardson number
Salinity of waterway
Time
Temperature
Surface temperature
Velocity in horizontal (x) direction
Mean velocity
Friction velocity, viTO/ P

u
v
w
Vi

x,Y,z
Zo

{3

r

o
K

P

T
TO

113

Mean velocity in horizontal direction
Velocity in lateral (y) direction
Velocity in vertical (z) direction
Volume of segment i (Equation (7))
Cartesian coordinates
Roughness height
Parameter
Adiabatic lapse rate
Dirac delta function
Karman constant
Potential temperature
Reference potential temperature
Equilibrium potential temperature
Correction for flow in segment i (Equation
(7))
Density
Parameter
Standard deviations ill the respective
directions
Shear stress
Shear stress at ground level

Programming the war against water pollution
by DEXTER J. OLSEN
International Business Machines Corporation
Kingston, New York

though seemingly not associated with water pollution,
might be put to use, or at least combined with the
expertise of others, to assist in this battle.

INTRODUCTION
In every communications medium, today, including
newspapers and magazines, television and radio, and
even in personal conversations, we are constantly
alerted to the problems of pollution. More often than
not, we are asked to do something about these problems.
Most of us are learning of little things we can do
individually, in our personal lives, to help reduce
pollution.
As for the bigger things, we tend to sit back and say
"'they' should do something about them," and in
saying "they," we are attempting to shift the burden to
either our legislators or the proverbial "George." It is
the intent of this paper to explore what we in the computer programming profession can do to assist in the
alleviation of our water pollution problems.
Many of those who have entered the programming
profession during the last few years, have frequently
come directly from the college environment and
increasingly are the products of specialized computerscience curricula. Our older colleagues, on the other
hand, usually entered into programming as an outgrowth of, or even as an alternative to, other vocational
backgrounds. They represent various degrees of experience in mathematics, chemistry, finance, business,
physics, biology, engineering-the list is almost endless.
It is this group of people that I especially want to
address, because many of these disciplines that are
represented among us can be applied in some manner to
programming the war on water pollution.
It is not my intent to go very deeply into particular
phases of the problem at hand, but rather, to cover the
topic in a general way, talking about some of the things
that have been done to date, and pointing out some of
the things that have yet to be done, both in the way of
technical items, and in bringing the computer closer to
the engineer. It is hoped that in pointing out some of the
problems involved, that a few sparks may ignite in some
of your minds as to how your particular expertise, even

BACKGROUND
For years a few dedicated souls and a very few
professions have been waging an uphill battle to try to
interest the legislators and the general public in the
need for action on this front. One of these professions
has been civil engineering and, more particularly, the
little known field of sanitary engineering. It is these
engineers who are responsible, through their training,
for the research and design of the facilities and structures which make the water we use fit for human needs,
and to cleanse our waste waters to an acceptable
tolerance before discharging them into a nearby river,
stream or lake.
With all the importance being attached to this subject
today and, after taking a look at the immense job that
lies ahead, it is discouraging to find how little the
electronic computer is being used in this battle.
Civil engineers were among the early users of computers for the more mathematized disciplines, such as:
surveying, and the design of highways, bridges, and
buildings. Only recently, however, have they begun to
use computers in the areas requiring engineering
judgment and decision-making. Even at that, many
civil (and, especially, sanitary) engineers have been
slow in utilizing computers. To a large extent this may
be attributed to the fact that, unlike all too many
federal projects, where large amounts of taxpayers'
money always seem to be available, the sanitary
engineer, until recently, has had to work with the
rather limited funds from the local municipality, and,
therefore, has felt that he must stay with the tried and
proven methods of construction, treatment techniques
and methods of design, rather than take chances with
unproven innovations.

115

116

Fall Joint Computer Conference, 1971

TABLE I -Volume of Construction Contracts (in millions of dollars)*
ESTIMATED

ACTUAL
1967

1968

1969

1970

1971

1972

1975

1980

Sewerage
Waterworks

1,179
970

1,386
887

1,415
980

2,050
1,090

2,500
1,200

3,250
1,350

3,950
2,125

6,125
2,750

TOTALS

2,149

2,273

2,395

3,140

3,700

4,600

6,075

8,875

ITEM

Another feeling still prevalent today among many of
these engineers is that by turning their design problems
over to a computer they will lose control of their designs
and they will no longer bear their individualistic style.
In spite of our daily exposure to it the computer is
something that many others still cannot understand
and many are afraid of. \ Fortunately these fears are
being overcome slowly as the more adventurous
engineering firms begin to use computers.
In this country there are approximately 70,000 civil
engineers and only about 12,000 sanitary engineers.
Many of these people work within very small consulting engineering firms that cannot afford the cost of
any but the smallest of computers, or must resort to
some form of time-shared computer use. Thus, the
economic circumstances have also contributed to this
slow acceptance of the computer. The advent of timesharing systems is helping to alleviate this problem.

Record* states that "for the next two years, sewerage

construction will set the pace for gains in water use and
control. The projected volume for 1972 is more than
double 1969's dollar volume, and three and one-half
times 1966's volume, yet it won't be nearly enough to
bring river and stream pollution under control. Annual
volume will have to redouble later in this decade to
accomplish that goal."
TABLE II-Backlog of New Construction Planning
(in millions of dollars)* as of December 31
1967

1968

1969

1970

Sewage
Waterworks

7,635
4,152

8,029
3,983

9,487
4,077

12,399
4,019

TOTALS

11,787

12,012

13,564

16,418

ITEM

* As compiled by Engineering News-Record, see bibliography.

SIZING THE JOB AHEAD
Weare all aware by now that one of the principal
enemies of our country is pollution, and, according to
some, we have barely begun to scratch the surface as to
cleaning it up. But just how big is the job? The ultimate
costs are difficult to determine and at best are educated
guesses. It is a fact, however, that the volume of construction for public water and sewerage facilities,
combined, in 1967 amounted to approximately $2
billion and by 1970 had reached about $3 billion. By
1980, this figure is expected to triple. At the beginning
of 1971, the backlog of new construction planning for
water and sewerage facilities was over $16 billion (see
Tables I and II). This latter figure represents in the
neighborhood of $400 million worth of actual design
work, of which 50-60 percent could be done on computers if the proper programs were available.
This year the federal government, as well as many of
the states, has taken action to make even more money
available to local municipalities and sanitary districts.
The construction trade journal Engineering N eW8-

One segment of the economy that has not been
affected adversely during the recent recession has been
the design and planning of anti-pollution facilities. All
indications are that the volume of construction and the
necessary design and planning in this area will be maintained at even higher levels over the foreseeable future.
DISSECTING THE JOB
Before we look at what the unprogrammed needs of
the sanitary engineer are, we might do well to look over
his shoulder to see what he is doing now. The range of
problems he encounters might be categorized as follows:
1.
2.
3.
4.

Collection of data
Analyzation of the collected data
Population studies
Hydraulic studies

* "EN-R's 96th Annual Survey," Engineering News-Record,
vol. 186, no. 3, page 23, January 21 1971.

Programming War Against Water Pollution

5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.

Analyzation of existing facilities
Biological and chemical studies
Modeling of existing and proposed facilities
Hydrologic and water basin monitoring
Modeling of the body of water that will receive
the treatment plant effluent
Implementing of design concepts
Structural design implementation
Selection of equipment to meet the various
design criteria
Determination of operating costs
Process control
Planning and monitoring construction progress
Development of specifications
Estimating of construction costs
Development of detailed construction plans
Researching new treatment techniques.

117

Armed with the proper tools and information to
establish the existing population within a given type of
district, the analyst may then merge information pertaining to projected land use and other demographic
data to determine the population totals for which he
must design.
Modeling and simulation

A bare beginning has been made in the way of
modeling existing and proposed treatment facilities. A
look at some of the modeling attempts may suggest
extensions that need to be done. Perhaps some of the
modeling and simulation work that has been done in
some other fields, such as chemical processes, might be
adapted for use with water and waste treatment
facilities.

Collecting and analyzing data
Modeling of waterways

The data available in many instances today consists
primarily of operating records of existing treatment
plants. This data is adequate to meet the requirements
of the state health departments, but is not suitable for
providing meaningful information for use in designing
new facilities or adding to existing plants. More frequent
sampling and metering of incoming sewage and of
effluent discharges, as well as more adequate sampling
and metering of the receiving waters, are only some of
the additional data required. Of course, with more data
to collect and more data to analyze, adequate metering
and telemetering systems must be made available, as
well as sample-analysis systems. Means for adequately
analyzing this increased volume of data must also be
developed.
Population studies

Before analytical or design studies for any particular
community can be started, population studies must be
made. With all the census information available today,
the problem of population studies is still quite a headache. In any given community, population figures must
be broken down into quite a variety of districts for
numerous functions, such as: school districts, voting
districts, electric power districts, water districts, sewer
districts, park districts, tax districts, and land zoning
districts. In many instances, the boundaries of each of
these types of districts do not coincide with any of the
others. A population model or analyzation system would
be much appreciated by the planners concerned with
each of these endeavors.

Models of flow in open channels, such as rivers,
irrigation waterways, and flood control channels have
principally concentrated on the flow, or even the dispersion of a pollutant, being distributed evenly over the
cross section of the channel. Two-dimensional or even
three-dimensional models need to be developed to help
understand the mixing action of pollutants. What
happens when these same pollutants are discharged into
a river emptying into a tidal water or directly into salt
water? There is known to be a "salinity wedge" that
moves in and out of the mouth of a river with the tides
and with the density of the constituents of the river
water. What effects will this salinity wedge have on the
dispersion of river water and its pollutants?
Little, if any, modeling or analysis has been done on
the so-called "conservative" pollutants (chlorides, nonbiodegradables, etc.) and their reactions, both biologically and chemically, and their mixing characteristics
in a receiving body of water. Models of the biological
and chemical reactions coupled in some manner with
hydraulic models need to be developed. In addition,
models may tell us what effect thermal pollution has on
the biological, chemical and mixing characteristics of
the river, lake or ocean. To make these models work,
more and better data must be collected, which emphasizes once more the need for adequate data acquisition systems.
Another area in which models can be useful is as a
guide in determining what data to collect, as well as
where and how much to collect. In this way the value of
the data may be determined prior to the collection
effort and detailed modeling studies.

118

Fall Joint Computer Conference, 1971

Modeling of treatment facilities

A start has been made in the modeling of waste
treatment facilities which will help a consultant in
determining the size and cost of certain treatment units.
Much more needs to be done to cover the spectrum of
possible treatment methods. Models to allow a consultant to determine the best and least costly type of
treatment facilities for a particular municipality need
to be developed.
A .comprehensive computer model can assist the
engineer in determining not only the best type of
treatment facilities for a given situation, but can also
help determine the environmental costs and benefits to
the affected geographical region. For instance, many
rivers serve as both a water supply source and a
dumping ground for waste disposal for a series of
communities. A model can assist engineers in studying
the effects the water usage has on the various communities the river services, e.g., the costs, effects, and
recreational benefits on farmland, lakes, streams, air,
parks and general land use.
Models of the operating characteristics of a particular
plant would be helpful not only during the design stages,
but in helping to keep. the plant operating efficiently
d.uring later years. Such model studies could be tied in
to process control systems to run the plants. One of the
first computer-controlled municipal sewage treatment
plants is scheduled to go into operation in Nassau
County, New York, during the fourth quarter of this
.
year. Another is scheduled later for San Jose, Cahfornia.
However, as with most "firsts," these probably will
be rudimentary compared with those that will be
constructed in the future. But two out of the thousands
. of possibilities across the country make it apparent that
many more, with many more variations, must be
devised.
Many communities have either no treatment facilities
or have what is called "primary treatment" (gravitational the removal of only the readily settleable solids).
As the various states establish more stringent antipollution requirements, more and more communities
will be forced to go to more complex secondary or even
tertiary treatment (processes providing different degrees
of oxidation of the effluent resulting from the prior
treatment), which becomes mucl). more critical in their
operating procedures. Computer control of these plants
will soon become a must, and some say it is already a
must.
Modeling of pipelines

With the ever increasing demands for more water,
many of the larger cities arehaving to go greater dis-

tances than ever before to obtain adequate supplies.
This means that long water transmission lines must be
built to transport the water. Such a pipeline may
require a number of pumping stations placed at intervals
along its length. The designer must determine the
number, frequency, and location of these pumping
stations along with determining the size (diameter) of
the pipeline. A computer model could combine these
variables with the possible variation in the number of
pumps on the line, the variable hourly demand for
water, and the storage facilities at the city to determine
the best arrangement of pumping stations and sizes of
pipeline. Optimizing the design could save the taxpayers a lot of money. After completion of construction,
computer control of the pumping stations could be
employed to maintain operating efficiencies. I have been
informed that very little programming has been done
in this area.
The problem of "water hammer", or hydraulic
transients is of immense concern on large pipelines. A
' .
.
sudden surge of pressure IS created by the sudden closmg
of a valve in the pipeline. In large pipes this surge of
pressure against the valve and the pipe walls can reach
amazing proportions. Being of a fluid nature, this
pressure wave bounces from end to end of the pipe until
it dampens out. The analysis of the water hammer
problem is a highly complex and specialized field of
study. As a result, "rule of thumb" designs often prevail
and the pipeline is overdesigned. A computerized
simulation of this phenomenon that could be used by
the design engineer without a high degree of specialized
training would be of great benefit. This could lead to
better designs of pipelines, and less cost to the taxpayer.
Hydrologic monitoring

The monitoring of hydrologic and stream data for
entire water basins is still in its infancy. The Tennessee
Valley Authority and the Chicago Sanitary District
have both established programs of telemetering this
information in their respective watersheds. These data,
when combined with meterological data will enable us to
design better and more adequate flood control facilities
and may provide the basis for automatic control of the
facilities themselves.
Problem-oriented languages-Structural and
hydraulic design

Many programs have been written for structural
analysis and design of bridges and buildings using
structural steel shapes, but only a limited amount of
programming has been done for underground structures

Programming War Against Water Pollution

built of reinforced concrete, which is necessary for water
and sewage treatment facilities. The designers need a
problem-oriented language suitable for the noncomputer-oriented engineer to design beams, columns,
walls, cantilevered walls, troughs and a myriad of other
shapes that are possible with poured concrete. All these
must be coupled with application of various types of
loadings, such as: water pressure, earth pressure, weight
of equipment, etc. Ideally, design of reinforced concrete
structures by computer should be done in an interactive
mode on remote terminals. This would give the designer
the most flexibility in combining the many shapes to
fit the design criteria. Better yet, would be to allow the
designer to do his work on graphical display devices.
Civil engineers at the Carnegie-Mellon University
have recognized a need for a problem-oriented language
in the field of water resources technology. A pilot
language, HYDRO, has been made operational for a
segment of this field. A language, such as this, should be
developed more extensively and made available to the
engineering community.
Equipment selection and cost estimating

Today the designer of treatment plants makes a
selection of possible equipment to suit his given criteria
from a handful of possibilities that he is familiar with,
possibly neglecting other equipment more suited to his
conditions. A data bank of performance characteristics
and other information pertinent to the full range of
equipment used in treatment plants and pertaining to a
large number of manufacturers could assist these
designers. This information could also be used in
determining projected operating costs under a variety
of conditions.
Some progress has been made in assisting the engineer
in making construction cost estimates, but much more
could be done. One problem is the quantifying of the
numerous items that go into the construction of these
facilities. Another major problem is that of making
reasonable estimates of unit costs for each of these
items. With the costs varying from contractor to
contractor, with the fluctuations in the cost of labor,
with the general inflationary trend, and all of these
varying in the different sections of the country, a
consultant has a difficult time in keeping track of the
latest and most reasonable unit prices. This is especially
true when one considers that a consultant rarely has the
same type of job in the same section of the country with
a frequency such that the prices hold true from one
job to another. A national data base could be established
and be updated on a regular basis with latest prices as
bid by contractors on various types of jobs throughout

119

the country. This data base could be accessed by
consultants through the use of remote terminals.
Drafting systems

A number of papers have already been written on
establishing automated drafting systems and some of
the leading large companies are now working with the
first of such systems. Suffice it to say here that such
systems are also needed in the fields of civil and sanitary
engineering. However, due to the size of the companies
involved, these systems must be available at much
lower prices, and in a time-sharing mode, before these
consultants can make use of them.
Making application programs useful

Simply automating these problems for the engineer is
not enough. In the development of programs the
usability factor must always be kept in mind. A program is almost worthless if the user finds the input
requirements too cumbersome or the output strange and
illogically presented. Under such conditions the
engineer will probably, and has been known to, revert
to his familiar manual methods.
In any printed output which the engineer must
utilize in his design work, the results must be in terms
that he will be able to readily use and understand. This
may seem elemental, but non-computer-oriented friends
have told me that more than a few times, when people
outside the profession (such as programmers who are
more oriented to other disciplines, mathematicians, etc.)
have produced analytical programs, the output is
presented in an academic or even unfamiliar manner.
Often the output does not reflect the way things are
normally done within the profession, and it causes
problems in understanding, together with creating a
distrust in the program as well as the programmer.
Similarly, I know that when a mathematician was
outlining the requirements of a study of the Ohio River,
he insisted that data concerning certain flow characteristics of the river had to exist or the program could
not be written. In actuality, records which had been
kept did not contain such data because of the impracticality of obtaining them. The practicing engineer had to
either do without it, or analyze around it. The mathematician was so involved with theory that he could not
accept the impure practicalities of the situation.
In other instances, certain proprietary programs have
been obtained that output only part of the data required
for fully understanding the printed results.
What I am trying to say is this: learn to speak the
language and fully understand the problems and

120

Fall Joint Computer Conference, 1971

idiosyncrasies of the job to be done. Simply automating
the procedure is not enough. This has been painfully
clear and exasperating to many housewives when they
receive their monthly billing from many department
stores. All they see on the bill is an amount of a purchase
and a department number. This information may be
fully adequate for the store, but the housewife (and
many of us husbands, for that matter) could care less
what the department number is. What is needed is the
date and a recognizable name of the purchased item,
as well as the price. While the programmer or systems
analyst planned the output for the accounting department, he did not plan sufficiently for all the expected
end-users of that output. Again, simply automating the
procedure is not enough. A presentation meaningful to
the end-user must always be kept in mind. Merely
understanding the problem from a data processing
standpoint is not sufficient; we must strive to understand more fully the language and the ramifications of
the problem at hand.
I have mentioned the difficulty that is generated
when inadequate or insufficient data is printed out or
displayed for the end-user. In many technical problems
there is much data that could be included on the output
forms and often there is not enough space to show it all
conveniently. The programmer may select the data he
thinks is most important for the output. However, an
engineer may desire different data or a different output
format. Other engineers may desire still other data or
formats. A means should be developed to enable a user
of a program to tailor the data and formats to his own
requirements without having to reprogram the output
routines. This is especially true where the user is not a
programmer, or does not own the source code of a
leased program.

billion worth of water and sewerage treatment projects)
already backlogged and much more to come, the need
is now here for concerned people in the programming
profession to lend assistance to the battle against water
pollution in this country. A great many of us have
come to programming from other previous vocations,
and many of these vocations can have a bearing on
some facet of the problems facing the sanitary engineer.
I t behooves us all to give serious thought to how our
various backgrounds can be applied to these problems.
By joining forces, we may be able to hasten the day
when we can hear that the water pollution problems are
under control.
ACKNOWLEDGMENTS
The author would like to express his gratitude to the
following persons for their valuable ideas and suggestions, many of which have been incorporated in this
paper: Mr. Frank Perkins, Professor of Civil Engineering
at the Massachusetts Institute of Technology; Mr.
Richard Foerster, and Mr. Paul E. Langdon, Jr., both
of Greeley and Hansen, Engineers, in Chicago, Illinois;
and Mr. Rodney Dabe, of Consoer, Townsend &
Associates, Consulting Engineers in Chicago, Illinois.
BIBLIOGRAPHY
1 ENR's 93rd annual report and forecast

2
3
4

SUMMARY

5
6

I have discussed a number of specific areas, along with
some more general areas in which programming work
needs to be done to help get some of our pollution
problems solved. All of them have one overriding
premise, h<:1Wever. And that is that the use of computers be made more practical for the non-computeroriented professional engineer, and that programs be
available at a reasonable cost. This means that terminaloriented time-sharing systems must be made easier to
use by the layman, and that libraries of these specialized
programs be made available, probably on a royalty-peruse basis. Also, a carefully planned method of making
such programs available, as well as the means of
obtaining them, should be familiar to all concerned.
With some $400 million worth of design effort ($16

7
8

9

10

11

12

Engineering News-Record Vol 180 no 4 pp 60 & 74
Jan 2519 8
ENR's 94th annual report and forecast
Engineering News-Record Vol 182 no 4 p 64 Jan 23 1969
Construction scoreboard
Engineering News-Record Vol 182 no 4 p 146 Jan 23 1969
95th annual report and forecast
Engineering News-Record Vol 184 no 4 p 56 Jan 22, 1970
Construction scoreboard
Engineering News-Record Vol 184 no 4 p 130 Jan 22 1970
96th annual report and forecast
Engineering News-Record Vol 186 no 3 p 23 Jan 211971
Construction scoreboard
Engineering News-Record Vol 186 no 3 p 91 Jan 211971
Display helps fight pollution
Battelle Memorial Institute
Electro-Technology p 18 June 1969
New York uses computer to monitor pollution
Datamation pp 63-64 Sept 1970
E S MUSKIE US Senate
Computers environmental planning, and the quality of life
Proceedings of IBM Scientific Computing Symposium on
Water and Air Resource Management May 1968
V L HOBERECHT
Computer aided design and drafting
IBM Technical Report 21.244 March 1967
D P LOUCKS C S REVELLE W R LYNN
Linear programming models for water pollution
Management Science p B166-B181 Dec 1967

Programming War Against Water Pollution

13 A GOTTLIEB
The computer and the job undone
Computers and Automation pp 16-23 Nov 1970
14 Integrated civil engineering system (ICES) for programming
Computers and Automation pp 10-11 April 1968

15 G BUGLIARELLO
Programming needs in the water resources field and the
role of a problem oriented language (Hydro)
IBM Scientific Computing Symposium on Environmental
Sciences pp 165-184 Sept 1967

121

16 P R DECICCO H F SOEHNGEN J TAKAGI
Use of computers in design of sanitary f!ewer systems
Journal of Water Pollution Control Federation
Vol 40 no 2 part 1 pp 269-284 Feb 1968
17 Proceedings of IBM Scientific Computing Symposium on
Water and Air Resource Management 392 pp May 1968
18 R SMITH
Preliminary design and simulation of conventional
wastewater renovation systems using the digital computer
U S Environmental Protection Agency Water Quality
Office 3-1968 WP-2Q-9

Application of a large scale nonlinear programming
problem to pollution control*
by GLEN W. GRAVES
University of California
Los Angeles, California

and
DA VID E. PINGRY and ANDREW WHINSTON
Purdue University
Lafayette, Indiana

INTRODUCTION

It is the purpose of this paper to present a model of
a river basin of the form suggested. We will consider
simultaneously the following treatment methods:

In recent years it has been recognized by several observers that the. techniques of mathematical programming can be used to select a least-cost solution to the
problem of river quality maintenance. In general these
models have used the solution technique of linear programming and considered one treatment alternative,
such as on site treatment or by-pass piping. Examples
of these models can be seen in Deininger, 1 Louchs,
ReVelle and Lynn2 and Graves, Hatfield and Whinston. 3 The use of these techniques has allowed, in a
theoretical context, large reductions in total treatment
costs in a river basin.
The procedure used in constructing river basin
models has been to divide the river into small sections
and place constraints on the water quality at the end
of these sections. In all cases the water quality criteria
used is the level of dissolved oxygen concentration.
The level of dissolved oxygen at the end of the river
sections is calculated using the Streeter-Phelps equations or some later variation. Cost functions are then
estimated and a mathematical programming problem
of the following form is solved:

(1) Flow augmentation
(2) By-pass piping
(3) Treatment plants (regional and at polluter)
We also show that linear constraints, consistent
with the linear programming technique, are not appropriate when these treatment alternatives are considered.
Therefore, a nonlinear programming algorithm must
be used. Some details of this algorithm are discussed
and the model is applied to the White River Basin in
Indiana.
WATER QUALITY MEASURES
Water quality can be measured in a variety of ways.
The appropriate parameter or set of parameters measured depends on the intended use of the water.
Traditionally, water quality in rivers has been measured by looking at the level of dissolved oxygen concentration. This parameter has been used because of its
direct relationship with the type and quantity of living
organisms in a body of water. If the level of dissolved
oxygen should drop to zero the river is said to be
"septic." In this condition only anaerobic organisms
can exist. These types of organisms rely on oxygen
which is in compounds rather than free oxygen which,
in the case of a septic river, is not available. In the
process of freeing the oxygen from compounds the

Minimize: The total cost of pollution abatement
structures
Subject to: Water quality in each section of river
better than some given set of quality goals

* This research has been sponsored by the Office of Water
Resources Research under Contract 14-31-0001-3080. The
authors are responsible for all possible errors.

123

124

Fall Joint Computer Conference, 1971

anaerobic organisms produce by-products which often
cause the obnoxious odors and colors which appear in
polluted waters.
When effluent, such as common sewage, is dumped
into a river it creates additional demand for oxygen
over and above the demands of the existing living organisms. If this additional load is moderate then the
river can recover using the oxygen entering at the surface. If the additional load is low enough it is possible
that the reduction in the dissolved oxygen concentration level will be acceptable. However, if the additional
load is high the river may become septic before it can
recover, and if additional heavy loads are dumped into
the river as it proceeds downstream, it may never recover, and in fact remain septic for its entire length.
The oxygen required for the oxidation of organic
matter is called biological oxygen demand or (BOD).
The dissolved oxygen concentration level is often
measured relative to the dissolved oxygen saturation
level of the water and is called the dissolved oxygen
deficit (DOD).
The level of the dissolved oxygen concentration in a
body of water is a function of the amount of oxygen
being absorbed at the surface from the air and the
amount being consumed in the water by the biochemical oxidation of organic material. Since, both the consumption of the oxygen by organic material and the
absorption at the surface are not instantaneous reactions, a sophisticated method of predicting the effect
of an organic material on the level of DOD after a
given period of time is necessary. The first successful
attempt to mathematically describe this relationship
was by Streeter and Phelps.4 Their work has only been
slightly modified to this date. The model used was a set
of differential equations, given in (1) and (2).
Assume:
(1)
db k/ dt = Klkb k

ddk/ dt = Klkb k- K 2k dk

(2)

The terms used in (1) and (2) are defined as follows:

t = time of reaction (days)
K1k=rate of oxidation reaction (days-I)
( deoxygena tion rate)
K2k = rate of absorption of oxygen (days-I)
(reaeration rate)
bk = BOD concentration (mg/ t)
dk = BOD concentration (mg/ t)
Equations (1) and (2) are integrated to yield equations
(3) and (4).

bk=bkBClk

(3)

dk = KkbkB[Clk- C2kJ+dkBC2k

(4)

The terms K k, Clk and C2k are defined in (5)-(7).

B

Kk=Klk / (K 2k -Klk )

(5)

Clk == exp (-Klkt)

(6)

C2k =exp (-K2kt)

(7)

B

bk and dk are the values of BOD and DOD when
t=O.
This now enables one to predict the value of DOD at
some point in time after the introduction of a extra
load of BOD. If the body of water with which we are
concerned is a river, and if over a small segment of a
river the velocity of flow is assumed constant, then the
length of time that the reaeration and reoxygenation
reactions take place in that segment is a linear transformation of the length of the segment. This is expressed in Equation (8).
(8)

X k is the length of the river segment and V k is the
velocity of flow assumed in that section.
Using the assumption of constant velocity, the values
of bk and dk can be interpreted as the values of BOD
and DOD at the end of river segment k. In equations
(3) and (4) they are written as a function of the initial
values of BOD and DOD, bk B and dk B , given values of
the parameters K lk , K 2k , V k and X k for that section.
The use of the dissolved oxygen theory in the context
of river basin programming models has been influenced
by the desire of the researchers to maintain a set of
linear quality constraints. This linearity has been
maintained by two procedures. The first is to assume
the parameters K 1k , K2k and V k constant for a given
segment of the river.2
A second procedure used to maintain linearity is the
method first proposed by Thoman. s This approach
views the river as a black box where effluent is dumped
and dissolved oxygen levels are changed in some manner unknown to the researcher. A matrix of so called
transfer coefficients is generated in which each element,
ai;i, is the marginal effect of a change in the BOD level
in section j on the DOD in section i. This concept is
used in the programming models constructed by
Graves, Hatfield and Whinston3 and Schaumburg. 6
Both of these linear representations are fairly accurate as long as the flow is not allowed to vary to a
significant degree. The reason for this restriction is
that it has been found that the level of the flow affects
the reaeration rate and the velocity of the flow in a
particular river segment. If this is indeed the case, and
the level of effluent flow is large relative to the river
flow, then the effect of the flow on the reaeration coefficient is not negligible as assumed in most programming models. If we assume that the values of K2k and

Large Scale Nonlinear Programming Problem

125

V k are a function of flow, then from Equations (3) and

(4) we see that the values of dk and bk can no longer
be represented as a linear combination of bk B and dk B •
This in turn implies that any quality constraint formed
for use in a programming model will be nonlinear in
nature.
We also note that if flow augmentation is to be used
as a treatment alternative, the level of flow will not
only affect the BOD and DOD concentration by dilution, but also by altering the value of K 2k • This points
to the necessity of having a programming algorithm
which can properly handle nonlinear constraints.
Equations (9) and (10) give the relationship between flow and velocity, and between flow and the reaeration rate assumed in our model.
(9)

(10)

The parameters gk, hk' Yk and Zk must be estimated for
each river section for a particular application of the
Streeter-Phelps equations.
It is important to note that the selection of dissolved
oxygen as the measure of water quality for the application of our model does not imply that this is the only
standard which could easily be used.
RIVER BASIN MODEL
We will now formulate a simulation model of a river
basin which can be used in combination with the quality model discussed in the previous section to predict
the level of DOD at a finite number of points along a
rIver.
In order to construct the model the river is divided
into n sections. A new section begins where one of the
following occurs:
1. Effluent flow enters the river.
2. Incremental flow enters the river. (Ground
water, tributary flow, etc.)
3. The flow in the main channel is augmented or
diverted.
4. The parameters describing the particular river
change.
Assume that there are 8 polluters and m treatment
plants in the river basin. Each polluter is able to pipe
efHuent either directly into any segment of the river or
to any of the m treatment plants. Each treatment
plant can in turn pipe to any of the n sections.
The system of pipes described allows for the possibility of by-pass piping and regional treatment plants.
The importance of by-pass piping as a treatment alternative was demonstrated in the study of the Delaware River done by Graves, Hatfield and Whinston. 3

Fig. 1

The purpose of this method of treatment is to transport
waste from densely populated or industrialized regions
to low use areas to take advantage of the natural
treatment capabilities of the river. Regional plants
would be constructed to combine the wastes of two or
more polluters to take advantage of economies of scale
which exist in the production of treated wastes.
The water quality model, as discussed above, can
now be used in the form of Equations (3) and (4) to
calculate the value of DOD at the end of each of the
n river segments.
In order to use Equations (3) and (4) we must know
the values of BOD and DOD at the head of each section. The concentrations, bk B and dk B , are a weighted
average of the concentrations of the flow of section
k-I and all of the efHuent, incremental and augmentation flows entering section k. These relationships are
expressed in Equations (11) and (12).

bkB = [bk_lFk_l+bkAFkA + bkEFkE+b/F/J/Fk

(11)

dkB = [dk_lFk_l+dkAFkA+dkEFkE+d/FkIJ/Fk

(12)

The terms Fk- 1, Fk A, FkE and F/ represent, respectively, the flow from section k-I, augmentation flow in
section k, efHuent flow in section k and incremental
flow in section k. The b terms represent the associated
BOD concentrations and the d terms the associated
DOD concentrations. See Figure 1 for an illustration
of a typical section.

126

Fall Joint Computer Conference, 1971

The effluent flow entering each river section will be
the sum of the flows coming from polluters directly
and from treatment plants. The BOD and DOD concentrations of these flows will be the weighted average
of the concentrations of all the flows. In turn, the BOD
and DOD concentrations of the flows from treatment
plants will be a weighted average of all the flows entering that plant times the treatment levels.
Given the values of the BOD and DOD concentrations at the polluters, the percentage of BOD removal
at each treatment plant and the values of the various
pipe flows the values of bkB and d k B can be calculated
for any k. If k> 1 the Streeter-Phelps equations must
be applied sequentially to all sections i, for which i _ gi (yi) - krii

and the vector of the primal slack variables is
ZT =

for some i, i= 1, m-l.

[Zl, ... , Zm-l]

4>p is the value of the primal objective function and 4>d
is the value of the dual objective function. The labels
(VBV)p and (VBV)d are respectively the values of the
current basic primal variables and the basic dual variables. The labels (BV) p and (BV)d are the basic variables associated with the given values.
From the duality theorem of linear programming
there are three possible termination conditions to the
local linear programming problem.
(A) There exists an optimal feasible solution to the

primal and dual problems.
(B) The constraints for the primal problem are infeasible and the dual problem is unbounded or
the constraints of the dual problem are inconsistent.
(0) The primal problem is unbounded and the dual
problem is infeasible.
The initial solution in the domain of the functions
gi(yi) , i = 1, m, is not required to be a feasible solution
to the nonlinear problem stated in (14). As was discussed above, if yi is not a feasible solution to (14),
then SUPG> O. If yi is a feasible solution to (14), then
SUPG=O. The goal of each nonlinear iteration is either
to reduce the value of the objective function gm(yi) , or
move closer to feasibility, which is interpreted to mean
reduce the value of SUPG.
The case of nonlinear infeasibility will usually imply
that the local linear problem (16) will be infeasible for
~ny t::..yi in the E region around yi. In this case Vg(yi) Tt::..tfji
IS chosen as the objective function and a linear programming problem such as (18) will be constructed.
Subject to:
Vgp(yi)T t::..yiS: _gp(yi) -krPi

P EH

However, if a gain has been made in SUPG then the
.
'
algOrIthm proceeds through the nonlinear iteration
with the determination of k.
Of course if the gain is large enough feasibility may
be reached and the local linear problem would terminate in condition (A).
If the local linear programming problem (16) terminates with condition (B) holding and no gain has
been made in SUPG, then it is assumed that the nonlinear problem is inconsistent and the algorithm terminates unless k can be adjusted as will be discussed later.
The other possible termination condition (A) implies
that a feasible solution to the entire linear problem
(16) has been obtained, and ignoring errors, yi+1 is a
feasible solution to the nonlinear problem. The algorithm at this point will check for a gain in gm(y). If at
the new yi+1 =yi+t::..yi, there is no gain in gm(y), then
we assume that the local minimum has been reached.
If there is a gain in gm(y), then another nonlinear step
is taken.
At this point it is necessary to explain in some detail
the role that the parameter k plays in the final determination of t::..yi. In order to see the role k plays more
clearly, it is necessary to write mathematical expressions for the statements, "gain in gm(y) ," and "gain in
SUPG." If the local linear problem terminates in condition (A), then
m-l

m-l

Vgm(yi)Tt::..yi= L (-Xi)gi(yi) +kL (-xi)rii
i=l

Using Equations (20) and (21), the condition for a
gain in gm(yi) is
m-l

i=l

where
H = {i I Vgi(yi) Tt::..yS: _gi(yi) -krii for some t::..yi}

Since this problem is consistent and bounded it can
only terminate in condition (A). However, it is still
possible that the entire linear problem (42) is infeasible,

(21)

Vgm(yi)T t::..yi< - E

L (-Xi)gi(yi) +k L (-Xi)r ii <

(18)

(20)

i=l

In order for a gain to be made in the nonlinear objective function, the following inequality must hold:

m-l

Minimize:

(19)

-E.

(22)

i=l

At this point, the dual variables are less than, or
equal to zero. The rii are assumed greater than or
equal to zero, and since the nonlinear problem is feasible
gi (yi) > O. This information implies that:
m-l

L (-Xi)gi(yi) <0
i=l

(23)

Large Scale Nonlinear Programming Problem

and
m-l

k

L: (-xi)rij>O

(24)

129

Using the criteria described above for optimality and
infeasibility, the steps actually taken in the computer
program are described in sequence:

i=l

From (22) and (24), it is clear that as k approaches
zero, the gain in gm(yj) would be greater. Therefore, if
the linear problem terminates in condition (A) and
there is no gain in gm (y), then k can be adjusted downwards, which effectively is relaxing the linear constraints. As k goes to zero, condition (22) becomes
m-l

L: (-Xi)gi(yi) <-e

(25)

i=l

The same sort of condition can be derived in the ease
when the linear programming problem terminates in
condition (B). Condition (26) is for a gain in SUPG.

L: (-xp)gp(yi) < -e

peH

(26)

p

Using these results, the criteria that the algorithm
uses for nonlinear optimality and nonlinear infeasibility can be written as follows:
Optimality:
If k is adjusted as low as possible, the linear program terminates in condition (A) and
m-l

L: (Xi)gi(yi)~e
i=l

then yj is assumed to be the optimal solution to
the nonlinear problem.
Infeasibility:
If k is adjusted as low as possible, the linear program terminates in condition (B), and

L: (-xp)gp(yi) ~ -e

peH

p

then the constraints of the nonlinear problem are
assumed to be inconsistent.
Before the steps of the actual program are discussed
one additional feature of this algorithm must be mentioned. It is possible to divide the n variables in the
nonlinear problem into IPRN priority classes. For example, assume that IPRN = 2. This implies that every
variable is either in priority class one or two. All of 'the
variables in priority class one would be used to try and
obtain a gain in SUPG or gm(y). The second priority
class variables would not be considered unless no gain
could be made using the priority one variables with k
adjusted to zero. The number of priority classes is
unlimited.

1. The variables, ~yj, which are currently not in
the basis of the linear programming problem, are
scanned for possible entry. All of the variables
will be out of the basis at the outset of each
nonlinear iteration. The scanning is accomplished by updating the element associated with
the current linear objective function. If priority
classes are used, then only those variables which
have a priority level less than or equal to the
current level, IPRC, are checked.
2. The variable associated with the updated element of highest absolute value is selected to
enter the linear tableau. This criteria is used because this variable locally affects the objective
function more than the other variables.
3. The element with the largest absolute value is
tested to see if it is significantly different from
zero. If it is, the algorithm proceeds to step 4
and the solution of the linear programming
problem. If not, it is assumed that the addition
of the variable associated with the largest element would not affect the objective function
significantly, since the appropriate coefficient is
so small. In this case, k is adjusted downwards,
or if k=O, the number of priority classes considered is expanded. If the current priority class
is the last one available, then the algorithm will
terminate. The termination will mean one of two
things; the nonlinear problem is infeasible since
no gain can be made in SUPG=gw(y) >0, or the
local minimum to the nonlinear problem has
been attained and no gain can be made in gm(y).
4. After selecting the variable column to be added
to the linear programming problem, the linear
programming tableau is augmented by the new
column and if a gain was made in the linear
objective function on the previous linear iteration, the columns associated with the variables
rejected from the basis are removed from the
linear tableau.
5. The linear programming algorithm now takes
over and solves the local problem set up in the
previous steps. The linear programming problem
will terminate in one of the three terminal conditions discussed above. If terminal condition
(A) is reached, linear feasibility, then the objective function gm(y) is tested to see if a gain
greater than some tolerance level was made. If
terminal condition (B) is attained, then the
algorithm tests for a sufficient gain in SUPG.

130

Fall Joint Computer Conference, 1971

If either of these gains are successfully made, the
algorithm leaves the linear programming subproblem. If the gains are not made, then the
algorithm returns to selecting variables to enter
the linear tableau. If terminal condition (C) is
reached, the algorithm again returns to selecting
variables in order to bound the primal problem.

The first step after the successful conclusion of the
local linear problem is to select the "best" value of k.
This calculation is called the post-optimal adjustment
and proceeds in two different manners, depending on
the value of SUPG.
If the value of SUPG is zero, or we have nonlinear
feasibility, then the value of the objective function is
written as a function of k, and this function is solved
for the value of k, which will minimize gm(y).
In the case where SUPG>O, or we have nonlinear
infeasibility, k is approximated by choosing a trial
value such that G is zero, where G is defined to be
G= (ll.yi-l)T(ll.yi-l) - (ll.yi)T(ll.yi)

(27)

This will give the value or values of k where the constraints gi(y) will go infeasible. If the lower bound on
k is greater than the upper bound, then the value of
D is increased and the quadratic problem is again
solved. If as D approaches one, the upper bound continues to be lower than the lower bound, then we say
that the interval determination failed and no k can be
found which will improve SUPG. In this case the algorithm terminates.
If SUPG=O, the algorithm determines the interval
which will maintain the feasibility of y. After the interval is determined, the best value of k is found by evaluating the function gm(yi+kll.yi) for different k's in the
range given. The gain in the objective function is
checked to see if it exceeds some preset tolerance level.
If not, then it is assumed that we are within the length
of that tolerance level of a local optimal solution and
the algorithm terminates.
If a gain in SUPG is made, or a reduction in gm(y),
the algorithm takes another nonlinear iteration. This
procedure is repeated until either a local minimum or
infeasibility is encountered.

In either case, SUPG = 0, or SUPG > 0, the value of

k must be tested to see that it does not violate bounds
which are implied by the bounds on y. If the value of
k determined in the post optimal adjustment violates
the greatest lower bound on k, or results, in the case of
SUPG=O, in no gain in gm(y), the value of k is adjusted
downwards. The control of the problem is then passed
back to the linear programming part of the algorithm.
It is at this point the new estimates of the error
terms rijare calculated. This is done by evaluating the
functions gi(yi+ Il.yi), i = 1, m-l, and using Equation
(17). The new values of rii are used for the rest of the
j + I-th nonlinear iteration and the absolute values are
used for the local linear problems in nonlinear iteration

j+2.
The next step in the algorithm is to calculate the
range of values for k which will maintain a feasible
solution or give the best gain in SUPG. After a range
of values is determined, the optimal value of k is chosen
from the range determined.
The range is calculated by using the following
equation:
gi(y) = gi(yi) +gi(yj) ll.yjk+k2rii~D(SUPG)

(28)

If SUPG=O then Equation (28) just says that the
new values of gi(y), i= 1, m-l, must be feasible. If
SUPG>O, then D is initially set equal to zero. The
quadratic functions in k are then solved.

gi(yi) - D SUPG+gi(yj) Il.yjk + k 2r i i = 0,

i= 1, m-1

(29)

APPLICATION OF RIVER BASIN MODEL
The model as proposed has been applied to the West
Fork White River in Indiana. The West Fork White
River has its source near the Indiana-Ohio border. The
general direction of flow is southwesterly for 371 miles
through the State of Indiana. The major city on the
river is Indianapolis, which is 234 miles from the
mouth. Two minor cities, Anderson and Muncie, are
upstream from Indianapolis. The concentration of population and industry around these three cities causes the
major portion of the pollution problem in the West
Fork.
For the purpose of this paper we have chosen a
length of the West Fork White River, which runs from
the headwaters above Muncie to just south of Indianapolis. The portion described is 133.2 miles long. It
has been divided into 46 sections based on information
about polluters, incremental flow, and river parameters.
The sections range in length from .1 mile to 12.2
miles, with most sections in the 2.0 to 5.0 range.
In the implementation of the model, some alterations were made which were not explicit in the exposition of the model in the previous sections of the paper.
The first of these alterations is that the number of
river sections, number of polluters and the number of
treatment plants are assumed to be equal, and a potential polluter and treatment plant are located at the
beginning of each river section. In terms of the model

Large Scale Nonlinear Programming Problem

presented above, this assumption can be written as
n=m=s. In any given problem the flow from some of
the polluters will be zero (i.e., no polluter exists currently at that point on the river). The treatment plants
in those sections are potential regional treatment
plants. In the section where polluters do exist, the
treatment plants are the on-site treatment plants for
the associated polluter, and can also act as potential
regional treatment plant sites. Regional treatment
plants can be located anywhere along the river by
creating a new river section. It is not at all necessary
to make this alteration, but it does allow for the easy
addition of polluters at a later date, and permits easy
calculation of the incremental cost of these additional
polluters.
The second of these alterations makes it possible to
reduce the number of variables in the system. The desirability of this can be seen by calculating the number
of variables in the entire model, as described for 46
sections. If we assume that each polluter can pipe its
effluent to every potential treatment plant and to every
river section, and each treatment plant can pipe to
every river section, the number of piping variables
alone would be 3 (46) 2 or 6348 variables. Since it seems
reasonable from knowledge of the nature of the problem
that certain of these pipes would not enter the solution,
it is desirable that the nonlinear algorithm does not
have to consider the flow variables associated with
these pipes.
The last alteration is that our particular implementation of the model allows for tributaries. They are limited
to a length of one section. This allows for polluters
dumping into small tributaries some distance from the
main stream. The model could easily be expanded to
encompass tributaries of any number of sections in
length, and even tributaries with tributaries. However,
since it was not necessary for our application we limited
the length of the tributaries to one section.
For the purpose of illustration of the river basin
model described, we have applied the model to the
West Fork White River Basin with the following
restrictions:
1. All effluent must be treated at a level of at least
85 percent removal. This corresponds to required secondary treatment, which is required as
a matter of policy in the West Fork White River
Basin. The implication of this assumption is
that no effluent can be dumped directly into the
river without treatment.
2. All treatment plants must dump their effluent
into the nearest river section. Since each polluter
is allowed to pipe to any treatment plant, it

131

TABLE I-Effluent Flow
River
Section
4

6
11
16

23
31
32
33
36

37
41

Effluent
Flow c.f.s.

BOD
mg/f

DOD*
mg/f

.2700
20.0829
.300
24.400
.7800
13.2000
.9300
13.0000
185.0000
10.0000
185.0000

40.0
322.0
298.0
200.0
270.0
20.5
19.0
20.0
450.0
20.0
450.0

4.66
5.66
8.66
6.66
6.66
2.16
.36
6.36
6.06
6.66
4.76

The DOD is calculated using a saturation level of dissolved
oxygen of 8.66 mg/f determined for a temperature of 21 0 C.

seems unnecessary to allow treatment plants to
pipe to other sections.
3. An individual polluter can pipe his effluent no
more than 25 river sections up or down the
river. This allows for the reduction of piping
variables as explained above, and since it is unlikely that piping costs will be less than gains
from economies of scale over a long distance, the
best possible solution will not be eliminated
from consideration.
4. One potential reservoir exists at the headwaters
of the river. This is not the only potential site,
but was chosen for this application.
5. The quality requirement in all sections is 5 mg/ ,e
of dissolved oxygen. This is a State of Indiana
policy and is, for that reason, appropriate for
our problem.
The sub-model described by this set of assumptions
has 1880 variables and 138 constraints. The large number of variables indicates that the appropriate use of
the priority classes described in the last section of the
paper is necessary. The smaller the number of variables
the algorithm must examine for possible change, the
faster a nonlinear iteration can take place. Therefore,
the priority classes can be used to great advantage by
selecting in advance the variables which appear to be
the most important. This is, of course, very tricky, but
the priority classes can be altered as information is obtained from the iterations of the nonlinear algorithm.
In our application of the model, we selected around 250
first priority variables from our knowledge of the nature
of the river problem.
The necessary data to apply the programming model
to the West Fork White River is in Tables I through

132

Fall Joint. Computer Conference, 1971

TABLE II-Incremental Flow
River
Section

Incremental
Flow c.f.s.

pipe has the form:

BOD

DOD

mg/.t

mg/.t

Ci /= 1.865 dij(qij)

(31)

.598

The term d ij is the length of the pipe segment and
is the flow through that segment. If we were considering a section of pipe from polluter j to treatment
plant i, the qij term would be replaced by Pij. Both
equations, (30) and (31), are in terms of $1000 per
qij

I

4
5
9
10
15
18
19
25
26
28
29
31
35
40
41
42
43
44

1.0
-21.2
1.0
4.0
6.4
72.0
-35.0
23.0
13.0
14.0
-214.0
28.0
5.8
5.0
8.0
8.0
9.0
10.0
21.0

4.0

2.36

214.0
23.2
13.2
7.7

7.66
3.06
6.26
3.26

12.1
5.0
5.0

3.06
2.16
2.76

5.0
12.4
9.3
13.9
5.0
5.0
23.2
16.7

.76
2.06
1.26
2.66
4.76
2.66
7.66
7.06

TABLE IV-River Parameters

IV. This is information about effluent flows, incremental flows, tributary flows and other required river
parameters.
The cost functions used for the treatment plants and
pipelines were obtained from the literature. The total
cost function for treatment plants was obtained from
Frankepo and has the following form for the kth treatment plant:

f. P.i)

3/4

C.TP = 49.22(

(30)

[8.0(r.-.5)'+IJ

The value of Pki is the flow from polluter i to treatment plant k and the value of rk is the level of BOD
removal at treatment plant k.
The total cost function for piping was obtained from
Linaweaver and Clarkll and for a particular section of
TABLE III-Tributary Flow
River
Section

Effluent
Flow c.f.s.

BOD
mg/.t

mg/.t

DOD*

1
2
7
14
17
24
30
38
45

51.0
6.1
30.3
3.0
44.0
16.10
21.00
61.00
33.7

2.92
2.63
9.80
5.50
7.18
6.91
10.62
30.44
8.00

.7993
1.9073
4.9465
3.4587
2.7736
2.2540
2.4544
4.5948
.3462

Section

gk

hk

1
2
3
4
5
6
7
8
9
10
11
12
13

6.579
6.579
6.579
6.579
6.579
6.579
6.579
6.579
6.579
6.579
6.579
.040596
2.8152
6.579
2.8152
2.9152
6.579
2.8152
2.8152
.037944
3.3354
3.3558
3.3558
6.579
3.3558
3.3558
.14382
.14382
.003468
6.579
.0034
.0034
.0034
.0034
.0034
.0034
.0034
6.579
.3374
3.621
3.621
3.621
3.621
3.621
3.621
3.621

-.249
- .249
-.249
-.249
-.249
-.249
-.249
-.249
-.249
-.249
-.249
.538
-.117
- 249
-.117
-.117
- .249
-.117
-.117
.403
-.14
-.14
-.14
-.249
- .14
-.14
.183
.183
.645
- .249
.645
.645
.645
.619
.619
.619
.619
-.249
.619
- .197
- .197
- .197
- .197
- .197
- .197
- .197

14

15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

Yk

.0445
.0445
.0445
.0445
.0445
.0445
.0445
.0445
.0445
.0445
.0445
.0125
.065
.0445
.065
.065
.0445
.065
.065
.0045
.064
.014
.014
.0445
.014
.014
.0056
.0056
.0023
.0445
.0023
.0023
.0023
.0007
.0007
.0007
.0007
.0445
.(007
.0050
.0050
.0050
.0050
.0050
.0050
.0050

Zk

Klk

.55
.55
.55
.55
.55
.55
.55
.55
.55
.55
.55
.728
.471
.55
.471
.471
.55
.471
.471
.715
.448
.685
.685
.55
.685
.685
.715
.715
.78
.55
.78
.78
.78
.935
.935
.935
.935
.55
.935
.765
.765
.765
.765
.765
.765
.765

.115
.1
.103
.1
.1
.6
.115
.. 63
.63
.63
.623
.308
.308
.102
.304
.805
.104
.600
.100
.100
.100
.~75

.300
.103
.300
.100
.100
.095
.091
.096
.093
.093
.093
.092
.092
.201
.201
.094
.201
.213
.225
.308
.308
.308
.1
.310

Xk

5.6
.1
6.2
1.9
3.6
3.7
.1
3.2
2.8
3.0
4.3
1.9
1.0
.1
2.4
12.4
.1
5.5
.5
3.3
5.1
1.2
.8
.1
8.7
5.0
4.8
5.4
5.0
.1
1.8
.5
.3
1.8
.7
.7
.2
.1
.8
5.7
3.9
7.3
8.4
2.6
.1
1.2

Large Scale Nonlinear Programming Problem

year and include both construction cost and operation
and maintenance cost.
The reservoir costs were obtained for the particular
site along with the expected flow augmentation yield.
Any flow less than the expected yield is assumed to
cost a percentage of the total cost equal to the percentage of total flow. The annual total cost for the construction and maintenance of a dam at the chosen site
is $807,000. The amount of expected flow in cubic feet
per second is 100.
The sum of all of these cost functions for each potential structure in the river basin is the objective
function of the programming model with one adjustment. The cost of every polluter treating on-site at a
85 percent removal level is subtracted from the aforementioned sum. This implies that the cost given by the
objective function is the cost over and above the cost
of required treatment at the· 85 percent level at the
polluters. We note that the required treatment level of
85 percent does not maintain the water quality at the
required level of 5 mg/ t.
The solution obtained to the West Fork White River
problem described has the following features:
1. The effluent of the polluters in section 4 and 6
are combined and dumped into section 6.
2. The effluent to the polluters in sections 31, 32
and 33 are combined and dumped in section 31.
3. Part of the effluent of section 36, 30 cfs, is combined with the effluent of section 37 and dumped
in section 39. There is a regional plane constructed at section 39 to handle the combined
effluent.
4. The reservoir at the headwaters site is constructed and provides 100 cfs. of augmentation.
The rest of the polluters dump their effluent into the
nearest sections after treating at the level given in
Table V. The location of the polluters in the West
Fork White River basin can be seen in Figure 2.
TABLE V-Treatment Levels of Treatment Plants
in the Typical Solution
Section

Treatment Level

6*

.85
.85
. 85
.85
.85
.986
.85
.910

11

16
23
31*
36
39*
41

*The plants in sections 6, 31 and 39 are regional plants.

133

Fig. 2

The total cost of the solution obtained is $3,571,799.
This cost is over and above the cost of required 85 percent removal at all polluters. The cost of required uniform treatment of 98 percent removal over and above
the cost of 85 percent removal is $4,063,074. The required uniform treatment of 98 percent does not give a
feasible solution. By combining certain effluents and
using flow augmentation, the cost is reduced a half a
million dollars and the river meets the water quality
standards.
The solution given above to the West Fork White
River pollution problem appears to be reasonable.
Effluent from polluters which are close together is combined to take advantage of the economies of scale in the
production of clean water. In the case of the polluter
in section 36, it is necessary to transport some of the
effluent downstream in order to gain feasibility. This
points out a heretofore undiscussed role that piping
effluent can play in solving river basin pollution problems. In the paper by Graves, Hatfield and Whinston3
on by-pass piping we see that piping replaces the treatment plant to take advantage of the natural treatment
ability of the river. In our example, it is mandatory to
pipe in order to meet the required quality goals, unless
one can treat at a level of removal of almost 100 percent. This is practically speaking almost impossible .
ACKNOWLEDGMENTS
The development of the river simulation model in this
paper owes much to the DOCAL computer program of

134

Fall Joint Computer Conference, 1971

the Environmental Protection Agency. This program
was applied on the West Fork White River by the
Evansville Office of E.P.A. We are especially grateful to
Max Noecker and Stanley Smith of the Evansville
office who kindly made their data and experience available for our use. Discussions with our colleague J.
Hamelink of the Forestry department were helpful.
The authors are responsible for all possible errors.
REFERENCES
1 R A DEININGER
Water quality management economically optimal
pollution control system
Unpublished PhD Dissertation Northwestern University
Evanston Illinois 1961
2 D P LOUCHS C S REVELLE W R LYNN
Linear programming models for water pollution control
Management Science Vol 14 No 4 Dec 1967
3 G W GRAVES G B HATFIELD
A WHINSTON
Water pollution control using by pass piping
Water Resources Research Vol 5 No 1 Feb 1969
4 H W STREETER E B PHELPS
A study of the pollution and natural purification of the
Ohio River
US Public Health Bulletin No 146 Feb 1925

5 R V THOMANN
Mathematical model for dissolved oxygen
Proc Amer Soc Civil Engr 89 No SA5 Oct 1963
6 G W SCHAUMBURG
Water pollution control in the Delaware estuary
Harvard University Water Program Harvard University
Cambridge Massachusetts May 1967
7 G W GRAVES G B HATFIELD
A WHINSTON
Water pollution control with regional treatment
Technical Report Federal Water Pollution Control
Administration Forthcoming
8 G GRAVES
Development and testing of a nonlinear programming
algorithm
Aerospace Corporation June 1964
9 G W GRAVES A B WHINSTON
The application of a nonlinear algorithm to a second
order representation of the problem
Centre d' Etudes de Recherche Operationnelle Volume 11
No 21969
10 R J FRANKEL
Economic evaluation of water quality-an engineering
economic model for water quality management
SERL Report No 65 3 University of California Berkeley
Calif Jan 1965
11 F P LIN AWEAVER C S CLARK
Cost of water transmission
Journal of American Water Works Association 1549 1560
December 1964

Parametric font and image definition and generation
by AMALIE J. FRANK
Bell Telephone Laboratories

Murray Hill, New Jersey

require excessive storage and fairly extensive manual
labor to define the images. Described herein is a new
method for defining and generating images in parametric form. This method decreases storage requirements and simplifies the manual operations considerably.

INTRODUCTION
The demand for high graphic arts quality publications
particularly in the areas of education and technology
increases steadily. The associated use of computer controlled photocomposition systems can be likewise expected to accelerate. Considering the high cost of currently available systems, we were motivated to investigate alternative approaches, both from a hardware and
a software point of view. We conducted experiments
with a high resolution electron beam recorder controlled
by a small computer containing 8K words of 16 bits
each, and running at about 500K cycles per second.
The recorder can address a raster grid 16,384 square,
and can draw both horizontal and vertical vectors, but
the vectors must be no greater than four rasters apart
to obtain proper shading. This paper concerns itself
primarily with the software implementation, as constrained by the given hardware configuration.
Focusing on the software implementation, the means
of defining and generating repeated images and fonts
is a prime factor in determining the economics of a
computer photocomposition system. For high quality
publications, a variety of symbols and fonts with
multiple sizes in each font is a necessary feature. There
is, of course, a philosophic question concerning the design of new fonts oriented specifically for computer
generation as opposed to the use of traditional styles,
originally designed for manual stylus, woodblock, or
hot metal techniques. Reserving aesthetic judgment
for the time being, we concentrated on developing an
efficient process for defining and generating any font or
set of images, following the given lines as exactly as
possible. In designing a system for this purpose, three
factors must be considered: the manual operations required to arrive at the image definitions, the computer·
storage required to hold the definitions, and the processing time required to draw the images. For the given
hardware, the computer storage condition assumes
first priority. Previous software systems are found to

PREVIOUS METHODS
In most existing systems, the font definition consists
of either a list of coordinates of the endpoints of the
strokes comprising the images, including any internal
shading strokes, or a list of the coordinates of points
defining the contours of the characters. Such lists occupy a considerable amount of storage. This is further
aggravated as technological advances drive the minimum line width down, thus requiring many more
strokes to shade the same areas. This, of course, can be
offset to some extent by including hardware to defocus
the beam for larger size characters, and thus increase
the beam width, but at the expense of a sharp, clear
image. Another important disadvantage of existing systems is that different sizes of the same font require
separate and distinct lists, themselves varying in size.
For each size in a desired font, the manual operations to
define the font must be repeated, and the resulting
definitions stored individually.
For an example, consider a simple sans-serif font in
8 point size, and assume an average of 75 strokes per
character. If the strokes all are vertical and are specified
in series, then the simplest storage scheme requires for
each stroke 8 bits for the starting position of the stroke
relative to a local origin, and 8 bits for the length of
the stroke. On this basis, a font of 128 characters requires 9,600 words of storage. This increases proportionately for larger-sized characters, and also for more
complex fonts.
Some improvement to the simple encoding scheme
indicated above may be made by incorporating variable
length fields, or by encoding differences between suc135

136

Fall Joint Computer Conference, 1971

Figure 1-Patch configurations

cessive values. Other techniques arISIng principally
from research in pattern recognition may also be considered. In one of these schemes, the boundaries of the
image are "chain" encoded. Here, the position of a
point is given relative to the previous point in the list,
usually as a value from 0 to 7 which indicates one of
the 8 raster positions immediately surrounding the
previous point. Another approach introduces the concept of the "skeleton" of an image. This scheme is use-

ful in applications such as chromosome analysis where
reduction of the image to a skeletal graph is an objective. However, for photocomposition, it offers no particular advantage over the other encoding schemes
mentioned.
In fact none of these schemes reduces the storage
requirements to a feasible level for the experimental
equipment indicated previously. A new scheme which
proves to be highly effective is described below.

Parametric Font and Image Definition and Generation

RECTANGLE

TRAPEZOID

i
l

I

NO. OF

LENGTII

STROUS

---.t

14--

~

NO. OF

14--

BOTTOM
LENGTII

STROUS

CURVE

f
#1

i
---.t

#1

)
#2

14--

j
#2
Figure 2-Types of patches

NEW IMAGE DEFINITION
This scheme uses the concept of breaking up an
image into patches. The contours of each patch are
then described by a set of parameters. The parameters
are chosen such that a simple manipulation yields varying sizes of an image.
There are various ways in which an image can be
divided up into patches, and correspondingly various
ways of defining the patch parameters. One such construction was proposed by Mathews and Miller in
1965. 1 Their construction, which was designed for hardware implementation, assumed that curved portions of
an image would be broken up into patches, each one of
which had two curved sides and two other sides which
are straight parallel lines. Each patch requires eight

137

parameters: the initial width, the height, the coordinates of one corner, and the curvature and slope of
each curved side. Rectangles and trapezoids are treated
as special cases.
The design described herein also uses the three
types of patches: rectangles, trapezoids, and patches
with curved boundaries. The curved patch definition is
somewhat freer than that used in the Mathews and
Miller construction, in the sense that it is not constrained to have two sides which are straight parallel
lines. Figure 1 shows some sample patch configurations.
These are described in detail further below. Of greater
significance is the difference in methods used to arrive
at the parameters for a particular image or set of
images. :l\1athews and Miller used trial and error,
which results in considerable effort to divide an image
into patches and to determine the necessary parameters. It is particularly difficult to choose the slopes and
curvatures in such a way that successive patches match
well at their points of juncture, i.e., to insure against
the appearance of cusps at these points. Clearly, a
faster, more analytical method is necessary for producing image definitions in any quantity. This paper describes both a curved patch construction and an associated algorithm for arriving at the necessary parameters
directly. The three types of patches are illustrated in
Figure 2. Rectangles and trapezoids are handled in a
straightforward manner. For a rectangle, the stored
parameters are the width or the height, whichever is
larger, and the number of strokes to be drawn. Vertical
strokes are drawn if the height is greater than the
width, and horizontal strokes otherwise. For a trapezoid, four parameters are stored: the bottom length,
the number of strokes to be drawn, the change in
abscissa of the left hand end of each successive stroke,
and the change in line length for each successive stroke.
These last two parameters are related, of course, to the
angles A and B made by the sides of the trapezoid as
shown in Figure 2.
A curved patch is a somewhat more complicated
matter. We consider a curved patch to consist of two
curved members. Vertical strokes are drawn where the
curved members are considered as functions of X, and
horizontal strokes are drawn where the curved members
are considered as functions of Y. The two curved members mayor may not meet each other at their end
points. Where they do not meet, straight lines are assumed to connect the ends. Each curved member is
described by at least one mathematical function. In
some cases, a curved member is broken up into a number of segments, and each segment is described by a
separate function. First, we must decide what kind of
function is to be used. Second, for a given image or set
of images such as the characters of a particular font,

138

Fall Joint Computer Conference, 1971

14

14

16

12

18

10
Figure 3-Steps in defining an image

we find a suitable method for determining the parameters of the function for each segment.
Two factors govern the type of function to be used:
storage and execution time. A more complicated function spans a larger segment and carries through more
inflections of the curve, thus requiring a smaller number of segments, and consequently less storage than a
simple function. However, it takes longer to compute.
Since our experiment concentrated on font production
on a rather small computer, we opted to minimize execution time. For this reason, we ruled out superellipses,
as suggested by Mergler and Vargo in their experiments
in font design. 2

We decided initially to experiment with parabolas
of the type: y=ax 2+bx+c or, x= ay2+by+c. The
principal axes of these parabolas are parallel to the Y
and X axes respectively. Roughly speaking, we use the
form y=f(x) for parts of a curve that are U or n
shaped, and the form x=f(y) for parts that are ( or )
shaped. To determine optimally which form to use
where, we follow the steps illustrated in Figure 3. The
upper left corner of this figure shows a donut shaped
image we wish to define. We actually start the process
with the donut shown in the upper right corner. Here
we mark all the points where 45 0 and 1350 lines are
exactly tangent to the donut. There are eight of them.

Parametric Font and Image Definition and Generation

139

Figure 4-Equalizing the angle error

We then connect corresponding points on the inside
and outside curves with dotted lines. For example,
point 1 is connected to point 11, point 3 is connected
to point 13, etc. This carves up the donut into four
patches. For the top and bottom patches we will fit
parabolas of the form y=f(x), and fill the patches with
vertical strokes. For the left and right patches we will
fit parabolas of the form x=f(y) and fill the patches
with horizontal strokes.
N ext we go to the lower left corner of the figure.
Here we add 9 more points where 0° and 90° lines are
exactly tangent to the donut. Finally, we find the
parabolas which fit between successive points. We fit
one parabola between points 1 and 2, another parabola
between points 2 and 3, etc. The donut in the lower
right corner of the figure shows two of the parabolas
that have been found. Between points 14 and 15 we
fit a parabola of the form y=f(x), and between points
6 and 7 a parabola of the form x=f(y).
GETTING A GOOD FIT
In our experiment we discovered that the process of
finding the actual parameters for the parabolas to be

fit to each particular patch is not a trivial matter. We
took the capital letters of the U nivers font, and actually programmed a number of algorithms before we
arrived at one which yields consistently good results
with a minimum of manual labor . First we tried entering the coordinates of various points along the given
curve. Taking the first four points, we forced the parabola to pass through the two middle points, and be a
least squares fit to the two surrounding points. The resulting parabola was then used for the segment between the two middle points. We then deleted the first
point, added the fifth point in sequence, and repeated
the fitting process. This continued until we ran out of
points. The resulting segments were fair fits, but they
did not include either end segment. These could be
generated by increasing the initial list by two points
lying someplace on an extrapolation of the given curve
member. Pinpointing that someplace proved often to
be like finding a needle in a haystack.
In successive trials, we loosened up the input format
by specifying for each segment separately two "fixed"
points through which the parabola had to pass, and two
"floating" points for the least square fit. This enabled
us to place the two floating points inside or outside
the two fixed points. With sufficient fishing this ap-

140

Fall Joint Computer Conference, 1971

proach resulted in much better fits, but we were often
bothered by noticeable cusps at the junction of segments. This led us to concentrate on the tangent lines
of a fitted parabola at the end points. From this evolved
our final algorithm, which "equalizes the angle error,"
i.e., makes the angle between the tangent line of the
fitted parabola at one end point and the tangent line
of the given curve at that point equal to the correspondingly defined angle at the other end point. This is
illustrated in Figure 4 and is stated more formally as
follows. Given points A and B, and their derivatives
A' and B', to fit a parabola:
1. Compute the parabola P A which passes through
A and B and has the derivative A' at A. Compute the derivative B A ' of P A at B.
2. Compute the parabola P B which passes through
A and B and has the derivative B' at B. Compute the derivative AB' of P B at A.
3. The tangent lines corresponding to the two derivatives A' and AB' at A form some angle a.
Similarly, the tangent lines corresponding to the
two derivatives B' and B A ' at B form some angle
{j. There exists a family of parabolas which pass
through A and B, and which have derivatives
AF' and BF', whose respective tangent lines lie
somewhere within the angles a and {j respectively. Within this family, choose that parabola
whose tangent lines at A and B "equalize the
angle error," i.e., where the angle (8) made by
the tangent lines corresponding to the derivatives A' and AF' is equal to the angle (8) made
by the tangent lines corresponding to the derivatives B' and B/.
This algorithm proved highly successful. We did note,
however, that points on the curve where the tangent
line is exactly horizontal or exactly vertical were particularly sensitive. Accordingly, all such points are
made end points of curve segments. In this case alone,
we do not equalize the angle error, but force the derivative of the curve at such a point to be exactly zero.
In most cases this procedure gives the visual effect of
continuous curvature at the juncture of adjacent segments, even though the angle made by the tangents of
the two segments at the juncture differs by as much as
10 degrees. Beyond this threshold however, the contour appears disjoint. Where this occurs, the addition
of one more segment normally removes this condition.
The parameters stored for a curved patch consist
essentially of the coordinates of the starting and ending
points of the two curved members relative to a local
origin of the entire letter or image, and a set of three
parameters required for each curve segment in the

patch. Various encoding economies are made, as, for
example, where a patch is preceded by a contiguous
patch, the end points of the previous patch are used as
the starting points of the current patch. We have also
incorporated a mirroring procedure so that patches
which are mirror images of each other share a common
set of parameters.
The curve fitting algorithm to equalize the angle
error computes the coefficients a and b of the parabola
as functions of the coordinates of the end points of the
segment and of the tangent of the given curve at those
points. The actual generation of an image, however, is
not done by evaluating the function y=ax2+bx+c or
X= ay2+by+c successively. Since the distance between
strokes is constant, the computation of the functional
values lends itself readily to implementation with difference equations. Classically, this approach consists
of starting with an initial value of the independent
variable (xo for vertical and Yo for horizontal curved
patches) and an initial functional value (Yo for vertical
and Xo for horizontal curved patches). We then construct each successive functional value from the previous functional value by applying the difference equation coefficients do and k, as shown below for a vertical
curved patch:
Yl=yo+do

d1=do+k

Y2=Yl+d1

d2 =d1+k

YN=YN-l+dN- 1

The difference equation constants are easily derived
from the coefficients a and b, the initial functional
value Yo, and the inter-stroke distance h:
do=ah2 +2ahYo+bh
k=2ah2

Use of difference equations results in less execution
time since additions replace multiplications, and it results in less storage required since only two difference
equation constants are required rather than the three
coefficients a, b, and c. An auxiliary advantage is that
the range of the difference equation constants is considerably smaller than that of the coefficients. The
difference equation constants each fit within one word
as fixed points numbers whereas the coefficients would
probably require a floating point representation or
double precision storage.
In summary, the parameters required for each curved
segment are the coordinates of the starting point, the
two difference equation constants, and the number of
strokes spanning the segment. Figure 5 and Tables I
and II contain a complete example of a parametric

Parametric Font and Image Definition and Generation

TABLE I-Input Data for Image in Figure 5

POINT

X

Y
744
890
981
1000
868
669
500
0
56
135
524
524
524
744
850
885
894
814
654
500
106
135
418
418

799
744
594
440
126
21
0
440
799
799
799
664
453
652
598
524
440
245
152
139
440
664
664
453

1
2
3
4
5
6

7
8
9
10
11

12
13
14
15
16
17
18
19
20
21
22
23
24

4

141

3

ANGLE OF
TANGENT
90
135
165
0
45
75
90
0

1

12

13
20

7

11

R

24

23
R

90
135
165
0
45
75
90
0

10
9

+-ORIGIN

definition. In Figure 5 the types of patches are coded
as follows: R = rectangle, T = trapezoid, V = vertical
curve patch, H = horizontal curve patch, M = mirror
image of other patches. Figure 6 shows an exploded
view of the image generated from this definition. This
same definition may be used to generate the image in
any other size, larger or smaller. Excessive size changes
result in some degradation to the image. See Figure 7

8

Figure 5-Example of parametric definition

for a few examples of sizing. In these cases, the sizing
was done in a strictly proportional manner. The patches
may also be individually sized by varying algorithms
to obtain thinning or thickening of different parts of
an image.
The various images in Figures 1, 2, 3, 5, 6, and 7 are

TABLE II-Computed Parameters for Image in Figure 5
End points
of segment
a
1-2
2-3
3-4
4-5
5-6
6-7
8-9
14-15
15-16
16-17
17-18
18-19
19-20
21-22

Difference Equation
Constants

Parabolic Coefficients

-

-

.00256
.00240
.00078
.00136
.00169
.00074
.00043
.00446
.00414
.00156
.00208
.00218
.00049
.00064

b

c

3.80424
2.62036
0.68705
1.19329
-2.06229
-0.74405
-0.38025
6.63375
4.21743
1.37188
1.83257
-2.63984
-0.49310
-0.56122

-615.17606
273.29839
848.84937
737.47536
644.06991
186.01190
83.65432
-1815.75331
-188.60866
594.18601
492.83464
943.88245
263.27415
227.46939

k

-

-

.08181
.07694
.02498
.04339
.05411
.02381
.01383
.14266
.13243
.04989
.06664
.06989
.01578
.02041

do
-0.04091
3.79026
0.96187
-0.02170
-3.46570
-0.98810
0.00691
-0.07133
2.92872
1.02268
-0.03332
-3.66281
-0.60750
0.01020

142

Fall Joint Computer Conference, 1971

Figure 6-Exploded view of an image defined and generated by the patch method

actual computer output and were all made using this
new method. The output device used for this purpose
was a Stromberg-Carlson 4060 microfilm recorder.
SUMMARY AND EVALUATION
We may now evaluate the method described with respect to the manual operations, computer storage, and
processing time.
The curve fitting implementation for this study was

initially done on a non-interactive basis. In this case,
the manual operations consist of measuring and keypunching the coordinates of the significant points, i.e.,
all terminal points, inflection points, and points of
tangency with 0, 45, 90, and 135 degree lines, and any
additional points required for smoothing, along with
the angle of the tangent line at these points. Almost all
of these points are chosen to be points of tangency
with 0, 45, 90, and 135 degree lines, so that it is not
actually required to measure the angle in these cases.
Even these manual operations are a decided itnprove-

Parametric Font and Image Definition and Generation

~1111111""II""llllm~

=
==
~IIIIIIIIIIIIIIII

143

~lllllllllInlllllllUl!!..
=.

~I I I I I I I I I I I I

Figure 7-Exploded views of various sized images derived from image shown in Figure 5

144

Fall Joint Computer Conference, 1971

ment over previous methods, where many more points
must be specified, and redone for each separate size.
This method has also been implemented in an interactive program designed by Miss Joan E. Miller of Bell
Telephone Laboratories. This interactive procedure
saves substantial time by replacing the measuring and
keyboarding of the various coordinates and tangents
by the faster physical and visual processes of drawing
and manipulating various knobs, pushbuttons and
switches and also by providing a means of immediate
feedback and correction for the curve fitting processes.
In this procedure, we draw the boundaries of the image
by hand on a RAND tablet. This results in a stack of
coordinates in storage, which are then subjected to a
smoothing algorithm, and displayed on a scope. The
next step is to find the significant points of tangency.
This is done by means of a novel feature of this program, which is a tracking dot under knob control. This
dot is not permitted to course the entire scope, but is
constrained to follow the path of the input curve. As it
does so, the tangent of the curve is continuously computed and displayed numerically on the scope. The user
may thus position the tracking dot to any desired point
and register it by pushing a button. After all these
significant points are captured the curve fitting algorithms described previously are applied and the fitted
parabolas immediately displayed. Additional segments
are added as needed. Finally, the parameters describing
the entire image are organized and outputted for use
in the photocomposition phase.
The image definitions for this experiment were stored
and executed as procedures in an interpretive language.
While this adds some overhead to both the storage and
processing, it affords a great degree of flexibility. For
this implementation, the storage required for the 26
capital letters is approximately 700 words. Extending
this to 128 characters, adjusted for smaller characters,
we arrive at a conservative estimate of 3,200 words,
representing a % reduction over the 9,600 indicated
previously. Moreover, we note that font definitions in
parametric form are easily sized. This can be done in
the definition process by transforming the coordinates
of the defining points prior to computing the parameters, or it can be done dynamically at drawing time
by maintaining a "master" set of definitions in one
size and deriving other sizes as required from the
master set. All sized versions of the same image require
the same amount of storage, a valuable feature for
efficient storage management.

Volume timing tests are currently being constructed.
Initial results indicate a through-put rate in the
neighborhood of 100 characters per second. This is satisfactory for many applications where graphic arts
quality output is required using a general purpose computer system, and where economy of operation is a
prime consideration. The speed may be increased, of
course, by microprogramming or hard-wiring or by
using computation equipment with a higher cycling
rate.
We conclude that the method described of defining
images in parametric form has a number of distinct advantages, and while it is particularly desirable in a
minimum equipment configuration, it is generally applicable to a variety of hardware configurations.

REFERENCES
1 M V MATHEWS J E MILLER
Computer editing, typesetting and image generation
Proceedings Fall Joint Computer Conference 1965
pp 389-398
2 H W MERGLER P M VARGO
One approach to computer assisted letter design
The Journal of Typographic Research Volume II
Number 4 October 1968 pp 299-322
3 J S WHOLEY
The coding of pictorial data
IRE Transactions on Information Theory Volume IT-7
Number 2 April 1961 pp 99-104
4 H FREEMAN
On the encoding of arbitrary geometric configurations
IRE Transactions on Electronic Computers EC-lO
Volume 2 June 1961 pp 260-268
5 C T ZAHN
A formal description for two-dimensional patterns
Proceedings of the International Joint Conference on
Artificial Intelligence May 7-9 1969 Washington DC
Gordon and Breach NY 1969
6 U MONTANARI
A note on minimal length polygonal approximation to a
digitized contour
Communications of the ACM Volume 13 Number 1
January 1970 pp 41-47
7 J L PFALTZ A ROSENFELD
Computer representation of planar regions by their skeletons
Communications of the ACM Volume 10 Number 2
February 1967 pp 119-125
8 U MONTANARI
Continuous skeletons from digitized images
Journal of the ACM Volume 16 Number 4 October 1969
pp 534-549

A syntax-directed approach to
pattern recognition and description
by LARRY D. MENNINGA
Western Washington State College
Bellingham, Washington

Such classification methods are also essentially onelevel. The features selected for measurement must be
able to detect any structural relationships which are
significant. As the patterns become more complex, the
selection of a suitable set of features becomes more
difficult.
Finally, the vector representation of a pattern is not
the most natural for people. To make use of the potential
for powerful recognition systems using interactive
computing, the human being must be able to communicate with the computer using representations
which he can grasp rapidly.

INTRODUCTION
Pattern recognition has held the attention of researchers
for quite some time. Early efforts were in optical
character recognition and were intended to provide
easier and more rapid communication from man to
machine. In more recent years, this research has expanded to include the processing of pictorial information,
such as that from high energy physics or medical
research.
Until recently, most of the work and associated
theory has treated pattern recognition as a classification
or categorization problem. The methods used can be
broadly characterized as follows: Each sample pattern
is represented by an n-dimensional vector whose components are the values of the individual features, or
properties, which have been selected for measurement.
The classification is done by partitioning n-space,
referred to as the feature space, into subspaces. Each
subspace represents a class, and a sample pattern is
considered to be a member of the class corresponding
to the subspace of which the pattern vector is a member.
The members of each class must be clustered in the
feature space to achieve successful recognition. The use
of weight vectors, proper selection of features, adaptive
systems, and other techniques for improvement have
been investigated. A survey of this work is given
by Nagy.!
The most significant shortcoming of the above
method is that the result of such a recognition process
yields only a classification, a class name Dr number.
What is often desired is a structural description of the
pattern or an analysis of the relationships existing
between certain substructures within the pattern. This
is especially true of more complex patterns. Of course,
it is possible to divide the feature space into classes,
with each one corresponding to a different structural
description. However, complex patterns would necessitate an extremely large number of classes and would
result in unmanageable problems.

The linguistic approach

In recent years, research has been done using linguistic methDds to overcome the failures of the classification methods. The linguistic methods are aimed
specifically at producing structural descriptions of
patterns. Miller and Shaw2 give a survey of much of
this work.
In the linguistic approach, the pattern, or picture, is
considered to be a sentence in a language generated
by a given grammar. This grammar is defined either
explicitly or implicitly, and it is used to analyze each
sample pattern. In such a grammar, the nonterminal
symbols are phrases descriptive of a subpattern and
the terminal symbols (called primitives) are the basic
elements which are given a priori, and they are recognized by some method outside of the linguistic model.
A given sample pattern is then described, or recognized, by using the grammar to analyze it. A derivation
tree gives a description of the sample in terms of substructures and the relationships which they satisfyinsofar as these are included in the grammar. Thus, in
addition to being the desired result of an analysis, the
description, as expressed in the rules for the grammar,
can also direct the recognition process by defining the
sequence of algorithms to be used.
145

146

Fall Joint Computer Conference, 1971

: := ~
::= 
::=! 
::=!
 < basic object>
::=! «description»
::=«object label>:::=
::=
::=!,

::=! 
::=NULL!! 
::=[]
: :=below! above !leftof! rightof! parallel!
~!skew!egual!tangent!intersect!
~!far!eguallength!

Figure I-Syntax for the pattern rules

A SYNTAX-DIRECTED SYSTEM
A system called PARSE, which uses the linguistic
approach to pattern' description, will be described here.
PARSE is an acronym for Pattern Analysis and
Recognition by Syntax Evaluation. It is a system in
which the user must supply the metalanguage to be
used to analyze and describe patterns.

object type must be defined by a pattern rule, or be a
primitive, if it is to appear on the right side of a pattern
rule.
As an example, consider the grammar rules given in
Figure 2. In the example, the object type HOUSE is
defined in terms of the object types TRIANGLE and
RECTANGLE and the relationships are specified by
the predicates above, parallel, perp, and skew. The labels
are used to identify object types, and they may be used
to specify a correspondence between substructures. The
relationship of identity is given implicitly by repeating
an object label more than once in a pattern rule. In the
rule for HOUSE, the object label X is associated with
both TRIANGLE and RECTANGLE to indicate that
the same instance of the line X must be a part of each
of these objects.
Although the user may specify the grammar which
he wants to be used, this specification is subject to
certain restrictions imposed by the syntax rules of
Figure 1. In each pattern rule, the labels and predicates
specify semantic information for the rule, and so these
elements can be treated separately. Ignoring this
semantic information, the "underlying" grammar can

y

x

Str1.Jcture of grammar for PARSE

The formal syntax for a user-supplied grammar rule
is shown in Figure 1. Each rule in the grammar is called
a pattern rule and gives the definition of an object. A set
of pattern rules is a pattern grammar. The left side of
each rule begins with an identifier. This identifier is
called an object type and is a nonterminal symbol in the
vocabulary of the pattern grammar. In addition, the
left side of each rule includes a list of labels.
The right side of each rule consists of a list of object
types, with associated labels, and predicates. There
are two special object types, POINT and LINE, which
are terminal symbols in the vocabulary. These are the
primitives, that is, they are object types which have
been defined a priori. They are the basic elements in
the language (or in the patterns), and they are recognized outside of the PARSE system. The list of predicates in Figure 1 is not intended to be exhaustive, but
rather to represent a typical set.
More than one pattern rule is allowed for any object
type. This makes it possible to give alternate definitions
for an object type. By using more than one pattern rule,
it is possible to construct recursive definitions. Each

u

w

v

HOUSE(H:T,S) .... TRIAHGLE(T:X, Y,Z) above RECT1'8GLE(S:X,U, V,W)
TRIAn"GLE (T:X,Y,Z) ~LiINE(X:P,Q) LINE(Y:Q,R) LINE(Z:R,P)
skew[x,Y]
RECTlL~GLE(R:X,y,Z,W)~ LINE(X:P,Q)

parallel LINE(Z:U,V)

LINE(Y:Q,U) parallel LINE(W:V,P)
~[x,y]

Figure 2-Sample pattern grammar

Syntax-Directed Approach to Pattern Recognition and Description

easily be seen to be a context-free phrase structure
grammar, as defined by Chomsky, 3 since only one
nonterminal symbol is allowed to appear on the left
side of a pattern rule. Note that the empty string
cannot be generated by any of the allowable rules and
that all rules are length preserving. For such a contextfree grammar, there is a derivation for every sentence in
the language which is generated by that grammar. 4
Thus it is possible to define an algorithm to determine a
description (structure) for a given pattern (string) ill
the language.
Semantics used by PARSE
Knowing the structure of a pattern is not enough.
Semantics are also involved in the descriptions of
patterns. The semantics give meaning (principally
spatial relationships) to a particular structure. Semantic
information is given primarily by the predicates and the
object labels in the pattern rules. The meaning supplied
by these elements of the language will be formalized
in a manner similar to that suggested by Knuth. 5
The seman tics will be supplied by the values of certain
attributes, or properties, that the symbols of the
language will have. The attributes selected can be
divided into two areas: geometric properties and labels.
The value of each attribute is assigned by evaluating a
function. The same name will be used for an attribute
and its corresponding function.
Each pattern rule has semantics associated with it.
Thus it is necessary to have a function for each attribute
of each symbol in the rule. The value assigned to an
attribute of a given symbol depends on the values of
some of the attributes of the other symbols in the rule
and on some of the other attributes of the same symbol.
Consider the underlying grammar given by a set of
pattern rules. Rules in the underlying grammar will be
called syntax rules. Let G be such a context-free grammar, G= (VT, VN,P, S), where V T= {terminal symbols} ,
VN = {nonterminal symbols}, P = {syntax rules}, and
S = the start symbol. For each symbol X in V TU V N,
there is a finite set of attributes or properties, A (X).
Let Y a be the set of values that can be assumed by a
given attribute a in A (X) .
If the rth syntax rule is
XO~XIX2

... Xn

where XOEVN and XjEVTUVN for I~j~n then the
semantics can be defined as follows: For each attribute
a of a symbol X j, there is a function fia which maps the
values of certain attributes of X o, Xl, ... , Xn into a
value of a. That is, !:;a: YalX Y a2 X'" X Yat(i.a)~Ya,
where ai=ai(j,a) is an attribute of X ki , for O~ki=
K i ( j, a) ~n and 1 ~i~t( j, a).

147

Ag(X)={angle, length, x min, ~, y min, y max}
for every symbol X which is not a primitive.
Ag(POINT)={X min, y min}
Ag(LlNE)={angle, length, x min, ~, y min, y max}.

Figure 3-Attributes of the object types

For each symbol X, partition A (X) into two disjoint
subsets Ag(X), the geometric attributes, and AL(X),
the attributes dealing with the labels. The sets of
geometric attributes are given in Figure 3. The label
attributes are handled in a similar fashion.
Each attribute is assigned a value by a function of the
other attributes. Let ((Xl, YI), (X2, Y2» be a pair of
points giving the cartesian coordinates of the primitive
LINE and (x, y) be the coordinates for POINT. Then
the functions which give the semantic rules are defined
as follows:
arctan[(YI-Y2)/(XI-X2)] for XI¢X2
angle (LINE) =

{

7r/2forxl=x2

length(LINE) = V(XI-X2)2+(YI-Y2)2
x min (POINT) =X
ymin(POINT) =y
x min (LINE) = minimum {Xl, X2}
X max (LINE) = maximum {Xl, X2}
Y min (LINE) = minimum {YI, Y2}
Y max (LINE) = maximum {YI, Y2}
For the rth rule XO~XIX2' .. Xn the semantic rules
for the attributes in Ag are:
angle(Xo) = "Langle(X i ) •length (Xi)/"Llength(Xi )
length (Xo) = "Llength(X i )
X min (Xo) = minimum {X min (Xi) I i=I, 2, ... , n}
X max (Xo) =maximum {x max (Xi) I i= 1,2, ... , n}
ymin(Xo) = minimum {ymin(X i ) I i=I, 2, ... , n}
y max (Xo) = maximum {y max (Xi) I i= 1,2, ... , n}
A "meaning" is assigned to each sentence in the
language by the semantics. A derivation of the sentence
is carried out in the usual way, using the syntax rules.
Starting with the terminal symbols, the attributes are
evaluated for each symbol in the derivation. It is easy
to see that the semantic rules define the attributes of a
symbol X as a function of the attributes of those symbols which appear on the right side of a production

148

Fall Joint Computer Conference, 1971

evaluated and used, consider the grammar rules given
in Figure 2. For each symbol in the grammar, the set of
geometric attributes is the same. Thus Ag(HOUSE) =
Ag(TRIANGLE) = Ag(RECTANGLE) = {angle, length,
x min, x max, y min, y max}. Figure 4 shows the derivation of a sample TRIANGLE and a table of attribute
values using the given pattern rule. The underlying
syntax rule for TRIANGLE is:

___
--------1--------LT'
LT'
~1~
TRIANGLE

«0,0), (2,0»

«2,0); (1,1»

«1,1), (0,0»

TRIAN GLE~LINEI LINE2 LINEa.
object

angle

I
I

length

~

~I

y min

y max

0

i

Using the grammar

LINEl

0

I

0

LINE2

.Jt/4

2

1

2

!

0

1

LINE3

"/4

2

0

1

I

0

1

2+2+2

0

2

I

0

1

TRIANGLE

0

I

2

0

2

I

Figure 4-Semantic evaluation of TRIANGLE

defining X. Thus, by first evaluating the attributes of
the terminal symbols, and working backward through
the derivation, the attributes of each symbol can be
defined. When all the attributes can be evaluated the
semantic rules are said to be well-defined. The meaning
of the sentence is the value of the attributes of the
start symbol.
In addition, any symbol can be considered to have
meaning, determined by the values of its attributes.
The PARSE system allows restrictions to be placed
upon the meaning of certain symbols or sub-derivations.
These restrictions are used to limit membership in a
given pattern language. Thus, if P G is a pattern grammar with underlying grammar G, then L(PG ), the
language generated by P G, is just those sentences with
structure generated by G which satisfy the semantic
restrictions imposed by the predicates and labels.
The specifications of the syntax for the pattern rules
allow predicates to be used in two ways. When a
predicate is used as an instance of (relator) it is a two
argument predicate, with the first argument being the
basic object preceding it, and the second argument the
basic object that follows the predicate. When a predicate is used as an instance of the metalinguistic variable
(predicate list), its arguments are the objects corresponding to the labels appearing as formal parameters
in the rule. In both cases the predicate is evaluated
using the geometric attributes of the object types which
make up its arguments. The predicates are defined to
allow sets of symbols as arguments because instances
of (basic object) can include more than one symbol.
To illustrate how the semantic information is

Recognition

The analysis uses the techniques of syntax-directed
compiling6 to recognize patterns. A "top down" analysis
of each sample pattern is done to determine if it is a
sentence in the language generated by the grammar.
A sketch of the procedure to use is as follows: Consider
the input pattern to be a set, Q, rather than a string.
This set is assumed to be finite. Now, beginning with
the start symbol, S, or global object type, generate a
sentence, or pattern, by first replacing the start symbol
by its definition as supplied by a production with that
symbol as the left side: S~XIX2 ... X n.
(1) If Xl, X 2, ••• , Xn are all terminal symbols, then
map the set {Xl, X 2, ••• , Xn} into the input set,
Q. Evaluate the semantics for the production
using the values of the attributes of the images
of the X/so If the semantic restrictions are met,
then Q is an instance of the object type S, and it
is a member of the language generated by S.
If the semantic restrictions are not satisfied a
new mapping must be tried. This continues
until either all the possible mappings are tried
or the semantics are satisfied.
(2) If the X /s are not all terminal symbols, then
choose the first nonterminal symbol, say Xj.
Apply the first production for X j (if there is more
than one production): Xj~Xj1Xj2 ... X jm. Now
proceed as in step (1) above, mapping
{Xjl, X j2 , • • • , X jm } into Q.
(3) Continue, by repeating steps (1) and (2) until all
terminals are reached. If at any point in the
procedure all the mappings have been tried and
the semantic restrictions have not been met, then
go back to the previous performance of step (2)
and apply the next alternate definition. If all
the alternate definitions for a given nonterminal
symbol have been tried for a particular performance of step (2), then go to the step (2)

Syntax-Directed Approach to Pattern Recognition and Description

two levels back. If all the alternate definitions
for the start symbol have been tried and the
semantics are still not satisfied, then the input
set, Q, is not in the language generated by S.

149

u

Description

Since the primary motivation for the linguistic
approach to pattern recognition is to produce a description of the patterns processed, the result of an analysis
must allow this possibility. The result of a parse, as
presented in the above, will be a yes or no answer, as
to whether the input pattern is an instance of the object
type which is the start symbol of the pattern grammar
being used.
In addition, the derivation of the sentence should be
kept, so that the user can have this derivation printed
out. This will then be a description of the pattern in
terms of the subobjects of which it is composed and the
spatial relationships which they satisfy, as specified in
the pattern rules.
This is all very good and useful, but it does not give
any information about patterns which are not sentences
in the pattern language. A description of such a pattern,
in terms of object types within the grammar and the
spatial predicates, should also be produced.
The object types, used as nonterminal symbols in the
pattern grammar, form a natural hierarchy. The start
symbol is the highest level object type in that hierarchy.
The object types are ordered by assigning X a lower
level than Y if the first production defining X occurs
previous to the first production defining Y. Because the
object types in the right side of a pattern rule must all
be defined, this will result in having X lower level than
Y if there is a production Y ~sXt and no production
X~uYv, where s, t, u, and v are strings of symbols
(possibly empty). Also, X is lower level than Y if Z
is lower level than Y and Z~sXt occurs before any
production of the form X~uZv.
In cases in which the input pattern is not a sentence
in the language, the desired result is a description of the
input in terms of the highest level object types. This
description will not be unique, in general, but should be
minimal in some sense.
Minimization will be achieved as follows: First the
highest level object will be found such that the input
pattern is a candidate for inclusion in the language
generated starting with that object type. In other words,
the highest level object type, such that the input
includes an instance of it, is found first. Each input line,
used to compose the object found, is marked. Now an
instance of the highest level object type in the hierarchy,
which is composed of the maximum number of un-

Z

v

TRIANGLE(T:X,Y,Z) rightof RECTANGLE(S:U,X,v,w)

Figure 5-Maximal description of a pattern

marked lines, is sought. If none is found with at least
one unmarked line, the object type which is the second
highest in the hierarchy must be tried. If an instance is
found, the unmarked lines are marked, and another
instance of that type is sought. This process continues
until all lines are marked or all object types have been
tried unsuccessfully.
A list of the instances of object types found in this
manner is constructed. The predicates defined within
the PARSE system are evaluated using all pairwise
combinations of objects found as arguments. Those
predicates which are true are added to the list of
object types, and this then becomes the description of
the input pattern. Figure 5 is an example of such a
description.

COMPUTER IMPLEMENTATION
Computer and language used

PARSE has been programmed on the Burroughs
B5500 computer system at the University of Washington Computer Center. Interaction is provided by using
a remote teletype as an input and output device. Consideration was given to using a list processing language
for the implementation, since much of the data is best
handled by using linked-list data structures. However,
for reasons of availability, as well as ease of programming, Burroughs Extended ALGOL was used.

150

Fall Joint Computer Conference, 1971

f

!----

! J

rl(oblect~
I.

1 (LINE)

FT#!I~~

I

-,j.

--------

~-----

-J

1

1 (object)
1 (LINE)
3
3

2 (predicate)
6 (parallel)
1(# 1st arqs

7
8

1
1(# 2nd arqs
3

4
8

-

~

-

l(object)
l(LINE)
3
2
6

7

}
1 (ob;ectl
l{LINEl
3

1

I

I

L
--

--

2

\pred~cate

b

\para.Lle.l)

-1

.1

5

.!
.L
4,

3 ( predica te
5 (perp)
2 (# of arqs

1
Label table
X
1

Y
Z
W
P
Q
S
T

2

2
3
4
5
6
7
8

Rule:

RECT(R:X,Y,Z,W)::=LINE(X:P,Q) PARALLEL LINE(Z:S,T)
LINE(Y:Q,S) PARALLEL LINE(W:T,P) PERP[X,Y]

Figure 6-Internal representation of a pattern rule

hierarchical treatment of the data during analysis.
Each node represents an instance of an object type and
consists of several fields. These are a label, a type
designation, a pointer to the next node of the same type
object, and pointers to severallists. There is a subobject
list made up of objects which compose the given object
and also a superobject list which contains those objects
of which it is a part. There is also an attribute list which
contains the values of the geometric attributes the
object has.
Before any processing of the pattern is done, the data
structure consists of only lines and points, the primitives. During the analysis, if a given object type is
being sought, the data structure is examined to see if an
instance of that type object is present. If not, the rule
defining that object is invoked. Each node of the
definition is to be satisfied by searching the data or
invoking a definition if the node specifies an object type,
or by evaluating the predicate if the node specifies one.
As each object type is found it is inserted into the
data structure. Thus, once a given instance of an object
has been found, by satisfying the pattern rule defining it,
the work done in finding it will not need to be repeated
even though it may not be used at that point in the
analysis.

Program structure

CONCLUSION
The PARSE program can be divided into three
major functional areas:
1. Preprocessor for pattern grammar rules.
2. Building of the data structure.
3. Analysis of the data according to the pattern
grammar.
The pattern rules are input from the teletype according to the syntax of Figure 1. The pattern rules are
checked for syntactical correctness by a top-down
analysis. Each pattern rule is made into a doubly-linked
list structure. An example of such a structure is shown
in Figure 6.
Each node in the list contains two pointers. The
forward pointer indicates the next node in the definition,
while the backward pointer specifies the node to go back
to in case backtracking is necessary during the analysis
of a pattern. Each node has a flag to indicate whether
it is a predicate or an object type, the name of the
predicate or object type, and a list of integers specifying
the labels used by that object type or predicate.
The data structure

The pattern is also represented by using a linked-list
structure. This representation arises naturally from the

PARSE is most similar to the system described by
Evans. 7 Evans' grammars are somewhat more restrictive
in their specification of semantics. The Picture Description Language of Shaw8,9,lo can be modeled in PARSE
by associating a label for head and a label for tail with
each object type. Similarly, the system of Ledleyll,12
is less powerful than PARSE. The only spatial relationship that his system allows is concatenation, which can
be handled with the labels alone in PARSE.
While some simple patterns have been processed
using PARSE,13 it has not been tested on any complex
pictures, and thus performance figures are not available.
In order to handle any automatic picture processing, a
device for recognizing the primitives would be a necessary addition to the system.
The PARSE system produces a description of an
input pattern in terms of a meta-language supplied by
the user. Although "natural" is a subjective judgment,
the description must be termed natural, in that it is
symbolic, using a familiar vocabulary with its usual
meaning. The PARSE system allows interaction
between man and machine. Success at recognizing any
instance of a pattern defined by the grammar is guaranteed by the restriction that the grammar be contextfree. Further, a description of any picture will always

Syntax-Directed Approach to Pattern Recognition and Description

be produced, although not a unique one, or necessarily
the same description that a human being would give.
REFERENCES
1 G NAGY
State-of-the-art in pattern recognition
Proceedings of the IEEE 56-5 836-62 May 1968
2 W F MILLER A C SHAW
Linguistic methods in picture processing-A survey
Proceedings AFIPS 1968 Fall Joint Computer Conference
Thompson Book Co Washington 1968
3 N CHOMSKY
Formal properties of grammars
In Handbook of Mathematical Psychology Vol II
Luce Bush Galanter Eds Wiley New York 1963
4 J E HOPCROFT J D ULLMAN
Formal languages and their relation to automata
Addison Wesley Menlo Park California 1969
5 D E KNUTH
Semantics of context free languages
Mathematical Systems Theory 2-2 127-145 June 1968
6 T E CHEATHAM JR K SATTLEY
Syntax-directed compiling
Proceedings of the AFIPS Spring Joint Computer
Conference Spartan Books Inc Washington 1964

151

7 T G EVANS
A grammar controlled pattern analyzer
Proceedings of the IFIP Congress Edinburgh 1968
8 A C SHAW
The formal description and parsing of pictures
Report No 84 Stanford Linear Accelerator Center
Stanford California 1968
9 A C SHAW
A formal picture description scheme as a basis for picture
processing systems
Information and Control 14 9-52 January 1969
10 A C SHAW
Parsing of graph representable pictures
JACM 17-3453-81 July 1970
11 R S LEDLEY ET AL
FIDAC Film input to digital automatic computer and
associated syntax directed pattern programming system
In Optical and Electro Optical Information Processing
J Tippep et al Eds MIT Press Cambridge Mass 1965
12 R S LEDLEY
High speed automatic analysis of biomedical pictures
Science 146 October 1964
13 L D MENNINGA
A syntax-directed approach to the recognition and
description of visual images
Tech Rep TR 70-10-06 University of Washington
Seattle 1970

Computer pattern recognition of printed music*
by DAVID S. PRERAU
Department of Transportation/Transportation Systems Center
Cambridge, Massachusetts

INTRODUCTION
A major area of concentration of pattern recognition
research has been the design of computer programs to
recognize two-dimensional visual patterns. These
patterns may be divided into two classes: 1 patterns
representing objects in the real world (e.g., landscapes,
blood cells) and patterns representing conventionalized
symbols (e.g., printed text, maps). The standard
notation used to specify most instrumental and vocal
music forms a conventionalized, two-dimensional,
visual pattern class.
This paper will discuss computer recognition of the
music information specified by a sample of this standard
notation. An engraving process is generally used to
produce printed music, so the problem can be termed
one of computer pattern recognition of standard
engraved music (though the recognition procedure will,
of course, be effective for music printed by any method).
The overall process is illustrated in Figure 1. A sample
of printed music notation is scanned optically, and a
digitized version of the music sample is fed into the
computer. The digitized sample may be considered the
data-set sensed by the computer. The computer performs the recognition and then produces an output in
the Ford-Columbia music representation. FordColumbia is an alphanumeric language isomorphic to
standard music notation. It is therefore capable of
representing the music information specified by the
original sample. (An example is shown in Figure 2.)
In the form of a Ford-Columbia alphanumeric string,
the output of the program can be used as input to music
analysis programs, 2 music-playing programs, composer
aids, Braille-music printers, music displays, commercial music printers, etc.

SCANNE

COl-IPUTER

RECOGNITION

Figure I-The overall process

THE PROGRAM
The product of this research is a computer program
which recognizes standard engraved music notation.
The program is called the Digital Optical-Recognizer of
Engraved-Music Information, or DO-RE-MI.
DO-RE-MI is written using two computer languages,
MAD and SLIP. MAD is a FORTRAN-like language,
and SLIP is a list-processing language. The version of
SLIP used in this study is embedded in MAD, and the
combination of the two languages may be considered an
extended MAD language with list-processing capability.
DO-RE-MI was programmed on the MIT Compatible
Time-Sharing System (CTSS), utilizing the IBM 7094
computer.
The set of symbols used in standard music notation is
large. Therefore, it was decided that DO-RE-MI
would be designed to recognize only a subset of these
symbols; but a subset that would include all the more
important symbols of music notation. This allows the
program output to have practical utility, since many
applications do not require full recognition. (For
example, an analysis of the statistical frequency of
occurrence of different pitch intervals would require
only recognition of notes, clefs, accidentals, and keysignatures.) Moreover, the program is designed to be

* This paper is based on a thesis submitted in partial fulfillment
of the requirements for the degree of Doctor of Philosophy at the
Massachusetts Institute of T~chnology, Department of Electrical
Engineering, September 1970; the work was supported in part by
the National Institutes of Health.
153

154

Fall Joint Computer Conference, 1971

notational symbols, and the Recognition Section performs the recognition. A flow-chart of the program is
shown in Figure 3 .

•

~~~~III
IG IK1- tM2:4 (27E Z6E) 291Q / RE 21Q.

Figure 2-An example of the Ford-Columbia music representation

modular. Thus, it should not prove too difficult to add
to DO-RE-MI, if desired, the capability of recognizing
any- of the secondary notational parameters not now
recognized. This is in contrast to a previous work on
computer recognition of printed music by Pruslin3
which dealt with only a small subset of music notation
and was not readily extendable to the remaining notational symbols.
DO-RE-MI is divided into three sections: Input,
Isolation, and Recognition. In brief, the Input Section
inputs a sample of standard engraved music notation
to the computer, the Isolation Section isolates the

I N PUT

INPUT
The Input Section takes a sample of music notation
and, using a flying-spot scanner, digitizes the sample.
Several samples of SOUrce music were selected, and a
positive transparency made for each. Each sample was
chosen to contain two to three measures of duet music.
Such a sample will be called a "picture". The scanner
used is SCAD,4 which was developed by Dr. Oleh
Tretiak at M.LT. SCAD will scan a transparency,
measure the light transmission at a large number of
points on the transparency, and convert these measurements to digital form.
For each picture, SCAD scans a raster of 512 rows by
512 columns, an inch in the original sample corresponding to about 225 points. SCAD finds a 3-bit
number, T, corresponding to the transmittance of
light through each raster point. The original picture
contained, at least ideally, only black regions and white
regions. It is therefore reasonable to compress the data

ISOLATION

CO
NE.'NTS WITH
POSSIBILITY LIST

~

ORDERED COMroNENTS

GRAY-LEvEL

FRAGHENTS

MATRIX

PACKED
BLACK-I'IHITE

MATRIX

Figure 3-DO-RE-MI flowchart

Figure 4-Printout of a picture

Computer Pattern Recognition of Printed Music

to one bit per point. This is done by choosing a threshold,
8, and considering all points with transmittance T> 8
to be "White", and all points with transmittance T < 8
to be "Black". Since the transmittance of most points
in the picture will not be near the threshold, the choice
of 8 is not crucial.
The result is a matrix whose entries correspond to the
digitized points of the music sample. (One such matrix
is sho"\\-11 in Figure 4, representing "Black" by a point
and "White" by a blank for display purposes.) Thresholding is done by a routine which is called PACK
since it also packs the matrix for storage economy.
This packed matrix is the data-base for the program.

ORIGINAL PICTURE:

rJ
.. .

FRAGMENTATION:

...

.
I

I

ISOLATION
As sho"\\-11 in Figure 3, the packed Black-White
matrix produced by the Input Section is passed to the
Isolation Section. The function of this Section is to
isolate each of the music notational symbols represented
in the matrix, so that the program can then attempt to
recognize individual symbols (rather than having to
deal with arbitrary groups of symbols or with parts of
symbols). The symbols must be isolated from the stafflines upon which they are superimposed, and from each
other. The staff-lines can be considered as graphical
interference; they connect symbols that would normally
be disconnected, they camouflage the contours of
symbols, and they fill in symbol areas that would
normally be blank. Thus, recognition is greatly simplified if the symbols are extracted from the staff-lines.
This is a problem of pattern recognition in the presence
of qualitatively defined interference, as the graphical
positions of the staff-lines are qualitatively defined
(i.e., five lines, nominally horizontal, straight, and
equally spaced) .
The process of symbol isolation must not destroy or
significantly distort the original picture, for such
destruction or distortion will make recognition difficult
if not impossible. A method of isolation with minimal
picture distortion has been developed. This method,
called Fragmentation and Assemblage, is illustrated in
Figure 5. First, the symbol fragments falling between
staff-lines are picked out from the staff-line background
by a relatively complex Fragmentation procedure. Each
fragment is obtained by finding a list of the points that
make up its contour (considering its intersections with
staff-lines as part of its contour). Then, these fragments
are assembled together· again by associating .the fragments into sets called picture components, each picture
component corresponding to exactly one of the music
notational symbols of the original sample. In essence,
this procedure reforms the symbols, but without the

155

I

I

...
.,

ASSE.\fBLAGE :

Sa.11 Dots indicate

Connection.

Figure 5-Fragmentation and Assemblage

staff-line interference. The reformed symbols are
isolated from each other, as desired.
For each fragment, a SLIP-list is produced containing ROWMAX, ROWMIN, COLMAX and
COLMIN-the extreme rows and columns occupied
by the fragment. This information is readily found from
the fragment's list of contour points, and defines a
rectangle bounding the fragment. The SLIP-list also
contains fragment interconnection data.
In addition, a SLIP-list is produced for each component, containing a listing (by number) of the fragments that make up the component. It is then easy to
find the overall ROWMAX, ROWMIN, COLMAX
and COLMIN of the component from the data in the
fragment SLIP-lists. (For example, the COLMIN of
the component is just the minimum of the COLMINs
of the fragments.) In this way, a bounding rectangle is
found for each component.
The Fragmentation and Assemblage method is somewhat intricate and will be discussed in more detail in a
future paper. However, it is important here to note
the data reduction that it accomplishes. As illustrated
in Figure 6, a picture is originally stored as a huge
(250,000 point) matrix. The fragment data for the
picture is stored as a fairly long table of contour points

156

Fall Joint Computer Conference, 1971

PICTURE

BOUNDING
UCTAlIGLIS

A

~

1\
dJ

C7-

D

o

DO

DO

D-

Figure 6-A picture, its fragments, and its bounding rectangles

(about 5000 to 10,000 contour points per picture).
The corresponding SLIP-lists contain only the connection and bounding rectangle data (about 1000 listentries per picture). In each step, the amount of data
has been significantly reduced.
As will be seen, the small amount of easily-found
information in the fragment and component SLIP-lists
is enough, in most cases, to enable complete recognition
of the components.
It was originally thought that each symbol would
have to be fully reconstructed after Fragmentation by
combining its fragments graphically, and filling in the
spaces left by the staff-lines. This process was not
needed. Surprisingly, the information in the BlackWhite picture matrix was never needed after Fragmentation. Even more surprisingly, the information in
the fragment contour-point listings was needed only in
five very special cases. Except for these special cases, all
recognition can be done just from the ROWMAX,
ROWMIN, COLMAX, COLMIN, and connection
information in the SLIP-lists! This result points out
another benefit of the Fragmentation-Assemblage
method of symbol isolation: almost all the recognition
tests can be performed on the relatively small data
base of the SLIP-lists, rather than on the larger data
base of the Fragment point-listings, or the even larger
data base of the picture matrix.

Figure 7. On the top staff there are two occurrences of
the same symbol. However, the two are not geometrically identical, since the spacing is affected by the notes
on the bottom staff. Thus, these symbols are not
characters. Many of the techniques used for recognition
of characters, such as template-matching, cannot be
used to recognize non-character symbols. Such techniques are therefore not very useful in the recognition
of music notation. Alternative techniques must be
employed.
Music notation has many strong syntactic properties.
In combination with small lists of possibilities for each
UIiknown symbol, these syntactic properties can be used
with good results to recognize each component. Thus, it
was decided to use a preliminary filter to reduce the set
of possible symbols corresponding to the unknown to a
small subset, and then to use the syntactic properties
for unique classification.
Upon consideration of the form of the data after
Fragmentation and Assemblage (i.e., the SLIP-lists)
and of the graphical properties of music notational
symbols, it was .decided that a simple but powerful
preliminary filtering .could be obtained by examination
of the normalized overall size of the component. A close
look at the symbols of standard music notation reveals
that each symbol type is significantly different in overall
height or overall width from almost all others.
In order to find the size range of each music symbol, a
survey of standard music notation was made. Many
samples of each type· of notational symbol were measured,and the overall height and width of each symbol
tabulated. These measurements must be normalized
since different samples of music will in general be of
different scale. The average height of a staff-space
(which shall be denoted as SPHGT) appears to be a
good normalization factor, so that by dividing the
physical dimensions by SPHGT, the measurements can

PRELIMINARY FILTERING
The symbols of the picture having been isolated, the
actual recognition procedure can begin. First, it is
interesting to note that certain familiar techniques
cannot be used. In standard music notation there are
some symbols that are not characters, in that they have
one or more graphical parameters whose value. may be
different for each occurrence of the symbol. Consider

Figure 7-Non-character symbols in music notation

Computer Pattern Recognition of Printed Music

be converted into ratios that are independent of
absolute dimensions.
Each normalized pair of measurements is plotted in a
normalized-height vs. normalized-width property-space.
A region is then delineated in the property-space for
each type of music symboL This region is chosen to be
the smallest rectangular region enclosing all the plotted
points for that symbol. The symbol regions in the
property-space are shown in Figure 8. The figure
shows, for example, that the measured natural signs
were always approximately 0.6 to 0.8 SPHGTs in
width, and 3.2 to 3.9 SPHGTs in height. Note from the
figure that there is very little overlap of the regions.
In finding the list of possible assignments for an
unknown component, it appears far better to include a
few extra possibilities than to leave out the correct one.
Therefore, it seems reasonable to enlarge the rectangular
regions of the property-space to allow additional

157

~-1

I

I

J

I
I

I

--~

•
II. ....

6

,•

r .------...,

•J

1:1
1~

J
I

IL.~=~...

--1

,

o

0·· . €

c::J

f

~

DOTTED LINES mrlO
FOR CLARITY (lNLY

o.
o
NORMAL I ZED ~HDTI!

(I'liclth!SPEG';)

Figure 9-The H-W space (The normalized-height vs.
normalized-width space with enlarged regions)

-0.1 -. .
IO--1

3

NOru·!ALIZrD :'lIDTI-!

if

(Width!SP:;,~T)

Figure 8-Normalized-height vs. normalized-width
property-space

tolerance. This allows for processing effects due to the
scanner and quantization, and for such characteristics
of printed music as the widening of a symbol-line when
it crosses a staff-line, etc. The normalized HeightWidth property-space with the enlarged regions is
called "the H-W space". This is shO\vn in Figure 9.
Note that the number of overlapping regions at a
point is usually between three and five, with a maximum
of eight. The points with the higher numbers of overlaps
usually occur where the comers of many regions meet.
Since most plotted points for unknown components will
be found near the middle of the regions of their corresponding symbols, the areas of the property-space
corresponding to the larger numbers of overlaps are
rarely encountered in practice.
For every tested picture, the point in the H-W space
plotted for each component fell in the region representing the symbol actually corresponding to that

158

Fall Joint Computer Conference, 1971

component. Since the H-W space contains enlarged
regions, and since it uses normalized values, it is
reasonable to use the same H-W space for recognizing
any sample of music notation. However, if for certain
printed music it is found that the regions in the H-W
space are not properly delineated, it is only necessary to
find an H -W space based on this new set of symbols.
Changing the H-W space in DO-RE-MI requires
nothing more than changing a few parameter values.
Or, one might store several H-W spaces, and then call
the one corresponding to the music style in which the
sample has been printed. This ability to change the
H-W space so easily is an attribute of the modularity
of the program.
The Get-Possibilities Routine (GETPOS) performs
the preliminary filtering. Given the H-W space, a short
list can be generated of the symbols that can possibly
correspond to a given unknown component. This is
done by finding the point in the property-space representing the normalized-width and the normalizedheight of the component. The regions in which this
point falls specify the music symbols which can correspond to the component. When the Possibility List has
been found by GETPOS, it is added to the SLIP-list of
the component.
It is interesting to note the power of the preliminary
filter. It is based on a simple overall property of the
component, normalized size, and is completely independent of the internal features of the component.
Yet, it is able to reduce the number of possible symbols
corresponding to the component usually to about three
to five with a maximum of eight. In addition, use of the
data found in Fragmentation and Assemblage makes the
determination of the normalized overall size a trivial
calculation.
ORDERING
The components have been found by Fragmentation
and Assemblage independent of their positions on the
staves. It is therefore necessary to associate each
component with the staff from which it was extracted,
and to find the left-to-right order of the components
within each staff. Knowledge of this ordering is needed
to produce the final output sequence. In addition, it is
useful because two important graphical features of
symbols in music notation can be ascertained from the
order information: immediate symbol context and
symbol nestedness.
The Order Routine (ORDER) finds the left-to-right
ordering of the components in each staff, forming an
order list for each staff. Due to the two-dimensionality
of the placement of music symbols, the ordering process
is not simple. (This is in direct contrast to printed text

ORDERING BY LEFTMOST POINT:
COMP(I). COMP(4). COMP(2). COMP(3).
ORDERING BY CENTER OF MASS:
COMP(4).COMP(1). COMP(2). COMP(3).
ORDERING BY THE "ORDER" ROUTINE:
COMP(l)-L. COMP(4)-L. COMP(4)-R
COMP(2)-L. COMP(2)-R. COMP(l)-R
COMP(3)-L. COMP(3)-R.
'
Figure lO-An example of three different ordering methods

where the characters are in a one-dimensional string,
and where the ordering process is trivial.) Consider
the music sample of Figure 10. Note, for example, that
Component 2 is neither to the left nor to the right of
Component 1, but that a more complex two-dimensional
relationship exists between them. Thus, ordering the
components by one-dimensional techniques will not be
satisfactory. An ordering by leftmost point would give
the sequence: COMP(1), COMP(4), COMP(2),
COMP(3). An ordering by center of mass would give:
COMP(4) , COMP(l), COMP(2) , COMP(3). Neither
is a good representation of the situation.
DO-RE-MI orders the components by both their
leftmost and rightmost points in the same list. For
Figure 10, this method gives the sequence: COMP(1)Left, COMP(4)-Left, COMP(4)-Right, COMP(2)Left, COMP(2)-Right, COMP(l)-Right, COMP(3)Left, COMP(3)-Right. This ordering more accurately
represents the overlapping of COMP(4) and COMP(2)
byCOMP(l).
.
SYNTACTIC TESTS
Most of the symbols of standard music notation (as
opposed to the alphabetic symbols of printed text)

Computer Pattern Recognition of Printed Music

have a strong set of symbol-to-symbol contextual
syntactic properties. For example, in the standard
music considered by this study, the "flat" sign may
appear in only one of two contexts (as shown in Figure
11):
(1) In a key signature to the right of a clef or
bar-line.
(2) As an accidental to the left of a note.
Any other appearance of a flat is syntactically incorrect.
Also illustrated in Figure 11 is the rule that the first
symbol of a line of music must be a clef. This type of
syntactic property can be used to great advantage in
recognition.
Music notation contains many redundancies and
these too can be used to aid recognition. There are two
types of redundancies: syntactic redundancy and
graphical redundancy. Figure 11 shows an example of
syntactic redundancy. Since the line starts with a
G-clef, the first flat of the key-signature (if any) must '
be on the middle line, i.e., a B-flat. Conversely, if the
first symbol of the key-signature is on the middle line,
it must be a flat. Continuing in the same manner, if
the first key-signature symbol is a flat, the second keysignature symbol (if any) must be on the top space,
and must be a flat. An example of graphical redundancy
would be the F-clef, where a component cannot be

MUST BE A CLEF.
~TICALLY

~~

ALLOWABLE POSITIONS

POR FLATS.

159

recognized as an F-clef, even if it passes all other tests,
unless two other components are recognized as dots and
are in the correct position to be the F-clef dots. (If so,
all three components are considered to form the F-clef
symbol) .
Another type of syntactic property exhibited by
music notational symbols is positional syntax. Many
symbols can occur only in a fixed position on the staff,
or only in a certain region (e.g., above the staff). For
example, the two dots of a repeat sign will always be
found in the two middle spaces of the staff. Also, each
rest has a standard vertical position and thus will
always occupy the same position on the staff (except
for a few special cases) .
The Syntax Routine (SYNTAX) uses syntactic
properties to perform the final recognition of the components. It examines all the components in sequence as
they appear on each staff's order list. For each component, SYNTAX performs tests on every entry on the
Possibility List, and eliminates from the list all those
possibilities which fail any of these tests. Contextual
syntactic properties are tested. Some feature tests
must be employed, but only when the other three types
of tests still yield ambiguities. This occurs, for example,
when a sharp, a flat or a natural appears as an accidental
(i.e., to the left of a note). In this usage, the three
symbols can be syntactically equivalent. Since they are
approximately the same size and would occupy approximately the same vertical position, they often cannot be
differentiated by the H-W space or by position tests.
In this case, feature tests must be used. As will.be seen,
a simple feature test is used to differentiate among
the three.
As shown in Figure 3, SYNTAX calls nine subroutines, each testing a different class of symbolpossibility, e.g., notes, rests, clefs, time-signatures, etc.
SYNTX2 is Part 2 of the Syntax tests, and is called
when one of the five special cases requiring information
from the Fragment contour-point lists is found. In all
other cases, recognition is completed by SYNTAX and
its nine subroutines, and the only information used is
that in the SLIP-lists, i.e., the bounding rectangles on
each fragment and component plus the interconnection
data.

A REPRESENTATIVE 'SYNTAX' SUBROUTINE
IF A FLAT. MUST BE ON MIDDLE LINE;
AND VICE-VERSA.
IF IN KEY-SIGNATURE. MUST BE A FLAT
ON THE TOP SPACE.

Figure ll-Syntax and redundancy in standard music notation

As an illustration of the procedures used in SYNTAX
and SYNTX2, a representative SYNTAX subroutine,
the Sofon Test, will be discussed in some detail.
("Sofon", pronounced so-fon, is a term that is coined
here from the initials "Sharp Or Flat Or Natural" to
denote a symbol which is any of these three, inde-

160

Fall Joint Computer Conference, 1971

3. To the right of a clef or bar-line and on the
correct pitch-space. (The "pitch-space" of a
sofon .is defined as the pitch of the line or space
which the sofon is "on." This is determined by
the vertical position of the sofon and the clef
currently in effect.) In this case, the correct
pitch-spaces are:
For a sharp: F
For a flat: B
For a natural: F or B
4. To the right of another of the same sofon type,
and close to it, and on the correct pitch-space.
The sequence of correct pitch-spaces for keysignature sofons is as follows:
For sharps: F, C, G, D, A, E, B
For flats: B, E, A, D, G, C, F
For naturals: Either of the above sequences.
The initial sharp or flat of a key-signature can
also be to the right of a natural.

Figure 12-Situations where sofons occur

pendent of whether the symbol appears in a keysignature or as an accidental. No such general term for
this set of symbols exists in the music vocabulary, and
such a term is needed since these symbols can often be
treated together during symbol recognition.) The
Sofon Test (SOFON) tests all components that have
either "sharp" or "flat" or "natural" on their Possibility List.
There are strong contextual syntactic tests, redundancy tests and positional syntastic tests that can be
applied to sofons. Thus, a sofon must be either:
1. To the left of a note or a beamed-together notegroup, and close to it (i.e., within 1 SPHGT
horizontally), and at least partially overlapping
it vertically.
2. Overlapped horizontally by a beamed-together
notegroup, and at least partially overlapping it
vertically.

In the first two cases the sofon is used as an accidental; in the latter two cases it is used in a keysignature. Typical situations are illustrated in Figure 12.
Since accidental sofons are syntactically equivalent,
the three types of sofons often cannot be separated
from each other by the above tests (though SYNTAX
will eliminate all other possibilities from the Possibility
List). In this case, a feature test must be used.
After consideration of many other separation

Figure 13-Sofon separation

Computer Pattern Recognition of Printed Music

algorithms, it was noted that the three sofons could be
separated by examining only their top and bottom
points, as follows (see Figure 13) :
(1) Sharp vs. Flat; Sharp vs. Natural:
-For a sharp, the topmost point occurs in the
rightmost two-thirds of the component width.
-For a flat or natural, the topmost point occurs
in the leftmost one-third of the component
width.
(2) Flat vs. Natural
-For a flat, the bottommost point occurs in the
left half of the component width.
-For a natural, the bottommost point occurs in
the right half of the component width.

161

1'101.(76 Ed27 S. 285))91(295.305.315.325)),/.(33 E. 31 E.
28 E. 29=E). • I 0 2. 21 O. R 0.1. 26 H)

Figure 14-0utput for the picture of Figure 4

do not allow accurate printing of all the symbols of the
Ford-Columbia character set; therefore, a modified
Ford-Columbia is printed out with commas representing blanks and apostrophes representing exclamation points.) In this printout, "101" indicates that the
first instrument's line begins, "26E" indicates an
Eighth note on space 26 (the third space on the top
staff), parentheses indicate beams, "I" indicates a
bar-line, etc.
RESULTS OF TEST RUNS

Thus, examination of the topmost point of the component separates sharps from flats and naturals, and
examination of the bottommost point separates flats
from naturals.
The sofon separation test requires finding the topmost and bottommost points of the component, and
these can be found easily from the contour-point lists.
This is one of the cases where it may be necessary to
employ SYNTX2 and the fragment contour-point lists.
Point lists for only two fragments must be examined:
the fragments containing the ROWMIN and the
ROWMAX of the component. In fact, often these
fragments fall completely on one side or the other of the
two-thirds or one-half points of the component's width.
When this is so, the sofon separation can be made solely
from the information in the SLIP-lists. Then, there is
no need to look at the contour-point lists, and SYNTX2
does not have to be called at all.
OUTPUT
The Output Routine (OUTPUT) takes the music
which was recognized by the Syntax Routine (SYNTAX
and SYNTX2) and produces the final DO-RE-MI
output according to the Ford-Columbia Music Representation. As has been mentioned, Ford-Columbia is an
alphanumeric language which is substantially isomorphic to standard music notation. It was developed by a
musician, Stefan Bauer-Mengelberg. 5 OUTPUT stores
the Ford-Columbia representation of the recognized
music on a SLIP list. For example, a printout of the
final output for the picture of Figure 4 is shown in
Figure 14. (Output ~estrictions on SLIP-lists in CTSS

DO-RE-MI was tested on some representative
pictures, including the one shown in Figure 4. In all
cases, the program produced the desired Ford-Columbia
representation of the input picture with complete
accuracy. All symbols in the subset of music notation
considered by DO-RE-MI were correctly recognized.
In addition, all symbols not in that subset were correctly
recognized as such.
Some overall statistics on the test-runs: DO-RE-MI
correctly isolated all the music notational symbols into
a total of 137 components. These components were
formed by 527 fragments. All 137 components were
correctly recognized by DO-RE-MI (including thirteen
components which were each a group of three to four
notes beamed together). An average test-run took
about four minutes, from packed Black-White matrix to
Ford-Columbia output, for a test picture of two to three
measures of duet music (i.e., four to six single measures
of music). This time figure could be significantly
reduced by various means; however, minimization of
run-time was not a major goal of this work.
CONCLUSION
This work is an investigation into computer recognition
of a class of conventionalized, two-dimensional, visual
patterns: standard engraved music notation. Important
aspects of the problem involve pattern recognition in
qualitatively-defined interference, recognition of positionally two-dimensional patterns, use of syntax and
redundancy properties in recognition, and recognition
of non-character symbols. A simple preliminary filter

162

Fall Joint Computer Conference, 1971

proved very effective. Bounding rectangles on fragments and components unexpectedly provided all
necessary data for recognition in almost all cases, and
thus examination of detailed feature properties was
rarely required. The program produced should be able
to be expanded to the recognition of all printed music.
In addition, the pattern recognition techniques developed can possibly be applied to computer recognition
of such things as maps, graphs, organic chemistry
symbols, circuit diagrams, blueprints, aerial photographs, etc.
ACKNOWLEDGMENT
I would like to gratefully acknowledge the encouragement, support, and counsel of Professor Murray Eden
of M.LT., who guided me throughout the course of
this research, and of Professor Allen Forte of Yale,
Dr. Oleh Tretiak of M.LT. and Professor Francis Lee
of M.LT. The interest and assistance of the members of
the Cognitive Information Processing Group of the

M.LT. Research Laboratory of Electronics is also
greatly appreciated.
REFERENCES
1 M EDEN
Recognizing patterns
P Kolers and M Eden editors Chapter 8
The MIT Press Cambridge MA 1968
2 H B LINCOLN
The current state of music research and the computer
Computers and the Humanities September 1970
3 D H PRUSLIN
A utomatic recognition of sheet music
Doctoral Thesis MIT Cambridge MA January 1967
40 J TRETIAK
Scanner display (SCAD)
Quarterly Progress Report, No. 83 MIT Research
Laboratory of Electronics October 1966
5 S BAUER-MENGELBERG
(of the IBM Systems Research Institute New York City)
Representing music to a computer: A primer of the
Ford-Columbia music representation (DARMS)
Manuscript in preparation

A storage cell reduction technique for ROS design
by Dr. C. K. TANG
International Business Machines Corporation
Endicott, N ew York

If one takes a logic viewpoint of the ROS, then
(2 n Xm)-bit ROS can be expressed as m logic functions
of n input variables, and each of the m logic functions
corresponds to one of the m outputs of the ROS. Let
iI, ... , in be the n inputs to the ROS, then anyone of
the m outputs can be expressed as a logic function F of
the n inputs, and F can always be expanded with respect to all the input variables as follows:

INTRODUCTION
A Read Only Store (ROS) of n inputs (n-bit address)
and m outputs (m-bit words) stores 2n Xm bits of information, and is abbreviated as (2 n Xm)-bit ROS.
The use of mono Ii thic arrays as a ROS depends upon
an effective store (write once) procedure (or personalization of the ROS). Two procedures are currently being
considered: the personalization by final metallization
pattern (masking),1 and the personalization by postmetallization connection elimination technique (etching or zapping).1 ,2 This study describes a method of
designing a monolithic ROS which will reduce the number of storage cells required in the former method of
personalization.
The ROS personalization pattern, which is usually
described by a truth table of 2n rows (where n is the
number of variables of the function to be implemented
by the ROS) , is first expressed as a function completely
expanded with respect to all of its n variables, i.e., a
function in the standard sum form. The incomplete
expansion of the same function with respect to less
than n variables is used to show that the number of
storage cells in the ROS can be reduced from 2 n to
either 2 n- \ or 2 n - 2 , or 2 n - 3 , etc.; this is made possible
by using a few additional logic gates in the ROS to
implement a set of selected simple functions (called
functional sets). As an extension of the previous method
of simple expansion, a double expansion method is also
presented; this enables the expansion of the function
with respect to a smaller number of variables to be
practical.

F = F (iI, ... , in)
=ili2'" inF(O, 0, ... , 0) +il~ .. ' inF(O, 0, ... ,1)

+ .. ·+i i

l 2 •••

i n F(1, 1, ... ,1)

(1)

where il denotes the negation of i l .
Each of the 2n products of the form i l*i2* . . . in * in
Equation (1) (where i* represents either i or i) is one
of the decode outputs, and hence the corresponding
residue function denotes the stored bit (0 or 1) associated with that decode output. If personalization by
masking is used, the storage of the information bit 1 or
is usually done by the connection or disconnection of
a cell in the ROS (where the cell is usually a transistor
or the emitter of a multiple-emitter transistor). Hence,
there are 2n cells required for each output bit of the
ROS. These cells are OR gated to provide the desired
function F.
Now, if the function F is expanded with respect to
n-l of its n inputs, then there are only 2n - 1 product
terms; i.e.,

°

F=F(il, ... , in)
=il~ ...

in-lF(O, 0, ... ,0, in)

+il~ ...

THE USE OF A FUNCTION SET

in-IF (0, 0, ... , 1, in) + ...

+il i 2 ••• in-IF (1, 1, ... , 1, in),

A ROS can be represented in block diagram form as
shown in Figure 1. The inputs cause only one of the
2n decode outputs to be energized; the energized decode
output selects the m storage bits to the output through
the sense amplifier.

(2)

Let V I ( in) be the set of all functions of th3 single variable in. The outputs of Vl(i n) will be (0, 1, in, in), and
VI (in) will be called the function set of the single variable
in. Each of the residue functions in Equation (2) is a

163

164

Fall Joint Computer Conference, 1971

I
I

Decode

I

t --- t
n Inputs

Storage
(2nxm
)
Storage Cells

F=F(il, ... , in)

Sense Ampl ifiers

~ ---

m Outputs

(Addressing' Bits)

variables i n- l and in, defined as the set of all functions
of the two variables i n- l and in. There are 16 functions
of two variables. The expansion of the function F with
respect to the first n-2 inputs is given as follows:

T
+ ... +iti2 ... i n- 2F (1, 1, ... , 1, in-I, in)

(Word Read-out)

Figure 1-(2n Xm)-bit ROS

function of the single variable in, and hence is included
in the function set VI (in). Assume the function set
Vl(in) and the decode with iI, ... , i n- l as its inputs are
available, then one can easily see from Equation (2)
that if there exist 2n - 1 cells and each is capable of performing a two-way AND function, then by OR gating
these 2n- 1 cells, the arbitrary function F(il, ... , in) is
obtained.
The above statement suggests the construction of a
(2nx I)-bit ROS. The selection of a particular function
from the four functions of VI (in) for each cell is done
by making a connection in the final masking from each
cell to an appropriate line that carries the desired function (note there are four lines carrying 0, 1, X n, Xn). By
sharing the four lines carrying VI(i n ) and the (n-l)input decode, a (2 nXm)-bit ROS can be constructed
by using 2n- 1X m storage cells. This arrangement is
shown in the block diagram of Figure 2. Note that the
associated decode circuits should also be simpler since
an (n-l)-input decode is required instead of an n-input
decode. Also, note that the function set VI(i n ) does not
take any extra circuits to build because 0 and 1 stand
for the ground and the power supply voltage, respectively, and both in and in are necessary in building the
n-input decode in a conventional ROS. In some conventional ROS designs the decoding of the n input
variables may not be as explicit as that indicated by
Figure 1, but the idea of using the VI function set can
still be applied with a resultant saving of storage cells.
This will be illustrated for the design of a current switch
emitter follower ROS and a T2L ROS in Appendix A
and Appendix B, respectively. The existence of the
simple cell that can perform a two-way AND is also
illustrated in the appendices.
The use of the function set of a single variable can
be extended to the function of more than one variable.
For example, let V 2 (i n- l , in) be the function set of two

(3)

Each of the residue functions is a function of two
variables, i n- l and in, and is obtainable from the function set V 2 (in-I, in). There are only 2n- 2 products in
Equation (3) and so now only 2n - 2 storage cells are
required. Each cell should also be ableto perform the
two-way AND function and can be OR gated.
Obviously the use of the function set can be extended
to any number of variables k, where k2) an
N -digit shift register needs approximately N M / (M -1)
register circuits (N)>M) , as compared with 2N circuits
in the conventional ones. The extreme of this approach
is that N + 1 registers and controls are needed to shift N
digits of information. A slightly different approach but
with the same limit has been used in the design of MOS
dynamic random-access memories. 7,10 In terms of circuit
count, the number of register circuits is reduced by
(M -2) /2(M -1) XI00 percent. The controls, on the
other hand, increased from two to M. To obtain a net
benefit from this approach, the hardware cost of the
added controls has to be small compared with the saving
on registers. This is justified in most cases if a large
quantity of shift registers is used in a system.
In LSI technology the gain of circuit reduction by
using more controls is again related to two factors: (1)
the area reduction of shift register chips, and (2) the
cost for more inputs/outputs on the chip for the added
control lines. The area reduction of the chip is always
smaller than the (M-2)/2(M-l)XI00 percent
(reduction of the circuit count). Two factors contribute
to this: (1) The addej controls always cost some chip
area. (2) Shift registers occupy a major portion of a

New Approach to Implementing High-Density Shift Registers

chip but not the complete chip. Some of the area is used
by control circuits and pads for inputs/outputs. Only
the array area occupied by the shift registers will be
considered in this paper.
The chip area needed to accommodate the added
control lines is a function of the register circuit configurations, the ground rules of the chip layout, and the
device technology. To obtain meaningful general results,
the following assumptions are used as the model of
analysis:

OONTROL
A

CONTROL CONTROL
CONTROL
BCD

173

CONTROL
CONTROL CONTROL
ABC

r-

--I

•

I

I
I
I
I

I
_ _ II

I

1-

1. The direction of data flow on the chip is perpen-

dicular to the physical lines of the control signals.
2. The area taken by a single register circuit is
increased 0 times if an additional control line
passes over it.
3. No crossovers are permitted among control lines.
All these assumptions are more or less based on
single-layer metal interconnections being used on the
chip.
Figure 3 shows an example of the model. Controls A
and C are commonly used by both the upper and low
row of registers and thus no additional area is needed
for these two lines. Control lines Band D, however, do
cost some additional area. When M is even, a maximum
of two controls do not take additional area. When M is
odd, only one control does not cost extra area. The total
array area reduction thus depends on whether M is
even or odd:
Percentage Area Reduction""

M-2-(M-l)0
2(M -1)

XI00 percent
M =odd number> 2

(1)

CONTROL
D

CONTROL
B

CONTROL
D

Figure 3-A possible topological arrangement of integrated shift
registers where the information flow is perpendicular
to the control lines

No distinction has been made so far on whether
"M controls" stand for M clock signals or M sets of

clock signals. Applications of the new approach on
circuits given in Refs. 3-8 all support one result: the
approach works best for register circuits that have
only one control clock per gated register circuit. Among
the circuits given in the references, the bucket-brigade
shift register 6 and circuits similar to the two-phase
capacitor-pullup circuit4 obtain the most area improve~
ment. The exact percentage of improvement depends
on the ground rules of the device process. As an example,
the bucket-brigade shift register is organized into a
two-clock version and a four-clock version, as shown in
Figures 5a and 5b. Conventional MOS technology is
used. 8 In the four-clock version, half the circuits have
30.8 percent area increase because of the two additional
clocks, but the array area is reduced 23 percent with
respect to the two-clock version.

.
(M -2) (1-0)
Percentage Area ReductlOn~ 2(M -1)

XI00 percent
M = even number~2

(2)

These two equations are plotted in Figure 4. At 0 == 0,
the area reduction ranges from 25 percent at M =3 to
its asymptotic value 50 percent as M increases. The
most significant point shown in this figure is that the
reduction of array area is rather insensitive to 0, the
fractional area increase of an individual register circuit.
This is especially true when M is even. For example, by
reorganizing a two-control into a four-control shift
register, 20 percent array area reduction can be achieved
even if two more controls cause 40 percent area increase
for half of the register circuits.

AREA INCREASE OF AFFECTED CIRCUIT IPERCENT)

Figure 4-Array area reduction vs circuit area increase for
different number of controls

174

Fall Joint Computer Conference, 1971

til1

1112
CONTROL

A

CONTROL

8

CONTROL

C

CONTROL
D

CONTROL

CONTROL

A

8

C2

1
1.2 MIL

T
AREA/BIT

j..--

5.2 MIL

~

THIN OXIDE

~

METAL

III

DIFFUSION

Figure 6-Parallel operation with sampled input/output

= 1.2X5.2 = 6.24CMIL

---...J

Figure 5a-Two-clock version of bucket-brigade shift register

PARALLEL OPERATION WITH SAMPLED
INPUT/OUTPUT
As mentioned in the previous section, the maximum
data rate is reduced as a result of the multicontrol
approach. To overcome this shortcoming, parallel
operation with sampled input/output can be used.
Figure 6 illustrates a 4-control shift register system
organized in such a way that the inputs and outputs are
sampled by the con troIs of the shift register (and no
additional gating signals are needed). The inputs of the
two parallel shift registers.are directly dotted together.
This is allowed because the registers are gated by
different controls. The outputs are also dotted directly
together. This is allowed for most FET dynamic shift
registers. Physically, these two shift registers can be

.,-

1~THIN

I·

'2 MIL

AIIEAIIIT -

------..!-I

'z:",2 - UDMIL

OXIDE

~ METAL

1'1

DIFFUSION

"REDUc:TION- ~-_

e- 0.4/1.1_111..,.

Figure 5b-Four-control version of bucket-brigade shift register

implemented side by side on the chip, or as two separate
ones, either on the same chip or on different chips with
no area penalty. The internal shifting speed is not
changed by using this scheme, but the overall maximum
data rate is doubled in this example. This, in effect,
recovers the maximum data rate of conventional shift
registers, provided the register speed determines the
maximum data rate. In most practical designs, the
drivers of the control clocks determine the maximum
data rate,u Since the driver for M-control shift register
drives considerably less load, higher maximum data
rate is actually achieved by using this approach. In
general M /2 M -control shift registers in parallel
operation will have the same or higher data rate than
that of the conventional one, provided M is an even
number. There is no simple way to gain back the maximum data rate if M is an odd number.
The minimum data rate in FET dynamic shift
registers is determined mostly by the leakage from the
storage capacitors. A conventional shift register and
its M-control version have the same minimum data
rate if no parallel operation is used. For parallel operated
M-control FET shift registers, the minimum data rate
is approximately M /2 times higher. At a· given data
rate within the operating range, however, the parallel
operated M-control shift registus consume approximately (M -1) times less power than the conventional
ones.
IMPLICATIONS ON DEVICE FABRICATION
Because of the higher minimum data rate, the present
schemes require lower leakage if both the maximum and
the minimum data rate are required. On the other hand,
since the new schemes offer smaller chip area, a higher
device process-yield is expected. Assuming that Yo is
the yield of a conventional shift register array, then
based on pure random defects, its M-control version

New Approach to Implementing High-Density Shift Registers

should give the following yield :12
y,,-, (yo) [M+(M-2)6) /2 (M-l) •

(3)

A very interesting situation exists when the M-control
shift register is operated in parallel with sampled
input/output. Assume that two 64-digit 4-control shift
registers are operated in parallel to form a 128-digit shift
register. One of the shift registers has a defect and is
stuck to a fixed state. In the conventional shift register,
one bad register ruins the whole shift register. In the
present case, there is a good chance that a 64-digit shift
register is still usable, assuming proper controls are
used. For instance, by using discretionary wiring to
connect the parallel operation, a good 64-bit shift
register may have a good chance to be recovered. For
M =4, the upper bound of the probability of recovering
one shift register of half length is approximately

Y = 2 X Y OI/6(2+6) (1- Y OI/6(2+6») •

(4)

Figure 7 shows the plot of both Eqs. (3) and (4) at
M =4 for different 5 values as a parameter. Based on
this model, better than 50 percent yield improvement
can be achieved for a 20 percent original yield. In
addition, there is up to 50 percent probability that a bad

•

8-

4

' ,

~ ,

0.3 - - - - - -

I: ~:~=== ,~"
~,,,
, ,
,"

•

I
... ...
...

...

~........

YIELD.
USABLE HALF-LENGTH
SHIFT REGISTERS
(UPPER BOUND)

... ....
~ ...
...

'

~,
...
~

'

10

o

10

.

•
Yo (l1li •

21

shift rc gister will have a recoverable good half-length
shift register. For a more realistic yield model, the
absolute value may be lower, but the general trend is
not expected to change much.
CONCLUSIONS
This paper has described a new approach to improve the
integrated shift registers by using different logic
organization. By increasing the number of controls from
2 (conventional ones) to M, new shift registers can be
organized which potentially offer many advantages.
Higher density, higher yield, fewer number of circuits,
and lower power consumption are among the most
important potential improvements over using conventional shift registers. The study also shows that the
high recovery of partial length M-control shift registers
can be expected as a byproduct of this approach.
In practical applications of the new approaches,
trade-offs are to be considered. The gain in density, in
yield, and possibly in data rate have to be weighed
against the increased number of controls, the number of
I/O interconnections, and the device implications. The
basic idea presented in this paper, however, is quite
general. The reduction of circuit count is always true
and is independent of the type of shift register. It is
therefore expected that this work can be applied to
designs of all shift registers.
The author thanks N. G. Vogl, Jr. for motivating
this study, W. D. Pricer and I. T. Ho for many valuable
discussions, and R. A. Henle and W. Hoffman.
REFERENCES

YIELD, ALL
REGISTERS
GOOD

(PERCENT)

Figure 7-Yields of full and half shift registers for M =4

175

1 R W BOWER H G DILL K G AUBUCHON
S A THOMPSON
MOS field-effect transistors formed by gate masked ion
implantations
IEEE Transactions on Electronic Devices Vol ED-15
No 10 1968
2 L L VADASZ A S GOVE T A ROWE
G E MOORE
Silicon gate technology
IEEE Spectrum Vol 6 No 10 1969
3 R L PETRITZ
Current status of large-scale integration technology
IEEE J Solid State Circuits Vol SC-2 No 4 1967
4 B G WATKINS
A low-power multiphase circuit technique
IEEE J Solid State Circuits Vol SC-2 No 4 1967
5 Fairchild Semiconductor 3320 Product Description
January 1968
6 F L J SANGTER
Integrated MOS and bipolar analog delay lines using
bucket-brigade capacitor storage
Digest of Technical Paper 1970 ISSCC February 1970

176

Fall Joint Computer Conference, 1971

7L BOYSEL W CHAN J FAITH
Random-access M OS memory
Electronics Vol 43 No 4 1970
8 F F AGGIN T KLEIN
A faster generation of MOS devices with low thresholds
Electronics Vol 42 No 20 1969
9 G A MALEY M F HEILWEIL
Introduction to digital computers
Prentice-Hall 1968

10 W M REGITZ J KARP
A three-transistor-celll024 500 ns MOS RAM
1970 ISSCC Digest of Technical Papers Vol 8
11 M E HOFF S MAZOR
Operation and application of M OS shift register
Computer Design Vol 10 No 2 1971
12 E TAMMARU J B ANGELL
Redundancy for LSI y'ield enhancement
IEEE J Solid-State Circuits Vol SC-2 No 4 1967

Universal logic lllodules implemented using LSI
memory techniques
by KENNETH JAlVIES THURBER and ROBERT ORVAL BERG
Honeywell SysterM and Research Center
St. Paul, Minnesota

five years, chip complexity will probably double7. By
1973, 16,384-bit single chip ROMs should be readily
available. Prices are also anticipated to reflect this
improved capability, and will probably be less than a
penny a bit. 4This should enable the clever logic designer
to economically perform extensive logic functions with
memory! One obvious use of such devices is table lookup
arithmetic; however, this is only one of the many ways
they can be used. Four possible uses of read-only
devices are given in this paper. These four uses were
selected because they illustrate the power of read-only
devices, show that "pie-in-the-sky" concepts such as
Universal Logic Modules (ULM) may actually be
possible, illustrate the future potential of read only
devices, illustrate the failings and tradeoffs involved
with several read only devices, and propose relatively
simple solutions to several problems.
In the next section, use of read only devices in the
construction of a universal combinational logic module
is given. Next, use of read only devices in the construction of a universal sequential logic module is discussed.
Then the use of read only devices in the realization of
arbitrary sequential machines is presented. In all cases
several different implementations of the devices under
consideration are given. Finally, a new type of read only
array is introduced which is a hybrid combination of
ROM and ROAM. It is shown that for a multiple
output function this hybrid implementation is most
economical.

INTRODUCTION
Large arrays of read only storage (ROS, ROM)
are currently available from semiconductor vendors;1-4,10,l1,14,15 however, there is a lack of material
describing potential uses of these blocks of memory.
This is due in part to the haste with which the semiconductor manufacturers place these devices on the
market and in part to the fact that engineers have not
yet realized the full potential presented by these devices
for use as logic.
Previous researchers5,6,17-19 have considered how to
use ROMs as logic devices; however, some of these
approaches have not been fully pursued. 5,16,17 Other
approaches do not take full advantage of the capabilities
offered by the application of such devices. 6,16,17
In Reference 7 Graham points out that one possible
partial solution to the large costs and long development
times associated with custom LSI chips is to replace
random logic designs with ROM; however, this is
probably only part of the total solution. Texas Instruments20 has developed a Programmable Logic Array
(PLA) which could possibly provide one economical
solution to the random logic problem. Semiconductor
manufacturers have described properties of ROM and
some simple applications, but have never attempted to
really investigate the power that is available through
the use of ROM. Several variations of the ROM array
have been introduced, notably, the ROAM and Solid
Logic Technology (SLT) array.s The SLT array may
be modelled as a special case of the ROAM in which a
variable can appear in either its complemented or
uncomplemented form (but not both) and it appears
in the same form in every minterm. Currently, 4096 bit
MOS single chip ROMs are available. 1, 10, 14,15 A single
chip 8192 bit ROM is currently available.H In the
next one to two years, it is conceivable that ROM
with two to three times as many bits will be available.
In fact, it is anticipated that every year for the next

A PROGRAMMABLE UNIVERSAL
COMBINATIONAL LOGIC MODULE
Figure 1 (a) shows a section of ROM.* By choosing
V E = 1 and G = 0, it can readily be seen that if line A

* The ROM and ROAMs have been implemented using diodes
in the examples presented in this paper; however, they could have
been just as easily implemented using transistors.
177

178

Fall Joint Computer Conference, 1971

AB
CD

00

01
0

00

0

1

01

CD

0

{

~-+----+-t----t-----t------''-'-I12

204 R

25IJO

3072

3584

40"G

4GOB

G144

7168

81 >12

8:>(;

n is equal to 6

p

256

384

512

G40

7GB

768

1024

1280

151G

17>12

2048

2560

3072

3584

400G

5120

G144

71 G8

81 >12

Figure 20-Sizing guidelines for sequential machines made
out of ROM

189

ROAM1 does not need decoders and therefore has more
area for bits. Using these bounds, Figure 19 shows the
number of bits of ROM necessary to construct an
arbitrary sequential chip with the designated values of
n, m, p. The values are only calculated until 16,384 bits
are reached. In interpreting the figure, n is the number
of external input variables, m is the number of output
lines, and p is the number of feedback variables (therefore, a 2p state machine can be realized). The number in
row p of column m of the table for a specific value of n
is the number of equivalent bits of ROM required.
In realizing the machine using only ROM, the cost
was computed to be (2 n +p ) (m+R)' Neglecting the
decoder (this is done by allowing only 8192 bits), the
output drive circuitry and flip flops, the cost of realizing
one function is 2n +p and therefore since m+p functions
must be realized, a resultant cost of (2 n + p ) (m+p) is
obtained. This can be compared to the cost of the
ROAM1 solution since the same factors were neglected
in both cases. Comparing Figure 20 to Figure 19 it can
readily be seen that ROM is better suited for the
realization of an arbitrary sequential machine.
A GENERALIZED LOGIC ARRAY
In the preceding sections ROMs and ROAMs were
used to solve three different problems. As can be seen
from the solutions given, there seems to be no universal
array applicable to all types of logic problems. In
addition for any given problem either ROM or ROAM
seems to fit the problem, but not necessarily both. It
was the author's intent to give a detailed illustration of
the power of memory (particularly read only semiconductor memory) used as logic and to determine
design guidelines for their use; however, there is an
additional array that can be made from read only
memories. Since this memory is an extension of both the
ROM· and ROAM it is felt that this array should be
introduced and an example given which illustrates that
it too has a use. In the preceding sections ROAM was
sometimes used for decoding purposes. The GLA is an
extension of this concept.
Figure 21 (a) shows the Generalized Logic Array
(GLA). The GLA consists of a combination of ROM
and ROAM. From the block diagram of the array
shown in Figure 21 (a), it can be seen that this array can
be specialized to be either a ROM or a ROAM as
previously examined. In order to obtain a ROM
(ROAM), the ROAM (ROM) is not used in the array.
This array, as shown, combines the best properties of
both types of memories; i.e., the table look up (associative property) of the ROAM and the random
access properties of the ROM. The combinatio~ of
associative and random access memories has been used

190

Fall Joint Computer Conference, 1971

OUTPUT

** IT IS ASSUMED THAT
THE ROAM OUTPUTS ARE
INVERTED BEFORE TREY
ARE OUTPUTTED TO THE

ROM
OUTPUTS

ROM. THIS IS NECESSARy
BECAUSE ROAM USES NEGATIVE
LOGIC AND ROM USES POSITIVE

GROUND

Figure 21(~)-Block diagram of the GLA

to advantage previously,12 but for a different purpose and in a different manner.
This array could be applied to the design of a universal
combinational logic module as follows:
• for a ROAM1 implementation delete all ROM and
make the ROAM from ROAMl.
• for a ROAM2 implementation delete all ROM and
make the ROAM from ROAM2.
• for a ROM implementation use the ROAM as a
decoder and the ROM to produce functions that
are most economically implemented in ROM.
This array encompasses both ROM and ROAM and
wx y z

Xy

wXy
Wxz
ROAM

wxz
y
xy

Figure 21(c)-The use of ROM in the GLA to select the
appropriate inputs to be ORed together

has as its main benefits the following:
• Combines the best properties of ROM and ROAM.
• Allows the implementation of functions which
require both associative and random access
addressing.
• Is more universal than either ROM or ROAM.
• Limits random wiring on the chip to be external to
the memory areas, thus simplifying the layout task.
• Retains a structure in the sense that the ROAM
and ROM sections can be considered as macro
cells.
The GLA does have a drawback and it is that the
logic designer is going to be required to be more ingenuous than before.
The judicious combination of random access and
associative memories seems to be quite promising for
certain types of problems. One of these is illustrated
below.
The problem to be solved here is the implementation
of the' three functions fl' h, and f3. This problem is
taken from page 159 of McCluskey's book. 9 The functions are defined as follows:

fleW, X,
heW, X,

Y, Z)

f3(W, X,

Y, Z)

= L(2, 3, 5, 7, 8,9, 10, 11, 13, 15)

Y, Z) = L(2, 3, 5, 6, 7, 10, 11, 14, 15)

= L(6, 7, 8, 9,13,14, 15)

The minimal sums for these three functions are as
follows: (page 164 of Reference 9)

h=XY+WXY+WXZ+WXZ
f2=WXZ+Y
Figure 21 (b)-Realization of multiple output function in the GLA

f3=WXY+XY+WXZ

Universal Logic Modules

Realizing the three functions in ROM would require
three 16-bit ROMs plus decoding. The total estimate
(in ROM bits) for a realization of this function is 96
bits (allowing the equivalent of 48 ROM bits for
decoding). A ROAM1 implementation would require
the implementation of nine implicants (you cannot
wire OR the WXZ(WXZ) term into both !1 and
!3 (!1 and !2) thereby causing you to have to realize all
nine implicants and wire OR them only in their specific
function) at a cost of eight ROM bits per implicant for
a total cost of 72 ROM bits. ROAM2 has a slightly
cheaper realization because Z does not appear anywhere. Therefore if a seven variable ROAM2 is available and the variables are assigned as X, X, Y, Y, z,
W, and W, then ROAM2 requires 9X7 or 63 ROM bits
for the realization. In this problem it appears that
ROAM2 is the cheapest solution with a cost of 63 and
ROM is the most expensive with a cost of 96; however,
a GLA solution using ROAM2 (ROAM1) can be given
that requires only 60 (66) equivalent ROM bits.
This implementation is shown in Figure 21 (b). This
implementation is based upon the following observations:

• A=A+O,O=OX
• there are six distinct multiple output prime
implicants: XY, WXY, WXZ, wxz, Y, and XY.
• !i=XY + WXY + WXZ+ WXZ+OY +OXY
=XY+WXY +WXZ+ WXZ
• !2=WXZ+Y+OXY+OWXY+OWXZ+OXY
=WXZ+Y
• !3=WXY+XY+WXZ+OWXZ+OXY+OY
=WXY+XY+WXZ
In Figure 21 (b) each of the six mUltiple output
prime implicants have been realized using ROAM2
(ROAMl) at a cost of 42 (48) equivalent ROM bits.
Decoding has been eliminated (the ROAM really
decodes into the multiple output prime implicants) and
the three ROMs used cost six bits each for the total
cost of 60 (66) using ROAM2 (ROAM1). The three
ROMs are 6 by 1 ROMs and contain a connected
diode to select the proper terms of the function. The
seventh line into each ROM is always kept at logicall,
thus allowing the output of the ROAM to select the
output value. For!1 the diodes driven by the Y and XY
outputs are just not connected thereby yielding the
correct realization for JI. The other two functions are
similarly realized. In this example the ROM just serves
to mask certain values and to "OR" the proper implicants together to realize the functions. This operation
may be illustrated by the three bit ROM shown in
Figure 21(c) which produces A+B. The multiple

191

output prime implicants would drive the ROM at
points equivalent to points A, B, and C of Figure 21 (c).
Using the ROM in this manner, the ROM effectively
selects the proper outputs of the ROAM and ORs
them together if the diode is connected to both lines.

SPEED CONSIDERATIONS
Since the modules have not been built, there is no
way to accurately measure the modules' speed. The
speed projections given here are based upon speed
projections for ROMs.2,3,4
It is important to note that all of the modules are
just read only memory so that as far as a module itself,
there are no real internal gate delays. For MOS the
general sequential machine should be able to run with
data changing every millisecond or less. There are some
MOS read only memories2that have access times on the
order of 50 nanoseconds. Although these· access times
are not necessarily for a MOS memory of 8000 bits, the
general machine running asynchronously could possibly
run significantly faster than 1 MHZ. If it is running
synchronously, it would probably have to be slower
(on the order of 1,250 nanoseconds1 5). Of all the modules,
the general sequential machine is probably the slowest,
however, it is anticipated that this module will probably
furnish the fastest overall system that could be put
together. The main speed advantage the general
sequential machine has is that its speed is dependent
only on the clock frequency and does not depend upon
the number of gate delays that the signals must
propagate through as in a random logic implementation.
It is anticipated that the sequential logic module will
be able to operate at a 600 nanosecond access time in
M OSlO and a 35 to 70 nanosecond access 13 time in
bipolar implementations. These speeds seem reasonable
and compare favorably with logic that is available
today; however, it should be anticipated that these
speeds will be heavily dependent upon the use and
design of the clocked flip flops. Again, the device is
really only a read only memory and can be operated
as fast as a read only memory. There are available
ROMs which can run a lot faster than those assumed
here 13,16 and ROM speeds are projected to improve
significan tly. 2
Evaluation of the speed of the universal combinational
logic module is fairly straightforward since it does not
contain any internal feedback loops with flip flops in
them. In a MOS implementation, one would expect
currently to be able to operate the module around
2 MHz, regardless of whether you are changing the
module's function. A bipolar implementation should be
able to operate at 20 MHz or above. Both of these

192

Fall Joint Computer Conference, 1971

anticipated speeds are below the projected speeds for
read only memory because there is some internal
feedback on the flip-flop functions. It would be better
to operate the module slower than capacity to assure
that the output values are correct. The MOS speed
(0.5 millisecond) is slower than most comparable MOS
logic (typical speeds for the Electronic Arrays 1800
Series range from 250 nanoseconds to 150 nanoseconds).
The logic building blocks most comparable to the
universal combinational building block are the EA1800
(250 nanoseconds) and the EA1806 (250 nanoseconds).
However, ROM speeds are expected to increase and
are increasing on smaller RO Ms.16
CONCLUSION
The viability of read only memories for use as logic
devices was investigated in this paper. Three different
versions of read only memory were applied to three
different problems. The three problems that were
considered are: (l) use of read only memories to build a
universal combinational logic module to replace a large
number of ICs and which could be used to generate
random logic functions, (2) use of read only memories
to build a sequential logic module to replace a number
of ICs that could be used to generate subsystems, and
(3) design guidelines for constructing a sequential subsystem on a single chip. A new type of read only
memory was introduced which incorporates the best
features of ROM and ROAM.
The three types of read only memories that were
applied to the problems are ROM, ROAMl, and
ROAM2. Each different type of read only memory has
its own particular properties that tend to make it quite
unique when compared to the other devices. ROAM2,
in general, does not have the ability to perform doublerail logic. This makes ROAM2 particularly efficient at
realizing simple logic functions such as AND. ROAMl
has double-rail logic and can therefore implement very
complex functions; however, it requires a large amount
of memground (G) and R2»RI. When a four-bit address
is placed on 1 1,12, 13 and 14 exactly one of the lines
A, B, C, and D becomes V E and the rest are G. Simultaneously, exactly one of lines one, two, three and four
becomes V E and the other lines are G. Assume lines C
and two are selected and become V E. The transistor Tc
then forms a path for current to flow through RI to
ground. This current is supplied by one of two sources.
If a diode exists at the intersection of lines C and two
[as it does in Figure 1 (a) ] then the current is supplied
through the diode at the intersection of lines C and two'
.
.
'
I.e., pomt Oc is at V E which forces the output to be
near VE.
In order to see what happens when the diode is not
connected, assume a new address is placed on the input
lines and lines two and B go to V E. The diode at the
intersection of lines two and B is not connected; therefore, point OB tends to go to G and current is supplied
by the power supply through R2 and DB to T B which is
a path to ground. Since R2 was made much larger than
R 1, the output is forced to G. The diodes D A , DB, Dc,
and Dn prevent unwanted currents from flowing. In
the case under consideration, lines two and B are at
V E, but since line two is at V E, the diode at the inter-

193

section of lines two and C wants to form a current path.
Since Tc is not turned on (line C is at G), current would
tend to flow to point Oc, then to point 0B (raising the
output to V E) finally to ground through T B if diode Dc
were removed. Therefore, diode Dc is necessary to stop
this unwanted flow of current and keep the output at
ground (G) which is what was desired.
DESCRIPTION OF HOW ROAM WORKS
Figure 3 (a) shows a three-input, one-output, read
only associative memory (ROAM). This device is
operated by placing a three-bit address onto the three
input lines A, B,and C. If this address matches an
address that has been previously stored in the ROAM,
then a one appears on the output line of the ROAM.
If the address doesn't match any of the addresses stored
in the ROAM a zero appears at the output of the
ROAM. This network can be used to generate logic
functions by associating a minterm with a minterm
that has been previously stored in the ROAM.
To see how the circuit actually works, assume that
- V E < ground (G) and that R2» RI . If negative logic is
assumed (- VE=I and G=O) then the circuit in Figure
3 (~t ca~ be used to perform the logic function f =
ABC+AC+ABC+ABC. Assume that A=I , B=I ,
and C = 1 is put onto the input leads then in row 3
of the ROAM in Figure 3 (a), - V E appears at the
vertical connection point of all diodes in row 3. This
leaves point 03 near - V E and current flows through
R2, 0 3, and RI to the terminal at - V E. Since R2»R I ,
the output is about - V E and a logical 1 appears at the
output. All other rows are mismatched and each of the
lines 01, O2, and 04 were held to ground because (considering row 1) in each row at least one diode's anode
was tied to ground, causing the horizontal line to be at
ground, therefore, not biasing the diodes D I , D2 and D4
ron. If the address was a mismatch in all rows, 01, O2 , 03,
and 04 would all be near G and the output would be G.
In summary, a mismatch on any bit causes the row to
leave the output at ground. Any row that has all
matches (- V E on the anode of all connected diodes in
that row) will allow the output diode of the row to
conduct pulling the output negative (logical 1). One
advantage of ROAM is that "don't care" conditions
can be programmed in by not connecting diodes. If
positive logic were used and G = V E and - V E = 0 then
logic functions compatible with ROM could be realized
because the ROAM would then produce OR-AND
logic. In positive logic, the ROAM shown in Figure
3 (a) produces the function

f= (A+B+C) (A+C) (A+B+C) (A+B+C)

194

Fall Joint Computer Conference, 1971

REFERENCES
1 M OS firm puts 4096-bit memory on a single chip
Electronic News p 1 January 19 1970
2 Designer's guide: Semiconductor memories
EEE pp 53-67 November 1969
3 Semiconductor memories Part III, Biopolar RAMs and
ROMs
Electronic Products p 23-25 March 1970
4 R F GRAHAM M E HOFF
Why semiconductor memories
Electronic Products pp 28-34 January 1970
5 J L NICHOLS
A logical next step for ROM
Electronics pp 111-113 June 12 1967
6 J C LEININGER
The use of read-only storage modules to perform complex
logic functions
International Computer Group Conference Washington
DC June 1970
7 Microcircuits, IC complexity due to double yearly
Electronic Design pp U93-U97 March 15 1970
8 R A HENLE et al
Structured logic
AFIPS Conference Proceedings Fall Joint Computer
Conference 1969 pp 61-68 November 1969
9 E J McCLUSKEY
Introduction to the theory of switching circuits
McGraw-Hill N ew York

10 4096-bit M OS/ LSI
Electronic Products Magazine p 57 June 21 1971
11 Most complex ROM shrinks CPU
EDN pp 16-17 June 15 1970
12 K J THURBER
An associative processor for air traffic control
1971 SJCC Proceedings AFIPS Press Volume 38 pp 49-59
May 1971
13 Biopolar 1024-bit ROM accesses in 30 ns
Electronic Design p 64 March 4 1971
14 High-density 5120-bit ROMs include on-chip decoding
Electronic Design p 95 November 8 1970
15 Specification sheet for the EA 3307 ASCII/EBCDIC code
converter ROM
Electronic Arrays Inc
16 2560-bit memory has 500-ns access
Electronic Design p 95 November 8 1970
17 W I FLETCHER A M DESPAIN
Simplify combinational logic circuits
Electron c Design pp 72-73 June 24 1971
18 J WUNNER R COLINO
Applying the versatile MOS ROM
Electronic Products pp 35-40 January 1970
19 H SCHMID D BUSCH
Generate functions from discrete data
Electronic Design pp 42-47 September 27 1970
20 Programmable logic arrays
Texas Instruments Bulletin CB-126 pp 152-166 October
1970

A panel session-Computers in medicine-Problems
and perspectives
Medical Inform.ation System.s: A
Reform.ula tion of the Problem.s as
Perceived by a Hospital Adm.inistrator

Despite past successes, the fiscal system is still desperately in need of further assistance because of the complexities of federal and state health care legislation. A
national effort is needed to make available to participating hospitals up-to-date information with respect to
each patient's prior utilization of health facilities, status
of deductibles and coinsurance, as well as location and
content of prior medical records. The challenge here is
more political than technical.
The goal of a completely integrated hospital computer-based information storage and communication
system is lower in priority. Partial stand-alone applications appear more attainable and are urgently needed.
Problems related to patient identification, patient
scheduling, medical record location control, and patient
eligibility are most urgent.
Technology already appears to be adequate to do the
things that are needed most. Slow progress is more
related to the absence of research and development
capital in the hospital industry. The short term opportunities appear greater for software development than
for equipment, although techniques for the rapid retrieval and transmittal of full page facsimilies by video
screen with optional hard copy output would appear to
have a large market in the health service industry.

by BALDWIN G. LAMSON

UCLA Hospitals and Clinics
Los Angeles, California

Much has been said in recent years of the inefficiencies
of hospital management and patient care, and of the
opportunities for automating the recording of nurses'
notes, scheduling of drug administration, recording of
doctors' orders, dietary menu planning, diagnosis by
computer, and the like. The hospital business office and
revenue accounting have often been singled out as the
only areas in hospitals where modern data processing
equipment has been at all effectively used.
After a decade of effort to produce total hospital
management and communication systems, few, if any,
completely successful and self-sustaining systems are in
full operation. Many of the major problems still remaining which to date have defied complete solution were
noted ten years ago but are now recognized as being
vastly more complex and difficult of solution than
originally perceived.
The medical record remains the heart of the problem.
Hard data, such as clinical laboratory reports, have
been successfully processed by computer, but doctors'
observations and physical examination data still largely
defy satisfactory input solutions. Patient self-query
systems have proven practical and are anticipated to
come into wider use.
The organization of the medical record for most efficient computer storage and retrieval is still a major
challenge. Much of the medical record as it accumulates
is obsolete within forty-eight hours, except for medical/legal and medical research purposes. Very possibly,
promptly dictated and typed uncoded records of day-today observations of the physician, followed by a single
well-structured summary and analysis upon the conclusion of each episode of medical care, will suffice.
Systems for scheduling of patient appointments' for
patient convenience and proper utilization of facilities
remain a very high priority. This problem is solvable
and awaits only adequate capital and a dedicated effort.

New Technologies in Medicine
by C. T. POST, JR.

Department of Health, Education, and Welfare
Rockville, Maryland

We are at the threshold of a technological era in
medicine. The Government is interested in harnessing
technology to alleviate manpower, cost, access and
distribution problems existing in the health care system
today. There exist, on the one hand, several examples
of technology in use today which clearly improve the
quality of medical care but which do so at considerable
cost. On the other hand the techniques which have the
potential for reducing the cost and increasing the availability of health services can only do so when they are

195

196

Fall Joint Computer Conference, 1971

deployed in situations where sufficiently large populations can be aggregated to take advantage of the
economies of scale that are implicit in these techniques.
There is a pressing need to demonstrate that these potentialities for economy and improved access can be
realized. This can only be achieved by mounting fairly
extensive experiments that will make evident the economies of scale that are inherent in these technologies.
This type of demonstration is made feasible by readily
available communication capabilities which now exist.
The fact that it is now possible for a large number of
medical institutions to share a common automated
medical service has very clear and immediate implications not only for the reduction of unit costs but also
for the dissemination of high quality medical information, advice and services. In addition, the sharing
of this service will exert pressures that will move communities of hospitals toward the sharing of other services and toward standardization of their operations.
Both within the hospital, and to a greater extent within
doctors' offices, the necessity for develo pment of mo~·
ular, evolutionary, user-oriented reliable systems is becoming increasingly evident. This latter area, the physician's office, where the majority of medical care is in
fact delivered in this country today, remains largely
untouched by technological aids. Given the persistence
to a large extent of the present organizational scheme
of primary care delivery, successful entrants into this
marketplace will in all likelihood primarily operate
within the physician's environment to tap his thinking
for purposes of problem definition, and then reconfigure
existing technological components into a purely problem
solving user-oriented system.

The Role of COlllputers and Inforlllation
Systellls in Medicine
by E. E. VAN BRUNT

the conduct of his business. The primary 'business' of
medicine is the care (management) of people who require
various levels of medical investigation, counseling and
treatment. While many specialized medical and medical
administrative data processing capabilities have been
developed, the current systems of medical information
management and communication are inadequate for
present day patient care needs.
Medical care is comprised of both hospital and clinic
activities; the larger component exists in the outpatient care areas. In this medical center environment,
the heart of any information system is an integrated,
or continuous, lifetime record-for each of the patients
receiving medical services, and the data systemmanual and/or electronic, which supports its growth,
maintenance and utilization. Concomitant with the
progressive development of group or 'regional' medical
care programs is the need for more effective large volume
medical information management. The objective is the
timely supply of relevant information to appropriate
users, for patient care services, and medically-oriented
research and ed uca tion.
The supply of information supporting patient care
implies extensive and highly reliable communication
capabilities: the communication of patient data from
the professional providers of care to the medical record,
and to other professionals and service bureaus; communication, on demand, of relevant summary information from the patient's record to medical professionals and service bureaus; communication between
services. The conduct of medically-oriented research
implies existence of a medical data base that can support clinical, epidemiological and health services research. This same data base should support limited,
patient care oriented, educational services to medical
care professionals.
The role of computer-supported information systems
in the medical care environment is clear but outstanding
'problems exist in both the medical and computeroriented disciplines.

The Permanente Medical Group
Oakland, California

An information system can be defined as a system
intended to provide information needed by the user in

*Medical data system studies have been supported in part by
a National Center for Health Services Research and Development
(NCHSR&D) Grant (HS-00288) and by the Kaiser Foundation
Research Institute.

A panel session-The user interface for interactive search
Proceedings, will comment on Workshop accomplishments and make observations based on the SHOEBOX
personal file system developed at MITRE. Engelbart,
an innovator in the use of terminals, will describe what
has been learned from experiments at SRI which place
the full power of the computer at the service of the user.
Katter has examined a variety of interactive applications at SDC, and has developed a model of user
behavior at the search interface which illuminates problems and suggests a direction for their solution. Hugo
will comment on the service to be provided for a large
class of noncaptive users-Senators, Congressmen, and
their administrative staffs-for whom interactive access
to a data base would represent just one of a set of information-gathering tools. Morton will draw on studies
made at MIT and Westinghouse to tell us what interface facilities are needed to establish the conditions
under which business decision makers would conduct
their own searches rather than delegating terminal
interaction to others.
In our discussion, we will identify those findings
currently available which help us design the link between the user who understands the search results he
wants and the system designer who can provide the
means for achieving those results. Though the emerging
interface technology is only roughly defined, we can
begin to outline now the research and development
issues to be resolved if interactive search is to achieve
widespread user acceptance.

The User Interface for Interactive Search

by JOHN L. BENNETT
IBM Research Laboratory
San Jose, California

In January 1971, the AFIPS Information Systems
Committee sponsored a workshop on "The User Interface for Interactive Search of Bibliographic Data
Bases." This narrower topic was chosen intentionally
to give the Workshop a focus suitable for intensive discussion within a small group. Now that the Proceedings
are available it is appropriate to highlight those user
interface characteristics shared by a less restricted range
of applications. The keynote for the panel will be
"what goes on in front of the terminal"-what facilities
the user requires, what services the computer can provide, and how the user responds to data display. Understanding the exchange of data between the user and the
computer at the interface during search will enable
designers to implement systems truly responsive to
search needs.
Each member of the panel will relate his own experiences to the keynote subject of interactive search.
Bennett will draw on work with the Negotiated Search
Facility which was used to study search behavior given
the data of bibliographic files. Walker, as editor of the

197

A panel session-State of the computer art in biology
ing, a cursor on the scope is controlled by a hand-held
device (a "mouse") with two potentiometers to determine x and y. The movie projector is controlled by
the computer, and the frame number is proportional to
the z coordinate. Thus, the observer can use the system
as a three-dimensional notebook. The cell bodies and
branching pattern of fibres, as well as synapse locations,
are recorded for each nerve. After all nerves have been
recorded a BRAIN file can be constructed by a program
which matches the synapses which were separately
recorded in each NERVE file.
Any combination of nerves can be displayed in 3-D
by rotation of the projected image. Similarity of branching patterns in two different. organisms can be determined by matching them analytically with a simple
graph-theoretic algorithm or by simultaneous display
of nerves from different organisms. In the same way,
bilateral symmetry can be identified in individual
organisms.
In simple invertebrates having brains with only a
few hundred cells but a very complicated set of fibres
connecting them, no differences have been observed
between two genetically identical organisms. Within an
individual organism there is a remarkable degree of
bilateral symmetry. A four-dimensional description of
a simple brain is now being developed by carrying out
the 3-D mapping at various stages during embryologic
development.

Contputer Applications in Cellular
Research-Three-Dintensional
Brain Reconstructions

by CYRUS LEVINTHAL
Columbia University
New York, New York

Computer applications in cellular research range from
relatively straightforward data reduction to complex
modeling. Programs exist for interpreting Coulter
counter data, for assisting microscopic cytological assessments when fluorescent antibodies are used, and for
relating observed uptake of labels to cell-cycle parameters. Models of cellular populations range from simple
cycling systems supportive to data reduction (e.g. for
label uptake) to complex models incorporating feedbacks and cellular differentiation, which are being used
to explore better strategies for treating leukemia and
cancer. The greatest need is for better experimental
information. Programs enabling reliable automatic analysis of bone-marrow sections or smears probably will
be difficult to develop but of great value to experimental
hematology. Programs to reconstruct three-dimensional
anatomies of cellular systems should provide valuable
insights and, as indicated in the following report of
current research, are feasible now.
The brains of. small organisms can be cut into thin
serial sections, each of which can then be photographed
in an electron microscope. All information as to the
nerve branching patterns and connectivity is contained
in this set of photographs, but even for a very simple
organism the number of photographs required will be
several hundred. Thus, reconstructing the three-dimensional (3-D) information in a usable form is a formidable
problem. We have developed a method of combining
the photographs in a motion picture film strip in such a
way that each section is aligned with the one before it.
When the movie is projected, the observer has the
illusion that he is traveling through the brain.
Recording of the nerve net is done by superimposing
the projected image from the movie with the image on
a computer-driven oscilloscope display. During record-

Contputer Applications in Population
Studies

by NEWTON E. MORTON
University of Hawaii
Honolulu, Hawaii

Population genetics is the study of forces that change
or maintain genotypic frequencies. With support from

199

200

Fall Joint Computer Conference, 1971

computers, research emphasis recently has shifted from
the conceptual synthesis of simple models to issues
related to the maintenance of genetic variability in
actual populations. That man himself Has become the
preferred organism for many types of population research results not only from motivations for human
benefit but also from the fact that the large number of
recognizable single-gene differences in man far exceeds
the known polymorphisms in Drosophila and other
classical genetic material. With this research emphasis
on human populations, computers become indispensable
both for data analysis and modeling.
Human data rarely can be acquired under the welldefined, controlled conditions possible for the laboratory geneticist. Efficient data management is essential
in the study of large populations for whom interrelated
pedigrees and migratory traces must be maintained.
Complex data analyses are required. Improvement of
these approaches is an important area for computer
implemented biostatistical research. Human data tend
to be expensive to acquire and often cannot be replicated; all possible information must be extracted.
Theoretical population models become complex as
they are applied to the description of realistic systems.
Computer implementation is essential. Efficient modeling techniques must be developed, as well as more
effective displays or interactive approaches for model
exploration.
Models have been proposed as alternatives to the
single-locus Mendelian models for the familial clustering
of some congenital malformations and many of our
more common diseases, e.g. club foot, pyloric stenosis
and diabetes. The major contribution of these models
was realized only after they were extended and programmed for the computer. This additional complexity
permitted the direct computation of genetic risk figures.
In the near future, it will be practical to define upper
and lower boundaries on the risk under a wide variety
of situations, permitting much more direct genetic
counseling. Using the computer to extend these models
and to apply them to a large amount of data is continuing to stimulate the development of new approaches
and modifications in the underlying theory.
Another important area for stimulating interaction
between computer advances and population research is
the problem of record linkage. Testing through applica tion of population models often requires tremendous
data sets available only by accessing multiple large
files such as of birth, marriage, and death certificates.
With the development of more efficient and accurate
management of these files, important aspects of hypothesis testing may be pursued and continuing theory
and model development is stimulated.

Molecular Biology as it Relates to Digital
I:mage Processing

by ROBERT NATHAN

California Institute of Technology
Los Angeles, California

Biological research has been placing increased attention upon the determination of the atomic structures
of large molecules. These have generally been proteins
and especially enzymes whose molecular weights range
up to 50,000. Tremendous efforts to infer functional
configurations of active enzyme sites have been expended by organic chemists. But total atomic configuration has been performed exclusively by the digital
computer using the methods of x-ray crystallography.
The sequences of the amino acids in these enzymes
have been determined by organic chemists. The information is then used by the crystallographer to infer
initial estimates about the actual geometrical configuration, which the crystallographic data eventually confirm. Several of these molecules and their functions will
be described.
There are many other large biological molecules and
molecular aggregates whose structures should be determined if we are to further unravel the mysteries of
cell function and apply these solutions to the problems
of disease. What are the structures of t-RNA, ribosomes,
mitochondria, membranes, antibodies, and even whole
viruses?
In our laboratory, several computer techniques have
been developed to manipulate continuous-tone digitized
images. These methods were originally applied to space
photography. They are now being applied to medical
x-ray images, and to light and electron micrographs.
(A brief description of automated karyotyping, the
light-microscope analysis of chromosomes, is included
as an example of light-microscope computer automation.)
An example of micrograph enhancement of the normal
electron microscopy of the enzyme catalase is shown to
illustrate present microscope limitations.
The final discussion is centered around a description
of a method for computer manipulation of dark-field
electron micrographs which should eventually reveal
atomic structure without the assistance of chemical
inference. A crystal of an organic dye, indanthrene olive
(molecular weight 750) is chosen to illustrate the computer method of obtaining high resolution by means of
a system called synthetic aperture. A preliminary
atomic resolution model is presented.

State of Computer Art in Biology

AutoInated InforInation-Handling in
PharInacology Research

by WILLIAM F. RAUB

National Institutes of Health
Bethesda, Maryland

Pharmacology involves the multitude of interrelationships between chemical substances and the function of
living systems. Since these interrelationships manifest
themselves at all levels of physiological organization
from the individual enzyme to the intact mammal,
research in this area involves concepts and techniques
from almost every biomedical discipline. Thus, pharmacology entails a class of information-handling problems
as formidable and enticing as any that can be found in
the medical area. In recognition of this, the National
Institutes of Health (NIH), through its Chemical!
Biological Information-Handling (CBIH) Program, is
attempting to accelerate the acquisition of new pharmacological knowledge by designing and developing special
computer-based research tools. Working through a
tightly interconnected set of contracts with universities,
research institutes, profit-making organizations, and
government agencies, the CBIH Program seeks to blend
the most advanced information science methods into
a computer system which can be an almost indispensable
logistical and cognitive aid to these investigators.
In the absence of all-encompassing theories of drug
action, pharmacologists rely primarily on empiric observations communicated in a plethora of literature with
which currently available information-handling systems
are unable to cope. There must be an increased emphasis
on effective data retrieval, as opposed to document
retrieval. Past work on encoding molecular topology
has produced two systems, connectivity tables and
linear ciphers. The former demand considerable storage
and processing time, but the latter exclude the possibility of some important substructure queries. New or
combined approaches are required which relate effectively to the task to be performed. A sophisticated manmachine interface is of high priority to effect the interchange of data, procedures, and models among geographically and disciplinarily disjoint scientists whose
work is relevant to the understanding of drug action.
There exist promising interactive systems for tablet
input of two-dimensional chemical graphs and for the
graphical display and manipulation of three-dimensional
molecular models.
Two projects currently pursued under the CBIH

201

Program will be discussed:
The PROPHET system is a medium through which
the latest pharmacologically relevant informationhandling methods can be developed, integrated, and
made widely available via a time-shared PDP-I0 to
practicing scientists whose disciplines range from molecular biology to human clinical investigation. It includes a powerful interactive command language for
handling empirical data, a coextensive simple procedural
language modeled on PLII, provisions for easy access
to complex computational processes, a rich substrate
of devices for handling different kinds of pharmacological data structures, and facilities for communication
among users.
Another project is exploring the use of automated
inference methods as cognitive aids to pharmacological
investigators. At present, model-handling tools are being
developed to enable researchers to express and assess
the validity of their concepts about mechanisms of drug
action. Finite-state automata models have been found
especially useful.

COInputers in Physiological Modeling

by WILLIAM S. YAMAMOTO

University of California
Los Angeles, California

The principal reasons for the construction of computer models of systemic physiological models are no
different from the problem of theoretical computations
in any other field. Systemic physiology, by which I
mean the written literature of the subject as distinct
from the process of acquiring such information, is
voluminous, undoubtedly redundant, and probably contains many contradictions. Moreover, it is difficult to
separate opinion from observation, correlation from
imputations of causality. Nevertheless, this is the milieu
on which the intelligent physiologist functions. He
pushes the frontier forward in every decreasing salients
because the weight of past experience becomes more
and more unmanageable.
The following are major reasons for mathematical
modeling of physiological systems:
1. Models serve an heuristic role, and can guide
laboratory research into unillumined corners, or
applications.

202

Fall Joint Computer Conference, 1971

2. Models codify unambiguously extensive systems
of postulate or conjecture for purposes of retrieval or ratiocination.
A critique of modeling activities in the light of these
purposes finds exact parallels in the laboratory study of
physiology. The perspectives of investigators in the
laboratory are shrinking. There is specialization because
the number of details and procedures as well as the
number of plausible alternatives increases with the
increase in apparent information. Problems of information synthesis become monumental, and the review
article occupies a significant place in scientific literature.
Modeling, which I claim is the only testable process of
scientific synthesis (review), suffers from the similar
problems. Namely, models are not unique, and they
become rapidly insensitive to both parameter and
structure when a certain level of complexity is reached.
The latter implies further that there is for any level of
development above some minimum the possibility of
alternative but compatible model structures, just as
there are alternative but compatible explanations in
theories of the same phenomenon.
There have been several models whose scale is sufficient to demonstrate the important problems. These
are the model of glucose metabolism, adrenocortical
function, the cardiovascular system, and of external
respiration. I am sure although such matters are usually
not presented in print that all of these investigators
have encountered common problems of the following
sort. First, one must make a choice between lumping
and tabulating the components of parameters. If one
lumps, one loses the heuristic value; physiologists can
no longer assign biologically meaningful notions to
magnitudes. Parameters with common elements become
confused. If one does not lump parameters, the glossary
keeping and degradation of computation efficiency become progressively severe. In fact, it becomes impractical for the modeller to set aside programs for even
periods as brief as two weeks. Second, model sensitivity
decreases so that not only parameters but whole sections
may be altered in structure and if the test behavior is
a limited set, there is no discriminatory function in
exercising the model because methods for fitting models
to data are not well developed. There is constant need
to map model behavior upon the experimental domain,

i.e., just as in the real case, conclusions need to be
tested to see if the permutations which constitute
laboratory experience can be replicated.
To examine the problems that arise in the translation
of physiological ideas into formal models, we chose to
add to our model of external respiration in mammals a
subroutine which generates the drive for movement of
the chest. In the original model this was represented by
three statements which produced a hybrid modulated
wave in which both the amplitude envelope and the
frequency envelope were functions of carbon dioxide
concentration in brain cells. The plan is to replace this
short subroutine with one which encompasses a substantial part of the recorded observations upon neural
mechanisms below the midcollicular level. There are
three classes of neurophysiological experiment: ablation,
stimulation, and microelectrode recording. Each of
these is given an unambiguous symbolic definition. The
respiratory complex is then constructed as a net of
differential equations in which the principal criteria
are location, firing pattern, and chest movement. Fifteen
nonlinear first order d.e.'s are used, and synthesis proceeds in reverse of the ablation literature. To produce
the phenomenon of gasping merging into apneusis, we
are dealing basically with eight of the centers. Apart
from parameter size, each center may be connected to
any other center in any of 3 ways and the directed graph
has potentially 3 X7 X8 = 168 patterns not all of which
give behavior which is readily discarded. Since one type
of connection is the non-connection, there are 112 parameters possible of which at anyone time only a few are
testable. The computational problem is severe. In the
absence of search formalisms the most reliable tool
turns out to be a thorough reading of the physiological
literature. And the model can be related almost relation
by relation (in the code) with statements in the research
literature.
For both the educational process and for the improvement of models beyond the easy stage, tools are necessary and they are probably some form of computational
tool to handle and demonstrate the structure of other
programs, perhaps in graphical form of their relations
and their implications. Automatic flow charting is the
most primitive form of such a program. If a program's
content is a tree or a forest, a topologically sorted list
might be a second valuable direction to go.

Introduction to training simulator programming
by D. G. O'CONNOR
Singer-General Precision, Inc.
Binghamton, N ew York

INTRODUCTION-GENERAL REMARKS
ABOUT SIMULATION

The training station is a replica of the operating
environment, i.e., cockpit, for which the training is
being conducted. Frequently, production equipment is
used in the simulator, or at least the panels and controls
are used. It is fair to say that every effort is made to
make the training station environment so close to the
real environment that being brought into the environment blindfolded would leave a human with a difficult
judgment were he asked whether he was in the real
operating location or not. In particular, since we deal
primarily with flight simulators, I mean by the foregoing statement that if an experienced pilot were put
in the seat of a simulator, he would have difficulty
determining whether he were in an airplane or not
(except for certain obvious clues).
The instructor's station consists of a console with
display of (almost) all the instruments, signal lights,
etc., available to the pilot. In addition, other displays
may be incorporated, such as CRT display used for
instructor assistance and other indicators, lights, and
meters of importance to the instructor in performing
his job. Controls for FREEZE, MALFUNCTIONS,
etc., are located here for the operation of the simulator
training problem.
The computer used in such a simulator must be
capable of performing "many" calculations per second.
The calculational load is determined by the aircraft
being simulated and the number of systems aboard this
aircraft whose operation must be simulated accurately,
as well as by psychological and physiological factors. 4
In recent years, more and more systems have been
simulated .and have been simulated to a higher degree
of fidelity. This growth has led to requirements for
more efficient programs and additional computer capacity. Consequently, significant requirements for improved programming have arisen. In spite of this, increasing requirements have been placed on the computer.
The interface equipment required includes both hy-

Several years ago, while we were having a compiler!
developed .for our. use in simulator computer programming, one of the senior management people at our
subcontractor's expressed surprise at our dogged insistence on object code "efficiency." Now, in fact, he
had not been on this project very long and did not
realize that a simulator for our purposes is a productnot a program run on an in-house computer. As I
shall interpret the word simulator in the remainder of
this discussion, it will mean a product used for training.
What are the implications of this statement? Well,
first it implies someone is being trained, and consequently, there is a man in the system someplace. In
fact, he interacts quite significantly with the· operation
of the system. System outputs are cues, and feedback
to this man and his actions are inputs to the computer.
To be more specific, let us limit our discussion to aircraft2 ,3 and space simulation.
In Figure 1 is shown an artist's sketch of the Lem
Mission Simulator (LMS). An appreciation of the size
of a complete simulator is derived from this sketch.
A sketch is shown rather than a photograph because
it is impossible to get a photograph since the physical
size of the equipment would constrain a camera to be
so far away that it could not be in the same room.
In fact, you will note, this equipment is itself in several
rooms.
Such a simulator consists of several major parts (as
seen in Figure 2) :
(a)
(b)
(c)
(d)
(e)

a training station (cockpit)
instructor's station
a computer
interface equipment
special effects-e.g., visual and motion equipment.
203

204

Fall Joint Computer Conference, 1971

Figure l-,-Lem mission simulator

brid (analog to digital and digital to analog conversion)
and purely digital inputs and outputs.
The special effects include systems such as a visual
system which presents moving scenes to the pilot over
a portion of his flight path. Obviously, the scenes presented to the pilot must be consistent with such things
as simulated aircraft velocity and attitude as well as
simulated position relative to the scenes being portrayed. In some simulators, a device for moving5 the
cockpit to simulate accelerations is incorporated.
A practical side effect is the almost continuous use
to which this equipment is put. The implications of
this type of usage are that the computer has the ability
to be reloaded and restarted with a minimum of operator activity. Before addressing my topic directly, let
me indicate the size of the problem at least broadly.
In Figure 3 is shown a .little more detail of the
peripheral equipment which may be employed in a
flight simulator. Since this equipment is required primarily during program development, there are many
simulators delivered with a sharply reduced amount of
this equipment.
Also shown in this figure is a list of ranges of some
of the parameters pertinent to the computer hardware.
The range shown for Processing Speed and Memory is
representative of single or dual processor simulators.
Note also the reliability figure. While this figure is
remarkable, it has been achieved on many simulators
used twenty hours per day,six days per week.

aircraft, should incorporate physiological cues such as
the trainee obtains in real performance through his
senses. These are simulated to a degree by providing
sounds appropriate to certain of the activities, appropriate force requirements to actuate controls, a
motion system which jostles the man more or less
appropriate to the maneuvers of his aircraft, and visual
presentation through the medium of motion pictures or
television.
The simulation which is carried on to a very high
degree of fidelity is that required to "convert" pilot
actions into realistic dial readings and other cues.
Fidelity in anomolous operation, such as malfunctions,
is emphasized in view of its importance in training
pilots to make prompt and proper action under such
circumstances.
Meanwhile, the simulator produces radio sounds for
navigational purposes, drives the compass, etc. In places
where noises appropriate to aircraft operation are likely
to be encountered, the simulator activates the devices
which produce these sounds. Examples of this are the
brake squeal on landing or rumble as the aircraft is
navigating across the airfield terrain. Simulators have
several advantages over real aircraft in the training of
pilots. First-and incidentally most compelling-is their
reduced cost. In addition, certain technical features give
them flexibility. Most noteworthy is the ability to introduce failures of various parts of the aircraft to train the
pilot to react properly and safely under such circumstances. Thus, for example, engine failures may be
simulated without danger. Similarly, tire blowouts, control system breakdowns to some degree, failure of part
of the navigation system, may also be introduced to the

WHAT DOES THE COMPUTER DO?

INSTRUCTOR STATION

FLIGHT COMPARTMENT

CONTROLS,
SWITCHES

INTERFACE

COMPUTER

THE SIMULATION PROCESS
The computer provides the ~ that

The intent of a training simulator is to convince the
trainee that he is operating the equipment in reality.
Clearly, the simulation of certain equipment, such as

represen~s

the

h~h"vi()r

of the aircraft systems.

Figure 2-General flight simulator

Introduction to Training Simulator Programming

simulator to train the pilot. Not to be overlooked is
the ability to simulate flight in controlled wind, controlled icing, etc., which is impossible to control when
training in a real aircraft.
Modern simulators include other features which enhance the utility of the device in training. In particular,
it is simple for the instructor to freeze simulation at a
particular point where he intends to instruct the trainee
in some phase of operation without completely losing
the continuity. Other features such as being able to
start the exercise at any point is valuable in situations
where the trainee needs reinforcement in certain procedures at certain flight conditions. Recording of performance with subsequent playback permits the student
to be a witness to his own activities, and consequently
assists in his training procedure.
In the following sections we will mention some of
these features again in terms of their impact on the programming of a flight simulator.

205

Figure 4-747 simulator with visual and motion

PROGRAM PREPARATION
Processing: Speed -

-300,000 -

1, 000, 000

tnst/sec
Memory

-

SDK -150K program

and data memory plus
2M byte bulk storage
TELETYPEWRITER

1,'0 Channels

--100,000 plus words/sec

Reliability

-100 hour MTBF

98% plus availability

MTU

CARD
READER

LINE
pRINTER

,.IS'{

(NORMALLY
AT INSTRUCT0R'S
STATION)

Program preparation includes the programming and
debugging of the individual programs prior to their
integration in the simulator. In this context the programming process is essentially identical to that carried
on in a laboratory computer. As seen, the main constituents are language translator, a debugging package,
testing programs, data base manageme~t program, and
the usual computer monitor features. In particular, the
programmer will want to use the monitor's I/O and
relocatability features as well as the set of utility programs during this phase of activity.

Figure 3-0verall computer requirements

In Figure 4 is shown a 747 simulator cockpit mounted
on a motion system and having a VAMP* Visual
System mounted on it.
Figure 5 is a representation of such a simulator with
a cut-a-way showing the electronics used to compute
the model. This is typical of modern j€t simulator
installations.
SIMULATION PROGRAMS
Figure 6 exhibits, in outline form, the programs
which we shall discuss for the remainder of 'the talk.
It is noted that the programs may be divided into two
general categories-training and program preparation.
* A trademark of Singer-General Precision, Inc.

Figure 5-Typical 747 installation

206

Fall Joint Computer Conference, 1971

PROGRAMS

I

I

PROGRAM
PREPARA TION

~

TRAINING

LANGUAGE
TRANSLATOR

I

EXECUTIVE
DEBUG,
TESTING
DATA BASE

"FRAMING"
SIMVLATIO~

ROUTI:\ES
MONITOR
[INTERRlOPTIBLE
="Oi'l-I"TERRCPTIBLE
SUBROVTI:\ES
'MAJOR ROUTI:\ES'

~ FU:"!CTI00JS

L

RADIO STATIO:--':S

I;-';TERRCPTS

l

'REAL'TIME CLOCK

INPUT/OUTPUT DEVICES

OTHER

Debugging and testing is similar to normal functions
with these names. The presence of a large data base
and control of an executive is not unusual. A problem
of integrating programs written by several people arises.
Perhaps the problem of mathematical stability in the
face of simplified routines is a bit more severe than
normal. The problem of observing real time is to some
extent postponed until debugging with the full set of
simulator equipment. However, the trend is to force
this problem earlier in the programming project.
Data base problems arise in that there are generally
several tabular constraints imposed with sometimes
contradictory requirements. Since duplicate storage requires extra memory in the delivered product, much
effort is expended to avoid this "easy" solution.
The monitor and utility functions available with the
purchased computer are exploited (sometimes after inhouse modification) during test prior to testing with
simulator hardware.

SIMULA TOR PROGRAMS
FREEZE (& VARIANTS), MALFC:\CTlO:-';S,
FAST TIME

TRAINING PROGRAMS

RESET, INITIALIZE
RECORD, PLAYBACK
INSTRU CTOR FUNCTIONS
SYNCHRONIZA TlON
DIAGNOSTICS

Figure 6-Simulator program tree

For many of our simulators we have used Assembly
Language. This choice has been dictated by the very
severe requirements of the real time environment in
which we function. Recently, however, we specified
and had written for us a compiler6 which produces fixed
point programs with a degree of efficiency approximating that of a good assembly language programmer.
Consequently, it is likely to be as good or better than
an average programmer.
This language, FORTRAN-like in its syntax and
semantics, is intended to be suitable for the documentation of the programs produced. Thus, it adheres very
closely to normal mathematical rules. We imposed an
additional constraint on the philosophy of describing
this language in that we took a point of view contrary
to that action by most higher order language developers,
namely we restricted generality as much as possible.
Our intent has been to provide this language with
only those features which we could foresee as being
necessary. This is an outgrowth of our anticipation
thatmost programs written in this language would be
written by people who are not programmers. Consequently, the tighter the language, the less likely they
were to make subtle errors.

In dealing with the training programs,7 we will find
some to be unusual with respect to what is normally
encountered in computer programming work. This is
not to say that there is some magically different set of
techniques, but rather that the intent and requirements
of these differ from what we normally encounter in
laboratory computer programming. The training program group is dependent heavily on a real-time executive. This executive is in many respects a scheduler as
well as being a program which recognizes and reacts to
interrupts and which guides the execution of simulator
related programs as well as simulation related programs.
The division of the programs for our purposes are the
three major headings under "Executive," namely:
"Framing," "Interrupts," "Simulator Programs."
Framing

The word "framing" in this context connotes the
fact that the simulator programs are divided into programs which are operated at different iteration rates.
Time is divided into intervals-frequently 50 millisec
in length-called "Frames." Since programs for different parts of the simulation operate at different rates,
the programs which are executed in a given frame
differ. The executive CALLS the proper programs in
each frame. Simulation routines include all of the programs which provide the mathematical representation
of the systems being simulated. They include all of the
programs which compute the aerodynamics and engines,
those which compute the various systems aboard, in-

Introduction to Training Simulator Programming

cluding even such things as de-icer and air-conditioning
systems. These systems must be simulated primarily to
be sure that the pilot understands the indications of
both correct and, perhaps more importantly, incorrect
operation so that he is capable of taking prompt action
when prompt action is required.
The programs dealing with applications like simulation of flight involve primarily solutions of the normal
types of mathematical problems found in the field of
mechanics. Consequently, most of these programs are
heavily mathematical in orientation. A crucial issue in
our solving these problems is that we keep the approximations as simple as possible in order to conserve both
time and memory. We will always be guided by the
recognition of the fact that the accuracy required for
such solutions is frequently much less than it might be
in an open loop engineering simulation.
One of the more widely discussed mathematical routines is that for integration. Frequently, expensive and
fairly complicated routines are used for this purpose in
mathematical computation. In our case we have the
conflicting requirement that most of these more accurate routines require extensive computer time. Furthermore, since our step size is generally small, we find
we can operate with a formula of the type of a modified
Adams:

207

[(x. Y)

Xl

y )

4

for

f(x, y)

, f(x,_ YH) - f(X,)'i)

k "

j,

H
(B)

Yi+l - Yi

Figure 7-Interpolation

In Figure 6 the words Interruptible and NonInterruptible have been used as sub-headings under
Simulation Routines. This choice of words emphasizes
the fact that some of the programs involved in simulation are heavily time dependent in the sense that the
time variable is an explicit independent variable of the
function. Typically, these routines are integrations with
respect to time. As a consequence of this dependence,
these programs must be solved at fixed regular intervals
of time-that is to say, in particular frames of simulation. The interruptible programs are those which do
not exhibit this tight time dependency and as a consequence are flexible as to the frame in which they are
solved, at least to a slight degree; i.e., can be completed
in the next frame.
The subroutines as the next heading under Framing
are simply the normal set of mathematical subroutines,
including such routines as trigonometric and logarithmetic functions. One of the problems encountered in
the distant past was that of reentrant subroutinesarising from interrupts of some programs conceivably
in the middle of the use of a subroutine. Many of our
subroutines must be reentrant.
The major routines, while similar to similar routines
used elsewhere, are of significant importance to the

field of simulation to warrant their being identified
separately.
The Function Generation routines are used for
computation of over 300 functions of one and two
variables in a large four-engine jet transport. Thus,
much core is required as well as a noticeable fraction
of the available computer time in a general purpose
computer.
Interpolation, as we implement it, is represented in
Figure 7. At the top is shown a function of two variables,
f(x, y). This function is represented in memory as a
table of values, where the arguments may be explicit
(in some manner similar to that shown) or normalized
and implicit. The small x's plotted on the curves represent the values entered in the table.
Normally, interpolation proceeds as two interpolations "on x" as in Equation A to obtain, for instance,
points A and B. These are followed by an interpolation
"on y" to obtain point C, using form B. Examination
shows this method to represent the first four terms of
the Taylor Series expansion of a function of two variables. The method of interpolation has been selected
due to the ease of changing data points 'at the last
minute' when required (and it is frequently required).

208

Fall Joint Computer Conference, 1971

Polynomial approximations have the characteristic that
a change of a single data point affects all coefficients.
This frequently proves to be an awkward constraint.
The process of interpolation as described above involves a search for the "adjacent" tabular values, given
an argument (x, y). This search, as well as the calculation, is minimized in order to minimize the computing
time used in the process.
The problem of simulating radio navigation facilities
in a flight simulator is in some sense similar to the
previously mentioned problem of "function generation."
In another sense, the problem is different.
In simulating the radio facilities, it must be observed
that the trainee has at his disposal all of the normal
normal types of radio navigation equipment normally
available in the aircraft being simulated. This includes
such things as UHF, VHF, omni-range, and Instrument
Landing Systems (ILS). The trainee may tune any of
these sets at any time that he pleases. It may be
parenthetically remarked that he does not do these
things capriciously or at random. However, it is difficult,
if not impossible, to take advantage of any prediction
in terms of his use of this equipment to reduce program
requirements except for the slight advantage which we
may gain by noting that he normally cannot be tuning
all of the radio facilities simultaneously.
In selecting a radio station the objective is to determine whether the receiver is tuned close enough to
the frequency radiated by the transmitter such that
were it within range it would be within the band pass
of the receiver. In tuning a radio station the program
must locate any station which satisfies simultaneous
constraints on frequency and range. Furthermore, the
implication of the transmitter being within range is
clear. Thus, the initial problem is one of table look-up
in which a determination is made on the following two
points:
a. Is the frequency to which the receiver is tuned
close enough to the transmitter frequency such
that the transmitter signal can be passed through
the receiver?
b. Is the location of the aircraft close enough to the
location of the transmitter for the given transmitter power such that a signal of sufficient
strength would have been received by the real
aircraft radio equipment were the real receiver
tuned to that station at the same range? This is
typically simplified to an x and a y calculation
rather than a range calculation.
Several factors influence the approach taken in programming this activity. Since the various types of sta-

tions are generally received on separate receivers, it is
clear that the first step in proceeding is to detect the
type of receiver being tuned and search only the table
containing simulated transmitters acceptable to that
receiver. A second factor is the necessity for rapid
response. It is required that, as the trainee tunes the
receiver, he hears the little "blips" that he would
normally hear as he tuned through stations. If the
trainee is tuning from, for instance, channel 12 through
channel 32, and channels 20 and 24 are within range,
as he passes through these during his tuning, he should
hear a response from the radio.
There are occasionally duplications so that two radio
stations having the same frequency but at different
ranges may exist. It is necessary, if the primary search
is on frequency, to have a secondary search on range to
eliminate stations which are within prescribed range.
While it is possible to arrange the stations within each.
category to be monotonic, it is not generally true that
the frequencies will be equally spaced. In fact, as mentioned before, there may be repetitions of some frequencies. A search algorithmS which has produced
excellent results for us has been developed. This algorithm may be characterized as being "real-time" in
nature. It remembers where it was the last time it was
used and consequently in view of the continuous nature
of most tuning it reduces the amount of search to find
the station which is being sought.
After the radio station has been identified the variety
of information germane to this radio station is located
in a block, "unpacked" and presented in another block
to the computation programs related to radio navigation.
Interrupts

The simulator computer programs are dependent
in significant ways upon interrupts. An interrupt is a
signal, generally from an external source, which causes
a discontinuance of the normal sequence of calculation
accompanied by storage of sufficient information to
resume this calculation upon command, and by a
transfer in control (jump) to a prescribed location in
memory corresponding to the interrupt which has
occurred.
One of the primary uses of interrupts is the insertion
of a relatively precise time signal at stipulated intervals.
This signal occurs, in most cases, every fifty milliseconds. This signal is used by the programmers as
representative of real-time input. In a sense it is the
means of synchronizing the operation of the simulator
computer with the external world.

Introduction to Training Simulator Programming

As previously noted when discussing "framing" the
computer executive must be aware of real-time, or at
least that which both the computer and its immediate
environment believe to be real time. In this context,
of course, the trainee is construed to be part of the
"immediate environment." Using these time signals the
computer may conveniently count the frames. In most
cases sixteen frames are grouped together as a "Cycle."
The choice of sixteen is more or less obvious in that the
rate of solution of programs requiring different rates
may be conveniently halved as the time requirements
become less stringent permitting then rates with ratios
of

209

Simulator programs

Specific to training itself as a general problem are
some programs, which we have referred to as simulator
programs, used in the educational process. These programs are not involved in the simulation of any of the
equipment used by the trainee. Nor are they involved
with the operation of the computer as a device containing a model of the thing simulated. In general
these programs are programs which provide functions
which could either not be done in the live equipment
or could be done only at unnecessarily high risk (in an
airplane).

16:8:4:2:1
Malfunctions

This framing has other implications. The most important of these is perhaps the situation which arises
when two computers are operated together in a large
simulator. Under these conditions, it is necessary that
the two computers not only be starting each frame
together, but, perhaps more importantly, that they
start each Cycle together. This synchronization requirement is a direct result of the requirement that information transferred in either direction between computers be properly phased (if I may use that word)
in order to observe stability requirements in the solution
of differential equations.
Input/output interrupts are fairly standard. In our
applications we have used multiplexed channels for the
obvious reasons. The channels have been allocated to
both the normal computer type peripheral equipment
such as card readers, line printers, as well as the real
time equipment such as D / A converters and digital
word channels.
On some simulators involving multiple processors
and on some earlier simulators using somewhat slower
computer equipment with somewhat different I/O facilities, several additional interrupts were required. In
one case an interrupt was required between computers
to indicate the fact that a transfer of data between the
computers in a given direction had been completed.
This was used by the computers to set up for transfer
in the other direction. In fact, in an earlier simulator
using three separate computers without any shared
memory, there were actually six data transfers to be
made. This inter-computer interrupt was essential in
this situation. It was in this same project where a
separate I/O control box was developed and this provided with its own interrupt. In this case a comput~r
not involved in inter-computer transfers was being
serviced by the external real time input/output equipment and required that interrupt for proper operation.

The first of these is called "malfunctions." The term,
I believe, is reasonably explicit. The intent of these programs is to provide a simulation of some system of the
aircraft when operating incorrectly and to exhibit appropriate indications of this faulty operation. Typically,
in the programming of a simulator, this might be
represented as a coefficient on an equation representing
some performance. These failures show up in several
ways. Some failures are complete; thus, an output
might be either a "normal" value or a constant (usually
zero), depending upon whether the system being simulated is intended to be operating or "failed." In other
cases the failure might be partial. In such cases the
math model would have a coefficient which might be
varied from unity (for normal operation) to some other
number to indicate the failure. The control of these
malfunctions is generally at the discretion of the instructor. He has means of introducing into the computer a number representing the malfunction and a
switch indicating that the malfunction shall become
effective. In terms of a programming of such a situation
the equation to be solved representing the system is
normally preceded by a check of the malfunction.
Should the malfunction be total, a complete branch
around the calculation may be feasible with the insertion of a constant as the pseudo result for the malfunctioned system.
The instructor who has carefully planned his training
session will have a script indicating which malfunctions
shall be introduced and under what conditions. Clearly,
this indicates that the control of the introduction of
malfunctions could bea computed operation. In fact,
this is sometimes done. The price of doing this is not
only the price of the additional program to continually
check these conditions (time or other conditions), but
the additional program which permits the instructor to

210

Fall Joint Computer Conference, 1971

enter the malfunctions that he wishes and the conditions
under which they are to occur. These data will change
from training session to training session. But the price
is not completely paid yet. On top of this, it is almost
always required that the instructor have the prerogative
of overriding even those malfunctions that he has
scheduled. Thus, additional programs must be introduced to check his "override" control so that if he
chooses to eliminate the malfunction for some reason
such as the training mission not going as planned, he
can do so.
Freeze

Freezing or halting of the simulation is another
feature almost always included in a training simulator.
The instructor can, upon command, stop all computation. He does this in some cases in order to converse
with his student and instruct him on the correct action
or procedure. This feature is used normally only when
there is a gross mistake which the instructor feels is so
important that it needed to be corrected before the student is influenced by any repetition of this mistake.
Typically, this feature is accomplished in the program
in an indirect way. By this I mean the typical operation
is to set any variables representing increments of time
to zero. Thus, time stands still. On the other hand, it is
generally of supreme importance that the output be
continued in order to keep all of the displays at their
readings just prior to freeze. It may be essential to the
instructor's purposes to refer to these readings in his
explanation to the student as to what should have
prompted hiin to do something other than what he has
done. Inputs are also typically permitted during Freeze
-not so much for the function during Freeze only, but
because of their necessity during Reset or Initialization.
Parameter freeze

Another form of "freeze" is the so-called parameter
freeze. This term refers to the fixing of a given variable
in the flight simulator. For example, it may be desirable
to relieve a pilot of the necessity of controlling a variable
in order to reinforce his learning the control of some
other variable. To this end, parameter freeze is used.
Thus, for example, the altitude may be maintained at
some fixed number if the primary intent is to train the
pilot to use radio equipment for navigation. If, for
instance, it is desired to emphasize the training of
keeping an aircraft on the localizer, the required rate
of descent might be frozen so that the pilot is relieved
of the necessity of keeping the aircraft on the glide

slope. The implications of this on the programming are,
of course, clear. The computation of the number may
be inhibited, or more likely, it is allowed to continue,
but the program simply inhibits the storage of the
answer computed in preferring to insert a constant in
its place.
Fast time

In some simulators, notably space simulators, 'fast
time' has been a feature of the simulator. In this situation, the training is suspended temporarily and the
time increment is increased by a factor simulating
passage of real time faster than the actual time elapsed.
This may be useful in some circumstances to exhibit
to the trainee the long term effects of an action he has
taken. This feature is intended to show the trainee the
effects of severe errors. Typically, fast time has an
attendant inhibition of any action by the student
affecting the simulation.
Reset, initialize

Reset and initialize are two words for the function
which permits the simulation to be established· at an
a priori condition for training. Thus, one condition
might be the aircraft at the beginning of the runway
prepared to take off. Another condition might be the
aircraft in full flight at a given velocity, altitude and
attitude preparing to make an approach to landing.
These conditions are stored in bulk memory of some
type, usually magnetic tape. They could, however, be
introduced through a medium of punched cards or
punched paper tape if a severely reduced amount of
equipment is necessary for economic reasons. It is to be
remarked that when the flight simulator is being reset
or initialized, it is in a freeze condition. Thus, the
input/output equipment is working so that when the
new conditions are in memory the meters, lights and
other indicators are all driven to the proper condition
while the aircraft is still in freeze. The controls are set
manually and accepted as input. This is necessary in
order that the trainee may adjust himself to these
values prior to his having to actually "fly" the airplane.
Record-playback

A relatively recent requirement introduced into the
simulator business has been the "Record-Playback"
feature. The objective of this operation is to record
significant variables during the course of training exer-

Introduction to Training Simulator Programming

cise. After the exercise has been completed, the recording is played back driving the required instruments to
review for the trainee what his performance was during
the training exercise. Properly used, this feature is a
great training aid in that the instructor, knowing what
is coming up, can indicate to the trainee the effects
which are about to occur in response to the trainee's
action. The major problem in dealing with this feature
from a programming standpoint is the control of the
number of variables to be recorded. The programmer
looks for those variables which might be construed as,
in some sense, basic. This may be interpreted as the
smallest set of variables which, with the aid of normal
simulation computations, can cause the proper instrument readings to be reproduced reflecting the pilot's
performance.
Instructor functions

I refer to the various controls which may be exercised
by the instructor at his console. Some of these have been
mentioned previously in our discussion of the instructor's control of Malfunctions. Implicitly he also has
controls for Freeze, Parameter Freeze, Reset and
Initialize, Record and Playback. Programs to examine
the inputs from these various instructor stations are
clearly required.
In addition, certain other conditions may be required
for simulation and ordered by the instructor. Winds
and turbulence in varying amounts may be introduced
by the instructor for training of the pilot under these
conditions. Icing on the aircraft or snow, slush or ice
on runways may also be introduced for the obvious
training purposes. These inputs and their attendant
effects require additional programs to be introduced.
Typically the outputs of these programs are used to
modify some variable of the computation. The method
of programming is similar to that used in the introduction of malfunctions (malfunction of the weather?).
Synchronization

In those simulators using more than one central
processing unit, additional programs dealing strictly
with computer operation, specifically and most importantly synchronization, are also required. Although
the real time interrupt can be relied upon to keep the
two simulator computers in synchronization once
started that way, it is necessary to insure that they
start simultaneously. Once this has been arranged, it is
a simple matter to make the program check from cycle
to cycle that they have remained in synchronization.

211

Thus, any transient failure of one of the real time
interrupts may be overcome in some cases.

Diagnostics

Diagnostics are, of course, required. In addition to
the normal mainframe and central processor unit
peripheral diagnostics, additional diagnostics for the
real time input!output equipment and for the cockpit
equipment are generally required. These diagnostics
are made automatic to the greatest extent possible.
However, much of the cockpit equipment involves
human interaction in either reading or observing the
results of a computer output or in controlling something
which results in a computer input. In view of the size
of the complete system, it is clear that a reasonable set
of diagnostics is mandatory. The problem involved in
reaching this "reasonable" set is important since the
objective of keeping the equipment on the air and the
objective of reducing the length of time it takes to
run the diagnostics conflict. Delicate compromise is
required.

SUMMARY
Programming of training simulators involves a combination of scientific, real-time and multiprogramming
problems. During program development the activity is
similar to laboratory scientific programming. Throughout the process emphasis is on efficient core and time
utilization characteristic of all real-time programs. The
final program being a combination of many programs
operating under a real time executive resembles a
multiprogram operation. Much of the program is devoted to simulator as contrasted with simulation related
programs.

REFERENCES
1 Math model compiler reference/operating manual

Internal Technical Report Singer-Link Division
2 B M TATE
Boeing 74-7 training developments and implementation
Fourth International Simulation and Training Conference
Society of Automotive Engineers May 1971
3 J A FERRARESE
Assessment of new training systems as substitutes for
airborne training
Fourth International Simulation and Training Conference
Society of Automotive Engineers May 1971
4 R L TAYWR A GERBER

212

Fall Joint Computer Conference, 1971

A study to determine requirements for undergraduate pilot
training research simulation system (UPTRSS)
Air Force Human Resources Laboratory AFHRL
TR 68-11 1968
5 E COHEN
How much motion is really needed in flight simulators
Fourth International Simulation and Training Conference
Society of Automotive Engineers May 1971

6 Math model compiler reference/operating manual op cit
7 R E FLEX MAN W P JAMIESON
J M WALSH et al
Synthetic flight· training system (SFTS) concept formulation
report
Technical Report NAVTRADEVCEN 68-C-0106-1
July 1968
8 Internal Report Link Division

The handling qualities simulation program for
the augmentor wing jet STOL research aircraft
by WILLIAM B. CLEVELAND
N ABA-Ames Research Center
Moffett Field, California

INTRODUCTION

As the aircraft velocity decreases or increases CL
must be increased or decreased to maintain the required
value of aerodynamic lift. When conventional means
cannot produce sufficient lift special devices are a
necessity. For example, if insufficient lift is obtained
from aerodynamic properties, direct thrust lift from
the jet engines may be employed. Similarly, lateraldirectional (roll-yaw) control is reduced in the same
manner as lift at the lower speeds making the STOL
class of aircraft in general dependent on special aids in
roll-yaw control as well as lift. For the C8-A a special
aerodynamically high lift flap is used in conjunction
with vector able jet engine thrust to provide the necessary lift at low landing approach speeds.

Aircraft have been simulated on computers for a
variety of reasons. The training of pilots and crews on
operational flight trainers, for example, is a common
use of simulation. Subsystems of aircraft are often
simulated to firm up the design of the hardware and
frequently the whole aircraft must be simulated to help
in the design of the subsystems. This is the case in the
simulation of the Augmentor Wing Jet STOL Research
Aircraft, a modified de Havilland C8-A Buffalo, in
which the total aircraft was simulated to determine
final design values for control systems and devices
which augment control of the aircraft. For research and
development simulations such as this one, simulation
software and hardware must have general application
while in piloted "man-in-the-loop" simulations speed of
computation is the overriding concern. Thus the aircraft model and computer software and hardware must
be merged to provide an accurate simulation which
meets the needs of the research objectives.

Handling qualities

Along with loss in control effectiveness the stability
of flight maneuvers is also reduced and improvements
in stability as well as control must be introduced to
bring the aircraft "handling qualities" up to an acceptable level. "Handling qualities" is a general term
in which the aircraft characteristics are rated by
pilots on a scale ranging anywhere from' 'uncontrollable"
to "optimum." The pilot must be asked to perform a
specific task. For instance, he may rate the handling
quality of an aircraft in a normal landing approach as
"optimum." However, the same approach may be
difficult or nearly impossible with an engine failed and
his rating would be consequently lower. The ratings are
nearly purely subjective, as the rating is the pilot's
opinion. Normally several pilots are used to avoid
personal biases and obtain a crude statistical sampling
for a handling quality rating. Since one of the primary
outputs of a simulation of this type is the subjective
pilot evaluation, systematic investigations require
exacting simulation models and the ability to introduce
or repeat any combination of initial conditions, failure

The STOL problem

Short Take-Off and Landing (STOL) aircraft are
designed to use 1500 ft. runways as opposed to 10,000
ft. runways commonly used by commercial jet transport
aircraft. To meet this requirement it is necessary to fly
at a slower speed with a resulting steeper flight path
angle. Due to the slow speed requirement all aerodynamic control of the aircraft is reduced as aerodynamic control power is proportional to the square of
velocity. For example:
Lift cc V 2CL
The coefficient of lift CL is a function of the shape of
fixed parts such as the fuselage and wings but it is
varied by movable surfaces such as flaps and spoilers.
213

214

Fall Joint Computer Conference, 1971

Figure 1-The augmentor wing jet STOL research aircraft

modes, control effectiveness, air roughness, etc., desired
to obtain reliable pilot ratings.

STATEMENT OF PROBLEM
The augmentor wing aircraft

The Augmentor Wing Jet STOL Research Aircraft
is sponsored jointly by the National Aeronautics and
Space Administration and the Department of Industry,
Trade and Commerce of Canada. Major modifications
of a de Havilland C8-A Buffalo are presently under way
to provide the research vehicle, Figure 1.
To meet the· short field requirements set by the
Federal Aviation Agency for STOL aircraft the airplane
contains several novel pieces of hardware. The augmented jet flap is a high lift device which gets its name
from the blowing of a flat jet of air down the slotted
flap as seen in the wing section diagram, Figure 2.

Cold air from the fan-jet engines is ducted to ejector
nozzles to provide the air jet. The ailerons also contribute lift by drooping in conjunction with the flaps
but only up to one half the total flap deflection. The
aileron is a boundary layer control device in which air
is blown over the surface to improve the aerodynamic
force characteristics. Normal roll control by the ailerons
is provided by differential aileron deflections about the
common flap-aileron angle operating point.
The effect of air blowing on the aileron may be seen
in Figure 3, the function of coefficient of roll Claa versus
aileron deflection oa and blowing coefficient, CJa • The
coefficient CJa is a measure of the cold air thrust, T e ,
non-dimensionalized through division by the product of
dynamic pressure and wing area, (CJ a = Tel ij S) .
As CJa increases for any down going aileron angle
(oa>O) so does the rolling moment on the aircraft. It
is apparent that boundary layer control adds great
effectiveness over the non-blown aileron. Since the roll
. coefficient curves represent an individual aileron the
total rolling coefficient for both ailerons is Claa (port) C laa (starboard) .
The engines themselves are remarkable in that they
provide the air for the flap blowing and aileron boundary
layer control and more so since the engine thrust is
deflectable. The thrust angle, under pilot control, is
normally directed aft for cruising but it may be directed
straight down to provide direct engine lift at slow
speeds.

APPROACH OPERATING POINT

.2

~

CJa=·OI6

CJ a=·008

c
AIR DUCT

~

FLAP SYSTEM
Figure 2-Cross section of the augmented jet flap wing

Figure 3-Rolling coefficient CIBa, a function of aileron
deflection, 8a and blowing coefficient CJa for one aileron, downgoing is positive

Handling Qualities Simulation Program

Objectives of the program

An early piloted simulation of the modified Buffalo
was conducted to determine just how the pilot might
best control the aircraft in takeoff, transition to cruise
and back to landing, and landing itself. The results
showed that the aircraft overall had acceptable handling
qualities, but the pilot's workload was higher than
desirable for a commercial STOL aircraft. However, it
was felt that the use of Stability Augmentation Systems
(SAS) in lateral (roll) and directional (yaw) would
reduce the work load to a satisfactory level. Systems
that were designed to help the lateral characteristics
provided turn coordination and dutch roll damping. In
order to evaluate these control aids the aircraft was
simulated on a moving base simulator in as much detail
as possible. The test pilots had not only the normal
flight instrumentation but good visual and motion cues
to make their evaluation of the proposed stability
systems. Various effects such as control system hydraulic failures, SAS servo failures, and engine failures
were needed in addition to the usual disturbances such
as air turbulence to evaluate the handling qualities of
the aircraft in both normal operating and failure modes.

Computer requirements of the handling qualities simulation

The problems discussed in these previous sections
place certain requirements on the simulation and at
the same time allow some few concessions. To be
specific the simulation of this vehicle must provide
the following:
1. Six-degree-of-freedom equations of motion completely rigorous and no approximations; a flat
earth is acceptable in a landing study.
2. Accurate and detailed aerodynamic derivatives
including all coupling effects. This is required to
access handling qualities.
3. Engine performance model.
4. Stability augmentation system to help make the
aircraft easier to fly in the lateral-directional
modes.
5. Air turbulence model, wind shear, gust upset,
and/or steady state winds.
6. System failures, engine failures, SAS failures of
several types.
7; Non-steady aerodynamic effects; for example,
the effects· of a time delay from when the air
flows over the wing until it reaches the horizontal
stabilizer.
8. Landing gear model for landing and roll out.

215

Other simulations, especially of high speed and very
large aircraft, would require a representation of body
bending modes and aero-elastic effects.
The amount of computation required to accomplish
these items listed above made it impractical, if not near
impossible, to do with analog computers (at least with
the desired accuracy). Thus this simulation was done
using a digital computer.
SIMULATION HARDWARE AND SOFTWARE
Simulation computing system

The Ames Research Center simulation computing
system is made up of both analog and digital computers.
The principal component of the system is the EAI 8400
Digital Simulation Computer. This computer is a 32 bit,
32,000 word machine with a real-time interval timer
and floating point hardware. Its input/output to the
analog domain consists of 32 bits "in" and 32 bits "out"
of discrete on-off signals as well as 64 channels of multiplexed analog-to-digital converters (ADC) and 64
DACs. Peripheral equipment includes four magnetic
tapes, line printer, card reader, disk file, and typewriter.
Of this equipment 29,500 cells, 64 DACs, 16 ADCs,
17 discrete bits "out" and 12 bits "in" were used in
addition to the computer peripherals. The memory
was allocated as follows: 6,500 for the simulation
monitor system, 21,000 for the simulation program, and
2,000 for the simulation software (user provided).
Because most of the cockpit instrumentation was
developed in the past for all-analog simulations, data
communication with the simulator cab is entirely
through the ADC-DAC linkage and the discretes. No
digital instruments were used.
In addition to the digital computer an analog computer (EAI 231-R) was used. Whereas the digital
computer computes the aircraft model as its primary
work, the analog computer is used as a buffering device
between the digital computer and the analog recorders,
motion and visual simulators, and the various cockpit
instruments and controls.
Theoretically the analog computer was not needed
since no part of the aircraft was simulated on it but with
the myriad of devices requiring data transfer to and
from the digital computer it is impractical not to have
some sort of analog computer for a data trunking center.
No claim is made to call this a hybrid computing
arrangement due to the nature of the workloads on the
two systems, however, it is interesting that the requirement exists for analog components to be available for
the "digital" simulation.

216

Fall Joint Computer Conference, 1971

Figure 4-Flight simulator for advanced aircraft located at
NASA-Ames Research Center, Moffett Field, California

Simulator system

To obtain the most valid handling quality evaluations
the pilot must be subjected to as many motion, visual
and aural cues as he would experience in flight. While
this is impossible to achieve on a ground based simulator
the most important cues of STOL aircraft can be faithfully duplicated on the large motion generator at Ames
called the Flight Simulator for Advanced Aircraft
.(FSAA), Figure 4. This motion simulator has a fully
Instrumented cockpit and six-degrees-of-freedom travel
capabilities of ±50 feet in lateral, ±4 feet in vertical
a~d longitudinal, and at least ±22.5 degrees in roll,
pItch, and yaw. The FSAA was chosen for its large
lateral travel which proved most useful in the roll-yaw
control handling qualitiBs study and also in the engineout maneuvers. In the simulator the pilot has the
capability of flying on instruments or by visual contact.
The visual scene is an out-the-window pictorial repres~ntation of a landing field and surrounding countrySIde. The scene is a tBlevision representation in which
the model of the landing field is scanned by a color
television camera mounted in gymbals so that in addition to three translations the angular motions of roll
pitch, and yaw of the airplane are displayed to the pilot:
A mes simulation software

The simulation software system at Ames has evolved
from manned simulation requirements. With a "manin-the-Ioop" all work must be performed in real time.
Historically, manned simulations havB been done on
analog computers with their fast parallel computing

capability and only recently have digital computers
been used widely for the real-time problem. Execution
time is of prime importance, thus the software used in
simulations must meet rigorous execution time requirements. At present the Ames simulation software is of
two types: hardware support and program support.
The program support software is a group of programs
under the label FAMILY 1. 1 The main features of
FAMILY I include the basic Ames simulation monitor,
integration packages, real-time magnetic tape data
dump and two special systems-MOTHER (Monitor
Time Handling Executive Routine) and CASPRE
(Comprehensive Aid to Simulation Programmers and
Engineers). The hardware support software serves to
make analog type operations tractable from the digital
computer. This software is critically important to the
operational efficiency of Ames' simulations.

Program support software

A real time scheduler called MOTHER was developed
in response to the problem arising from the speed
limitations of the EAI 8400 digital computer. It was
recognized that programs that contained high frequency
systems or that sampled high frequency analog inputs
must sample and solve the system equations at high
rates to produce sufficient accuracy. However, the size of
our simulations indicated that one big program loop
solving all the systems would normally produce integration and sampling step size too large to accuratBly
reproduce the high frequency portion of the problem.
The easiest solution to this problem would be to get a
computBr fast enough, but when the problem is not too
large and the frequencies are not too high, there are
less drastic solutions. The approach used in the Ames'
simulations is to do the high frequency operations more
often than the low frequency ones using a special piece
of software called MOTHER.
MOTHER is a scheduling and executive routine
which schedules subroutines to operate at synchronized
time intervals or to operate within given time constraints. Subroutines are defined to MOTHER to
operate within specific time constraints~ By organizing
all the high frequency computations into one list and
the remainder into a second list one may define to
MOTHER what the computation rate is to be on each
list. Both lists are called constrained processes because
they must he completed within time constraints as
opposed to syn~hronized processes such as input! output
of analog data which must be performed at specific
time intervals.

Handling Qualities Simulation Program

In Figure 5 a two loop program is shown as it might
be run under a MOTHER schedule. Assume that the
I/O for loops A and B must be executed every 10 and
20 milliseconds respectively and that the computation
blocks, A and B, must execute within the same 10
and 20 milliseconds.
Since the I/O processes are synchronized, they are
executed first. After the I/O the constrained processes
begin; process A has been scheduled to execute first
since it must execute within the shorter time constraint.
At the completion of A, process B begins; at the time
of 10 milliseconds the I/O of A interrupts the constrained process and proceeds to execute. Note that A
executes again before process B resumes its execution
and finally is completed. At this point all the synchronized and constrained processes are completed for
this period. The period as used iIi this context is the
shortest time into which all the various process times
must divide integrally. The ability of MOTHER to do
this scheduling is a great aid to the simulation programmer since all he must do is to provide the simulation
subroutines and specify the calculation rates. The
scheduling burden has been removed.
Other items that have proved useful are MOTHER's
executive features such as the servicing of simulation
mode control and discrete signal inputs. In the typical
operating environment a request for a mode change
may come from any of three areas: The analog domain,
the program itself, or from the digital control console.
M OTHER receives the mode requests and services
them by first setting a user-provided mode word and
then executing the process list defined for this mode.
Discrete bit signals from the analog domain are serviced

TIME, m/sec

a

10

20

I/O FOR A
1/0 FOR B
LIST A
LIST B
EXECUTIVE
MODE
WAIT

Figure 5-MOTHER schedule for two computation loops

217

by either setting a corresponding Fortran word high or
low according to the condition of the input bit or by
executing special subroutines not in the lists of defined
constrained processes. Figure 5 shows the mode and
discrete bit servicing periods which follow execution of
the defined processes.
The digital simulation computers are completely
dedicated to simulations as Ames' batch work is performed at a separate central facility. Hence the emphasis
in the simulation laboratory is not on computer throughput but on computer/simulator uptime. In this context
computer uptime means not only that the hardware be
operational but also that the program be operationally
useful. Experienced simulation engineers will certainly
agree that changes to programs occur at a high rate and
in a seemingly never ending stream. In this state of flux
a means of quickly updating and changing data and
equations is a necessity. Since the simulation programs
are written in Fortran, recompilation is the only practical means of making lengthy changes. However, small
changes may not warrant delaying an operational
simulation to make a time consuming compilation. The
software used to make minor changes is called CASPRE.
Changes of data constants~ for example, are very
simple, e.g., if it was desired to set the weight of the
aircraft to 100,000 pounds the computer operator need
only type + WEIGHT = 100000$. The elements of this
statement fall into several categories. The +, =, and
$ are commands or operatives defined to perform
specific tasks on the character strings WEIGHT and
100000. The plus sign indicated to CASPRE that a
Fortran name was to follow, the equal sign is a command
to set the "weight" cell to the following data concluded
by the dollar sign. This flexibility and ease of change is
essential to the operation in research and development
simulations.
Some of the many directive codes in the CASPRE
system include typewriter display and modification of
a data cell's contents in floating point, octal, or integer
formats and even in Binary Coded Input (BCI) if the
cell happens to contain such data. The display is also
available on the line printer making it possible to
provide short data print-outs. In practice this print
capability is rarely used for more than program checkout.
Small program changes are made with a combination
of machine language instructions and symbolic addressing. Making these changes requires some knowledge of the machine language codes but the nature of
program patches normally requires only arithmetic
plus a few conditional branch instructions. For example,
a very common request is for a sign change in an
equation. This is easily done by a simple operation

218

Fall Joint Computer Conference, 1971

SCALE
SWITCH I

OISCRETE DIGITAL COMPUTE R,
o--B=IA.:..:;.S_-t-.;.;,.IN.;.;,.PU-:TS--\
INCRE AS E

SWITCH 2

INSCAl SUBROUTINE

O· AC

f7\
) V

r----.,-----,,-i

a, • 100
IR.

(J!.a mox+ aBIAS

DECREASE

Figure 6-Calibration of cockpit instruments using INSCAL

such as changing an add to a subtract instruction.
Large changes such as the insertion of an equation into
the program are made by a jump from the compiled
program to an area of memory set aside as a patching
area, where the equation is patched in machine language
and then a jump back to the equation stream.

Hardware support software

One of the operational problems associated with
flight simulators is the calibration of the cockpit
instruments. The instrument readings are, for the most
part, linear with voltage input but the zeroes and
scaling vary with each instrument. It is a daily process
to calibrate and check each instrument used. Calibration
consists of both a bias and a gain on the problem
variable. It was the practice at Ames to provide a
DAC channel scaled in some convenient manner and
with an analog computer do the necessary voltage
scaling and biasing before sending the signal to the
instrument. This straightforward approach proved to
have a major fault albeit a human one. Due to the
general purpose nature of the computers and the
simulators there is a large turnover of programs on the
equipment. The analog equipment requires a high
amount of human attention to keep up with changes
which frequently resulted in improperly scaled instruments and control inputs thereby plaguing the uptime
record of the facilities. The solution to this problem was
to scale and bias the instrument drive signals in the
digital program before sending them out via the DACs.
INS CAL was devised to ascertain just what values of
gain and bias were required. Figure 6 shows two switches
and an instrument located in the simulator cabin. A
computer operator selects the instrument to be scaled
by typing the DAC channel number into the INSCAL
routine. From this point one of the simulator personnel
in the cockpit performs bias and scaling. When the
operator sets switch 1 to Bias, the variable to that
DAC is set to zero and the DAC output is only a bias
voltage. If the instrument is not in its null position,
switch 2 is used to increase or decrease the value of the

bias until it does null. Having determined the voltage
bias of this instrument, switch 1 is set to Scale position.
Internal to the INS CAL program a static test value has
been previously assigned to the variable to aid in the
determination of its scale value for the instrument.
With switch 1 in the Scale position the variable is set to
its static value. The operator again uses switch 2 to
increase or decrease the scale factor until the meter
reads the prescribed test value.
For most simulations this process is repeated for 10
to 30 instruments. However, once the initial cockpit
calibration has been done the gains and biases, having
been saved in the program, are available for subsequent
setup of the program. Normally, subsequent calibration
checks require no changes to the stored gain and bias
values, thus dramatically reducing the setup and turnaround time for simulations.
A requirement for many channels of recorded data on
analog strip chart recorders prompted the multiplexing
of·2 variables onto one recorder channel. Multiplexing
can be done by mechanically switching between contacts carrying the appropriate signals. This approach
will require many DACs in the digital simulation. By
multiplexing the variables in the digital computer the
same effect is produced, but only one DAC is used to
display two variables. Figure' 7 shows an example of
what can be done with multiplexing. The input and
output signals of a control system have been multiplexed onto one data channel for comparison. Normally
high frequency and/or discontinuous signals are not
multiplexed since in that form they would be very hard
to read. Most variables can be multiplexed in aircraft
simulations and the use of this routine has proved to
be quite successful.

x-----

S2+2twS+w 2

y

Figure 7-Strip chart recording of X and Y multiplexed
for comparison

Handling Qualities Simulation Program

219

AWFTV PROGRAM MECHANIZATION
The organization of the digital program is illustrated
in Figure 8. The middle block contains MOTHER, the
executive program for the aircraft and support subroutines. The upper block in the figure represents the
main program of the simulation which defines to
MOTHER the timing requirements for the execution
of programs and input/output data transfer schedules.
The three lower blocks constitute the simulation
which consists of two real-time calculation loops and a
set of supporting subroutines. Because MOTHER can
schedule the computations of fast variables more
frequently and slow variables less frequently, MOTHER
makes it possible to run complex programs with
adequate dynamic fidelity than otherwise is possible
using a straightforward seriai calculation of all variables. Two calculation loops proved satisfactory in this
simulation. The high frequency loop contained the
rotational kinematics, part of the rotational aerodynamics, the control system and the landing gear
model. The low frequency loop contained the translational kinematics, the remainder of the aerodynamics,
the engine model and the simulator drive calculations.
The terms "high" and "low" frequencies are relative, of
course, and the split of the workload is somewhat
subjective. Usually the rotational behavior of an aircraft in flight contains higher frequencies than the
translational. As a natural grouping of rotationally
oriented work the control system and rotational
aerodynamics were put into the fast loop with the
rotational kinematics. The landing gear equations must
be solved relatively fast due to the high transient
frequencies present upon touchdown. The lower frequency work is essentially the remainder of the workload. One improvement that most probably will be
made in the future is to solve for altitude in the fast

~EDULES

AND DEFINITIONS
FOR MOTH ER

I

I

I

MOTHER, MONITOR TIME HANDLING AND EXECUTIVE ROUTINE

I

HIGH FREQUENCY LIST

I
LOW FREQUENCY LIST

I

1

SUPPORT CAPABILITIES

ROTATIONAL DYNAMICS

TRANSLATIO NAL DYNAMICS

LANDING GEARS

AERODYNAMICS

TRIMMING ROUTINE
PRINT ROUTINES

AERODYNAMICS

ENGINES

DYNAMIC CHECKS

CONTROL SYSTEM

SIMULATOR VARIABLES

LIBRARY FUNCTIONS

Figure 8-0rganization of the digital simulation program

AERODYNAMICS
ENGINES
LANDING GEARS

BODY
VELOCITY

Figure 9-Translational dynamics block diagram

loop for improvement in the simulated landing gear
response.
The digital simulation depends upon several subroutines some of which _do not run in real time. Data
printout and aircraft trim calculations cannot run in
real time. However, the three dynamic check routines
which provide modal response checks must run in truetime. These routines are simply called by pressing a
computer console push button. Other support software
such as wind turbulence models, random number
generators, and arbitrary functions of one, two, or three
variables require a Fortran call in either the high or
low frequency simulation loops.
Equations oj motion

The equations of motion chosen for this simulation
are a six-degree-of-freedom rigid body set. The set
assumes a flat non-rotating earth in which the linear
accelerations are integrated in a local horizontal Euler
axis system and the angular accelerations are integrated
in the vehicle's conventional body axis. This set of
equations is sufficient for landing studies in which the
range traversed is only four or five miles or less and the
maximum velocities are very low, 60 to 150 knots.
The translational equations are interesting in that all
forces, not including gravity, are summed in the body
axis frame then transformed to the local horizontal
axis where gravity is easily added in before integrating
to obtain ground velocities. Wind velocities may be
easily inserted in this axis system in the north, east,
and down directions. No resolution of winds or gravity
from Euler axis to body axis is necessary in this formulation. This is illustrated in Figure 9. One advantage of

220

Fall Joint Computer Conference, 1971

1------- 1
IAERODYNAMICS:
I
AND
I
IEQUATIONS OFt
:
MOTION
:
L

_ _ _ _ .J

8 SAS

DYNAMIC
PRESSURE
COMPENSAT ION
FEEDBACK SIGNALS

Figure Io-Typical control system and stability
augmentation system

integrating the forces in the inertial frame is that the

wXx terms present in body axis acceleration equations
are omitted along with the corresponding higher frequency content in the angular velocity terms. In digital
simulations this is to be desired as reduction of frequency content usually improves solution accuracy.
Aerodynamics

The aerodynamic equation set formulates all the
effects of velocity, control deflections, etc. and produces
three forces and three moments. The stability derivatives were formulated from wind tunnel data taken with
respect to the stability axis; from which a rotation
through the angle of attack, a, about the y axis produces
derivatives in the body axis frame. The equations and
data are representative of many simulations with just a
few interesting exceptions. The effects due to angle of
attack of the horizontal tail takes into account the
downwash of air flow over the wings onto the tail and
the variable time delay involved between the distance
from wing to tail as a function of speed. The most
interesting facet of the aerodynamics simulation is the
separation of the effects of right and left ailerons, due
to simulated engine failures which stop the air blowing
on an aileron with the resultant loss of aerodynamic
control force.
Two types of random disturbances were included in
the simulation. The first type was a "wing drop" in
which a roll acceleration was inserted for a specific
period of time to produce a resultant roll angle. The
second was a wind gust model which produced noise in
the three rotational and three translational velocities
using a Dryden model.

under computer control as a function of dynamic
pressure and control column movement (as shown in
the figure) or by surface deflection. It also illustrates
how the control actuator, in the forward control path,
and the SAS, in the feedback control loop, serve to
affect the amount of control surface deflection obtained
by column movements. In each control mode, lateral,
longitudinal, and directional, there is a similar control
system to provide correct. pilot work loads.
In the actual aircraft the longitudinal control system
is a purely mechanical system while both the lateral
and directional systems have hydraulic power assist
actuators. Provisions are made in the simulation to
fail the hydraulic systems with resulting losses in
aileron, spoiler and rudder effectiveness. In this vehicle
stability augmentation is necessary in lateral and
directional modes only. The lateral SAS system uses
sideslip angle, roll rate and yaw rate feedbacks for roll
stabilization while the directional SAS uses roll rate and
sideslip rate feedbacks for stabilization in yaw. Wherever
possible the linear filters are mechanized as simple
difference equations using state space transform
methods. 2
Engines

The engine simulation is primarily a thrust computation in which the port and starboard engine thrusts
are calculated separately from their individual throttle
and diverter control levers. The engine thrusts contribute to body axis forces and moments for the aerodynamic computations. The jet engines produce hot
thrust for propulsion and cold thrust for the flaps and
ailerons.
The engine diagram, Figure 11, shows the basic ideas
of the thrust simulation. The operation of this circuit

THRUST +
DEMAND

Control systems and stability augmentation systems

Figure 10 is a block diagram of a longitudinal control
system. This figure shows "force feel" modification

Figure ll-Fan jet engine thrust block diagram

Handling Qualities Simulation Program

is based on non-linear rate limiting of a first order
system. When a demand for increased thrust is made
the thrust rate, T, is either T'l or T'2. Initially T'l is
greater than T'2 so no limiting occurs, and the limited
thrust, P L <65. Thrust, T, is a positive exponential
function of time but as thrust builds up and P L becomes
limited, the thrust will become linear with time until
T'l5:.T'2. At this time the limiter acts to make T=T'l
providing a first order exponential tail-off of the process.
The rate limiting used whe,n thrust is decreasing is
much simpler; the maximum thrust rate bares simply a
square relationship to thrust. This scheme provides
high thrust rates' at high thrust levels and quite low
thrust rates at low thrust levels.
The thrust diverter dynamics are modelled with
rate limiting and hysteresis between the control and
diverter angle. For determination of pilot handling
qualities, system failures were implemented providing
thrust loss in either of the two engines and a "hard
over" diverter lock to a specific angle.
Landing gear

The landing gear model is composed of equations for
tire friction forces and oleo reaction forces and the
proper resolution of the forces into body axis forces and
moments. The friction forces are resolved into two
components, one' in line with the tire and the other
perpendicular to it. Coefficients of friction for both
components are functions of gear velocities. The equations on gear compression and compression rate are
rigorous but assume no tire deflection, so all reaction
forces are due to the oleo. In the case of the C8-A three
individual gears were simulated with the forces and
torques on each summed to provide the total landing
gear force and moment components.
Subroutine ICTRIM

ICTRIM is a subroutine which performs two
separate functions. "10" refers to the calculation of
Initial Conditions (ICs) for the velocity terms in the
local horizontal or Euler Frame based on inputs of
airspeed, Va, sideslip angle, /3, and angle of attack, lX.
"TRIM" refers to the capability of this subroutine to
trim the aircraft longitudinally.
In the operation of simulations at Ames it has been
found that while use of the Euler frame to integrate
accelerations improved calculation accuracies it did
pose operational problems. In the use of the simulations
it was apparent that research people using the program
were more used to-thinking in terms of total airspeed,
for instance, than its components in body axes much

221

less those in Euler axes. Consequently a simple routine
was written to accept Va, lX, /3 to calculate initial
conditions for north, UE, east, V E, and down, WE
velocities. Later it was modified to allow the flight path
angle, 1', to be input along with lX to calculate the initial
condition of pitch angle, fJ.
Starting the aircraft simulation run with a trimmed
aircraft avoids requiring the pilot to waste time trimming the aircraft before starting his task. For a large
class of simulation problems the trimming need only
consist of nulling pitch acceleration and the aircraft
longitudinal and vertical accelerations. When aileron,
rudder and sideslip angles and the roll and yaw rotational velocities are zero, the remaining three accelerations are zero.
The trimming algorithm is an iterative scheme. In
the routine's most conventional form elevator control,
8e, is used to null pitch acceleration, q; thrust, T, is used
to null longitudinal acceleration, Az; and angle of
attack, lX, is used to null vertical acceleration, A z •
As an example of the iterative process the current value
of q is used to modify the current value of 8e according
to the equation 8ei+l = 8ei+kqi where k is an appropriate
gain predetermined from the aerodynamics of the aircraft. After modifying 8e the routine commands a cycle
of calculation through all the aircraft equations and
data using the new value of 8e to obtain a new value of
q. Of course lX and thrust are being changed concurrently
in the same manner. After sufficient iterations and for
reasonable initial condition the routine determines the
control inputs for which the accelerations are sufficiently
close to zero for the airplane to be considered trimmed.
This basic scheme has proved to be sufficient for
transport aircraft. However the C8-A has the capability
of directing its thrust from 180 to 1160 from the aft
horizontal and when using this thrust vectoring to
obtain trim the throttle may be held constant so that
the diverter angle then becomes the trim parameter
affecting both Az and Az strongly. A more general
scheme was devised for the iteration algorithm since
thrust diverter angle influenced both vertical and
longitudinal accelerations. Basically the chain rule of
differential calculus was employed to introduce the
effects of lX and thrust diverter angle, 'II, on Az and A z.
We say that

but since the desired Az is zero then

By the chain rule

222

Fall Joint Computer Conference, 1971

In like manner

This is a set of two equations in two unknowns if we
assume the partial derivatives can be found from the
aero and engine data. Now we say that

climb rate are saved at predetermined altitudes while
the maximums of such variables as pilot control forces
and guidance errors are determined within certain
altitude ranges. If this routine is desired it automatically
prints after completion of the run. This type of data is
useful to correlate with the pilot's subjective ratings of
the vehicle's handling qualities.

and
where ~a and All are the solution of the two chain
equations. The partials of Ax and A z with respect to 11
are merely thrust modified by the sines and cosines of
the diverter angle, but the derivatives with respect to
a are unknown analytically and difficult to determine
exactly.
The approximation aAz/aa= AAx/ Aa was implemented by having the trim routine determine Ax and
Az for some ai and again for ai plus one degree then
using those results to calculate the approximations.
Since the approximation to the partial derivative was
made rather arbitrarily the calculated Aa and ~1I were
multiplied by a constant less than 1 to assure convergence. Most generally .8 was used, for instance,
ai+l = ai+ .8A,a.

This scheme in all other respects acts as the original
more crude iterative scheme for conventional aircraft
trim, iterating until the three accelerations are nulled.
Dynamic check routines

The subroutine DYNCHK is used to perform
dynamic checks of the aircraft. The routine provides
doublets and pulses in roll control and rudder control.
In elevator control the optional disturbances are steps
and pulses. The disturbances are input as pilot control
variations providing a check on the control system as
well as the aerodynamics. By recording the response
data on eight channel strip chart recorders the user may
view the results and determine the parameters in which
he is interested.
Print routines

Two separate print routines are included in the
simulation. One is a general purpose print routine for
determining the status of a list of variables and is
useful for printing initial conditions and trim conditions
for documentation as well as a trouble shooting aid. The
second routine is an attempt to determine the pilot's
and aircraft's performance. It collects data in two ways
as functions of altitude. Variables such as airspeed and

Support subroutines

Three subroutines from the computer library should
be mentioned in the context of supporting simulations.
They are WIND, MLTPLX and INSCAL. The latter
two were described as hardware support routines.
Probably the most important is WIND since it provides
the atmospheric turbulence needed in simulations for
aircraft handling qualities work. Turbulence is used
primarily to assess the effects that real-life turbulence
has on controllability, flying qualities, and ride qualities
of an aircraft. The disturbance effects on the design of
controls and stability augmentation systems are very
important to insure that the airplane has sufficient
control effectiveness to be manageable during flight in
turbulence.
The turbulence model is the Dryden mode1. 3 •4 In
essence white noise is passed through filters to provide
noise in the three translational and three rotational
velocities which have good representations of the power
spectral densities present in actual air turbulence.
EPILOGUE
The simulation program and simulator hardware provided test pilots a realistic representation of the
modified C8-A Buffalo with the result that various
control and SAS representations were evaluated using
pilot handling qualities ratings.
Final design parameters were found for the aircraft
which is scheduled for flight test in early 1972. The
digital program has subsequently been used as a base for
additional simulations of navigation and guidance of
STOL craft in the air traffic control situation near
airports and for studies of control and SAS in longitudinal motion for this class of airplane. Of current popular
interest are the noise reductions made possible by the
high angle landing approaches to the runway· which in
turn are made possible by the slow approach speeds. At
any ground point the STOL aircraft will be at a higher
altitude then conventional aircraft thereby reducing
the noise.
Due to the diverse uses of the program it is not con-

Handling Qualities Simulation Program

sidered to be fixed as present effort is aimed at determining better simulation techniques and models and at
making the system software execute faster and be more
responsive to the needs of the user.

REFERENCES
1 E A JACOBY J S RABY D E ROBINSON
FAMILY I: Software for NASA-Ames simulation
systems
AFIPS Conference Proceedings Vol 33 Part 11968

223

2 J V WAIT
State-space methods for designing digital simulations of
continuous fixed linear systems
Transactions of IEEE/PGEC Vol EC-16 No 3 1967
3 F NEWMAN J D FOSTER
Investigation of a digital automatic aircraft landing
system in turbulence
NASA TND-6066 1970
4 C R CHALK
Background information an user guide for MIL-F-8785B
(ASG), "Military specification-flying qualities of piloted
airplanes"
AFFDL-TR-69-72 1970

Software validation of the Titan IIIC digital1light
control system utilizing a hybrid computer
by R. S. JACKSON and S. A. BRAVDICA
Martin Marietta Corporation
Denver, Colorado

INTRODUCTION

mono-propellant attitude control engines, which are
used for control purposes during the non-powered flight
(coast) portions of a mission.
The Titan IIIC is required to handle single and
multiple payloads that range from 1800 pounds to
30,000 pounds. There is also a variety of missions, one
of which is described below, to match the spectrum of
payloads. A wide range of payloads, in turn, significantly impacts the upper stage total vehicle mass,
inertia, and dynamic characteristics which results in a
requirement for flexibility in control system compensation. While in coast, the Digital Attitude Control
System (DACS) must provide several levels of pointing
accuraoy as well as the capability to optimize time
response and propellant utilization in a variety of inertia
conditions. Furthermore, the powered flight Titan IIIC
booster characteristics present to the control system
designer a multi-plant problem with significant structural bending and propellant slosh dynamics. One of
the DFCS design goals was to provide the capability
to fly all of the missions and payloads with one basic
airborne software package. Therefore, when the mission
or payload change only modifications to the program
parameters will be required.
A diagram of the particular mission that was simulated is shown in Figure 1. A 92-by-109 mile parking
orbit is achieved by the boost portion of the flight
which includes a 17 second 1st burn of transtage. Mter
a one hour coast period, transtage will then fire for the
2nd burn, lasting 298 seconds, to obtain an elliptical
transfer orbit of 107-by-22,300 miles. Mter about five
hours in this path, the transtage will perform the 3rd
burn to inject the satellite into a near-circular synchronous orbit measuring 22,221 miles at perigee and
22,318 miles at apogee. This typical synch-eq mission
requires about 672 hours from liftoff to satellite eject.
(One of the software validation runs is a 672 hour
continuous run which duplicates this mission.)

In April 1966 work was initiated by~ Martin Marietta
Corporation (MMC) , Denver Division, to extend the
role of an airborne computer to include flight controls
as well as guidance and navigation computations. This
project is one of several significant improvements for
the Titan IIIC space booster which was funded by the
Space and Missile Systems Organization of the Air
Force. The new Digital Flight Control System (DFCS)
has been successfully tested in four (4) Titan IIIC
missions.
The purpose of this paper is to describe how a large
hybrid computer simulation was used as an aid to
design and develop the DFCS and then used to validate
the resulting D FCS airborne software.
The simulation was programmed in six-degrees-offreedom and included an airborne Univac 1824M Missile
Guidance Computer (MGC) in the closed loop. Additional computing equipment used in the simulation
included three (3) EAI 8800 analog computers, an
EAI 8400 digital computer and an SDS 930 digital
computer. Flight control hardware components such as
rate gyros, body mounted accelerometers, and hydraulic actuators were also used in the simulation.
DESCRIPTION OF THE VEHICLE AND A
TYPICAL SIMULATED MISSION
The Titan IIIC is one of the Titan models being
built to carry Air Force and Defense Department payloads. The Titan IIIC is a four-stage vehicle with solidpropellant five-segment rocket motors on each side of
a liquid propellant core. Stage 0 is powered by dual
solid-propellant engines; Stages I and II are powered
by gimbaled liquid propellant engines; and Stage III
(transtage) is powered by a pair of restartable gimbaled
liquid propellant engines. The transtage also has 12
225

226

Fall Joint Computer Conference, 1971

MISSION DIAGRAM

~

I

A diagram of the Titan IIIC is shown in Figure 2,
and the pertinent flight control hardware that was used
in the closed loop simulation is labeled.
DESCRIPTION OF THE HYBRID SIMULATION

I

Tranltage
2nd. Burn

I

I

t
I

Coast
(5 hr)

{ \
f

\

;0,

\

/

t

Co .. t (1 hr)

I
\

'-

/

I

I

,,/./;,

I"

'\

\

/:/(ij----:;;\ ~~." "

I

I

\

I

\
'\

\
"

/

\

.......

'-.

"
.......

"-

'lTanstage
3rd Burn

-'

:::...-~ Satellite

Release

Since flight hardware was included as part of the
simuhttion effort, the simulated airframe had to be
computed in real time. The real time computational
requirement and the size of the simulated airframe
dictated the need for a hybrid simulation. Thesimulation hardware as shown in Figure 3 is located in two
separate facilities; the hybrid computation facility and
the controls mockup facility. The distance between the
two facilities is approximately 300 ft.

FIGURE 1

Facilities description
Figure I-Mission diagram

During the coast portions of transtage flight, the
simulation of the digital attitude control system (DACS)
is required to perform a series of maneuvers to maintain
proper thermal balance of the payload and to position
the transtage for transmission of telemetry signals. In
addition, the DACS is required to operate for several
seconds in one or more velocity vernier modes wherein
the eight aft-pointing DACS engines are turned on to
make fine adjustments in the trajectory and to bottom
the propellants in preparation for a main engine burn.
This mode requires that the attitude control logic on
these eight DACS engines be reversed because attitude
control is maintained by turning a jet off in this mode
rather than on, as in normal operation.

The hybrid facility contains 3-EAI 8800 analog
computers, l-EAI 8400 digital computer with a 32 bit
per word 32K memory, and a linkage system containing
32 analog to digital converters and 32 digital to analog
converters.
The Controls Mockup Facility (CMU) contains the
Univac 1824M missile guidance computer, an SDS 930
digital computer, flight actuation devices for all stages,
flight hardware sensor devices, and interface equipment
for buffering the signals received from the hybrid facility. Figure 4 is a photograph of the inverted engine
bells which are driven by' hydraulic actuators with
Stage II in the foreground, Stage I in the middle and
transtage in the background. Engine commands gener-

SIMULATION FACILITIES

;~.~"·I

HYBRID FACILITY
<:ODED PLATFORM ACCELERATION
AIRFRAIIE
ACTUATION

DEVICE

OUTPUTS

<:ODED VEHICLE ATTIT~

!vEHICLE LATERAL ACCELERATION

THRUST VECTOR CONTROL DYNAIIIC8

illEHICLE RATE8
ATTITUDE CONTROL THRUST SHAPING (uQUENCING DI8CRETES

CONTROLS

IIISBILE

MOCKUP FACILITY

GUIDANCE

COIIPUTER IUNIVAC 18241

Y

......v

EQUATIONS

7 STRUCTURAL BENDING 1I0DES

IIISBILE
ACTUATION
DEVICES

}-

GUIDANCE

COMPUTER SIIIULATOR
18D8-8S01
FLIGHT

CONTROLS

H
H

J
1

RATE GYRO
8EN81NG SYSTEII

LATERAL ACCELERATION
UN81N8 IIYSTEM

t----

AND
GUIDANCE

FUNCTIONII

I
FIGURE 3

Figure 2-Titan IIlC configuration

Figure 3-Simulation facilities

Titan IIIC Digital Flight Control System

Figure 4-Controls mockup Titan IIIC engine bells

ated in the MGC were output to the actuation devices.
Engine displacements measured by telemetry potentiometers on the actuation devices were used to provide
feedback into the airframe simulation.
The Univac 1824M missile guidance computer is a
binary machine employing fixed point, two's complement arithmetic with single address capability. The
MGC .employs a non-destructive readout thin-film
memory which can store 12,096 16-bit words.
The SDS 930 digital computer contains a 16K memory with 24-bits per word. Prior to using the MGC in
the closed loop simulation, the digital flight control
system, while still in the development stage, was programmed on the SDS 930 computer. The SDS 930
computer was then utilized to aid in checkout of the
airframe simulation and also to provide a means of
easy access for investigating design considerations for
the DFCS. The final MGC software was translated for
use in the SDS 930 computer, therefore allowing the
SDS 930 to be used as a missile guidance computer
simulator (MGCS). The primary purpose of the MGCS
was to stand backup for the MGC during the critical
DFCS··software validation period.
The hybrid simulation

The airframe simulation required the use of 3-EAI
8800 analog computers utilized to 95 percent of their
operational amplifier capability and a digital program
requiring 16K of core in the EAI 8400 digital computer.
The assignment of computational tasks to the analog
and digital computers was handled in such a manner
as to make use of the best computational aspects of

227

both computers. The parts of the simulated airframe
mechanized on the analog computer were as follows:
The body acceleration and body rate equations; seven
system modes representing structural bending and fuel
slosh for all four stages; vehicle sensor station equations;
a Thrust Vector Control system (TVC) for the solid
rocket motors of Stage 0; attitude control system thrust
shaping, and an analog autopilot to aid in the checkout
of the simulated airframe. Two-bit Gray coders were
implemented on the logic panels for the Gray coding of
the simulated inertial platform accelerations and vehicle attitudes. Functions used in the. thrust vector
control system were generated on card programmed
diode function generators. All other functions required
in the simulation were generated in the 8400 digital
computer.
Resolution on the body rate and body acceleration
equations was maintained by releveling the input functions originating in the digital computer at vehicle
staging events. The simulation of the seven system
modes required the generation of 6 functions per mode
per stage; therefore, a total of 168 separate functions
were needed in the simulation of the seven system
modes for a complete mission. Again the digital computer provided a means of generating the required
functions and also supplied a method of rescaling at
staging times, which made it possible to conserve on the
amount of analog equipment required to simulate the
seven system modes.
On the actual vehicle, platform acceleration and
vehicle attitude are generated by means of optisyns
which produce a two-bit Gray code as output to the
missile guidance computer. Both platform acceleration
and vehicle attitude in the simulated airframe were
computed in the EAI 8400 digital computer. Vehicle
attitude was updated every 10 milliseconds. The attitude was then quantized and compared with the· previous pass. If a change of one quanta in attitude was
observed, the information was sent to the analog patch
panel for Gray coding and then output to the missile
guidance computer. Gray coding of platform accelerations was handled in the same manner as vehicle attitude except that they were updated and output every
20 milliseconds.
The digital portion of the simulated airframe consisted of two routines. Routine 1 was updated every
10 milliseconds in Stages 0, I, and II and every 5 milliseconds during transtage flight; routine 2 was updated
every 20 milliseconds throughout the mission. Using
the external interrupts of the EAI 8400 digital computer, routine 1 was given higher priority than routine
2. The timing for the interrupts was generated on the
analog logic panel via a binary coded decimal down
counter. Routine 1 sampled the body rates generated

228

Fall Joint Computer Conference, 1971

on the analog computer and utilized the body rates in
the generation of the direction cosine matrix. Vehicle
attitude was generated as a function of direction cosine
terms, and then quantized and output to the analog
logic panel for Gray coding.
,
Routine 2 sampled the body accelerations and, using
the direction cosine matrix generated in routine 1,
transformed the body acceleration into inertial acceleration. Subtracting the gravity component from the
inertial acceleration and integrating produced inertial
velocity. Another integration produced inertial position. A transformation on the inertial velocity using
the inverse direction cosine matrix produced· the body
velocities from which the aerodynamic terms were
formed. The aerodynamic terms were then output to
the analog computers for input to the body acceleration and body rate equations. Platform acceleration
was computed, quantized and output to the analog
logic panel for Gray coding. All functions generated in
the digital computer were computed and output in
routine 2. An Automatic Data Channel Processor
(ADCP) was used to store digital variables on magnetic
tape at I-second intervals. The tape was then postprocessed to retrieve the real time information.
The flight computer generates timing and sequencing
discretes which control staging, engine start and engine
shut-down commands. These discretes were sensed in
routine 2 and the necessary action was initiated on the
simulated airframe.

autopilot gains and digital filter parameters to be held
constant. Also the guidance commands into the autopilot are held constant and all guidance computations
are bypassed. When the missile guidance computer is
put into the perturbate mode, a discrete is issued to the
hybrid simulation. The perturbate discrete actuates
logic incorporated in the EAI 8400 digital program
which holds all parameters that are functions of time
and also holds vehicle altitude. The discrete also sets
the longitudinal body acceleration in, the analog computer to zero.
The dynamic checkout procedure consisted of two
main parts. The first part included frequency response
testing with the simulation in the. perturbate mode.
Frequency responses of the closed loop simulation were
generated by forcing engine deflections with a sinusoidal signal. In this way, stability margins at several
time points in each stage were verified against previously generated stability analysis results. The second
part of the dynamic check involved comparing the
hybrid simulation trajectory results with the results of
an all-digital trajectory program.
DESIGN AND DEVELOPMENT OF THE DFCS
A new set of analytic, software, and simulation problems were generated by incorporation of a DFCS into
the Titan IIIC. This section of the paper describes
how some of these problems were resolved with the aid
of the hybrid simulation.

Simulation checkout
Digital filter accuracy
The general purpose hybrid facility was scheduled
in 6-hour shifts; therefore, setup of the simulation airframe was a daily occurrence. Potentiometer settings
and static tests were performed with each setup of the
simulation, using the Hytran Operations Interpreter
which is an EAI processor designed to perform checkout functions on the analog equipment. To further insure that the airframe simulation was operating properly, a trajectory was flown using an analog autopilot.
This trajectory run was made daily before closing the
loop around the missile guidance computer and all the
associated flight hardware.
In addition to the static checkout method, a thorough
dynamic checkout procedure was developed to ensure
that the airframe simulation represented the physical
plant. Part of this procedure required that the entire
closed loop simulation be in a perturbate mode. This
is a mode wherein the trajectory is fixed in time; thus
all vehicle parameters are held constant.
The missile guidance computer has a perturbate
mode built into the software. This mode allows the

Figure 5 is a block diagram of the Stage 0 pitch
plane autopilot, showing the four feedback loops used.
The attitude and two angular rate loops are used only
for stabilization purposes, while the lateral acceleration
feedback loop is employed as a means of achieving
active load relief, and is operative only during the max
SYSTEM

BLOCK

DIAGRAM

ATTITUDE
ATTITUDE
COMMAND ..

ACTUATION
DEVICE

AIRFRAME

RATE I
RATE 2
LATERAL
ACCEL

Figure 5-System block dia.gram

Titan IIIC Digital Flight Control System

buffet-max dynamic pressure portion of Stage 0 flight.
Each of the feedback loops is compensated with a gain
and a digital filter.
The digital filters are susceptible to accuracy problems1 when mechanized on a fixed point digital computer
such as the Univac 1824M. Two main sources of the
accuracy problem are (a) difference equation coefficient accuracy, and (b) fixed point digital computer
truncation accuracy. Furthermore, the requirement
that the software design will be able to fly the wide
range of payloads and missions without the reprogramming of equations makes the accuracy problem all
the more acute. For example, filter computations must
be scaled in such a way as to insure no overflow for
large inputs for all possible combinations of mission/
payload. Then for low level inputs (such as those expected for nominal operation) computations are scaled
in a very sub-optimal manner, thereby exaggerating
even further a computational truncation inaccuracy
problem.
To determine the effect on system operation in regard to digital filter computational accuracy, a pro-,
posed D FCS was programmed on the SDS 930 digital
computer. System operation in this case means the
effect of interaction between filter accuracy and other
vehicle characteristics such as quantization of MGC
input and output variables, actuator nonlinearities,
sensor nonlinearities, and propellant slosh. This possible
interaction could cause excessive limit cycle amplitudes and even instability. Therefore, to verify the
DFCS design, the DFCS was programmed on the
MGCS and the operational word length was varied in
the digital filters for each stage. This was done by
"masking" bits in the word that are furthermost to the
right, thereby effectively varying the scaling (binary
point location).
Noise susceptibility

Digital autopilot noise susceptibility is caused by
the sampled data folding phenomenon. Relatively high
frequency environmental noise brought into the system
as corruption on the sensors will, upon being sampled
by the DFCS, be "folded" to a lower frequency (i.e.,
any signal whose frequency, w, is greater than half the
sampling frequency, w s/2, will, on being sampled, be
"folded" about nW s /2, n= 1, 2, 3, ... ). Hence, energy
that was filtered out by an analog autopilot now appears in the form of low frequency signal content in the
control system. This low frequency energy can cause
excitation of the structural bending modes of the
vehicle, thereby inducing significant structural loads.
The severity of the loads problem is a function of the

229

amplitude and frequency content of the noise coming
in on the sensors and of the sampling frequency of the
DFCS.
To determine the severity of this problem on the
Titan I1IC DFCS (sampling frequency of 25 samples
per second), the mean and 30" environmental noise
spectrum existing during flight was measured. This was
done by statistically reducing telemetry data on sensor
outputs from several past Titan IIIC flights which
used analog autopilots. Next, a noise generator on an
EAI 8800 console plus shaping networks were set up
to duplicate the in-flight environmental noise. When
the 30" noise was superimposed on sensor outputs, and
therefore sent directly into the autopilot, excessive excitation of the first structural bending mode was seen
and intolerable vehicle loads were generated. This
problem was solved by using analog prefilters (by filtering the sensor input to the MGC) with a break frequencyof 10 cps, .5 damped on the rate channels and
a 5 cps break frequency, .5 damped on the lateral
accelerometer channels. These prefilters were specified
to be built into the MGC and are common to all flights.
Recursive and non-recursive digital prefilters were
derived and tested by using the MGCS. However,
significant noise energy proved to be present at frequencies above w s/2 which, due to the folding phenomenon, rendered the digital prefilters much less
effective than analog prefilters.
An extremely useful application of the EAI 8400
digital computer was made in' conjunction with the
noise investigation, wherein a program was written
which would sample the noise-shaping network outputs and calculate the power spectral density and rms
level of the noise. Thus the validity of the noise simulation was easily and rapidly checked against desired results. This one application saved many hours that
would have otherwise been spent waiting for the same
results to be calculated on an "off-site" digital
computer.
Digital autopilot malfunction detection logic

Experience with airborne computers indicates that
the prevalent failure mode is a transient one. In this
failure mode an electrical transient can cause either
read-write memory or one of the central processor
registers to pick up or drop one or more bits of information. The result is that the program will transfer
incorrectly, perform an incorrect calculation, or store
bad data.
As part of the DFCS development, MMC designed
malfunction logic that will detect these types of errors
and reset the MGC to a safe-point condition. The

230

Fall Joint Computer Conference, 1971

VALIDATION TASKS

DEVELOPMENT OF DETAIL
VALIDATION PROCEDURES
MGC, MGCS, HYBRID

CLOSED LOOP TESTS, MGC, HYBRID
1. TRAJECTORY VERIFICATIONS - - 2. STABILITY MARGIN VERIFICATION
3, TRANSIENT RESPONSE/MALFUNCTION LOGIC TEST
4. NOISE SUSCEPTlBILITY/MALFUNCTION LOGIC TEST
5. FORWARD LOOP GAIN REDUCTION TEST
6. FORWARD LOOP GAIN INCREASE TEST
7, DIGITAL ALTITUDE CONTROL SYSTEM CHECKOUT
8. TEST NO. 9 PRETEST
9. COMPLETE MISSION
10, REAL TIME MALFUNCTION LOGIC TESTS
11. VERNIER BACK-UP SHUTDOWN VERIFICATION

Figure 6-Validation tasks

safe-point condition means that all possible flight control parameters are initialized from nondestructive
readout memory. As implemented, 98 percent of the
DFCS parameters are initialized. The key variables
(stage indicators, time, and certain other key flags)
which cannot be initialized are specially protected, decreasing the chance of their being altered. The overall
malfunction logic requires approximately 8 percent of
the total DFCS memory requirements.
The MGCS and the hybrid simulation were used to
verify the operation of this system. Possible types of
malfunction were simulated in the MGCS and the response of the flight was monitored in the hybrid lab.
Only 40 milliseconds are required to complete the
initialization process. The resultant transient to the
vehicle, even in the maximum dynamic pressure region
of flight with worst case winds, is almost undetectable.
SOFTWARE VALIDATION
The software is in the form of a flight tape-a
punched paper tape that was coded to MMC specification by Univac and delivered to MMC and other
agencies for validation. Program validation has two
main objectives. The first is to insure the correctness
of equations, logic, timing, memory utilization, module
interaction, etc., for the flight tape programming when
compared to the software specification. The second objective is to insure that the program on the punched
tape satisfies all of the mission and vehicle performance
requirements. The validation itself consisted of a series
of open and closed loop tests executed per a detail

procedure· document. The open loop tests were an important part of the tape validation and consisted of
phasing tests, open loop digital filter response, etc.
However, these tests did not include the hybrid simulation and they will not be discussed further. The closed
loop testing basically consisted of determining if the
flight tape could "fly" the entire mission and meet all
requirements.
Figure 6 illustrates the tasks that were necessary
for software validation. One of these tasks required
that a validation procedures document be written.
This document was developed into a 537 page volume
that detailed the test set up, the test procedure, and
the expected results for each validation test. The expected results are .called success criteria. Tolerances
were placed on the success criteria in such a way that
if validation test results should exceed these tolerances
then the test is considered to be invalid. The reason for
the test being invalid must then be traced and explained. The MGCS, MGC, and the hybrid simulation
were used to develop and check out the tests, and in
some cases, to generate success criteria. Development
of the validation procedures document required approximately five man-months of effort, but this document proved to be invaluable during the flight tape
validation period.
The closed loop tests that were performed are listed
in Figure 6. To illustrate how these tests were run,
test 1 is discussed briefly. The following wording IS
taken from the validation procedures document.
Validation Test 1, Trajectory Verifications:
1. Test Objective-This will be a baseline trajec-

tory run through Stage III first burn and approximately 5 minutes into coast flight. The
overall objective of this test is to uncover quickly
any major trouble areas that may exist. The
specific purposes of this test are to verify proper
preaiming filter initialization and e.g., tracking,
and to verify dynamic stability during staging.
(The preaiming filter is a forward loop filter in
the control system that tracks the vehicle center
of gravity. Also this filter is "initialized" just
before a main engine is fired which causes the
thrust vector to be pointed through the center of
gravity.)
2. Configuration~The test configuration is the
Hybrid Computation Lab (HCL) /CMU Closed
Loop 6-Degree of Freedom Trajectory Simulation Configuration as illustrated in Figure 3.
For this test, the Stage III thrust differential is
removed for Stage III start, but is included for
Stage III shutdown.

Titan IIIC Digital Flight Control System

231

TABLE I-Success Criteria and Test Results for Preaiming Filter Operation

Flight Condition
Stage
Stage
Stage
Stage

II Start Initialization Value
II Burnout
III Start 1st Burn Initialization Value
III End 3rd Burn

Success Criteria
Engine Deflection

Test Results
(Test 1)

Test Results
(Test 9)

Pitch (deg)

Pitch (deg)

Pitch (deg)

-.0336
-.0573
-.254

-.019
-.055
-.23
- .419

-

.027± .019
.018±.095
.24 ±.05
.162±.28

3. Test Sequencea. Set up HCL and CMU to use trajectory run
procedure with the telemetry (TM) ground
station.
b. Label HCL strip charts and the x-y plotters
and enable Automatic Data Channel Processor. Plot the aerodynamic variables a and
qAT on x - y plotters.
c. Null to zero any biases on the TVC valves,
TM monitor pots, and upper stage actuator
TM pots. This is done in Stage 0 with the
hybrid simulation in the I.C. mode. Stage I
actuators are biased to zero with the simulation in "perturbate" at approximately T = 110
sec. Stage II and Stage III actuators are
biased to zero with the simulation in "perturbate" in Stage I.
d. Success Criteria.
Success criteria for this test is given in Table I. Table I
is taken from the validation document for this test to
illustrate how the success criteria is typically used. The
test results shown in Table I were obtained from telemetry data output from the MGC. (This is the only
way that real time data could be continuously extracted
from the MGC during validation tests.)
The most significant table of success criteria for this
test is not included in this paper because of space
limitations; however, it will be discussed. This table was
designed to measure trajectory variables against success
criteria that were generated by an all-digital trajectory
program. In this table, the following hybrid computer
trajectory variables for ten flight times are measured
against the success criteria: total velocity, velocity components, altitude, angle of attack, side-slip angle, aerodynamic pressure, and the product of the aerodynamic
pressure and the total angle of attack. All of these
variables must be within the success criteria tolerances
for the results to be acceptable. Tolerances for this

N.A~

test vary from less than 1 percent upward depending
on the flight time and the particular variable that is
being examined. One other table of success criteria (not
shown) was generated for test 1, which consists of the
maximum vehicle attitude rates that should never be
exceeded during staging sequences.
Test 7, which is a complex checkout of the digital
attitude control system, involved a special application
of the EAI 8400 digital computer. The DACS has
several different logic "stages" in each of the three
autopilot channels. Consequently, to check every logic
path in the DACS, 132 separate sub-tests were required.
As in some of the other tests, support of the TM
ground station was necessary for test 7. This support
involved recording DACS information on magnetic
tape. Hence, it was important that test 7 be executed
as rapidly and efficiently as possible. To achieve this
goal, the 8400 digital computer was programmed to
accept data for the 132 sub-tests from punched cards
and execute the entire sequence of test 7. More specifically, the digital program simulated vehicle attitude
motion which was then Gray coded on an 8800 analog
computer. The resulting Gray code was sent to the
MGC to exercise the DACS logic. The time needed to
complete test 7 for one channel was one hour of continuous running. The other tests listed in Figure 6 were
conducted in a manner similar to test 1; therefore, they
will not be discussed.
During the validation period, configuration control
was carefully maintained in both the CMU and in the
HCL. For example, whenever reprogramming of the
hybrid computer was required, or a pot setting changed,
this information was entered into a configuration controllog book along with the reason for the change. In
addition, the analog patch boards and the EAI 8400
program were kept locked in a special cabinet when
not in use. Responsibility for maintaining configuration
control and security during validation tests was assigned to the lead engineers who were running the
tests.

232

Fall Joint Computer Conference, 1971

Each quanta of engine command is equivalent to .02
degrees of engine deflection. Figure 7 illustrates that
the expected in-flight limit cycle results were substantiated quite well.

FLIGHT RESULTS

CONCLUSIONS

-('oOO"-Ov 1--+~+-+--""'f--+--t--+--r-=t~'4

au

aM

ase

H8

8'1'

Trajeoi:or,o TiM - SM

SIMULATION RESULTS

~'-------+--+---+--+----

--.-------

aea

aTO

Trajectory TIllIe - 8eo

Figure 7-Simulation and flight result comparison

Figure 7 is a typical comparison of simulation results
with results obtained from the flight. Pitch engine deflection commands are compared over a ten second
time period during Stage II flight; trajectory time corresponds to 360 seconds at the beginning of the plots.

The hybrid simulation described in this paper proved
to be a valuable aid in the development and design
validation of a new digital flight control system for the
Titan IlIC.
The simulation was instrumental in resolving problems associated with digital filter accuracy and digital
flight control system noise susceptibility. Also malfunction detection logic for the airborne missile guidance
computer software was developed. This developmental
work contributed significantly to achieving an operational digital flight control system which is capable of
flying the broad range of Titan IIIC missions without
the necessity of reprogramming software.
The simulation was successfully used for validation
of the final airborne software by executing a series of
carefully planned validation tests. These tests were designed to verify performance of the entire digital flight
control system.

REFERENCES
1 R S JACKSON
Minimization of computer word length and storage
requirements in recursive digital filter design
IEEE Proceedings of the Fourteenth Symposium on
Circuit Theory May 6-7 1971

Multivariable function generation for simulations
by S. P. CHEW, J. E. SANFORD and E. Z. ASMAN
Boeing Computer Services
Seattle, Washington

systems, primarily because of the higher system frequencies involved. The difficulties, however, become
particularly acute when the number, shape and size of
the missile control surfaces is highly restricted by
carrier-missile interface design. The stringent performance requirements for maneuverability and the
control surface design constraints, impose demanding
requirements on the function generation task. Some
important reasons for this are:

INTRODUCTION
The purpose of this paper is to describe the mechanization of a technique for generating continuous functions
of up to six variables, given a discrete set of experimental data points. This technique is currently being
applied in a real-time simulation of a high performance
missile. The characteristics which make this method of
function generation unique are the high speed with
which many multivariable functions are produced and
the large size of the data base from which .the function
values are computed.
The need for this capability emerged at a time when
an existing hybrid simulation of the missile was to be
improved and expanded to support missile flight tests.
It was determined that in order to refine the simulation's predictive and post-flight analytic capability, a
substantial improvement over the existing all analog
method of generating the functions which define the
aerodynamic model was required. With. the all analog
method, only a small portion of available data could be
applied to model the missile aerodynamics. The objective was to develop a technique for direct utilization of
wind tunnel data in order to provide a more accurate
model.
The following discussion defines the problem and its
solution constraints, considers alternative solutions and
describes the mechanization of the selected approach.
A summary of significant results achieved by the
application of this technique concludes the discussion.

1. During certain maneuvers, a control surface
(fin) may be flying in the shadow of the missile.
The effectiveness of this control surface may be
very non-linear with respect to the deflection
angle.
2. The missile control system is only stable within
a narrow region defined by the gain margin. Gain
margin errors of a few db introduced by the
function generation process may quickly invalidate simulation results.
3. The frequency response of the system is exceptionally fast. Any computational delay introduced has a pronounced effect on the gain
margin.
In addition to all of these general constraints, the
following specific simulation requirements were identified:
1. The number of functions to be generated In
real-time, included:

PROBLEM DEFINITION

6 functions of 6 variables
5 functions of 3 variables
3 functions of 4 variables

The problem of generating aerodynamic functions
in a real-time* simulflJtion of a high performance missile
is more difficult than in simulations of slower flight

2. A provision for a data base containing approximately 20 million discrete, wind-tunnel derived
data points was required.
3. All function outputs were to be continuous to
permit integration with the existing simulation

* Real-time simulation is required in order to evaluate flight
hardware in the loop and to collect sufficient data in a reasonable
amount of time.
233

234

Fall Joint Computer Conference, 1971

and to avoid degradation in gain margins. U nacceptable degradation was empirically determined to occur when function output delays
exceeded 0.3 milliseconds;
4. The selected function generation technique was
to provide the capability for rapidly altering
functional data, as further wind-tunnel or actual
flight test data became available.
A literature search of material related to multivariable function generation and consultations with
individuals and equipment manufacturers (See References 1, 2 and 3) yielded useful ideas which later
influenced the design concept. Most notably, contacts
with Mr. A. I. Rubin, author of Reference 3, provided
the base from which the ultimate design evolved.

storage, management and input/output. The hybrid
computing unit would perform the interpolation
calculations in parallel, to. produce continuous functions simultaneously.
HFGS DESIGN REQUIREMENTS
In order to meet the simulation requirements discussed earlier, certain internal design criteria had to
be met:
1. The HFGS was required to produce updated

function values in less than 5 msecs. following a
breakpoint* crossing of anyone of the independent variables. Consequently, the internal update
time was not to exceed 2.5 msecs.
2. A correct set of 442 data points had to be
retrieved from storage and made available for
interpolation within the update time.
3. The data requirements had to be met within the
storage constraints of the available digital
computer.
4. The interpolated functions had to match within
1 percent the theoretical values obtained from
linear interpolation of wind-tunnel data.

ALTERN ATIVES
Several alternatives were evaluated for satisfying
the above problem definition. Improvement to the
existing all analog aerodynamic function generation
system was rejected because lengthy and complex curve
fitting techniques are required whenever changes to
aerodynamic data points are made. Also, the system
did not accurately represent the functions, due to
inherent noise and inability to include sufficient data
points. Evaluation of function generation using a
general purpose digital computer revealed that unacceptable time delays would be created. It was determined that a Hybrid Function Generation System
(HFGS) would best meet the stringent requirements
for accuracy, flexibility of modifying function data
points, and frequency response. Two mechanization
alternatives, a multiplexed and a parallel, were evaluated. The multiplexed HFGS, which generated several
functions with the same set of circuits, could greatly
reduce the number of computing components. Although
a reduction in components could theoretically be
achieved, the component specifications would be
extremely rigid and the resultant system would have
limited general purpose applications. The parallel
HFGS, which employed separate circuits for each
function, could be designed with commercially available
computing components and configured to meet the
required multivariable function generation task as
well as other simulation needs. Based on these evaluations, the parallel alternative was chosen for implementation.
The parallel HFGS (See Figure 1) was envisioned to
consist of a large general purpose digital computer,
interfaced with a special hybrid computing unit. The
digital computer would be assigned the task of data

The stringent 5 msec. requirement was derived from
consideration:s of missile frequency characteristics,
minimum expected spacing between breakpoints and
time delay constraints. A longer update time was
expected to compromise the 1 percent accuracy specification considered necessary to provide realistic
results.

EASE 2100 CONSOLES

HYBRI D FUNCTION GENERATION SYSTEM

AND XDS 9300 ",,"PUTER

-

~ ADI ~ HYBRI D

CONSOLES

HI SSI LE SIMULATION
•

CARRI ER TARGETI NG PROGRAM

•

HI 551 LE GUI DANCE

•

EQUATIONS OF HOTIONS

•

FLI GHT CONTROL SYSTEM

•

FI N ACTUATION SYSTEM

11 '";:::.":~.
I Lt-4

II

FLI GHT HARDWARE

--,I

6 Functions of
6 Variables

• General Control
• Address Calculation

. . 3 Functions of
4 Variables

• Datil Storage
Retrieval and

.. 5 Functions of
3 Variables

t __ Lf-- •

, -_ _....L-

AERO DATA
MANAGEMENT

Independent
Variables

Output

<:=

=

=

==:;: .D~;~f!~~:sl:~:pend.n

4Functlon Dat.

Interrupt Signals

OPERATI DNAL MOCKUP
~

AnalogSlgnals

<===:::J

DI gl tal Dat.

Figure I-The Boeing hybrid function generation system

* Breakpoint is a predefined discrete value, Xi, of an Independent
variable X for which a function data point f(Xi) exists.

Multivariable Function Generation

In order to provide for the occurrence of two breakpoint crossings in rapid succession, the internal update
time had to be 2.5 msecs. This would ensure that
sequential processing of two closely spaced breakpoints
would not exceed the allotted 5 msecs. to accomplish a
function update.
DATA REDUCTION TECHNIQUES
The HFGS was required to produce functions from
data matrices of 17X13X13X13X13X7 (3,398,759
points), 17X 13X 13X7 (20,111 points) and 6X3X25
(450 points) for functions of 6, 4 and 3 variables
respectively. To represent the specified number of
multivariable functions, the resulting data base would
be 20.5 million points. Although required to accurately
describe the highly non-linear aerodynamic functions,
the data base was impractical to .obtain from direct
wind tunnel measurements or to fit into the random
access memory of the available digital computer.
Fortunately, not all of the 20.5 million points were
needed to reproduce an EFFECT lVE data base of the
specified size.
The aerodynamic coefficients produced by the HFGS
are functions of six independent variables: angle of
attack, a, side slip angle, /3, velocity, M, and the control
surface deflection angles (h, 02 and 03. Although a six
dimensional d,ata matrix is needed to describe a function of six variables, in practice a minimum set of
single fin effective data is obtained from wind tunnel
measurements. The effect of each control surface is
incrementally combined to produce an equivalent
function of 6 variables. This superposition of the fin
effects reduces a function of 6 variables to the sum of
3 and 4 variable functions. The superposition algorithm
for this reduction is:
f(a, /3, M, 01, 02, 03) =fo(a, /3, M)
+f1(a, /3, M, 01,0,0) -fo(a, /3, M)
+f2(a, /3, M, 0, 02, 0) -fo(a, /3, M)
+f3(a, /3, M, 0, 0, 03) -fo(a, /3, M)

which simplifies to:
f(a, /3, M, 01, 02, 03) =f1(a, /3, M, 01)
+f2(a, /3, M, 02) +fs(a, /3, M, 03) -2fo(a, /3, M)

The 17 X 13 X 13 X 13 X 13 X 7 data matrix required for a
function of 6 variables was reduced to one matrix of
(17XI3X7), and 3 matrices of (17X13X7X13) for
functions of 3 and 4 variables respectively. This reduces
3,398,759 data points to 61,880. Since redundant data at
01 = 02 = 03 = 0 in these matrices can be eliminated, the

235

number of data points would be 57,239 for each of the
6 functions of 6 variables.
In addition to superposition, vehicle symmetry
presented the opportunity to further reduce the data
base. Function values corresponding to negative angles
of an independent variable can be derived from data
measured at positive angles. The HFGS is programmed
to compute data points derived from symmetry during
each update cycle.
The above techniques reduced the specified 20.5 m
point data base to approximately 435,000 points.
The available digital computer was an IBM 360/75
equipped with a 750K byte core memory. The data
base of 435K, 15-bit data points would require 870K
bytes. Since the total database could not reside in core,
the total function data was segmented into several
pages and stored on drum. Each page contained data
corresponding to a breakpoint of the slowest changing
variable. For linear interpolation data from two adjacent pages are required. Thus only four pages (two
immediate plus one ahead and one behind) need to be
in random access memory at any time. Since the paging
variable is relatively slow, a new page can be transferred to core before the paging variable exceeds the
boundaries described by the resident data.
The data base required to produce the functions could
now reside in core and be manipulated by the software.
The application of these data management techniques
did not compromise original system specifications.
THE INTERPOLATION ALGORITHM
The linear interpolation algorithm was chosen to:
1. Provide direct correspondence between equation
and computing elements for easy hardware
implementation and fault isolation.
2. Allow systematic expansion from functions of
one to n variables.
3. Eliminate operational restrictions (such as
maximum slope and data spacing) because of
hardware limitations.
The algorithm can be derived directly from the
principle of superposition as follows:
Figure 2 shows a typical function of one variable.
Figure 3 shows the region of interest. For linear interpolation, regardless of how the rest of the data
points are distributed, the value f(X)' at any point
Xi5;X 5;X i+1 is determined by f(Xi) andf(Xi+1)' The
principle of superposition states that the value of f(x)
can be computed as the sum of the individual con-

Fall Joint Computer Conference, 1971

236

simplification of the equation is accomplished by
normalizing each interval between Xi and X i+l to
unity, Thus, in Figure 2,

f(X)

Xi+l-Xi= 1;

X-Xi
- - - =X-Xi=aX'
Xi+l- Xi
'

and
Xi+1-X
- - , = [ Xi+l-X1'] - [X - X']
1 = 1 -aX ,
Xi+l-Xl
Equation (2) becomes
XI X X,+l

X

Figure 2-A function of one variable

tributions from f(Xi) and f(Xi+l) , The contribution of
f(Xi) is fl in Figure 2, From similar triangles:
, [Xi+l-X]
X
X·
i+l- 1

h=f(Xl)

f(X)

= f(Xi) [1- a.¥]+f(Xi+l) [aX]

(3)

The same principle can be applied to obtain the
interpolation algorithm for a function of two variables,
From Figure 4, f(X, Y) is determined from the magnitude of f21 and f12 which are in turn determined by
f(Xi, Yi), f(Xi+l, Yi), f(Xi, Yi+l) and f(Xi+l, Yi+l) ,
The magnitudes
f21=f(Xi, Yi)[1-AX]+f(Xi+l, Yi)[aX]

Similarly the contribution of f (X i+l) is:

f12= f(Xi, Yi+l) [1- AX]+f(Xi+1, Yi+l) [aX]
f(X, Y) =f21[1-aY]+h2[aY]

X-Xi]

f2=f(X i+1) [ X
X'
i+l- 1

Substituting for f12 and f21

And

f(X, Y) =f(Xi, Yi)[1-aX][1-aY]

X'+I-X]

f{X) =fl+f2=f(Xi) [ X ~
X' +f(Xi+l)
i+l- 1

( X-Xi]
X
X'.
i+l- 1
(2)

+f(Xi+l, Yi) [aX][1- aY]
+f(Xi, Y i+1 ) [1- AX][aY]
+f(Xi+1, Yi+l) [AX][aY]

(4)

Although the spacing between data points is not equal,
f(Xr+l,Yr+l)

f(XI+l )

f(X)
f(XI)

fl
~'~

f2~

fl

'-...."

,.

......

'~-Xf

6X

X

1-6X

X'+1

Figure 3-Generation of f(x) within two known points
by superposition

Figure 4-Generation of a function of two variables
by superposition

Multivariable Function Generation

237

(1-6X) (1-6 y) (I -6Z)
f(XI,Yj,Zk)
(I-6X) (I -6 Y)6Z

6X=~

XI+I - XI
-6X

XI

(I-6X) (6Y) (1-6Z)

X

(I -6X) (6 Y) (6Z)

Where (1) Digital
Input

~

Analog

(6X) (I-6Y) (I-6Z)

Input

(2)

f

(x, Y,Z)

MDAC
(6X) (HY) (6Z)
High gain analog amplifier
(6X) (ll y) (I -6Z)

-{>-

Analog Inverting amplifier

(lIX) (6Y) (6Z)

Figure 5A-A typical normalization circuit

Where

(1)
(2)

Multiplying Digital to Analog Converter
f(XI,yj,Zk)' ..... , f(Xi+I'Yj+l'Zk+l)

are the

functional values at the breakpoints (Olgl tal)
- (6X) (1-6 Y)

(3)

(1-6X) (1-6Y)(I-6Z), ••••• ,6X6Y6Z are the weighting
coefficients (Analog voltages)

-6X
- (6X) (6Y)
+6Y

Figure 5C-Implementation of a function of three variables

- (I-6X) (6 y)

-6X
I

-(I-6X) (I-6Y)

------ -----/:.Z
-(6X) (AY)

-6Z
- (/:'X)(6Y)

-6Z
-.(I-AX)ilY

=rE> q;
=22 q;

=p q;

(6X) (6Y) (AZ)
(6X) (6Y) (1-6Z)

(6X) (6Y) (6Z)
(6X) (6Y) (1-6Z)

(1-6X) (6 Y) (ilZ)
(I-6X) (6Y) (I-6Z)

Figure 5B-Generation of weighting coefficients for a
function of three variables

The same procedure can be applied to derive the
interpolation algorithm for a function of three variables.
But we also have observed from the physical picture
that the value of the function f(X, Y) at any point
(X, Y) is the sum of the ~ontributions of the data
points immediately surrounding the point. Furthermore
the magnitude of each contribution is proportional to
the magnitude of the data point multiplied by the
normalized distance from the point (X, Y) to the
reference point. The normalized distances are actually
weighting coefficients of the data points since their
expanded sum is equal to unity.
With the experience gained from the procedure of
deriving the interpolation algorithm for a function of
two variables we can systematically write down the
interpolation algorithm for a function of n variables.
As an example, for a function of four variables there are
2 n (n = 4) data points surrounding any point of interest.
Therefore, there will be 24 = 16 terms contributing to
the value of the functionf(X r Y, Z, W). Each term will
be the value of a data point multiplied by a weighting

238

Fall Joint Computer Conference, 1971

~ MultIplIer

I
I

'----

60.6P (1~»)

I

--- -.-------'

,----------.,
:

I
(1~P)

I'- _ _ _ _ _6o.(1~p){l
_ _ _ _ _ _ .J

r-----------,
I

{l~o. ~

I

L - - - - - -- - - -

J

r------------,
6M

1

(1-60.) (I -6p}b.M

1 (l-6a){l~)

EC

+ 1 - - _ 1 .....

66

1

L ___6~ ~~ <..!.~l {,!..-6_6J

L

r------------60. (I-6p)6~ 6 1

I

6a
(l~P}b.M(I~
L ___
__
_ _ _ _ _ _ _6_
1)_ JI
6a(I~p)(1-6M)661

I

L _ _ 6a
_ _(1-68)
____
___
_ 61
(1_6M)
(l~

r-----------(l-6o.~ 61

I

{1-60)6p6M(I~
6I> _
L _______
____

I

~

(1-6a)6p
I
, - - - - - - _____ _
(l~a)6p(l~)I:-- + - 1
(1-60' )6P (1...0.M) 6 6 1
I

I

-6a

(1~)

r- - - - - - - - - - - - -

I

I 6M

6o.6p

t:, a {I ~P}b.MI---+--I

6M

60.

r-----------

I

{l~a

)6p
(1..6M)
(1-66
L _____
__
___
_ _) I

r-------------(1~o.)(1~p~61

t----+--I
(I ~ a ) (I -6p)6M (1 ~ 6 )
I
L _ _ _ _ _ _ _ _ _ _ _ _1_

I

-

_oJ

I

I

r ____________ _

(l~O) (1~P) (1-6M}---+--t

L ______ -

1

(l~ 0' ) (l~P) (1-6M) 66 11

f(o.,p,M, 6 1 ,/21

63 )

L_~~~~~~~)~l~J~~~

Figure 6-Implementation for the generation of a function of six variables

coefficient or the normalized distance between the
reference point and the point of interest.
Thus
f(X, Y, z, W)
=j(Xi,Yi, Zi, Wi)(l-aX)(I-aY)(I-aZ)(I-.1W)
+f(Xi+l, Yi, Zi, Wi)(aX)(I-aY)(I-aZ)(I-.1W)
+f(Xi, Y i+1, Zi, Wi)(I-aX)(aY)(1-.1Z)(I-.1W)
+f(X i +1, Y i +1, Zi, Wi)(.1X)(aY)(I-.1Z)(I-LlW)

+f(Xi, Yi+l, Zi, W i+1)(l- aX) (.1Y) (1- az) (aW)
+f(X i+1, Y i+1, Zi, W i+1)(.1X)(AY)(I-aZ)(aw)
+f(Xi, Yi, Zi+l, Wi+1)(I-aX)(I-aY)(az)(aw)
+f(X i+1 , Yi, Zi+1, W i +1) (.1X)(l-aY)(az)(aw)
+f(Xi, Yi+l, Zi+l, W i+1)(I-aX) (aY)(az)(aw)
+f(X i +1, Yi+l, Zi+l, W i+1)(ax)(.1Y) (.1Z) (aw)

SYSTEM IMPLEMENTATION

+f(Xi, Yi, Zi+l, Wi) (1- aX) (1- .1Y) (.1Z) (1-.1W)
+f(Xi+l, Yi, Zi+l, Wi)(.1X)(I-.1Y)(aZ)(I-aW)
+f(Xi, Y i +1 , Zi+1, Wi)(l-aX)(aY)(.1Z)(I-aW)
+f(X i +1, Yi+l, Zi+1, Wi) (ax) (.1Y) (az) (1- aw)
+f(Xi, Yi, Zi, Wi+1)(I-aX)(I-aY)(I-aZ)(.1W)
+f(Xi+l, Yi, Zi, W i +1)(.1X)(I-aY)(I-aZ)(aW)

In the interpolation algorithm the independent
variables X, Y, etc., and their normalized values AX,
aY, etc., are analog signals; the breakpoints Xi, Yi,
etc., and the function data points f(Xi . .. ), etc., are
digital values. Each function is the sum of several
products. Each product is the multiplication of a digital
value, f(Xi ... ) by an analog signal, the product of
the a's, and (1-.1)'s. This form of the algorithm is

Multivariable Function Generation

ADAPTER
UNIT

SELECTOR
CHANNELS

INTERFACE

2701

2860
2701

CH 1

IBM

2860

239

~

~

I

~!

RIF

fJ

~

RIF 4

I Dig I ta 1

___________________________________________________ L_ _ _ _ _ _

~--~

Da ta

----I.-Analog Signals

Figure 7-HFGS implementation

ideally suited for implementation using hybrid computing circuits. The basic computing elements are (1)
the multiplying digital to analog converter (MDAC),
(2) the analog multiplier, and (3) the operational
amplifier. The implementation of a typical normalization circuit is shown in Figure 5A. Figure 5B shows
how the products of the d'S and (1- L.\) 's can be formed
systematically for functions of two and three variables.
Figure 5C shows the concise implementation of a
function of three variables using MDAC's. Figure 6
shows the implementation of equation (1), a function
of six variables.
In a typical flight simulation all aerodynamic
coefficients are functions of the same independent
variables. Under such conditions one set of normaliza-

tion circuits is sufficient for the generation of all
functions of the same variables. This simplifies the total
system implementation requirement considerably.
The HFGS is implemented with an IB]\II 360/75
digital computer interfaced to four AD/4 analog computers that house hybrid computing elements. A block
diagram showing the interconnections of the major
subsystems of the HFGS is in Figure 7. Two-way
communication is provided between the digital computer and the interpolation circuits. The DACs and
the MDACs in the normalization circuits receive break
point values and break point spacing information
respectively from the digital computer. The remaining
MDACs receive functional values of corresponding
break points from the digital computer. The dual data

240

Fall Joint Computer Conference, 1971

path provides an effective transmission rate of one
million words per second. An analog-to-digital converter (ADC) supplies the values of the analog independent variables to the digital computer. Analog
comparators are used for monitoring the outputs of the
normalization circuits. The outputs of the comparators
are connected through an OR circuit to an interrupt
line in the digital computer. Servicing of the interrupt
is controlled by the function generation program.
SYSTEl\1 OPERATION
The interpolation circuits for generating the multivariable functions are programmed on the analog
patchboards. The digital computer stores in memory
and bulk storage the digitally recorded data representing the functions to be generated. During a simulation, the analog computing elements monitor the
independent variables and generate interrupt signals to
the digital computer when the value of any variable
crosses a breakpoint. This condition occurs when the
normalized value of any independent variable goes
below zero or above unity. The ADC under the control
of the digital computer converts and supplies the values
of the independent variables to the digital computer.
Based on these values the digital computer calculates
data addresses, retrieves, orders and outputs the
necessary data to the l\1DACs. If necessary, it brings in
a new page of data from the drum concurrently with the
other operations. The computing elements in the analog
consoles perform linear interpolation simultaneously
and continuously based on the instantaneous value of
the independent variables. Negligible phase delay is
introduced as the functions are generated. As the value
of any variable crosses a breakpoint, the digital computer again updates the MDACs dynamically within
2.5 msec. with new data.
During the 2.5 msec. after an independent variable
crosses into a new region and before the MDACs can
be updated with the new data, the process of interpolation temporarily becomes extrapolation since the
independent variable is outside the defined region. The
error of extrapolation depends on the rate of change of
data values between adjacent data points. This amplitude error appears as high frequency noise in the
simulation.

TABLE I-Capacity of the HFGS
212

No. of functions
No. of Variables

106

53

26

13

6

123

4

5

6

that can be generated increases as the number of
variables decreases. A mix in the number of functions
and the number of variables is also possible. The total
maximum number of independent variables at present
is limited to 9.
SYSTEM ACCURACY
One interesting point worth mentioning is the
determination of the dynamic accuracy of the HFGS.
If a standard sine wave of 100 HZ at 100 volts peak is
used to represent the independent variables, the
product term of the normalized variables in the interpolation algorithm can become very complex. For
instance, let
AX=AY=AZ=AW= sin wt,

then
(AX) (AY) (AZ) CAW)

= sin4 wt = %-

Y2 cos 2wt + VB cos 4wt

The product terms of the weighting coefficients all

Signal Characteristics of A Sallpled-hta S'(St.

A.

f'UIICtiO/l • •erated by
a s.,.led-data system

~~

Output" os ...,Je

11 t

Hold while COlllpute
Output

a.

s ....ple

Hold while
Output

tr

c~ut.

function

a. .....ple

I

HFGS CAPACITY

SallPffl
CoaIpute

Equipment capable of generating functions of six
variables is obviously good for functions of 5, 4, 3,
2 or 1 variables. Table I shows the maximum capacity of the Boeing HFGS. The number of functions

1'-.,---')

Update
SalWple
c:-pute
Update

Figure 8-Comparison of output signals

Multivariable Function Generation

contain harmonics of the original sine wave. The output
of the function, in general, will not be a simple sine
wave. However, with the understanding of the characteristics of the weighting coefficients, the equation
can be simplified for test purposes. The equation for
linear interpolation can be rearranged such that all
terms containing one normalized variable, such as ~ W
are collected in one group and the remaining terms
containing (1- ~ W) in another group. Within each
group the common terms ~W, and (l-~W) are
factored out giving:
f(X, Y, Z, W)

= (l-~W)[f{Xi,

Vi, Zi,

Wi)(l-~X)(l-~Y)(l-~Z)

+f(Xi+l, Vi, Zi, Wi)(~X)(l-~Y)(l-~Z)

241

reduce to unity, the equation becomes
f(X, Y, Z, W) = (l-~W)El[lJ+ (~W)E2[lJ

(7)

Equation (7) indicates that whatever signals are used
for ~X, ~Y, and ~Z, they should sum up to unity for
properly chosen values of data points. Equation (7)
shows the simple relationship between the output of
the function and the input signal ~W. The use of (7)
enables the measurement of small dynamic errors by a
simple comparison of output to input while all function
generation circuits are being exercised.
The maximum total dynamic error including phase
shift for a function of six variables measured at an
output signal of 100 sin21r(100)t volts is 1 percent. The
measured error is very close to the calculated error
based on the individual component specifications.

+f(Xi, Yi+l, Zi, Wi)(l-~X)(~Y)(l-~Z)

A COMPARISON WITH OTHER METHODS

+f(Xi+l, Y i+1, Zi, Wi) (~X)(~Y) (1- ~Z)
+f(Xi, Vi, Zi+l,

Before the hybrid equipment was available for
function generation, the aerodynamic coefficients were
simulated in an analog aero model. An example of the
interpolation polynomial for the coefficient CL is given
below. The mechanization of this polynomial on an
analog computer introduces very little phase delay at
the output. But, the basic drawbacks are:

Wi)(l-~X)(l-~Y)(~Z)

+f(Xi+l, Vi, Zi+l, Wi)(~X)(1-~Y)(~Z)
+f(Xi, Yi+l, Zi+l, Wi)(l-~X)(~Y)(~Z)
+f(Xi+l, Yi+l, Zi+l, Wi) (~X) (~Y)(~Z) ]
+(~W)[

f(Xi, Yi,Zi, Wi+l)(l-~X)(1-~Y)(l-~Z)
1. Many man-months of effort are spent in the

+f(Xi+l, Vi, Zi, Wi+l)(~X)(l-~Y)(l-~Z)

derivation of the interpolation polynomial from
wind tunnel data.
2. The equation does not produce correct outputs
throughout the entire range of interest.
3. The many multiplications and gains in a chain
produce unacceptable noise amplitudes.
4. The method requires a major effort to update
when new data are obtained from wind tunnel
tests.

+f(Xi, Y i+1, Zi, Wi+l)(l-~X)(~Y)(l-~Z)
+f(Xi+l, Yi+l, Zi, Wi+l) (~X) (~Y) (1- ~Z)
+f(Xi, Vi, Zi+l, Wi+l)(l-~X)(l-~Y)(~Z)
+f(Xi+l, Vi, Zi+l, Wi+l)(~Z)(l- ~Y)(~Z)
+f(Xi, Y i+1 , ZiH, Wi+l) (1- ~X) (~Y) (~Z)
+f(Xi+l, Yi+l, Zi+l, Wi+l)(~X)(~Y)(~Z)J

(6)

The remaining coefficients inside the brackets are
exactly the weighting coefficients of a function with
one less variable. If we set the digital values of the data
points in the two groups to El and E2 respectively, and
factor them out of each group, we have
f(X, Y, Z, W) = (l-~W)El[(l-~X)(l-~Y)(l-~Z)

+ ... +(~X)(dY)(~Z)J
+ (~W)E2[(1- ~X) (1- ~Y) (1- ~Z)
+ ... + (~X) (~Y) (~Z) ]
Since the weighting coefficients inside the brackets

A sampled-data hybrid system employing a large
scale digital computer with simple zero-order hold
reconstruction was tested for function generation by the
sample, compute, output and hold scheme. The complete cycle using digital interpolation is about 7 milliseconds. The output waveform of this sampled-data
system is compared to the waveform of the HFGS in
Figure 8. The significant difference is that the HFGS
introduces negligible phase delay at the output. With
the phase delay created by the sampled-data system
the simulation was forced to run at 10 times slower than
real-time. Even on a 10 times slower time scale, the
method introduces into the simulation a loss in gain
margin twice as much as the HFGS running in realtime.

242

Fall Joint Computer Conference, 1971

A Sample Equation For An Aerodynamic Coefficient C I
C Z=C1(3f3+C 1(321 f31 f3+CZhOl+Clo}21 011 01+C 1PP-C y AZCG

I 021 02+CloS03+CZ8a2 I 03\ OS+CZ(33(33
} Derivatives depend on the sign of a
+ [C la(3(3C la(33(33a + C la (3{3 + C la 3(33(3S]a S

+CZ8202+CI822

3

+

+[C1(301 I 01 I+ C Z(3o 12012 C1(30 140l4]f3
+[Cl(321lt0l+Cl(32012 101 I 01+C 1(32 014 I 01 I 5.3

+ [Claol01+ C 1ao12 I 011 01+Clao1301S]a

+ [Cla21lt0l+Cla2012 I 01 I 01+CZa20130lS]a2

+ [C la/i202 +C la/i22022 +C lao2302S]a

+ [C la20202 + C la2022022 + Cla2023023 ]a 2
+ [C laosOS + C laos20S2 + C laos3033]a
+ [C la2osOS + C la2os2032 + Cla2oa80aS]cx2

+
+
+
+

[C 1(30202 + C 1(3022022 + C l~o 24024]f3
[C 1(321l202+C 1(32(322022+ C1(32024024](32
[C l(3os0S + C l(3os2032 + C l~oa40S4]f3
[C 1(32os0S + C 1(32oa20S2 + C1(32os40S4]f32

19'} Derivatives depend on the sign of
} Derivatives depend on the sign of

l
l

f301

a

Derivatives are zero for negative a, and depend on the
sign of 0 for positive a

Derivatives depend on the sign of f3 and the sign of 0

SUMMARY OF SIGNIFICANT
IMPROVEMENTS USING THE HFGS
The use of the HFGS has enabled the refinement of
real-time flight simulations of high performance systems
to a degree never before achieved at Boeing. Some
important benefits resulting from the application of the
HFGS are summarized below.
1. The use of recorded data eliminates the tedious

process of curve fitting which often fails because
the functions are not analytic.
2. The flexibility of a digital computer allows fine
adjustments of the aeromodel unattainable by
analog computing methods.
3. The parallel, continuous outputs of the HFGS
eliminate the phase delay of sampled-data
systems. The illustration in Figure 8 shows that
the delay in updating the function values in the
MDAC's creates amplitude errors (due to
extrapolation) rather than a phase shift. The
superior signal characteristics of the HFGS
have enabled the simulation to run in real time
with a high degree of confidence in the accuracy
of the simulation results. The increase in

simulation speed by a factor of ten over the
sampled-data system represents a significant
improvement in computing efficiency.
The ease of function changes with the HFGS have
enabled refinement of the aeromodel until simulation
data and missile flight test telemetry data closely
matched. Observed anomolies were exactly reproduced
by the simulation during the post-flight analysis phase.
Based on these factors this hybrid technique of multivariable function generation is considered to be a
success.

REFERENCES
1 R WHAMMING
Numerical methods for scientists and engineers
1962
2 J A PUSTAVER JR.
A multivariable interpolation formula
Air Force Cambridge Research Laboratories Physical
Sciences Research Papers No 358 May 1968
3 A I RUBEN
Hybrid techniques for generation of arbitrary functions
SIMULATION Volume 7 number 6 December 1966

Problems in, and a pragmatic approach to,
programming language measurement
by JEAN E. SAMMET
IBM Corp(YI'ation
Cambridge, Massachusetts

INTRODUCTION

tivity. Eventually such measurements should help
improve the selection of a language, the design of new
languages, the modification of existing ones, and even
in implementation.
The second section establishes the problem by discussing currently used terms, the practical need for
comparison, types of measurements, and elements not
included in the current approach to the problem. The
third and fourth sections discuss approaches to measurement of non-syntactic and syntactic characteristics,
respectively, including relevant features, the numerical
approach, and examples. A brief summary is given at
the end.
The only related published work seems to be that of
Goodenough. 1 His paper concentrates on syntax and
semantics from a linguistic (not a numerical) viewpoint. This paper takes a very pragmatic approach and
emphasizes many of the intangible aspects of languages.

Although considerable attention has been given to the
measurement of compilers (e.g., size of compiler,
amount of memory needed for compilation, speed of
compilation and object program, size of object code),
virtually nothing has been done about measurement of
languages. This is not an empty issue, because there are
a number of relevant and significant questions pertaining to programming languages for which we would
like to have (quantitative) answers. For example,
given three languages which two are most alike? By
what criteria? Which of them is most like some fourth
language? How could we develop a general ranking or
hierarchy for a set of languages according to features
so as to handle subsets and extensions? Probably the
most important practical question is "For a given
application or set of applications, which language
is best?"
Among the most frequently used phrases involving
languages are the ones indicating one language is a
"dialect" of another, or one language is "like" another,
i.e., an "L-like language." There is no concrete meaning
for these terms. Furthermore, given two "dialects,"
how do we determine which is closer to the base
language?
Finally as an illustration of a different kind of question, consider the problem of having N syntactic forms
for accomplishing the same specific task; we need SOIJle
specific numerical method of comparing them. For
example, suppose one language has a key word XYZ,
and one "dialect" uses XYZABC while another
"dialect" uses RST; which is closer to the original?
The entire field of programming languages is quite
sUbjective; opinions are used more often than facts.
Part of the reason for this is the lack of numerical
values which can be associated with languages, and
hence the lack of concrete data. Development of
methods for quantification should help improve objec-

ESTABLISHMENT OF MEASUREMENT
PROBLEM
Currently used terms

There are a number of terms which are currently used
in discussing programming languages which imply some
type of measurement. Unfortunately this measurement
is very likely to be subjective. As the simplest illustration, consider the term "dialect" which is generally
used for a language purportedly very similar to some
other language; very often a dialect is merely a particular implementation of a well known language with some
"trivial" changes made. Well defined dialects usually
arise by making "minor" syntactic changes, e.g.,
restricting the number of characters in a data name,
eliminating certain options in a partiCUlar command,
and/ or adding some particular feature or removing a
243

244

Fall Joint Computer Conference, 1971

restriction which is in the original language. As indicated
in the previous section, one of the problems that constantly faces us is the situation in which we have two
dialects of a given language and no way of measuring
them relative to each other or to the base language.
Another popular term which is frequently heard is
the phrase "L-like language." This usually refers to a
language which is similar in spirit and notation to
language L, but differs from it markedly enough not to
be considered merely a dialect. Thus we hear of
"ALGOL-like" languages, or "PL/I-like" languages
(e.g., REDUCE and MAD /1 respectively) . Not only do
we have the same problem of measuring the "likeness"
that exists with dialects, but in addition we have no
way of indicating when a language stops being a dialect
and when it starts becoming an L-like language.
In both these cases the primary issue is one of syntax;
semantics generally plays only a minor role.
This paper does not provide firm definitions of the
terms "dialect" and "L-like language"; however, the
types of measurements discussed do provide an approach which will help in defining these terms. As a
first approximation, we might (arbitrarily) say that a
language which has a syntactic deviation of 20 percent
from another language is a dialect, whereas a deviation
between 20 percent and 50 percent would be considered
"language-L like." Beyond 50 percent it might be truly
considered a different language.
The questions of subsets and extensions are somewhat
more easily dealt with. This author has already stated
in Reference 3 the following definitiori of subset: A
language 8 is considered a proper subset of a language L
if (1) there are some programs which can be legally
written in L which cannot be legally written in 8; (2)
all legal 8 programs are legal L programs ; and (3) the
results from a program written in 8 when executed with
an 8 compiler are the same as the results obtained from
an L compiler on the same machine, except for those
aspects which are implementation dependent.
From that definition of subset, we can then easily say
that a language E is an extension of a language L if L
is a subset of E.
If we look at only subsets or extensions which are
arranged in a hierarchical fashion then there is very
little problem. Thus we could have a language L with
extensionsE(l),E(2),
E(n) whereE(i) is a proper
subset of E (i + 1) for all i. A similar concept can apply
to subsets. However, in actual practice the situation is
seldom that simple and what happens far more frequently is that there are two languages, both of which
are extensions (or subsets) of the same base language
but neither of which is properly contained in the other.
We need to have some way of measuring the size of the
extension (or subset) when there are non-nested
000,

extensions (or subsets). For example, if one extension
of a language contains a new data type, and another
one contains a new command, which is really a "larger"
extension of the base language? 8imilarly if one subset
removes an input/output command and another one
eliminates a double precision facility, which is the
smaller subset?
Finally, as the worst situation we very frequently
have what can be called an "L-like extended subset."
This is a situation in which a language L has a subset 8
with some mmor deviations (called 8') but some
features are added to the subset which are not in the
language L (say 8'+). The result is surely an L-like
language (or might even be considered merely a dialect)
but we cannot say anything more quantitative than
that. If we have two such situations (say 81'+ and
8 2'+) we have no way of measuring the amount of
differences involved. A good example of this is CP8
and RU8H, both of which are PL/I-like extended
subsets.

Practical need for comparison
There are at least two broad classes of people who
are concerned in a very practical way with these
measurements and comparisons, completely aside from
any theoretical interest. One is the user and the other
is the implementor.
The· user in general is concerned with the problems
of relevance and compatibility. In the case of relevance,
he is very concerned with the usefulness of a particular
language for a particular problem or a broad application
area. However, any resulting decisions on what to
actually use are generally based on intuition. The user
badly needs a way of measuring languages in terms of
their relevance to his needs. He is also greatly concerned
with all of the issues pertaining to compatibility. For
example he would like some way of knowing which of
two languages is most like some other language, because
use of· the "closer one" will ease his training problem
and/or improve compatibility and hence ease his
potential conversion. On the other hand, given two
languages both of which seem to meet his needs it
would be very desirable for him to have some way of
measuring which of these two was "closer" to his needs.
A nonnumerical approach to this for one case is described in Reference 2.
The implementor is slightly less concerned in a
practical way and slightly more interested from the
theoretical viewpoint. As part of the practical considerations, the implementor is likely to want to measure languages because he may want to consider which
techniques that have been previously used in a compiler

Programming Language Measurement

for a "similar" language are applicable; if there was
some reasonable measure of the deviations of the language he might be able to tell. In another instance, the
implementor might have a compiler for a given language
and be attempting to determine how wide a deviation
in the language could still be handled by the same
compiler. (In many cases the measurement may be
irrelevant because one language might be much closer
to another one numerically but the deviations would
cause far more drastic changes in compiling techniques
than from a language which had a bigger numerical
deviation. )
Types of measurements

There are many types of measurements that can be
applied, but not all are meaningful in all cases. One
important type of measurement is that of a single
language against some fixed numeric scale for a particular characteristic. Thus we might conceivably have an
absolute scale for generality which is certainly one
characteristic of a language; a numerical value could be
given (providing we could establish the scale in the
first place which is extremely difficult) .
Another very important type of measure men t is of a
single language with respect to a given application.
Thus while we might measure a particular language for
generality considered across the scope of all desired
computations, in more realistic circumstances we would
be likely to consider a particular language against a
specific application or a broad area of applications.
Finally, we are frequently interested in measurements
between two languages. This tends to be simpler in
many cases because it is easier to indicate that language
A is more general than language B without actually
worrying about a numeric scale. If we introduce a third
language and we wish to rank them, we can still do this
on a pair-wise basis; however, it will become increasingly
harder to compare more than two languages unless we
use some type of numeric scale.
It is important to realize that there is a significant
difference between measurements of a language and
measurements of a program written in that language.
The former can be developed once, from the given
specifications, whereas the latter will literally depend
upon who does the programming.
Elements not included in the current approach
to the problem

In this first introduction to the overall problem there
are several elements which will not be considered. First
and foremost, no attempt will be made to include any

.'

245

type of formal semantics. We will be dealing primarily
with syntax, and will implicitly use semantics only in
an intuitive fashion, i.e., we know intuitively what a
particular syntactic construction is meant to do.
Furthermore, syntactic measurements will be shown in
this paper only through a limited example i.e., specific
methods of providing detailed measurements of syntactic elements are not included. Finally, no attempt
will be made to establish an absolute scale for individual
features.
It should be emphasized that the problem of measuring languages specifically excludes measurement or
numerical comparisons of implementations. Thus this
aspect does not need to be included in any approach to
the problem.

APPROACH TO MEASUREMENTS BASED ON
NON-SYNTACTIC CHARACTERISTICS
Relevant features and viewpoints

As the first step in measuring programming .languages
with respect to their non-syntactic characteristics, it is
necessary to list the relevant features or elements or
characteristics (e.g., consistency, ease of reading)
that are to be measured, where subfeatures will be used
as appropriate. Figure 1 provides this list. However, it
is not necessarily meaningful to measure all features in
all ways. The characteristics of the language can be
measured with respect to the following viewpoin ts :
absolute scale (if one exists)
user
implementor
one other language
two or more other languages
specific application
application area

(U)*
(I) *

(OOL)*
(TML)*
(SA) *

(AA) *

A particular characteristic may be irrelevant or of
minor importance from some viewpoints (e.g., relevance
to an application is not significant to the implementor).
In a few cases the relevance is questionable.
Numerical approach

To the ·extent that numbers can be supplied, the
following simple techniques will be used. Absolute
scales should be established in the range 0 to 1,
with 1 representing the maximum of the char-

* Abbreviation used in Figure 1.

246

Fall Joint Computer Conference, 1971

Non-Slntactic Characteristics

ViewEoints
U I OOL* TML* SA AA
9

9

NA NA

NA 9
NA 9

s

s

?
9

9
9
9
9
9

9
9
9
9
9
9
9

9
9
?
9
9
?

9
9
?
9
9
9

9
9

s

9
9
9

9
9
9

9
9
9

9
9
9

9
9
9

GENERALITY
This is essentially equivalent to the phrase "general
purpose" . A completely general purpose language could be
used effectivell for all applications and problems. There
is no such language today.

?

s

g

9

s

9

NATURALNESS
This deals with the intuitive resemblance of the
programming language itself to the way in which an
individual would describe the problem or give ins truct ions
about how to solve it to another person.

9

NA 9

g

9

9

NON-PROCEDURAL
It is this author's contention that non-procedural
is a relative term which changes as the state of the
art changes. At a given point in time one can talk
about the amount of non-procedurality in a particular
language.

9

s

9

s

9

CONSISTENCY
This deals with internal contradictions in rules, or
exceptions to rules. For example a language might say that
blanks are irrelevant except in particular cases. This
language would then have a certain amount of inconsistency
within it.
EASE OF
READING
WRITING
DEBUGGING (from the language viewpoint only)
MAINTENANCE
LEARNING
CONVERSION
IMPLEMENTATION
Each of these "easel! features calls for a different
measurement since what is easy to read is not necessarily
easy to write, etc.
ENVIRONMENT INDEPENDENCE
MACHINE INDEPENDENCE
OPERATING SYSTEM INDEPENDENCE
ON-LINE VERSUS BATCH INDEPENDENCE
While all programming languages are fairly machine
independent, some have some implicit hardware dependencies
which could be simulated but only with great inefficiency
(e.g. read tape backward). Furthermore there are cases
of language dependencies on the operating system (e.g.
handling of STOP command or its equivalent). Some languages
can only be implemented effectively in on-line or batch
modes.

s 9

9
9
9
?
9
9
?

Figure 1-Measurement of N on-Syntactic Characteristics

9
9

NA

9

Programming Language Measurement

247

View~oints

!L I OOL* TML*

SA AA

RELEVANCE TO APPLICATION AREA
This includes the whole gamut of notation, features,
etc. which would be used for a particular application area.

9

NA 9

9

9

9

RELEVANCE TO SPECIFIC APPLICATION
In contrast with the above, a particular application
may impose different demands on the language than a
broader area and therefore different measurements will
be needed.

9

NA 9

9

9

9

SIMPLICITY
This provides some measure of the complexity of the
rules in the language but also must bear some relationship
to the amount of generality. A very narrow language can
be very simple because not many rules are needed; a very
general language may need more rules. Thus simplicity
really should be measured both in absolute terms and also
as a ratio of the generality. To simplify matters only
the absolute scale will be considered.

1 1

g

9

s

s

SUCCINCTNESS
This contrasts with verbosity, i.e., how many pencil
strokes are needed to convey a particular concept.

s g

9

9

s

s

USE AS A HARDWARE LANGUAGE
Some languages are definitely defined using a character
set which is not readily available on normal equipment. This
affects the direct and immediate use of the language as input
to a computer.

g s

1

1

s

s

USE AS A PUBLICATION LANGUAGE
Some languages are particularly well designed for use
in normal publication media and this can be used as a
measurement.

1 NA 1

1

s

s

Column Headings:
U = User
I = Implementor
OOL = One Other Language

TML = Two or More Other Languages
SA = Specific Appl ication
AA = Application Area

Column Entries:
9
s

= of
= of

great importance
small importance

1 = relevance is questionable
NA = not applicable

* While these two columns are identical in this formulation, it seems advisable to
keep both for potential changes as this table is revised and refined.
Figure l-(Continued)

248

Fall Joint Computer Conference, 1971

TABLE I-Measurement of Languages from Viewpoint of A User Writing A Payroll Program*

Feature
Consistency
Ease of
reading
writing
debugging
maintenance
learning
conversion
implementation
Environment independence
machine independence
operating system independence
on-line vs. batch independence
Generality
Naturalness
Non-procedural
Relevance to application area
Relevance to specific application
Simplicity
Succinctness
Use as hardware language
Use as publication language
Totals

Weighting *
Factor

Normalized
Weighting
Factor

Raw Scores*
COBOL PL/I

Normalized
Weighted Scores
COBOL
PL/I

NA

.9
.8
.5
.2
.3
.8
.2

.09
.08
.05
.02
.03
.08
.02

.9
.9
.1
.1
.7
.5

.09
.09
.01
.01
.07
.05

.3
.5
.8

1
1
1

.7

1
.5
.2
.3

1

1

1

.8
1
1

1
.4

.2
1
1

1
1
.2
1

.090
.080
.050
.014
.030
.080
.020

.027
.040
.040
.020
.015
.016
.006

.072
.090
.010
.002
.070
.050

.090
.036
.010
.010
.014
.050

.100
.050
.015
.100
.010

.060
.050
.050
.100
.010

.933

.644

NA

1

.1

.10
.05
.05
.10
.01

10.0

1.00

.5
.5
1

1

.6

1

1
1
1

1

1

1

.3

* These values are based on the author's personal judgment and are somewhat arbitrary. The numbers are meant primarily for illustrative
purposes.

acteristic. It will be assumed that the relationships are
linear, i.e., if one language has a measurem~mt of .9
on an absolute scale (with the maximum of 1) and
another one has a measure men t of .3 it will be assumed
that the first had three times more of the characteristic
than the second. Since the types of elements being
measured are not in the same units, each measurement
must be normalized. This is accomplished by doing the
following: Where comparisons of two or more languages
are made, the one having the highest value will be
assigned the value 1 and all others assigned the
appropriate ratio value. Using this technique also
permits us to obtain quantitative results without
assigning an absolute measurement to any language
elements.
A fundamental assumption in attempting to measure
programming languages from any point of view is that
the criteria to be used will vary depending on both the
individual and the viewpoint from which he makes the
measurement. What is important to one person is
unimportant to another; what is vital in one type of
measurement is insignificant in another. For example,
from the viewpoint of relevance to an application area
the actual syntactic form of a loop control statement

may not make much difference, whereas rules on
formation of data names may be very significant. The
views of two individuals will differ even if they make the
same comparisons from the same viewpoints (e.g., use,
implementation). In order to accommodate these
individual differences, each person attempting to do a
measurement will assign his own weighting factors
each time he makes a measurement. These will be
normalized so that the total is 1 and then a numerical
score can be obtained by multiplying each weighting
factor against the numerical value (already normalized)
for the related characteristic and adding them. This
then produces a number-albeit a very crude onewhich represents a measurement of a particular language using the viewpoint (i.e., the criteria) of the person
making the measurement.
TABLE II-Types of Changes to be Made to a Base Language
None (Le., same as base language)
Deletion (Le., subset)
Addition (i.e., extension)
Substitution (of one word or character for another)
Optional (in new version instead of required in base)
Required (in new version instead of optional in base)
Other change (i.e., not shown above)

Programming Language Measurement

249

TABLE III-Features in Base Language and Two Other Versions*
Feature
Character Set

Base Language

i

Uses

** + - * /
Single quotes
required

As shown, plus
use of 1 for NOT
Single or double
quotes required

Double quotes required

A-Z
=

Literals

Version 2

Version 1

< >.,

instead of **

Multiple Assignment Statements
Not allowed

Allowed

Keyword LET before assignment
statements

Optional

Required

Built in functions

11 specific functions

Computed GOTO
DATA statement

Not defined
Number or character
strings
Present
Required as last
statement in program
Required

MATRIX inversion statement
END
Statement numbers

Optional use of keywords
COMPUTE or LET
11 as specified plus
2 more functions
Allowed
Numbers or character
strings or expressions

Numbers or character
strings or expressions
Not allowed
Optional

Optional

* The entry - indicates that the feature has the same specification as the base language.

Specific example

Table I shows the author's evaluation of the nonsyntactic features of COBOL and PLjI made from the
viewpoint of a user wishing to write a payroll program.
Since this author believes intuitively that COBOL is
better suited for this application, it is not surprising
that the numerical results confirm that; however the
figures were not manipulated, i.e., weighting factors and
raw scores were assigned without any fudging of the
figures except to increase two weighting factors by .1
each to make the total 10 instead of the original 9.8
which had come about naturally. Readers are invited
to replicate this experiment for themselves.

mandatory versus optional usage of words or features
program structure (e.g., blocks, procedures, sequencing rules)
data types
commands
declarations
There are many different ways of measuring, and
not all the items in the above list can be measured the
same way. For example, it is easy to compare lists of
reserved words, but hard to compare program structures. Then again, which is more important in comTABLE IV (a and b)-Scores Assigned to Types of Changes*
(a)

APPROACH TO MEASUREMENT OF SYNTAX
Relevant features

In considering the syntactic elements of the language,
the following are some of the parameters which must
be considered as elements in the measuring system:
reserved words (existence, number, exact words
themselves)
punctuation, including handling of blanks and
literals
length of user-defined identifiers (i.e., data names,
statement names)

Change Type
None
Deletion
Addition
Substitution
Optional
Required
Other change

User Viewpoint of
Compatibility of
Program in the
Base Language to
New Language

(b)
User Viewpoint of
Generality of New
Language with
Respect to Base
Language

+1

o

-1

-1

o

+1

.5

o

.1

+

.8
.7

-

.8
.5

o

* These scores are based on the author's value judgments and are
somewhat arbitrary; they are meant primarily for illustrative
purposes. The scores are not normalized because that seems to
be unnecessary. However, maximum values of +1 and -1 are
used.

250

Fan Joint Computer Conference, 1971

TABLE V-Measurement of Features From Viewpoint of Compatibility

Weighting*
Factor

Feature
Character Set
Literals
Multiple Assignment Statements
Keyword LET before assignment statements
Built in Functions
Computed GOTO
DATA statement
MATRIX inversion statement
END
Statement numbers
Totals

.8
.8

Normalized
Weighting
Factor

.9
.6
.5
.8
.2
.1
.9

.13
.13
.07
.15
.10
.08
.13
.03
.02
.15

6.0

.99

.4

Raw Scores**
Version 1
Version 2

0
0
0

-

.8

+1
+1
0
-1

-

.1

+1

- .5
- .5
+1
- .7
0
0
0
+1
+1
- .1

Normalized
Weighted Scores
Version 1
Version 2

0
0
0

.068
.068
+ .07
.105
0

.12
+ .10
+ .08
0
.03
- .002
+ .15

+ .03
+ .02
- .015

+ .178

-

0

0

.136

* These values are based on the author's personal judgment of the importance of the feature with regard to compatibility and the other
parts of the language. The values are meant primarily for illustrative purposes.
** Based on Tables III and IV(a).
Note: The significance of the specific numbers +.178 and -.136 cannot be stated quantitatively. What can be concluded is that Version 1
is more compatible with the base language than Version 2.

paring reserved words-length, or similarity of letters?
i.e., is the word XYZ closer to the word XYZABC or
to the word ABC? The answer again depends on the
viewpoint. The primary viewpoints are user and
implementor, each of whom may use specific criteria
(e.g., compatibility, generality) and/or many of the
non -syn tactic characteristics shown in Figure 1.

Numerical approach

The approach here is similar to that used for the nonsyntactic characteristics. The major types of changes
to the syntax of a language are shown in Table II.
(N ote that the common term "restriction" actually
becomes one of the listed items for each specific case.)

TABLE VI-Measurement of Features From Viewpoint of Generality

Feature
Character Set
Liter~,ls

Multiple Assignment Statements
Keyword LET before assignment statements
Built in functions
Computed GOTO
DATA Statement
MATRIX inversion statement
END
Statement numbers
Totals

Weighting*
Factor

Normalized
Weighting
Factor

.5
.5
.3
.8
.6
.5
.7
.3
.1
.7

.10
.10
.06
.16
.12
.10
.14
.06
.02
.14

5.0

1.00

Raw Scores**
Version 1
Version 2

+1
+1
+1

-

.5
0
0
+1
-1
+ .8
0

0
0
0
0
+1
+1
+1
0
0
+ .8

Normalized
Weighted· Scores
Version 1
Version 2

+ .10
+ .10
+ .06
.08
0
0
+ .14
- .06
+ .016
0

+ .12
+.10
+ .14
0
0
+ .112

+ .276

+ .472

0
0
0
0

* These values are based on the author's personal judgment of the importance of the feature with regard to generality and the other
parts of the language. The values are meant primarily for illustrative purposes.
** Based on Tables III and IV(b) ..
Note: The significance of the specific numbers +.276 and +.472 cannot be stated quantitatively. What can be concluded is that Versions
1 and 2 are both more general than the base language and Version 2 is more general than Version 1.

Programming Language Measurement

Depending on the viewpoint from which the measurement is to be made, the individual assigns a value from
+ 1 to -1 with the "best" and "worst" changes at the
extremes. A change which is irrelevant to the particular
measurement being made is assigned the value o.
Each syntactic feature under consideration is
assigned a weighting factor between 0 and 1 to
represent the individual's judgment of its importance
from the particular viewpoint involved. These are
then normalized.
Raw scores for each syntactic feature are obtained
by determining the type of each syntactic deviation and
assigning the appropriate "measure of change" score.
Multiplication of normalized weighting factors by the
raw scores, followed by addition, yields a number
representing the syntactic deviation from the base
language according to the specified viewpoint.
Specific examples

In order to illustrate the types of measuring on syntax
that can be done, some hypothetical cases are taken. In
Table III, some syntactic features in a (hypothetical)
base language are specified, and then two versions of
the base language are defined with respect to those
same characteristics. (The language is intuitively
BASIC, but that is not significant to the discussion.)
Tables IV(a) and IV(b) assign a score to each of the
change types, considered from two different points of
view-compatibility of a program in the base language
to a new language, and generality with respect to the
base language. Tables V and VI show the raw and
weighted scores for each version, from the viewpoints of
compatibility and generality, respectively. The results
show (a) Version 1 is more compatible with the base
language than Version 2, and (b) Versions 1 and 2 are

251

more general than the base language, with Version 2
more general than Version 1.
SUMMARY
This paper has attempted to provide an introduction
to the need for, and a pragmatic approach to, measuring
programming languages. Commonly used terms such as
"dialect" were shown to have only an intuitive meaning
although numerical measures could and should be
defined. Two very simple numerical approaches to
-obtaining some quantitative results for. syntactic and
non-syntactic characteristics were outlined. A set of
non-syntactic characteristics was described, and the
major syntactic parameters involved in measurement
were shown. Three specific examples illustrate the
techniques involved.
The approach shown here is not yet ready for practical
usage except in very simple cases, and the numerical
techniques have deliberately been kept simple to make
further explorations of this problem easy to do. The
actual numbers used were primarily for illustrative
purposes and should not be considered as absolute
values to be used in all similar cases.
REFERENCES
1 J B GOODENOUGH
The comparison of programming languages:
a linguistic approach
Proceedings ACM 23rd National Conference 1968
2 H HESS C MARTIN
T A CPOL-a tactical C & C subset of P L / I
Datamation Vol 16 No 4 April 1970
3 J SAMMET
Programming languages: History and fundamentals
Prentice-Hall Englewood Cliffs N J 1969

The EeL programming system*
by BEN WEGBREIT
Harvard Unive:I'sity
Cambridge, Massachusetts

INTRODUCTION

tion with automatic storage reclamation, record handling, and algorithm-independent data description.
Further, it provides facilities which allow the programmer to define extensions to the language to tailor
it to each particular problem area. New data types,
new operators, new syntax and new control structures
can be added to the language enabling the program to
model directly the objects, unit operations, relations,
and control behavior of each problem domain. For example, list processing, matrix arithmetic, string manipulation by pattern matching and replacement, and
discrete simulation can all be carried out in ELI by
appropriate extensions.
To aid program construction and debugging, the
ECL system has been designed for use in an iteractive
on-line fashion. t Programs can be composed at the
console using a text editor and run interpretively with
appropriate levels of error checking, tracing, and conditional suspension. With execution suspended, the
programmer can examine data or program, modify
either, and resume. Any variable may be declared
"sensitive"; changes to its value are monitored and an
interrupt generated whenever a programmer-specified
predicate associated with the variable becomes true.
Several system facilities contribute to the construction of efficient programs. One is the compiler. Variables can be data typed so that the compiler can perform type checking, compile in type conversion, and
choose among alternative procedure bodies on the basis
of argument data types. The compiler can be called at
any time, so it is possible to write procedures which
compile themselves or other procedures. To allow economical use of storage, the language allows packed

EeL is a programming language system currently
being implemented as a research project at Harvard
University.** Its goal is an environment which will
significantly facilitate the production of programs. In
this paper, we describe the motivation for this project,
present the approach taken in its design, and sketch
the resulting ECL system. Detailed treatment of specific aspects of the system are found elsewhere. l ,2
Programmers, whether professionals or casual users,
manufacture a unique product, programs: objects,
often large, which must be coded, modified, debugged,
verified, made efficient, and run on data. In providing
an environment for this manufacturing, four goals
were considered primary:
1. To allow problem-oriented description of algorithm, data, and control over a wide range of
application areas.
2. To facilitate program construction and debugging.
3. To allow and assist in the development of highly
efficient programs.
4. To facilitate smooth progression between initial
program construction and the final realization of
an efficient product.

ECL consists of a programming language and a system
built around this language to meet these goals.
The language component, called ELI, includes most
of the concepts of ALGOL 60, LISP 1.5, and COBOL.
It provides standard arithmetic capability on scalars
and multidimensional arrays, dynamic storage alloca-

t This is not to the neglect of batch processing. Any interactive
language can be used in batch mode if the job control commands
that would normally come from the console are taken from a file
and results which would normally appear· on the console are
written to a second file. ECL allows such switching of command
streams, so that batch processing falls out as a sub case of its
normal mode of operation.

* This

work was supported in part by the U.S. Air Force,
Electronics System Division, under Contract No. F19628-68-C0101 and by the Advanced Research Projects Agency under
Contract No. F19628-68-C-0379.
** The current implementation is on a PDP-lO running under the
10/50 monitor. Versions for other machines are contemplated.

253

254

Fall Joint Computer Conference, 1971

data (e.g., bits, bit strings, bytes, and byte strings) and
operations on such data objects. This is carried out in
a machine-independent notation and representation so
that programs using this are not tied to a particular
machine. To allow the construction of efficient programs which include asynchronous components, ECL
includes multiprogramming and a programmer-controllable interrupt system.
Efficiency, in any metric, is seldom gained at one
fell blow; programs are only relatively stable. Even
after code is checked out with the interpreter and compiled, it is usually changed and frequently requires ~e­
bugging. Further, it is sometimes necessary to compIle
part of a program in order to, get sufficient ~peed to
test an algorithm against a large data base. Smce the
road is filled with relapses, it is important to allow
smooth progression and regression between initial construction and final product. It should scarcely need
saying that the languages acceptable to the interpreter
and compiler are identical and that compiled and interpreted code may be freely intermixed with no restrictions. For example, the result of compiled code may be
used as an argument to interpreted code; a golo in
interpreted code may lead back into compiled code;
variables local. to compiled code may be accessed by
interpreted code, etc. A less familiar concept, but
equally fundamental, is the notion that compilation is
not all or nothing. In ECL, compilation can be carried
out to any level depending on the amount of information supplied to the compiler: specifically, the number
of program components that the programmer is willing
to accept as being invariant. The more invariants, the
better the compiled code. As with interpreted code, the
execution of compiled code may be broken (either by
an internal condition or an external interrupt) to allow
intervention by the programmer, e.g., for debugging
purposes.
The primary motivation for, and the intended use of,
the ECL programming system is "difficult" programming efforts. That is, projects which could otherwise be
carried out only with considerable waste of human or
machine resources. It is our intention that ECL be
usable for production programming. Hence the emphasis on machine efficiency. This is not to say that the
requirements of interactive usage have been slighted in
system· design. Quite the contrary, we view good interaction capability and a well-engineered debugging
facility as significant tools in tackling a difficult programming project. The utility of on-line debugging
should be clear. Equally important is the use of an
interactive capability in developing and refining algorithms. Still more important is the use of interaction
in allowing measurement of program behavior and the

attendant optimization based on knowledge of this
behavior.
SYSTEM ORGANIZATION AND DESIGN
PHILOSOPHY
Before discussing ECL in detail, it will be useful to
outline its internal organization and discuss the philosophy which underlies its design.
Nornally, one uses ECL on-line, communicating with
the system via a console. As seen by the programmer,
ECL is an executor of input commands. Syntactically,
commands correspond roughly to statements of an algebraic language;. semantically, commands embrace all
actions expressible in the system. Hence, commands
include: conventional algebraic statements, definitions
used to construct new procedures and operators, and
the "job control" statements of a batch processing
system such as instructions to compile procedures,
transact with data sets, create and destroy processes,
etc.
As seen by ECL, the programmer is a source of input
commands. We will take the system's point of view. It
reads and parses each command, interprets it, and
turns to the next command. Since commands include

Callable
Routines

Figure I-Primary system modules

EeL Programming System

calls on procedures which may be programmer-defined,
the interpretation portion of the cycle may set off the
running of a compiled program.
At the heart of ECL is the command handler-the
routine which controls the above command loop. It has
two main components: the parser and the interpreter
(c.f. Figure 1). The parser calls on a lexical analyzer to
decompose the input stream into lexemes. The parser
then analyzes the lexeme stream as directed by parse
tables previously derived from a syntactic specification
of the language. Both the input source and the parse
tables may be changed by commands, so that the
source of commands and the language in which commands are expressed are subject to change by the programmer. The output of the parser is a representation
of the command as a linked list. Constituent syntactic
units are represented by sublists, recursively. The
command handler calls on the interpreter to execute the
command. When this is completed, control returns to
the command handler which outputs the result and
then calls on the parser for the next command.
The list structured representation has two uses. On
the one hand, it can be executed directly by the interpreter; on the other, it is a convenient form of input to
the compiler. This achieves several economies. A program need be parsed only once, on input. Hence the
interpreter does not reparse a line each time it is encountered during execution, e.g., in a loop. Also, the
compiler is considerably simplified since it is not at all
concerned with parsing.
Most commands will be function calls, i.e., the application ofa routine (procedure or operator) to a set of
arguments. Routines initially available in ECL include:
1. The conventional arithmetic, relational, and
trigonometric routines.
2. A set of I/O routines.
3. A routine for defining new procedures and
operators.
4. The compiler.
5: Routines to define new data types.
6. Routines to change the parse tables, thereby
changing the syntax of the language.
7. Routines to allocate storage, and a garbage collector to reclaim storage no longer in use.
8. Routines to create, run and destroy processes.
The first· three sets require no explanation; the others
will be discussed individually in subsequent sections.
It should be clear that ECL is an unusually eclectic
system. This is unavoidable; a complete programming
environment necessarily includes many components,

255

each fairly complex. There is a certain danger in this.
Such a system can easily become very large, hence
prohibitively expensive to implement and maintain. No
less dangerous is the possibility that a system may be
unwieldy for the casual users. Finally, there is the
danger that the system may impose too much or the
wrong kind of structure on the programmer. With each
decision made incorrectly, a language system inconveniences some class of users. With many decisions to
make, a system is certain to inconvenience all programmers some of the time.
In ECL, these very real dangers of an eclectic system
have been avoided by judicious application of four concepts: (1) extension mechanisms, (2) sustained variability, (3) bootstrapping, and (4) system uniformity.
The first of these has been mentioned earlier. The
idea is to construct a small initial system consisting
mostly of powerful definition facilities for self-extension. Only the initial system-the nucleus-need be
Implemented and maintained by the system's creators.
The rest is built on this by the programmer or programming group to suit its needs and taste. The ECL
provides definition mechanisms for extension along
three axes: syntax, data types, and control structures.
A second key concept, distinct from language extension, is systematic variability. That is, the deliberate provision for access by the programmer to key
points at which he can control system behavior. All
well-designed systems have key points of control;
usually, however, these points are deeply embedded in
the system either on grounds of supposed efficiency or
because actions to be taken were believed to be incapable of sustaining intelligent variation. Seldom is
the burial justified. Allowing programmer control over
such issues provides a surprising amount of power. In
ECL, three points have been singled out for attention:
error and interrupt handling, input/output stream
direction, and data type conversion on binding formal
parameters of routines to their arguments.
Bootstrapping, i.e., using the system to define parts
of itself, provides system variability at another level.
In ECL, bootstrapping has been a fundamental implementation technique. The data type extension facility
was used to create the system data types needed by the
interpreter itself. Further, large parts of the system are
coded in the language, most notably the compiler. Such
system modules can be run either interpreted or compiled: the compiler, of course, is compiled by itself
using the interpreter. For the system implementor, this
technique avoids a large amount of machine language
coding with the attendant benefits of rapid production,
better system organization, .and ease of change. For
sophisticated system users, this bootstrapping provides

256

Fall Joint Computer Conference, 1971

an additional point of variability: those portions of the
system coded in the language are accessible to change.
The fourth concept in the ECL system is uniformity.
Insofar as possible, the entire environment of the programmer is treated as a single homogeneous space without special times, cases, or preferred objects. Correspondingly, the implementor has to deal with a system
notable for its lack of special cases and "funny"
situations.
All data types (called modes in ECL) are treated
equally. Each class of objects in the system has a
mode; for each mode there are values of that mode;
declarations are used to create variables which can be
assigned values of that mode. A procedure is like any
other object in this respect. It is a value, it has a mode,
and may be assigned to be the value of a procedurevalued variable. Programs can be treated as data and
data as programs. Programs which generate other programs are straightforward. Files (somewhat generalized) are another mode in the system, so that programs
can compute the source of or sink for input/output and
can arrange for arbitrary transformation of the data
during transmission. Finally, there is no preferred
status for the data type mode. A mode (e.g., integer) is
just as legitimate a value as, say, 3.1. Hence, mode
values may be computed, assigned to variables of data
type mode, passed as arguments to routines, etc. There
are a number of system-defined routines which take
modes as arguments and produce new modes. Additional routines for computing modes may be defined by
the programmer from these. Hence, a programming
project might include all of the following (c.f. Figure 2) :
1. Defining a set of routines which compute modes.
2. Writing a program which uses variables whose
modes are of the class generated by 1.

3. Running the program defined in step 2 interpretively, halting, modifying and debugging it.
4. Running the routines of step 1 on input data to
compute a set of modes.
5. Compiling the program of step 2 to get object
code tailored to the data types computed in
step 4.
6. Running the object program of step 4 on a data
set.
Conceiva.bly, this could be done in a single console
session. Alternatively, these steps might be carried out
over the course of-several months as a large programming effort goes throu~h the process of defining its
data formats, coding and checking out its routines,
metering the input profile, compiling and tuning code,
and finany running. The key point is that all these

Figure 2-Program development in EeL

8teps can be carried out in a single system using a common language to describe their actions.
SYSTEM FACILITIES
In this section we discuss the key facilities seen by
the programmer using ECL. In.the interest of brevity,
we concentrate on innovative features and treat lightly
those which are straightforward. In discussing the
language component, we will ignore all but its extension mechanisms; in particular, we do not give its
syntax or programming examples in this paper. Suffice it
to say that the language is ALGOL-like in syntax,
ALGOL/LISP-like in semantics and that a formal
description of both syntax and semantics exists. 3
Builtin data types of the language include characters,
integ-ers, reals, and Booleans; builtin operations include
the usual operations on these types. A system-provided
extension package adds to this the data types symbol,
list and arrays of reals, integers, and Booleans along
with appropriate operations.
Syntax extension

A number of proposals for syntax extension have
appeared during the past few years, proposals ranging
from simple macro extension schemes requiring prefix
macro name triggers, to recognition of arbitrary context-free languages with complex parse-tree manipulation facilities. The technique used in ECL has two key
properties: (1) it is very efficient in both parse speed

ECL Programming System

and storage required, (2) it includes specific provision
for simple common additions as well as complex comprehensive changes.
The parser is a deterministic pushdown store analyzer. It scans the input stream from left to right, recording the progress of the parse in state information.
At each step, the parser either reads the next lexeme
and adds it to the pushdown store or it reduces the
top elements of the pushdown store. In either case, it
goes into a new state. In the case of a reduction, employed whenever a complete syntactic phrase has been
found, a semantic action associated with the phrase
class is executed. The choice of read or reduce, the
reduction to be made, and the next state to be entered
are recorded in a syntax table as a function of the current state, next lexeme, and top elements of the pushdown store. This table is computed by a parse table
genera tor using a technique developed by F. DeRemer, 4
from a syntax specification in BNF. Semantic actions
augmented to each syntax rule specify the desired
mapping from the parse tree into the intermediary list
structure representation-IL. Each syntactic form of
the source text is therefore represented by some IL list.
The interpreter and compiler treat certain IL lists
(e.g., those representing a (block» specially; all others
are taken as procedure or operator calls where the head
of the list is the function name and the rest of the list
is the set of arguments. Therefore, most augments
simply map the syntactic construction into prefix form.
The final element of the language specification is the
definition of the function names used as prefix operators in IL.
The language may be extended by (1) adding to the
syntax specification new syntax rules with augments,
(2) defining the function names used as prefix operators in the new IL forms, thereby defining the semantic
specification, (3) calling the parse table generator on the
new syntax specification, and (4) switching the parser
to be driven by the resulting new parse tables. In subsequent input any command, in particular any program,
containing the new constructs will be analyzed employing the new syntax rules, mapped by the augments into
prefix form, and executed by the associated function
in the semantic specification. Compiling the program
and the semantic specification functions will yield acceptable although not specially optimized code for the
new construct.
The most common additions to the language will
surely be new operators. For example, much of APL5
can be obtained simply by defining the appropriate
array operators. While new operators could be added
by using the above technique, this is needlessly complex for such a simple addition. Hence, ECL provides
a special facility to handle this, making the definition

257

of a new operator no more difficult than the definition
of a new procedure. An identifier in the language can
be written either like a PL/I identifier (e.g., X, TEMP,
FOO, COEFFICIENT) or as a sequence of special
symbols (e.g., +, -, **, +f-, = # ». Any identifier
can be declared to be a prefix operator, an infix operator, or both. (E.g., the minus sign denotes negation as
a prefix operator and subtraction as an infix operator.)
An infix opera tor can be given an integer index from
1 to 7 specifY,ing its binding strength.
The mechanism used to implement this facility is a
simple extension of the basic analyzer; hence, operator
and other extensions mesh together smoothly. The initial syntax specification includes the syntactic categories (prefix operator) and (infix operator i) for
i= 1, ... ,7. All operators are recognized as (identifiers)
by the lexical analyzer and are handed to the parser
with syntactic category (identifier). The parser changes
the syntactic category to (prefix operator) or (infix
operatori) under "appropriate conditions" (e.g., for
the second identifier in X**I). The parser recognizes
the possibility of such an appropriate condition by
means of a second set of parse tables (actually part of
the symbol table) which specifies which identioors may
be used as operators and in what roles (i.e., prefix,
infixi, or prefix and infixi). The tricky point here is
distinguishing between different uses of an identifier
symbol; e.g., if #@ has been declared to be both a
prefix and infix operator then it may appear in:

#@B
A #@B

as a prefix operator acting on B,
as an infix operator acting on A and B,

# @ f-. .. as an identifier being assigned a new
(operator) value.
The parser distinguishes between these three uses in
the same way as the human reader-by local context.
The read routine of the parser examines each (identifier)
that can be used as an operator, checks its local context and decides how it is being used in the context,
and possibly changes its syntactic type to (prefixoperator) or (infix-operatori). The rest of the parser,
in particular the part that performs reductions, is oblivious to this local transformation; it sees either an
(identifier), a (prefix-operator), or an (infix-operatori)
and regards these as disjoint terminal categories.
Storage management

There are two classes of storage provided by the
ECL system: (1) storage automatically allocated and
freed at block entry and exit (on the stack) and (2)
storage dynamically allocated by the program (in the

258

Fall Joint Computer Conference, 1971

heap, using Algol 686 terminology). The former is
handled by well-known stack implementation techniques and requires little discussion. In providing dynamic storage allocation, however, there is a critical
design decision-whether to provide automatic storage
reclamation or whether to require explicit return of unused storage, e.g., by a free command.
A common characteristic of allocated storage is that
the programmer does not, in general, know when it is
becoming unused. Typically, a block is pointed to
from many places, most of which are in other allocated
blocks. Deciding when the last reachable pointer ceases
to reference a block is therefore no simple matter. Keeping track of this at all times places a burden upon the
programmer, one that may significantly complicate a
program. Hence, ECL provides automatic reclamation. *
Garbage collection was chosen· as the implementation
technique since this requires the least housekeeping
storage and is guaranteed to find all unused storage.
The programmer sees only a system-provided allocation function-ALLOC. Specifically, ALLOC (M) allocates an object of mode M and returns a pointer to this
object. When available storage is exhausted, the allocator invokes a garbage collection.
The garbage collector is basically straightforward. A
few subtle points are, however, worth mentioning. The
trace phase traces all storage blocks referenced and
marks all machine words in use using a bit map. By
marking machine words, not objects, it is possible to
mark only part of a block in a compound object. Garbage collection leaves untouched these parts actually
referenced and reclaims the rest. The difficult point in
the trace phase is the possibility, indeed almost certainty, of tracing through objects having programmerdefined mode. Given an object, the trace routine must
be able to determine how big it is (so as to mark all of
its words), whether or not it has pointers within it
and, if so, where they are (so they can be traced).
This information is calculated by internal system
routines whenever a new mode is defined and is entered
into tables associated with the mode. Once marking is
complete, the garbage collector sweeps linearly
through storage, collects all unmarked words into maximal contiguous blocks, and sorts these blocks by size
into a set of linked lists forming the free storage pool.
Keeping different lists for various sized blocks (currently, one list for each power of 2) speeds up subsequent allocations.
Clearly, it is best to avoid garbage collection entirely
if possible. We therefore stress that ECL also provides

* Dynamic storage management in EeL therefore differs from
that of PLjJ.7 The latter provides dynamic storage allocation but
no automatic reclamation.

automatic, block-structured storage. This behaves like
a normal ALGOL 60 stack, holding variables declared
to reside on the stack as well as arguments to routines,
and. temporary results. Hence, all computation concerned with ALGOL-like objects (e.g., scalars and arrays of fixed-point and floating-point numbers) can be
carried out on the stack and requires no use of the free
storage mechanism.

Data type extensions

Perhaps the chief requirement of a programming
language intended to serve a wide range of application
areas is an equally wide range of data types or modes.
Clearly, a language must include integers and reals for
numerical computation, Booleans as the result of relational operations, and characters for headings and
labels. List processing implies data objects which reference other objects, i.e., pointers. However, compiled
code can be made considerably more efficient if a
pointer variable may be declared as restricted in what
types of objects it can point to; this introduces integer
pointers, real pointers, character pointers, Boolean
pointers, etc. Packed objects such as bit vectors are
sometimes essential in saving core storage. A list of interesting data types could go on indefinitely.
In the face of so many diverse claimants for inclusion
in a language, the only sensible solution is an extension
facility: here, a mechanism for defining new modes.
The language provides a few basic modes and five
primitive routines for defining new modes in terms of
these. The primitive mode constructors are ARRAY,
PTR, STRUCT, PROC, and ONEOF; these create
arrays, pointers, heterogeneous structures, procedures,
ap.d mode unions, respectively. These mode construct~rs are callable routines. They evaluate their arguments, perform some computation, and deliver a result
having data type mode. The resulting modes are just as
legitimate as the builtin modes. Objects of these types
may be assigned values, passed as arguments to routines, returned as the value of routines, etc.
The key point of this facility is that the mode constructors compile modes in the same sense that a traditional compiler compiles procedures. That is, they calculate once, at the time a mode is created, all information about the mode that the system will subsequently
need. One such computation is the storage layout for
compound objects-how to represent objects of the
constructed mode in the fewest possible machine words.
The current algorithm produces optimal packing on almost all cases; e.g., a structure consisting of one 18-bit
pointer, four 7-bit characters, three 5-bit fields, one
3-bit field and four I-bit fields will be packed into two

ECL Programming System

36-bit words. * The result of the calculation is a structure table giving the location and mode of each component in a compound object, to be used by subsequent
phases of mode compilation and by the runtime routines. Another computation is preparing the tables for
the garbage collector, in particular, deciding whether
an object of this mode contains a pointer to be traced.
The most important computation, however, is the generation of three blocks of machine code: (1) to construct objects of this mode, (2) to perform assignments
to objects of this mode and (3) (for compound objects
only) to select the individual components of objects of
this mode. To effect construction, assignment,· and selection, the interpreter executes these code bodies so
that these operations are partly compiled, even from
interpreted code. The compiler may either use these
bodies or compile corresponding code in-line depending
on whether it is optimizing space or time.
The programmer can use these mode compilation
routines to define the types he needs. For example, bit
vectors are defined as ARRAYs of Booleans, multidimensional arrays of any sort are defined by composing
the function ARRAY, data processing records are
STRUCTs of characters and integers, and a list of reals
is constructed from blocks of identical STRUCTs each
containing an integer and a pointer to the next block.
Further, the programmer can define new mode-valued
procedures (i.e., mode generators) in terms of the
primitive routines. We anticipate a library of modes
and mode-valued procedures analogous to a library of
numerical algorithms.
One additional facet of the mode extension facility
requires discussion. When a mode is defined using the
system primitives, certain behaviors are automatically
assumed. For example, if BYTE names the mode
ARRAY of 8 Booleans (represented as an 8-bit object),
it will be assumed that an object X of mode BYTE has
8 components which may be accessed as Boolean values
by XCI] for I = 1, ... ,8, that assignment of one BYTE
to another copies all 8 bits, and that if X is to be passed
as an argument to a routine then that routine must
have a corresponding formal parameter of mode BYTE.
If the programmer wishes, he can override these assumptions and specify the behavior he wants. He can,
for example, declare that an object X' of mode BYTE
is to have the following behavior:
1. X can be assigned an integer value (e.g., X~73).
If the value can be represented in 8 bits 2's

* This is, of course, entirely machine-dependent. However, the
programmer never sees this packing. He deals only with objects
of the language which have the right properties-e.g., access to
the second 1-bit field getsthe desired value. This differs from the
approach taken in LISP 28 where the programmer deals explicitly
with the bit packing himself.

259

complement notation, an 8-bit assignment is
made; otherwise, an error procedure P is to be
called with the integer value as an argument.
2. X can be used as an argument to a routine taking
an integer formal parameter, in which case sign
extension is used to get a full-word value to be
treated as signed integer.
3. X is to be treated as if it had an additional 9th
component recording the number of leading O's
in its bit configuration. X[9] is always interpreted as an integer count of the number of
leading O's in X[I] ... X[8] at that point in the
computation.
Using this facility, the programmer can specify ex.actly the properties of his data objects. Encoded representation for values, variables which monitor their
values, objects with "protected" fields, and the ability
to represent'sparse compound objects fall out as simple
applications.
Compilation

A compiler can be viewed in two distinct ways. It
can be taken as a device for translating programs from
source representation to one which can be executed
directly by some computing machine. Alternatively, it
can be seen as a means for factoring a computation into
two parts: that which is invariant with respect to input
data and can be performed once at compile time, and
that which depends on the data and is therefore postponed until run time. The second view subsumes the
first and is surely the more fundamental. Translation
is only one of many computations that can be factored
out. Others include: evaluation of expressions at compile time, data type checking, and generic selection. The
interesting problems in compilation can be best addressed by pushing the notion of factoring to take
advantage of additional invariants. It is this line of
approach that characterizes the ECL compiler.
A program consists essentially of a large number of
variables, a few constants, and some punctuation to
paste this all together. EeL carries the notion of variable somewhat farther than most languages. For example, a program may declare X to be an object of
mode TRIPLE when TRIPLE is a mode-valued variable or may apply FOO to a set of arguments where
FOO is a procedure-valued variable. This allows the
programmer great flexibility, but presents the compiler
with the problem of dealing with an unknown value of
the variable. There are three possible routes it might
take:
1. Attempt to deduce the value by examining the
structure of the program, e.g., look for an initiali-

260

Fall Joint Computer Conference, 1971

zation of or assignment to TRIPLE and verify
that the value will not change.
2. Obtain explicit assistance from the programmer.
3. Wait until run time when the value will surely
be known.
From a theoretical point of view, the first route has
certain appeal. However, the inevitable undecidability
results are assurance that in general one can deduce
nothing; discovering subcases in which interesting deductions can be made is a significant research problem.
Further, making such deductions is often a pointless
task: the programmer usually knows far more about a
program than could ever be deduced from examining
it; he alone knows its intended function and the environment in which it is to run.
Hence, the second route is the mainstay of the compiler. In compiling a procedure P, the compiler is
called with two arguments: P and a list L of all variables in P whose value is to be "frozen." P is then
compiled with each variable on L replaced by the value
of that variable at the point where the compiler is
called. (It will be recalled that this point might be
while executing another procedure or P itself.) For example, if X is declared in P to be a TRIPLE and
TRIPLE is on the frozen list L, then the value of
TRIPLE must be a mode and this mode is taken as
the data type of X. Similarly, if FOO appears on L,
then an appearance of FOO(argl, ... ; argn) can generate code specific to the value of FOO, e.g., by in-line
expansion. To treat a related case, it may be that FOO
does not appear on L, but FOO is declared in P to have
mode FOOMODE and FOOMODE is a variable on L.
The compiler then does not have access to the value of
FOO, but it does know its data type, i.e., the modes of
its arguments and the mode of its result. Hence, the
compiler can perform type-checking of arguments in
calls on FOO and type-check the usage of its result in
a larger context (e.g., A+FOO(argl, ... , argn»).
Any set of variables may appear on the freeze-list L.
If an operator and all its arguments are frozen (e.g., by
appearance on L), then the entire function application
is frozen. * By recursive application of this rule, it is
possible for arbitrary complex expressions to be frozen.
These can and will be evaluated during compilation.
For example, if X, Y, FOO, and FUM are all on L,
then
FUM(Y, FOO(Y), FOO(FOO(X)))
will be evaluated, the result replacing that expression
in the code generated.
For those variables not in L, the third route remains

* Assuming that the operator definition contains no free variables.

open: wait until run time to obtain its value. This includes "ordinary" variables as well as mode identifiers
and procedure names. For example, if TRIPLE is not
on L, then in a procedure with formal parameter declared to be a TRIPLE the data type is left open until
the procedure is called. The compiler is governed by a
consistent rule: it will compile the best code it can with
the amount of information (i.e., set of invariants)
given to it. This code can be anything from a single call
on the interpreter (in those cases where nothing useful
is frozen) to the value of the program (in those cases
where everything is frozen). The interesting cases fall
somewhere in between.
It is possible to compile a procedure dynamically
during the course of some computation as values are
calculated and frozen. Hence, a computation may involve reading part of the input data, compiling a program specific to that data, and running the compiled
routine on the remainder of the data. Programs which
periodically recompile themselves based on statistics
gathered during the course of a run are an obvious
application.
Errors and interrupt handling
It should go without saying that a modern programming language needs a facility for handling errors and
interrupts. That is, a means for accepting asynchronous
external interrupts and dealing with internal error conditions. ECL takes care of both by means of the procedure call mechanism. Every error or interrupt may
be treated as if the program had explicitly called an
error handling routine of its choice from the point
where the error or interrupt occurred. Associated with
each* error or interrupt is a procedure name (e.g.,
ENDOFILE, FLOATOVF, FIXOVF, etc.). When an
error occurs or an interrupt comes in, the normal computation sequence is suspended at that point and a
system routine ERR is entered. ERR finds the symbolic name associated with that error/interrupt condition and then checks whether there is a variable of type
procedure valid at that point in the suspended computation. If no such variable exists, ERR types out an
error message and goes into a break routine which preserves the state of the computation and accepts further
commands from the console. If, however, there is an
appropriate variable, then the associated procedure is
called; so far as ECL is concerned, that is the end of
the matter-any further action is the responsibility of
the called routine.

* These include: end of file, fixed point overflow, floating point
overflow, taking the value of a null pointer, the completion of
certain I/O transactions, subscript index out of range, and timer
interrupt.

ECL Programming System

In the case of an external interrupt, it may be possible to handle the interrupt without regard to the
suspended environment. Such an interrupt processor
may perform some computation on the interrupt message, change global flags, variables, and queues, then
continue with the suspended computation. However,
to handle most errors and internal interrupts, it is
necessary to access the environment in which the condition occurred. For example, it will frequently be useful
to examine the call structure (the sequence of function
calls that lead to this point) and to examine and change
the values of variables in the suspended environment.
ECL allows access to this information, not as a special
feature offered to the error handling routine, but rather
as a system facility available at all times. A stack of
return points is· used by ECL so as to allow recursive
procedures; it is a simple matter to also stack the symbolic name of the called routine. Hence, any procedure,
whether called to process an interrupt or otherwise, can
obtain the symbolic name of the Ith dynamically preceding routine (CALLER (I) ) and can access the value
of any variable in that environment (DYB ( (variable
name), I».
An error or interrupt routine can exit in a number of
ways, depending. on the cause of the interruption.
GOTO L transfers control to the nearest enclosing
label L; this, however, may be arbitrarily far back in the
chain of calls. Since the argument to GOTO is evaluated, it is possible to use DYB to get to an arbitrary
level, even one "masked" by another label of the same
name; e.g., GOTO DYB (L, I) transfers control to the
label L defined in the Ith enclosing environment. Two
other routines allow returning a computed value. For
errors, CONTWITH ( (expression» continues computation with the value of (expression) used in place of the
expression which caused the error. RETURN ( (expression), I) acts as if the Ith routine back on the call
chain had suddenly returned to its caller with the
value of (expression).
In summary, this scheme provides a powerful, inexpensive mechanism giving the programmer fine control
over errors and interrupts. The program is armed for a
specific error or interrupt in any scope where a procedure-valued variable of the appropriate name is defined. Errors or interrupts for which the program is so
armed are handled by the specific routine. Control and
environmental inquiry facilities of the system provide
the linguistic power needed by the routine to handle
such conditions intelligently.
Control structures: paths and multiprogramming

The error/interrupt facility allows the mainline of
.computation to be suspended so that a subsidiary

261

computation can be performed to process the cause of
interruption. However, this is strictly a priority situation: the interrupt routine must complete and exit
before the main computation can continue. It is frequently useful to deal with subsidiary computations
going on whenever there is any work to be performed,
in parallel with the main computation.
ECL provides such parallel computation. In general,
a job consists of some dynamically varying number of
independent processes (called paths in ECL). What has
been described thus far is the behavior of one such path.
Indeed, when ECL is started, there is but one path.
However, that path may create new paths and start
computations on these paths, computations which in
general proceed asynchronously with respect to computation on the starting path. Each path is an independent computational entity consisting of an environment
(the call structure and variables created during this
call sequence) and an activation record which, among
other things, records the state of the path. States include suspended, waiting for some resource (e.g., I/O),
and runnable. All runnable paths are parallel processes.
The state of a path may be changed by a number of
commands; these include SUSPEND some path,
WAIT some period of time, and the Dijkstra P and V
semaphores9 for synchronization among paths. All
paths have a certain portion of their environment in
common-potentially, any allocated storage. Hence, it
is possible for two or more paths to reference common
data,e.g., a buffer, a set of flags, or a message queue.
This, coupled with the P and V semaphores, allows the
conventional sort of cooperating sequential processes
to be established.
The really interesting aspects of the ECL path
facility lie, however, in its ability to host nonconventional multiprogramming, in particular, control regimes
not explicitly anticipated by its designers. That is, like
many other facilities in ECL, the multiprogramming
mechanism is extensible. As with other extension facilities, that for multiprogramming consists of a set of
primitives and a framework for combining them. Primitive operations include creating a path, setting up a
function to be executed in a created path, running a
path, deleting a path, accessing and changing the value
of a variable in some other path, and making a copy of
a path. The basic framework is provided by a distinguished path-the control interpreter. This is unique in
two respects: (1) timer interrupts pass directly to it;
(2) there is a control primitive-CIA-by which other
paths can call for the execution of an arbitrary procedure in the environment of the control interpreter and
wait for the result.
There is a program which runs in the control interpreter path and acts as the central control of EeL .

262

Fall Joint Computer Conference, 1971

Basically, its functions are to handle I/O requests, arrange for running the other paths, and handle coordination between paths. This program is written in the
language using the primitives mentioned above. For
example, to perform path scheduling, a queue of runnab Ie paths is maintained; when the timer interrupt
comes into the control interpreter, the path that was
running is put at the end of the queue, a new path is
chosen from the runnable queue by the scheduler, and
the start-path primitive is executed to run the new
path. The scheduler is· also a routine written in the
language. Currently, it simply chooses paths in FIFO
order. However, the programmer may redefine the
scheduler by substituting his own routine for the system-provided one. Hence, such refinements as a priority
system, either simple or with dynamically changing
priorities, can be readily added.
Other control activities are equally easy to program.
For example, a Dijkstra semaphore is a languagedefined data structure consisting of an integer count
and a queue of paths (also a defined data type) waiting
on this semaphore. The P and V operations are implemented by using CIA primitive to transfer into the
environment of the control interpreter where the necessary queues can be safely modified.
With the framework provided, it is straighforward to
implement most of the known control structures, e.g.,
coroutines, multiple parallel returns, cooperating sequential processes and fork/join structures. Further,
since ECL leaves its control structures open to change,
it will be possible to develop, as needed, a variety of
other control regimes.
SUMMARY
The ECL programming system has been designed to
provide an environment conducive to effective programming. To this end, it contains a language with
comprehensive data types, operators, control structures,
and storage management facilities. It allows interactive
program composition and debugging with smooth

transition to efficient compiled code. Most important,
it· allows the programmer to tailor this environment to
suit his needs.
ACKNOWLEDGMENTS
It is a pleasure to acknowledge the help of B. Brosgol,
B. Byer, T. Cheatham, B. Holloway, and C. Prenner.

REFERENCES
1 B WEGBREIT
The treatment of data types in EL1
Technical Report 4-71
Center for Research in Computing Technology
Harvard University Cambridge Massachusetts May 1971
2 B WEGBREIT
Compactifying garbage collection in the heap
Technical Report 5-71
Center for Research in Computing Technology
Harvard University Cambridge Massachusetts June 1971
3 B WEGBREIT
Studies in extensible programming languages
ESD-TR-70-297
Harvard University Cambridge Massachusetts May 1970
4 F L DE REMER
Practical translators for LR(k) languages
Ph.D. thesis
Electrical Engineering Department MIT Cambridge
Massachusetts October 1969
5 IBM
APL/360 user's manual
GH 20-0683-1
6 A VAN WIJNGAARDEN et al
Report on the algorithmic language ALGOL 68
Mathematisch Centrum Amsterdam MR 101 February
1969
7 G RADIN H P ROGWAY
Highlights of a new programming language
Communications of the ACM Vol 3 January 1965
8 P S ABRAHAMS et al
The LISP 2 programming language and system
FJCC Vol 29 1966
9 E W DIJKSTRA
Co-operating sequential processes
In Programming Languages edited by Genuys Academic
Press New York 1968

The "single-assignment" approach to parallel processing*
by DONALD D. CHAMBERLIN
IBM Thomas J. Watson Research Center
Yorktown Heights, New York

INTRODUCTION
Parallel processing systems--..:..computer systems in
which more than one processor is active simultaneously-offer potential advantages over uniprocessor
systems in terms of speed, flexibility, reliability, and
economies of scale. However, they pose the problem of
how multiple processors can be organized to cooperate
on a given problem without interference. Various
solutions have been proposed to this problem. 1,2,3,4
Some systems require a programmer to assign units of
work to the various processors, while other sy~tems
perform this assignment automatically; in some systems, the processors are linked closely together, while
in others the processors are nearly independent. The
system to be described here automatically detects
opportunities for parallel processing in programs written
in a specific, high-level language. Parallelism is detected
on a very low level-even within a single algebraic
expression. The system consists of many independent,
asynchronous processors, all active at once in processing
a single program.
In a paper presented at the 1968 Spring Joint Computer Conference6 Larry Tesler and Horace Enea
proposed the concept of "single-assignment" programming languages. In a single-assignment program,
statements do not necessarily execute in the order in
which they appear; rather, each statement executes as
soon as all the variables it needs are defined. In order
that each statement be triggered at a well-defined
time, it is required that each variable be assigned a
value only once during the execution of a program. In
such a program, there is no "flow of control" in the
conventional sense; rather, the sequencing of statements is determined by the data flow, as some state-

ments assign values to variables which are needed by
other statements. If many statements simultaneously
have all their needed variables defined, all the statements may be executed in parallel.
This paper describes a single-assignment language
called SAMPLE (for Single-Assignment Mathematical
Programming Language), and a parallel processing
system to implement the language., SAMPLE was
originally inspired by Tesler's language COMPEL;
however, it includes quite different facilities for iteration, input/output, and other features not found in
COMPEL. All considerations of implementation are
original to this paper.
THE LANGUAGE
SAMPLE resembles as closely as possible a conventional high-level language such as ALGOL. It
employs the left arrow (~) as an assignment operator,
and allows use of the conventional arithmetic and
logical operators and parentheses in the construction of
expressions. However, all SAMPLE programs must
obey the single-assignment constraint: No variable may
be assigned a value more than once during the execution
of a program. SAMPLE recognizes two data types:
Real numbers and tuples, which are ordered sets of
numbers or of other tuples. By nesting tuples inside
, each other, the programmer can implement arrays,
trees, or "structures" as in PL/I. Tuples are denoted by
lists of their elements, enclosed between ( and ). For
example, the array

* This work was done while the author was with Digital Systems
Laboratory, Electrical Engineering Dept., Stanford University,
and was supported by a National Science Foundation Graduate
Fellowship.

might be represented by the nested tuple
A = ({I, 2,3), (4, 5, 6), (7, 8, 9».
263

264

Fall Joint Computer Conference, 1971

Elements are accessed by means of the subscripting
arrow 1, which is grouped from the left if it occurs
multiple times. The first element of a tuple has subscript
one unless otherwise specified. In the above example
A 1 2 1 3 is the number six. The reserved word TO
generates a tuple containing successive integers; for
example, (II TO 14) is the same as (11, 12, 13, 14).
The reserved words FIRST and LAST yield the
subscript number of the first and last element of a
tuple, respectively. For example, LAST (11, 12, 13, 14)
has the value four.
Arithmetic and logical operators may be used
between two tuples or between a number and a tuple,
in which case they operate element-by-element.
Examples:
(1,2,3)+(2,3,4)= (3,5,7)
(1,2,3)+2

= (3, 4,5).

Assignments may be made to individual elements
within a tuple, provided that the single-assignment
constraint is observed. When a tuple variable is to
have its elements assigned one at a time, an additional
"bounding" statement must be included to inform the
system that the variable is a tuple, and giving its first
and last subscript numbers (which may be expressions). In the following example, A is defined to be a
tuple having subscripts ranging from one to three,
and its elements are assigned values:
A IS TUPLE (1,3);
Allto-5;
A

A

1 2 to-I3.5;
1 3 to- (X+ Y)/Z;

Rather than writing a separate bounding statement,
the programmer may choose to state the bounds of a
tuple's subscripts in the same statements which assign
values to its elements, as in
A 110F(I,3)to-5;
If the lower subscript bound is omitted, it is taken to be
one. The "OF" clause may be used more than once in
a statement, once for each subscript. Thus,

A

1 I OF L 1 J OF L to- 50;

means:
1. A is a tuple whose subscripts range from one to L
2. A 1 I is a tuple whose subscripts range from
one to L
3. A 1 I 1 J is assigned the value 50.
SAMPLE has no conditional or unconditional

branches, because it has no flow of control. However, it
has a conditional expression. The following statement
assigns to SWITCH the value A if X= Y, otherwise
the value B:
SWITCH to- IF X = Y THEN A ELSE B;
SAMPLE has block structure, which enables the
programmer to declare and use a name in an inner
block without danger of duplicating a name used elsewhere. However, no storage allocation occurs on block
entry; in fact, "block entry" is undefined since statements execute in an unpredictable order.
The unpredictable order of statement execution also
requires some special provisions for input and output,
since the programmer cannot know the order in which
quantities will be read or written. Conceptually, an
I/O medium is provided in which all inputs are simultaneously available, each associated with a "tag"
(each tag is a unique integer). The statement READ
(A, 1) means "read into variable A the input quantity
associated with tag one." WRITE (B, 2) means "write
the value of B into the I/O medium and associate it
with the tag two." Post-processing can be done on the
I/O medium to produce the desired output document.
SAMPLE allows the programmer to define functions,
which are pieces of code which may be called by name
from various places in the program. Functions may have
parameters and may be recursive; however, a function
must obey the single-assignment constraint internally,
and must return exactly one value and have no side
effects.
The most difficult facility to provide in a singleassignment language is iteration, which, by its nature,
tends to assign values to the same variables repeatedly.
Since SAMPLE is intended for parallel processing, we
distinguish two types of iteration: (1) a set of actions
which may be taken simultaneously, and (2) a set of
necessarily sequential actions. For simultaneous iteration, we provide a convention which obeys the singleassignment constraint. A statement containing a tuple
name enclosed in single quote marks behaves exactly as
though it were many statements, one with each element
of the tuple substituted for the quoted name. If more
than one different quoted name appears, a copy of
the statement is generated for each possible way of
substituting a tuple element for each quoted name. The
order of execution of the copies is determined by the
readiness of their respective input values. For example,
if the programmer writes
I to- (1,2);
J to- (1,2);

A 1 'I' 1 'J' to- B 1 'J' 1 'I';

"Single-Assignment" Approach to Parallel Processing

the last statement behaves exactly as though it were
A1111~B1111;

All 1 2~B 1 2 11;
A1211~B1112;

A

1 2 1 2 ~ B 1 2 1 2;

The SAMPLE facilities for sequential iteration
abridge the single-assignment property; they look
almost exactly like ALGOL FOR and WHILE loops.
Examples:
FOR I ~ 1 STEP 1 UNTIL 10 DO
(loop body)
END
WHILEX=YDO
(loop body)
END
For the purposes of the external program, the entire
loop with all its iterations looks like a single statement,
and any variables assigned values in the loop are not
considered to be "ready" until the final iteration is
complete. Within the loop body, the order of execution
of statements (and nested loops) is governed by the
readiness of their data, according to the single-assignment property. Each iteration of the loop is not begun
until the previous iteration is complete. Within the
loop, any variable name X means "the value which is
being computed for X during this iteration of the loop"
whereas OLD X means "the value which was computed
for X in the previous iteration." Each loop may have an
initialization section, which assigns the value to be used
for OLD X (or any other OLD variable) during the
first iteration, as follows:
INITIAL X

~

0;

Shown below is a SAMPLE program which reads a
matrix A (of arbitrary size and shape) and reduces it
to upper diagonal form by a process of Gaussian
elimination. Rows and columns are assumed to begin
with subscript one.
BEGIN
READ (A, 1);
COMMENT: A IS A TUPLE OF TUPLES
REPRESENTING A MATRIX. EACH ELEMENT TUPLE IS A ROW;
L~LAST A;
COMMENT: L IS THE NUMBER OF THE
LAST ROW;
FOR I ~ 1 UNTIL L-1 DO
COMMENT: DO THE LOOP BODY FOR
EACH ROW EXCEPT THE LAST;

265

INITIAL B ~ A;
J ~ (1 TO I);
K ~ (1+1 TO L);
FACTOR IS TUPLE (1+1, L);
FACTOR 1 'K' ~ OLD B 1 'K' 1 IjOLD
B 1 I 1 I;
COMMENT: FOR EACH ROW LOWER
THAN THE PRESENT ONE, WE HAVE
COMPUTED THE NECESSARY FACTOR.
WE NOW CONSTRUCT THE NEW B
FROM THE OLD B BY ROW OPERATIONS;
B IS TUPLE (1, L);
B 1 'J' ~ OLD B 1 'J';
COMMENT: THE NEXT STATEMENT
DENOTES ARITHMETIC OPERATIONS
BETWEEN ROWS;
B 1 'K' ~ OLD B 1 'K' - OLD B 1 I *
FACTOR 1 'K';
END
WRITE (B, 2);
COMMENT: ONLY THE FINAL VALUE OF B
(THE FINISHED MATRIX) IS WRITTEN;
END
IMPLEMENTATION
Implementation of SAMPLE consists of two steps:
Compilation and execution.
Compilation is done by a conventional method: The
program is parsed according to a phrase structure
grammar, and appropriate machine-language instructions are emitted during the parsing process. The
compiler generates "temporary" variables as needed to
ensure that the single-assignment property is preserved
in the emitted code. For example, in compiling the
expression
X ~ (A * B) + (C

* D) ;

the compiler would generate temporary variables Tl
and T2 and emit the following instructions:
1. Tl

2.
3.

~A

*B
*D
X~Tl + T2
T2~C

During the execution phase, instructions (1) and (2)
might execute simultaneously, defining the values of
Tl and T2, which in turn would release instruction (3)
for execution. The compilation process, which could be
implemented on a conventional machine, is described
in more detail in a Stanford Ph.D. thesis. 6 SAMPLE
could not conveniently be used as the language in which

266

Fall Joint Computer Conference, 1971

its own compiler is written, because it lacks facilities
for manipulating character strings.
It has been proposed that the SAMPLE compiler
might generate names, relieving the programmer of the
necessity to invent a different name for every different
assignment of a variable. However, this would require
the compiler to make assumptions about the order in
which statements are to be processed, and so would be
contrary to the principle of single-assignment programming.
A hardware organization is proposed for executing
compiled SAMPLE programs. The system has three
passive storage units:
1. The Instruction Store

This unit contains the machine instructions
emitted by the compiler. Each instruction has an
opcode, an output operand, up to three input
operands, and certain link fields as described
below. Each input operand has a "ready" bit;
the instruction cannot be executed until all the
ready bits are on.
2. The Data Store

This unit contains data used during execution of
the program. Each cell contains a space for the
value of a variable, and a pointer to some
instruction which is waiting for that variable as
an input operand. If there are many such
instructions, they are organized into a linked
list, each instruction pointing to the next by
means of special link fields in the instructions.
Thus, when a given variable becomes ready, all
instructions waiting for it can be notified by
following the linked list. The linked list is
created by the compiler (or by the action of
certain instructions at run time).
A number can be stored in the "value" field of a
single data cell. If the variable to be stored is a
tuple, the cell contains a notation of the size of
the tuple, and a pointer to where the first
element is stored. The elements are stored in a
set of consecutive cells, one element to a cell.
Each element may itself be a tuple which points
to a set of elements of its own.
3. The Ready List

This unit contains copies of all instructions which
are known to be ready for execution, in the
sense that all their input operands are defined.
Execution of the program is carried out in parallel by
many independent processors. Each processor re-

peatedly executes the following Basic Instruction
Routine:
1. The processor fetches from the Ready List an
instruction which is ready to be executed.
2. The processor fetches from the Data Store the
input operands of the instruction, and performs
the indicated operation on them.
3. The processor writes the resulting output operand
into the Data Store. In the same storage access
cycle, it obtains the pointer to an instruction
which is waiting for the newly-ready cell, if any.
4. The processor follows the linked list of instructions which are waiting for the newly-ready data
cell. For each such instruction, it does the
following:
a. It turns on the ready bit of the newly-ready
operand.
b. If all ready bits are now on, it copies the
instruction into the Ready List.
c. It obtains the link to the next instruction on
the waiting list.

Because many processors are simultaneously making
access requests to the three storage units, each is
organized into many banks, and the cell addresses
within the unit are interleaved among the banks. In a
given storage cycle, each bank can satisfy only one
access request; however, two processors making simultaneous requests of two different banks may both be
satisfied.
Certain features of SAMPLE make it necessary that
additional machine instructions be generated at run
time. One such feature is the ability to define and call
functions. The compiler produces, from the function
definition, a template of machine instructions, with
certain operands left "blank." Then, when the function
is called at run time, a special CALL instruction makes
a new physical copy of the template, filling in the
blanks with the real parameters of the function call,
and releases the newly copied instructions for execution.
The original template is preserved and used for other
calls. Because of the single-assignment constraint, each
new copy of the function template must have a completely new set of memory cells allocated for its temporary variables; these new cells are allocated by the
action of the CALL instruction.
Another feature requiring run-time generation of
instructions is the parallel iteration feature, in which a
statement containing a quoted tuple name behaves
like many statements, one for each element of the
tuple. In general, the size of the quoted tuple is not
known at compile time, and so it is not known how
many copies of the statement are to be made. Again,

"Single-Assignment" Approach to Parallel Processing

the solution is to make a template of all the instructions
compiled from the statement. At run time, when the
quoted tuple is defined, a special EXPAND instruction
is triggered, which expands the template into the required number of copies and fills in the addresses of the
tuple elements in the appropriate places.
A third language feature requiring special implementation is the loop. Once again, the compiler generates
a template of instructions corresponding to one copy of
the loop. At run time, a physical copy of the template is
made and released for execution. At the same time, a
special REPEAT instruction is generated, whose
function is to sense when the most recent copy of the
loop is completely executed, then test the continuation
condition and, if it passes, generate a new copy of the
loop template (complete with its own REPEAT
instruction). In order for the REPEAT instruction to
sense when all the loop instructions have executed, it
must have a dummy input operand which becomes
ready only when the output operands of all the loop
instructions are ready. This is accomplished by means
of a tree of NOP instructions, whose only function is
to make the readiness of the dummy REPEAT operand
dependent on the readiness of all variables defined in
the loop.
Loops also require introduction of the concept of
"levels of readiness." A variable defined in a loop may
be "ready" to instructions which are implementing the
current copy of the loop, but "not ready" to instructions outside the loop, which must wait for all loop
iterations to be complete before they can use the
variable. Therefore, all instructions in the instruction
store have a "level" field, which describes the level of
loop nesting at which the instruction is found, and
each cell in the data store has a field describing its
level of readiness. The ready bit of an instruction is not
turned on unless the corresponding operand is ready
on the level of the instruction (or on an outer level
of nesting).
EXPECTED PERFORMANCE
We expect that, for some class of problems, the
SAMPLE type of organization has a speed advantage
over a conventional uniprocessor, despite its wastage of
storage accesses on overhead functions such as updating
ready bits. The speed advantage arises from the ability
to overlap multiple memory accesses into the same
memory cycle. However, the SAMPLE system requires
a costly replication of processors and memory banks,
and its total storage requirements for a given unit of
work are expected to exceed those of a conventional
processor by a large factor. The increased storage

267

requirements are due to the following causes:
1. Storage is required for such "overhead" items as
pointers, links, ready bits, and the Ready List.
2. Because the processors communicate only
through memory, they cannot store temporary
results in internal registers, but must use memory
cells for this purpose.
3. The single-assignment property dictates that
each storage cell is used only once in a program.
Therefore, although SAMPLE processing is
compacted in time, it is correspondingly "spread
out" in memory space. This trading off of time
for memory space may be an inevitable consequence of parallel processing.
Because of its wasteful use of storage, single-assignment
processing is not considered to be a cost-effective
method of computing at the present time. Its feasibility
would be improved by a large reduction in the cost of
random-access storage, or by development of a means of
reusing storage locations during processing of a
program. The problem of deciding when to free a
storage cell for reuse is complicated by the fact that,
in processing a SAMPLE program, new instructions are
generated at run-time. Thus, although all instructions
referencing a particular cell may have been executed,
there is no assurance that more such instructions will
not be generated at a future time, and so the cell cannot
be released.
EXAMPLE
As an example of the behavior of the proposed
system, programs were written to multiply together two
square matrices, in SAMPLE (see Appendix A) and in
IBM System/360 Assembler Language7 (see Appendix
B). The 360 program was written in such a way as to
minimize memory accesses by storing temporary
results in registers. Behavior of the two programs was
simulated in detail, on a memory-cycle level, for 2X2
and 3X3 matrices. The results were compared on two
bases: (1) total memory cycles required to execute the
program, and (2) total bits of storage required for
program, data, Ready List, and all working areas. Each
360 memory cycle resulted in exactly one memory
access for fetching instructions or data or storing results.
In the SAMPLE system, memory accesses were used
not only for these purposes but also for "overhead"
functions such as updating ready bits. However, in the
SAMPLE system, one memory cycle might result in
many memory accesses made by different processors to
different storage banks. Three parameters limited the

268

Fall Joint Computer Conference, 1971

ability of the SAMPLE system to overlap storage
accesses in this way: (1) the number of processors,
(2) the number of storage banks,and (3) the number
of ports through which a processor may make an access
request (effectively, the number of access requests
which may be made by a single processor in the same
cycle) . Two SAMPLE systems were investigated:
(1) an unlimited system, in which there were indefinitely many processors, each with an unlimited
number of ports, and each individual storage address in
the instruction store or data store is considered to be
its own "bank"; (2) a limited system having ten
processors, each with four ports, and, in which the·
instruction and data stores each have their addresses
interleaved among ten banks. The simulation results are
shown in Figures 1 and 2. The results of the example are
consistent with the expected performance described
above.

. (52.0)

50

(26.2) .

500
. (484)

360

(1.0)0-·- - - - - - - 0 . (1.5)

O--~~----------

2x2

__3x3
~

SIZE OF MATRICES

Figure 2-Storage usage comparison, 360 vs. SAMPLE

-

REFERENCES

lIJ

:E

~
z

1 J P ANDERSON

200

Program structure for parallel processing

o

~

. (55)

:l

hi

~ 100

The Illiae IV computer

IEEE Transactions on Computers Vol C-17 No 8
August 1968
3 H W BINGHAM D A FISHER E W REIGEL

(90) .

A utomatic detection of parallelism in computer programs

(55)0 lMIimit.d SAWLE

0(61)

O--~~~----------~

2x2

Communications of the ACM Vol 8 No 12 December 1965
2 G H BARNES R M BROWN M KATO
D J KUCK D L SLOTNICK R A, STOKES

3x3

SIZE OF MATRICES
Figure I-Execution time comparison, 360 vs. SAMPLE

Burroughs Corporation Technical Report TR-67-4
November 1967
4 H S STONE
One-pass compilation of arithmetic expressions for a
parallel processor

Communications of the ACM Vol 10 No 4 April 1967
5 L G TESLER H J ENEA
A language design for concurrent processes

Proceedings of tpe 1968 Spring Joint Computer Conference

"Single-Assignment" Approach to Parallel Processing

6 D D CHAMBERLIN
Parallel implementation of a single assignment language
PhD Thesis Electrical Engineering Department Stanford
University Stanford California 1971
7 IBM System/360 principles of operation
IBM Publication No A22-6821 February 1966

APPENDIX A
SAMPLE MATRIX MULTIPLICATION
PROGRAM
This program multiplies square matrices A and B
(which may be of any size) to form the product C :**
BEGIN
L t- LAST A;
It- (1 TO L);
J t- (1 TO L);
Kt- (1 TO L);
COMMENT: DO ALL MULTIPLICATIONS
IN A SINGLE STEP;
T 1 '1' OF L l 'J' OF L 1 'K' OF L tA l '1' l 'K' * B l 'K' 1 'J';
COMMENT: NOW ADD UP THE PRODUCT
ELEMENTS;
C 1 '1' OF L l 'J' OF L t- + T 1 '1' l 'J';
END

APPENDIX B
IBM SYSTEM/360 MATRIX MULTIPLICATION
PROGRAM
This program accepts 2X2 matrices in row-major
order in areas A and B, and leaves their product in
row-major order in area C. To convert to multiply N XN
matrices, simply enlarge areas A, B, and C to N2 words
each, and change the constant N4 to contain 4 * N.

** The unary operator + in the last statement yields the sum
of the elements of the tuple T ! I ! J.

°

269

* REG. WILL CONTAIN 4 * N
* REG. 1 WILL CONTAIN 4 * I
* REG. 2 WILL CONTAIN 4 * J
* REG. 3 WILL CONTAIN 4
DS 4F
A
B
DS 4F
C
DS 4F
N4
DC F'8'
PUT 4 INR12
LA 12,4
PUT4*NINRO
0,N4
L
SET 1=1 (4*1=4)
LR 1,12
SET J = 1 (4*J =4)
ILOOP LR 2,12
JLOOP LR 5,1
MR 4,0
R5 NOW CONTAINS
AR 5,2
DISPLACEMENT
(I,J)
ZERO RIO
SR 10,10
SET K=1 (4*K=4)
LR 3,12
KLOOP LR 7,1
MR 6,0
R7 NOW CONTAINS
AR 7,3
DISPLACEMENT
(I,K)
LR 9,3
MR 8,0
R9 NOW CONTAINS
AR 9,2
DISPLACEMENT
(K,J)
LOAD A(I,K) INTO Rll
LE II,A(7)
MULTIPLY
ME II,B(9)
A(I,K) * B(K,J)
AER 10,11
ADD PRODUCT TO RIO
AR 3,12
INCREMENT K
CR 3,0
IF K (=N,
Be 12,KLOOP GO TO KLOOP
ST 10,C(5)
STORE RIO INTO C(I,J)
AR 1,12
INCREMENT J
CR 2,0
IF J (=N,
Be 12,JLOOP GO TO JLOOP
AR 1,12
INCREMENT I
CR 1,0
IF I (=N,
Be I,ILOOP GO TO ILOOP

MEANINGEX-A computer-based semantic parse
approach to the analysis of meaning*
by DAVID J. MISHELEVICH**
The Johns Hopkins University School of Medicine
Baltimore, Maryland

INTRODUCTION

problems (which mayor may not be diagnoses)
followed by progress notes and flow sheets of laboratory
values or other measurements keyed to the problems.
An important feature for the generalization of the
approach to semantic analysis is that the individual
problems are essentially the same in format and content
as statements of diagnoses, symptoms or abnormal
laboratory findings in medical and pathological records
which are not in the form of the "Problem OrientedMedical Record."
The striking syntactic feature about this highly
important problem list is that, ,in the vast majority of
instances, the statements are noun phrases. The availability of a valuable corpus of medical record text as
the subset of natural language to be analyzed first,
before completely free text, justified the limitation of
this study in semantic analysis to noun phrases.

It is the purpose of this paper to look at the semantic
analysis of a subset of natural English text, namely the
simple noun phrase, and present the theoretical basis
for and the implementation of a semantic analyzer
called MEANIN GEX. A "simple" noun phrase is
defined as a noun modified by adjectives and/ or
prepositional phrases. Throughout the paper, examples
will be given from medical record text because of my
own orientation and because the desirability of semantic
analysis of specific types of phrases provided the
motivation for my study of meaning. I look at the basic
operational question to be as follows: "How can statements with the same meaning, but which are said in
different words be transformed to an identical form?"
Thus the basic object of the process is to make ~imilar
things fall together.
Looking at the set of medical records of a hospital or
physician as a whole, the set forms a costly and hardwon body of medical knowledge from which information
for both individual patient care and for research
purposes can be retrieved.
As in any document which contains many, many
words, organization is a requirement for the purpose of
rapid retrieval. One formal scheme for producing a well
organized patient record results in the "PROBLEMORIENTED MEDICAL RECORD" of Weed. 1,2
The record consists basically of a numbered list of

THEORETICAL BASIS FOR SEMANTIC
ANALYSIS
Semantics has been primarily studied by philosophers.
Rather than the development of a unified, applicable
analysis scheme, the approach to semantics has been
largely descriptive with example and counter-example.
The major thrust of recent philosophical work in
semantics has been due to Dr. Jerrold Katz and his
coworkers. Katz points out that there are three components to the linguistic description of a natural
language: syntactic, semantic and phonological,3 They
define the linguistic description of a natural language
as "attempt to reveal the nature of a fluent speaker's
mastery of that language" (Ibid, p. 8). Basically, then,
we are concerned with the act of communication.
Specifically, we are interested in the means by which
one physician communicates with other physicians
through the medium of the medical record. We desire to
make relatively minimal restrictions on his com-

* This investigation was supported in part by National Institute
of General Medical Sciences SpeciaJ Fellowship No. 5-F03-GM42, 816-02. The computing was done in the computing center of
the Johns Hopkins University School of Medicine supported in
part by a research grant of the Control Data Corporation. This
paper is based on a dissertation submitted in partial fulfillment
of the requirements for the degree of Doctor of Philosophy at the
Johns Hopkins University.
** Present address: National Educational Consultants, 711
St. Paul St., Baltimore, Maryland 21202.
271

272

Fall Joint Computer Conference, 1971

munication and avoid such practices as the enforced
use of handbooks of standard terms or numerical codes.
Katz and Fodor4 and Katz and Postal3 outlined a
model for a semantic theory. A "projective device"
was described which consisted of two components: a
dictionary and set of projection rules which assign a
semantic interpretation to the output produced by the
syntactic process. It is required that every sense that
a term can take on in any sentence be covered. Katz and
Postal3 require that the entries in the dictionary have
"normal form" with the following components:
1.
2.
3.
4.

Syntactic Markers
Semantic Markers
Distinguishers (optional)
Selection Restrictions (optional)

MEDICAL INFORMATION RETRIEVAL
SYSTEMS
Retrieval of information in the form of diagnoses has
been a very important objective in medicine. The major
systems currently used are those involving numerical
codes (SNDO = Standard Nomenclature of Diseases
and Operations,5 IC = International Classification of
Diseases and Operations, 6 ICDA = International Classification of Diseases, Adapted, 7 and SNOP = Standard
Nomenclature of Pathology8) although there is a move
(e.g., Reference 9) toward standard terms uncoded
into numbers (Reference 10), such as the Current
Medical Terminology,11
Other medical systems not relying on numerical
codes coded by the user can be broken down into two
classes. The first is the synonym approach12- 14 and
the second is the syntax-oriented approach (the
ACORN system, see References 15 and 16).
COMPUTER-ORIENTED RELATED AREAS
Four computer-oriented types of programs depend
very heavily on the general field of semantic analysis.
They are: Machine Translation, Question and Answer
Systems (part of artificial intelligence) , Content
Analysis and Bibliographic Retrieval.

translation from one language to another~ While the
results were largely disillusioning even with postediting,
some questions are related to the present problem.
With copious postediting, machine translation cost more
than human translation.

Question and answer systems

Question answering systems must have semantic
talents. Simmons has reviewed the area twice17- 19
and indicated that there have been essentially two
generations of such systems with a fuzzy dividing line.
One difference has been that most first generation
systems have had fixed data bases from which information can be retrieved while second generation
systems are more likely to be able to accept revisions to
the information structure by the individual asking the
question. With rare exceptions (e.g., see Reference 20)
the systems have been quite limited in the English
subset involved. Bibliographic retrieval is a related area,
since the search request is actually a formally stated
question. Giuliano,21 commenting on the first Simmons
article, felt that there were many reasons for being
pessimistic since the semantic analysis in general was
restricted to "almost trivial subject areas" such as
kinship relationships,22 baseball23 and uncomplicated
geometric figures. 24 Some improvements were made by
the time of writing of the second review article as we
shall see below.
It is beyond the scope of this paper to cover such a
rich field exhaustively. The area can be broken down
into list-structured data base systems (BASEBALL;23
SAD SAM22,) text-oriented systems (PROTOSYNTHEX20, 25,) logical inference systems (SIR26-27 and
STUDENT28-29), belief system simulationSo-S4 and
semantic memory.35-37

Content analysis

The major thrust in content analysis has been using
a tool called "The General Inquirer,"38-39 a computeroriented system for which a user's manual has been
supplied. 40

Bibliographic retrieval
Machine translation

The 1950s brought forth the hope for and attempts
to realize machine (or mechanical) natural language

Bibliographic retrieval or automatic indexing is a
question answering system in that the question is,
"Please retrieve all references which are relevant to my
search request in a given field."

MEANINGEX

The most ambitious test of retrieval techniques has
been with regard to the SMART system developed by
Salton and his coworkers. 41-46

REQUIREMENTS FOR A SEMANTIC
ANALYZER
What, then, are the requirements which a semantic
analyzer should be able to meet? They are in summary:

1. Application of a "semantic transformation" so
that statements which are given in different
words but are recognized as meaning the same
thing in a given contextual communication are
reduced to an identical form.
2. Effective selection of sense of meaning
3. Resolution of ambiguity
4. Deletion of contradiction
5. Ability to handle the problem of specificity

The approach to MEANINGEX will be stated, the
implementation described and results on typical medical
record text statements exhibited. WB then shall have
the opportunity to compare the output of the
MEANINGEX analyzer to the above criteria and
evaluate its performance.

THE SEMANTIC PARSE APPROACH TO
MEANING ANALYSIS
As opposed to the selector scheme of Katz et al.
with the subsequent combination of appropriately
selected lexical paths, I have chosen the more efficient
representation of the modification of a head term in
which both the selection and combination of meaning
are integrated processes. I call this a semantic parse or
composite selector-modifier system. Not just the
attributes of a word {e.g., (Human), (Animate), etc.
of the Katz, Fodor and Postal model] are displayed.
The head term is the noun in the noun phrase statement
which forms the root in the tree-structured semantic
analysis of that statement. Any arbitrary modifi~ation
ofa head term by semantic markers is permitted. The
appropriate markers for the text being analyzed are
automatically selected. The use of a tree structure
means that the modifier of a term can be in tum

273

modified to any level. Thus we have schematically:
HEAD TERM
MODIFIER A
MODIFIER Al
MODIFIER A2
MODIFIER B
Etc.
Note that typographical indentation is and will be
employed as the means by which hierarchical structure
is displayed. The modifiers are general properties such
as anatomical or functional considerations.
I call the tree structure a semantic parse or "sparse"
because the nodes are composed of semantic markers as
opposed to syntactic ones. Note that Raphael27 has used
the term "semantic parse" to denote the phase of
extracting relational information from English text in
his SIR program (see above). Since the MEANINGEX
modifications are at least informal relations, the present
use is certainly related. Quillian3s- a6 feels that his
dictionary view of encoding definitions using the
concept of modifications to a word represents at least
in part a Hparse." This use is also consistent, perhaps
more so. The non-hierarchical, linear presentation of
the node contents in descriptor form after the tree
structure has been constructed is called the "endsparse."
An important point is that we are dealing with a
functionally oriented analyzer. It has generative
properties as well since the input of a single head term
will generate an entire skeletal structure. The analysis
is performed, however, with regard to the rest of the
text given. These concepts will become clearer through
the development of an example. Let us analyze the
noun phrase problem statement:
MODERATELY SEVERE, ACUTE,
PNEUMOCOCCAL ARTHRITIS OF THE
LEFT KNEE
which is a typical problem statement.

The lexical phase

Of course, the only information about a word that
an analyzer has is given to it by the system user. We
must therefore relate each input word or compound
term we wish the analyzer to recognize to a standard

274

Fall Joint Computer Conference, 1971

term (which may be the term itself) and a part of
speech. In some cases the part of speech may be changed
later when context dictates such a move. For example,
a noun might be effectively changed to an adjective
role as in the case of liver becoming an adjective in
liver biopsy. The form of such a lexicon .might be as
follows:
TERM OR COMPOUND TERM

/

CEREBRAL VASCULAR
ACCIDENT
CVA
DIABETES
DIABETES MELLITUS
DIABETES INSIPIDUS
MODERATELY SEVERE
STROKE
LIVER BIOPSY

STANDARD
TERM

CVA
CVA
DIABETESM
DIABETESM
DIABETESI
SEVERE
CVA
LIVERBX

PART
OF
SPEECH
NOUN
NOUN
NOUN
NOUN
NOUN
ADJ
NOUN
NOUN

For our problem statement, we would also use the
information that "ACUTE" was substituted for
"ACUTE" as an adjective, "PNEUMOCOCCUS"
was substituted for "PNEUMOCOCCAL" as an
adjective and "ARTHRITIS" was substituted for
"ARTHRITIS" as a noun. Common incorrect spellings
could be placed in the lexicon as well.

Syntactical phase
In this phase, the head term is separated out and the
rest of the terms become adjectives. For example, the
prepositional phrase is dealt with. In our sample
problem "of the left knee" is such that "left knee"
tells the location of the arthritis so "of the left knee"
is converted to the role of an adjective. Once this is
recognized, then "of the" is no longer required.
The extent of the adjectives modifying the noun
within the prepositional phrase can be marked with· a
special symbol, sayan ampersand. Some nouns used as
adjectives will be taken care of in the lexical compressions (e.g., "liver" in "liver biopsy"). Others can
be handled by assuming that the final noun in a string
of nouns is really a noun and that the rest are adjectives
(e.g., "disease" in "kidney disease"). The ambiguity
which might arise because of confusion with regard to
whether a term is a noun or a verb (e.g., "biopsy")
does not arise since we are dealing exclusively with
"simple" noun phrases.

Normalized text
The output of the lexical and syntactical phases
combined is called "normalized text." It forms the input
to the semantic parser. We are using as an example for
analysis the problem statement:
MODERATELY SEVERE, ACUTE,
PNEUMOCOCCAL ARTHRITIS OF THE
LEFT KNEE
Our example would have the normalized text:
SEVERE,ACUTE,PNEUMOCOCCUS,
&LEFT,KNEE,
ARTHRITIS

Tree directory
While it is possible to get some reduction to identical
meaning by use of the lexicon alone, the power of
modification and selection resides in the semantic
parse itself. The elements of the semantic parse are the
entries in the tree directory. Starting with the head
term and its terms on the next node, the tree can be
constructed. Two types of nodes are available. Both
represent "term or terms on the next node." The first
is an inclusive node in which all the ter~ are used.
The second is a selector node in which only one of the
possible choices is selected. One modifier being chosen
is equivalent to a subset being chosen since the one
selected can be defined to generate the others required.
For display purposes, an asterisk in front of the members of a· set of "terms on the next node" will denote
that only one of .those is to be selected. If one of the
selections in turn has no entry in the tree directory,
it is assumed to be terminal and "null" as the following
step is automatically supplied. The form of the tree
directory is as foliows:
TREE DIRECTORY
TERMS

TERMS ON NEXT
NODE

ARTHRITIS

JOINT
INFLAMMATION

BACTERIAL

*GONOCOCCUS·
*MENINGOCOCCUS
*PNEUMOCOCCUS

DEGREE

*MILD
*MODERATE
*SEVERE

MEANINGEX

DURATION

ETIOLOGY

INFECTIOUS

*ACUTE
*SUBACUTE
*CHRONIC
*INFECTIOUS
*OSTEO
. *RHEUMATOID
*BACTERIAL
*VIRAL

INFLAMMATION

DEGREE
DURATION
ETIOLOGY
LOCATION

JOINT

JOINTNAME
SIDE

JOINTNAME

*SHOULDER
*FINGER
*HIP
*KNEE
*TOE
*POLY

LOCATION

*JOINT
*CAVITY
*ORGAN

SIDE

*LEFT
*RIGHT
*BOTH

The sparse

The skeletal form of the sparse is solely determined
by the head term. Aside from that head term, the only
role played by the normalized text is to supply selector
node decisions. The tree produced by the sparse from
the sample problem statement with reference to the
above tree directory is as follows:
ARTHRITIS =
JOINT =
JOINTNAME=
KNEE =
NULL =
SIDE =
LEFT =
NULL =
INFLAMMATION =
DEGREE =
SEVERE =
NULL =

275

DURATION =
ACUTE =
NULL =
ETIOLOGY =
INFECTIOUS =
BACTERIAL =
PNEUMOCOCCUS =
NULL =
LOCATION =
JOINT =
JOINTNAME=
KNEE =
NULL =
SIDE =
LEFT =
NULL =
"JOINT" and "KNEE" are duplicated because if the
problem had been stated "MODERATELY SEVERE,
ACUTE PNEUMOCOCCAL INFLAMMATION OF
THE LEFT KNEE," we would need some way to
indicate location (except see section on IMPLIED
RELATIONS, below). The only result not thus far
explained is how the terms "INFECTIOUS" and
"BACTERIAL" were produced considering that they
were not present in the original problem statement.
This facility is taken up in the following section.
Implied relations

Within a given context, more information is present
in terms of meaning than is given by the terms actually
comprising a given statement. In a medical context, for
example, the term "pneumococcal" calls to mind that
the pneumococcus is a bacterial agent and therefore the
process involved is an infectious one. Thus the fact
that pneumococcal implies bacterial and bacterial
implies infectious can be and was used to good advantage in producing the above sparse.
The logical relations, of course, are not always as
simple as pure implication and for some cases one can
see the practical extension of these operations to cover
other cases. For example (Diabetesm AND Pancreas)
might imply "Endocrine." The present system deals
with pure implication, and the binary logical operators,
AND and OR. The OR is an inclusive OR.
We note parenthetically that there is redundancy
present in the implied relations. This really is not a bad
situation except in the information theoretical transmission sense. We have two choices, first to put in
implied members of the tree or second to cull out all
such members which are otherwise implied. I have

276

Fall Joint Computer Conference, 1971

chosen the first course of action even though it occupies
more space, since it gives us the most consistent body of
information to use for the similarity determination.
This is a desirable approach to allow the greatest
degree of symmetry since it exhibits linkages which
might otherwise be lost. This "expansive" approach
might well be criticized on the basis that somehow the
"essence" of meaning should be the smallest set of words
possible, but this is not one of my present goals. It
appears, in fact, that the only method by which one can
demonstrate the meaning of an item is to attach all the
relevant sememic tags to it. Since words are often quite
rich in meaning, we would expect to have a number of
such tags occur and in fact this is part of the measure
of "goodness."
The end-sparse

Once the sparse has been constructed, we have
utilized the power of the hierarchical structure and
now can transform the sparse to a form that is convenient for the purpose of information retrieval. A very
convenient format is that of the linear descriptor string.
Each data item is enclosed between virgules with the
descriptor for that item to its right. This is an "attributevalue" approach. Such a record can be logically searched
with the SEARCH program47 or similar systems. The
sparse placed in the linear descriptor form is called an
"end-sparse." Note that one item's descriptor can be
another descriptor's item. For our sample problem we
obtain:
/NULL/LEFT /SIDE/NULL/KNEE/ JOINTN AME/ JOINT /LOCATI ON jNULL/PNEUM 0COCCUS /BACTERIAL/INFECTIOUS /
ETIOLOGY/NUJ.. L/ACUTE/DURATION /NULL/
SEVERE/DEGREE/INFLAMMATION /NULL/
LEFT /SIDE/NULL/KNEEjJOINTN AME/
JOINT / ARTHRITIS/
The end-sparse for the problem statement
"MODERATELY SEVERE, ACUTE, PNEUMOCOCCAL INFLAMMATION OF THE LEFT KNEE"
would be the same except that "/ARTHRITIS/"
(which is redundant information anyhow) would not
occur. If one states in the implied relations that "joint
AND inflammation IMPLIES arthritis," the head
term "inflammation" will be replaced by the higher
level term "arthritis." In any case, since the basic
elements (those resulting from "joint" and "inflammation") do occur, we have transformed a statement
said in different words than the previous one but with
the similar meaning into a "similar" form. It would be

possible to remove redundancies and thus shorten the
end-sparses.
MEANINGEX IMPLEMENTATION
MEANINGEX is a language for "extracting
meaning" from medical text which consists of problem
statements. The language MEANINGEX runs interpretively on the CDC 3300 computer and has been
implemented in the assembly language COMPASS.
The current version is batch process only. The instructions are *INPUT LEXICON, *IMPLICATIONS,
*TREE DIRECTORY, *DUMP LEXICON, *DUMP
IMPLY LIST, *DUMP TREE DIRECTORY, and
*SPARSE. Implications are described in Polish postfix
notation. An on-line interactive system would be
valuable since spelling, format and implication problems
could be resolved in a conversational manner. Many of
the items could be internally coded so the storage
requirements for the sparses could be pared down
dramatically. In the present version where space was
not a major factor, a design decision was made to maintain everything in actual text.
EXAMPLE OF MEANINGEX ANALYSIS
Two sparses from a typical run using the
MEANINGEX interpreter appear in Figure 1.
Comparison of the first and second problem statements illustrates the use of the implication facility to
displace a head term. Implication is stated in Polish
postfix notation with = standing for implication, * for
an inclusive OR, and & for AND. In the second sparse,
HJOINT,INFLAMMATION,&,
the
implication
ARTHRITIS, = " is used to displace the head term
"INFLAMMATION" by "ARTHRITIS." Thus the
sparses:
MODERATELY SEVERE, ACUTE,
PNEUMOCOCCAL ARTHRITIS OF THE
LEFT KNEE
and
SEVERE, ACUTE PNEUMOCOCCAL
INFLAMMATION OF THE LEFT KNEE
are transformed to an identical form which was the
object of the semantic analysis. Thus "LOCATION"
as a tree directory nextnode for inflammation is redundant in this case. Note that in the first two sparses, the
information that the condition is serious (from
the implication "SUBACUTE,ACUTE,*,SEVERE,
&,SERIOUS, =") and that a bacterial and there-

MEANINGEX

277

·SPARSE
.1075936
MODERATELY SE~ERE, ACUTE, PNEUMOCOCCAL ARTHRITIS OF THE LEFT KNEE •
SEVERE,ACUT~,PNEUMOCOCCUS,~LEFT.KNEE,
ARTHRITIS,
ARTHRITIS=
JOINT=
JOINTNAME=
KNEE=
NULL=
SIOE=
LtFT=
NULL=
INFLAMMATION:
DEGREE=
SEVERE=
NULL=
DURATION:
ACUTE=
NULL=
STATUS=
SERIOUS:.
NULL=
ETIOLOGY=
INFECTIOUS=
l:IACTERIAL=
PNE UMOCOCG US=
NULL=
LOCATION=
JOINT:
JOINTNAME=
KNEE=
NULL=
SIOE=
LEFT=
NULL=
.1075938/NULL/LEFT/SIUE/NULL/KNEE/JOINTNAME/JOINT/LOCATION/NULL/PNEUMOCCCCUS/BACTERIAL/INFECTIOUS/ETIOLOGY/NULL/SERIOUS/
STATUS/NULL/ACUTE/DURATION/NULL/SEVERE/DEGREE/INFLAMMATION/NULL/LEFT/SIDE/NULL/KNEE/JOINTNAME/JOINT/ARTHRITI SI
.0935487
SEVERE, ACUTE PNEUMOCOCCAL INFLAMMATION OF T~E LEFT KNEE ..
Sf VERE. ACU TE. PNEUMOCOCC US, ~LEF T, KNEE,
SEVERE,ACUTE,PNEUMOCOCCUS,~LEFT,KNEE,

INFLAMMAT ION,
ARTHRITIS,

ARTHRITIS=
JOINT=
JOINTNAME=
KNt:E=
NULL=
SIDE=
LeFT=
NULL=
INFLAMMA TI ON=
DfGior•

....,...

Etc.

WIthHolding

OMcription

''''

C.O.SWter~

Utility

::~:ng

Figure 3-Segment detail of a peop]e data base

Long Beach Public Safety Information Subsystem

Safety between Fire, Police and Civil Defense and at
the component level within the Fire Function between
Dispatch, Suppression, Prevention and Investigation.
The People data base (Figure 3) acts as a data
sharing mechanism at the subsystem level between all
four subsystems, and at the component level within the
Police Function between in-Custody, Case Reporting,
Investigation Support, Calls for Service and Traffic
Reporting.
PERSPECTIVE
During the remainder of this project, Long Beach
will proceed with the phased implementation of selected

301

applications to substantiate the hypothesis proposed by
the concepts developed. The conceptualization described
above is expected to evolve into a multi-year implementation plan for the City of Long Beach and become
the basis for planning in other municipalities.
If the Long Beach Public Safety Subsystem is successfully transferred to another jurisdiction, the USAC
objective of transferability will be validated. This can
occur only if the recipient municipality openly approaches change to its existing policies and operations.
Such change appears to be desirable in view of anticipated benefits to be derived from the implementation of
an integrated municipal information system or subsystem.

State criminal justice information systems
by ROBERT R. J. GALLATI
New York State Identification and Intelligence System
Albany, New York

INTRODUCTION

criminal justice information systems at the state level,
there must be some method for the interstate exchange
of criminal history records. This was fully recognized
during the development of Project SEARCH (System
for the Electronic Analysis and Retrieval of Criminal
Histories). The FBI/NCIC (National Crime Information Center) has assumed responsibility for maintaining the central index of the SEARCH-type system
which becomes operational this November.
The critical role to be played by state identification
bureaus in the future of the NCIC Criminal History
Record Exchange System is apparent from a policy
statement approved on March 31, 1971, at a meeting
of the National Crime Information Center (NCIC)
Policy Board. The Board determined that in order for
the NCIC system to evolve into a truly national
system ... "each state must create a fully operational
computerized state criminal history capability within
the state. . . ."
It is submitted that state criminal justice information systems and the identification bureaus which
form the nuclei of their files, are destined to play an
ever-increasing role in the area of public systems dedicated to law enforcement and criminal justice. One of
the noteworthy examples of the development of such
systems is the New York State Identification and
Intelligence System (NYSIIS).
I propose to present the NYSIIS story as an analytic
case study of a particular model for a state criminal
justice information system. NYSIIS did not evolve
from some other agency or function. It was created
"de novo" as a conscious effort to produce both an
agency and a function that had never before existed
in N ew York State, or elsewhere. It is unique, and
may well continue to be the only one of its kind in the
nation. However, all 50 states
follow the model in
one way or another, even though each may develop an
indigenous system, which, on the surface, appears to
differ considerably.

There has finally been wide recognition of the need
to improve the systemic relationships of the various
functions and processes of what has been euphemistically referred to as our criminal justice system. With
this recognition has come an understanding of the
central role of criminal identification bureaus in computerized criminal justice information systems, which,
in turn, serve as foundations for the development of
true criminal justice systems.
As a practical matter the state is the most logical
governmental level at which computerized criminal
identification bureaus could be housed. Local communities, regardless of size, necessarily have less complete files than those at the state level. Criminal law,
both in terms of enactment and enforcement, is statelevel based. State identification files are records of
violations of the criminal laws of the particular state
involved. As an operational matter, it is exceedingly
difficult for a national agency to handle the fantastic
workload involved in an attempt to process fingerprints and perform other identification functions for the
entire nation.
An additional factor which commends the maintenance of computerized criminal justice information
systems and the identification bureaus they are structured around at the state level is the increasing public
concern about possible invasions of privacy involved
in computer data banks, particularly those at the federal
level which might be interfaced with others to form a
single giant National Data Bank. Criminal justice information system data banks necessarily contain derogatory records, so it has been strongly recommended
by civil libertarians and many criminologists that
comprehensive criminal justice information files be kept
at state level.
Obviously, if we are to retain our computerized

will

303

304

Fall Joint Computer Conference, 1971

NYSIIS CASE STUDY
Origins

NYSIIS was created as an agency in 1965. The
concept of NYSIIS rests upon the following basic
principles of the unitary nature of criminal justice: all
criminal justice agencies need to participate in and
share a joint data bank; the submission of information
thereto should be primarily voluntary; NYSIIS is to
be a service agency only, with no powers, duties or
facilities to arrest, prosecute, confine or supervise;
security and privacy considerations must permeate the
system and involve central and remote NYSIIS operations; new dimensions of science and computer technology can be applied to provide greater effectiveness
in filing methodology and the utility of processed data;
and that criminological research will be supported by
a vast resource of computerized empirical data available
for variable searching to test theses, hypotheses, theories
and pilot projects, thereby enabling criminal justice
administration to evaluate its own procedures, practices
and operations.
Development

It was very evident from the start that if NYSIIS
was to function effectively as a criminal justice information system serving all functional areas of criminal
justice, there were some obvious conditions that had
to be met:
1. NYSIIS had to be created and maintained as

an independent agency so that it could serve all
functions without fear or favor. This has been
a public administration and computer sciences
problem (opportunity);
2. NYSIIS had to have a vast computer capability
and engage in massive historical and ongoing
data conversion of the millions of criminal history
records contained in its manual identification
files. This has been a systems and computer
sciences problem (opportunity);
3. NYSIIS had to advance the state-of-the-art of
computer-related techniques for the further automation of the fingerprint identification process.
This has been a research and development and
computer science problem (opportunity);
4. NYSIIS had to create state-of-the-art computerrelated technology for the development of new
and improved analytical techniques for the
identification and intelligence functions. This has
been a planning, research and computer sciences
problem (opportunity);

5. NYSIIS had to provide computer-related communications systems for remote access to the
system data bank and for computer interface
with interstate information exchange systems.
This has been a systems, communication and
computer sciences problem (opportunity);
6. NYSIIS had to be able to survive a period of
several years during which it produced only a
minimum tangible product in the service of the
criminal justic~ community. This has been a
public relations and computer sciences problem
(opportunity) ;
7. NYSIIS had to embrace a sophisticated security
and privacy program in order to allay the fears
of those who perceived computerized data banks
of derogatory data about individuals as a threat
to civil liberty. This has been a political and
computer-sciences problem (opportunity).
Computer opportunities

I t is fair to say that NYSIIS would be as nothing
but for its computer capability. In every phase and at
every stage of its origins, development and continued
growth computer science problems (or opportunities)
presented themselves and then became the most constantly viable elements of continued survival. Indeed,
the future of NYSIIS and criminal justice information
systems is wholly dependent upon computer capabilities
and related technology. Perhaps it is more accurate,
therefore, to refer to the role of the computer as computer-science opportunities rather than computerscience problems.
Agency independence

A cardinal tenet of the founders of NYSIIS was that
it should be an independent agency with its own dedicated computer system. Maintaining bureaucratic independence has been a torturous trail to blaze. As a small
agency among the giants (State Police and Department
of Correctional Services), it is not surprising that influential legislators would each year call for NYSIIS'
elimination and the consolidation of its services, either
with the State Police or the Department of Correctional
Services. In fact, within a single week, a prominent
legislator recommended that NYSIIS be taken over by
the. State Police on one occasion, and then recommended that it be absorbed by the Department of
Correctional Services on another occasion, just a few
days later.
Another facet of the war for independence has been
misguided attempts to consolidate state agency com-

State Criminal Justice Information Systems

puters on a statewide basis. To date these have been
frustrated, largely because of the fact that NYSIIS
seized the opportunity to obtain its own very large
computer system as soon as possible. Had NYSIIS
yielded to the temptation to "get started" with some
attractive police modules such as stolen motor vehicles
and stolen property, it may have found itself a ready
candidate for absorption by a larger agency or have
been required to share a general service computer with
a number of other state agencies. Since NYSIIS, from
the beginning, went for the "big apple"-a vast computerized criminal history file-very early in the game,
it became not so readily digestible and it managed to
stand alone and independent-saved by the Burroughs
6500!
Data conversion

Data conversion for the computer is invariably considered a simple problem by people who have never
experienced its impact. Those who are veterans of conversion battles know better. They also know it is particularly difficult to convert a very large manual file
while that file is being used on a day-to-day basis in
essential operations.
In the NYSIIS con version of criminal history records
as is true to a greater or lesser extent in all criminal
identification bureau conversions, there have been certain added dimensions that further increased the difficulty attached to such an operation, such as:
1. Different time periods-due to the type of documents involved (fingerprint cards, court dispositions, institution cards, etc.), and primarily
for control of data conversion it was necessary
to set four time periods from 1927, to the present;
2. Multiplicity of agencies-there are more than
1000 relatively autonomous agencies which have
submitted source documents to NYSIIS;
3. Document differences-documents received from
varying sources differ in format;
4. Information location-the positioning of information on submitted forms varied from the same
and different sources.
At present more than 750,000 criminal history records
are on tape and fully edited and purified records are
being added to the computer data base at the rate of
over 10,000 per month. Obviously, this amount of
storage of extensive criminal histories requires tremendous computer capacity and maximum speed and
multiprocessing capability. Computer science has provided NYSIIS with these capacities and capabilities

305

and the opportunity to provide a two-hour response
time for fingerprint submissions-as opposed to the
10-14 day response time of the old manual system.
This 12,000 percent improvement in response time is
completely dependent upon the extensive computerization of the identification function at NYSIIS.
A utomated identification

The fingerprint identification process which is the
basic function of all criminal identification bureaus
virtually demands computerization and complete automation by its very nature. It is fortunate that NYSIIS
planners recognized this from the very beginning of the
agency. Today, NYSIIS is the most fully automated
identification bureau in the world.
The need for systematic improvement in the operation of these bureaus is accentuated by three recent
developments:
1. New legal procedures such as preventive detention, release on own recognizance, forthwith
sentencing, and mandatory rapid arraignment
and bail setting require swift criminal history
record responses;
2. The overwhelming increases in the volume of
fingerprint submissions has resulted in larger and
less readily accessed files;
3. Facsimile transmission and other telecommunications devices have eliminated time delays connected with delivery and focused attention upon
the lag-time at the point of processing.
The fundamental steps in the fingerprint identification process are as follows:
1.
2.
3.
4.
5.

Name search of main file and wanted file
Classification of fingerprints
Fingerprint search
Fingerprint comparison
Criminal history preparation

NYSIIS has computerized the name search process,
both for wanteds and the main file. To date, no agency
in the world has succeeded in automating the classification of fingerprints; however, extensive research has
been conducted by the FBI, NYSIIS and others to
achieve this objective. NYSIIS, however, has developed
a computerized fingerprint search technique with the
remarkable capability of searching incoming fingerprints against a base file of 2.5 million fingerprint
classifications in less than 30 seconds. Fingerprint comparison through microfilm image retrieval techniques is

306

Fall Joint Computer Conference, 1971

within the state-of-the-art and NYSIIS intends to obtain such a capability as soon as funds become available.
Finally, the hard copy criminal history record is retrieved from the computer data bank and printed out
in NYSIIS, or at a remote access facility. Here again,
we see that the opportunity to utilize computer sciences
and related technology is the key to entirely new dimensions of service which serve to protect civil liberties
and to deal more effectively with suspects and apprehended criminals.
A nalytical identification

The basic function of an identification bureau (which
is, in turn, at the heart of any criminal justice information system) is to receive hard copy sets of fingerprints containing the friction ridges of all ten fingers of
the subject; compare these sets with those in the base
file and produce verified criminal history records (or
"no record responses", as the case may be). However,
there are many analytic (investigative) needs of criminal justice agencies which can be satisfied as byproducts of the computerized criminal identification
bureau. These bonus-type modules of the system have
a high pay-off in terms of public support and increased
credibility in the criminal justice community. Examples
of some of these computerized modules which NYSIIS
has developed to date are as follows:
1. Latent fingerprint identification

2.
3.
4.
5.
6.

Fraudulent check
Personal appearance
Warrant/Wanted
Organized Crime Intelligence
Stolen motor vehicles (Automatic License Plate
Scanning)
7. Modus Operandi
8. Criminalistic data analysis

All of these analytical modules are viable theoretically
for investigative purposes by all branches of the criminal
justice process. However, as a practical matter, they
are most often within the police domain and their
availability pleases the law enforcement segment of
the total spectrum of criminal justice administration.
Since 70 percent of the total resources of criminal
justice are concentrated in the police function, special
attention to law enforcement requirements was definitely in order for this fledgling agency.
However, it was not sufficient merely to take the
very primitive files that currently existed and computerize them. This would have outrageously sub-

optimized the capabilities of the computer and the
developing criminal justice system. Ergo, NYSIIS found
itself in the position of having to create state-of-the-art
computer-related technology in order to justify its
efforts to meet these law enforcement needs. Once
again, great opportunities were presented to provide
orders of magnitude improvement in this vital area of
government.
For example, there has never before been an effective
latent (crime scene) fingerprint identification system.
Under the conditions of existing fingerprint classification
systems, it is not possible to search the fingerprints of
unknown suspects left at the scene of a crime through
the millions of sets of prints in the main file. As a result,
special files of recidivists in those types of crimes where
there are likely to be prints left at the scene and the
perpetrators are not otherwise identifiable (i.e., burglary, auto theft, etc.) have been created. The largest
file of this type in the United States contains the
prints of less than 30,000 persons-compared with many
millions in the base file! NYSIIS' studies indicate that
at least five percent of all burglaries could be solved
through latent print identification if a sufficient number
of crime scene fingerprints were lifted and processed
in a large base file. (It must be recognized that less
than 20 percent of current burglaries are cleared by all
other investigative methods.) NYSIIS is developing
an improved system with the capability of matching
lifted crime scene fingerprints with prints in the main
criminal fingerprint file, utilizing a combination of computer search and microfilm retrieval technology.
Likewise, computer searching to identify perpetrators
by personal appearance, modus operandi, trace data
analysis, fraudulent check characteristics, etc., opens up
an entire new spectrum of aids to criminal justice administration. In automatic license plate scanning for
the apprehension of wanted motor vehicles a combination of optical and computer technologies has provided a viable solution to the epidemic stolen car
problem. Most recently, through NYSIIS planning and
research, it has been recognized that computerized
organized crime intelligence systems hold the key to
new opportunities for dealing with this nagging problem
which, up to now, has been largely unresolved despite
vast allocations of manpower resources.
Computer communications

The fantastic opportunities for criminal justice offered
by computer technology depend to a very large extent
upon compatible communications resources. Speedy
computer processing of arrest fingerprint submissions

State Criminal Justice Information Systems

is pretty much in vain if the fingerprints require two
or three days to arrive by mail and it takes two or
three more days thereafter to receive the printout.
The "magic" of computer identification of wanted vehicles passing on the highway is meaningless unless the
"hit" message can be retrieved within a few seconds.
The intra- and interstate exchange of identification and
intelligence data must be facilitated by an entire array
of computer compatible communications. Likewise, remote access and computer-to-computer interface depend
upon appropriate telecommunications systems.
NYSIIS has seized the opportunities available to
provide computer-related communications systems,
both within N ew York State and on an interstate basis
through its participation in SEARCH. A very significant development in this regard was the establishment of the first statewide facsimile network for the
photo transmission of fingerprints from any point in
the state to NYSIIS for computer processing and appropriate responses thereto by message facsimile. At
the present time it still takes 14 minutes to transmit
each set of fingerprints and an average of 4 minutes to
respond with a criminal history record. These elapsed
times are, of course, unsatisfactory and we are urgently
pressing vendors to escalate their efforts to improve
the technology.
In this connection, NYSIIS has been participating
in the satellite transmission project of SEARCH, experimenting with the possible transmission of fingerprint card images via microwave and satellite rather
than facsimile ground systems. Likewise, NYSIIS participated in the Project SEARCH interstate transmission of criminal history records which involved an
advanced telecommunication network with remote access and computer-to-computer interface. NYSIIS and
its many counterparts throughout the country are interfaced with the National Crime Information Center
(NCIC) computer at the FBI in Washington, D.C., for
purposes of stolen property and wanted identifications
and most recently for the transmission of alphanumeric
criminal history record data. Despite the sophisticated
hardware presently available, we still need a number
of breakthroughs in the area of communication technology in order to optimize the impact of the computer
upon the criminal justice system.
Barren survival

Government agencies must justify their continued
existence and make requests for their share of scarce resources each year. In the case of NYSIIS, it was necessary to convince the Legislature and its scrupulous

307

fiscal committee of the merits of this computerized
criminal justice information system long before it produced any tangible benefits to anyone. Here again, the
msytique of the computer provided opportunities to
sustain interest in the glorious promise of a criminal
justice information system.
A public information program which took full advantage of the public's fascination with computers and
their incredible capabilities was mounted with significant success. Professional associations of police, district
attorneys, correction, probation and parole officials provided loyal support during the barren years. They, too,
were intrigued by the promised potential of computerized information sharing.
The continuous announcement of technological breakthroughs during development and the constant reiteration and reinforcement of the ultimate promise kept
NYSIIS alive on the one hand; and, on the other hand,
prevented the abortive development of unnecessary
and redundant computer systems at the state and local
levels. Many millions of dollars were saved by inhibiting the creation of systems which would duplicate
what was already being planned on a more comprehensive basis and would perforce supplant any such
truncated endeavors. The dazzle and the promise of
the computer served NYSIIS well during the "lean"
years of little production and much planning, research
and development.
Security and privacy

From the very inception of NYSIIS it was evident
that matters of security and privacy should be given
prime attention. The same public awe of the computer
which served NYSIIS so well in buying time for system
analysis and development, could readily be changed to
fear and turned against us. The computer compelled
us to critically examine every facet of the planned
structure to be certain that the system provided dynamic security and privacy, in order to equate in productive equilibrium the right of privacy and the need
to share information.
NYSIIS recognized that we need to protect private
personality as zealously as we protect private property.
Long before the issue of the computer vs. privacy became a subject of national debate, NYSIIS had committed itself to a program of computer security which
earned the praises of Congressman Gallagher) Oscar
Ruebhausen, Orville Brim, Senator Ervin, Alan Westin,
the New York Civil Liberties Union, the Vera Institute
of Criminal Justice and many other persons and organizations who champion civil liberties.

308

Fall Joint Computer Conference, 1971

As I testified recently before the Senate Subcommittee on Constitutional Rights:
"I firmly believe that computerized criminal justice
information systems are essential for the effective administration of criminal justice and that such systems
can be developed and operated with adequate security
against unreasonable invasions of individual privacyindeed, I believe that they can be so developed and
operated as to provide new dimensions of personal
freedom and protection for civil liberties and constitutional rights."
The concerns of NYSIIS were also the concerns of
Project SEARCH and with their respective plans for
protecting privacy and their voluntary adoption of
stringent codes of ethics, we may rest assured that the
computer sciences will remain alive and well in the
criminal justice community.
SUMMARY
The future of criminal justice information systems,
particularly at the state level, seems very bright indeed.
The difficult years of development, which I have indicated by reference to the NYSIIS experience, are pretty
much behind us. The miracle of mounting infusions

of money through welcome federal funding of computerized criminal identification bureaus and related technologies assures the fiscal support of such systems.
The very rapid and meaningful responses that computerized criminal justice information systems are providing for the felt needs of all functional branches of
criminal justice administration are engendering professional support and commitment. The challenge of
those who fear 1984, has been met head-on. We are
leading the march for individual freedom and civil
liberties, for the computer, properly controlled, is a
willing slave to serve humanity, not a master of our
fate.
Ultimately, computerized criminal justice information systems will be vindicated by two consummations:
1. Emergence of a coherent and coordinated system
of criminal iustice;
2. Reduction in the incidence of crime and effective
apprehension, prosecution, adjudication and rehabilitation of offenders.
I am so bemused myself with the mystique of the
computer that I sincerely believe the computer can accomplish this.

Automated court systems*
by RONALD L. BACA, MICHAEL G. CHAMBERS and WALTER L. PRINGLE
Symbiotics International Incorporated
Houston, Texas

and
STAYTON C. ROEHM
Harris County
Houston, Texas

speeded up or the wheels of justice will soon come to a
grinding halt.
Chief Justice Warren E. Burger in his first state of the
judiciary message in 1970 said: "In the supermarket
age, we are like a merchant trying to operate a cracker
barrel corner grocery store with the methods and
equipment of 1900."
Litigants in criminal cases are experiencing delays of
up to two years and more before their cases can even
come to trial. This is particularly true in our larger
metropolitan centers. After such a long period of time it
is not unusual to find that witnesses involved in a case
have moved away or even died. The standard solution
to the delay problems is simply to add more courts.
More courts mean more judges, more clerical support
and more docketing problems.
Our courts are bogged down with manual bookkeeping procedures. In many of the metropolitan areas
it is not unusual to read about how someone was denied
his freedom due to a simple clerical error or a breakdown
in communications between the various departments
that comprise the criminal justice system. Citizens
often win judgments against law enforcement agencies
in resulting litigation.
The problem of crimes committed by persons out on
bond is a major one. Many states are implementing
procedures to speed up the processing of cases involving
dangerous persons who are free on bond. Such preferential treatment can result in more delays for
innocent people who cannot post bond and must remain
in jail.
It is suspected that a prime cause for much of the
backlog and delay is due to a lack of coordination in
docketing cases. Attorneys are often involved in a large

INTRODUCTION
Why does our Judiciary continue to use antiquated
methods in the courts instead of taking advantage of
business automation techniques which have been so
successfully utilized by private industry?
This paper answers this question and discusses some
of the reasons why the courts, especially those in the
larger cities, need such automation techniques.
The paper also describes what has been done in
Houston, Texas, to solve this problem. The authors
have worked closely with Harris County criminal
justice officials for several years and have designed a
completely automated criminal records system.
This system, called the Harris County Subject-inProcess Records System, 1 maintains pertinent information about criminal cases. This information is
made available to the courts, law enforcement agencies,
the District Attorney, and other agencies and departments involved with the judicial process.
JUDICIARY REQUIREMENTS
The court officials, especially in the larger cities,
including judges, clerks of the courts and prosecuting
attorneys, know what computers can do for them.
Their conferences and professional publications constantly emphasize the importance of automation. They
know also that somehow the processing of cases must be

* The development of the system described in this paper was
financed in part by the Law Enforcement Assistance Administration with a grant awarded to Harris County, Texas, and
administered by the Texas Criminal Justice Council.
309

310

Fall Joint Computer Conference, 1971

number of cases and therefore are frequently unavailable. It is also suspected that attorneys often ask for a
postponement of one case in order to get a better setting
for another case they are representing. These things are
suspected, but without automation it is a formidable
task to sift through the mountain of paperwork to
determine bottlenecks in the judicial process and to
formulate action to remove them.
Former Chief Justice Earl Warren, in a speech
delivered at the annual meeting of the American Law
Institute in 1966, said: "It seems to me there is a
definite need for thorough analysis and study of the
mechanics-in its physical aspects-of carrying on the
business of the courts. I am led to this belief by the
accomplishments of new data. processing methods
employed in other fields-medicine, for example."
Governmental agencies, especially on a local level,
are quite inflexible in comparison to commercial businesses. Seemingly simple changes such as using an
available computer facility to print an index of criminal
defendants instead of manually entering each name in a
"well-bound" journal often require amendments to
state constitutions; or, at a minimum, require an
interpretation by the State's Attorney General.
Of course, we are all too familiar with the problems
posed by budgetary considerations and of officials who
are not close enough to the problem and who find it
difficult to approve expenditures for data processing.
Government often fails to use modern data processing procedures simply due to organizational
restrictions. There is usually no one person or department to tie the various criminal justice departments
and agencies together to organize and support the
implementation of such a system.
What, then is being done to relieve our congested
courts. The use of computers to streamline court
procedures can presently be found in several large
cities. Many of these systems, however, were implemented quickly to solve some immediate problems.
What is desperately needed is a thorough analysis of the
entire court system and the development of long range
plans to solve the problems.

printed reports and remote terminals, to the District
Clerk, District Attorney, Sheriff, Probation Department, and the Courts.
The primary objectives in the design of the Harris
County Subject-in-Process Records System were to
produce a system which would provide an efficient
means of monitoring the progress of criminal cases and
to define methods of using such information to reduce
the total time and effort required to process a case.
The system is designed in a manner to be mutually
beneficial to the various County agencies and departments concerned with the criminal process. It is, whenever feasible and allowable under the statutes, designed
to eliminate unnecessary duplication of records and
effort amongst these agencies and departments.
Harris County records show that in 1966 the average
time from indictment to trial was 18 months. Today the
average is down to six months due to the diligent efforts
of the County officials. U. S. Chief Justice Warren E.
Burger, however, has urged that all criminal cases be
brought to trial within 60 days of arrest.
ORGANIZATION AND DESIGN
The Harris County Subject-in-Process Records
System was designed to eliminate the necessity of
looking for information manually. Naturally, there are
many manually processed legal documents. The computer system may, however, maintain copies of per-

AUDITOR

JUSTICE OF THE
PEACE COURTS

cQ

cQ

~~\// ~
'"'"''

IN
PROCESS

.A

~

COMPUTER

~ V--,--SYST-----,'"

~

"-

---.

~

SYSTEM OBJECTIVES
The criminal justice officials in Harris County have
long been aware of the administrative problems and
have recently taken positive steps toward a solution by
working together to develop what is now called the
Harris County Subject-in-Process Records System.
This computer system maintains all pertinent information about criminal cases and the defendants
involved. The system information is available, via

Q.).

/

c=CJ
DISTRICT
CLERK

cCl
SHERIFF

DISTRICT
ATTORNEY

Figure 1-System interface

Automated Court Systems

tinent facts from each document and thereby provide
instant response to many questions concerning criminal
cases.
As a subject progresses from one step in the judicial
process to the next, information regarding this progress
is recorded in the computer system.
Figure 1 is a graphical representation showing which
County departments interface with the System. The
number depicted in each box indicates the number of
terminals assigned to each department.
The Subject-in-Process Records System consists of
teleprocessing and batch processing functions built
around a nucleus of files serving as the System's data
base. The System Organization flowchart shown in
Figure 2 illustrates the system. 2 The various files and
queues are shown in the center with the teleprocessing
functions to the left and the batch processing functions
to the right.
The three basic data files are the Case History File,
Name and Identification Number File, and the Calendar
File. These files are similar to those of the Basic Courts
System3 (BCS) files, but several additions and
modifications have been incorporated. The basic files
are separated into active and inactive files to augment
the on-line and batch oriented functions.

311

The remote terminal user has available to him nine
basic teleprocessing functions. These consist of Remote
Batch Input (RBI), Batch Output Reporting (REP)
and seven on-line functions (CAS, NAM, NUM,
ANM, PER, JAC and CAL) which aid the user in the
interrogating, retrieving and updating of the basic data
files via the remote terminals.
RBI allows for the input of batch data via the remote
terminals by placing the input in a queue to be processed
by the Batch Input Subsystem. REP allows the user
to request batch output from the remote terminals by
placing the requests on a queue to be processed by the
Batch Output Subsystem. The seven on-line functions
yield terminal displays to the terminal inquiries and are
briefly described below:
CAS: allows the user to search, retrieve and update
the Case History File, and to display all
associated transaction records at the terminal
N AM: allows the user to search, retrieve and update
the Name File and to display the desired
records at the terminal
NUM: allows the user to search, retrieve and update
the Identification N umber File and to
display the desired records at the terminal
ANM: allows the user to display all available
identification numbers associated with a
defendant to a case
PER: allows the user to display all available
personal descriptor information associated
with a defendant
JAC: allows the user to display the arrest/conviction history of a defendant
CAL: allows the user to search, retrieve and update
the Calendar File and to display the docket
of a court
These teleprocessing functions are written in
FASTER-LC4 and are incorporated into the system to
augment the facility available to the user.
All terminal inquiries are logged on the Log File to
provide system backup. In the event of a system failure,
all transactions can be reconstructed and the integrity
of the basic data files insured. The Log File also provides a data base for the analysis of user requests and
overall terminal usage.
Batch Processing

Figure 2-System organization

The batch processing functions are divided into the
Batch Input and Batch Output Subsystems. These
subsystems are designed to interact with the queues
built in the on-line mode and the basic data files. These

312

Fall Joint Computer Conference, 1971

HARRIS COUNTY

RUN DATE

HARRIS COUNTY
COMPLAINT INDEX - MONTH TO DATE
JANUARY 22. 1970

01-23-70

RUN DATE

PAGE

NUMBER

DEFENDANT'S NAME

OFFENSE CODE

PAGE

16

33
COMPIAINT

NUJ!IIER

CASES' PENDING THE GRAND JURY INDEX
WEEK ENDING 12-18-70

12-18-70

DATE
FILED

DEFENDANT I S NAME

J P COURT
PREC POS

OFFENSE DESCRIPTION

OFFENSE DESCRIPTION
14296-01

WASHINGTON SADIE

2501

FORGERY OF CHECKS

2100-02

WASHINGTON SADIE

2501

FORGERY OF CHECKS

15113-02

WATTS CHARLES R

1

0

3562

MARIJUANA - POSSESSING

2352-01

WATTS CHARLES R

3562

MARIJUANA -

14184-01

WEST JAMES M

1

1

2270

BURG & THEFT

2260-01

WEST JAMES M

2270

BURG & THEFT

24780-01

WHITE ROBERT

2300

THEFT BY BAILEE

2362-01

WHITE ROBERT

230b

THEFT BY BAILEE

POSSESSING

08-10-70

TOTAL CASES PENDING THE GRAND JURy

4

TOTAL CASES PENDING OVER 90 DAYS

1

...
...

Figure 3-Complaint index
Figure 5-Grand jury index

subsystems provide for the input of data to the files and
the output of pre-defined system reports.
The Batch Input Monitor is a subsystem consisting
of ANS COBOL programs which take the batch and
remote batch input data and update the basic data
files. This subsystem performs the necessary editing
and formatting of the various data records and supplies
diagnostic messages when appropriate.
The Batch Output Monitor is a subsystem consisting of ANS COBOL programs which queue the
system requests for generating reports on pre-established frequencies. This subsystem also analyzes all
system generated and user generated requests for batch
output, eliminates duplication, establishes priorities
and invokes the various batch output programs which
produce the system reports.
The capabilities of the System include the ability to
produce numerous printed reports at predetermined
intervals or upon request. These reports include indexes,
case histories, and summary reports.
The Complaint Index shown in Figure 3 is a list of
all felony complaints which have been submitted to the
Grand Jury. The index contains the defendant's name,
a unique sequence number, the co-defendant suffix (a
two-digit number used to identify defendants when
there are more than one to a case) , the offense code and
the offense description. The Complaint Index is sorted

RUN DATE

10-26-70

CASE
NUMBER

HARRIS COUNTY
FELONY INDEX - MONTH TO DATE
OCTOBER 25. 1970

DEFENDANT'S NAME

310154-02

ALLEN JOHN B

308916-01

BOND JAMESON L

309985-01
310225-03

PAGE

JUDGEMENT-RECORDS
VOLUME
PAGE

and printed by defendant name and by the sequence
number.
The Felony Index and the Misdemeanor Index are
similar and contain the case number, co-defendant
suffix, defendant's name, the volume and page of the
judgment records and the case disposition. The indexes
are printed in defendant name order. A sample is shown
in Figure 4.
The Cases Pending the Grand Jury Index, see Figure
5, is an alphabetical list of all defendants of felony
cases which have been bound over to the Grand Jury
but have not been indicted or no-billed. The index
contains the defendant's name, the complaint number
with co-defendant suffix, the Justice of the Peace
Court, and the offense. In addition, cases which have
been pending the Grand Jury for 90 days or more are
flagged by listing the date the case was filed in the
Justice of the Peace Court.
The Cases Pending the District Courts Index and
the Cases Pending the County Courts at Law Index
consist of a list of cases which have been assigned to a
County Court at Law or District Court, but are
pending final disposition. The indexes are sorted in two
major ways: by case number, and by ready status. A
sample is shown in Figure 6.
The Case History Registers consist of chronological
listings of all transactions concerning each case from
the time the case number is issued until the case has
been disposed of. A sample is shown in Figure 7.

92
HARRlSCOUNTY
FELONY CASES PENDING STATUS IND£X

CASE
DISPOSITION

312

/

006

GUILTY

SMITH JOHN

310

/

125

NOT-GUILTY

WILLIAM WILLIAM W

311

/

205

NO BILLED

WEEKENDING 12-04-70

PUBLIC JOHN Q

...

Figure 4-Felony index

TOTALFELQNY CASES PENDIHG

-4

....

Figure 6-District courts index

Automated Court Systems

313

Input:

Teleprocessing

~ NAM, Smith, John, B.
Response:

The System has the ability to display system information on remote terminal CRTs. As indicated
earlier in Figure 2, which describes the organization of
the data files, the system provides for the following
terminal displays:

... NMN

I N 0 E X

JOOO LAST NAME

J010 SMITH
J010 SMITH

• NAME INDEX INQUIRY-NAM
• PERSONAL DESCRIPTOR INQUIRY-PER
• JAIL ARREST CONVICTION INQUIRYJAC
• COURT CALENDAR INDEX INQUIRYCAL
• IDENTIFYING NUMBER INDEX
INQUIRY-NUM
• ASSOCIATED NUMBER INDEX
INQUIRY-ANM
• CASE HISTORY REGISTER INQUIRYCAS
The Name Index Inquiry, shown in detail in Figure 8,
is a display which allows the user to identify information
pertaining to persons .involved in the judicial process.
By supplying the System with the name of a person,
the System responds by displaying all cases involving
the person.
This inquiry capability is used to find the case
number when only the name is known. The Name
Index contains the names of all persons associated with
complaints, misdemeanors, and felonies. That is, it
contains the names of defendants, defense attorneys,
prosecuting attorneys, witnesses, bondsmen, etc.
The Personal Descriptor Inquiry is used to answer
requests for more identifying information about a
defendant. Upon entering the defendant's name, the

J010 SMITH

M TI CON C

CASE

B SR OEF 2

003945601 010 5

SMITH

06-29-70

B SR OEF 3

003816801 176 S V SMITH

05-18-70

8 SR WIT 2

004137403 040 IMP THOMPKINS

09-23-70

~:~~~~

~~:~~:::

001/213

CAPIAS RET
DISPOSITION

12-29-69
02-06-69

005/131

INDICTMENT
CAPIASISS
-cAPIASRET
BOND MADE

12-24-69
12-24-69
12-27-69

HEARING
PLEADING

12-30-69
01-05-10
01-79-70

JURYREQ
DISPOSITION

01-29-70
02-UI-70

001/213

IS3

BAIL

Place of Birth
Date of Birth
Height
Weight
Hair Color
Ethnic Features
Sex
The Jail Arrest Conviction Inquiry provides a display
Input:

~ CAL,08-24-70,176
Response:

CALENDAR

CASE

BOND

5010

BROWN HOLLY A

3

MS.PEAS PARTIES-DESCRIP_TIN EST ACT DISP FUT-DATE FCC

003456-9D1 SENT MICHAEL SMITH
JOHN A. ooE

003483601 SENT JOHN Q. PUBLIC

1000
PLACED IN JAIL
lOaD

APPEARANCE BOND _ ALLSTATE s.:>~t' C~
FOUND SANE AT TIME OF TRIAL
PLEA OF NOT GL'ILT'i

069/S15

DECISION _ GUILTY BY.Jt""Y
•

14687-01

LOC

003513201 TRY

560
NONEXECUTABLE
DEATH OF DEFENDANT

119

5 FIL-DATE

ROID 08-24-70 176 10:00 CR31

HARRIS COUNTY
FELONY CASE HISTORY REGISTER
WEEK ENDING 02-18-70

C
T

v

following information is provided:

~SMN

MINUTES
VOL/PAG

ENTITLEMENT

Figure 8-Name index inquiry

ROOO CAL-DATE CNC TIME

T R -A N SAC T I 0 fo,'
TYPE
DATE

MS CT

INDICTMENT

12-1-1-70001/82418"3

CAPIAS ISS

12-17-10

PRISON

-

LIFE

SOD

Figure 7-Case history register

Figure 9-Court calendar index inquiry

314

Fall Joint Computer Conference, 1971

10 allows the user to display all transactions regarding
a particular case.

Input:

~ CAS, 3,176-0028346-01
Response:

SECURITY AND PRIVACY

~CMN

CASE
MS CT

HISTORY

ENTITLEMENT DCPS NEXT-APPEAR FIL-DATE CODE

002834601 176 S V DOE

B NN 11-15-70

BODO TRANSACTION -DATE- VOL/PAG IDENTIFIER

BOlO OFFENSE

082670

082670 123/303 09-13-70

B070 BOND MADE

082870

S 10000

8130 INDICTMENT

102370

$ 10000

DOlO

100.00

TRIAL

LAST-CHG PF

10-26-70

VALUE

B030 COMPLAINT

DODO PRINCIPL JUDGMNT

08-26-70 2501

FORGERY OF CHECKS
$ 10000

DIS-ATT SHERIFF

CLERK

004-0163104-01

176-0028346-01

JURY

LAW-LIB

15.00

Figure lO-Case history register inquiry

of all known arrest and conviction information concerning a particular defendant. This information may
be used by the District Attorney's Office to prepare
the prosecution and by the Sheriff's Office for criminal
investigation purposes.
The Court Calendar Index Inquiry shown in Figure
9, allows the user to display the cases scheduled for a
particular day in a particular court.
The Identifying Number Index Inquiry allows the
user to identify a person and the cases that person is
associated with by entering anyone of several identifying numbers. The following identification numbers
may be used:
Complaint Number
Sheriff' s Number
Texas Department of Public Safety Number
FBI Number
Social Security Number
Operator License Number
Arresting Agency Number
Law Enforcement Number
Grand Jury Records Section Number
This index allows the various agencies of the criminal
justice process to communicate with the system by
using their own identification numbers.
The Associated Number Index Inquiry allows the
user to display all identification numbers associated
with a person involved in the judicial process by
supplying the system with a case number.
The Case History Register Inquiry shown in Figure

In every computer system incorporating a large data
base, security and privacy of information are important
considerations. This is especially true in the case of a
criminal records system. Two basic problems exist,
errors and unauthorized access.
Errors are a result of mistakes occurring during the
manual preparation of the input data. Errors on source
documents, typographical or keypunching errors, and
inadvertent omission of pertinent data are examples of
the types of errors which can occur. The input routines
detect invalid input data (numeric value out of range,
alphabetic character in a numeric field, unknown code,
etc.) and all data input via cards is verified by being
displayed and matched with the source document. As
data are input, routines also check for inconsistencies in
data (e.g., a warrant for arrest is shown to be executed
prior to being issued) .
The second problem of unauthorized access to the
data files is particularly critical. The criminal records
system deals with highly sensitive information. Destruction or modification of this information would severely
cripple the effective performance of criminal justice.
Therefore, a considerable amount of effort has been
made to ensure the integrity of the information contained in the criminal records system.
The criminal records system allows for the updating
of records from remote terminals. This provides up-tothe-minute information in the files but can be a source
of problems if unauthorized personnel have access to
the terminals. Several steps have to be taken to alleviate
this problem.
• Each person authorized to update the files is
assigned an access code which is changed periodically. Without the code, modification of or additions to the files cannot occur. Furthermore, the
access codes are valid only for a particular terminal.
• Certain terminals are designated as display
terminals only and allow no modifications or
additions to occur. In addition, those terminals
which are allowed to make modifications or
additions may be restricted to use only during
those periods of the day when authorized persons
are on duty.
• The system also provides file protection by
terminals. Thus a particular terminal may be able
to modify or add a record in the Name File but not
in the Case or Calendar File.

Automated Court Systems

• The system also has the ability to restrict the transactions allowed on a given terminal. Thus a particular terminal may be able to make an inquiry that
is not allowed by some other terminal. This allows
controls, via software, to be placed on the use of
any terminal.
Periodically the information is transferred from disk
storage to magnetic tape. Two copies of the files are
made. One is stored locally and is used to recreate the
files in the event of inadvertent (hardware malfunction)
or deliberate destruction of the files currently recorded
on disk storage. The other copy is kept at a remote
location as protection against the destruction of both
the files on disk storage and the magnetic tape copy.
While the above mentioned capabilities provide a
means of protection, the ultimate success depends on the
people involved and the extent to which the operating
procedures are followed.
CONCLUSION
It should be noted that while this System was tailored
specifically for Harris County, Texas, the concepts and
design, if not some of the programs themselves, could
be successfully applied to many other counties in Texas
and throughout the country.
The System was designed with several important
growth features in mind. Some of the possible additional
capabilities being considered are simulation models
which take advantage of the statistical information
now available, a complete bookkeeping system for the
Adult Probation Department to keep track of fines,
supervisory fees and restitution payments, a complete
jail record system from keeping track of personal
effects and making cell assignments to computerized
search capability of fingerprints and mugshots, and
automated recording procedures for the Juvenile
Probation Office.
The benefits of the Harris County Subject-in-Process

315

Records System are numerous. One of the primary
benefits however, is the ability to obtain instantaneous
response to a variety of questions concerning a case or
a defendant. In the past, an inquirer was often transferred from one office to another as each office searched
but failed to find the requested information.
Another benefit of primary importance is the System's
ability to monitor the progress of each case and periodically report required actions. These action reports
include lists of persons being held for no apparent
reasons, cases that are ready for trial but have not been
calendared and persons whose probation periods have
elapsed but have not been officially terminated.
In addition to providing answers to questions and
monitoring case progress, the System also provides
numerous/ written reports which assist the criminal
justice officials in preparing a case for trial, scheduling
each event of the trial, and preparing local and state
statistical reports.
Another result of the computerized system is the
ability to use the information to produce various
statistical reports to aid in evaluating administrative
procedures and to test hypothetical changes in these
procedures. Additionally, quick access to accurate case
load information is extremely useful for budget planning
and evaluating future manpower and facility requirements.
All of these benefits aid significantly in reducing the
time it takes to process a case.
REFERENCES
1 The User' s Manual

Harris County subject-in-process records system
Symbiotics International Inc 1971
2 Design Specifications
Harris County subject-in-process records system
Symbiotics International Inc 1971
3 Basic courts system (BCS)
IBM form No GH20-0888 IBM Corp
4 Faster-LC
IBM form No SH20-0863 IBM Corp

Delphi and its potential impact on information systems
by MURRAY TUROFF
Office of Emergency Preparedness, Executive Offices of the President

Washington, D. C.

THE DELPHI METHODl,2
The Delphi method is basically defined as a method
for the systematic solicitation and collation of informed
judgments on a particular topic. The concept of "informed" here could mean poor people, if the subject
were poverty, as well as the usual interpretation of
"experts." The method has two important characteristics which distinguish it considerably from a polling
procedure. The first is feedback, where the judgments
of the individuals are collected, possibly formulated as
a group response and fed back. Thus, each individual
may view the results and consider whether he wishes
to contribute more to the information and/or reconsider
his earlier views. This round or phase structure may
go through three to five iterations in the usual paper
and pencil exercise. The second characteristic is that
all responses are anonymous. The reasons for anonymity
are much discussed in the literature and will not be
reviewed here. However, there are circumstances where
complete anonymity could be relaxed. In some cases it
may be useful for the respondents to know who is
participating in order to insure awareness that a peer
group is involved in the discussion. Also, when a highly
specialized subtopic enters the discussion it may be
appropriate to permit an expert to endorse an item.
The primary objective of the Delphi process, as set
forth in this paper, is the establishment ofa "meaningful" group communication structure. If this view is accepted as correct, then the question of whether or not
a Delphi exercise will produce "truth" is not a relevant
one. The real issue, given the context of a particular
problem, is what communication process or combination
of processes will be most effective in terms of the
resources available to examine the problem.
There appear to be five situations where the Delphi
method clearly has an advantage over other alternatives:

•

•

•

•

no history of adequate communication and the
communication process must be structured to insure understanding;
Where the problem is so broad that more individuals are needed than can meaningfully interact
in a face-to-face exchange.
Where disagreements among individuals are so
severe that the communication process must be
refereed.
Where time is scarce for the individuals involved
and/or geographical distances are large, thereby
inhibiting frequent group meetings.
Where a. supplemental group communication process would be conducive to increasing the efficiency
of the face-to-face meeting.

In order to emphasize the view that the Delphi is a
communication process, Table I directly compares the
properties of normal group communication modes and
the non-automated and automated Delphi processes.
The major differences lie in such areas as the ability of
participants in a Delphi to interact with the group at
their own convenience (i.e., random as opposed to
co-incident), the ability to handle large groups, and the
ability to structure the communication. With respect to
time considerations, there is a certain degree of similarity between a Committee and a Delphi exercise since
delays between meetings and rounds are unavoidable.
Also, the Delphi Conference 3- s may be viewed conceptually as a random (occurring) conference call with
a written record automatically produced. It is interesting to observe that within the context of the normal
operation of these communication modes in the typical
organization, governmental or industrial, the Delphi
, process appears to provide the individual with the
greatest degree of individuality or freedom from restrictions on his expressions.
While the Table breaks down these systems separately, there is no reason why the examination of a
particular problem would not be best served by a
combination of these techniques. For example, a Delphi

• Where the individuals needed to contribute knowledge to the examination of a complex problem have
317

318

Fall Joint Computer Conference, 1971

TABLE I-Group Communication Techniques
Conference
Telephone Call

Committee Meeting

Effective Group
Size

Small

Small to Medium

Occurrence of
Interaction by
Individual

Coincident with
Group

Length of
Interaction

Formal Conference
or Seminar

Delphi Exercise

Delphi Conference

Small to Large

Small to Large

Small to Large

Coincident with
Group

Coincident with
Group

Random

Random

Short

Medium to Long

Long

Short to Medium

Short

Number of
Interactions

Multiple, as
required by group

Multiple, necessary
time delays
between

Single

Multiple, necessary
time delays
between

Multiple, as
required by
individual

Normal Mode
Range

Equality to
Equality to
Presentation
Chairman Control
(Directed)
Chairman Control
(Flexible)
(Flexible)

Equality to
Monitor Control
(Structured)

Equality to
Monitor Control
or Group Control
and no Monitor
(Structured)

Principle Costs

Communications

-Travel
-Individuals time

-Monitortime
-Clerical
-Secretarial

-Communications
-Computer Usage

Time·Urgent
Considerations

Forced Delays

Forced Delays

Time Urgent
Considerations

Other Characteristics

-Equal flow of information to and from
all
-Can maximize psychological effects

Conference may be used between committee meetings
to arrive at an agenda and expose the areas of agreement
and disagreement. This, in turn, would improve the
efficiency of time spent in the actual committee meeting
by focusing the discussion on those areas requiring
review. In some instances this would also improve the
efficiency of staff work before the meeting.
Usually a Delphi communication process, whether it
be an exercise or conference undergoes four distinct
phases. The first phase is usually characterized by
exploration of the subject under discussion wherein
each individual contributes additional information he
feels is pertinent to the issue. The second phase usually
involves the process of reaching an understanding of
how the group views the issue (i.e., where they agree
or disagree and what they mean by relative terms such
as importance, desirability or feasibility). If there is
significant disagreement, then that disagreement is explored in the next phase to bring out underlying reasons

-Travel
-Individuals time
-Fees

-Efficient Flow of
Information
from few to many

-Equal flow of information to and from all
-Can minimize psychological effects
-Can minimize time demanded of respondents or conferees

for the differences and possibly to evaluate them. The
last phase, a final evaluation, occurs when all previously
gathered information has been initially evaluated and
evaluations have been fed back for consideration.
The Delphi technique may be considered to have
roots in the jury system and is, perhaps unfortunately,
a rather simple idea. Because of this, many individuals
have conducted one Delphi and only a few have gone
on to do more than one. The process of designing a
workable communication structure for a particular
problem currently appears to be more an art than a
science. However, a number of general reasons for
failures have come to light from these less successful
attempts:
• Utilizing a blank sheet of paper on the first round
or phase and thereby implying that the respondents
should waste their time in educating the design
and monitor team;

Delphi

• Poor techniques of summarizing and presenting
the group response and insuring common interpretations of the evaluation scales utilized in the
exercise;
• Ignoring and ot exploring disagreements so that
discouraged dissenters drop out and an artificial
consensus is generated;
• Ignoring the fact that respondents to a Delphi are
acting in a consultant mode in what may be a
demanding exercise and should therefore be involved as a part of their normal job function or
should receive normal consulting fees for participation.
The use of the Delphi process appears to have increased at an exponential rate over the past five years
and on the surface seems incompatible with the limited
amount of controlled experimentation that has taken
place on the methodology itself. It is, however, meeting
a demand for improved communications among larger
and/ or geographically dispersed groups which cannot
be satisfied by other available techniques. It also serves
the decision maker who wishes to seek out the potential
secondary effects of a decision or policy which may
involve a more diverse group of experts than is normally
available. Also, technologists have become increasingly
concerned that attempts to evaluate cost-benefit aspects
through mathematical models often eliminate significant technical factors which they may feel are crucial
criteria for the making of a decision. The Delphi
process can, in this context, be viewed as an attempt
to put human judgment, in terms of a group judgment
by experts, on a par with a page of computer output.
This is an unfortunate justification for the Delphi
process, .but from a pragmatic point of view it is a
valid one in terms of decision processes in some organizations.
It can be expected that the use of Delphi will continue
to grow. From this one can observe that a body of
knowledge is developing on how to structure the human
communication process for particular types or classes
of problems. The abuse, as well as the use, of the
technique is contributing to the development of this
design methodology. It would seem obvious that any
communication structure that employs pencil, paper,
and the mails can, in principle, be duplicated in a real
time mode on an interactive terminal-oriented computer-communication system. When this is done the
resulting product is a continuous group communication
process which eliminates some of the disadvantages in
the paper and pencil type Delphi while retaining most
advantages. It is the contention of this author that
those in the computer field should begin to actively
plagiarize the techniques of the Delphi design area for

319

building on-line conferencing systems tailored to various
problem applications. The remainder of this paper attempts to support this assertion.
EXAMPLES*
In examining applications.of the Delphi, one observes
that the vast majority deal with forecasting the future.
Because of this, many individuals associate the Delphi
process solely with forecasting. However, in examining
other Delphi exercises, one finds that they span a
surprising diversity of applications :
•
•
•
•

Examining the significance of historical events
Gathering current and historical data
Putting together the structure of a model
Delineating the pros and cons associated with potential decision or policy options
• Developing causal relationships in complex economic or social phenomena
• Clarifying human interactions through role playing concepts.
If one adopts the view of Delphi as a communication
tool, then this exhibited diversity of application is not
surprising. A group communication process can, in
theory, be applied to any problem area. The following
will discuss some of these previous applications and
indicate where they may lead in the future.
Dr. Williams of Johns Hopkins University has utilized
the Delphi to obtain estimates of current rates of
disease incidence' and the success rate of various alternative treatments. Since hospital reports may reflect
local reporting standards, there is considerable uncertainty associated with the data that is available. This
phenomenon also occurs in other areas such as crime
statistics. In applications of this sort, individuals are
asked to supply low and high values as well as an
explicit estimate. This type of exercise then proceeds in
very much the form of a forecasting Delphi, although it
deals with current data.
There are a surprising number of Delphi designers in
the medical research and health care areas, some of
these are Dr. A. Sheldon at Harvard, Dr.A. Bender
at Smith Kline French, Dr. D. Gustafson at the University of Wisconsin, and Dr. G. Sideris, American
Surgical Association.
A Recent Delphi on the Steel Industry7 by the
National Materials Advisory Board of the National
Research Council also attempted to gather estimates on

* See Reference 1 for explicit references to the examples
mentioned.

320

Fall Joint Computer Conference, 1971

the quantity of material flowillg in and out of various
processing segments of the industry. In such a case,
even when a parameter is published it may only represent a percent of the industry. This percent factor may
be only approximately known.
A proprietary Delphi was done which dealt only
with historical events affecting the subject of the "Limitation or Elimination of Internal-Combustion Vehicles."
Some eighty-two events were compiled by the respondent group and evaluated for explicit significance and
"factors to watch" as a result of the events. The events
were technological, economical, social, and political.
The resulting summary arranging the events chronologically represented an excellent review and condensation for management. This same concept could easily
be applied to a professional area and the computer
field is perhaps overdue for a careful review of the
literature. For example, it is doubtful that anyone in
the field can claim he has read all that has been written
on Management Information Systems. Probably all
would agree, however, that the signal to noise ratio is
small. It would be interesting to see the list of significant
papers drawn up by a group of experts, and to discover
how they would identify papers representing follow-on
work to earlier papers and further developments that
may occur as indicated by a particular paper. One
added benefit of the Delphi is that an expert need not
feel embarrassed to propose or argue for his own papers
as significant. It is not clear, of course, that the group
would always vote to include a suggested paper.
The concept of utilizing Delphi to examine history is
a simple but powerful concept. Most organizations do
not really do a good job on evaluating past performance
and this often defeats the purpose of their planning
efforts. The author hopes more applications of this
type will be forthcoming.
Mr. S. Scheele of the SET, Inc. designed and executed
a fascinating Delphi on the Role of Mentally Retarded
in Society. Since he was dealing with a non-quantitatively oriented group, he relied very heavily on pictorial models which the individuals could fill in in order
to represent human and societal interactions. Also inheren t in the design were role playing concepts and a
requirement for the respondents in answering different
questions to assume different roles. This same concept
applies to obtaining answers from individuals in po~
litical or public position where one would wish to ask
for the individual's true view on an issue and the view
he would espouse if required to take a public position.
The role playing concept in the Delphi has implications for an organization in the sense that most
budget allocation procedures may be viewed as a form
of polling where each manager submits his requests to
a central source. When budget cuts must be made,

there is a great deal of competition among the divisional
groups, often resulting in antagonism and a complete
breakdown of lateral cooperation and communication.
The budget process could be "carefully" recast in a
Delphi mode and each manager asked to assume the
roles of other managers and to attempt justification of
budget segments other than his own. This could lead
to more understanding of the final allocation for all
concerned and correspondingly less antagonism. The
validity of the above general suggestion is, however,
extremely dependent upon the particular organization
and details of the environment, operation, and makeup.
Norman Dalkey's "Quality of Life" Delphi is a
classic simple example of utilizing a Delphi to obtain
subjective evaluations which could not be gained by
any analytic method. Here the respondents were asked
to itemize and define a set of variables which comprised
the Quality of Life and were measurable in at least an
empirical sense. The feedback mechanism was necessary
to arrive at mutually understandable definitions and
the anonymity was desirable to avoid the embarrassment of individuals who might rate factors such as
"aggression" higher. than the group as a whole. The
same type of Delphi was conducted on a group of
corporate executives to determine if their ranking of
the Quality of Life variables corresponded to the
corporation executive benefit program.
Many individuals have a mistaken impression that
consensus is a goal of all Delphi Exercises. When exploring policy or decision issues, the goal may be to
develop the strongest set of pros· and cons concerning
a given issue. In a sense then, some policy Delphis seek
to at least explore disagreement if not to directly foster
it through the makeup of the respondent group. Even
if a decision maker has reached a view on an issue, it
may be of interest to him to seek out the opposing
view to be forewarned of difficulties he may encounter
when his decision is made public. The discovery of a
consensus among opposing advocates on underlying
issues or compromise positions may make the exercise
doubly useful but may not be the primary goal.
In the steel Delphi mentioned earlier, the respondents
were given a flow model diagram of steel processing
which was intended to collect data on the flow of
material in each path. The initial model was put together by an expert. However, many of the respondents
to the exercise decided that the diagram was not sufficient to express what they felt were significant connections. As a result of the uninvited modification of
the model,the diagram obtained after two Delphi
rounds was considerably more detailed and realistic.
This leads to the proposition that Delphi can be utilized
to build model structures for complex processes. The
difficulty with some of the plans for designing computer

Delphi

graphic systems for group engineering design efforts is
that the computer people often forget that the concept
can be first tried with pencil and paper on a real-life
problem to see if a workable communication structure
would result. If it succeeds in a Delphi Exercise mode
then there is a higher probability of success in the
automated version. In many ways, the Delphi activity
as it occurs today is conducting a significant experimentation program for the field of computer sciences. This
fact appears to have, thus far, escaped the notice of
most computer personnel. The general concept of pretesting an information system design by paper and
pencil exercise before it is frozen in the concrete of our
"flexible" computer system deserves more attention
than it has received.
One very significant aspect of the Delphi area has
been the design of attempts to discover views on causal
relationships underlying complex physical, social, and/or
economic systems. While many design techniques have
been tested, one in particular has gained wide use
because of the ease with which even non-quantitatively
oriented individuals can supply answers. This communication format is generally referred to as "Cross
Impact"8 and involves a matrix formulation of causal
effects where the user is asked to supply either probabilities, odds, or weights depending on the particulars
of the formalism. While the approach is easy to use,
the analysis of the results is less clear because one is
asking only for a small, but feasible portion of the
information required to rigorously specify the problem
and therefore consistency checks can only be approximations. At least four different methods of analysis are
currently being used. An important difference for some
of these approaches is the ease with which the method
can be incorporated into an interactive mode on a
computer system. In experiments the author has conducted with a method of treatment suited for a computer, one finds that a non-programming user, by
supplying answers to a cross impact form, can in effect
build his own model of the future which he can then
subject to perturbations to see the effects of alternative
decisions or policy. This becomes very useful as an aid
to the thinking through of a complex situation. The
interactive feature is extremely important in allowing
an individual to modify his initial estimates until he
feels he has obtained consistency between these and the
inferences provided by the analytic treatment. Once a
user is satisfied with the estimates obtained in this oneperson game mode, they may be applied automatically
to the formation of a group estimate and may allow
individuals to see the differences in judgment that may
occur for both the magnitude and the direction of the
causal effects. This process quickly focuses the group's
attention on areas of either disagreement or uncertainty

321

which then may be discussed in a committee process or
a general discussion-oriented Delphi.
The particular utility of the cross impact formalism
in a planning environment will become evident in the
next section.
In terms of the author's knowledge alone, there are
at least thirty distinctive Delphi designs which have
been successfully applied to particular problem areas.
Each one of these is a potential candidate for automation on a terminal-oriented computer system in order
to implement a real time conference system. While many
of these require graphical input, a sizable number can
be implemented utilizing the common teletype terminal.
When the computer is introduced we also introduce the
ability to provide for the Delphi respondent both analytical tools and selective data bases which he may
utilize to sharpen his jUdgments before they are contributed to the group response.
A significant observable effect of a computerized conference system is the group pressure to restrict discussion to the meat of the issue. Verbose statements always
tend to receive low acceptance votes and individuals
quickly learn, because of this, to sharpen their position
if they wish to make a point.
Putting all these factors together with the real time
nature of such a system, we can begin to visual the
results as approaching something that might be termined a "collective human intelligence" capability. In
terms of the current state of the art in the computer
field there may be a great deal more pay-off in easing
the ability of humans to contribute the intelligence to
the computer than in attempting to get the computer
to simulate intelligence.
INFORMATION SYSTEMS
In most organizations today, the individuals or groups
involved in forecasting and/or planning* usually exhibit
the greatest desire to foster lateral communication.
This often comes from a realization that uncertainties
and ancillary considerations must be carefully explored
if the organization is to avoid problems in the future.
The desire to seek out the specialists in the organization
regardless of where they sit, combined with the requirement to minimize the time they must give up from their
normal functions, has led to an increasing use of the
Delphi by the forecasting groups.

* The exception to this generality occurs when there is a belief
that planning or forecasting can be reduced to only the consideration of dollars and alternative dollar equivalents or investments.
Perhaps more organizations take this view than is warranted by
their situation.

322

Fall Joint Computer Conference, 1971

STANDARD PROCEDURES. MODElS. DATA MANAGEMENT SYSTEMS

li.!
PEOPLE

-~TERMINALS

t~\
AN ADAPTIVE LATERAL MANAGEMENT SYSTEM

CONTROLLABLE EVENTS

UNCONTROLLABLE EVENTS

EVALUATED OPTION
CONSEQUENCES

Ffgure 1

Due to the increasingly complex environment that
most organizations face today, a similar circumstance
has developed with respect to day-to-day management
functions. The need for committee participation is beginning to make heavy demands upon the time of many
managers. While the paper and pencil Delphi process
has been introduced in some cases to alleviate the situation, it does not always meet the time urgent requirements associated with some management activities.
These seems, therefore, to be a rapidly increasing
interest in implementing more efficient communication
techniques to deal with complex management problems.
The automated Delphi or Delphi Conferencing may
very well be the answer to this problem. In fact, one
can conceptually layout a highly adaptive Management
Information System based upon this view.
Given that the organization has a problem to be
examined and resolved, the first step is to pinpoint
the individuals who can contribute to the process independent of their organizational or geographical location.
They, as individuals, may contribute via· terminals at
their convenience to a general discussion conference
(see Figure 1).
Requests for information on the potential environment (i.e., those factors not under control of the organization) will emerge from this discussion. These requests
are shifted to a specialized conference structure which
may involve only a subset of the general conference
group and other specialists as needed. The communication structure for this forecasting conference is probably typical of many of the forecasting Delphis already
in existence.

Potential program options would also evolve from
the general discussion conference. These options would
be shifted to another secondary or specialized conference
to evaluate questions of resource allocation within the
organization. This type of conference would possibly
have various analytical support routines involving
optimal allocation of resources among combinations of
program options.
In both the resource allocation and forecasting conferences one would expect uncertainties or disagreements to occur which should be fed back to the general
discussion conference for resolution. The results of these
efforts would be a set of program options and potential
environments which may now be played off, one against
the other, in a conference structured along the lines of
a 'Cross Impact' exercise.
In this third conference, additional uncertainties and
disagreements may arise to be fed back to the general
discussion conference. It is also possible, if not likely,
that the results of the cross impact may trigger the
requirement to introduce new program options or to
examine a newly introduced aspect of the environment.
One then views the interaction of these four conference
structures as a continuous communication and feedback
structure.
This basic set of four conferences may be replicated
for each problem the organization wishes to put through
this process. Therefore anyone individual may be involved in a number of different problems. Also, a particular problem may be perpetual in nature so that the
activity never stops but different individuals may enter
and leave the discussion as a need for their particular
speciality arises and is satisfied. The result of this is a
highly adaptive and flexible structure for problem solving which the author feels exhibits all the characteristics
of a Management Information System or what MIS
should be.
The underlying premise behind the adoption of such
a system is that while the organization may have
elaborate data management systems and simulations
or models, there are no algorithms allowing the data
flowing through the normal organization procedure to
automatically be transformed into a form directly suitable for addressing management problems as they occur.
The view here is that individuals provide the best
available mechanism for discriminating, reorganizing
and presenting the portion of the data needed for
problem consideration.
The problem that appears to exist in many current
MIS efforts is the view that it is possible to introduce
automation to the point where the person at the top
can press a few keys on the terminal and all the pertinent data in a form appropriate to his problem will be
retrieved. This is true only to the extent one believes

Delphi

that all the problems that will be considered can be
predefined.
Most of the current MIS efforts are based upon
what the philosophy of science people would term a
'Leibnitzian Inquiring System'9 where the approach is
to believe that one can construct a model of a physical
process independent of the data inputs. This view
underlies the mathematical and physical sciences, and
the attempt of the soft sciences, including the management sciences, to emulate this philosophy has perhaps
created some of the problems in applying work in this
area to real problems.
It is of interest to note that any particular Delphi
design, or communication structure can be characterized
in terms of one of the Inquiring Systems specified in
Churchman's writings. However, very few Delphis fall
into the category of being Leibnitzian since there is
usually a basic recognition in Delphi structures that the
problem and data are inseparable. The policy type
Delphi can, for example, be characterized as a "Hegelian
Inquirer" which in its extreme assumes that any particular data, through its representation, can be used to
support contrary positions. Information is considered
fundamentally, in this view, as a property of the conflict
between contesting points of view.
The lateral management system that this paper has
attempted to describe can be viewed as what has been
termed a "Singer-Churchmanian Inquirer" where it
is assumed that "there are a multiplicity of models,
theories, and inquirers for looking at the world, no
one of which has absolute priority over the other".
With this view, one is quickly forced to the position
that each new problem must be examined within the
context of the available information and potential
analysis tools in order to arrive at a treatment. Therefore, we must utilize the only information processor
which can evaluate* among alternative approacheshumans.
When one realizes that a majority of the efforts
associated with trying to apply computer systems to
the problems facing organizations are based upon a
Leibnitzian Inquiring philosophy, then one of the little
known, but crucial dangers associated with computer
systems becomes clear. Organizations are forced, by
both accounting requirements and command (i.e., focusing of responsibility) requirements, into adopting a
hierarchical structure. Because the environment confronting these organizations has become increasingly
complex the resulting structure does not often match
the problems that arise. The usual Leibnitzian reaction
to this situation is to reorganize so that a new structure
fitting the problem emerges. Because structures, es* In the sense of applying value to the alternative approaches.

323

pecially if they retain the hierarchical property, are
fairly rigid, and the· situation today is characterized by
numerous problems wherein no one structure is common
to all, these attempts to reorganize do not usually
accomplish the desired goal.
Since organizations are made up of at least a subset
of intelligent human beings, the inadequacies of the
organizational structure and the resulting established
communication channels are at least obvious to some.
The result is a growing lack in maI;ly organizations of
effective communications about various problems. The
individual perceiving the situation faces a choice of
either establishing informal communication channels
and perhaps suffering consequences for bypassing the
established modes or suffering in silence and adapting
a game-playing attitude toward the communication
process available to him. When this latter attitude is
characteristic of a large segment of the organization,
there is no longer an effective human communication
process and individuals become extremely unresponsive
to attempts to effectively deal with problems. This is
further complicated in times of tight budgets where
there is competition for resources among different segments of the organization.
Given the above situation in an organization, what
happens when a computerized Management Information System designed along Leibnitzian lines is
introduced? A well designed system of this sort gives
the illusion of intelligence by being very responsive to
the individuals communicating with it. Since it is data
independent it can translate any input data provided
by the humans into an apparently original output or
consequence. Psychologists would possibly agree that
given the alternative of an unresponsive human communication process or a responsive man-machine communication process most individuals will shift their
efforts at communication to the machine. We then find
the computer becoming a surrogate for a repressed
individual desire for effective communication. In addition, since the Leibnitzian view of the world appears
invalid for the type of problems confronting most
organizations, then the introduction of a Management
Information System based on this concept is a form of
deception. The result is that the human is still playing
a game, although with the computer he may be less
aware the existence of the game than in the process of
dealing with humans.
What the author therefore believes to be a real
danger of computer usage over the long term is the
ability of these systems to subjugate the desire to
treat problems associated with human communications
in organization by providing an image that an effective
communication process exists. There is the possibility
of a world ten to twenty years hence where a majority

324

Fall Joint Computer Conference, 1971

of the professional populace believes it is performing a
useful function, but is, in fact, engaged in a game from
which no tangible benefits result.
USER REQUIREMENTS FOR CONFERENCINGlO
The first and paramount requirement is that the
designer of a conference structure have available a user
oriented language (i.e., BASIC, JOSS , ·APL, TINT ,
etc.) in which the conference can be programmed. The
general rule about conferencing systems is that any
~oup of users will, through experience, have a consIderable number of modifications to make. Also, it can
be expected that a new type of problem may dictate a
new communication structure. The role of the computer
specialist should be to provide those features in a user
language and machine executive which will allow the
designer the flexibility of programming communication
structures which may be intertwined with simple or
complex analytical expressions (i.e., from vote averaging to optimization models). The basic system requirements are twofold and very similar to those required for on-line simulations involving a group of
humans:
(1) Simultaneous attempts by two or more indi-

viduals at separate terminals to write in the
same file should not cause garbling of the file.
(2) Errors in input or noise on the line should not
confuse the user by throwing him out of the
program and into the compiler or executive
program.
There are several ways to meet the requirements.
Summarized here are the particular features available
in XBASIC* on the UNIVAC 1108 that allow the
writing of conferencing systems.
An XBASIC program can execute a subset of the
~xecutive .level co.mmands on the 1108. This capability
IS helpful ill allowillg the program to assign the common
file exclusively (using the executive command) for a
short time (less than a second) to the conferee who is
inputing data at that moment. The program then frees
the file for any other conferees desiring to write in it.
This simple feature solves the first requirement.
In order for the interaction program to do all the
error checking, a full capability for decoding strings is
required. Everything (even a number) entered via the
terminal or via noise on the communication line is read
as a string and checked for allowed choices. Therefore,

* Proprietary processor developed by Language and Systems
Development Inc.

t~e ability to accomplish string manipUlation and proVIde storage of string variables is required.
Output, especially for non-programmers, must be
neat. Therefore, format or form control, such as is
provided in FORTRAN or JOSS, for example, must
be part of the language.
A good test of the sufficiency of the string handling
capabilities in a user language is provided by examining
the difficulty of writing one of the standard interactive
text editors in the user language.
Although the above items are sufficient, a number of
other features will make things easier or more efficient.
Many of these are covered in Hall's paper in the 1971
SJCC proceedings.4
Many computer professionals appear to have believed
until now that any user can be placed in one of five
categories. The user does calculations, or he looks at
data, or he manipulates strings, or he edits text, or he
files things away; but he always does just one of these
things, and the system capabilities are slanted accordingly. All the users I have ever encountered seem
to do all the above in their daily non-computer chores.
It is time for user languages to reflect a more realistic
picture of users and their requirements.
If computer conferencing is to be a successful operation, the design and modification of conference structures with respect to the dictates of the problem being
examined must be largely carried out by the users.
Once a successful structure has evolved, a good systems
programmer will have a role in making the overall
operation efficient, provided the system is to receive
long term use.

REFERENCES
1 A more detailed discussion of the Delphi and a comprehensive bibliography of this area may be found in The Design
of a Policy Delphi by Murray Turoff, Journal of Technological Forecasting and Social Change, Vol. 2, No.2, 1970.
2 A comparison of Delphi as a planning tool with other
planning tools may be found in Technological Forecasting
and Engineering Materials by the Committee on Technological Forecasting of the National Materials Advisory
Board of the National Research Council, NMAB-279,
December 1970.
3 The history of a particular Delphi Conference application
may be found in Delphi Conferencing (i.e., Computer Based
Conferencing with Anonymity) by Murray Turoff, Journal
of Technological Forecasting and Social Change, Vol. 3,
No.2, 1971 (publisher: American Elsivier). This paper
contains the complete design of the user interaction.
4 Details on implementing the above conference system on a
computer may be found in Implementation of an Interactive
Conference System by Thomas W. Hall, Proceedings of the
1971 Spring Joint Computer Conference.

Delphi

5 An abbreviated report on this topic may be found in
Industrial Applications of Technological Forecasting and its
use in R&D Management, Wiley 1971, edited by M. Cetron
and C. Ralph.
6 An explanation of the Delphi Conferencing concept for the
layman is available in the April 1971 issue of the Futurist
(magazine of the World Future Society, Washington,
D. C.).
7 Two other large recent Delphis (involving 40 to 100
experts) reviewing the potential future of an industrial
sector was one on computers by IBM and one on the
Housing Industry by Selwyn Enzer of the Institute for the
Futttre. See Some Prospects for Residential Housing by
1985, IFF report R-13, January 1971. The "Delphi
Exploration of the Ferroalloy and Steel Industry" should be
available from NMAB in late 71 and contains a detailed
history of the effort involved in carrying out a large scale
Delphi exercise.
8 For a review of this literature see An Alternative Approach
to Cross Impact Analyses by Murray Turoff, Submitted to
the Journal of Technological Forecasting for publication
early 1972. This paper also illustrates the use of Cross
Impact in an information system context.
9 See What is Information? A Philosophical Analysis by
Jan J. Mitroff, Interdisciplinary Program in Information
Sciences, University of Pittsburgh (to be published). Also
the writings on Inquiring Systems by C. West Churchman
(Internal working papers 28, 29, 45, 46, 49 on Inquiring
Systems, Space Sciences Lab., University of California
Berkeley, to be compiled in a book).
10 The author has discussed some of these issues earlier:
Immediate Access and the User, Datamation, August 1966
and Immediate Access and the User Revisiated, Datamation
May 1969.

OTHER REFERENCE MATERIAL
Most of the current literature on Delphi appears in
the Journal of Technological Forecasting and Social
Change, Futures, or the Futurist (the magazine of the
World Future Society). The World Future Society also
runs a supplemental bulletin in which on-going work is
reported usually long before publication.
The Institute for the Future is doing continuing
work on applying the Delphi to fairly complex problems.
Their reports and working papers should be of considerable interest to anyone planning to utilize the
technique on a large scale problem.
Norman Dalkey at RAND has been carrying on a
continuing series of experiments on the methodology
and many of his recent papers are mandatory reading
for potential practitioners.
The following items are meant to augment the extensive bibliography already available in the paper:
The Design of a Policy Delphi.
A recent OEO report discusses the role of Delphi
Conferencing as a component of an "Executive Infor-

325

mati on System", for the Governor of Wisconsin:
• GENIE (Government Executives Normative Infor-mation Expediter) by D. Sam Scheele*, Vincient
De Sante and Edward Glasser, March 1971.

A version of the Delphi Conferencing System has
been implemented in TRAC on the PDP-10 by Claude
Kagan of Western Electric Research, Princeton, N.J ..
The proceedings of the First General Assembly of
the World Future Society (held in May of 1971 In
Washington, D.C., and to be published late 1971 or
early 1972) contain two papers of interest:
• On the Design of Inquiring System-A Guide to
Information Systems of the Future by Ian I. Mitroff.
• Three-Hundred and Seventy-Third Meeting of the
Council on Social and Economic Cybernetic Stability
in the Year 2011 by Murray Turoff.

The first provides a review and literature guide to
the concept of "Inquiring Systems" and the second is
a forecasting scenario which carries some current tendencies in the computer field to their dangerous, but
perhaps logical, extreme.
Prof. Mitroff also has a paper in Vol. 17, No. 10,
June 1971 issue of Management Science which deals
with a particular application of a Hegelian Inquirer:
• A Communication Model of Dialectal Inquiring
Systems-A Strategy for Strategic Planning.

The 1971 IFORS (International Federation of Operations Research Societies) meeting on Cost Effectiveness held in May of 1971 in Washington, D.C., had a
working session on Delphi. (The proceedings should be
published in 1972 by Wiley.) The report on the working
session provides a synopsis on the Delphi method and
also reports on an experiment held in the session where
the audience voted (as to agreement or disagreement)
with respect to twenty-one conclusions contained in a
technical presentation. The vote was taken both before
and after the presentation. This modified form of Delphi
provided a clear measure of the effectiveness of the
presentation and its utility as an educational experience
for the audience. One cannot help but conjecture that
extensive use of this technique at professional meetings
might either significantly decrease the number of papers

* Mr. Scheele (Social Engineering Technology Inc., L.A.)
presented this concept at a session in the SJCC 1971' it is not
however, available in those proceedings.
"

326

Fall Joint Computer Conference, 1971

submitted or significantly improve the quality of the
presentations.
The IFORS proceedings also contain a review article
on Multidimensional Scaling and its potential use in
Delphis by J. Douglas Carroll of Bell Telephone Laboratories. Work of this sort in the field of psychology is
pertinent to using the Delphi for obtaining value
judgments.
Those interested in the use of Delphi in social indicators should examine:
• Experimental Assessment of Delphi Procedures with
Group V alue Judgments by Norman Dalkey and

Daniel Rourke, RAND Report R-6-12-ARPA,
February 1971.
Two recent attempts to validate Delphi exercises
with respect to "real" applications were carried out
by Dr. John W. Williamson of the School of Hygiene
and Public Health, Johns Hopkins University. These
were:
• Prognostic Epidemiology of Breast Cancer and Prognostic Epidemiology of A bsentee'ism

A recent Delphi of interest to the computer field is:
• A Delphi Inquiry into the Future Economic Risks
of Computer-Based Systems Institute for the Future,

Middletown, Connecticut.
There is considerable activity in the use or potential
use of Delphi in the area of regional or urban planning.
An example of this may be found in:
• Sea Grant Delphi Exercises: Techniquesfor Utilizing
Informed Judgments of a Multi-Disciplinary Team
of Researchers, by John D. Ludlow, Bureau of

Business Research, University of Michigan, Working Paper 22, Jan. 1971.
A number of county governments are utilizing the
Delphi technique internally.
The Delphi literature has become quite rich in recent
years with respect to the diversity of applications. One
can quite easily be amazed at the number of information systems being designed and utilized without the
use of a computer. This is especially significant if one
concludes, as I have, that many of these designs are
closer to meeting MIS requirements than other activities designed on computers and labeled as MIS.

Technology for group dialogue and social choice*
by THOMAS B. SHERIDAN
The Massachusetts Institute of Technology
Cambridge, Massachusetts

INTRODUCTION

and the costs of our military-industrial complex and the
foreign policy which it serves. Technology, while
aggravating the selfishly independent consumption of
common resources, has made communications beyond
the circle of intimacy both more awkward and more
urgent.
Beyond the circle of intimacy, what kind of communications make sense? Surely most of us do not
demand personal interactions with "all those other
people." Yet in order to participate realistically in the
decisions of industry and commerce, and in government
programs to aid and regulate the processes which affect
us intimately, we as citizens need to communicate with
and understand the whole cross-section of other
citizens.
Does technology help us in this? Can it help us do it
better? We may now dial on the telephone practically
anywhere in the world, to hear and be heard with
relatively high fidelity and convenience. We may watch
on our television sets news as it breaks around the
world and observe our President as though he were in
our living room. We can communicate individually with
great flexibility; and at our own convenience we can be
spectators en masse to important events.
But effective governance in a democracy requires
more than this. It requires that citizens, in various ways
and with respect to various public issues, can make their
preferences known quickly and conveniently to those
in power. We now have available two obvious channels
for such "citizen feedback." First, we go to the polls
roughly once a year and vote for a slate of candidates;
second, we write letters to our elected representatives.
There are other channels by which we make our
feelings known, of course-by purchasing power, by
protest, etc. But the average citizen wields relatively
little influence on his government in these latter ways.
In terms of effective information transmitted per unit
time, none of the presently available channels of citizen
feedback rivals the flow from the centers of power
outward to the citizens via television and the press.

Usually the best way to discuss and resolve the choices
that arise within groups of people is face-to-face and
personally. For this reason, city planners and educators
alike are calling for new kinds of communities for
working, living, and learning, based more on familial
relationships between people than on contractual
relationships. When people get to know one another,
conflicts have a way of being accommodated.
Beyond the circle of intimacy the problem of communication is obviously much greater; and while
social issues can still be resolved more or less arbitrarily,
it is more difficult to resolve them satisfactorily.
The "circle of intimacy" is constrained in its radius.
One analyst has estimated that the average person in
his lifetime can get to know, on a personal, face-to-face
basis, only about 700 people-and surely one can know
well only a much smaller number. The precise number
is not important: the point is that it is dictated by the
limitations of human behavior and is not greatly
affected by urban population growth, by speed of
transportation and communication, by affluence, or by
any other technologically induced change in the human
condition.
Indeed, these changes underlie the problem as we
know it. Although the number of people with whom we
have intimate face-to-face communication during a
lifetime remains constant, we are in close proximity to
more and more people.
We are, moreover, a great deal more dependent on
one another than we used to be when American society
was largely agrarian. Weare all committed together in
planning and paying for highways and welfare. We
pollute each other's water and air. We share the risks

* The research at M.LT. described herein is supported on
National Science Foundation Grant GT-16, "Citizen Feedback
and Opinion Formulation" and a project "Citizen Involvement in
Setting Goals for Education in Massachusetts" with the
Massachusetts Department of Education.

327

328

Fall Joint Computer Conference, 1971

What is it that stands in the way of using technology
for greater public participation in the important compromise decisions of government, such as whether we
build a certain weapon, or an S.S.T., or what taxes we
should pay to fund what federal program, or where the
law should draw the line which may limit one person's
freedom in order to maintain that of others?
Somehow in an earlier day decisions were simpler
and could involve fewer people-especially when it
came to the use of technology. If the problem was to
span a river and if materials and the skills were available, you went ahead and built the bridge. It would be
good for everyone. Thus with other blessings of technology. There seemed little question that higher
capacity machines of production or more sophisticated
weapons were inherently better. There seemed to be an
infinite supply of air, water, land, minerals, and energy.
Today, by contrast, every modern government policy
decision is in effect a compromise-and the advantages
and disadvantages have to be weighed not only in
terms of their benefits and costs for the present clientele,
but also for future generations. Weare interdependent
not only in space but in time.
Such complex resource allocation and benefit-cost
problems have been attacked by the whole gamut of
mathematical and simulation tools of operations
research. But these "objective" techniques ultimately
depend upon subjective value criteria-which are valid
only so far as there are effective communication
procedures by which people can specify their values in
useful form.
THE FORMAL SOCIAL CHOICE PROBLEM
The long-run prospects are bright, I think, that new
technology can play a maj or role in bringing the
citizenry together; individually or in small groups,
communicating and participating in decisions, not only
to help the decision makers but also for the purpose of
educating themselves and each other. Hardware in
itself is not the principal hurdle. No new breakthroughs
are required. What is needed, rather, is a concerted
effort in applying present technology to a very classical
problem of economics and politics called "social
choice"-the problem of how two or more people can
communicate, compare values or preferences on a
common scale, and come to a common judgment or
preference ordering.
Even when we are brought together in a meeting
room it is often very awkward to carryon meaningful
communication due to lack of shared assumptions, fear
of losing anonymity or fear of seeming inarticulate,
etc. Therefore, a few excitable or most articulate

persons may have the floor to themselves while others,
who have equally intense feelings or depth of knowledge
on the subject, may go away from the meeting having
had little or no influence.
It is when we consider the electronic digital computer
that the major contributions of technology to social
choice and citizen feedback are foreseen. Given the
computer, with a relatively simple independent data
channel to each participant, one can collect individual
responses from all participants and show anyone the
important features of the aggregate-and do this, for
practical purposes, instantaneously,
Much of technology for such a system exists today,
What is needed is thoughtful design-with emphasis on
how the machine and the people interact: the way
questions are posed to the group participants; the
design of response languages which are flexible enough
so that each participant can "say" (encode) his reaction
to a given question in that language, yet simple enough
for the computer to read and analyze; and the design of
displays which show the "interesting features" or
"pertinent statistics" of the response data aggregate.
This task will require an admixture of experimental
psychology and systems engineering. It will be highly
empirical, in the same way that the related field of
computer-aided learning is highly empirical.
The central question is, how can we establish scales
of value which are mutually commensurable among
different people? Many of the ancient philosophers
wrote about this problem. The Englishmen Jeremy
Bentham and John Stuart Mill fir~t developed the idea
of "utility" as a yardstick which could compare different
kinds of things and events for the same person. More
recently the American mathematician Von Neumann
added the idea that not only is the worth of an event
proportional to its utility, but that of an unanticipated
event is proportional also to the probability that it will
happen.! This simple idea created a giant step in
mathematically evaluating combinations of events
with differing utilities and differing probabilities-but
again for a single person.
The recent history of comparing values for different
people has been a discouraging one-primarily because
of a landmark contribution by economist Kenneth
Arrow. 2 He showed that, if you know how each of a set
of individuals orders his preferences among alternatives,
there is no procedure which is fair and will always work
by which, from this data, the group as a whole may
order its preferences (i.e., determine a "social choice").
In essence he made four seemingly fair and reasonable
assumptions: ( 1) the social ordering of preferences is
to be based on the individual orderings; (2) there is no
"dictator" whom everyone imitates; (3) if every
individual prefers alternative A to alternative B, the

Technology for Group Dialogue and Social Choice

society will also prefer A to B; and, (4) if A and Bare
on the list of alternatives to be ordered, it is irrelevant
how people feel about some alternative C, which is not
on the list, relative to A and B. Starting from these
assumptions, he showed (mathematically) that there is
no single consistent procedure for ordering alternatives
for the group which will always satisfy the assumptions.
A number of other theoreticians in the area have
challenged Arrow's theorem in various ways, particularly through challenging the "independence of irrelevant alternatives" assumption. The point here is that
things are never evaluated in a vacuum but clearly are
evaluated in the context of circumstance. A further
charge is a pragmatic one: while Arrow proves inconsistencies can occur, in the great majority of cases
likely to be encountered in the real world they would
not occur, and if they did they probably would be of
minor significance.
There are many other complicating factors in social
choice, most of which have not been, and perhaps
cannot be, dealt with in the systematic manner of
Arrow's "impossibility theorem."2 For example, there
is the very fundamental question of whether the
individual parties involved in a group choice exercise
will communicate their true feelings and indicate their
uncertainties, or whether they will falsify their feelings
so as to gain the best advantage for themselves.
Further difficulties arise when we try to include in the .
treatment the effects of differences among the participants along the lines of intensity-of-feelings vs.
apathy, or knowledge vs. ignorance, or "extendedsympathy" vs. selfishness, or partial vs. complete
truthfulness; yet these are just the features of the social
choice problem as we find it in practice.
To take as an ultimate goal the precise statement of
social welfare in mathematical terms is, of course,
nonsense. The differing experiences of individuals (and
consequently differing assumptions) ensure that commensurability of values will never be complete. But this
difficulty by no means relieves us of the obligation to
seek value-commensurability and to see how far we can
go in the quantitative assessment of utility. By making
our values more explicit to one another we also make
them more explicit to ourselves.
POTENTIAL CONTRIBUTIONS OF
ELECTRONICS
Electronic media notwithstanding, none of the newer
means of communication yet does what a direct faceto-face group meeting (town meeting, class bull session)
does-that is, permit each participant to observe the
feelings and gestures, the verbal expressions of approval

329

-qr.SPONSF
nl::VU'E
'l'F.LEPfJ()Nf'
rH'\N~H;T.

cnr~l.n~~ICllTI,)N
t;'()'Q

rF.F.l)m\c~·

Figure 1-General paradigm for citizen feedback (right)
added to top-down communication (left)

or disapproval, or the apathetic silence-which may
accompany any proposal or statement. As a group
meeting gets larger, observation of how others feel
becomes more and more difficult; and no generally
available technology helps much. Telephone conference
calls, for example, while permitting a number of people
to speak and be heard by all, are painfully awkward and
slow and permit no observation of others' reaction to
any given speaker. The new Picture-Phone will eventually permit the participants in a teleconference to see
one another; but experiments with an automatic system
which switches everyone's screen to the person who is
talking reveals that this is precisely what is not
wanted-teleconferees would like most to observe the
facial expressions of the various conferees who are not
talking!
One can imagine a computer-aided feedback-andparticipation system taking a variety of forms all of
which are more or less characterized by Figure 1.
For example:
( 1) A radio talk show or a television "issue" program
may wish to enhance its audience participation
by listener or viewer votes, collected from each
participant and fed to a computer. Voters may be
in the studio with electronic voting boxes or at
home where they render their vote by calling a
special telephone number. The NET "Advocates" program has demonstrated both.
(2) Public hearings or town meetings may wish to
find out how the citizenry feel about proposed
new legislation-who have intense feelings, who
are 'apathetic, .who are educated to the facts
and who are ignorant-and correlate these
responses with each other and with demographic
data which participants may be asked to volunteer. Such a meeting could be held in the town
assembly hall, with' a simple pushbutton console
wired to each seat.

330

Fall Joint Computer Conference, J971

(3) Several P.T.A.s or alternatively several eighth
grades in the town may wish to sponsor a feedback meeting on sex education, drugs, or some
other subject where truthfulness is highly in
order but anonymity may be desired. Classrooms
at several different schools could be tied together
by rented "dedicated" telephone lines for the
duration of the session.
(4) A committee chairman or manager or salesman
wishes to present some propositions and poll his
committee members, sales representatives or
etc. who may be stationed at telephone consoles
in widely separated locations, or may be seated
before special intercom consoles in their own
offices (which could operate entirely independently of the telephone system) .
(5) A group of technical experts might be called
upon to render probability estimates about
some scientific diagnosis or future event which is
amenable to before-the-fact analysis. This process may be repeated, where with each repetition
the distribution of estimates is revealed to all
participants and possibly the participants may
challenge one another. This process has been
called the "Delphi Technique" after the oracle,
and has been the subject of experiments by the
Rand Corporation and the Institute for the
Future,3 and by the University of Illinois.4
Their experience suggests that on successive
interactions even experts tend to change their
estimates on the basis of what others believe
(and possible new evidence presented during
the challenge period) .
(6) A duly elected representative in the local, state
or national government could ask his constituency questions and receive their responses.
This could be done through radio or television
or alternatively could utilize a special van,
equipped with a loudspeaker system, a rearlighted projection/display device, and a number
of chairs or benches which could be set up
rapidly at street corners prewired with voterresponse boxes and a small computer.
These examples point up one very important aspect
of such citizen feedback or response-aggregation
systems: that is that they can educate and involve
the participants without the necessity that the responses formally determine a decision. Indeed the
teaching-learning function may be the most important.
It demands careful attention to how questions are
posed and presented, what operations are performed by
the computer on the aggregated votes and what
operations are left out, how the results are displayed,

and what opportunity there is for further voting and
recycling on the same and related questions.
Some skeptics feel that further technocratic invasion
of participatory democracy should be prevented rather
than facilitated-that the whole idea of the "computerized referendum" is anathema, and that the forces
of repression will eventually gain control of any such
system. They could be correct, for the system clearly
presupposes competence and fairness in phrasing the
questions and designing the alternative responses.
But my own fear is different. It is that, propelled by
the increasing availability of glamorous technology and
spurred on by hardware hucksters and panacea pushers,
the community will be caught with its pilot experiments
incomplete or never done.
THE STEPS IN A GROUP FEEDBACK
SESSION
Seven formal steps are involved in a technologically
aided interchange of views on a social-choice question:
(1) The leader states the problem, specifies the
question, and describes the response alternatives
from which respondents are to choose.
(2) The leader (or automated components of the
system) explains what respondents must do in
order to communicate their responses (including,
perhaps, their degree of understanding of the
question, strength of feeling, and subjective
assessment of probabilities).
(3) The respondents set into their voting boxes
their coded responses to the questions.
(4) The computer interrogates the voting boxes and
aggregates the response data.
(5) Preselected features of this response-aggregate
are displayed to all parties.
(6) The leader or respondents may request display of
additional features of the response aggregate, or
may volunteer corrections or additional information.
(7) Based upon an a priori program, on previous
results and/or on requests from respondents, the
leader poses a new problem or question, restarting the cycle from Step 1.
The first step is easily the most important-and also
the most difficult. Clearly the participant must understand at the outset something of the background to any
specific question he is asked, he must understand the
question itself in nonambiguous terms, and he must
understand the meaning of the answers or response
alternatives he is offered. This step is essentially the

Technology for Group Dialogue and Social Choice

same as is faced by the designer of any multiple-choice
test or poll, except that there is the possibility that a
much richer language of response can be made available
than is usually the case in machine-graded tests.
Allowed responses may include not only the selection of
an alternative answer, but also an indication of intensity
of feeling, estimates of the relative probability or
importance of some event in comparison with a standard, specification of numbers (e.g. allowable cost)
over a large range, and simple expressions of approval
("yea!") or disapproval ("boo!").
The leader may have to explain certain subtleties of
voting, such as whether participants will be assumed to
be voting altruistically (what I think is best for everyone) or selfishly (what I think is best for me alone, me
and my family, etc.). Further, he may wish respondents
to play roles other than themselves (if you were a
person under certain specified circumstances, how would
you vote?).
He may also wish to correlate the answers with
informedness. He may do this by requesting those who
do not know the answer to some test question to refrain

Identification of self (note: if
one of 1, 2, 3 not switched assume
unregistered or other party; if one
of 4, 5, 6 assume other or none)

Expressions of feeling and
experience

1)
2)
3)

4)
5)
6)
1)
2)

Four numerical categories
Two administrative categories

Am mildly interested
Am uninterested

4)
5)
6)

Daily experience
Occasional experience
No experience

1)
2)
3)
4)

Less than 10%
10 to 30%
30 to 60%
Greater than 60%
Don It know
Don It understand

6)

I want olan 1
I want plan 2
I want plan 3
Undecided as to olans
Object to availahle plans
Confused by procedure

l)

A

1)
2)
3)
4)
5)

Rank orderinq of three
alternatives, A, B, C

first
choice
second
choice

Response to interpersonal
cOIlU1lunica tion of actors

as to

{

{

2)

B

3)
4)

C
A

5)
6)

B
C

{ l2)
)

(4)

3)

5)

6)

To select one of 8 on each
of two questions
Question
(dots under your answer
indicate switches to be
thrown)

1(1)

2)
3)
4)

Quesl:ion' {

Am intensely interested

3)

5)
6)

Three alternatives plus
Three administrative categories

Republican
Democrat
Independent
Protestant
Catholic
Jew

5)
6)

~liss Adams
Colonel Baker
Doctor Crank
Agree
Disagree
Am bored

ABCDEFGIl

·

•••
••
•
••
• ••
•
•
• •
••

. .

Figure 2-Sample categories of response for a six-switch console

331

from voting, or he can pose the knowledge test question
before or after the issue question and let the computer
make the correlation for him.
Insuring the participants "play fair," own up to their
uncertainties, vote as they really feel, vote altruistically
if asked, and so on, is extremely difficult. Some may
always regard their participation in such social interaction as an advocacy game, where the purpose is to
"win for their side."
The next two steps raise the question of what equipment the voter will have for communicating his responses. At the extreme of simplicity a single on-off
switch generates a response code which is easily interpreted by the computer, but limiting to the user. At the
other extreme, if responses were to consist of natural
English sentences typed on a conventional teletypewriter-which would certainly allow great flexibility
and variety in response-the computer would have no
basis for aggregating and analyzing responses on a
commensurate basis (other than such procedures as
counting key words). Clearly something in between is
called for; for example, a voting box might consist of
ten on-off switches to use in various combinations, plus
one to indicate "ready," plus one "intensity" knob.
An unresolved question concerns how complex a
single question can be. If the question is too simple, the
responses will not be worth collecting and will provide
little useful feedback. If too complex, encoding the
responses will be too difficult. The· ten switches of the
voting box suggested above would have the potential
(considering all combinations of on and off) for 210 =
1024 alternatives but that is clearly too many for the
useful answers to anyone question.
It is probably a good idea, for most questions, to
have some response categories to indicate "understand
question but am undecided among alternatives" or
"understand question and protest available alternatives" or simply "don't understand the question or
procedures," three quite different responses. If a
respondent is being pressured by a time constraint,
which may be a practical necessity to keep the process
functioning smoothly, he may want to be able to say,
"I don't have time to reach a decision"; this could
easily be indicated if he simply fails to set the "done"
switch. Some arrangement for "I object to the questions
and therefore won't answer" would also be useful as a
guide to subsequent operations and may also subsume
some of the above "don't understand" categories.
Figure 2 indicates various categories of response for a
six switch console.
The fourth step, in which the computer samples the
voting boxes and stores the data, is straightforward as
regards tallying the number of votes in each category
and computing simple statistics. But extracting mean-

332

Fall Joint Computer Conference, 1971

ing from the data requires that someone should have
laid down criteria for what is interesting; this might be
done either prior to or during the session by a trained
analyst.
It is at this point that certain perils of citizen feedback
systems arise, for the analyst could (either unwittingly
or deliberately) distort the interpretation of the voting
data by the criteria he selects for computer analysis and
display. Though there has been much research on
voting behavior and on methods of analyzing voting
statistics, instantaneous feedback and recycling poses
many new research challenges.
That each man's vote is equally important on each
question is a bit of lore that both political scientists
and politicians have long since discounted-at least in
the sense that voters naturally feel more intensely
about some issues than about others. One would, therefore, like to permit voters to weight their votes according to the intensity of their feeling. Can fair means
be provided?
There are at least two methods. One long-respected
procedure in government is bargaining for votes"I'll vote with you on this issue if you vote with me on
that one." But in the citizen-feedback context, negotiating such bargains does not look easy. A second
procedure would be to allocate to each voter, over a
set of questions, a fixed number of influence points, say
100; he would indicate the number of points he wished
the computer to assign to his vote on each question,
until he had used up his quota of 100 points, after which
the computer would not accept his vote. (Otherwise,
were votes simply weighted by an unconstrained
"intensity of feeling" knob, a voter would be rather
likely to set the "intensity of feeling" to a maximum
and leave it there.)
A variant on the latter is a procedure developed at
the University of Arizona5 wherein a voter may assign
his 100 points either among the ultimate choices or
among the other voters. Provided each voter assigns
some weight to at least one ultimate alternative an
eventual alternative is selected, in some cases by a
rather complex influence of trust and proxy.
Step five, the display of significant features of the
voting data, poses interesting challenges concerning
how to convey distributional or statistical ideas to an
unsophisticated participant, quickly and unambiguously.
The sixth step provides an opportunity for nonplanned feedback-informal exposition, challenges to
the question, challenges to each other's votes, and
verbal rebuttal-in other words a chance to break free
of the formal constraints for a short time. This is a time
when participants can seek to influence the future
behavior of the leader-the questions he will ask, the

response alternatives he will include, and the way he
manages the session.
EXPERIMENTS IN PROGRESS
Experiments to date have been designed to learn as
much as possible as quickly as possible from "real"
situations. Because the mode of group dialogue discussed above introduces so many new variables, it was
believed not expedient to start with controlled laboratory experiments, though gradually we plan to make
controlled comparisons on selected experimental conditions. But the initial emphasis has been on plunging
into the "real world" and finding out "what works."
Experiments in a semi-laboratory setting
within the university

In one set of experiments in the Man Machine
Systems Laboratory at M.LT. the group feedback
system consists of fourteen hand-held consoles, each
with ten on-off switches, a continuous "adjust" knob
and a "done" switch. The consoles are connected by
wire to a PDP-8 computer with a scope display output.
Closed circuit television permits simulation of a meeting
where questions are being posed and results aggregated
at some distant point ( e.g., a television station in
another city) and where respondents may sit together
in a single meeting room or may be located all at
different places. Various aggregation display programs
are available to the discussion leader, the simplest of
which is a histogram display indicating how many
people have thrown each switch. Other data reduction
programs are also available, such as the one described
above permitting voters to give a percentage of their
votes to another voter. A variety of small group
meetings, seminars and discussions have been held
utilizing this equipment.
Two kinds of leadership roles have been tried. The
first is where a single leader makes statements and poses
questions. Here, among other things, we were concerned
with whether respondents, if constrained to express
themselves only in terms of the switches, can "stay with
it" without too much frustration and can feel that they
are part of a conversation. Thus far, for this type of
meeting, we have learned the following:
(1) Questions must be stated unambiguously. We
learned to appreciate the subtle ways in which
natural language feedback permits clarification
of questions or propositions. Often the questioner doesn't understand an ambiguity in his
statement-where a natural language response

Technology for Group Dialogue and Social Choice

from one or two persons chosen at random only
for the purpose of clarifying the question is often
well worth the time of others, though this by no
means obviates the need to have some "I don't
understand" or "I object" categories.
(2) The leader should somehow respond to the
responses of the voters. If he ca·n predicate his
next question or proposition on the audience
response to the last one, so much the better.
Otherwise he can simply show the audience that
indeed he knows how the vote on the last
question turned out and freely express his
surprize or other reaction. In cases where the
leader seemed as though he was not as interested
in the response and simply ground through a
programmed series of questions, the audience
quickly lost interest.
(3) Anonymity can be very important, and, if safeguarded, permits open "discussion" in areas
which otherwise would be taboo. For example, we
have conducted sessions on drug use, in which
students, faculty, and some total strangers quite
freely indicated how often they use certain drugs
and where they get them. Such discussions, led
unabashedly by students (who knew what and
how to phrase the key questions!) resulted in a
surprising freedom of response. (We made the
rule that voters had to keep their eyes on the
display, not on each others' boxes, though a small
voting box can easily be held close to the chest
to obstruct others' view of which switches are
being thrown.) It was found especially important, for this kind of topic, not to display any
results until all were in.
In the same semi-laboratory setting described above
we have experimented with a second kind of leadership
role. Here two or more people "discuss" or "act" and
the audience continuously votes with "yea," "boo,"
"slow down and explain," "speed up and go on to
another topic" type response alternatives. Voters were
happy to play this less direct role but perhaps for a
shorter time than in the direct response role described
above. Again it proved of great importance that the
central actors indicate that they saw and were interested in how the voters voted.
Experiments with citizen group meetings
using portable equipment
As of this writing five group meetings have been
conducted in the Massachusetts towns of Stoneham,
Natick, Manchester, Malden and Lowell to assist the
Massachusetts Department of Education in a program

333

of setting educational goals. In each case cross sections
of interested citizens were brought together by invitation of persons in each community to "discuss educational goals." Four similar meetings were conducted
with students and teachers at a high school in Newton,
Massachusetts. (A similar meeting was also held in a
church parlor in Newton to help the members of that
church resolve an internal political crisis.) All groups
ranged in size from twenty to forty, though at anyone
time only thirty-two could vote since but that number
of voting boxes have been built.
The portable equipment used for these meetings,
held variously in church assembly halls, school classrooms and television studios, features small hand-held
voting boxes, each with six toggle switches, connected
by wire to substations (eight boxes to a substation,
each of the latter containing digital counting logic)
which in turn are series connected in random order to
central logic and display hardware. The display
regularly used to count votes is a "nixie tube" type
display of the six totals (number of persons activating
each of the six switches). The meeting moderator,
through a three position switch, can hold the numbers
displayed at zero, set it in a free counting mode, or lock
the count so that it cannot be altered. A second display,
little used as yet, is a motorized bar graph to be used
either to display histogram statistics or to provide a
running indication of affective judgments such as
agree with speaker, disagree, too fast, too slow, etc.
The typical format for these meetings was as follows.
After a very brief introduction to the purpose of the
meeting and the voting procedure itself several questions were asked to introduce members of the group to
each other (beyond what is obvious from physical
appearance) such as education, political affiliation,
marital status, etc. An overhead projector has been
used in most cases to ask the questions and record the
answers and comments (on the gelatin transparency)
since, unlike a blackboard, it need not be erased before
making a permanent record. Following the introduction,
the meetings proceeded through the questions, such as
those illustrated by Figure 3, and those posed by the
participants themselves. The categories of "object to
question" or "other" were used frequently to solicit
difficulties or concerns people had with the question
itself-its ambiguity, whether it was fair, etc. Asking
persons who voted in prepared categories to identify
themselves and state, after the fact, why they voted as
they did, was part of the standard procedure. Roughly
twenty questions, with discussion, can be handled in
172 hours.
After the meeting, evaluations by the participants
themselves have suggested that the procedure does
indeed serve to open up issues, to draw out those who

334

Fall Joint Computer Conference, 1971

How are preschool children best prepared for school?
(as school now exists)
(as school should be)
1) lots of parental love
9
11
2) early exposure to books
2
1
3) interaction with other kids
14
8
4) by havinq natural wonders and
esthetic deliqhts pointed out
5
3
5) unsure
6) object
1
Salient cOllllllents after vote ("as school now exists" and "as school
should be" not part of question then): One man object to "pointed
out" in 4), as it EIIIIphasized "instruction" rather than "learninq."
Discussion on this point. SCllleone else _nted to qet at "encouraqinq
curiosi ty • " Another claimed, "That's what question says," and another
"discover natural wonders." Consensus: "leave wordinq as is." Then
a lady Violently objected that the vote would be different depenclinq
on whether voter _s thinkinq of school as it now existed or as it
should be. others aqreed. Two cateqories added. Above is fi_l
vote.

Student attendance should be:
1)
2)
3)
4)
5)
6)

cOlllpulsory with firm excuse policy
compulsory with lenient excuse policy
voluntary, with students responsible for
material missed
Voluntary, with teachers providinq all
reasonable assistance to pupils who
miss class
unsure
object

11
0
12
2
3
0

Comments centered on the feelinq-that sOllIe subjects require attendance
more than others do. (Note the 0 vote on cateqory 2) which is
inappropriately self-contradictory.)

Figure 3-Typical questions and responses from the citizen
meetings on educational goals

would otherwise not say much, and generally to provide
an enjoyable experience-in some cases for three hours
duration.
EXTENDING THE MEETING IN SPACE
AND TIME
The employment of such feedback techniques in
conjunction with television and radio media appears
quite attractive, but there are some problems.
A major problem concerns the use of telephone networks for feedback. Unfortunately telephone switching
systems, as they presently work, do not easily permit
some of the functions one would like. For example, one
would like a telephone central computer to be able to
interrogate, in rapid sequence, a large number of
memory buffers (shift registers) attached to individual
telephones, using only enough time for a burst of ten or
so tone combinations (like touchtone dial signalling),
say about 72 second. Alternatively one might like to be
able to call a certain number, and, in spite of a temporary busy signal, in a few seconds have the memory
buffer interrogated and read over the telephone line.
However, with a little investigation one finds that
telephones were designed for random caller to called-

party connections, with a busy signal rejecting the
calling party from any further consideration and
providing no easily employed mechanism for retrieving
that calling party once the line is freed.
For this reason, at least for the immediate future, it
appears that for a large number (much more than
1,000) to be sampled on a single telephone line in less
than 15 minutes, even for simple count of busy signals,
is not practical.
One tractable approach for the immediate future is
to have groups of persons, 100 to 1,000, assembled at
various locations watching television screens. Within
each meeting room participants vote using hand-held
consoles connected by wire to a computer, which itself
communicates by telephone to the originating television
studio. Figure 4 illustrates this scheme.
Ten or more groups scattered around a city or a
nation can create something approaching a valid
statistical sample, if statistical validity is important,
and within themselves can represent characteristic
citizen groups (e.g. Berkeley students, Detroit hardhats,
Iowa farmers, etc.). Such an arrangement would easily
permit recycling over the national network every few
minutes and within anyone local meeting room some
further feedback and recycling could occur which is not
shared with the national network.
Cable television, because of its much higher bandwidth, has the capability for rapid feedback from smaller
groups or individuals from their individual homes. For
example, even part of the 0-54 MHZ band (considered
as the best prospect for return signals6 ) is more than
adequate theoretically for all the cable subscribers in a
large community, especially in view of time sharing
possibilities.
The above considerations are for extensions in space.

a

Figure 4-Multi-group arrangement for television audience
response

Technology for Group Dialogue and Social Choice

One may also consider extensions in time, where a single
"program" extends over hours or days and where each
problem or question, once presented on television, may
wait until slow telephone feedback or even mail returns
of an IBM card or newspaper "issue ballot,"7 variety
come in.
Development of such systems, fraught with at least
as many psychological, sociological, political and
ethical problems as technological ones, will surely have
to evolve on the basis of varied experiments and hard
experience.
REFERENCES
1 J VON NEUMANN 0 MORGANSTERN
Theory of games and economic behavior
Princeton University Press 2nd edition 1947

335

2 K ARROW
Social choice and individual values
John Wiley New York 1951
3 N DALKEY 0 HELMER
An experimental application of the Delphi Method to the
use of experts
Management Science No 91963
4 C E OSGOOD S UMPLEBY
A computer-based system for exploration of possible systems
for Mankind 2000
Mankind 2000 pp 346-359 Allen and Unwin London
5 W J MACKINNON M K MACKINNON
The decisional design and cyclic cooperation of SPAN
Behavioral Science Vol 14 No 3 pp 244-247 May 1969
6 The third wire: cable communication enters the city
Report by Foundation 70 Newton Msssachusetts March
1971
7 C H STEVENS
Citizen feedback, the need and the response
MIT Technology Review pp 39-45 Cambridge
Msssachusetts

Structuring information for a computer-based
communications medium*
by STUART UMPLEBY
Unive1'sity of Illinois
Urbana, Illinois

Several years ago Prof. Charles E. Osgood suggested
that it might be possible to develop a program for a
computer-based education system which would eventually allow the public, possibly at a world's fair, to
"explore the future."1 Such an "exploration" would
be useful both for education and for social science
research. This paper is a progress report on the continuing development of that "exploration of alternative
futures" using the PLATO system (see Figure 1).
The educational function of the exploration is accomplished by exposing the "explorer" to four types
of information:

studies as a decision -making aid. The direction in which
the research moved resulted in part from the nature of
the medium and in part from our concern with increasing public participation ill decision-making
processes.
THE EVOLUTION OF THE PROJECT
After the project had been under way for a year or
two, it appeared that this work could have applications
beyond simply education and social science research.
This belief resulted from the projected growth of the
PLATO system and the ease with which the computer
program for the exploration could be modified to deal
with problems other than the general future of mankind.

1. A list of developments possible in the future.
2. The model of "reality interaction" used in the
computer program.
3. The decision-making procedure, including
making investments to change probabilities.
4. The operation of a teaching computer.

A .glance at the past

The PLATO III system, which we have been using,
is capable of operating 20 terminals simultaneously.
However, the PLATO IV system, scheduled for completion in 1974 or 1975, is being designed to operate
4000 terminals simultaneously (see Figure 2).
When the first wave of interest in computer-aided
instruction began in the early 1960s there were basically two questions which had to be answered. First,
could students learn educational material as rapidly
and retain it as well if they were taught using a computer terminal rather than by sitting in a classroom?
Second, was computer-aided instruction economically
competitive with classroom instruction? After a d€cade
of educational experiments at the University of Illinois
and €lsewhere, the answer to the first question is emphatically yes. 2 But computer-aided instruction is not
economically competitive using the technology available
in 1960. It was this realization-that the crucial problem lay in the cost of the equipment and its operationthat led to the invention at the University of Illinois of

The data from these exercises can reveal which developments people consider desirable, which developments they are most familiar with, and which developments they are most interested in. Data from several
demonstrations of early versions of the exploration will
be discussed later in this paper.
In short this work originated neither as an attempt
to develop software for a new communications medium
nor from an interest in conducting on-line Delphi

* The research described here was conducted using the PLATO
system at the Computer-based Education Research Laboratory
of the University of Illinois at Urbana-Champaign. The laboratory is supported in part by the National Science Foundation
under grants NSF GJ81 and GJ 974; in part by the Advanced
Research Projects Agency under grant ONR Nonr 3985 (08); in
part by Project Grant NPG-188 under the Nurse Training Act
of H}64, Division of Nursing, Public Health Service, U.S. Dept.
of Health, Education and Welfare; and in part by the State of
Illinois.
337

338

Fall Joint Computer Conference, 1971

Generations of communications media
If radio and television are thought of as first and
second generation mass communications systems, then
perhaps the PLATO system could be thought of as a
forerunner of a third generation mass communications
system. Originally, we felt that built-in feedback would
be the characteristic distinguishing computer-based
communications systems from radio and television.
However, due to the feedback possibilities of cable
television, it now seems more appropriate to list four
"generations" of electronic communications media now
existing in at least prototype form.

Figure 1-The PLATO III system provides each student with
an electronic keyset as a means of communicating with the
computer and a television display for viewing information
selected or generated by the computer

the plasma display panel and the development of the
PLATO IV system. 3
A computer-based education system which is both
educationally effective and economically competitive
will in all probability be adopted throughout the United
States and around the world in due course. With the
prospect of teaching computer terminals in most classrooms, if not most homes, within a few decades, we
began to think of this equipment as a new kind of mass
communications medium.
Student Console

Random Access
Image Selector

Keyset

/

""

Figure 2-Using terminals such as the one pictured above,
the PLATO IV system, scheduled for completion in 1974, will
provide a high quality color display at low cost. The terminals
will be connected to the computer over standard voice-grade
telephone lines

1. Radio transmits audio messages from the center
to the periphery.
2. Television transmits audio and visual messages
from the center to the periphery.
3. Cable television provides a great increase in the
number of available channels and the possibility
of both passive feedback (monitoring what
people watch) and active feedback (for example,
voting by pressing a button on the television
set).
4. Computer-based communications systems have
several new characteristics.

a. Less simultaneity: Although many people
may be using the program, each may be in a
different part of the program. Thus everyone
on one "channel" does not see the same thing
at the same time.
b. Less evanescence: With radio and television
a listener or viewer cannot go back if he
misses a word or sentence (unless he has a
tape recorder). With PLATO each individual
progresses at his own rate. The display does
not change until he wants it to, and he can
go back to review previous displays.
c. Viewer-designed programs: With PLATO the
viewer can ask for additional information or
can jump ahead if he becomes bored, thus to
some extent designing his own program.
For the sake of clarity it should be noted that the
displays for computer-based communications systems
are generally static, like color slides, rather than dynamic, like movies or television. Messages are conveyed
primarily by the use of words, frequently supplemented
by tables or graphs and occasionally by drawings and
pictures. However, the PLATO IV equipment, as contrasted with the PLATO III equipment, will use audio
as well as visual messages and will be better suited to
simple moving displays such as dancing.stick figures or
plotting out a graph.

Computer-based Communications Medium

A pplying the exploration to specific issues

Even before we began thinking of the PLATO system
as a new kind of mass communications medium and
not simply as a teaching device, we had thought of
writing explorations on a variety of different topics
such as disarmament, the future of education, and
urban planning. But with the probable widespread
acceptance of computer-based education in the next
few decades, it seemed that our work suggested the
possibility of using this equipment as a medium between
planners and the public for exchanging information and
opinions regarding community goals. 4
The adaptability of the exploration resulted from the
fact that the decision-making framework could remain
the same for any problem area and that only the information units with the matrix giving the relationships
between them would need to be changed.
Thus the projected expansion of the PLATO system
and the ease with which the exploration could be
modified to deal with specific issues resulted in our
thinking of the teaching computer as a new kind of
mass communications medium particularly suited to
discussions among different interest groups about the
long range goals of a community. But how does one
present information on this new medium?
HOW DOES ONE DESCRIBE THE FUTURE?
Programming an exploration of the future required
suggesting possible future developments in a way which
emphasized their probabilistic nature and in a way
which could be easily manipulated by the explorer. We
needed a method of presenting possible future developments so that people could request additional information, make "investments" which would alter the
initial probabilities, and see the possible secondary
effects of their actions. Consequently we assumed that
the future, and also the present and the past, can be
described using "information units."
Features of information units

The features or components of an "information
unit," as described in an earlier report, were (1) a short
descriptive statement, (2) a background paragraph,
and (3) an associated probability or other measure. 5
It now seems necessary to expand and revise this list of
features.
The background information (2) can involve graphs,
charts, pictures, drawings, and tape recordings in addition to written information.
Complete measurement (3) of an historical occurrence

339

requires the measurement itself, the date the measurement was made, and some indication of measurement
error. The measurement of a forecast requires the forecasted value of a parameter, the date at which that
value of the parameter is expected to occur, some indication of certainty about the forecast, and also the
date at which the forecast is made.
With respect to graphs of information units, the
horizontal axis will always be time. The vertical axis
which we have used so far for developments has been
probability, ranging from 0 to 100 percent. This kind
of scale requires a specific, identifiable event such as
50 percent of the nation's schools having computerbased education equipment or 50 percent of the population favoring the legalization of marihuana. A superior method of forecasting would lend itself more
easily to validation and would make possible the measurement of progress toward a goal as the years pass by.
Such a scale is suggested by the previous format.
Rather than measuring the probability of a particular
level of distribution of a technology, one can simply
measure the distribution itself. Similarly one can estimate the percentage of the population which will favor
a social development rather than the probability of a
particular degree of acceptance. Regarding the development of a technological capability, as opposed to its
diffusion once it is developed, one could list the stages
of development ranging from the original concept
through experimentation and prototype construction to
the first production model. However, this kind of
measure would use an ordinal rather than an interval
scale.
An important new feature which should be noted
explicitly when developing lists of information units is
the group or person suggesting a particular idea as important (4) and worthy of attention. This information
can usually be deduced either from the name and
affiliation of th€ person writing the paper or from the
list of people who took part in a Delphi exercise.
The person or group originating an idea is frequently
recorded. However, the thought behind recording this
data is usually either to aid in locating additional background material or to give credit where it is due. But
such information is also politically relevant. It is needed
so that other forecasters, public officials, and especially
the general public, will know whose ideas about the
future of society are represented in the total set of
forecasts and social indicators now being generated.
People in different walks of life, in different socioeconomic groups, will be subjected to different stresses
in their daily lives. Consequently they will define
different "problems" as being important for mankind
to solve. The intervention of politics into forecasting
cannot be avoided, we can only try to be aware of

340

Fall Joint Computer Conference, 1971

possible sources of bias so that the interests of all
groups will be as fairly represented as possible. 6

Information units can be divided into four categories:

2. Initiatives are actions taken by a group or an
individual.
3. Events are sudden or unanticipated occurrences.
4. System variables, now more commonly called
social indicators, are measures of a system which
fluctuate in time.

1. Developments, including both social and technological developments, refer to new characteristics
of the social system.

Two criteria are used to distinguish among these four
kinds of information units: the shape of the graph over
time and the extent of human control.

Categories of information units

Type of
information unit

Human control

Shape of graph

../

Many small decisions

Development

S-curve

Initiative

Step function

Event

Spike function JL Very little control

Social indicator

Fluctuation ~ Regular adjustments

The earlier report suggested that the four categories
of information units could be thought of either as
"change-producing factors" (developments, events, and
initiatives) or as "system variables" (social indicators).
The distinction between these two larger categories lies
in the period of time during which the information unit
can usefully be of interest. System variables or social
indicators are of interest over an indefinite period of
time and so are used to monitor the behavior of social
systems. Change-producing factors can occur in a period
of hours or years but are of little interest outside of the
period of time during which they are producing change
in the social system.
Most mathematical models deal with the relationships among several "system variables," such as population, per capita income, gross national product, and
capital investment in agriculture. Delphi studies usually
concern themselves only with "change producing factors." An ideal exploration of the future would use
both system variables and change-producing factors.
A new literary form

Every new communications medium seems to generate its own distinctive forms for structuring information. The printing press made possible newspapers,
journals, and novels. Films greatly extended the use of
animated cartoons and led to zoom and pan shots,

S

Single large decision

parallel editing and special visual effects. Radio and
television produced the talk show, 15 minute news,
commercials, and spot announcements. The mimeograph machine was best suited to the leaflet, the working
paper, and the "underground press." The Xerox machine promoted letter writing to multiple recipients
and extended the readership of journal and magazine
articles. It is not surprising that computer-based communications media also seem to be developing their
own literary form.
Branching sequences and mathematical algorithms,
so useful in "individualized instruction," create a demand for literature in which statements and paragraphs
can be rearranged, dropped or added. Scripts or programs which follow the single logical sequence of the
essay are criticized by the managers of the medium as
"not taking full advantage of the capabilities of this
kind of system." In such cases there is pressure to either
rewrite the material or present it using a different
communications medium.
Rather than an articulate text with an interestarousing introduction and a good summarizing conclusion, .the material written for a computer-based
communications medium, particularly when it deals
with public issues, emphasizes alternatives and their
consequences and concisely stated, measurable events.
With this medium a person can describe his ideal future
without having to give a speech or write an essay or
book. Furthermore, his views can be easily compared

Computer-based Communications Medium

or combined with the "ideal futures" of other people,
thereby informing the explorer, the programmer, and
the general public what visions are dancing in the heads
of their fellow citizens.
We have long needed a literary style which, rather
than imposing a particular idea, tends to draw out
new ideas, and which tends simply by its form to make
normally implicit assumptions explicit so that they can
be challenged. When people see that they disagree
about the relationships between developments or events
they may discover that their disagreements are not
about basic values or goals as much as they are about
factual questions such as what does in fact lead to
what.
The future-oriented literary form of the Delphi
Method is different in several important respects from
the present and past-oriented literary forms more commonly used today.7 The essay form, whether a newspaper report, a magazine article or a book, is most
useful for developing a single idea to a certain degree
<,>fdetail. A story related in this way has only one plot
and all the subplots are related in the same way for
all readers. Thus for the reader it is not a very personalized artistic form, no matter how weird one's powers of
interpretation. It is little wonder that one criterion of
quality in a short story or novel has been the range of
interpretations or meanings which can be drawn out of
it. This practice might be thought of as bestowing
cuddos for the ability to transcend the limitations of
the medium. Imagine the artistry possible if a literary
form could be designed which had nearly all of the
strong points of the essay but reduced or eliminated
some of the limitations!
The essay requires only the passive involvement of
the reader. Fantasy and relationships with previous
knowledge or experience can be brought into play, but
the new ideas which are generated cannot be tested out
in the story itself. Literary essays, reports, stories,
even films, plays, and melodramas are closed ended
and are characterized by high certainty. Events do or
do not happen. The closest thing to the hypothetical
or probabilistic is the scientific report with its margins
of error and the assumption that refutation is possible.
But the scientific report states facts, not possibilities.
Even science fiction stories while beginning from a
hypothetical situation follow a fixed course to a unique
conclusion.
Robert Theobald's Teg' 81994- a mimeographed, alterable account of a small girl's possible future world is
one example of rumblings of a demand for new literary
forms which are flexible, probabilistic, open-ended and
user-controlled, thereby permitting active involvement
of the reader.s A more conditional and manipulatable
style of literature will not be very satisfying and may

341

be downright disconcerting to some people. It will
probably be most satisfying for people who have a high
tolerance for uncertainty and ambiguity and who appreciate being asked for their judgments as well as
being given someone elses. A plot not subject to influence other than interpretation is suitable for a past
not subject to influence other than interpretation. A
future susceptible to action and open to invention
requires a medium which invites action and encourages
invention.
THE EXPLORATION AND SOME RESULTS
With the preceding background on how our thinking
about the project has evolved and the refinement of
what is meant by information units, we shall now pause
in our speculations for a look at the present version of
the exploration and the data which has been collected
so far.
A n outline of the exploration

The decision-making procedures in one cycle of the
40 information unit exploration were as follows:
1. From a list of the 35 social and technological
developments programmed into the computer
the explorer chooses a development whose probability he would like to change. 9 The object is to
make more probable those developments which
the explorer considers desirable and less probable
those developments which the explorer considers
undesirable. However, desirable developments
may have undesirable secondary effects, and
undesirable developments may have some desirable secondary effects.
2. The explorer makes an "investment" (an indication of desirability) between -100 and +100,
where -100 would mean that the development
is maximally undesirable, 0 would mean that
the development is neither desirable nor undesirable, and +100 would mean that the development is maximally desirable. An investment such as +50 would mean that the development is moderately desirable. In the present
version, no limit is placed on the total amount
which can be invested in an exploration. 100
units could be invested during each cycle.
3. The computer shows in table form the secondary
effects of the explorer's immediately preceding
investment according to the estimates of secondary effects put into the computer by the
programmer. For each development listed as a

342

Fall Joint Computer Conference, 1971

TABLE I-Demonstrations with Recorded Data
Number of
people

Group

1. 3/9/68

?

?

2. 3/13/68

?

?

3. 5/12/68

17

Date

Social
Science
Undergraduates

Notes
1. 15 information units
2. uses GENERAL language
3. only comment data (Always same as preceding
demonstration except as noted)
1. uses TUTOR language
2. no comment mode
3. primary development (PD) selected randomly and
not recorded in data
4. shows background paragraph for PD
5. investment is only
0, 6. asks for relationship of PD to 4 predetermined
secondary developments, relationship must be
given as
0, ~
7. select 5 to follow, option to see 3 other secondary
effects
8. 4 stage oracle
9. main calculation sequence is part of special version
of TUTOR
10. secondary effects matrix read in by paper tape
(+,0, -)
1. magnitude on investment can go to ± 99
2. asks for numbers of 3 developments which might
affect PD and how they will affect PD (+, -)
1. indirect investment possible in up to 5 developments
2. no question on what developments affect PD
3. magnitude of investment can go to total of 100 in
one cycle

+,

+,

4. 7/10/68

10

Social Science
Faculty & Graduate Students

5. 10/9/68

11

Undergraduates from several disciplines

6. 2/1/69
7. 2/17/69
8. 3/3/69

6
7

15

9. 3/4/69

9

10. 3/17/69
11. 5/12/69

13

local press
Undergraduates from several disciplines
Political Science
Graduate Students
Social Science
Undergraduates
Political Science Graduate Students
Education
Professors

12. 2/14/71

9

13. 2/20/71
14. 2/27/71
15. 2/28/71
16. 3/6/71

5
8
13
6

7

Landscape
Architecture
Graduate
Students

WBBM reporter and friends
Urban planning Graduate Students
Graduate Students in secondary education
Graduate Students in religion

1. Built-in comment mode

1. data printout at end of each cycle, gives PD, cycle
number, and probabilities for all 15 developments
2. calculation sequence and secondary effects matrix
built into Delphi program
1. 40 information units, 35 developments and 5 events
2. PD selected by explorer
3. background paragraph for PD not automatically
displayed, must be requested
4. no indirect investment
5. does not ask for estimates of 4 secondary effects
relationships
6. automatic selection of secondary effects, only those
whose probabilities are changed by the investment
in PD
7. secondary effects matrix with relationships up to ± 3

Computer-based Communications Medium

secondary effect the computer displays the old
probability (before the investment) and the new
probability (after the investment) and the
change in probability (the difference between
the two).**
4. An oracle message is displayed. Oracle is a verbal
message telling which developments are likely
to happen in the year 2000 and which are not
likely to happen, on the basis of the current
probability of each development in the exploration.
5. At the end of each cycle the computer performs
several random calculations to determine
whether an "event" occurs. If an event does
occur, a background paragraph about it is presented and then its effects on the probabilities
of the social and technological developments are
shown by a table of secondary effects.
Tradeoffs due to hardware and software limitations

The primary limitation on the complexity of these
explorations has been the amount of computer memory
allocated to what is called student bank, the number

TABLE II-Rank Order of Developments by Mean Investment
(Data from demonstrations 12-16)

Development

Mean Investment

-100 -SO -60 -40 -20
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.

20

Pollu tion Con trol
Racial Barriers Eliminated
Abortion Legalized
Population Planning
World University
World Currency
Treaty Banning CBW R&D

,,\

60

80

"

100

'I.J
....

". "","
,,\,\,\,\,\',,\'

l'\'\

rtf!. 'I/I//I/j
1,\,\

.'\'\'\'\'\'

/I/I/I/I/I/J

Complete Nuclear Disarmament

Wor ld Aid Program
Ocean Farming
Synthetic Food
Staggered Work Week
Group Marriage Legalized
International GNP Tax
Marijuana Legalized
Citizen Sampling Simulations
"Sexicare"
Time Travel by Deepfreezing
Global 3-D Color 1V
Air Cushion Vehicles
!ianned Lunar Base
l~eather Modification
Animal Donors
!>lational Data Bank
Nationless Corporations
Credit Card Economy
Teaching Computers
Legal Decisions by Computers
Passing of Religion
In telligence Drugs
Genetic Manipulation
U. S. in Limited War
Continued Urbanization
Cloning of Humans
All-out Nuclear War

40

,'\'\'\'\'\'X

343

TABLE III-Rank Order by Frequency with which the
Development was Selected for Consideration*

Development
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.

Pollution Control
Racial Barriers Eliminated
Complete Nuclear Disarmament
Abortion Legalized
All-out Nuclear War
Population Planning
Continued Urbanization
Genetic Manipulation
Citizen Sampling Simulations
Cloning of Humans
Group Marriage Legalized
International GNP Tax
Treaty Banning CBW R&D
World Aid Program
World University
Marijuana Legalized
Passing of Religion
U. S. in Limited War
Air Cushion Vehicles
Credit Card Economy
Legal Decisions by Computers
Manned Lunar Base
"Sexicare"
Intelligence Drugs
National Data Bank
Animal Donors
Ocean Farming
Synthetic Food
Global 3-D Color TV
Staggered Work Week
Weather Modification
Nationless Corporations
Teaching Computers
World Currency
Time Travel by Deepfreezing

N umber of times Chosen
for Investment
31
25
23
21
19
18
15
15
13
13
13
13
12
12
12
11
11
11

10
10
10
10
10
9
9
8
8
8
7
7
7
6
6
4
3

.'\ '\'
'///.'/. 'I,
,,\,\

,\\ .\ \'
'I. 'I. 'I.
.\1

* Data from demonstrations 12-16.

,\\ \

14
tfr~

~+

~~i

,
~

-100 -80 -60 -40 -20 &

20

40

60

SO

100

** For a discussion of the mathematical model used in the
exploration and the decisions which have to be made by the
programmer, see Reference 5.

of words accessed by only one terminal. The number
of variables can be increased somewhat by packing, but
of course this procedure involves a limit as well.
The' 15 information unit explorations, particularly
the later ones, involved a larger number of decisionmaking operations in each cycle, such as indirect investment (what other developments are likely to affect
the occurrence of the development under consideration)
and asking for estimates by the explorer of some of the
probable secondary effects of the development being
considered.
When the number of information units was expanded
to 40, the number of decision-making operations performed by an explorer in each cycle was decreased.
This reduction resulted both from the demand for more

344

Fall Joint.Computer Conference, 1971

TABLE IV-Rank Order by Frequency with which Background
Information was Requested*

Development
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.

N umber of times background
information was requested

Cloning of Humans
"Sexicare"
Citizen Sampling Simulations
Pollution Control
Racial Barriers Eliminated
Treaty Banning CBW R&D
Complete Nuclear Disarmament
Group Marriage Legalized
Passing of Religion
Population Planning
Abortion Legalized
Animal Donors
Genetic Manipulation
All-out Nuclear War
Air Cushion Vehicles
Continued Urbanization
International GNP Tax
Ocean Farming
World University
Intelligence Drugs
Legal Decisions by Computer
National Data Bank
Staggered Work Week
Nationless Corporations
U. S. in Limited War
World Aid Program
Marijuana Legalized
Synthetic Food
Credit Card Economy
Manned Lunar Base
Teaching Computers
Time Travel by Deepfreezing
Global 3-D Color TV
World Currency
Weather Modification

22
21
17
15
15
15
14
14
14
14
13
13
11

10
9
9
9
8
8
7
7
7
7
6
6
6
5
5
4
3
3
3
2
2
1

* Data from demonstrations 12-16.

variables caused by more information units and from
the addition of some more sophisticated computer
operations.

Di8cu8sion of the data
Since work on developing the computer program
began in the f&ll of 1966, sixteen demonstrations of an
exploration of the future have been given during which
data was recorded. Numerous demonstrations were
given during which data was not collected. Table I
lists by dates the demonstrations during which data
was collected. The number of people participating in
the demonstration and the background of the group
are given in the second and third column. The right-

hand column contains notes about the nature of the
program, such as the number and categories of information units and the decision-making operations which
were added or dropped since the previous demonstration.
The first 11 demonstrations were of an exploration
having only 15 information units, all of which were
either social or technological developments. Demonstrations 12-16 were of an exploration having 40 information units-35 social and technological developments
and 5 events. Table II lists the 35 social and technological developments according to the mean investment
in each development.
Table III lists the developments in order according
to the number of times each development was chosen
as an object of investment. The greater the number of
people who choose to invest in a development, the less
influence each person has in determining its mean
investment.
TABLE V-Answers to Questionnaire at End of Exploration*
1. Is the outcome close to or far away from the future you had
hoped to achieve?
a. very close
2
b. close
14
c. slightly close
19
d. slightly far
2
e. far
1
f. very far
1
2. If you had it to do all over again, would you change any of
your investments?
a. yes
25
b. no
15
3. I found the information in the background paragraphs
a. helpful
33
b. not helpful
4
c. wrong
2
4. I found the instructions on what to do next
a. sufficient
34
b. insufficient
2
1
c. repetitious
2
d. badly written
5. All in all I found the Delphi exploration to be
a. loads of fun
10
25
b. fun
c. a bore
0
1
d. a complete waste of time
6. Sex
12
a. F
24
b. M
7. Year in school
2
a. freshman
2
b. sophomore
0
c. junior
4
d. senior
20
e. graduate student
15
f. professor
* Data from demonstrations 12-16.

Computer-based Communications Medium

The number of times that background information
was requested for each development is shown in Table
IV. There are no doubt a variety of reasons for
requesting background information. The more prominent items in the list seem to be those with which
people are least familiar. Background paragraphs are
also frequently requested on subjects which the explorer
is familiar with, probably in order to test the expertise
of the people writing the program. Similarly, controversial subjects seem to be called up in order to divine
the political opinions of the programmer.
Ten minutes before the end of the hour each person
is asked to type a code word in order to jump to a
series of questions at the end of the exploration. Seven
of these questions are listed in Table V. Question 2
asks, "If you had it to do all over again, would you

TABLE VI-Developments in which People Would have
Changed Their Investment*
Development
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.

Continued Urbanization
Pollution Control
U. S. in Limited War
Complete Nuclear Disarmament
Legal Decisions by Computers
Ocean Farming
Passing of Religion
Population Planning
Racial Barriers Eliminated
Treaty Banning CBW R&D
World Aid Program
Cloning of Humans
International GNP Tax
National Data Bank
Synthetic Food
World Currency
Abortion Legalized
All-out Nuclear War
Genetic Manipulation
Group Marriage Legalized
Intelligence Drugs
Nationless Corporations
"Sexicare"
Staggered Work Week
Teaching Computers
Weather Modification
Air Cushion Vehicles
Animal Donors
Citizen Sampling Simulations
Credit Card Economy
Global 3-D Color TV
Manned Lunar Base
Marijuana Legalized
Time Travel by Deepfreezing
World University

* Data from demonstrations 12-16.

N umber of times listed
5
4
4
3
3
3
3
3
3
3
3
2
2

2
2
2
1
1
1
1
1
1

1
1
1
1
0
0
0
0
0
0
0
0
0

345

change any of your investments?" If the explorer answers, "yes," he is then asked to list the numbers of
the developments in which he would make a different
investment. Table VI lists the 35 developments in
order from the most to the least frequently mentioned
in responses to this question.
Not all of the people who worked through at least
part of the exploration completed the questionnaire at
the end of the exploration. Fifty-four people participated in 6 demonstrations. Of that 54, 39 completed
the questionnaire.
A non-random sample of people

It should be stressed that the people who took part in
these demonstrations were not randomly selected from
the population at large. They were not even randomly
selected from the university community. The disciplines
represented are suggested by the groups listed in Table
1. The data in Table V, questions 6 and 7, shows the
distribution of people according to sex and year in
school. An open-ended question on political viewpoints
stimulated frequently extended critiques of American
society. My own interpretation of their answers indicates that there were two radical liberals, 10 liberals,
and 2 people between middle and right wing.
Furthermore, the people were not randomly selected
in terms of their interest in the exploration. With the
exception of a few students in political science classes,
all of the explorers to date have asked to work through
the exploration or have responded to the encouragement
of a friend to do so. The most frequent pattern is for
an interested faculty member to bring along either a
group of faculty members or a class of graduate students. We have not yet systematically sought representative samples since we are still primarily concerned
with the development of a more interesting program
from the viewpoints of both education and research.
Consequently this data is presented only as a very
preliminary indication of the kinds of responses that
can be obtained when using a computer-based communications medium to discuss an area of public policy.
The responses should not be interpreted as representative of how the American people or even university
people would rate the desirability of the developments
listed. The data is useful in indicating how many
possible future developments can be considered by an
educated person in a given period of time.
Measures of performance

Despite extensive instructions at the beginning of the
exploration, a few people have great difficulty figuring

346

Fall Joint Computer Conference, 1971

25
No. of

20

People

No. of People

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21

Total Number of Cycles Completed

o

Figure 3-Total number of cycles completed
(Data from demonstrations 12, 14, 15, 16)

out what they are supposed to do. This is shown by the
fact that in Figure 3 several people were able to complete only a very few cycles.
The time allotted for the exploration was in most
cases one hour. The demonstration on 2/20/71 extended
to about an hour and twenty minutes. For the five
demonstrations of the 40 information unit game the
mean number of cycles completed was 10.3. For the
four demonstrations in which only one hour was available, a mean of 8.9 cycles were completed with the
mode being 8 and the median 10.
The number of people requesting a particular number
of background paragraphs is shown in Figure 4. The
mean number of background paragraphs requested was
8.8. The mode was 6 and the median 11.5.
The number of people having a particular number of
random events occur in their exploration is given in
Figure 5. The mean number of events in an exploration
was 1.7. The mode was 1 and the median 2.5.
Figure 6 shows the number of people who made a
certain number of comments. The mean number of
comments was 1.6. The mode was 1 and the median 2.
Everyone was explicitly asked for comments, suggestions, and criticisms as one part of the questionnaire
at the end of the exploration. Consequently there were
very few people who made no comments, or a response
such as "none" in answer to that question. Those

1

2

3

4

5

6

Total Number of Events Which Occurred

Figure 5-Total number of events which occurred
(Data from demonstrations 12-16)

people who made only one comment did so in reply to
the specific request and did not interrupt the exploration
in order to go into the "comment mode."
Comments by explorers

The comments made by the participants during the
demonstrations reflected a variety of criticisms, suggestions, questions, and general reactions. These can be
grouped in the following categories:
1. Technical errors. Debugging is an activity familiar
to all computer programmers. Debugging a program on
a teaching computer involves calling in some friends
who either have a knack for making things go wrong
or who find a malicious glee in outwitting a computer.
Examples of technical errors would be that the computer does not accept a negative number when it
should, or it accepts a letter when it should accept only
numbers. A few participants were quick to point out
such errors: "Hal You made a mistake."

No. of People
No. of
People

o

1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Total Number of Background Paragraphs Requested

Figure 4-Total number of background paragraphs requested
(Data from demonstrations 12-16)

o

1

2

4

Total Number of Comments Made

Figure 6-Total number of comments made
(Data from demonstrations 11-16)

5

Computer-based Communications Medium

2. Instructions. N early every exploration produces
suggestions for clarifying the instructions. For example,
"I did not understand how to invest and make a
meaningful contribution toward my goal." "It would
be helpful if we were given a reminder to enter our
choices and outcomes on the data sheet." "Need more
explanation of what you mean by investment."
3. The purpose of the exploration. In a game in which
no score is kept, where the object is to make the desirable probable and the undesirable improbable, expressions of confusion are to be expected. "The final results
of the game or the goals were not clear to me." Some
people seem to have difficulty conceptualizing complex
systems. "I had difficulty realizing the issues. Consequently, I'm not sure I was making intelligent decisions." On the other hand, some people experience
"an awakening of the sense of the future of man."
4. Important items not included. Some of the most
thoughtful and helpful comments deal with what is not
included in the exploration. "I did not think that there
were enough policy factors involved in the issues contained here. For example I would have liked to see the
question of manipulation of the individual considered
more explicitly-in other words questions about attitudinal changes." "Complete disarmament rather than
nuclear disarmament should be one of the futures
included."
5. The original probabilities. Explorers sometimes
question the actual estimates. "All out nuclear is not
possible." And sometimes they question who made the
estimates. "I am curious as to how the original probabilities for the given issues were determined."
6. Background paragraphs. The brief background
information was challenged usually for not giving sufficient attention to consequences, regulation, or alternative solutions. "Nothing was mentioned about the effects of pressure on the surface the air cushion vehicle
is going to be passing over. Until the ecological effects
of such an apparatus over a natural surface are determined, I would seriously question such an apparatus." "In such items as cloning of humans, crucial
matters would be the regulations that go along with
the process. It is difficult to decide if it is good or bad
without at the same time fixing some of the regulations."
"Legislation or executive action is not needed to stabilize population growth rate-education is the answer
to this problem."
7. Secondary effects. Questions about secondary effects ranged from who determined them and the logic
used, to amazement that there were so many. For
example, "How does a 100 percent investment in world
aid program reduce probabilities of ocean farming and
synthetic food?" "I am unclear as to how the associ-

347

ations were decided upon after I had made my investment or an event had happened. Some of them did not
make sense to me. Like some of the considerations were
left out." "I do not see what happened at this point
which caused the probability of limited war to increase." "I would have been interested in how various
choices affected other events specifically, e.g., a little
more about why one probability caused a particular
change in another variable." "Frightening how one decision influences so many others you never took into
consideration when making your initial decision. On
some do not see off hand how the influence came
about."
8. What makes the exploration enjoyable. "I like when
you add the event to make it more exciting." "I was
angry at being told· what to do in such a demanding
way by a machine. My hands felt slapped each time I
made a mistake." "I would have liked to have finished
in order to see the full picture of the world. Occasionally
became bored, however, perhaps due to each step being
handled in the exact same manner as the previous."
9. The dangers or opportunities implied by this technology. Only a very few people remarked on the potential of this kind of equipment and this kind of
program for radically altering the process of citizen
participation in planning. The people who do comment
on this possibility have usually been students who were
told in advance what this work might lead to. More
than we would like, people have concentrated on the
information in the program rather than thought about
the possibilities of this kind of information exchange.
"Concerning this computer, I hope we never go into
teaching children by this means even though I can see
where it may be more efficient. Learning how to deal
with people seems much more important to me." "The
greatest difficulty here is that as well explained as the
process is, it is still very confusing. I am not sure you
could ever get the general publie to use them as the
general public strikes me as being very lazy and therefore would not consider this worth the effort." "I think
the idea is very· interesting and has many possible
applications especially in finding out what the common
level of awareness is and where work is needed to bring
what area up." "A great start at showing the interrelationship of various particular choices."
FURTHER NOTES ON SOCIAL IMPLICATIONS
Not all of the analysis which can be performed on the
data from these demonstrations has yet been carried
out. But perhaps the preceding discussion is. adequate
evidence that the earlier and following speculations and

348

Fall Joint Computer Conference, 1971

philosophical musings are based on experiences with
operational though still elementary prototypes. ***
The practicality of public discussions
The data presented here is useful in estimating the
feasibility of widespread use of "citizen sampling simulations" to involve the public in the planning process.
It is to be expected that some people will be intimidated
by the thought of using a computer and will be overwhelmed by this advanced communications technology.
As was mentioned earlier, there is some evidence that a
few people do have great difficulty with the program
or at least proceed very slowly. Nevertheless, learning
how to use a new technology and also exploring a list
of 35 unusual social and technological developments is
a demanding task for a one hour period. The success of
these people in performing that task leads us to believe
that community issues can Ibe discussed by the public
with this kind of technology, particularly as people
gain practice in using it.lO
The power of suggestion
The method of describing the future using information
units seems to have been successful. Most people find
the experience educational, and the basic structure of
the program-the way the future is described-has
not been criticized by most people who play the game.
However, a few perceptive observers have expressed
skepticism about the exploration, and rightly so. Simply
listing a set of possible developments structures the
thinking of explorers and thereby severely limits the
range of responses. This is confirmed by the fact that
people rarely suggest new developments or criticize the
list given.
Those observers who are equally concerned but less
theoretically inclined question the assumptions about
the world which led to the selection of that particular
set of forty information units. Furthermore there is a
danger that some individuals may assume the changes
in probabilities are "real" or indicate something more
than merely the aggregated judgment of a group of
individuals.
In future explorations we intend to make even more
explicit the fact that the original probabilities and the
changes in probabilities of other developments which
we call "secondary effects" merely reflect the judgment

*** An earlier consideration of possible social, political, and
psychological implications of citizen participation in planning
using computer-based communications media is contained in
Reference 4.

of the programmer and the people he has consulted
and that the consequences indicated are not determined
by a computer model based upon verified theories of
the operation of social systems.
A program now being developed on a specific problem area, the Future of the University, uses probability, desirability, and importance ratings and both
occurrence and non-occurrence matrices for three separate groups-students, faculty members, and administrators. In the Delphi program the computer is serving
as a communications medium between the programmer
and the people at the terminals. In the Future of the
University program the fact that the computer is simply
operating as a mediator among groups of people with
different patterns of concern and perceptions of the
world is made much more explicit.

Who is communicating with whom?
Even though the responses of each individual are not
automatically seen by the other participants and the
program itself may be changed only at intervals involving weeks or months, communication can still be
taking place. In order to clarify the differences among
Delphi-like computer programs it is useful to keep in
mind whether individuals or groups are communicating
with each other, the number of times a single individual
will sit down at a computer terminal in order to work
on one particular problem, and how much monitoring
or editing of responses is done by the programming
staff.
1. In Turoff's Delphi Conferencing, as I understand
it, the responses of each individual are seen by
all the other participants in the exercise. Items
are added or dropped on the basis of a vote taken
among ·participants. No monitoring or editing of
the exercise is done by anyone except the participants themselves. Each time a person responds to a question or types a statement his
response is not only recorded for viewing by the
programmer but actually alters the program
which each participant will view thereafter. Each
person works on the given problem for a few
minutes every day for several days.
2. The idea behind the Delphi Exploration, which
can be thought of as a prototype "citizen sampling simulation," was to have communication
take place not among individuals but rather
between the planning group and the public. The
responses of the explorers are recorded and
viewed by the monitoring group but do not
automatically change the program itself. This

Computer-based Communications Medium

pattern is similar to the normal process of instruction where communication takes place primarily between the teacher and the students.
Each explorer works on a particular problem
probably only once, at a sitting lasting from one
to two hours. If the issue is a recurring one, the
program is changed only every few weeks or
months when the programming staff has an
opportunity to reconsider the issue and to change
the program on the basis of the responses obtained since the last modification of the program.
The purpose of this kind of exercise is not to
generate a forecast or a set of policy alternatives
but rather to reduce the amount of time spent
by planners in presenting background information to interested citizens and to generate data
from the public on the desirability of particular
alternatives, the completeness of the set of alternatives considered, and the way in which
"the problem" is defined.
3. For a "computer-based mediator," such as the
program on the Future of the University, communication takes place neither among individuals
nor between planners and the public with the
planners also acting as monitors, but rather between conflicting interest groups with either no
monitor or a neutral party acting as monitor /
arbitrator. The responses of individuals are not
seen by the other participants except in the mean
responses of a group. The position of each group
is not arrived at by negotiation or compromise
within the group but rather results from averaging the views of individuals in the group. In
TABLE VII-Group Communication Techniques

Participants

Delphi
Conference

Citizen
Sampling
Simulation

Computerbased
Mediator

individuals

planning
group and
the public

interest groups
2 or more

Length of
Interaction

minutes

1 to 2 hours

1 to 2 hours

Number of
Interactions

several,
usually 1
per day

usually only
one

usually only
one

Normal
Mode

usually
group
control and
no monitor

at present
completely
monitored

at present
only list of
items not
modifiable

349

the present program the responses of an individual alter his group's estimates of probabilities,
desirabilities, and causal relationships, but the
list of information units can only be changed by
the programmer.
The practice in Delphi Conferencing of allowing the
participants themselves to add and drop information
units is a very important capability which can be incorporated into both citizen sampling simulations and
computer-based mediators.
Exchanging views vs. simulating complex systems

There is a tendency to confuse this project with the
presently growing number of attempts to model complex social systems. If that were our purpose, we would
be greatly hampered by the technology we are using.
The PLATO system was designed to operate a large
number of computer terminals simultaneously with
each terminal being allocated a small amount of computer memory space. Our efforts are sufficiently similar
to the modeling of complex systems that we have attempted to keep up with developments in that area in
hopes that the form of the models would be applicable
to our programs.
However, given the limitations of our equipment for
doing that kind of work and given its uniqueness for
doing the task for which it was designed, we believe
that it would be most productive to spend o~r time
developing the computer as a communications medium
between people and as a device for helping less skilled
people to articulate their mental models of how the
world works. Since computer models of social systems
inevitably embody the assumptions of the programmer
about what the important variables are, the ability of
less technically skilled groups to express their assumptions about important variables could be helpful in
trying to achieve a balance of political influence.
REFERENCES
1 C E OSGOOD S UMPLEBY
A computer-based exploration of alternative futures for
mankind 2000
Mankind 2000 pp 346-359
Edited by Robert Jungk and John Galtung
London Allen & Unwin 1969
2 D ALPERT D L BITZER
Advances in computer-based education
Science Vol 167 pp 1582-1590 Mar 20 1970
3 D L BITZER D SKAPERDAS
PLATO IV: An economically viable large-scale
computer-based education system
National Electronics Conference Chicago 1968

350

Fall Joint Computer Conference, 1971

4 S UMPLEBY
Citizen sampling simulations: a method for involving the
public in social planning
Policy Sciences Vol 1 No 3 pp 361-375 Fall 1970
5 S UMPLEBY
,
,
The delphi exploration: a computer-based systim for
obtaining subjective judgments on alternative futures
Social Implications of Science and Technology Report F-1
pp 34-51 University of Illinois Urbana-Champaign
August 1969
6 G MYRDAL
Objectivity in social research
N ew York Pantheon Books 1969
7 A bibliography of Delphi studies is included in
M TUROFF
The design of a policy delphi
Technological Forecasting and Social Change Vol 2 No 2
1970

8 R THEOBALD J M SCOTT
Teg's1994
mimeographed 5045 North 12th Street Phoenix Arizona
1969
9 An explanation of how the 40 information units were
selected is given in the third progress report
V LAMONT S UMPLEBY
Forty information units for use in a computer-based
exploration of the future
Social Implications of Science and Technology Report F-2
University of Illinois at Urbana-Champaign March 1970
10 An experiment using the PLATO system to discuss a
local environmental issue is reported in
V LAMONT
New communications technologies and citizen participation
in community planning
Computer-based Education Research Laboratory
University of Illinois Urbana May 1971

INSIGHT-An interactive graphic instructional
aid for systems analysis*
by M. J. MERRITT and R. SINCLAIR
University of Southern California
Los Angeles, California

..
INSIGHT (INstructional Systems Investigation
GrapHic Tool) is an interactive graphics program
which illustrates the basic concepts of systems analysis.
The transfer functions, inputs, and parameters of a
single loop control system are drawn on the graphic
display. Numeric data entry, system modifications and
control of the time domain and frequency domain
analysis is performed using the graphics terminal's
light pen. All communications are problem oriented
and no previous computer experience is required.
INSIGHT provides a common instructional tool for
all disciplines touching upon systems analysis, control
theory, and differential equations. Such disciplines
might include engineering, physics, mathematics, geology, and chemistry, to name but a few. The common
link between all of these fields is the use of Laplace
Transforms to describe linear components of possibly
non-linear systems, represented as serially connected
block elements, with and without feedback. Additional
common factors fall into two classes:

I t should be suitable for classroom instruction, laboratory exercises and homework assignments (see Reference 2 for a more complete discussion).
The work of Melkanoff,4 Moe,5 and Calahan! clearly
demonstrates the advantages of computer aided instruction and computer graphics. The INSIGHT program is a continuation of these efforts which utilizes
interactive graphics to meet all of the goals listed
above.
All communications between the program and the
user take place at the display console. Selection of system components, specification of parameters, and selection of computational algorithms are all accomplished
with the light pen. The computational services provided by INSIGHT include both time domain analysis
(numeric solution of the system equations) and frequency domain analysis (root locus).
Computational results are displayed as they are computed. All graphs are scaled and rescaled automatically
to fit within the plot area of the display.
INSIGHT offers a number of unique advantages:

1. Common Instructional Goals
(a) encourage use of analysis techniques to develop insight into the process under study
(b) relate frequency domain analysis to time
domain performance
(c) provide feel and intuition for theoretical
concepts
2. Common Analysis Techniques
(a) time domain measurements of step, ramp,
sinusoidal, etc., responses
(b) phase plane trajectories
(c) frequency domain analysis: root locus, Bode,
Nyquist, and describing functions

1. Learning time is short, usually less than five
minutes.
2. The program is autoinstructional and completely protected against inappropriate user
commands.
3. Both setup and solution times are extremely
short.
4. No previous computer experience is needed.
5. No programming of any kind is needed.

The Hardware

The instructional device which provides all of these
features should also be a universal educational solvent.

All computer programs are constrained by the computer hardware available. INSIGHT is no exception.
The USC School of Engineering's System Simulation

* Supported by the National Institutes of Health under Grant
No. GM 16197-03.
351

352

Fall Joint Computer Conference, 1971

TABLE I

Laboratory contains (in part) the following equipment:
1. IBM 360 Model 44 with 64K bytes (going to
128K bytes as this is written), high speed line
printer, and four disk files.
2. Adage AGT-10 Graphic Display System with
8K words (30 bits) of memory, light pen, function switches, joy stick, teletype and magnetic
tape.
All of the facilities of the Adage Graphic System are
available to FORTRAN programmers through the
AGNOS language. 3 The AGNOS language contains
simple FORTRAN callable subroutines for image generation and manipulation, and light pen hit processing.

Using INSIGHT
INSIGHT is used in much the same manner as one
would use a pencil and paper, only much more conveniently. The pages of paper are represented by a sequence of graphic "pages" appearing at the graphic
console. Each graphic page elicits new information
from the user relating to the design and analysis of the
system.
INSIGHT provides the skeleton framework-a
single loop control system with an arbitrary input. The
user fills out this framework by selecting items from a
menu and inserting them in the control systems block
diagram. The first graphic page presented to the user
is shown in Figure 1.
The block diagram is the process, and vice versa. In
order to reinforce this feeling, the block diagram is
always retained at the top of the display. All other
material is entered into the lower half of the screen.

Figure I-First graphic page of INSIGHT showing the
block diagram and element menu

Linear Transfer
Functions

Non-linear
Functions

Forcing Functions

A

SATURATION

A SIN (Bt)

A
s

RELAY

A U(t)

A
(Bs+I)

BANG-BANG

At

(As+I)
s2+BS+C

GEAR TRAIN
BACKLASH

YzAt2

CUBIC

SQUARE WAVE

SAMPLE AND
HOLD

GAUSSIAN
RANDOM
NOISE

As
A(Bs+I)

D parameter is the
initial condition
for all linear
transfer function
elements

The contents of the element menu are summarized
in Table I. All of the photographs seen in this article
were taken from an early version of INSIGHT. Whenever a discrepancy exists between the text and the
photographs, the text is correct.
Menu elements are placed in desired blocks in two
steps:
1. Selection-when a menu item is touched with
the light pen, INSIGHT affirms the touch by
moving the square shown enclosing the item
A/S in Figure 1, to enclose the touched item.
This item is the "current selection."
2. Placement-when the contents of one of the
control systems blocks is touched with the light
pen, the current selection is copied into it, replacing its previous contents.
Initially, all of the control system blocks are filled
with asterisks. An asterisk filled block will be treated
as a unity gain element by all of the computational
algorithms.
INSIGHT recognizes obvious errors, for example
placing a forcing function, A SIN (BT), in a transfer
function block, or the reverse, placing a transfer function element, A/S, in the INPUT block. When either
of these placements is detected, the words "ILLEGAL
OPERATION" are written at the top of the display.
The contents of the block involved are not changed.
When the block diagram is complete, the user
touches the word FINISHED with the light pen. The

INSIGHT

353

temporary line are copied into the box enclosing
the current selection. The temporary line remains unchanged, allowing the same value to be
stored in a number of places.
Integer values may be entered with and without a
decimal point. For convenience in entering complementary numeric constants (for relays, saturation,
etc.), the minus sign may precede or follow the numeric
characters, thus -1 and 1- are both stored as -1.0.
When all numeric values have been entered, the
words "SPECS COMPLETE" are touched with the
light pen. The parameter specification page is replaced
with the mode specification page, shown in Figure 3.
The mode specification page allows the user to select
between the two computational algorithms:
Figure 2-INSIGHT's specification page, paramet~r entry for
the diagram shown is complete

element menu is removed, and the parameter specification page written in its place, see Figure 2.
Each block in the control system is assigned a reference number. Thereference number is associated with
a block's contents, as well as its output. The number 3
alone refers to the output of block 3. The INPUT
block is assigned a reference number of 7. A given
block may contain an element with one , two , three , or
four parameters. These parameters are denoted by the
letters A, B, C, and D (see Table I).
The parameter values are arranged in seven rows of
four columns each, as shown in Figure 2. All parameter
values are set to zero when the INSIGHT program is
entered. Thereafter parameter values are retained until
they are changed by the user.
Parameter entry is a three step process: (1) preparation, (2) selection, and (3) placement, as follows:
1. Preparation-a 13 item menu containing the
numbers 0 through 9, decimal point, minus sign
and RUB (backspace) is written below the
block diagram. The underlined value seen just
below the numbers 2, 3, and 4 in Figure 2 is
called the temporary line. As menu items are
touched, the corresponding character or backspace is written on the temporary line.
2. Selection-when a parameter value is touched
by the light pen, the box shown enclosing the A
parameter, in Figure 2, is moved to enclose the
touched parameter. This parameter becomes the
"current selection."
3. Placement-when the word PUT is touched
with the light pen, the current contents of the

1. Time domain solution
2. Root locus

(N?TE: These options are not shown in Figure 3,
whICh was taken from a preliminary version of INSIGHT.)
If the root locus option is selected, INSIGHT adds
an auxiliary gain K to the control systems open loop
transfer function. Pole and zero positions are marked
on the root locus plot by Xs and Os respectively.
Root positions are plotted for values of K in the range
zero to ten. The root positions at K = 1 are the roots
of the control systems characteristic equation and are
denoted by small squares instead of the usual asterisk ,
see Figure 7. The locii are drawn as they are computed
and are automatically rescaled to fit into the plot area
of the display. Four options are available while the locii
are being drawn: STOP, START, RESTART, DONE.

Figure 3-Mode specification page of INSIGHT-Showing
default parameter values

354

Fall Joint Computer Conference, 1971

then the step size is set to

TABLE II-Graph Formats
X
0
1

7
8

FORMAT
phase plane Yl versus Y2
same as X = 0
time histories Yl and Y2 versus time
(see Fig. 4)
same as X = 7

KEEP
NO
YES
NO
YES

The STOP and START options are normally used to
freeze the display for purposes of discussion or reproduction. If, at the end of a locus, i.e., K= 10, START
is touched, then the locus is continued to K=20, 30,
etc. The decision to STOP a root locus cannot be made
effectively until the entire locus has been viewed. The
RESTART option allows the user to return to the beginning of the sketch and STOP it at selected intermediate positions. Touching DONE with the light pen
returns control to the first graphic page.
The root locus algorithm treats all non-linearities as
unity· gain elements. If describing function gains are
known, they must be inserted in place of the nonlinearities before requesting the root locus analysis.
If time domain solutions are requested, then the five
parameter values, step size, total time, Y1, Y2 and X,
are examined. If the step size is too small, i.e., if

step size = total time/100,000.
Time domain solutions may·be displayed in either of
two formats, with and without a KEEP option. In all
cases, the reference numbers placed next to Y1 and Y2
specify the two block outputs to be plotted. The reference number placed next to the X symbol in the mode
specification page determines the format of the graph,
as shown in Table II.
If the KEEP option is selected, the current solution
is drawn on top of the previous solution or solutions
at the same scale factors. The KEEP option is useful
in modeling and optimization studies and demonstration of the properties of phase plane singularities.

total time/step size/> 100,000

Figure 5-Non-linear system with relay element in place of
saturation element

Values are added to the graphs as they are computed.
Scale factors are adjusted automatically to fit the
graph in the plot area (unless the KEEP option was
selected). The STOP, START, and DONE options
halt and resume computations and terminate the solution, returning control to the first graphic page.
Application of INSIGHT

Figure 4-Non-linear control system with saturation and a
square wave forcing function

An instructor wishes to demonstrate the effect of nonlinearities on system performance and to illustrate, the
concepts of limit cycling and stability. He begins by
constructing the control system shown in Figure 4.
This system contains three poles and one zero in its
open loop transfer function. Numeric values must be

INSIGHT

355

play system is not so easily treated. Calahan! and
others have discussed the cost factors relating to the
use of computer aided instruction. A number of satisfactory graphics systems are available for less than
$10,000 (Techtronics T4002 and the Adage ARDS
System, to name only two). The low cost of these devices, combined with increased demand for computer
time will encourage administrators to restructure the
financial foundations of educational computer centers.
Conversion of INSIGHT to operate within an existing
Conversational Programming System (CPS) utilizing
a large screen video device is under study. The decrease
in display capacity and increased response times will be
offset by the decreased cost and increased availability.

Figure 6-Non-linear system with sinusoidal forcing function

specified for the gains, time constants, slope of the
linear segment of the saturation element, and for the
saturation levels. After specifying the integration step
size and total solution time, the instructor selected the
time domain solution, plotting the output of blocks
3 and 7 versus time.
In Figure 5, the instructor has replaced the saturation element with a relay and plotted blocks 3 and 1
versus time.
In Figure 6, the instructor has switched to a sine
wave forcing function and has plotted the input to, and
output from, the relay element versus time. This could
be used to introduce describing function analysis.
The ease of use, fast solution time and rapid interaction of INSIGHT are illustrated by the fact that
less than five minutes was required to generate the
three examples just described.
INSIGHT may be used for its root locus facilities
alone. Another instructor, desiring to discuss properties
of the root locus sketch, selected elements to form an
open loop transfer function of the form:
G(s)

SUMMARY AND CONCLUSIONS
INSIGHT provides a variety of Systems Analysis
functions to a wide range of users in a convenient, easy
to use package. The goals set forth at the beginning of
this article were met and exceeded. Extension of INSIGHT's capacities to encompass more complex structures and additional analysis tools is in progress.
ACKNOWLEDGMENTS
The authors are indebted to Rick Klement, Bill Liles,
and Al Vreeland for the AGNOS language, and to

60(0.08338+1)(0.28+1)
8(82 +48+20)

The resultant INSIGHT root locus diagram (again
from an early version without axis labels, magnitudes
or K= 1 markers) is shown in Figure 7.
FUTURE PLANS FOR INSIGHT
The single loop structure and restriction to seven
blocks are major disadvantages. They are, however,
removable through additional programming. The high
cost of the dedicated IBM 360/44 and the Adage dis-

Figure 7-Root locus diagram for a three pole two zero open
transfer function

356

Fall Joint Computer Conference, 1971

Donald Miller for his many valuable suggestions in the
development of the INSIGHT program.
REFERENCES
1 D A CALAHAN
Circuit design application of the Michigan terminal system
IEEE Transactions on Education Vol E-12 No 3
September 1969
2 M L DERTOUZOS
Educational uses of on-line circuit design
IEEE Transactions on Education Vol E-12 No 3
September 1969

3 R KLEMENT W LILES M MERRITT
A VREELAND
The AGNOS language
University of Southern California Technical Report
No 71-19 April 1971
4 M A MELKANOFF
The use of on-line graphical computer systems for student
research
IEEE Transactions on Education Vol E-12 No 3
September 1969
5 M L MOE
The N ASAP computer-aided circuit design program and
its use in undergraduate education
IEEE Transactions on Education Vol E-12 No 4
December 1969

An interactive class oriented dynamic graphic
display system using a hybrid computer
by A. A. FRANK
University of Wisconsin
Madison, Wisconsin

INTRODUCTION

Then since the digital computer can control fully the
analog computer, dynamic displays in real time can
easily be generated.
To make this system work in a reasonable manner in
a university environment it is an absolute necessity
that the professor not be burdened with the additional
responsibility of writing his own programs. To solve
this dilemma a full time staff member whose sole duty is
the programming of class problems is provided. This
staff member is the key to the success or failure of the
system. He must be versatile enough to handle all
fields of engineering.
It must be emphasized that the professor is not to be
replaced by the system, but rather the system is to
provide him a more effective manner in which to teach,
or in other words to enable courses to contain more
material for a given amount of time, and to provide
the student a means to comprehend this material more
effectively. It is thus the responsibility of the professor
to learn about this new teaching tool.

The current state of engineering education is such that
it is often difficult for students to gain insight into the
many subjects they must master. To aid the professor
or teacher in "getting the point" across, the following
system is being tried.
In every field of engineering, theory is developed
using mathematical and graphical techniques. These
theories are manifested in problems which illustrate
different aspects of the theory. These problems in general are rather tedious to work out by hand and when
they are worked out they illustrate one single aspect of
the theory. For example, a beam in a strength of materials class has a given load, then how does the shear,
moment and displacement change for different kinds
of loading? What effect does the cross-sectional inertia
have? How does cross~sectional inertia change as the
shape is changed? What are the natural frequencies of
this beam? How would it bend if a natural frequency
were excited, etc? This kind· of problem could easily be
solved by a hybrid computer system and the solution
displayed on an oscilloscope screen. The answers to all
these questions and many more can be easily seen in a
continuous fashion on command of the professor.
It must be emphasized that this system is not intended to be a student involved teaching aid. 1 ,2 It differs
with other elaborate systems used for specific applications in that the machine and language remains general
and only the "skilled" operates the machine. 3 ,4 Further,
it is possible to display "static" as well as "dynamic"
problems as illustrated by the example.
To make the system worthwhile and utilize the hybrid system more effectively, it is necessary to consider
multiple numbers of terminals placed into different
classrooms. I t is then necessary to solve each classroom's problem on a shared basis. A hybrid system
with a digital computer which has a real time monitor
and a disc or drum storage can be used in this mode.

DISCUSSION
1. System Concept

The hybrid computer is the heart of the display
system. A normal hybrid computer system has the elemen ts shown in Figure 1.
The computer system has all the elements of a display computer. The objective is to design a system
which can maintain a display system without greatly
jeopardizing normal hybrid computer usage.
In a normal environment the hybrid computer's
digital machine experiences only about 5-10 percent
CPU usage. The graphical display will essentially steal
from the unused time. It has plenty of time from which
to pick. Besides, there is a full time operator so communication between operators is possible.
357

358

Fall Joint Computer Conference, 1971

r - - - - - -- - - - - - - - -

following:

-- -- - --,

1. 872X11 memo scope or scope with memory.
2. Keyboard.
3. 8 Coefficient potentiometers.

I

I

SYSTEM OPERATION
HYBRID
CO}1PUTER
SYSTEM

I
"--

-- ----

-

-------

-..I

DISPLAY

DISPLAY \ o 4 - - - - - - < J

DISPLAY

Figure 1

The full time operator's duties include both machine
set up and operation, and programming advice.
HARDWARE
The hybrid computer for such a system must consist
of at least the flowing complement of equipment:
Digital Computer

Analog Computer

I-General Purpose
CPU.
2-16 K Core memory.
3-Disc or drum mass
storage.
4-Real time monitor
system.
5-Priority interrupt
structure.
6-External communications: AID, DIA,
logic in and out, and
hardware interrupts.

I-General purpose
computer with patchboards.
2-16-64 integrators.
3-4-40 multipliers.
4-30-60 summers.
5-100-200 digital
computer controllable
potentiometers.
6-Lines to various
display centers.
7-Hard copy capability.

Each of the displays in the system consists of the

The organization, hardware and software, is designed
to provide an aid to the teaching of engineering principles for the professor. Thus the display of a problem
solution must be available immediately (within a second) after the professor's request is put into the system.
To do this, the hybrid computer is time shared with
the various terminals and the user of the hybrid
computer.
The operation of a terminal is done by a professor
simply pushing a request button. On initiation by the
professor, his program will be presented with the set of
parameters he has specified on the display screen in
less than one second. The professor can then change the
coefficients or parameters and push the request button
again and the students can see within a second the next
solution to the problem. These solutions can be stored
for comparisons or the screen can be wiped clean each
time. This is the feature of the memory screens. If a
"hard copy" is desired the professor need only to call
the operator and make the request. The operator then
will use the hard copier and provide the professor and
class with hard copies. These hard copies could also
have been made in advance in which low cost reproduction methods would be employed.
The professor may have a number of problems programmed into the computer system. He can simply call
for the problem by name through the keyborad. For
example, the bending beam problem can be broken
into at least three separate parts; (1) shear-moment
diagrams, (2) deflection-stress diagrams, (3) vibration
mode diagrams, etc.
The development of the software for each problem is
the responsibility of the display systems personnel. The
only responsibility the professor has is to specify the
problem and the parameters he would like to see varied,
and the time and date the problem will be requested.
His problem will be stored in the computer. Of course,
it will be easy for a professor to specify a problem which
is beyond'the computers capability. This will either be
due to the problem magnitude and the computer capacity or due to the fact that the problem is not suitable for this kind of computation. It is not likely that
such problems, at the present level of undergraduate
education, will be encountered; however, to provide for

Interactive Class Oriented Dynamic Graphic Display System

such eventuality the terminals will be outfitted with
time share capabilities to other larger facilities, such as
a large IBM 360 system or an 1108 Univac system via
telephone connections.
__ The program developed by the systems analyst will
be put onto the digital computer's mass storage device
(either a disc or a drum). In this fashion, when the
professor makes a particular problem request the digital
computer fetches the program from the disc or drum
and executes it and sends the output to the display;
In problems involving the analog computer, or real
time dynamics, the digital computer provides the control and direction of the analog computer and its signals. Problems involving the analog computer will require a patchboard as well as the digital program. It is
the responsibility of the system staff member to see
that such a patchboard and the digital program be on
the machine and ready to run when the request button
is pushed. The professor may use this display for only
tep. minutes during his lecture and may only use it
three or four times during the semester. Even with 30
professors throughout a college using such a system, it
will have a relatively low demand if a little care is
taken in scheduling.
SOFTWARE SUPPORT
The staff must be capable of providing this service
to any professor and any department in the engineering
college. For example a professor in the engineering
graphics area wishes to present a problem in descriptive
geometry, such as a cone cutting a cylinder, on the display system. Another example may come from the
chemical engineering department which has a problem
involving chemical kinetics and process control. Still-a
third example may come from civil engineering in the
design of earthquake proof structures. Obviously every

359

department of engineering has courses which can find
uses for such a system. The important aspect of the
total system is that the staff must be able to comprehend all areas of engineering so as to aid professors
from all disciplines. The program can be called upon
once a semester or whenever such a class is run.
As the system becomes used by the College of Engineering the facility may have to be expanded to a
separately dedicated computer system. However, a
way to begin such a program is to initially start with
an existing hybrid computer system as an overload.
Then when the system has proven itself and grows beyond the hybrid labs capability it will be easy to justify
further expenditures and argue for its own system and
staff.
Most important is that the professor's present teaching techniques need not be modified by any great extent
and thus this system lends itself to a natural "phase in"
period. It is but a stepping stone to more elaborate
systems.
REFERENCES
1 J LENAHAN
Synthesis of an interactive human-machine system
PhD Thesis University of Wisconsin 1969
2 F KOENIG
Formal analysis for a general system of interactive automata
Presented at the 3rd Hawaii International Conference on
System Sciences January 1970
3 D CALAHAN
Circuit design application of the Michigan terminal system
IEEE Transaction on Education Vol E-12 No 3
September 1969
4 M MELKANOFF
The use of on-line graphical computer systems for student
research
IEEE Transactions on Education Vol E-12 No 3
September 1969

Hybrid terminal systelD for simulation in science education
by DONALD C. MARTIN
North Carolina State University
Raleigh, North Carolina

ONCE UPON A TIlVIE - - -

languages is speed and availability. It is far too expensive for most universities to provide sufficient remote
terminals to effectively service an engineering school of
several thousand students. One terminal for every fifty
students is not unrealistic and even this number will
lead to long hours waiting for access during prime time.
The time shared terminals currently being developed
by Grannino Korn 1 in project DARE at the university
of Arizona are certainly encouraging and may change
this picture in the future. At any rate, it is fair to say
that the high cost of interactive digital terminals for
continuous system simulation severely restricts their
extensive use in the undergraduate science education
program today.
Once upon a time ... students were asked-no,
required-to submit lengthly laboratory reports to be
read and graded by the graduate teaching assistant.
The hours of hand calculations were gradually superseded by reams of computer output. It would appear
that much of the undergraduate laboratory could be
improved by using simulation techniques and programmed instructional material. The student would be
required to answer specific questions about the physical
system, either real or simulated, to demonstrate his
acquisition of the information presented in the laboratory.
Now ... It is because of the nature of interactive
simulation and its use as an aid to the students' understanding of physical processes that the hybrid terminal
system described in the remainder of this report was
developed. An ideal system includes the high speed,
interactive, graphic display capability of the analog
computer along with the memory, program storage and
terminal capability of the digital computer. The design
parameters for such a terminal were recently described. 2
These hybrid terminals evolved from earlier work with
student evaluation of simple and inexpensive analog
computer terminals. 3 The final terminal design was
based on the premise that the student need not learn
analog patching to use the hybrid terminal and that

Once upon a time ... there was an old woman who
lived in a shoe. She had so many children, she didn't
know what to do ....
... Precisely the problem which faces the professor
who wishes to use computer simulation in education
today. Like the old woman, he has no serious difficulty
with the older children who are helping around the house
or in graduate school, but what can be accomplished
with the thousands of youngsters now lodged in or
entering that shoe we call the university. This paper is a
progress report on a new approach taken by one school
to introduce continuous system simulation concepts to
all four thousand of its children.
Once upon a time ... we attempted to teach analog
computer programming in various engineering and
science oriented courses. The majority of students
objected, and well they should. Ten years ago, they
objected by writing short references to the professor on
the desks or bathroom walls. Today, they simply ask
in class-where will I use the analog computer after I
enter the real world. The answer is, of course, that
ninety-five percent or more will never see a general
purpose analog computer after they graduate. They all
need to know about analog and digital simulation,
operational amplifiers, basic electronics and signal
conditioning, but the majority will· never require the
ability to patch a six degree of freedom or nuclear
reactor simulation. Wouldn't· it be nice if they could
all use such simulations to study complex systems
response without the patching exercise?
Once upon a time ... it seemed that digital simulation would be the answer to the old patching and scaling
problem. Once students grasped the idea of implicit or
bootstrap solution of differential equations, they could
easily learn a digi~al simulation language structure in
several hours; The scaling problem largely disappears
and the output can be displayed on an oscilloscope or
plotted. The difficulty with most digital simulation
361

362

Fall Joint

Comput~r

Conference, 1971

IBM

1/30
INTERFACE

AND

DIGITAL
COMPUTER

MULTIPLEXER

TER

MINAL

NO. 16

TR-48

ANALOG
COMPUTER
PROCESS
CONTROL

CLASSROOM

COM PUTE R

ROOM

LABORATORY

Figure I-Hybrid terminal system

programmed instruction type laboratory handouts
should replace the traditional concept of submitting
hard copy computer results of a simulation.

clined panel. The control functions available to the
student include power on-off, store, non-store, and erase.
Indicator lights are also provided for terminal identification, error and terminal ready status.
The primary output display device is a Tektronix
type 601 storage screen oscilloscope. Output scaling is
automatically provided for in the digital software. A
six digit display is available to return parameter values
and other numbers from the digital control computer.
The student controls the analog or digital simulation
from the keyboard. There is provision for selecting
from two X and four Y channels for the display of
analog signals from any of four pre-patched analog
problems. The student sets any of eight function
switches and enters values for up to eight parameters
from this keyboard. The current value of any parameter
can be displayed in the six digit window on command.
Either E or F format can be selected by setting a two
position switch. Hybrid problems are all addressed as
problem five, i.e., selecting problem 52 is for digital
simulation, problem 51 for least squares fit of data,
etc. Provision is made on the terminal for automatically
incrementing a parameter through a range of values and
for control of a cursor to locate points on the analog
display.

THE HYBRID TERMINAL SYSTEM
Instructor's control terminal

The hybrid terminal system developed at North
Carolina State University is the largest system in the
world devoted to undergraduate system simulation
utilizing both analog and digital computers. This
system, funded by the National Science Foundation,
was designed and installed in less than twelve months.
The hardware was supplied by Electronic Associates
and the software was developed by student programmers at the university.
A flow sheet for the classroom hybrid terminal
system is given in Figure 1. The system consists of the
following components:

The instructor has access to a terminal which is
similar to the students as shown in Figure 3. The major
difference lies in the output display. This terminal uses a
Tektronix type 4501 storage oscilloscope with television

a. Sixteen student simulation terminals.
b. An instructor's control terminal.
c. A digital mini-computer with teletype and cassette
tape I/O.
d. Control interface to the analog computer.
e. Channel communication link with an IBM 1130
for hybrid problems.
The student simulation terminals

The basic terminal configuration is shown in Figure 2.
All display functions are located on the upper display
panel. Control and data input are provided on the in-

Figure 2-Student simulation terminal

Hybrid Terminal System

output. The instructor can display his solution to the
entire class on a large screen television monitor and mix
visuals with the computer output. By selecting the
appropriate terminal number, he can also displa;:y any
student's solution on the monitor. He also has the
capability of obtaining a hard copy of his or a student's
solution of any given problem in ten seconds with a
Tektronix type 4601 hard copy unit.
Digital control computer

The heart of the hybrid terminal system is the digital
control computer. This is a PDP-8 with 4K cor-e and
cassette tape drive for program storage. This computer
collects and stores data from the simulation terminals
until a solution is requested by a student. When a
solution request is received for an analog problem,
appropriate problem number, outputs and function
switch settings are transferred to the multiplexer. All
parameters are normalized, the digital to analog converter set, and the analog is placed in the operate
mode. Only the oscilloscope of the terminal which
initiated the solution request is unblanked. The basic
cycle time for this process is forty milliseconds although
solution time can be extended by the instructor if
desired.
If a hybrid problem is selected by a student, the
PDP-8 computer interrupts the IBM 1130 and initiates
the transfer of the appropriate program from disk to
core. All further entries from this particular terminal
are then transferred to the IBM 1130 until the hybrid
program has been executed or terminated by the

363

student. For a hybrid or digital simulation program the
IBM 1130 stacks interrupts and stores input on the disk
for sequential execution. After execution, the IBM 1130
interrupts the PDP-8 control computer which steers
the output back to the proper terminal.
The instructor uses a special conversational language
developed by our programmers for setting up student
application programs. This language makes use of the
cassette tape recorder to store maximum and minimum
values of parameters, analog operate and reset time,
and output display scaling information. The only
expertise required by the instructor who wishes to
employ this system in his course is basic analog computer programming and a knowledge of FORTRAN
for digital applications.
Control interface

The interface couples the PDP-8 control computer
with the analog computer multiplexer, and terminals.
If the student wishes to enter a parameter value, this
information is transferred digitally from a 32 bit sIiift
register in his terminal to his core area in the PDP-8
through this interface. If a parameter display is requested, values are transferred through the interface
back to the display window of the terminal. A solution
request directs the interface logic to transfer all information collected from a given terminal to the analog
computer. The interface clock determines basic cycle
time for analog problems. This interface also contains
the logic for interrupt processing between the PDP-8
and IBM 1130 computers for hybrid or digital problems.
Channel communication link

Since the PDP-8 control computer only has a 4K
core, hybrid and digital problems were extremely
limited on the basic system. To implement digital
simulation and more extensive hybrid problems, the
PDP-8 was coupled to an IBM 1130 located in a nearby
process control laboratory. Each computer treats the
other as an additional device operating under interrupt
control. A basic monitor was written for the IBM 1130
to transfer terminal information to disk and call stored
digital programs.
TERMINAL CLASSROOM

Figure 3-Instructor control terminal

The sixteen terminals are installed in a classroom
near the computer room as shown in Figure 4. Three
terminals are located on a large conference table near the
instructor's console. These are used for small groups of

364

Fall Joint Computer Conference, 1971

basis, the analog solution response time for other terminal users would not be degraded. It should be noted
that only one hybrid or digital simulation problem can
be entered and executed from an individual terminal at
any given time. Any attempt to enter a second hybrid
problem before completion or deletion of the first results
in an error message on the screen. This error message
directs the new user to a terminal which is not busy.
This feature is particularly important when using the
digital simulation program which can result in individual terminal response times of· 20-30 minutes for
multiple users. In such instances, the user can leave the
terminal and return at a later time to request his
solution. Again, analog solution time is not affected on
the remaining terminals.

Figure 4-Terminal classroom

students in a seminar-discussion mode to provide a high
degree of interaction with the instructor. The remaining
terminals are located in restaurant-type booths designed
for use by two students. The classroom can handle
thirty students comfortably. The terminals are series
connected with the last terminal of the string located
approximately 350 feet from the computer room.
SYSTEM RESPONSE TIME
The basic system response time for analog programs
is 40 milliseconds. This time is limited by the old but
reliable relay mode control analog computer used in our
system. When a student requests a solution after
selecting a problem and setting parameters, the next
twenty milliseconds are used to set the multiplexer, set
any or all of the eight DA converters to his parameter
values and reset initial conditions on the analog computer. During the next twenty millisecond time span,
his display oscilloscope is unblanked to store or view
the problem solution curves. In this mode of operation,
the worst case response time when all sixteen students
request a solution simultaneously is 640 milliseconds.
The response time for hybrid problems and digital
simulation is naturally dependent on the specific
application. A typical illustration is the least squares
data program described in the next section. This program requires from thirty seconds to one minute
execution time on the IBM 1130, the time dependent
on the order of polynomial requested. Thus, if two or
three terminals are using this program, an· individual
would have to wait several minutes for a solution. Since
the communication channel operates on an interrupt

TYPICAL PROBLEMS IMPLEMENTED ON
THE SYSTEM
The types of problems which can be implemented
on the hybrid terminal system are illustrated with three
specific examples. The first is an analog water pollution
study used at the freshman and sophomore level and
the second is a digital data reduction problem used by
juniors and seniors to analyze laboratory data. The third
example illustrates control features of the digital
simulation language.
A n analog problem

This problem demonstrates the effect of dumping
pollutants in a stream at two different points. The

Figure 5-Typical output for an analog problem

Hybrid Terminal System

primary display presented to the student is the typical
oxygen sag curve, but he can also look at the waste
decay function. The model consists of four first order
differential equations, two for each town on the stream.
The student can control the decay rates characteristic
of the waste, the rate of dumping waste material, and
the reaeration coefficient as a function of stream velocity
and turbulence. He is given a programmed instruction
type handout, and asked to determine from the terminal, waste characteristics, rates, and stream aeration
coefficients so that the oxygen level will remain above
habitable levels for fish population. This problem is
designed for a two hour laboratory session. Since this
is an all analog simulation, response time is excellent.
A typical illustration of the parameter increment
features of the terminal is shown in Figure 5. In ~this
case the student has incremented the quantity of waste
dumped at the second town from its minimum to
maximum value. For single student operation, this plot
is obtained in 400 milliseconds. If all were requesting
ten solutions of this or another analog problem at the
same time, the plot would have been obtained in about
seven seconds. Student feedback from this problem
has been very favorable, partly because of the current
nature of the problem.
A digital problem

A polynomial fit of experimental data serves to
illustrate the terminal system capability in the digital
mode. When the student selects any problem number
beginning with the numeral five, his terminal is coupled

365

Figure 7-0utput of digital curve fitting problem

to the IBM 1130 through the communication link.
Figure 6 shows the response received at the terminal
for the least squares program, number 51. The student
can then either select card input for data or enter the
data on the terminal keyboard in XY pairs. He then
specifies the order of the polynomial and requests a
solution. The data, curve and coefficients are returned
as shown in Figure 7 in approximately one minute. For
this particular plot, the student requested a first, second,
and fourth order polynomial and then requested the
coefficients for the fourth order case. The user scales
the output by entering the percentage of the vertical
screen he wishes the input data to occupy. Naturally
the response time would be on the order of sixteen
minutes if all terminals requested a solution simultaneously, but such use is unrealistic with this type
of program.
A digital simulation problem

Figure 6-Initialization of the least squares program

The digital simulation program is accessed in the
same manner as other digital programs, i.e., selection of
problem 52 provides the terminal response shown in
Figure 8. The student can enter configuration statements, parameters, and control statements as indicated.
Incidentally, the graphics package written by our
student programmers requires about 400 milliseconds
to fill the screen. Programs which exceed the line
capability of the display are paged. Since the IBM 1130
simulation language is slow, between two and eight
minutes are required for digital execution time. The
user's program is tagged and he can return later to

366

Fall Joint Computer Conference, 1971

seldom have further hesitation. We try to ensure that
the first two problems used by a given group are of the
programmed instruction type with very detailed instructions 0Ii. terminal operation. Subsequent handouts
generally include detailed description of the physical
system being studied but not much information on
terminal operation.
Typical problems used in an introductory chemical
engineering systems analysis course include:

Figure 8-Digital simulation control

obtain his output if desired. Another user can solve an
analog problem from this terminal, but if he attempts to
enter a digital or hybrid program, a message on the
screen directs him to another terminal which is not
busy. In this case, a busy terminal is one which has any
hybrid solution pending. This problem could be avoided
with a larger digital computer with additional storage
capability.
CLASSROOM EXPERIENCE
At the time of this writing, the system has been
installed and used for one semester. Approximately one
thousand students have used the system during this
initial phase of operation. These students were from the
Chemical Engineering, Engineering Operations, Civil
Engineering, and Computer Science Departments.
Many others have expressed interest in using the system
and will be developing application programs during the
summer. One interesting use has been by the Freshman
Engineering Division. These students take an introductory engineering orientation course and have no
background in differential equations or system simulations. They seemed to have little or no difficulty in
understanding the concepts presented in the study of a
simple stream pollution simulation.
The time required for a student to learn the operation
of these terminals is about two hours. Most of this time
is spent in overcoming a natural reluctance to press the
buttons without detailed instructions concerning their
function. Once they decide for themselves, for instance,
that a value entered on the keyboard will not alter a
parameter value unless the ENTER key is used, they

Session one: An introductory session used to teach
terminal operation. The example used is the filling of
tanks of various sizes with different fluid flow rates.
Session two: A perfectly mixed tank is forced with a
step function and sine wave to illustrate superposition
for linear systems. The student controls the tank
volume, inflow rate, input frequency, and initial concentration. He uses the simulation to reinforce the
concept of time constants and determines phase lags and
amplitude ratios for sinusoidal forcing of a linear system.
Session three: The student studies the response of
first order systems in series from the oxygen sag curves
of a two town river pollution model.
Session four: A mercury manometer is forced with a
step function and the student observes the actual
response in terms of frequency and damping. He then
uses the terminal with a second order model to see how
closely calculated values of the parameters in the
simulation fit the experimental data.
Session five: A heated tank model is studied to
introduce the idea of proportional, integral and derivative control. The student varies control parameters to
demonstrate the idea of stability of such control
systems.
Additional laboratory sessions introduce frequency
response, sampled data systems, etc. The laboratory
report consists of answering questions related to system
response as determined from the simulation.
STUDENT FEEDBACK
Initial feedback from the students using this system
has been excellent. They are asked to comment on each
experiment and hand in their comment sheet without
identifying themselves. Approximately 80 percent are
ecstatic and make such comments as:
"The most interesting lab in the university."
"I never understood stability until I actually saw the
system response on the terminaL"
"The labs improve with time because as you go on
you become more accustomed to the machine. It
was also easier to follow instructions for this third

Hybrid Terminal System

experiment. They made more sense. Too bad we
have to stop when we are barely started."
"Lab so far has been very interesting, and more
important, useful. My only concern (sad but true)
is how these reports will be graded."
As expected, a few adverse comments were received, i.e.,
"This lab can be instructive but not in two hours!
The time t'o fully understand the computer and the
system is much closer to 5 or 6 hours."
, "I feel lost but I'm learning."
"Too much material to cover in the time allotted."
"Very interesting but I think I need more background
in differential equations to fully appreciate the
problems."
In the particular course being evaluated, some"
students had completed a course in differential equations, but most were taking differential equations
concurrently. In talking to students who felt they were
not receiving much benefit from the simulation terminals, it turns out that they were a semester behind in
their mathematics sequence and would not take differential equations until next semester. Since the terminals
were successfully used in the freshman orientation
course, it is apparent that we used the wrong handout
material for these students.
CONCLUSIONS
Student response to the use of this hybrid simulation
laboratory has been very encouraging. It is apparent
that we can indeed introduce the concepts of simulation

367

in the study of dynamic systems to all of our children.
The key to success is the development of the student
handout material which must be oriented toward both
the specific course material and the background of the
student. The cost of the system described is approximately $80,000, not including the IBM 1130. The
system .can easily accommodate several thousand
students per year. If the cost were amortized over a
five-year period, the cost would be something on the
order of eight to ten dollars per student per year for
essentially unlimited use. Quite a bargain for simulation
terminals when connect time, line charges and CPU
time are considered for conventional digital terminals.
While these terminals are more limited in scope, we are
convinced that we will see significant improvement in
the students' understanding of dynamic systems as the
terminals are used in additional curricula.
ACKNOWLEDGMENT
The financial support of the National Science Foundation to develop and evaluate these hybrid terminals is
gratefully acknowledged.
REFERENCES
1 G A KORN
The handwriting on the CRT
SIMULATION pg. 319 June 1969
2 D C MARTIN
A different approach to classroom computer use
ACEUG TRANSACTIONS Vol 1 No 1 January 1969
3 D C MARTIN
Development of analog/hybrid terminals for teaching system
dynamics
Fall Joint Computer Conference 1970

BIOMOD-An interactive computer
graphics system for modeling*
by G. F. GRONER, R. L. CLARK, R. A. BERMAN, and E. C. DeLAND
The Rand Corporation
Santa Monica, California

To accomplish this, the investigators must have
available to them a computer system that can simulate
large models and produce accurate,repeatable results.
In addition, the computer system should provide aids
for describing, analyzing, and verifying the model,
mechanisms for changing the model and rerunning the
simulation, and facilities for saving the model description, simulation run results, and data about the system
being modeled.
An investigator can be most effective if he, the person
who understands the real system, can directly and
easily develop and operate the computer simulation of
that system. The computer system should allow him to
describe his model using terminology that is meaningful
to him, to develop, modify and run his model by taking
actions natural to him, and to examine his model's
behavior in those ways that are most understandable
to him. Because continuous systems are usually described by combinations of block diagrams, mathematical statements, logical operations, ordinary and
partial differential equations, chemical equations,
transfer functions, graphs, and tables of data, the
computer system should be able to interpret these modes
of representation. Modelers communicate by sketching
diagrams, writing equations, drawing curves, and
typing text, so the computer system should allow for
these natural forms of input. Because the behavior of
real systems is usually presented through graphs or
through tables of numbers, the.computer system should
present the behavior of simulated models in these
forms.
A succession of digital computer programs for
simulating continuous systems l - 3 provides analogcomputer-like elements, mathematical statements, logical operations, printed graphs, and the ability to
automatically modify parameters and rerun. These
programs are usually run in the batch mode, however,
so an investigator can neither directly develop his
model nor control its operation. Some recent computer

INTRODUCTION
Many of those involved in improving the quality of life
often model and simulate continuous systems as part
of their work. For example, one group of investigators
may model an oil refinery to learn how to produce a
new fuel efficiently, while another may simulate a
global weather model to determine the effect of burning
large quantities of the fuel at high altitudes. An urban
planning team may simulate the water flow in an
estuary to discover the best location for a new sewage
treatment plant or the effect of a proposed breakwater.
A medical team may simulate the bodily distribution of
drugs to determine optimal dosage amounts and intervals, while another team might simulate the bloodvolume control system to design a more efficient
artificial kidney.
All of these investigators may take a common approach to solving their problem. This approach involves
the following steps:
• Develop a mathematical model based on data and
experience.
• Represent the model in terms suitable for an
analog, hybrid, or digital computer.
• Run a computer simulation for a number of
situations where the real system behavior is known.
• Adjust the model structure and parameters until
it behaves the same as the real system.
• Run the simulation to predict the system behavior
in new situations.
• Continue to collect data and run the simulation to
verify the model and learn more about the real
system.

* This

research was supported by Public Health Service Grant
l-ROI-GM-15896. Any views or conclusions contained in
this paper should not be interpreted as representing the official
opinion or policy of The National Institutes of Health or The
Rand Corporation.
No~

369

370

Fall Joint Computer Conference, 1971

A user may represent a model by a block diagram,
each component of which, in turn, may be defined by
another block diagram. This hierarchical structuring
enables him to organize his model into meaningful
substructures. At the most detailed level of the structure, he defines blocks by analog-computer-like elements, algebraic, differential, or chemical equations,
and/ or Fortran statements. A modeler may thus define
his model in whatever terminology is meaningful to him.
Displayed curves are continuously and automatically
updated during model simulation. At any time, a user
may stop the simulation, display curves for different
variables, change scales, alter simulation parameters,
or modify the integration method, and then continue
the simulation or revise the description of his model.
The next section demonstrates BIOMOD by presenting a scenario of how a user might describe and
simulate a simple, but important, model. The third
section describes BIOMOD's features more fully. The
paper concludes with a brief description of the
BIOMOD system implementation and a discussion of
experience with users.
Figure 1-A BIOMOD user at a console

systems4-7 employ graphic consoles attached to computers to allow investigators to interactively construct
and operate models. These systems are, however, still
lacking in man-machine communication techniques.
The BIOMOD system has been developed specifically
to make modeling continuous systems more convenient
for investigators who are unsophisticated in the use of
computers. It accomplishes this by providing a high
degree of interaction, graphical displays, user-oriented
model-definition languages, and flexible, in-depth
model structuring.
A modeler uses BIOMOD via an interactive graphics
console comprising a television screen, data tablet, and
keyboard (Figure 1). The system takes full advantage
of the screen/tablet two-dimensionality by allowing the
user to shift his attention from one place to another
and take any appropriate action at will. He may communicate with the system by typing with the keyboard,
and by handprinting or pointing with the tablet pen.
BIOMOD responds immediately with its interpretation
of user actions. This interaction is present in the system
at several levels. For example, when the user handprints a character, the system displays its interpretation
as a stylized character in the respective position on the
CRT; when he completes a statement, the system either
lists the variables appearing in the statement, or
provides an error diagnostic. When he completes the
description of his model, the system generates an
executable program and runs the simulation.

A MODEL FOR EVALUATING DRUG
ADMINISTRATION POLICIES
Medications and their prescribed dosages are designed
to maintain a critical amount of drug in the blood for a
specified period of time. The conventional method for
determining optimal dosages involves numerous laboratory experiments. If the drug effects can be modeled,
however, a more efficient method is to experiment by
running computer simulations.
One technique for maintaining the prescribed
amount of drug is to use a capsule containing a large
number of differently coated pellets. The pellets dissolve at different times, so that, as drug leaves the
blood, it is replaced by drug released by newly dissolved pellets. Garrett and Lambert8 have proposed
the following model to describe this situation. A capsule
comprises a number of pellet popUlations with different
mean times of release. The rate of drug release for each
population is assumed to be normally distributed (with
the same standard deviation for each population) about
the mean time of release for that population. The rate
of adding drug to the body is therefore specified by a
sum of normal distributions. The transfer of drug
through the body is described by
kGI,B

kB,U

drug~~~~@
where GI refers to the gastrointestinal tract, B to the

BIOMOD

371

blood, and U to the urine and other excretory parts of
the body; this means that drug flows from GI to B at
rate kGI,B and from B to U at rate kB,u. Thus, for
example,
dDB/dt = kGI,BDGI - kB,uDB
where DB is the amount of drug in B, and DGI the
amount of drug in GI.
Given this description together with the requisite
parameter values, BIOMOD can be used to simulate
the model. Note that the following dialogue describes
only one of many possible ways of reaching the same
goal, and that BIOMOD does not force the user to take
actions in any particular order.
When using BIOMOD, we communicate via a data
tablet pen and a keyboard. The pen's location on the
tablet is always indicated by a dot displayed in the
corresponding location on the television screen.
BIOMOD's interpretation of user pen actions depends
on where the pen is placed and on what is currently
displayed on the screen. We may handprint characters
in most areas. As we write, a displayed "ink" track
appears to flow from the pen; each time we complete a
character, its track is replaced by a stylized character.
We can change a character by writing another over it.
Some symbols are used for editing; for example, we
may use a caret to insert text, or we may scrub with the
pen to delete text. Some areas displayed on the screen
act as pushbuttons; if we "push" one of these (by
touching the pen down), the system performs the
indicated action. If we push a displayed arrow, a continuous action takes place, such as the rescaling of a
set of curves. Some figures can be "dragged"; if we
"touch" one of these and move the pen, the displayed
figure follows the pen's motion. We may type (with the
keyboard) in any area where writing is possible. The
keyboard cursor may be positioned either with the pen
or with keyboard control keys.
To create our model, we enter our identification,
name our model DRUGS, then begin constructing the
. model. Because it has two major components, we first
draw two rectangles; these are replaced by stylized
function boxes. We write CAPSULE in one box and
names of parts of the .body in the other box, and then
draw a flowline to connect them. This diagram (Figure
2) provides not only a picture of our model, but also a
means of defining the two components of the model
separately.
To define the capsule component, we first push the
DEFN button on its box. The system replaces the block
diagram with a list of languages that we may choose
from to define the component. The languages are: block
diagrams, mathematical equations, chemical equations,
and Fortran statements. We choose block diagrams so

tll"'ll. ITC

Figure 2-The DRUGS model block diagram

that we can define the capsule as a set of boxes, each
representing a pellet population. This ability to define
a box by another block diagram enables us to organize
a model as a hierarchical collection of a number of
components at different levels. We draw four boxes and
write PILL on the top line of each to represent four
pellet populations. We name a function box in this way
whenever we anticipate using the same function
repeatedly.
Because each population is defined by a normal
distribution, we indicate that we want to define the
PILL function with mathematical equations. BIOMOD
responds by displaying a form for writing algebraic .a~d
differential equations. We assume that the probabIlity
of a pellet dissolving in an interval about time t is
given by the probability density function
1

p = - - exp[ - (t - m)2/20'2]
0'y'2;r
Using this function to approximate the drug release
rate by a deterministic variable, we write
P

=

1/ (SIGMA*SQRT (2*PI) ) *EXP ( - (TIMEMEAN) **2/ (2*SIGMA**2)

The system analyzes this statement and immediately
responds with the message
UNBALANCED PARENTHESES

372

~TOR!.

Fall Joint Computer Conference, 1971

,

R!CUL

OG Oil

MODU !DITOR

I

~.

HARDCOPY

ROLL 1

'UM! 000000
•
PILL

I
t
MATH!MATICU

D!"!I!~

VARUILI!~

DI""ID HIli!

CHICK ,,:
Ol'TPl!T V"""'ILI!
I LOCU VAIIUILI!

,

!lCROLL

,

CODIIiG 'ORM

DI""ITIO"

VARUIILI~

•• IllAM!

!QlJlTION~

'AGI ..
ORIOI"

DI""ID ILSI_HIRI

t

CHI!CK ,,:
UIUMUIU III'liT
I~ON-RI!IIIAMUILI

COMMINT

..

IllAMI!
NIAll
PI

!lCROLL

,

IN'lI1"

for us to provide the names of the output and mean of
this particular population. We write PI next to P~,
to name this output PI, and write MI next to
MEAN f-, to name this mean time of release M 1. We
similarly establish the correspondence between the
names of variables in the other three pellet populations
and the names (P and MEAN) used when defining the
function PILL.
We can describe the flow of the drug through the
body by chemical equations because these are mathematically equivalent to mass transport equations. When
we push the DEFN button on the box that describes
the body, and select chemical equations, BIOMOD
presents an appropriate form. According to our original
model description, we would like to write

COMMINT

kGI,B

I ~IOM.
IbO ... II

k:s,U

DGI~DB~Du

or

Figure 3-The definition of a pellet population

We then add a closing parenthesis to correct the statement. We also realize that we should parameterize the
amount of drug released by each pellet population, so
we insert DOSAGE after the equals sign, and scrub the
1. The display now appears as in Figure 3.
BIOMOD has generated separate lists of the defined
and undefined variables; TIME does not appear
because it is always the simulation independent
variable. These lists enable us to indicate which
variables have different meanings or values each time
we use the function. We indicate that the names (and
therefore the values) of PI, SIGMA, and DOSAGE
are the same each time we use the PILL function. This
is because PI is a constant, and because we assume
that the standard deviation and dosage amount are the
same for each population. On the other hand, we
indicate that MEAN may have a different value for
each pellet population.
N ow that we have defined the PILL function, we are
ready to use it to define the individual populations. We
push a button to get back to our diagram of the four
PILL boxes, then push the DEFN button on one of
these. Because PILL is now defined, BIOMOD displays

P

~

MEAN fPI
f-PI
SIGMA f- SIGMA
DOSAGE ~ DOSAGE

where O. indicates that there is no backward flow.
BIOMOD requires that we linearize each equation,
and write the rate coefficients and equation in the
provided columns as
S
S

KGIB
KBU

O.
O.

DGI = DB
DB = DU

Here S (for slow reaction) means that BIOMOD
should derive integral equations from our equation.
Since the pellets release drug into the gastrointestinal
tract, we also write
G

DGI

Pl+P2+P3+P4

This statement (with G for gain) indicates that the
gain of DGI, i.e., the increased rate of change of DGI
due to drug entering the body from outside, is equal to
the sum of the rates of drug release from the four pellet
populations.
The model is now defined except for parameter
values. When we indicate that we are ready to provide
these values, BIOMOD displays the names of model
variables and parameters in two separate lists (Figure
4). Names such as D G 10 indicate initial values; they
are derived from the chemical equations by BIOMOD.
We enter the values given or implied by Garrett and
Lambert. We assume that some drug is immediately
released into the gastrointestinal tract and therefore
set DGIo to 5; the other drug amounts are initially

BIOMOD

!!TO". I "ICALL L
"''''HI 00 . . . . ·

•

~

D"UO!!'

00 Oil

I

HODIL 101 TO"

..

I ' .OLL t

H"'''DCO'Y

•

I

I

VA"tAIL.1!!

CHICIC IJI',
PLOTTAILI
UOT PLOTT AIL I
II AMI

I

COMMINT

''''GI ..
O"IGIII

""""'MITIIl!!

CHICK 1',
HODIJI'tAILI
UOT HODIJI'tAILI
IIAMI • VALUI
l.!..

I·

"--~

'---

~

11_"'"

t

_nOLL' ,

:1

COMMIIIT

~

Figure 4-The model variables and parameters

zero. In order to mInImize storage requirements,
BIOMOD limits the number of variables whose values
are saved during simulation and the number of parameters whose values can be modified during simulation.
Since this model is small, we indicate that we want to
save values of (and possibly plot) all the variables, and
that we might want to modify values of all parameters
except PI.
I t has taken us less than half an hour to completely
describe our model. We now indicate that we would
like to simulate it. BIOMOn first produces a
CSMP/3602 program that describes our model and
provides for graphic display of the results. CSMP, in
turn, generates a Fortran program, which is compiled
and linked with other programs required to run the
simulation. If an error is detected at one of these steps,
program listings and error messages are displayed on
the screen; otherwise, our only awareness of the intermediate steps is via displayed messages. The time
required for the translations depends on the load on the
(multiprogrammed) computer; generally it is about
three minutes.
Our DRUGS model translates successfully so the
form shown in Figure 5 is displayed on the screen. As
in the other forms, software pushbuttons appear across
the top. The central area is for selecting numerical
integration methods, modifying parameter values,
examining variable names and values, or plotting

373

graphs. The areas to the left and below the central area
are for specifying the y and x axes of the graphs.
We expect the values of our model variables to change
smoothly and over several units to TIME, so we
choose a simple integration method-Simpson's method
with step-size = 0.1. This is a fixed step-size method,
so the information regarding variable step-sizes disappears. Before studying how to use a multi-pellet
capsule, we want to ensure proper model behavior when
there is initially some drug in the gastrointestinal
tract, but no capsule. To eliminate the capsule drug we
push the PARAMETERS button to display the list of
modifiable parameters in the central area, then overwrite the value of DOSAGE, changing it to O. Next we
display the list of plottable variables. Because we are
most interested in the amount of drug in the gastrointestinal tract, blood, and urine, we drag the names
DGI, DB, and nu to the y axis. We want to watch
the model for several simulated hours, so we change the
upper range of TIME (in the small box at the lower
right of the central area) from 1. to 7. We push PLOT;
now we are ready to plot DGI, DB, and DU from O.
to 1. against TIME from O. to 7. hours.
We push RESTART and the simulation begins
running. We see (from the curves) that DB, and later
nu, are being generated; the "NOW X =" number
changes continuously to indicate the current value of
simulated TIME. Because DGI is plotted off scale

I

FO~ CI~VE~'

I

ENTF.~

ANALY1E
PLOT

,

FOR

I,

I

OOE+OO

tl

GRAPH

INTPGRATION
I

RECTANGI'LAR

(

I

T~APE1.0IDAI

,

I ~1'4P~ON' ~
I 2ND-ORDER
14TH-ORDER

(

t

I

PARAM!T!R~
'H!THOD~

AND

r

'GO TO

t

[:!MPORARY C.OPY
MODU r.DITOR
.TART 01' MOD1!L

VARIA8LE~

08/211'1

14 :02

METHOD

METHOD

(

(

'DI~PLAY

DRl'G~

SET UP

I,

CONTINl!F.
.. ARDCOPY
HIN THI~ DATA

ALL' DATA
Rl'N

I

,I

~I

~TOP

T.. IN

INTEGRATION

t

DO

RE~TART

Rl'tE
ADAM~

RI'NGF.-kl'TTA

4T .. -OROER «l'NGF.~tTTA, VARIABLE ~TEP-~IZ1!
15TH-ORDER MILNE, VARIABLE ~TEP-~I7.E
I FO"LER-IOARTEN MOO BAND C, VAR. ~TEP-~I7.1!

(XI

(
(

Y-NAME

Rl'N

10
INITIAL INTF.GRATION ~TEP-~IZE
ALtO"ABLE INTr.GRATION ~TEP-~I1.1!
REL F.RROR IN INTEGRATOR OtTPl'T~
A8~ F.RROR IN
INTEGRATOR Ol'TPl'T~

:
:
:
:

114'"

MAX
MAX

THI!

~TEP-~I1,E

I~

1!~TIMATF.1l

AOIl ~TEO

[RROR
A +R.AB~( Y I

"HERr.

..

Y :
A :
A :

~l'C"

1.00!-08
I.OOF.-IO
1.001!-0.
I.OOE-Ol

T .. AT:

< I
-

Il~TIMATEO

Ol TPl T
R1!tATlVll' r.RROR
AB~Oll T1! ERROR

~P1!CIFIEO "AX
~P1!CIFI1!O

MAX

0 ,oor.+no",

I·~H!'~I

NO • •

PLOTTING

INT1!R~Al

:Io.oo!!+ooi

~TO'

II'

•

...

I.OOI!+eo,,",

:

O.GOr.+OO

I'

I.oor.+all

Figure 5-The simulation control form with integration methods

374

Fall Joint Computer Conference, 1971

• GO TO t
IIMpORARY COpy
"ODI!L I!DITOR

SIMULATION

••

!I

OOF.OO

I'OR CURVE,!:

• GO TO t
.MPORAIIY COP,
"ODI!L l!DITOR
TART 01' MODEl

I!NTER
AIULYZE
PLOT

PENDOWN

it

••

tl

5. OOE.OO

Q!!L __
Q!'---1.

Qt; _ _ _ _

PLOTTING

INTI!IIYAL

•

Q~

O.OOr..OO
PLOTTTIiG

Figure 6-The simulation run with no capsule drug

along the upper boundary, we touch the pen down to
stop the simulation. In order to determine the range of
DGI, we return to the display that lists the variables
along with their current, minimum, and maximum
values. The maximum value of DGI is 5. (its initial
value), so we write 5 over the 1 that specifies the
upper y-axis value, then redisplay the curves. The
curves are now nicely scaled. We continue the simulation, then stop it when we see that nearly all the drug
has entered the urine, and values are changing slowly.
We assume, ~ from the curve's reasonable appearance
(Figure 6), that we described at least the body component of the model correctly, and our choice of
integration method is adequate. We see from the curve
labeled DB that, as expected, the drug remains in the
blood for only a short time.
We reintroduce the capsule drug by changing
DOSAGE back to 3., then restart the simulation and
watch the curves Being continuously updated as it runs.
Once it becomes apparent that the capsule is not
effective, i.e., that the value of DB drops too low, we
stop the simulation. Apparently (Figure 7) the drug is
not released from the pellets in time to replace the drug
that leaves the blood. To correct this, we change the
mean times of release from 2., 4., 6., and 8. to 1., 2.,
3., and 4. We then rerun the simulation and get much
better results. The amount of drug in the blood should

Figure 7-The simulation run with the parameter values shown
in Figure 4

FOR Cl'RVE'I:

•

ENTER
ANAI.YU:
PLOT

I.

00 TO

t

RT 01' MODEl.

x=

5 .OOE'OO

tl

Y-IIAME RliN

TO

QCH._.
l!1! __ '.
l!t:._..

l!t;

y=

2.50E+OO

1

t

QI!

Figure 8-The simulation run with Means

1., 2., 3., and 4.

BIOMOD

375

be greater than 2.5. In order to determine if this is
achieved, we place the pen down in the central area to
establish an x-y meter, then drag this meter to a place
where Y (corresponding to DB as well as DGI and
DU) is equal to 2.5 (Figure 8). Our choice of means
was good, but they need to be adjusted to maximize the
total duration of capsule effectiveness.
While changing the means for further trials we
realize that rather than controlling four means, we
would prefer to deal only with the first mean and the
interval between mean times of release. To reformulate
the model in this way, we return to its description and
add another box to the definition of the capsule component. In this box we write

tion provided by the system. Such a box is used in the
same way as a user-defined function (e.g. ,PILL) ,
except that its definition cannot be viewed. Equations
may also refer to these functions; BIOMOD checks to
see that such a reference includes the proper number of
arguments.
Mathematical equations may be differential equations as well as algebraic equations. Derivatives with
respect to time are indicated by up to nine prime signs
(') or by a prime sign and a digit. The variable defined
by an equation need not appear alone to the left of an
equals sign. Thus, the equations

M2 = MI + INTVAL

-K2XI + M 2x2 + BX2 + K2X2 = 0

and similar equations for M3 and M 4. This replaces the
parameters M2, M3, and M 4 with the single parameter
INTVAL, which we set to 1. and mark modifiable. Once
that is accomplished, we retranslate the model, then
continue to resimulate it and change parameters until
we have established a satisfactory drug formulation
and administration policy.

SOME ADDITIONAL BIOMOD FEATURES
The DRUGS example illustrates many, but not all,
of BIOMOD's features. One facility that was not
described is file management. Every new model is saved
on secondary storage, and is filed according to name and
user identification until it is intentionally destroyed.
One may copy, and then modify a model description to
build a family of related but different models.
When drawing a block diagram or writing a set of
equations, a user may run out of space on a displayed
"page." In either case, a blank continuation page may
be obtained by pushing a displayed button. A few lines
of text may similarly be moved off the display to
create writing space.
Text can be edited by overwriting, deleting, closing,
inserting between characters, and inserting between
lines. Block diagrams can also be conveniently edited.
A box or a flowline may be deleted by scrubbing. The
appearance of a block diagram may be improved by
dragging a box to another position or by "stretching"
it (from its lower right corner) to change its size and
shape.
BIOMOD makes available most of the functions
provided by CSMP/360. These include mathematical
functions, logical functions, and signal sources. A box
given the name of one of these functions has its defini-

Mixi

+ (K I+K2)XI -

K2X2 = 0

may be entered as
MI*XI" + (KI+K2)*XI - K2*X2 = O.
-K2*XI

+ M2*X2" +

B*X2' + K2*X2 = O.

If the user indicates that the first equation defines Xl,
BIOMOD manipulates it to place Xl" alone at the
left, generates integral equations for Xl' and Xl, and
requests initial values for Xl and Xl'. The second
equation is handled similarly. The major restrictions
are that the highest derivative of the defined variable
may appear only once in an equation, and may not
appear as a function argument.
Chemical equations may be written as

S

KF

KB

2H2 + 02 = 2H20

or as

F

KEQ

2H2 + 02 = 2H20

In the former case, S indicates that BIOMOD is to
generate, and numerically solve, integral equations that
model the (slow) chemical reaction; rate coefficients
(KF and KB in this example) indicate the rate at
which the reaction proceeds in each direction. In
the latter case, F indicates that the (fast) reaction is
to be forced to equilibrium initially and at each successive time step; an equilibrium coefficient (e.g., KEQ)
is used in specifying the equilibrium condition. In
either case, BIOMOD requests initial values of the
chemicals. Rate and equilibrium coefficients may be
defined by either algebraic expressions or numerical
values on a chemical equations form. Each box defined
by chemical equations is treated as a self-contained
compartment separated from the others. Mass flow
between compartments is specified by gain terms as in
the DRUGS example. A non-reacting chemical may be

376

Fall Joint Computer Conference, 1971

included in a compartment to affect the concentrations
of the other chemicals.
A function that involves the conditional evaluation of
variables may be specified entirely in terms of Fortran
statements. The allowable statement types are assignment, arithmetic IF, GO TO, and CONTINUE.
While running the simulation, any set of up to five
variables may be plotted against TIME or any other
variable. The plots may be linear, logarithmic, or
semi-logarithmic .. The ranges of the axes may be
changed by overwriting the numbers that specify them,
dragging variable values over these numbers, pushing
displayed buttons to gradually magnify or contract
the curves, or pushing buttons to shift the curves.
When the curves are rescaled or shifted, they are redisplayed so fast that they appear to magnify or move
continuously. As the simulation proceeds, the user may
control the intervals at which points along the curves
are plotted, and those at which data values are saved.
Once he begins simulating a model; a user may like
to define a new variable for some purpose, e.g., to
scale a variable so that it has approximately the same
range as others or to plot a boundary value. BIOMOD
allows a user to define such a new variable as a simple
combination of constants and other variables without
requiring retranslation of the model. Such a new
variable may be plotted just as any other. It is not
incorporated into the model, however, and so it cannot
define a parameter such as M2 in the D RU GS model.
If a user wishes to reexamine his model description
during the simulation process, but not modify the model
structure, he indicates this by pushing the TEMPORARY COPY button. BIOMOD saves a translated
copy of the most recently simulated model so that,
after examining the model, a user may simulate it
again without retranslation.
A user may push a displayed button to request
hardcopy of what is currently displayed. Hardcopy is
produced off-line on a Stromberg Datagraphix 4060
film and hardcopy unit. The figures in this paper
(except Figure 1) were generated this way.
IMPLEMENTATION
The BIOMOD system operates on an IBM
System/360, Model 40 or larger, utilizing a partition
of approximately 228,000 bytes. The operating system
may be either the MFT II or MVT version of OS/360,
augmented by Rand's Video Operating System, which
serves as a link to the Rand Video Graphic System. 9
BIOMOD is used from a Video Graphic console comprising a television screen, a data tablet, and a keyboard.

Parts of the model description portion of BIOMOD
were derived from GRAIL, 10-12 a Rand system that
enabled users to draw and execute program flowcharts.
Both BIOMOD and GRAIL were tailored to provide
good response times for operations on complex problem
descriptions while minimizing demands on computer
resources. However, since they are designed for different
applications, they differ in their user-oriented languages.
When a user requests simulation of his model, the
BIOMOD translator interrogates the data structure
that contains the model description, and produces a
CSMP program. Dummy names are generated for
variable names that include primes (') or initial value
symbols (0). Algebraic and differential equations are
rearranged so that a single variable is assigned a value
specified by an expression. Chemical equations are
handled by calls to specially designed subroutines.
Boxes defined by Fortran statements are implemented
as CSMP procedures. User-defined functions are
implemented as CSMP macros. In addition to this
description of a model, the BIOMOD-generated CSMP
program also includes calls to graphics subroutines that
let the user control the simulation and observe results.
The program also includes a subroutine that communicates the names and storage addresses of variables
and parameters to the graphics subroutines.
The CSMP program is passed to the standard
CSMP/360 processor. CSMP sorts statements and
expands macros to produce a Fortran program that
describes the model, and then calls the Fortran compiler and the Linkage Editor to generate an executable
program. The resulting program runs under CSMP
control. The simulation graphics programs are Fortran
subroutines that call assembly language routines to
extend the capabilities of Fortran and to communicate
with the graphics hardware. They extract variable
values and change parameter values directly; they
return codes to control the simulation.
Since a user's model is represented at one point in
the translation as a CSMP program, a user with a batchmode CSMP program may modify it slightly, then load
it into the BIOMOD system to take advantage of the
simulation graphics facilities. Similarly, a user may
modify a CSMP program produced by BIOMOD, then
run it at another facility in batch mode.
CONCLUSIONS
Users are very enthusiastic about BIOMOD. The
combination of graphics, highly interactive facilities,
and user-oriented languages enables them to get their
work done quickly, and occasionally provides insights

BIOMOD

that would have been missed using other techniques.
The graphics not only help in visualizing the model and
the simulation results, but also provide for operating
freely on a two-dimensional surface.
For the most part, the interactive techniques have
been well received. The methods for controlling the
simulation and manipulating graphs are particularly
effective. The pen and tablet are intended to be used
like pencil and paper; however, users tend to use the
tablet pen for printing labels, changing values, editing
text, and dragging, but use the keyboard for entering
equations and anything else that is extensive, because
typing is faster than printing.
Thus far, most BIOMOD users have had previous
experience writing and running batch-mode programs,
and this has influenced the way they develop models.
They tend to be too experienced as programmers to
require all the aids provided by BIOMOD, yet too
inexperienced as modelers to take advantage of all the
power provided. For example, those of our users who
have previously written Fortran and CSMP programs
usually state their models in terms of these formal
languages, rather than write differential and chemical
equations. On the other hand, users with "little or no
computer experience find it convenient to describe
models in their own terminology. Since it is often
interesting to study models in the literature, it is
particularly convenient to be able to directly transcribe
their description with, perhaps, minor notational
changes.
Block diagrams are used more to organize the model
into component parts and to make different languages
available for defining some boxes than they are for
using system functions, defining new functions, or
visualizing the flow of signals or mass. Our users have
not learned to take advantage of BIOMOD's hierarchical capability, but rather, construct models at
only one or two levels. This is probably because they
have not yet begun to develop very complex models, and
because hierarchical model structuring has previously
been difficult to accomplish. Another reason is that the
techniques for moving from one part of a model to
another are presently too complicated, particularly
when user-defined functions are involved.
BIOMOD has several inadequacies that we plan to
correct. It is often convenient to specify the initial
state of a model by a set of equations that are evaluated
only once; BIOMOD does not provide for this. Users
would like to draw data curves to define functions or to
compare with simulation results. Although the tablet is
an excellent device for drawing, this feature is not yet
implemented. Users would like to save simulation
results in order to compare various runs, but they can-

377

not. Users spend a great deal of time adjusting parameter values and rerunning the simulation until they get
the desired results; it would be very helpful to use
parameter identification schemes to automate this
process. Most of these features were taken into consideration when BIOMOD was initially designed and
can be added without any major difficulty.
Other problems are hard to remedy. Users with large
models run into size limitations imposed both by
BIOMOD and by CSMP. These limitations can undoubtedly be relaxed, but it will require more experience
to evaluate tradeoffs, and may require abandoning the
standard version of CSMP. Many CSMP and executiontime error messages are vague. In the present implementation it is difficult, if not impossible, to automatically relate these to a specific part of the user's
model. BIOMOD operates only at The Rand Corporation because of its display-hardware dependencies; we
are pre"sently rewriting large portions of the system to
make it exportable.
Further details about the BIOMOD system are
reported in a user's manuaP3 and in a description of its
implementation. 14
REFERENCES
1 J J CLANCY M S FINEBERG
Digital simulation languages: A critique and a guide
AFIPS Conference Proceedings 1965 FJCC Vol 27 pp 23-36
Spartan Books Washington D C 1965
2 System/360 continuous system modeling program
(360A -CX -16 X) application description
IBM Corporation Form No H20-0240-2 August 1968
3 SCi SIMULATION SOFTWARE COMMITTEE
The SCi continuous system simulation language (CSSL)
Simulation Vol 9 No 6 pp 281-303 December 1967
4 H B BASKIN S P MORSE
A multilevel modeling structure for interactive graphic design
IBM Systems Journal Vol 7 Nos 3 and 4 pp 218-2291968
5 R G RENAUD R F WALTERS
The interactive creation, execution and analysis of
biological simulation using MIMIC on a graphic terminal
Proceedings of the Conference on Applications of
Continuous System Simulation Languages San Francisco
California pp 185-191 1969
6 G A KORN
Project DARE: Differential analyzer replacement
by on-line digital simulation
AFIPS Conference Proceedings 1969 FJCC Vol 35
pp 247-254
AFIPS Press Montvale New Jersey 1969
7 M J MERRITT D S MILLER
MOBSSL-UAF-An augmented block structure
continuo'us system simulation language for digital and
hybrid computers
AFIPS Conference Proceedings 1969 FJCC Vol 35
pp 255-274
AFIPS Press Montvale New Jersey 1969

....

378

Fall Joint Computer Conference, 1971

8 E R GARRETT H J LAMBERT
Analog computer in drug dosage and formulation design
Journal of Pharmaceutical Sciences Vol 55 No 6
pp 626-634 June 1966
9 K W UNCAPHER
The Rand video graphic system-An approach to a general
user-computer graphic communication system
The Rand Corporation R-753-ARPA April 1971
10 T 0 ELLIS J F HEAFNER W L SIBLEY
The GRAIL project: An experiment in man-machine
communication
Proceedings of the Society for Information Display
Voll! No 3 pp 121-129 Third Quarter 1970

11
12
13

14

Also The Rand Corporation RM-5999-ARPA
September 1969
T 0 ELLIS J F HEAFNER W L SIBLEY
The GRAIL language and operations
The Rand Corporation RM-6001-ARPA September 1969
T 0 ELLIS J F HEAFNER W L SIBLEY
The GRAIL system implementation
The Rand Corporation RM-6002-ARPA September 1969
R L CLARK G F GRONER R A BERMAN
The BlOMOD user's reference manual
The Rand Corporation R-746-NIH July 1971
R L CLARK G F GRONER
The BlOMOD system implementation
The Rand Corporation R-747-NIH July 1971

The future on-line continuous-system simulation
by HANS M. AUS and GRANINO A. KORN
The University of Arizona
Tucson, Arizona

INTRODUCTION AND REVIEW
The DARE I and DARE II simulation systems each
added a simulation console with graphic and alphanumeric displays to a PDP-9 minicomputer with 16K
of memory and a small disk (Figure 1) and employed a
continuous-system simulation language for simplified
programming. System equations or block statements,
text, and comments, are typed and edited on a CRT
typewriter. Solutions appear on a second CRT and can
be plotted or listed for report preparation; they are
automatically labeled and scaled without any need for
special FORMAT statements. Iterative and statistical
simulation studies involving repeated differentialequation-solving runs are possible.
A DARE system is loaded onto the small PDP-9
disk from a reel of magnetic tape. The TYPE EQUATIONS console light lights, and a "communication
line" at the bottom of the alphanumeric CRT says
DERIVATIVE BLOCK NO. I-INPUT MODE

Figure I-PDP-9 and DARE console at the University
of Arizona

The operator types first-order differential equations,
say
X' = XDOT

by function names. An initial display, showing one or
two variables against the independent variable T, or a
phase-plane plot, is specified with a display statement,
say
DISPLAY X, XDOT, T

XDOT' = ALFA* (1. - X*X) *XDOT - X
and equations introducing "defined variables," such as
E=X-BETA*SIN(X)

Note, however, that all state variables and defined
variables are also stored for later display or listing in
any combination.
The integration routine to be used is selected with a
12-position console switch (DARE I and II), or by
typing the method number on the CRT screen (DARE
III). There is a choice of predictor/corrector- and both
fixed- and variable-step Runge-Kutta methods, plus an
"implicit" method for stiff-equation systems (DARE
I, DARE II, and DARE III have two derivative blocks,
permitting simultaneous use of two different integration
methods and step sizes) .

in any order. He can intersperse this material with
titles, comments, and other report material; each such
"comment line" begins with a star to prevent compilation. Program and text can be edited at will on a
CRT typewriter, which permits one to move words or
lines, to substitute symbols, and also to find specified
symbol strings in long programs. Any or all of this
material can also be printed out as a hard-copy report
at the touch of a console button.
Table look-up functions of one or two variables are
simply entered as one- or two-dimensional tables called

379

380

Fall Joint Computer Conference, 1971

settings on an analog computer. We push the COMPILE
button on the console.
WHAT THE SYSTEM DOES:
STAND-ALONE-COMPUTER SYSTEMS
The edited DARE I, DARE II, or DARE III
program (source-language program) is partially in core
and partially on the small local (PDP-9) disk. To see
what each DARE system must do, let us first look at a
stand-alone-minicomputer system, say DARE I, which
can handle 20 first-order state equations. This will make
it easier to understand the larger time-sharing systems.
The COMPILE button causes compilation and
loading in several overlays from the minicomputer
disk. A precompiler (translator) first sorts the system
equations into a FORTRAN program so that no statement can call for as yet uncomputed quantities.
Undefined parameters (BETA, and the first-run value
of ALFA in our example) are left over in this sorting
process and will be presented automatically to the
operator, who must supply numerical values. The
FORTRAN compiler is loaded next and compiles the
FORTRAN program, including the logic block. Finally
a linking loader loads the resulting binary program
together with any library routines needed (such as sine
or square~root functions). The alphanumeric CRT
screen now displays the names of all initial-value
settings and as yet undefined parameters, say,
Figure 2-Closeup of DARE control panel, showing method
switch, sense switches, and lighted control buttons

X=
ALFA=

If we want to run a complete simulation study
requiring multiple differential-equation solving runs,
iterative adjustment of parameters or initial values,
and/ or crossplotting or statistical evaluation of results
obtained in successive runs, we type OPEN LOGIC to
call for a DARE logic block and proceed to type a
FORTRAN IV program such as

CALL RUN
ALFA=2.5
CALL RUN
ALFA=3.7
etc.
This logic block will then take control of the computation and call for successive equation solving runs with
suitable parameter changes, which could also depend
on results from past solutions.
Our program is now complete except for initial-value
and parameter settings (corresponding to potentiometer

XDOT=
BETA =

together with the simulation parameters DT (integration step), TMAX (total computation time) and, in
variable step integration routines, also EMAX, the
maximum allowable local truncation error.
The operator then simply enters the desired values
on the CRT typewriter. Note that he did not have to
remember which quantities needed· to be specified;
the CRT screen told him.
Weare now ready to solve the differential equations.
Pushing the COMPUTE button on the console starts
the computation; the initial solution display will
appear on the graphic-display CRT. If there is a logic
block, solutions will automatically proceed through the
desired iteration sequence.
After a solution is complete, the operator can push a
RESET button on the console to display the parameter
values again, type new ones, and restart the solution
at once by pushing the COMPUTE button (this
corresponds to resetting and restarting an analogcomputer solution). It is also possible to change the
integration routine, DT, TMAX, and EMAX without

On-line Continuous-System Simulation

recompiling, and to use sense switches on the console.
If one wishes to change the differential equations in a
more radical way, one pushes the RESTART button
on the console; this will again display the differential
equations, which can now be changed and recompiled.
After a set of differential equations has been solved,
the SELECT DISPLAY button on the console loads
another PDP-9 overlay which permits one to recall
time histories of all state variables and/or defined
variables, and also crossplots from different solutions,
from the disk. The following display options are
obtained by simply typing codes on the alphanumeric
CRT typewriter.

I

TIME-SHARING SYSTEMS
Large simulation problems require more powerful
computers, and such computers are too expensive to
wait idly while an engineer engaged in on-line simulation
thinks or interprets results. We cannot tie up a large
computer for interactive simulation without some type
of time-sharing. A medium-sized computer (of the
order of an XDS SIGMA 5) with a sufficiently large
sector-protected memory could be time-shared between
one interactive simulation and a batch-processed background program, which would be interrupted by the
simulation runs.
Perhaps the main attraction of such a system is that
the simulation laboratory would have its own computer.
For more cost-effective time-sharing of a larger digital
computer, and for multiple interactive simulations,
simulation programs will have to be swapped in and
out from a system disk. It would also seem expedient to
arrange priorities and time slots so that each simulation
runs or each iterative sequence of simulation runs is
considered as a job which ordinarily cannot be interrupted by other users. Typical runs might take seconds,

I
I
I

PDP-9 TERHINAL

COMPLl. '

CDC 6400

I

DARE III Simulation
Translator and CDC
FORTRAN Compiler

1. Plot up to 4 variables against T on the same or
different reference axes. Curves can be in 4
colors.
2. Plot any variable against any other (phaseplane plots) .
3. Tabulate up to 4 variables on CRT or teletypewriter.
4. Obtain hard-copy plots on the 4-channel stripchart recorder or XY recorder.
The possibility of plotting any set of variables
against time or any other variable, including variables
saved from preceding computer runs by a special option
code, is not only very convenient for evaluation of
results and report preparation, but also constitutes a
useful debugging aid.

381

Output Data
Tabular
Plotter

CHART
On 4-channel
Strip Chart
Recorder

Data on
Screen
PRINT
Tabular Data
on Teletype

Tabular Data
on CDC 6400
Line-Printer

Figure 3-Time sequential flow chart of DARE III

but large iterative sequences could take much longer.
Quite frequently, interactive simulation would be
employed mainly. to debug some initial runs, with the
rest of the study deferred for batch-processing study
at night.
In developing DARE III, we were confronted with
the ponderous organization of a university computing
center with a CDC-6400 in an iron-clad operating
system (CDC SCOPE), which we did not want to
change. The closest approximation to time-sharing was
the CDC INTERCOM system used at the University
mainly for remote batch-processing job entry and
printing. Since the 6000-series INTERCOM combination is widely used in Universities, it was a worthwhile
challenge to develop a suitable simulation system.
DARE III OPERATION
Most of the user interaction required in the DARE
III system is the same as in DARE 1. The main difference between the interactive portion of the two systems
is that DARE III does not use the simulation control
panel, and that the user must dial, connect and disconnect the telephone as directed from the CRT
screen. The DARE III user can exercise system control
functions exactly like those provided by the control

382

Fall Joint Computer Conference, 1971

panel in DARE I by typing the appropriate command
on the last line of the CRT screen, for example COMPILE, PRINT REPORT, READ PROBLEM TAPE,
etc.
The simulation-problem text is entered and modified
using the alphanumeric CRT editor. Each problemdefinition block can contain up to 600 full 40 character
lines. DARE III also offers sophisticated users the
ability to replace any run-time system routines with
his own routines such as new integration routines,
transfer-function operators, etc. These may be written
in either CDC FORTRAN IV or 6000 series assembly
language (COMPASS).
After the user has entered and modified his problem,
he types COMPILE on the last line on the CRT
screen. The screen next flashes
DIAL COMPUTER 3243
Using the Data Phone adjacent to the keyboard the
user dials the university extension 3243 and presses
the DATA button after the initial answer-back tone is
finished. Several Control Data INTERCOM messages
flash up on the screen for user information. The user
does not take action in response to any message except
HANG UP THE PHONE
When the telephone has been disconnected, the screen
will say:
PLEASE SELECT INTEGRATION METHOD
After the user has selected a valid integrationroutine number, the alphanumeric screen will display
the names of the variables which need initial conditions
and parameter values. Numerical values are entered by
typing (NAME) = (VALUE) on the last line of the
screen. Numerical values of the integration method
may be changed as often as desired.
When finished entering data, the user simply types
RUN on the command line. The screen next flashes
PLEASE SELECT NUMBER OF
OUTPUT POINTS. MAX 512
then
PLEASE SELECT DISPLAY
and finally
DIAL COMPUTER 3243
The first two requests are used to minimize the length of
time the user has to wait for output data to be transmitted to the local terminal. The first request requires a
number between 10 and 512. The user responds to the

second request by typing, say:
DISPLAY (TIME) Xl, X2, BETA, T
where Xl and X2 are state variables, BETA is an output variable and T is time. The user may select up to
5 variables in any output request. The file name, in
this case TIME, is required in all output requests in
order to distinguish between the four output storage
files available to the DARE III user. All output data
stored during the simulation study will always be
available on the 6400 for later retrieval, regardless of
the data returned to the local terminal.
After the simulation study is finished the message
HANG UP THE PHONE
will once again appear on the screen. The display
.selected above will flash on the XY display screen when
the telephone has been disconnected. Scale factors, etc.,
will appear on the alphanumeric screen. A tabulation
of the current graphic data precise to only three places
can be obtained by typing:
QLIST Xl, X2, BETA
or
QPRINT Xl, X2, BETA
More precise tabulations require an additional DARE
111/6400 access.
As in DARE I, additional output requests can be
made at any time. The output requests may require an
additional DARE 111/6400 access, in which case the
screen will once again flash
DIAL COMPUTER 3243
The local terminal will, however, always try to complete
the output request without an additional DARE
III/6400 access. The graphic data returned from the
additional 6400 accesses will destroy the data currently
available on the local disk. The destroyed data will,
however, still be available on the 6400 storage files.
The use of the 6400 line printer for long tabular
listings is encouraged, especially since the local teletypewriter is extremely slow. The line-printer listings
can be picked up at the computer center under the job
name DARE3. The user's ability to create long output
listings without any input deck will continuously amaze
the I/O clerks.
DISCUSSION-THE DARE IV SYSTEM.
The 2,000 bits/sec data rate of the inexpensive,
unconditioned, dial-up telephone line is just sufficient to

On-line Continuous-System Simulation

return oscilloscope displays at low audio frequencies.
Transmitting a reasonably large program, say, 600
character lines takes about 4 minutes, which is still
tolerable. The main delay is in getting access to a
CDC 6400 control point via the user queue of the
system, which is, at heart, still a batch-processing
system. When the 6400 was busy, delays up to 15
minutes were experienced, and such delays in an interactive simulation are somewhat hard on the nerves.
With DARE III, the user might need, moreover,
three such accesses for a single computer run: once to
submit the program, once to enter parameters and
execute the simulation study, and once to get extra
solution displays, if any. This problem can be greatly
relieved by acquiring a high input-queue priority, say
by crossing the computer-center administration's palms
with money. Since we had no money, ours proved to
be incorruptible.
A viable alternative is to change the apportionment
of tasks between the large central processor and the
local minicomputer. If the precompiling (translation)
of the CSSL programs can be done in the local minicomputer, the latter will find the undefined parameters
in the sorting process, flash them on the CRT screen,
and accept the operator's parameter entries. The
translated program and parameter values can then go

383

to the central computer in a single access. The central
computer still compiles the program and proceeds to
solve the differential equations. In most applications,
we will then require only one access to the remote
central processor, a very great advantage.
COMPUTING SPEEDS
For a typical medium-sized aerospace simulation
problem involving second-order Runge-Kutta integration of 12 state-variable derivations, 100 sums, 140
products, 10 sine-cosine evaluations, and 8 table-lookup
functions of one variable, the floating-point DARE I
system can accommodate sinusoidal oscillations up to
about 0.1 Hz, while DARE II (fixed point) goes to
4Hz, and DARE III (floating-point) will admit 7Hz.l
The CSSL benchmark problem (pilot ejection problem) 6
takes 64 sec for 815 runs with DARE III. Digital computing times will be proportionately longer in larger
simulation problems. While on-line digital simulation
does not match the bandwidth of the latest analog
computers, all-digital real-time simulation is possible
for many practical problems, and problem setup and
checkout is incomparably simpler for digital simulation.
Analog/hybrid computation will hold its own mainly in
problems requiring a very large number of simulation
runs on large systems, and in certain high-speed Monte
Carlo, optimization, and partial-differential-equation
studies. l
DISPLAY AND CONSOLE REQUIREMENTS
FOR ON-LINE SIMULATION

PLOT

on CALCOMP
plotter

PRIJIT

report
ud ub1••
on teletype

Figure 4-Time sequential :Bow chart of DARE IV

At the present time, Project DARE employs a
television-raster alphanumeric display with internal
memory, plus an l1-inch electrostatic-CRT graphic
display refreshed by memory interlace from the PDP-9
almost without programmed instruction. Packed 18-bit
computer words simultaneously transfer the X and Y
coordinates of each displayed point to save refresh time
and memory.7 There is also a separate simple color
display.s
It is clearly desirable to minimize the equipment
committed to each local time-sharing station. The
minimal display facility would consist of a simple
storage-tube display for both alphanumeric and graphic
output. Such a display has excellent resolution and
saves local computer time, but will not permit quick\
on-line editing of alphanumeric text. The DARE CRT
editor program is so very convenient that it is well
worth the extra cost of a television-raster alphanumeric
display (CRT typewriter). Such units incorporate
simple refresher memories (usually M OS shift registers)

384

Fall Joint Computer Conference, 1971

* DERIVATIVE BLOCK:
* SAMPLE PROBLEM-PILOT EJECTION

*

* VE = SEAT EXIT VEL.
* THE = SEAT EXIT ANGLE
* Yl = HEIGHT OF RAILS

*

X' = V*COS(TH) - VA
Y' = V*SIN(TH)
PROCED YGEYI = Y, Yl
YGEYI = 1.
IF(Y.LT.Yl)YGEYI = 0.
ENDPRO
V' = - YGEYI *(D / AM + G*SIN (TH»
TH' = - YGEYl*(G*COS(TH»/V
D = RHOP*V**2/2.
YF = Y
TERMINATE X +30.
AM = 7.
G = 32.2
DISPLAY Y, X
* LOGIC BLOCK:
* VA = PLANE VEL.

6E3,
10E3,
15E3,
20E3,
30E3,
40E3,
50E3,
60E3,

1.937E-3
1.755E-3
1.497E-3
1.267E-3
0.391E-3
0.587E-3
0.364E-3
0.2238E-3

* DATA:
DT = 1.0E-01
TMAX = 2.0E+00
X=
Y=
V=
TH =
RHOP =
VE =40
VA = 100
Yl = 4
THE = 15
Figure 5-Problem listing for pilot ejection problem

*

H = 0.
S = 10.
CD = 1.
INPUT VA, VE, THE
OUTPUT H
* CONVERT THE TO RADIANS
THE = THE/57.2957795

*

* CALCULATE INITIAL PILOT VEL.

*

2

V = «VA- VE*SIN(THE»**2+(VE*
$ COS(THE»**2)**0.5

*
* CALCULATE INITIAL PILOT ANGLE

*

TH = ATAN(VE*COS(THE)/(VA- VE*
$ SIN (THE»)

*
3

RHOP = RHO (H) * CD * S
VS = V
THS = TH
vAs = VA
CALL SHOW (H, VAS, VS, THS)
CALL RUN
IF (YF.GT.20.) CO TO 4
H = H+500.
GO TO 3

RUNNO = VA
CALL STORE
VA = VA+50.
IF (VA.LE.1000.) GO TO 2
* OUTPUT BLOCK:
DISPLAY H
* TABLE BLOCK NO.1:
RHO, 12
0., 2.377E-3
IE3, 2.303E-3
2E3, 2.241E-3
4E3, 2.117E-3

4

and their prices have recently come down into the
$2,000-region.
The graphic display could still use a storage tube.
Since storage tubes permit comparison of current and
past displays stored on the screen, one could dispense
with the moving repetitive solution displays possible
with DARE II. For a larger display presentation, we
are also considering a storage-tube/scan-converter
system, which would combine the refresher-memory
output of a television-scan alphanumeric-display generator with scan-converter pickup from a small storage
tube on one large television screen, possibly in color.
The simulation console will also need a local minicomputer for editing, communication control, and
display operation. With the storage-tube graphic
display, 9- or 10-bit display-point X and Y coordinates
could be stored separately, so that the local minicomputer need only be one of the new inexpensive
12-bit types costing under $5,000 or for central processor: we would anticipate a need for at least 8K of
local memory instead of the minimal 4K.
For hard-copy preparation, we have employed a
teletypewriter, a handheld Polaroid oscilloscope camera
capable of photographing. both the alphanumeric and
graphic displays, a four-channel strip-chart recorder,
and an XY servo recorder. A small line printer with full
140 character lines would be faster and more reliable
than our KSR 35 teletypewriter. Unlike the latter, the
line printer could accept the full-width of the 6400
output for debugging; the line printer could also be used
for plotting solutions. Very extensive line-printer tables
and CALCOMP graphical plots could also be prepared
at the computer center.

On-line Continuous-System Simulation

DIGITAL-COMPUTER ARCHITECTURE FOR
SIMULATION
Most modern 24- to 36-bit intermediate-sized digital
computers with floating-point arithmetic are well
suited for simulation applications. Since no such
machine is available at the University of Arizona, the
DARE IF project will investigate the augmentation of
an existing 18-bit minicomputer (PDP-9 or PDP-15)
with a newly designed floating-point arithmetic unit
plus some high-speed storage. The result will be a new
small general-purpose computer, but we are, of course,
especially interested in those features of digital-computer architecture which might favor continuoussystem simulation.
The very fast MECL-II emitter-coupled logic (2 to
3 n sec gate delay) was chosen for the new processor to
permit us to trade this speed for a relatively simple
arithmetic design. The fast processor can communicate
with the PDP-9 memory through the direct-memory
access channel, which requires 1 J,Lsec for the transmission of an 18-bit instruction, or 3 J,Lsec for the
transmission of a 54-bit floating-point data word
(consisting of three 18-bit PDP-9 words). These wordtransfer rates are fairly well matched to the anticipated
10 ,usee to 15 ,usee floating-point addition and multiplication times in the fast arithmetic unit. We will,
nevertheless, investigate instruction lookahead and the
use of some fast-storage (scratchpad memory) consisting of MECL-II memory chips to buffer some data
and/ or instruction transfers in an effort to match the
arithmetic processor's speed to better advantage.
A look at a typical simulation program indicated that
most of the. program execution involves the repetitive
calling of derivative-computing, integration-formula,

VA

Figure 6-Plot of H

VB.

VA for pilot ejection problem

~

10 0

385

and data-storing subroutines. It would appear that
many time-consuming core accesses could be saved
through storage of complete subroutines in a fast
scratchpad memory. Additional execution time would
be saved if instruction, fetching, and/or data storage
could be overlapped with arithmetic execution.
Further investigation of derivative computations for
differential-equation solution indicates that the requirements for intermediate storage are relatively
small. The implementation of a typical simulation
block diagram requires one word of temporary storage
for each point where the block-diagram interconnections
branch. Fast-access storage in multiple arithmetic
registers, small scratchpad memories, or special memory
stacks would appear to be especially suited to such
operations and could save many time-consuming core
accesses. Indeed, the organization of derivative computations tempts the designer to organize his scratchpad
storage into a stack for temporary data storage. A
stack-oriented processor would permit a wide variety
of 18-bit operating instruction, with only a minimum of
memory-reference instructions for communicating with
core storage. Unfortunately, such a stack organization
of the fast processor would also increase the total
number of instructions (and instruction fetches !)
required for the total derivative-computing program.
The optimal system would have enough fast scratchpad
storage to store all frequently used subroutines, but
this is a fairly expensive proposition; a future DARE
study (DARE IIF, Table 1) will look into possible
compromises and trades.

SOME CONCLUSIONS
We believe that the success of the DARE I and DARE
III (equation-oriented) and DARE II (block-oriented)
simulation systems has conclusively proved the feasibility and advantages of all-digital on-line simulation.
Without question, all future systems of this type will be
scale-factor free floating-point systpms. In our experience, most users appear to prefer the equation-oriented
systems. On the other hand, the possibility of creating
special frequently used system blocks as macros is a
very convenient feature of the block-oriented DARE II
language. Future continuous-system-simulation systems will superimpose a macro generator on equationoriented systems; this has already been done in the
batch-processed CSMP-360 and in some of the newer
CSSL systems. In the final analysis, the main advantage of assembler-based purely block-oriented
simulation systems will depend on the extent of their
execution-speed advantage over compiler-based
equation-oriented systems. This speed advantage is

386

Fall Joint Computer Conference, 1971

TABLE I-Project DARE On-line Simulation Systems
System

Author

Completed

Description
Floating-point, equation-oriented CSSL System, 20
state variables one derivative block
Similar to DARE I, two derivative blocks
Fixed-point, block-oriented system two derivative
blocks, extremely fast
Floating-point, equation-oriented CSSL time-sharing
system, 200 state variables, two derivative blocks
Batch-processed CSSL, 200 state variables, two derivative blocks
Similar to DARE IR
Similar to DARE II, but floating-point

DARE I

J. Goltz

1969

DARE IR
DARE II

J. Moore
T. Liebert

1971
1970

DARE III

H. Aus

1971

DARE IIIB

A. Trevor

1971

DARE IF
DARE IIF

C. Wiatrowski

1972

DARE IV

Modified version of DARE III

overwhelming with minicomputers which, because of
their small memory sizes, rely on many subroutine
calls for FORTRAN execution. With large digital
computers having larger core memories and floatingpoint hardware, together with modern, very efficient
FORTRAN compilers, much of the speed advantage of
assembler-based systems may be lost. Estimates of this
remaining speed advantage vary, but might be in the
order of two-to-one, which would still result in significant cost savings.
ACKNOWLEDGMENTS
Project DARE is sponsored by the National Science
Foundation under NSF Grants GK-1860 and GK15224. DARE I and DARE II were respectively written
by John Golts (now President of COMPU-SERVE,
Columbus, Ohio) and Tom Liebert (now on the technical staff of the Bell Telephone Laboratories) as
Ph.D. dissertations. Professor John V. Wait is coprincipal investigator.

Computer

Reference

PDP-9

1,2

PDP-9
PDP-9

1,3

PDP-9
CDC 6400
CDC 6400

4
5

PDP-9 and
homemade
floatingpoint
processor
PDP-9
CDC-6400

REFERENCES
1 G A KORN
Project DARE: Differential analyzer replacement by
on-line digital simulation
Proceedings Fall Joint Computer Conference 1969
2 J R GOLTZ
The DARE I simulation system
Proceedings SWIEEECO Dallas Texas 1970
3 T A LIEBERT
The DARE I I simulation system
Proceedings SCSC Denver Colorado 1970
4 H AUS
DARE III, A time-shared digital simulation system
PhD Dissertation University of Arizona 1971
5 A TREVOR
The DARE IIIBsimulation system
M S Thesis University of Arizona 1971
6 The SCI continuous-system simulation language
SCI Software Committee Simulation December 1967
7 G A KORN et al
A new graphic display/plotter for small digital computers
Proceedings Spring Joint Computer Conference 1969
8 C WIATROWSKI
A color television graph plotter
Computer Design April 1970

A panel session-Computer structure-Past, present
and future
throw us into previously uninhabited parts of the space
of all computer structures. Whatever systematic techniques start to emerge are left behind.
This note comments on several possibilities for computer structures in the next half-decade. Given the unfamiliarity that we all have with the region of computer
space into which we are now moving, there can be no
systematic coverage. Neither is it appropriate simply to
reiterate what would be nice to have. Such an exercise
is not responsive to the new constrain.ts that will limit
the new designs. Such constraints will certainly continue to exist, no matter how rapidly logic speed rises
and logic costs fall. In fact, it is useful to view any
prognostication of new computer structures (such as
this paper) as an attempt to reveal the nature of the
design constraints that will characterize a new epoch of
technology.
We will discuss five aspects of computer structures.
Mostly, these represent design features that we think
have a good possibility of becoming important in the
next few years, though we have reservations on one.
We have been actively engaged (with others) in working
on particular structures of the type we present. Our
selection of these is not a denial that other quite
different structures might also be strong contenders for
dominance during the next several years. Indeed,
according to the point made earlier, with strong shifts in
technology no one can know much about the real
potentialities for new structures. Thus, that we have
been working on these particular structures provides,
mainly, a guarantee that we have thought hard enough
about their particulars to have some feeling for the
design limitations in their local vicinity.

Possibilities for COInputer Structures 1971*

by C. GORDON BELL and ALLEN NEWELL

Carnegie-Mellon University

What computer structures come into existence in a
given epoch depends on the confluence of several
factors;
The underlying technology-its speed, cost, reliability, etc.
The structures that have actually been conceived.
The demand for computer systems (in terms of
both economics and user influence).
One ignores any of these factors at one's peril. In
particular, with technology moving rapidly, a real
limitation exists on our ability as designers to discover
appropriate structures that exploit the new trade-offs
between the various aspects of a computer system.
The design of computer structures is not a systematic
art. So new is it, in fact, that in a recent book (Bell
and Newell, 1971) we found ourselves dealing with
basic issues of notation. We are still a long way from
concern with the sort of synthesis procedures that
characterize, say, linear circuit design. However, the
immaturity is dictated, not so much by youth (after
all we have been designing computers for almost 30
years), as by the shifts in technology that continually

* The ideas expressed in this presentation have emerged from a
number of overlapping design efforts, mostly around CMU and
DEC, but occasionally elsewhere (e.g., at Newcastle-on-Tyne,
the ARPA list processing machine effort, and the effort at the
Stanford AI project). Consistent with this being a short note,
we have attempted to indicate the individuals involved in these
efforts at appropriate places in the text. But we wish here to
acknowledge more generally the contribution of all these individuals. The preparation of this paper was supported by the
Advanced Research Projects Agency of the Office of the Secretary
of Defense (F44620-70-C0107) and is monitored by the Air Force
Office of Scientific Research. The paper is to be published in the
Proceedings of the FJCC, 1971 and may not be copied without
permission.

Minicomputer multiprocessor structures

Consider the multiprocessor structure of Figure 1.
There are p central processors (Pc) and m primary
memories (Mp). We ignore, in this discussion, the
remaining structure that connects the secondary
memories and ij o. The switch (Smp) is effectively a
crossbar, which permits any of the processors access
to any of the memories.
387

388

Fall Joint Computer Conference, 1971

Smp

Crosspoint swi tch

Mp
Da ta operations
for address
translation

central processors

Stp

Unibusses for connectin _ _ _ _ _ _ _ _-'
Ms and other i/o

been proposed both at CMU and at N ewcastle-onTyne.* A set of p PDP-II's have access to a set of
m Mp's aggregating 221 8b bytes.** Each Pc maintains
its address space of 216 bytes, but an address mapping
component (Da) associated with each Pc permits this
address space to be distributed as 23 independent pages
of 213 bytes each. The details of this addressing, though
important, need not be discussed here. Similarly, the
details of the Smp need not be discussed. Each link
through the Smp is essentially a unibus (the bus of the
PDP-ll, see Bell et al., 1969). Connections are made
on a memory access basis, so that the a PC broadcasts
d's address to all Mp's and the connection is made to the
recognizing Mp for the data transfer.
The three critical questions about the Smp are its
performance, measured in terms of Pc effectiveness,
its reliability and its cost. Figure 2 gives the calculated
expected performance (Strecker, 1970) in terms of total
effective memory cycle access rate of the Pc's (whose
number is shown along the abscissa). Each instruction

memory
accesses/
sec 6
X 10

Figure l-Smp (crosspoint) for connecting p central processors
(Pc) from primary memories (Mp)

There is nothing new per se about a multiprocessor
structure. Many dual processors exist, as do genuine
multiprocessors whose additional processors (beyond
one Pc) are functionally specialized to i/o and display.
General multiprocessors have been proposed and a
very few have come into existence (e.g., the Burroughs
D825). But they have not attained any substantial
status. The main technological reasons appear to be
(1) the cost and reliability of the Smp and (2) the relative cost of many processors. Software (i.e., operating
systems) is also a critical difficulty, no doubt, but not
one that appears yet to prohibit systems from coming
into existence.
Both of these technical factors appear to be changing
sufficiently to finally usher in multiprocessor systems
of substantial scope. The cost of the processor is
changing most rapidly at the minicomputer end of
the scale. Thus, we expect to see minicomputer multiprocessors systems before those with large work-length
Pc's. An additional impediment for large Pc's is the
bandwidth required through the switch, which i~\~ub­
stantially less for 16b/w machines than for 32-64b/w
machines both in terms of cost and reliability.
As a basis for discussing detailed technical issues, let
us describe a multiprocessor system involving the
DEC PDP-II. Variant designs of this system have

14

13
12
11

t. swi tch delay:
t.cycle(Mp) :
t.access (Mp):
Pc:
number Mp:

10

15

20

25

30

35

190
600
350
PDP
16

ns
ns
ns
11/25

40

Number of Processors

Figure 2-Performance of a multiprocessor computer with 16
independent Mp's.

* The original design was proposed by W. Wulf and W. Broadley,
based on a switch design by Bell and Broadley; a second more
general design was proposed by C. G. Bell, H. Lauer and B.
Randall at Newcastle-on-Tyne; the version described here is by
C. G. Bell, W. Broadley, S. Rege and W. Wulf. No published
descriptions are yet available on any of the designs, though some
are in preparation.
** Addressing in the PDP-ll is by bytes, though it is preferable to
view it as a 16b machine.

Computer Structures

requires one to five memory accesses. The curve is
parameterized by the number of Mp's (m= 16 here),
the t.cycle of the Mp (350 ns here) and the delay
through the switch (190 ns here). The criteria we have
used for ideal performance is p stand-alone computers
with no switching delays. Thus, the loss is due to both
switching delay and multi-Pc interference. The parameters shown are attainable with today's technology.
The number of memory references per processor decreases as the number of processors increase, since the
calculation assumes a reference to any Mp is equally
likely. The reliability cannot yet be estimated accurately, but appears to be adequate, based on a component count. The cost per Pc is of the order of one
quarter to one times the Pc, measured in amounts of
logic for a 16 X 16 switch. Thus, the Smp cost is appreciable, but not prohibitive.
What does one obtain with such a structure? Basically, Pc cycles have been traded for (1) access to a
larger memory space and (2) Mp-level interprocessor
communication. These benefits come in two styles.
Statistically, the Smp permits configuration of the
Pc's with various amounts of memory and isolation.
An important design feature, not stressed above, is
that the PDP-II components remain essentially
unmodified, so that they can be moved in and out of the
system at will. This feature extends to permitting the
addition and extraction of components to the system
while in operation. Dynamically, the Smp permits the
set of processors to cooperate on various tasks and to
decrease the system overhead for input/output and
operating systems programs. Coupled with this is
common access to the secondary memory and peripheral
parts of the systems, permitting substantially lower
total system cost as opposed to p independent systems. *

389

much this can be exploited in a multiprocessor depends
on t.Smp. Thus, the relevant t.Mp is that which would
obtain in a non-switched system.
Current technology makes all the above terms
comparable, from 50""'500 nanoseconds. Thus, variations of a factor of 2 in any of the component terms can
have a determining effect on the design. Most important
here is that t.Smp can easily become large enough to
make t.instruction(with Smp) twice t.instruction(without Smp).
The cache appears to offer a solution to this problem
within the currently emerging economic design parameters. The basic concept of a cache is well established. *
To review: a cache operates by providing a small high
access content addressed memory (M.cache) for
recently accessed words. Any reference to Mp first
interrogates M.cache to see if the information is there,
and only if not is an access made to Mp. The basic
statistical regularity of system performance underlying
the cache is that words recently accessed will be accessed
again. This probability of reaccess depends of course on
the size of the past maintained. Available statistics
show that if a few thousand words of cache can be kept,
then well over 90 percent of the Mp accesses will be
found in the cache, rather than having to go to Mp itself. If technology provides a steep trade-off between
memory size, memory cycle time and cost per word,
then a cache is a valuable structure.
If we associate the cache with the Pc, as in Figure 3,
then the net effect of the cache is to decrease t.Pc (for
fixed computational power delivered). In organizations

Mp

Smp

Caches for multiprocessors
A key design parameter in multiprocessor organizations, such as the one above, is· the delay through the
switch, measured relative to the performance of the
Mp's and Pc's. The total instruction (e.g., for a memory
access instruction) of a Pc can be partitioned as:

Mp

t.instruction = t.Pc+t.Smp+t.Mp
In current memory technology overlap is possible
between Pc and Mp since accessed information is
available before the rewrite cycle is completed. How

* If this latter goal were all that were required, then one might
consider less expensive alternatives. However, a price must be
paid in system overhead for less general coupling and the trade-off
is far from clear. In fact, we are not justifying the design here,
but simply presenting a concrete example.

Figure 3-Multiprocessor computer with
with each Pc.

cache associated

* The first machine really to use a cache was the 360/85 under
the name of "buffer memory" (Conti, 1969). Wilkes (1965)
termed it the "look-aside" memory. "Cache" seems by now an
accepted designation.

390

Fall Joint Computer Conference, 1971

such as the 360/85 this permits balance to be achieved
between a fast Pc and a slower Mp. In the case of
multiprocessor, this permits the delay of Smp to be of
less consequence (for aggregated t.Smp and t.Mp
play the same role as does t.Mp in a uniprocessor
system).
There is a second strong postive effect of caches in a
multiprocessor organization of the kind under discussion. As the graph of Figure 2 shows, performance is a
function not only of the delay times, but of the frequency of accessing conflicts. These conflicts are a
monotone function of the traffic on the switch, increasing sharply as the traffic increases. The cache on
the Pc side of switch operates to decrease this traffic, as
well as to avoid the delay times. There is one serious
problem regarding the validity of the data in a system
such as Figure 3, where multiple instances of data coexist. In a system with p caches and an Mp, itis conceivable that a single address could be assigned p+ 1
different contents. To avoid this problem by assuring a
single valid copy would appear to require a large amount
of hardware and time. Alternatively, the burden might
be placed on the operating system to provide special
instructions both to dump the cache back into Mp and
to avoid the cache altogether for certain references.
In a recent attempt to design a large computer for use
in artificial intelligence (C.ai), we considered a large
multiprocessor system (Bell and Freeman, 1971;
Barbacci, Goldberg and Knudsen, 1971; McCracken
and Robertson, 1971).* The system is similar to the one
in Figure 1; in fact, the essential design of the Smp for
the minicomputer-multiprocessor came from the C.ai
effort. C.ai differs primarily in having 10-20 large Pc's
with performance in the 5XI07 operation/sec class (e.g.,
the cache-based Pc being designed at Stanford, which
is aimed at 10XPc(PDP-I0) power). An essential
requirement for this large multiprocessor was the use of
caches for each Pc in the manner indicated.
Why then are caches not needed on the minicomputermultiprocessor? Interestingly enough, there are three
answers. The first is that the performance of minicomputers is sufficiently low, relative to the switch
and the Mp, so that reasonable throughput can be
obtained without the cache. The second is that the
first answer is not quite true for the PDP-II Pc. To
achieve a reasonable balance in our current design
requires an upgrading of the bus driving circuits on the

* Many

people at CMU participated in the C.ai effort; a list
can be found in the reports referenced. Furthermore, the C.ai
effort was itself imbedded in a more general design effort initiated
by the Information Processing Technology Office of ARPA and
was affected by a much wider group.

PDP-II. ** The third answer is that the benefits that
accrue from a cache in fact hold for minicomputers as
well. Recently a study by Bell, Cassasent and Hamel
(1971) showed that a system composed of a cache and a
fast minicomputer Pc was able to attain a fivefold
increase in power over a PDP-8. The cost of the cache
was comparable to the Pc, yielding a substantial net
gain (i.e., for a minimal system the power increased by
5 while the cost doubled). Thus, caches would undoubtedly further improve the design of Figure 1 at a
lower cost. Alternatively, one could simply add more
Pc's, rather than increase the cost of the Pc by a cache.

Multiple cache processors
One additional design feature of the C.ai is worth
mentioning, in addition to its basic multiprocessor
structure and cache structure vis-a-vis the Smp.
The general philosophy of the multiprocessor is that
of functionally specialized Pc's working into a very
large Mp. In the context of artificial intelligence,
functional specialization of the entire Pc to a completely specific system (such as the language, Lisp)
seems required to exploit algorithm specilaization. *
Thus, we engaged in the design of two moderate sized
Pc's, one for Lisp and one for a system building system
called L* (Newell, McCracken, Robertson and DeBenedetti, 1971).
Figure 4 shows the basic PMS organization of one of
these processors (actually the one of L*, but it makes
little difference to the discussion at hand). The important feature is the use of multiple caches, one for
data and one for the microprogram. Two gains are to be
obtained from this organization. On the performance
side, the gain is essentially a factor of 2, arising from
the inherent parallelism that comes from the lockstep
between the data and instruction streams. The cache is
indicated by the design decision to permit the microcode to be dynamic. Thus, the second gain is in replacing a deliberate system programming organization
for changing the microcode with the statistical structure
of the cache, thus simplifying considerably the total
system organization (including the operating system).

** Modification of these circuits constitutes the primary modification of the PDP-ll Pc for participation in the system. The only
other modification is the use of two bits in the program status
work to indicate extended addressing.
* The argument is somewhat complex, involving the fact that
specialization to artificial intelligence per se (and in particular to
list processing) does not produce much real specialization of hardware. Not until 'one moves to a completely particular specification
of internal data types and interpretation algorithms can effective
specialization occur.

Computer Structures

I

IMP
Pc

M.cache
Instructions ,data
(ML program)

M.cache
microprogram
(interpreter)

I
control

DM
arithmetic unit
processor state

Figure 4-Multiple (two) cache system

. The gains here are not overwhelming. But in the
lIght of the many single cache organizations (Conti,
1969). a~d non-cache dynamic microprogramming
or~amzatlOns (Hu~son, 1970; Tucker and Flynn, 1971)
bemg proposed, It seems worth pointing out. The
concept could be extended to more than two caches in
co~pu~ers t~at are pipelined, where additional parallelIsm IS aVaIlable in the controls.
Register transfer modules

Some time ago Wes Clark (1967) proposed a system
of organization that he called Macromodules. These
traded speed and cost to obtain true Erector set constructability. For a given domain of application
namely sophisticated instrument-oriented laborator;
experimentation, a good case could be made that the
trade-off was worthwhile. The modules essentially incorporate functions at the register-transfer level of
computer structure, thus providing a set of primitives
substantially higher than the gates and delays of the
logic circuit level.
More recently, another module system has been
created, called Register-Transfer-Modules (RTM'S)*
(Bell and Grason, 1971). RTM's differ from Macromodules at several design points, being cheaper (a
factor of 5), slower (a factor of 2), harder to wire, and
more permanent when constructed. On some dimensions
(e.g., checkout time) not enough evidence is yet available. Thus, they occupy a different point in a design

* Also called the PDP-16 by DEC.

391

space of RT modules. For our purposes here these two
systems can be taken together to define an approach
to a class of computer systems design.
Register transfer modules appear to be highly
effective for the realization of complex controls, e.g.,
instrument controls, tape and disk controls, printer
controls, etc. They appear to offer the first real opportunity for a rationalization of the design of these
aspects of computer systems. Their strong points are
in the rationalization of the control itself and in the
flexibility of data str~ctures.
An extremely interesting competition is in the offing
between minicomputers and register transfer modules. *
As the price of the minicomputer continues to drop, it
becomes increasingly possible simply to use an entire
C.mini for any control job. The advantages are low cost
through standarization and hence mass production. To
combat this the modular system has its adaptation to a
particular job, especially in the data flow part of the
design, thus saving on the total amount of system
required and on the time cost of the algorithm.
An important role in this competition is played by
memory. If substantial memory is required, its cost
becomes an important part of the cost of the total
system. An Mp essentially requires a Pc and lor a
minicomputer has been created. Stated another way:
a minicomputer is simply a very good way to package a
memory. Consequently, RT modules cannot compete
with minicomputers in a region of large Mp. This
extends to task domains that require very large amounts
of control, since currently a memory is the most cost
effective way to hold a large amount of control information. Thus, the domain of the RT modules appears to be
strongly bounded from above.
An interesting application of the above proposition
can be witnessed in the domain of display consoles.
First, substantial memory is required to hold the information to be displayed. Thus, in essence, small
computers (P.display-Mp) have been associated with
displays. A few years back costs were such as to force
time-sharing; each P .display serviced several scopes.
But the ratio is finally coming down to 1-1, leading to
simplification in system organization, due to the elimination of a level of hierarchical structure.
Our argument above, however, has a stronger point.
Namely, a minicomputer (namely, a Pc-Mp organization) will dominate as long as there is already the
requirement for the memory. Thus, the specialized display processors are giving way to general organizations.
In fact, it is as effective to use an off-the-shelf mini-

* Actually there may be a third contender, microprogrammed
controllers.

392

Fall Joint Computer Conference, 1971

computer for the display processor as one specially
designed for the purpose. Our own attempt to show
this involves a PDP-II (Bell, Reddy, Pierson and
Rosen, 1971).
There is an additional reason for discussing RT
modules, beyond their potentiality for becoming a
significant computer structure. They appear to offer the
impetus for recasting the logic level of computer
structure. The register transfer level has slowly been
gathering reality as a distinct system level. There
appears to be no significant mathematical techniques
associated with it. But in fact the same is true of the
logic level. All of the synthesis and analysis techniques
for sequential and combinatorial circuits are essentially
beside the point as far as real design is concerned.
Only the ability to evaluate-to compute the output
given the input, to compute loadings, etc.-has been
important. Besides this, what holds the logic level
intact is (1) a comprehensible symbolism, (2) a clear
relation of structure to function so a designer can
create useful structures with ease, and (3) a direct correspondence between symbolic elements and physical
elements.
RT modules appear to have the potential to provide
all three of these facilities at the register transfer level
(rather than the sequential and combinatorial logic
level). The ability to evaluate is already present and
has been provided in several simulators (e.g., Darringer,
1969; Chu, 1970). The module systems provide the
direct correspondence to physical components, which is
the essential new ingredient. But there is also emerging
a symbolism with clear function-structure connections,
so that design can proceed directly in terms of these
components. For the Macomodules of Clark one can
actually design directly in terms of the modules. With
our RTMs we have been able to adapt the PMS notation (Bell and Newell, 1971) into a highly satisfactory
symbolism* It is too early to see clearly whether this
conceptual event will take place. If it does, we should
see the combinatorial and sequential logic levels shrink
to a smaller, perhaps even miniscule, role in computer
engineering and science. Actually, even if these modules
do not cause such an emphatic shift in digital design,
it is almost safe to predict this change solely on the
basis of minicomputers and microprogrammed controllers being used for this purpose. This will lead to a
decrease in the need for, and interest in, conventional
sequential and combinatorial logic design.
A cautionary note on microprogramming

With the right shaped trade-off function on memory
speeds, sizes and costs relative to logic, microprogram-

* Called Chartran in DEC marketing terminology. See Bell and
Grason (1971) for examples.

ming becomes a preferred organization, because of the
regularity in design, testability and design flexibility
that it offers. Memories of 105 bits must be available at
speeds comparable to logic and at substantially lower
cost per effective gate. With only 104 bits there is not
enough space for the microcode of a large Pc. If memory
is too slow or too costly, the resulting Pc's simply
cannot compete with conventional hardwired Pc's in
terms of computational-power/dollar.
The conditions for microprogramming* first became
satisfied with read-only memories (circa 1965). In the
first major experiment, the IBM System/360, a variety
of hardware was used at different performance levels
of the series, all of it M.ro. Some of the memories
permitted augmentation, and in fact this feature
attained some significant use, e.g., the RUSH system
of Alan Babcock (a Joss-like commercial timesharing
system based on PL/I) which is able to be both costeffective and interpretive by putting parts of the interpreter into the M.microprogram of the 360/50.
More recently read-write memories have become
available at speeds and costs that satisfy the conditions
for microprogramming. This leads, almost automatically, to dynamic microprogramming, in which the
user is able to modify the microcode under program
control. This allows his program to be executed at
higher speeds. The effect is not quite to make the
microcode the new machine language, for the trade-offs
still do not permit 1061"..1107 bits of M.JLP, which is
required for full sized programs. Thus, the original
functional concept of microprogramming remains
operative: a programmed interpreter and instruction
set for another machine language, which occupies a
much larger Mp.
All this story is a rather straightforward illustration
of the principle that computer structures are a strong
function of the cost-performance trade-offs within a
given set of technologies. Different regions in the space
of trade-offs lead, not to parametric adjustments in a
given invariant computer structure, but to qualitatively
different structures.
The cautionary note is the following. In our headlong
plunge to discover the new organizations that seem to
be effective in a newly emerging trade-off region, we
must still attempt to separate out the gains to be made
from the various aspects of the new system-from the
new components, from the newly proposed organizations, etc. The flurry of work in dynamic microprogramming seems to use to be suffering somewhat in this
regard. The proposed designs (e.g., see Tucker and
Flynn, 1971) appear to be conventional minimum

* The microprocessor must operate at a speed of 4 to 10 times the
processor being interpreted.

Computer Structures

computers with wide unencoded words. * They compare
very favorably against existing systems (e.g., members
of the 360 series), but when the performance gains are
dissected they appear to be due almost entirely to the
gains in componentry, rather than to any organizational gains (e.g., Tucker and Flynn, 1971).* The cost
of these systems is usually missing in such analyses.
High performance
technology
microprocessor Model 50
n umber of instructions
number of bits
loop time
(in memory accesses)
memory bandwidth
(megabits/sec)
time for 10 iterations

8
512
10
640"-'3840
4.3

11 (6)
224 (128)
9 (6)

Minicomputer
5
80
7

16

16

191

70

(~ec)

time using high
performance technology
(,usec)

4.3

6.5(4)

3.5

() indicates improvement in coding over Tucker and Flynn.
*There appears also to be some confusion in the application of the
term "microprogram" to some of the proposed systems. The
definition given by Wilkes (1969) is functional: a microprogrammed Pc is one whose internal control is attained by another
processor, P.microprogram. Thus, it is the cascade of two processors, one being the interpreter for the other. Certain structural features characterize current P.microprograms: wide words;
the nature of the operations (control of RT paths); parallel
evocation of operations, and explicit next-instruction addressing
(to avoid machinery in the P.microprogram). Many of the
proposed dynamic programming systems maintain some of the
structural features, e.g., wide words, but drop the functional
aspect. This is, of course, essentially a terminological matter.
However, we do think it would be a pity for the term microprogramming to attach to certain structural features, independent
of function, rather than to the functional scheme of cascaded
processors, one the interpreter for the other.
* A microprogrammed processor design using 1971 logic and
memory technology was compared with IBM's 1964 Solid Logic
Technology and core memory used in the 360 Model 50. Since
the newer technology (50 nanoseconds/64 bits) was a factor of
about 80 faster than the Model 50 (2000 nanoseconds/32 bits)
the microprogrammed processor was somewhat (only a factor of
45) faster. Even using the faster technology the microprogrammed
processor's times for multiplication given by Tucker and Flynn
were about the same as Model 50.
The following table of a Fibonacci number benchmark given
by Tucker and Flynn shows that the main advantage of microprogramming is with high performance technology. A microprogrammed processor has about the same number of instructions and number of memory accesses. Due to the poor encoding
of instructions a microprogram takes more bits (hence possiply
costs more). By having a comparatively high memory bandwidth
it can execute the loop rapidly, but given a model 50 or a minicomputer constructed with a 50 ns memory the execution times
are about the same.

393

Actually, there are signs of the watchmaker's delusion
(Simon, 1969). A watchmaker, Tempus, attempted to
construct watches out of very small components, but
every time the phone rang with an order he was forced
to start over. He got very few watches completed.
His friend, Hora, decided first to build springs, releases,
escapements, gears, etc., and then larger assemblies of
these. Though he, too, was often called to the phone,
he quite often had time to complete one of these small
assemblies, and then to put these together to obtain an
entire watch.
To apply the moral: Large systems can only be
built out of components modestly smaller than the
final system itself, not directly out of much smaller
components. The dynamic microprogramming proposals take as given the same micro-components as
have existed priorly (gates and registers). They do not
propose any of the intermediate levels of organization
that are required to produce a large system. Thus, e.g.,
when they propose to put operating systems directly
in microcode they are close to the watchmaker's
delusion. Insofar as the response is "But of course we
expect these intermediate levels of organization to
exist," then their proposals are radically incomplete,
since the operative concepts of the design are missing.
The situation is even a little worse, for unlike conventional machine language organizations, microprogrammed processors are usually oriented to highly
special technology, have multiple automatic units that
have to be operated in parallel, can even perform in a
non-deterministic manner, are location sensitive, and
provide a combinatorially larger instruction set.
Effective compilers and performance-monitoring software will be mandatory before users can effectively
gain any order-of-magnitude increase in performance
latent in the basic organization. Furthermore, since
these processors are so technology oriented, it is difficult
to guarantee that they will have successors or be members of compatible families.

CONCLUSION
We have touched on a number of aspects of current
research in computer structures that appear to have
possibilities for being important structures in the next
half decade. Our examples-and our style of discussing
them-suggest several basic points about the design of
computer structures. Some of these have been stated
already in earlier sections, but it seems useful to list
them all together:
(1) Computer design is still driven by the changes in
technology, especially in the varying trade-offs.

394

Fall Joint Computer Conference, 1971

(2) Distinct regions in the space of trade-offs lead to
qualitatively different designs.
(3) These designs have to be discovered by us (the
computer designers), and this can happen only
after the trade-off characteristics of a new region
become reasonably well understood.
(4) Thus, our designs always lag the technology
seriously. Those that are reaching for the new
technology are extremely crude. Those that are
iterations on existing designs, hence more
polished, fail to be responsive to the newly
emerging trade-offs.
(5) Since the development cycle on new total systems
is still of the order of years, the only structures
that can be predicted with even minimal confidence are those already available in nascent
form. The multiprocessor, cache and RT module
organizations discussed earlier are all examples
of this.
(6) The design tools that we have for discussing
(and discovering) appropriate designs are weak,
especially in the domain over whi~h the structures under consideration here have rangedessentially the PMS level.
(7) In particular, there is no really useful language
for expressing the trade-offs in a rough and
qualitative way, yet precisely enough so that the
design consequences can be analyzed.
(8) In particular (as well), design depends ultimately
on having conceptual components of the right
size relative to the system to be constructed:
small enough to permit variety, large enough to
permit discovery. The transient character of the
underlying space (the available space of computer structures) reinforces the latter requirement. The notion of M.cache is an example of a
new design component with associated functions,
not available until a few years ago. Even this
small note shows it to be a useful component in
terms of which designs can be sought. The
potential conceptual revolution hiding in the RT
modules provides another example.
REFERENCES
M BARBACCI H GOLDBERG M KNUDSEN
A LISP processor for C.ai
Department of Computer Science Carnegie Mellon University
1971
C G BELL R CADY H MCFARLAND B DELAGI
J O'LAUGHLIN R NOONAN W WULF
A new architecture for mini-computers-The DEC PDP-11
AFIPS Conference Proceedings Vol 36 Spring Joint Computer
Conference 1970

C G BELL D CASSASENT R HAMEL
The use of the cache memory in the PDP-8/F minicomputer
AFIPS Proceedings of the Spring Joint Computer Conference
1971
C G BELL P FREEMAN et al
A computing environment for AI research
Department of Computer Science Carnegie-Mellon University
1971
C G BELL J GRACON
The register transfer module design concept
Computer Design pp 87-94 May 1971
C G BELL A NEWELL
Computer structures
McGraw-Hill 1971
C G BELL D R REDDY C PIERSON B ROSEN
A high performance programmed remote display terminal
Computer Science Department Carnegie-Mellon University 1971
(For IEEE Computer Conference 1971)
Y CHU
Introduction to computer organization
Prentice-Hall 1970
W A CLARK
M acromodular computer systems
AFIPS Proceedings Spring Joint Computer Conference pp 335336 1967 (This paper introduced a set of six papers by Clark and
his colleagues pp 337-401)
C J CONTI
Concepts for buffer storage
IEEE Computer Group News March 1969
J A DARRINGER
!
The description, simulation and automatic implemenatation of
digital computer processors
PhD dissertation Carnegie-Mellon University 1969
S S HUSSON
Microprogramming: Principle.'l and practice
Prentice-Hall 1970
D MCCRACKEN G ROBERTSON
An L* processor for C.ai
Department of Computer Science Carnegie-Mellon University
1971
A NEWELL D MCCRACKEN G ROBERTSON
L DEBENDETTI
L*(F) manual
Department of Computer Science Carnegie-Mellon University
1971
H A SIMON
The sciences of the artificial
MIT PRESS 1969
W Strecker
Analysis of instruction execution rates in multiprocessor computer
system
PhD dissertation Carnegie-Mellon University 1970
A TUCKER M J FLYNN
Dynamic microprogramming: Processor organization and
programming
Communications of the ACM 14 pp 240-250 April 1971
M V WILKES
Slave memones and dynamic storage allocation
IEEE Transactions on Computers Vol EC-14 No 2 pp 270-271
1965
M V WILKES
The growth of interest in microprogramming: A literature survey
Computing Reviews Vol 1 No 3 pp 139-145

Computer Structures

COInputer Structures: Past, Present
and Future (abstract)

by FREDERICK P. BROOKS, JR.

University of North Carolina
Chapel Hill, North Carolina

First, Blaauw's law of the persistence of established
technology leads me to predict that both C.p.u. architecture and technology will change little by 1975, and
will be surprisingly similar in 1980.
Second, magnetic bubbles or integrated circuits may
at least give us memories in the 100 sec. range. These
will force the complete abandonment of the fast-fading
dichotomy between electronic memories, directly addressed, and mechanical memories, treated as inputoutput. As the memory hierarchy becomes a continuum, radically improved addressing techniques and
block-moving algorithms will be required.
Third, cheap minicomputers lead one to distributed
intelligence systems, with minicomputers replacing disk
control units, display processors, or communications
adapters. Such systems promise to save c.p.u. core,
save c.p.u. cycles, and simplify programming. But first
attempts have saved few c.p.u. cycles and no core, and
programming is worse. The answer seems to be to combine distributed intelligence-separate instruction
fetching and interpreting mechanisma-with centralized
memory.

395

different page sizes. The next concept termed 'Naming'
was introduced after extensive testing on Atlas of the
pattern of operand accesses. A high percentage of accesses is to a limited number of 'Named quantities' and
the provision of a small associative buffer memory incorporating a 'Stack' facility is able to significantly
reduce references to the main store and hence improve
performance. The third concept involves the use of
da ta descriptors which extend the flexibility of operand
accesses by defining elements of a data structure which
can be variable in length or arranged in the form of a
string. The final feature is the connection of both the
processor and its associated main store to a communication highway which links them to other units such as
the Mass Core system, Disc backing store or even a
second computer system.
In the detailed implementation of the complete system further use of associative buffer storage is used to
minimize the effect of jump orders on instruction accesses and to readdress the storage units provided to
give an element of 'fail soft' in that area.

COInputer Structures: Past, Present
and Future (abstract)

by ALAN KAY

Stanford University
Stanford, California

COInputer Structures: Past, Present
and Future (abstract)

by D. B. G. EDWARDS

University of Manchester
Manchester, England

The machine being constructed at Manchester,
M.U.5, has a number of interesting structural features
which result from the general aim of designing a system
to function efficiently with high level languages.
The first feature is a 'Paging' system which is based
on Atlas experience and improved to provide a large
virtual address range (34 bits), good protection facilities
and an ability to simultaneously handle a number of

Computer Design in the seventies will be delineated
by a number of past ghosts and present spectres.
The fi~st is that IC manufacturers are highly motivated toward producing better "FORTRAN" components (faster linear addressed memories, adders, etc.)
and thus most revolutionary designs will run FORTRAN more cost-effectively than the system for which
they were intended. A traditional counter-example to
this statement is the magic which can be done with
CAMs. Unfortunately, the advent of cheap buffer
storage largely obviates all but a masked search .in
terms of speed and the necessity to load the CAM still
seems to be a serious liability for overall utility.
A second annoyance is also caused by the cheap
buffer storage. It means that the effective memory
(cache) cycle time is usually between 10 to 20 gate
delays, which means that very little decoding can be
done before a storage cycle is missed. Pipelining helps

396

Fall Joint Computer Conference, 1971

alleviate this problem a bit but is antagonistic to
highly branched or recursive evaluators. This means
that it is difficult to hide the data paths on a machine
when attempting to emulate with microcode (which in
fact, now becomes quite "visible" itself).
A more serious problem to confront computer designers is the conspicuous absence of a new, more useful
theory of evaluation on which to base a revolutionary
design. There is some reason to believe that one such
will appear in a few years, but for now, consequential
processes which require essentially a B5500 environment are still being reinvented and understood after
10 to 15 years of life.
The social problems of users and customers who are
unable to adapt to the cost/size tradeoffs implied by
new technology seem to be part of the general inability
to transcend "McLuhanism." The Grosch theory of the
utility of the large processor (etc.) has not redeemed
itself with proof. A "super" processor may give 10 to

20 improvement in speed; the elimination of secondary
storage would improve many real jobs running in a real
environment by 200 to 1000!
Mini's have embarrassed the computer establishment
by being very cost effective compared to the large
(747?) machines. Besides being more reliable their systems are not as bothered by the exponentially growing
complexity problems involved in resource sharing that
the dinosaurs face.
The above implies the following to this discussant:
it is now quite possible to give most users their own
(mini) processor, some memory and a link to various
file systems. Computer "utilities" (as with power) will
rent a service to handle global needs. The increase in
actual computing power to a user may be substantial
enough to allow him a few years of blessed peace from
"Improvitis" during which time he will finally be able
to invent a new theory of algorithmic computation
which is so desperately needed.

A panel session-Computers in sports
the week (typically on Monday after a Sunday game).
The tendencies which are found can be incorporated
into the practice scrimmage sessions to get the offense
and defense teams familiar with the type of attack
used by the next weekend's opponent.
The computer is here to stay in professional football
especially with the Cowboys. It was slow starting, but
the coaches and team management are finally realizing
it is an indispensible tool to aid in player selection and
determine tendencies in an opponent's (or ones own)
play sequence.

The User's Reaction to Football Play
Analysis and Player Ranking ... Do
They Make Any Difference?

by GIL BRANDT
Dallas Cowboys

Dallas, Texas

The Dallas Cowboys began to examine systematic
ranking of college football players in the early 1960s
using a computer.
These rankings are employed in the common NFL
draft held in the early part of the year, after the
Superbowl. When the draft is in progress, there is not
a great deal of time available to make a selection. This
means the player selection ran kings must be available
and provide a listing of the college players in an order
most advantageous to the club. The computer was
introduced into this ranking process to help eliminate
many of the biases which existed previously.
In the beginning, many coaches and members of the
team management were skeptical of the computer produced rankings. However, time has been in the favor
of the computer, and as a result, most of the NFL
teams now employ computer processing of the scouting
information. The Dallas Cowboys stand behind their
ranking system 100 percent. It has gained Dallas relatively unknowns, such as Calvin Hill, who was Rookie
of the Year in 1969. The advantages should be obvious.
The computerized method of scouting has also
brought some interesting side effects into the overall
scouting process. There is now a standard and comprehensive form for all scouts to use. This is read directly
by the computer. Scouts are now assigned to a particular area or region, and it has become less necessary to
have prospective athletes scouted by many scouts (although the more the better) since the computer program has built-in weighting factors on the scouts themselves.
The Dallas Cowboys coaching staff also utilizes a
football play analysis system on a weekly basis. The
main advantage of this computer system is to make
available analysis of opponent's plays much earlier in

Computers and Scoreboards

by KEN EPPELE

Datex Division
Conrac Corporation

Computer capability in scoreboard control offers
several features:
1. Data presentation in real time.
2. High speed timing data may be gathered, processed and presented to the fans in familiar
units: miles per hour, time behind, etc.
3. Instantaneous comparison to statistical records.
4. Automatic message formatting.
5. High speed presentation, blinking, reversing.
6. Message recall.
7. Character generation including variable size
characters, which aid flexibility and interest to
message presentation.
8. High speed switching concepts permit animation
and slide presentation.
Conrac Corporation has been applying computers to
scoreboard display systems since 1967, when it delivered a mobile golf trailer to IBM, which has been
used on the PGA golf circuit.
397

398

Fall Joint Computer Conference, 1971

This was followed by the Oakland scoreboard which
was the worlds first computer controlled electronic
scoreboard to be installed in any major stadium. The
computer was programmed to follow the play of the
baseball game and develop up to the minute statistics
during the course of the game. Each batter's average,
for example, is shown when he comes up to bat and
reflects his season to date average as of the last time at
bat. The computer is also programmed so that a single
entry can cause many events to occur. If, for example,
the count is 3 and 2 on the batter with 2 outs, and the
batter takes a third strike, the operator simply enters
the strike code on the computer keyboard. Internally
the computer updates the player and the team statistics, charging the batter with the strike-out, crediting
the pitcher with one. The computer also causes the
ball, strike, and out indications on the scoreboard to
return to O. It then updates the line score for the team
just retired and automatically brings up the name and
current average of the next batter.
At Ontario Motor Speedway, Conrac installed the
worlds first automatic timing, scoring and display
system for automobile racing. Radio transmitters, each
generating a unique frequency, are mounted on each
car in the field. Antennas buried in the track sense these
signals and time any car that crosses an antenna. Time
is measured to ± one millisecond, which represents a
distance of four inches at 200 miles per hour. This time
data is processed by the computer and race order information is immediately displayed on in-field pylons which
show the laps completed and the first nine car positions.
During the race, computer print-outs make available
more information, e.g. current order of the entire field,
any car's fastest lap, average speed, and speed of the
current lap, etc. Print-outs are distributed to the press
and track announcers.
The scoreboard for the Dallas Cowboys incorporates
video data terminals. The computer is programmed to
maintain current team and player statistics during the
play of the game. At any time during the game, preformatted messages may be recalled by the operator
with the up to the minute statistics automatically included in the message. Messages may be previewed by
the scoreboard director and displayed on the board at
his direction.
The scoreboard system for the Stadium at Munich
incorporates a minicomputer for primary control of the
matrix scoreboard. This computer also communicates
with a Siemens central computational center which includes a large data bank for all of the Olympic sports.
This establishes a focal point for dissemination of information not only to the facility conducting the event,
but to the facilities for other events. The Conrac com-

puter system accepts these types of messages, stores
them temporarily and presents them at the main
Stadium under command of the local operator. Conrac
is also supplying the worlds first Mobile Matrix Scoreboard system to be used for display at the Canoeing
and Regatta events. The mobile equipment is designed
such that it may be operated at one remote facility one
day and at another remote facility the next.
Computers have become an integral part of major
scoreboard facilities and are virtually essential if they
do nothing more than control the hardware in a flexible
manner. Programming flexibility and ease of expansion
allows for computer controlled scoreboards to accomplish other functions, such as statistical analysis or
comparison and data processing, in real time during the
course of the sporting event. Conrac foresees continued
use of computers in future scoreboard installations,
with even greater emphasis in the real time data processing of the sport.

Football Player Ranking SysteIns

by ATAM LALCHANDANI

Optimum Systems Incorporated
Palo Alto, California

The TROIKA player selection system has been the
first breakthrough in the application techniques in the
application of computer and statistical techniques in
the area of personnel ranking in sports. The success of
such a system can be evidenced by the fact that 23 of
the 26 professional football clubs use a computerized
system for guidance at the yearly draft meetings.
Based on some of the concepts researched above,
extensive work has been done in developing a generalized system for ranking personnel in areas distinct
from sports. It is felt that the tool developed herein can
be a valuable aid in the decision-making concerned with
the hiring and promotion of employees in government
and public institutions. Technological advances can be
made in defining jobs and personnel and the resultant
optimum matching of the two.
OSI is in the process of marketing these ideas to
industry and government and the next couple of years
will show some concrete results in these areas. Currently, OSI sees itself in the research and development

Computers in Sports

stage, but it will not be long before we have an operational tool that would be a useful addition to all levels
of management.

Portable Co:mputer Ti:ming, Analysis, and
Display Syste:ms for Rowing Applications;
Legal Proble:ms in Co:mputer Sport
Syste:ms

399

tition appears to be the area where the most regulation
will be necessary. Additional restrictions may be imposed when these electronic devices used to generate
data are attached to the equipment.

Co:mputers in Track

by J. G. PURDY
by KENT MITCHELL

TRW Incorporated
Sunnyvale, California

JAMCO, Inc.
Palo Alto, California

The firm JAMCO, Inc. has concentrated on making
some of the lesser' known sports more popular, using
electronic timing devices and display systems. The
computer has been used principally (1) to store and
retrieve historical information about the sport and its
competitors, (2) to analyze data generated by automatic and semiautomatic interval timing equipment,
and (3) to operate electronic display equipment as a
visual aid to spectators.
JAM CO's specific objective in rowing is to overcome
certain spectator and press information problems, which
are:
1. The inability presently to view the entire 17,i

mile long race from start to finish.
2. The lack of knowledge about the personal and
competitive backgrounds of the 400 to 500 oarsmen who compete in major world-class regattas
each year.
3. The poor press coverage given rowing for the
above reasons and because typical "official timing" systems and methods for reporting results
are inaccurate, unimaginative, misleading, and
antiquated.
There has also been recent interest in computer systems which aid in competitive strategy and, as a result,
attempt to predict the outcome of future competitions.
Legal problems begin to appear when these techniques
can actually affect the outcome.
Sports governing bodies, professional and amateur,
have already begun to consider these problems, and
some regulation has resulted. The implementation of
real-time analysis and display systems during compe-

There have been a number of scoring tables which
have been developed for track and field. The purpose of
these tables is to compare performances (via a point
score) between the different events, a higher score indicating a better performance. Historically, the scoring
table was introduced as a necessity for the 1912 Olympic
Games where the first decathlon was staged. Here, the.
scoring table provided a method to evaluate the best
overall athlete in the 10 events of the competition.
The official ruling body in track athletics, the International Amateur Athletic Federation (IAAF) , has
adopted scoring systems in 1912, 1934, 1952, and 1962.
Each succeeding scoring system was, supposedly, better
than the previous system. Rule changes, new equipment
and training methods greatly improved the performances in some events while leaving others behind; this
destroyed the "equality" of the point score. The newer
scoring tables attempted to reestablish the equality of
the point scores and often were· based on different
principles.
The time has come again for the reevaluation of the
current scoring system. New records are being made
which causes some inequality in the point scores. However, the present tables were based on the physics of
the performances only; physiological considerations
were purposely ignored since the creators thought a fair
system could be developed which was based on physics
alone.
My current research has attempted to model the
physiological effort associated with a performance rather
than just the physics. A rather sophisticated computer
program has been written to generate the scoring tables
for the different events in the many various required
formats. It is anticipated that these new tables will be
presented to the IAAF for possible ratification as the
official decathlon scoring tables.

400

Fall Joint Computer Conference, 1971·

It should be pointed out that the computer is almost
essential in this task. Analysis of the thousands of performancesand listing of the tables in the many different
formats would almost be impossible without a computer.

Prospects for Sophistication in Football
Play Analysis

by FRANK B. RYAN

u.s. House of Representatives
Washington, D.C.

. It is a familiar task in football coaching ranks to
analyze opponents from the point of view of trends and
of statistical frequencies which might lead to a more
confident decision on how to conduct game strategy.
This analysis pervades the minds of football coaches at
all levels but finds it fullest expression in the professional
area.
To put things in perspective, a review of current
techniques is worthwhile. Computer-aided analysis is
not new to professional football and had its beginnings
some years ago when modern methods confronted the
age-old problem of assimilating a large quantity of data
efficiently and effectively. The usual procedure begins
by encoding football information which is then converted to machine-compatible form. Reports analyzing
informa tion covering several games are genera ted in a
pre-structured format which invariably is finalized once
and for all prior to the beginning of a season. Generally
these reports produce simple frequency counts for a
variety of situations and leave interpretation and
weighting of results up to the coach. The really useful

advantages of computer assistance apparently are only
faintly recognized by most modern-day coaching staffs,
including those who claim a heavy dependence upon
this tool.
There are many reasons for this current imbalance.
Tradition, of course, plays an important role as well as
the very human element of "teaching an old dog new
tricks." In addition, many coaches believe that a simple
approach must be the best approach. The main force
here appears to be severe changes which computer aids
impose on the mode of coaching operation. And yet these
changes have not been supported fully by theoretical
developments which would inspire confidence in the
new methods.
As a first step in attacking the general problems facing
computer-aided strategy analysis today, a general retrieval system, called PROBE, was conceived. The plan
was to develop a system with great flexibility in three
important areas: data base definition, report generation,
and formatting of the output display. This system caters
to the individual interests and methods of different
coaching staffs and attempts to bring computer assistance to the coach, rather than the coach to the
computer. At best this approach is intermediate in the
view of providing modern technological aids to football
strategy analysis.
Looking ahead to the future, one wonders if the game
of football can survive the impact of sophistication.
There certainly is a point of diminishing returns which,
however, has not yet been reached, and ultimately the
flavor of professional football would appear to be in
jeopardy from excessive analysis. To provide an adequate base for a sophistication in game strategy which
does not compromise the game's basic appeal, there
needs to be attention devoted to a number of areas,
among them linguistics, decision theory, and the interplay between team, mass, and individual psychologies.
Interestingly enough, the game does and will continue
to provide a fruitful model for fundamental studies in
each of these areas.

On the hybrid computer solution of partial differential
equations with two spatial dimensions*
by GEORGE A. BEKEY
University of Southern California
Los Angeles, California

and
MAN T. UNG
Dillingham Environmental Company
La J olIa, California

one-dimensional diffusion equation

INTRODUCTION

a2U(x,t)
ax

For a number of years it has been suggested that one
of the fruitful areas of application for hybrid computation might lie in the study of distributed parameter
systems. 1 ,2 In principle, the combination of analog
computer speed with the memory and logical capabilities of digital machines should make it possible to solve
partial differential equations both efficiently and
rapidly. However, most of the published work in the
field has dealt only with linear problems in one spatial
dimension and time, where the advantages and limitations of the particular methods do not stand out
clearly.
The purpose of this paper is to present a detailed
analysis of three methods applicable to the hybrid
solution of nonlinear parabolic partial differential
equations with two spatial dimensions, and to apply
two of these methods to a specific "benchmark" problem. By using different methods to solve the same
problem, their relative merit can be evaluated in proper
perspective.

a·---2

aU(x,t)
at

(1)

with the following Dirichlet boundary conditions
U(O, t) =g(t)

(2)

U(L, t) =c(t)

and the initial condition
(3)

U(x, 0) =b(x)

Various alternative hybrid computer methods for
solution of (1) differ primarily in the way in which the
variables x or t or both are discretized, in order to obtain ordinary differential equations or algebraic
equations.
Discrete-space-continuous-time approximations

In this method the x-domain is represented by discrete stations or nodes which divide the x-axis into M
segments. * The nodes are numbered consecutively
from 0 to M and Equation (1) is approximated by the
system of ordinary differential equations

REVIEW OF HYBRID METHODS FOR
ONE-DIlVIENSIONAL PARTIAL
DIFFERENTIAL EQUATIONS

dUi(t) = _a_ [U i+1(t) -2Ui (t)
dt
(AX)2

As a background to the study of two-dimensional
systems, the major techniques for one-dimensional
equations will be reviewed briefly. Consider a linear,

Ui(O) =b i ;

+ Ui-l(t) ]

(4)

i= 1,2,3, ... , M-1

* The segments are generally of equal length, but need not be.
For example, in the semi-infinite domain 0 ~x < co it may be
more convenient to divide the x-axis using a logarithmic or
exponential scale.

* This

research was supported in part by the U.S. Air Force
Office of Scientific Research under Grant Number AFOSR
71-2008.

401

402

Fall Joint Computer Conference, 1971

where, by convention, Ui(t) denotes U(iAx, t) and bi
denotes b(iAx). If the number of stations M is not too
large one can solve Equation (4) in parallel, entirely
on the analog computer.

Method" and the hybrid implementation of the Alternating-Direction Implicit procedure.

THE "COMPONENT SHARING" METHOD

Continuous-space-discrete-time approximation
In this method the spatial variable x is kept continuous while the temporal variable t is represented by a
sequence of discrete steps, nflt, n=O, 1,2, ... , P. Then
a finite-difference approximation to Equation (1) may
be written as

d2 Un(x)
dx 2

Un(x) - Un- l (x)
aAt

(5)

where, by definition

Un(x) = U(x, nAt)
Un(O) =gn
Un(L) =Cn
This is a split-boundary-value problem whose solution
consists of hyperbolic functions at each time step. It
can be shown3 ,4 that direct solution of (5) by iterating
the unknown initial condition dUn(O)/dt until the
second boundary is matched leads to computational
instability.

Method of decomposition
In order to avoid the stability problems inherent in
the classical solution, Vichnevetsky 5 introduced his
Method of Decomposition which amounts to breaking
Equation (5) into two stable first-order differential
equations. One is integrated in the forward direction
while the other along the backward direction.
Another approach to the stability problem in Equation (5) was used by Hara6 who transformed the problem into one of optimal control.

Monte Carlo methods

This method is a generalization of the discrete-spacecontinuous-time procedure because only the spatial
dimensions are discretized. The Component-Sharing
method is designed to reduce the number of analog
components required to solve all stations of the x-y
plane in parallel. The idea is to divide the net into equal
subdivisions so that there is enough analog equipment
to simulate every node within one subdivision in parallel. Then the same analog circuits are used to simulate
other subdivisions in a certain order until the whole
net is covered. At each station we obtain an approximation Ui,l(t) to the true solution. The superscript k
refers to the iteration number. The procedure is to
iterate across the medium again, solving each subdivision serially, in the same order as designated above, to
come up with a better approximation Ui,jk+l(t) based
upon the initial and boundary conditions. In subsequent
iterations, the values Ui,i(t) are the ones obtained from
the previous computation. In this method the digital
computer's role is that of control plus data storage and
playback. The first attempt to use this method was
carried out by Howe and Hsu9 on the IBM 7090 using
a fourth-order Runge-Kutta formula to simulate the
analog computer integration.
Consider the following linear parabolic equation:
(6)
with boundary and initial conditions

U(O, y, t) =g(y, t)
U (L, y, t)

= c(y, t)

U(x, 0, t) =h(x, t)

(7)

U(x, L, t) =r(x, t)
U(x, y, 0) =b(x, y)

Finally, Monte Carlo Methods7 ,8 have been used (in
two and three spatial dimensions) for the evaluation
of the variable U at a limited number of points in the
space under consideration.
The application of hybrid computers to the solution
of two-dimensional problems is based on generalization
of the first three methods discussed above. Three
methods have been used successfully in solving twodimensional partial differential equations: the "Component-Sharing Method," the "Explicit/Implicit

There are many ways of creating a net. For example,
the shape may be square, rectangle, an entire row or an
entire column. The last alternative is adopted in this
paper. Allowing t to be the analog independent variable
we end up with the following set of difference-differential equations representing (6) for a= 1:

dUi,l (t) /dt= P.Ui+l,l-l(t) +P.Ui-I,l (t) +AUi,i+lk (t)
+AUi,i_lk(t) -2(P.+A) Ui,l(t)

(8)·

Hybrid Computer Solution of Partial Differential Equations

403

for
i=l, 2, 3, ... , M-1

i---*---*---*---*---*---*---i;::

j=l, 2, 3, ... , N-1

where J.' and A are defined as

A= 1/(~y)2

Equation (8) indicates that we need one integrator per
node on the i-j plane. If we have enough integrators,
the whole column can be simulated on the analog computer, the same equipment being used sequentially to
simulate columns i= 1, 2, 3, ... , M -1 in that order.
During the first iteration, solutions for all but the first
column must be assumed to provide a set of starting
boundary conditions for each column. Each new iteration yields a closer approximation to the true solution to
Equation (6). The solutions are said to be convergent
whenever, for an arbitrary E>O, there exists a number
K such that for all k> K and for all i, j and n

I Ui}(n~t) -

Ui}-I(ntlt)

!·

1

J.'= 1/(&1;)2

I 'Ui,i+l
+>.Ui ,i-1- 2 (.U+>') Ui,i]

(33)

Figure 4-Flowchart for the method of component-sharing
(Nonlinear diffusion equation)

Hybrid Computer Solution of Partial Differential Equations

form Vi,j in Figure 3. That is why the continuous
analog coupling terms Ui,j-l and Ui,j+! are sampled
before they are added to the output of the DAC. A
storage block of 14 X 14 X 53 words is needed for the
data of the program. The digital portion of the hybrid
program, as written, will work for any nonlinearity.
Variations occur only in the analog program where
different functions have to be set into the DFGs.
Directing the information flow between the analog computer and the mass storage is the main task of the
digital computer. In practice it is not feasible to write
one hybrid program to fit every' occasion because
changes in the initial conditions and boundary conditions result in scaling problems. Digital computer
users grow accustomed to the convenience of floatingpoint arithmetic. Hybrid programs, especially the analog portion, are restricted to a fixed-point regime. For
example, Figure 3 should be rescaled if maxUi,j(t) <0.5.
In that case, only a fraction of the dynamic range of
the analog computer is fully utilized. As it stands,
Figure 3 is not yet "optimally scaled" because
Os Ui,j(t) s 1. In other words, only one-half of the analog computer dynamic range is covered by the variable
Ui,j(t). We could have offset Ui,j(t) in such a way
that its scaled value reads
-1.0S2[Ui,j(t) -0.5JS 1.0

This type of biasing of a variable is sometimes done
when accuracy considerations override the extra expenditure of time and equipment. As the solution develops, the program types out the convergence error
E for each iteration to inform the operator of the progress being made. The quantity E was defined in Equation (9) and in the current program
E=max I Ui,i,n k - Ui,j,n k -

1

I

for all i, j and n. It was found that, starting at the
same initial guesses, the nonlinear problem converges
at the same rate as its linear counterpart (a( U) = 1).
One possible reason is the fact that the chosen nonlinearity a( U) does not vary drastically with the
temperature U. A typical computer output shows that
the solution converges to within .1 percent in 20
iterations.
The implicit alternating direction method

407

For a grid of 15X 15 in the spatial dimensions and for
14 time steps his program consumed 2.6 seconds, and
his results agreed with comparable digital results
within 1 part per 10,000. It is safe to assume that the
execution time would remain essentially at 2.6 seconds
for the nonlinear problem as well, if one is willing to
handle the nonlinearity on the analog computer. The
situation in this case is similar to that of the Component-Sharing method. All we need is to add some
D FGs to the existing analog diagram.
Solution of the test problem by the explicit/implicit method

Equation (31) becomes
d2 U j ,n(X) /dx 2 = -oz,2[U j ,n-l(X) J
+[Uj,n(X) - Uj,n-l(X) J/[a( Uj,n(X)) • atJ

(35)

The above expression can be put in the form similar
to that of (22), but this time we have a system of nonlinear differential equations. For j = 1, 2, 3, ... , N-l
d2 Uj,n(x) /dx 2 - E>j( Uj,n(X)) • Uj,n(X) = Rj(x)

(36)

where
E>j(Uj,n(X)) = [a(Uj,n(x)) ·atJ-l

(37)

Rj(x) = -AUj+!,n-l(X) -Vj(Ui,n(X)) U j ,n-l(X)
-AUj -

1 ,n-l

(38)

and
Vj(Uj,n(X)) =[a(Uj,n(x)) ·atJ-1-2>.

(39)

The Explicit/Implicit method is based upon the superposition principle in the decomposition procedure and
in matching the boundary conditions of Uj,n(X), To
apply this method we must linearize Equation (36).
Linearization which yields a set of linear differential
equations with varying coefficients, rests upon the
search for a purely space-dependent coefficient v j (x)
such that E>j( Uj,n(x) )~Vj(x). Once a suitable function
Vj(x) is found the Explicit/Implicit method can be
applied to the solution of the equation
d2 Uj,n(x)/dx2 -vj(x) Uj,n(x) =Rj(x)

(40)

A predictor-corrector approach to this problem was
suggested by Vichnevetsky.14
Let the "predicted value" Ui ,n* (x) be represented
by the linear sum
K

The application of this method to the benchmark
problem is not available at this time. However, Bishop
has a working program* on the performance of a gas
storage reservoir, also a diffusion equation. He implemented a linear model of the reservoir in two spatial
dimensions on the EAI 8900 hybrid computing system.

Uj,n*(X) =

L CkUj,n-k(X)

(41)

k=1

where Ck are constants. Substituting Uj,n * into the

* The information was supplied by Dr. Kenneth Bishop to the
authors in a private communication.

408

Fall Joint Computer Conference, 1971

--------a ----- -f¥l- -----~

:

I

r

~

i

I

_CQnlrq] l i ue _
NO from digital
computer

It can be verified (by substitution) that Zj(x) actually
satisfies the ordinary differential Equation (40).
Figure 5 contains the analog diagram for one cell of
the problem. The flowchart can be found in Figure 6.

I

I

t

W

I

;
I

I

~----

Results

-----rn---~

I

Since an analytical solution to the problem is not
known, a reference solution Ui,j(t) was generated
digitally using the implicit alternating direction method.
We then chose to evaluate the hybrid solution Ui,j(t)
by computing the criterion function

I

I
I

1-_-----------------,
I

(

~)
25

I

[4V j (X) 1

t

100

E .. (t) =
~,J

To ADC

(~~,)
Rj(x)/500

or

8 . (x)/25

from digital computer

Figure 5-Explicit/implict method analog diagram.. Logic
signals are shown by dashed lines. Mono is the abbreviation for
monostable multivibrator

I U .. (t)-U·
~,J

·(t)

~,J

I

which is the absolute value of the difference between
the digital and the hybrid solutions. The time histories
of three points along the main diagonal of the x - y
plane are singled out for our study. Point (1, 1) is
closest to the upper left corner (see Figure 2); point
(6, 6) is right at the center of the x-y plane and point
(9, 9) on the main diagonal is closest to the cut-out.
Figure 7 contains the error plots for these three points.
It is evident that the accuracy of the two methods is
of the same order of magnitude.
The solution time (using the EAI690 hybrid computer) were:

right-hand side of Equation (37) yields
Wj(x) = [a( Uj,n *(x)) • Llt]-l

(42)

A space-dependent term comparable to lIj(U j ,n(X)) ,
say Vj(X) , must be found by using the same substitution
as done in (42). With the help of wiCx) and Vj(x) we
are prepared to solve for Uj,n(x) in Equation (40). The
newly found Uj,n(X) can be considered as the "corrected" value. If I Uj,n*(X) - Uj,n(X) I is still deemed
too large, we can always reinsert Uj,n(X) into the coefficient (42) and again solve for yet another new
value of Uj,n(X). The process can be repeated a number
of times to improve the approximation. Hence the original nonlinear Equation (36) can be approached as
closely as desired.
Straightforward decomposition as done earlier is no
longer permissible, since W is now a function of x. To
solve the nonlinear problem we define

Component-sharing method-'-8 seconds
Explicit-implicit method

-8 minutes

However, the latter time should not be taken as a
norm for the explicit-implicit method. By allocating

(43)

This is a Riccati differential equation. It can be shown15
that its solution is well-behaved for the problem under
consideration. We now apply the Explicit/Implicit
method, using (Jj(x) instead of Wj(X)1/2, and obtain

dZj(x) /dx-(Jj(x) ·Zj(x) = Vj(x)

(45)

Figure 6-Explicit/implicit method flowchart

Hybrid Computer Solution of Partial Differential Equations

409

One severe drawback of this method is the large memory requirement for any practical problem.
Explicit/implicit method

t=O second
t=

.078

Component-Sharing Method

•06

.04

.02

t

0 second
Explicit/Implicit Method

t

.)73

Figure 7-Errors in the nonlinear solutions. In this figure
E {Ui,i(t)}

== IAnalytical

1. No iteration needed for linear partial differential
equations.
2. Solution is displayed as a function of one spatial
dimension.
3. Conditionally stable. I5
4. Solution speed is limited only by the digital
computer .
5. A general executive program may be written
but it must receive scaling information prior to
execution. Rescaling is necessary when there is
a change in flt, flx, fl.y or the nonlinearity.
6. Digital memory requirement is confined to storing one past time plane (or one dimension less
than the Component-Sharing method).

Ui,i(t) -Computed Ui,i(t)l.

the solution of WI (x, y) and W2(x, y) to the analog
computer, the 8 minutes can be reduced to approximately 10 seconds. The all-digital solution on the small
fixed-point digital computer system required 38 minutes. Evidently, this time can be reduced drastically by
using larger machines.
DISCUSSION OF RESULTS AND
CONCLUSIONS
Based upon the experience encountered we draw the
following conclusions about the merits of the two hybrid methods examined in detail in this paper:
Component-sharing method

1. Iterations are needed because the starting values
for the interior nodes are not known at the beginning of the computation.
2. Solution is unfolded as a function of time.
3. Unconditionally stable. I5
4. Solution speed is limited. only by the digital
computer.
5. A general executive program can be written to
handle both linear and nonlinear equations,
even to accommodate a class of geometrical
configurations.

As for the hybrid implementation of the AlternatingDirection Implicit Method, it is too new to be judged.
An exhaustive list of its properties can be obtained only
after someone studies it carefully, in a hybrid environment. However, some of its characteristics are evident:
Hybrid alternating-direction implicit method
1. No iterations needed.
2. Unconditionally stable.
3. Solution is piecewise continuous In the time
dimension.
4. High accuracy can be attained in the hybrid
solution because we only integrate the change
in the dependent variable using the full dynamic
range of the analog computer.

Accuracy comparisons between digital and hybrid
methods are not meaningful, since hybrid computer accuracy is always limited by the limited precision of its
analog components. While hybrid solution times are
extremely short, the programming effort is considerably
greater and requires knowledge of not only analog and
digital computation but of the interface problems as
well.
Concluding remarks

It can be concluded that the hybrid computer solution of partial differential equations with two spatial
dimensions is at best only a partial success, since it
combines, to some degree, both the advantages and the
disadvantages of the digital and the analog machines.

410

Fall Joint Computer Conference, 1971

That is, the hybrid solution receives the benefit of the
speed of the analog computer and the memory and the
logic of the digital computer, but it suffers from the accuracy limitation of the analog and interface hardware.
One firm conclusion we can draw is that the hybrid approach is justifiable only when the program is intended
to be run many times, in order to offset the investment
in programming and checkout. These conclusions are
not startling in any way. In fact, since similar results
had been obtained for hybrid computer solution of onedimensional partial differential equations, the results
we obtained were expe.cted.
Unfortunately, it appears that hybrid computers are
not a panacea for the difficulties of partial differential
equations, which continue to challenge both analysis
and computation.
REFERENCES
1 W J KARPLUS
A nalog simulation-solution of field problems
McGraw-Hill New York 1958
2 G A BEKEY W J KARPLUS
Hybrid computation
John Wiley New York 1968
3 S- K CHAN
The serial solution of the diffusion equation using
non-standard hybrid techniques
IEEE Trans on Computers Vol 0-18 No 9 1969
4 H WITSENHAUSEN
Hybrid solution of initial value problems for· partial
differential equations
MIT Electronic Systems Lab Report No 8 1964
5 R VICHNEVETSKY
A new stable computing method for the serial hybrid
computer integration of partial differential equations
Proc SJCC 1968
6 H H HARA W J KARPLUS
Application of functional optimization techniques for the
serial hybrid computer solution of partial differential
equations
Proc FJCC 1968
7 W D LITTLE
Hybrid computer solutions of partial differential equations
by Monte Carlo method
Proc FJCC 1968

8 H HANDLER
High-speed Monte Carlo technique for hybrid computer
solution of partial differential equations
PhD Dissertation Electrical Engineering Department
University of Arizona 1967
9 R M HOWE S K HSU
Preliminary investigation of a hybrid method for solving
partial differential equations
Applied Dynamics Report 1967
10 J DOUGLAS JR
On the numerical integration of (iJ 2ujiJx2) + (iJ 2U/iJy2) =iJu/iJt
by implicit methods
J Soc Indust Appl Math Vol 3 No 1 1955
11 D W PEACEMAN H H RACHFORD JR
The numerical solution of parabolic and elliptic differential
equations
J Soc Indust Appl Math Vol 3 No 11955
12 K A BISHOP
Hybrid computer implementation of the alternating
direction implicit procedure for the solution of
two-dimensional parabolic partial differential equations
AICHE Journal Vol 16 No 11970
13 L N CARLING
Hybrid computer solution of heat exchanger partial
differential equations
Annale de l' Association Internationale pour Ie Caicul
Analogique (AICA) 1968
14 R VICHNEVETSKY
Serial solution of parabolic partial differential equations;
the decomposition method for nonlinear and
space-dependent problems
Simulation 1969
15 M TUNG
Hybrid solutions to parabolic partial differential equations
with two spatial dimensions
Ph D Dissertation Electrical Engineering Department
University of Southern California 1970
16 J DOUGLAS JR C M PEARCY
On convergence of alternating direction procedures in the
presence of singular operators
Numerische Mathematik Vol 5 1963
17 C PEARCY
On convergence of alternating direction procedures
Numerische Mathematik Vol 4 1962
18 J DOUGLAS JR
Alternating direction methods for three space variables
Numerische Mathematik Vol 4 1962
19 J DOUGLAS JR A 0 GARDER C PEARCY
Multistage alternating direction methods
SIAM J Numer Anal Vol 3 No 41966

Numerical solution of partial differential
equations by associative processing
by P. A. GILMORE
Goodyear Aerospace Corporation

Akron, Ohio

INTRODUCTION

word of the AM, which provides arithmetic capability,
read/write capability, and indication of logical operation results. (3) An (optional) funnel memory of as
many words as the AM, each word of 32 to 128 bits,
depending on user need. The funnel memory provides
the AP system with both high speed temporary storage
and high speed I/O to external devices. Data transfer
between the AM and funnel memory is on a serial-bybit, parallel-by-word basis; data transfer between the
funnel memory and external devices is on a parallel-bybit, serial by word basis. (4) A data/instruction memory
in which are stored the AP program, i.e., the list of
instructions executed by the AP, and data items
required by (or generated by) the AP but not maintained in the AM. (5) A control unit which directs the
AP to execute the instructions specified by the AP
program; this control unit is similar to control units
found in conventional computers. Communication
channels are provided between the data/instruction
memory and the sequential control unit and between
both of these units and external devices.
One other unit of the AP must be mentioned here,
that unit is the comparand register (CR). The CR may
contain as many bits as an AM word and is used both to
transmit data into the AM in a parallel-by-bit, serialby-word basis and to specify masking conventions for
AP operations.
A simplified representation of the AP is given in
Figure 1.

In a number of recent articles the application of
associative processing to a variety of data processing
problems has been considered. 1,2,3,4,5,6 In this paper we
consider the application of the parallel arithmetic
capability offered by associative array processors to
the numerical solution of partial differential equations.
A set of equations concerned with weather forecasting
is selected as a representative problem to show the
methodology used in applying associative processing
technology. An associative array processor implementation of a numerical solution to the equations by a
time-marching process is developed for a 50X50 mesh
and corresponding execution times are given. The
solution yields the time-dependent behavior of three
time and space-dependent variables which represent x
and y components of wind velocity and height of a
constant pressure surface. The associative array processor employed is the Goodyear Aerospace STARAN
IV. Since its organization and operation are not widely
known, the next section of this paper is devoted to a
description of the associative array processor and its
operation.
THE ASSOCIATIVE ARRAY PROCESSOR
General characteristics

The Goodyear-developed Associative Array Processor
(AP) is a stored program digital computing system
~apable of operating on many data items simultaneously;
both logical and arithmetic operations are available.
rhe principal components of an AP system are as
follows: (1) An associative memory array (AM) in
which are stored data on which the AP operates.
rypically, the AM may consist of 4096 words, each of
256 bits. (2) A response store configuration for each

AP operations

We are principally concerned with the parallel
execution of arithmetic operations in the AP and, to a
lesser extent, with internal transfer of data. An understanding of the AP's parallel arithmetic capability can
be facilitated by considering Figure 2 which depicts the
word/field structure of a hypothetical 10 word, 20 bit
411

412

Fall Joint Computer Conference, 1971

CONTROL
UNIT

DATAl
INSTRUCTION
MEMORY

EXTE RNAL
DEVICES

If, for i=1, 2, ... ,10, we denote word i by Wi; the
contents of field 1 of Wi by (Fli) ; the contents of field 2
by (F2 i ); and the contents of field 3 by (F3 i ), then
(at least) the following computations can be done in
"parallel," that is, simultaneously by word, sequential
by bit. (Such parallel computations are often called
"vector" computations, since they involve like operations on corresponding elements of two vectors of
operands.)
(Fli) EB (F2i)

or

(F2i) EB (FL)

i=1,2, ... , 10/\EBE{+, -, *, +}
The field into which the results of the operations are
stored is specified by the programmer. For example,
the results of the ± operations could be stored in either
Field 1, Field 2, or Field 3. We denote this, for example,
by:
i=1,2, ... ,10
or
Figure 1-8implified AP structure

i=1,2, ... ,10

AM. (For a full discussion of the AP's logical and
arithmetic capability the reader is referred to Reference
7.)
In Figure 2, each of the 20 bit words has been
(arbitrarily) divided into two 5 bit fields and one 10 bit
field. Other field assignments could have been made
and they need not be the same for all words. Field
specifications are made by the programmer in accordance with computational and storage requirements
at a given stage of a program; the specification is
logical, not physical and can be changed for any or all
words at any point in the program by the programmer.

FIELD 3

FIELD 2

FIELD 1

,-20_ _ _ _ _'_1,-10_ _
6--,--5_ _---.

~- BIT NUMB ER

t--------j----+------l -

WORD 1

,--------+-----f--------I -

WORD 2

or
i=I,2, ... ,10
In the first two specifications, the original values (Fli)
or (F2i), respectively, would be destroyed; in the third
specification (FL) and (F2i) would be unaltered.
In .* or + operations, a double length product or
quotient will be available. To save the double length
result we would be restricted to placing the result in the
"double length" field, F3. For example,
i=I,2, ... ,10
The original values (Fli), (F2 i ) would be unaltered.
Operations such as those described above are referred
to as "within word" arithmetic operations. We have
also available "register to word" operations and
"between word" operations.
In register to word operations, the contents of a
specified field of the comparand register, denoted by
( CR), is used as an operand. A typical register to word
operation would be:
i=1,2, ... ,10
or
i=1,2, ... ,10

' - - - - - - - - - - - - L - _ - - - l_ _---.J

Figure 2-AM structure

-

WORD 10

In between word operations, the operand pairs derive
from different words. For example, in the operation
i= 1,2, ... ,8

Numerical Solution of Partial Differential Equations

field 1 of word 1 is multiplied by field 1 of word 3 and
the result placed in field 3 of word 1; field 1 of word 2
is multiplied by field 1 of word 4 and the result placed in
field 3 of word 2 ... ; field 1 of word 8 is multiplied by
field 1 of word 10 and the result placed in field 3 of
word 8. We could likewise specify an operation such as
i=1,2, ... ,9

We note that for between word operations, the
"distance" between words from which operand pairs are
derived is constant, that is, with each word i, we
associate a word i±.!l.
Such between word operations are executed in
parallel but are more time consuming than within word
or register to word operations. The increase in time is
proportional to the distance .!l.
In the preceding examples, operand pairs were
derived from either AM word/field locations or the
comparand register and results were stored in AM
word/field locations. For AP systems incorporating a
funnel memory, one element of each operand pair can
be derived from the funnel memory and results can be
stored in the funnel memory; operations taking place,
as before, in parallel. Simple data transfer operations
between the AM and the funnel memory of course
proceed in a word parallel, bit serial fashion.
As the reader may suspect, the bit serial nature of AP
operations results in long execution times if computation
is considered on a per word basis. The source of computational advantages for an AP lies in the AP's
ability to do many, indeed thousands, of operations in
a word parallel fashion and thus give, for properly
structured computations, effective per word execution
times which are very attractive as we shall see.
In the following section we shall show how computations required in the weather forecasting problem can
be structured to take advantage of the AP's parallel
arithmetic capability.
A SIMPLIFIED WEATHER FORECASTING
PROBLEM
Problem statement

The National Oceanic and Atmospheric Agency
(NOAA), a Federal Agency which includes among its
responsibilities the development of methods of weather
forecasting, has provided Goodyear Aerospace with a
math modelS which is a simplified version of the math
model currently used in weather forecasting computations. The math model consists of a system of difference
equations which are the discrete version of a system of

413

partial differential equations. The simplified model is
both interesting and useful in that it does in fact provide
a representative propagation problem and the computations required for solving the problem are typical of
the computations required for the model actually used
in weather forecasting. We shall show that such computations may be structured for effective AP execution.
The equations in the math model involve three time
and space dependent variables u=u(x, y, t); V=
vex, y, t); h=h(x, y, t) which give, respectively, the
x component of wind velocity; the y component of wind
velocity; and the height (above some reference) of a
surface of constant barometric pressure.
The system of differential equations is given by:

au
au
au
ah
- +u - +v - -fv+g- =0
at
ax
ay
ax
av
av
av
ah
- +u- +v- +fu+g- =0
at
ax
ay
ay
ah
ah
ah
(au av)
-+u-+v-+h -+- =0
at
ax
ay
ax
ay

(1)

The corresponding difference equations are:

Xy

Xy

h t = - {uXYh:l+vXYhl/+hXY(uXY+V,/) }

(2)

The subscripts indicate partial derivatives; superscripts denote spatial averaging; f and g are constants.
These notational conventions are those of Dr. Shuman
of NOAA.
The equations are to be solved over an nXn (say
50X50) uniform square mesh with spacing "d." Initial
conditions specify, at each point of the mesh, values of
u, v, and h at time to. The solution is specified by a
marching (in time) procedure in which we approximate,
for each variable, its time derivative at time tk = to + k.!lt
and then predict its value at time tk+1 by a truncated
Taylor series. For example:
(3)
An alternate marching method proceeds by computing, based on initial values at time to, values at time
to+.!lt = t1, and then proceeding in a "leap frog" technique. For example:
(4)

414

Fall Joint Computer Conference, 1971

-21

-22

-23

-24

-16

-17

-18

-19

-11

-12

-13

-14

X

-6

X

-7
X

1

X

-8
X

-2

-25

X

-9
X

3

X

-4

5

Figure 3-5 X 5 mesh

Computations involved in the two marching methods
are nearly identical; storage requirements for computer
implementations differ in that in the first method,
given by (3), a set of values for the variables u, v and
h at only one time period is saved from step to step
while the second method given by (4), requires saving
sets of values at two time periods at each step. Selection
of the second method may be dictated by numerical
accuracy and stability considerations.
Whatever marching procedure is employed, it is
executed for each variable u, v and h at each interior
point of the mesh, throughout the forecast period.
In order better to expose the computational requirements of a marching procedure we shall in the following
item consider an illustrative example based on a 5X5
mesh.

of these boxes are denoted by the x's of Figure 3. We
shall choose the latter point of view and at each of the
4 box centers (and for each variable) compute an
approximation to the time derivative. The 4 computed
values will then' be averaged and taken as the time
derivative value at point 7. This procedure is repeated
for each interior point of the mesh.
Within each box the time derivative will be computed
in terms of certain space derivatives, as indicated by
Equation (2). For example, in computing Ut, the
approximate time derivative of u, the approximate
space derivative U x is required. The calculation of space
derivatives is specified in a box-to-box fashion as
follows. Consider the box {I, 2, 6, 7}. For this box
center, the derivative of u with respect to x can be
approximated by either

or

where Ui denotes the value of u at mesh point i, and
d is the mesh spacing. If we average the two approximations in the y direction, we have a better approximation, namely

Similarly, for the rest of the boxes formed by the
first two rows of mesh points we have, moving left to
right, the following approximations which we refer to
as "Form 1."
UxY = ~((Ua-U2) jd+ (us-u-r) jd)

Computation example
Uz

For purposes of example we shall consider a solution
of the simplified weather prediction problem over a
5 X 5 mesh having the mesh point ordering conven tion
of Figure 3.
We shall employ marching method (3), the resulting
computations, programming, and timing being but
little changed if method (4) is selected.
For each variable u, v and h, the time derivative
computation at each interior point of the mesh of
Figure 3 will proceed in the following fashion. Consider
point 7 which may be viewed either as lying at the
center of the 4 surrounding points {2, 12, 6, 8} or
viewed as lying at the center of the 4 surrounding
"boxes" specified by the point sets {I, 2, 6, 7},
{2, 3, 7, 8}, {6, 7,11, 12} and {7, 8,12, 13}. The centers

y= ~((U4-Ua) jd+ (U9-US) jd)

UxY = ~((U5-U4) jd+ (UIO-U9) jd)

"FORM I"

Other space derivatives can be approximated in like
fashion.
It is evident that the computations of "Form I" are
amenable to parallel execution. Each of the differences
(U2-Ul), (U7-U6), (Ua-U2), .•• can be computed
independently and hence in parallel; subsequently, the
quotients and sums may be computed in parallel.
Other space derivatives and averages appearing in
Equation (2) similarly involve potentially parallel
arithmetic operations. The question then arises as how
best to exploit the parallelism inherent in the computations by proper implementation of the computations on a computer system such as the AP which

Numerical Solution of Partial Differential Equations

offers parallel arithmetic capability. In the following
section an AP implementation of Equation (2) for
parallel execution will be developed.

AP WORD F IEID

r - - - •••

ASSOCIATIVE PROCESSOR
IMPLEMENTATION
In the preceding section it was seen that the computations involved in solving Equation (2) by a
marching process offer the potential for parallel
execution. In this section we consider the question as
how best to store the required data in an AP and arrange
the computations. We again use for example purposes a
marching process for Equation (2) over the mesh
of Figure 3.
One could associate with each point of the mesh one
AP word, and in designated fields of that word store
current point values of the variables u, v, and hand
results of intermediate computations required to compute approximate time derivatives Ut, Vt, and h t used in
the marching process. Such a scheme would, however,
have certain disadvantages as can be seen by examining
the "Form 1" computations previously specified for the
u~'Y space derivative computations. With such a storage
scheme the differences (U2-U1), (U7-Ua), (U3-U2), ...
can indeed be executed in parallel, but at the expense of
between-word arithmetic operations in the AP; the
subsequent division operation will require a registerto-word arithmetic operation followed (or preceded)
by another between-word operation for the add
(between words 5-distant in the 5X5 mesh example;
between words n-distant in a general nXn mesh
problem). Between-word operations (especially between-distant words) should be minimized for optimal
performance and further analysis of operations required
for the marching process if the one mesh point per word
storage scheme is employed reveals that excessively
many between-word operations are required. Another
storage scheme is suggested by the following considerations.
The {u~'Y} calculations specified previously for the
four "boxes" formed by the first two rows of the mesh
can be rewritten, respectively, in what we shall call
"Form 2" as follows:
U~1I =

( (U2+U7) /2d- (U1 +Ua) /2d)

U~1I=

((U3+US) /2d- (U2+U7) /2d)

U~1I=

((U4+U9) /2d- (U3+US) /2d)

uz

((US+U10) /2d- (U4+U9) /2d)

1l=

"FORM 2"

If we were to define two vectors by VI = (U1, U2, U3, U4, Us)

415

3

2

V6

v1

u6

u1

1

h2

v

7

v2

u7

u2

2

h3

v8

v3

u8

u3

3

h9

h4

v9

v4

u9

u4

4

h10

h5

v 10

v5

u10

u5

5

u11

u6

6

u12

u7

7

u13

u8

8

u14

u9

9

u15

u10

10

u 16

u11

11

u17

u12

12

u18

u13

13

u 19

u14

14

u20

u15

15

u21

u16

16

6

5

4

h6

h1

h7
h8

·
·

·

·

·
·
·

.

.

h21

b16

v21

v16

h22

b17

v 22

v 17

u22

u 17

17

h23

b18

v23

v18

u23

u18

18

h24

b 19

v24

v19

u24

u19

19

h25

b2()

v25

v20

u25

u20

20

Figure 4-Redundant AP storage scheme

and V 2= (Ua, U7, Us, Ug, UlO) then it is readily seen that
the add operations of Form 2 are just those of the
vector sum V = V 1+ V 2 • The subsequent divisions are
given by V*= Y2dV and, finally, the space derivatives
{uz'Y} are given by a "convolution" of the vector V*
in which we subtract from the ith element of V* the
(i-1)th element, i=2, 3, 4,5. Such vector operations
are well suited to AP execution and suggest a storage
scheme in which the data items {Ui, Vi, hd, i= 1,2,
... , 25 (for the example) are stored in a redundant
fashion as follows. In two fields of words 1 through 5
we store respectively not only U1 through Us but also
Ua through U10. Then the two fields of words 1 through 5
will contain respectively the vectors VI and V 2 previously defined. In like fashion we shall store in 4 more
fields of words 1 through 5 values for VI through VIO
and hI through h lO . The first five words of the AP then
contain all the values needed to compute time derivatives for the variables u, V, and h in the first row of
"boxes" determined by the first two rows of the mesh.
We can continue in this fashion by storing in second 5
words of the AP, words 6 through 10, values of u, V,
and h required for computation of time derivatives in

416

Fall Joint Computer Conference, 1971

the second row of boxes determined by the second and
third rows of the mesh. We continue in like fashion till
all mesh point values are stored. For the example
problem this storage scheme is shown explicitly in
Figure 4. In this scheme data are stored redundantly;
the increased storage requirements being tolerated in
return for increased computational speed. The increase
in computational speed is gained by minimizing
between-word operations in the AP and maximizing
parallel computations. The parallel nature of the
computations allowed by the redundant storage scheme
is readily seen by considering Figure 5 in which is
exhibited the three step sequence for computing
{- uxy} for the first row of boxes. (- UxY is computed
since that is the quantity actually required in the time
derivative computations; see Equation (2). It will be
noted that the three steps correspond to the three
vector operations previously described. The addition
operation of the first step is a within-word AP operation
which is done in parallel for all data pairs (Ul' Ua) ,
(U2, U7), ... , (us, UIO); the division operation of the
secon<:l step is a register-to-word AP operation which is
done in parallel for the intermediate data items (Ul +ua) ,
( U2 +U7), ... , (us +UIO). The third an d final step of the
uxY calculation is a between-word AP operation (words
a distance of only 1 away) which is done in parallel for
the intermediate data items (Ul +us) /2d, ... ,
(U5+UlO) /2d. That the {uxY} computations for the
first row of boxes can be done in parallel is evident.
But a moment's reflection reveals that {u x Y } computations for all boxes can be done in parallel because of the
redundant storage scheme. The uxY computations for
the second row of boxes are of the same form as those
for the first row, the only difference being that they are
based on values Us, U7, ... , U15 which comprise the
values of the variable U over the second and third rows
of the mesh.
Due to the redundant storage scheme, these values
are stored in words 6 through 10. Likewise, in words 11
through 15, and 16 through 20 are stored, respectively,
values for the third and fourth, and fourth and fifth
rows of mesh points. The {uxY} computations for the
corresponding boxes are again of the same form as for
the first row of boxes. We have then that under the
redundant storage scheme the uxY computation for all
boxes is specified by the same field wise operations and
that within each of the three computational steps the
computations for all boxes can be carried on in parallel.
We note here that this is true not just for our 5 X 5
example, but for arbitrary n X n meshes, and AP
effectiveness increases with increasing mesh size.
The same analysis applies to computations of spatial
derivatives for the other variables v and h whose values

over the mesh are likewise redundantly stored. The
sets {vxY} and pixy} can, in turn, be calculated in parallel.
Like parallelism exists for computation of space derivatives with respect to y, averaged in the x direction, say
{u1,x} and spatial averages, say {UXY}. We see this by
noting that, for the first box,

UI/I'./ (U7-U2)

/d

UI/x= ((US-Ul) /2d+ (U7-U2) /2d)
similarly, for the remainder of the first row of boxes

UI/X= ((U7-U2) /2d+ (Us-Ua) /2d)
UI/x = (( Us- Ua) /2d+ (U9-U4) /2d)

u1/ = (( U9- U4) /2d+ (UIO- UIi) /2d)
The respective spatial averages for the first row of

St:~ 1
U

Fie tds*
6
h6
h7
hS
hg
h lO

11
Ul+U6
U2+ U7
u +u
3 s
U +u
4 9
u +u
5 lO
u 6 +u ll
u +u
7 12
u +u
S 13
u +u
9 14
u +u
lO 15
(u +u )/2d
1 6
(u +u )/2d
2 7
(u +u )/2d
3 8
(u +u g )/2d
4
(u +u
)/2d
5 lO

nll
hl2
h 13
ht4
h
l5

hl
h2
h3
h4
h5
n6
h7
hS
hg
h
lO

2

4
v6
v7

vl
v2

u6

v8
Vg
v
lO

v3
v4
v5

Us
Ug
u lO

vll
v 12
v l3

v6
v7

ull
ul 2
ul 3
u t4

v8
v9
v
lO

v 14
v
15

..

(u 6 +u

)j2d
ll
(u 7 +u 1 2)/2d
(u +u
)/2d
S 13
(u g +U ) /2d
14
(u +u )/2d
lO 15
(u t +u6) /2d-(u2+u7) /2d
(u +u ) /2d- (u +u )/2d
2 7
3 8
(u 3 +u8) /2d- (u4 +ug)/2d
(u4+Ug)/2d- (u5+ulO) /2d
(u +u
)/2d
5 lO
(u +U ll ) /2d- (u +u t2 )/2d
6
7
(u 7 +u l 2) /2d-( uS+u l3 )/2d
(us +u ) /2d- (u g +u )/2d
13
l4
(u +u ) /2d- (u +u ) /2d
9 14
lO l5
(u +u )/2d
tO l5

..

It

..

* Complete field designation is as follows:
11

12

13

H

14

L

Fields
15

Figure 5-Uzl1 computation

4

3

u7

u

15

ul
u2
u3
u4
u
5
u6
u7
u8
Ug
u
lO

Numerical Solution of Partial Differential Equations

boxes would be simply

uxy = «Ul+U6) 14+ (U2+U7) 14)
uxy = «U2+U7) 14+ (U3+ US) 14)
uxy = «U3+US) 14+ (U4+ U9) 14)
uxy = U4 +U9) 14+ (U5+UIO) 14)

«

For these computations too we have p"arallel execution
for all boxes; this is evidently true also for the variables
vandh.
The actual computation of the approximate time
derivatives involves not only computing spatial
derivatives and averages for the variables u, v and h
over the mesh, but combining these in the fashion
indicated by Equation (2). The combining operations
too are amenable to parallel execution. An indication
of this is seen by considering the product UXYux Y required
in computing Ut. For each box we may store computed
values for ux Y and uxy in the AP word whose index is the
same as the lowest indexed point of the box corners
(i.e., with each box we associate its lower left corner).
For the example computation we have in fact done this
for ux Y and as the spatial averaging equations indicate
we can do the same for the computation of the uXY ,
using the original storage scheme of Figure 4. For each
box then we have the corresponding values for uxY and
UXY in two fields of the AP word associated with the box
and for all boxes we may compute the product uxYux Y
in a single within-word operation. Subsequent to the
computation of a value for a variable at a new time
step, a between-word data transfer operation must be
executed due to the redundant storage scheme employed.
An AP program has been written for one complete
updating of interior mesh points. The program assumes
a funnel memory of 120 bits and is patterned after the
computational procedure given in Reference 9. For a
50 X 50 mesh, the total time required for one updating
of the interior mesh points according to the difference
Equation (2) is 3.5 milliseconds. This time is based on
fixed point arithmetic on 20 bit fields. (Floating point is
available via software and, if employed, overall execution times will typically increase by a factor of approximately 1.4.) Due to the parallel features of the AP
implementation, this execution time will increase very
little (within AP Ifunnel memory capacity) as the mesh
size increases. The independence of execution time from
n, the mesh size, is due to the fact that within the
updating procedure only certain between-word data
transfer operations are explicitly time dependent on n.
This would allow a large AP of say 24,000 words of 256
bits employing a 120 bit funnel memory to execute an
update of all internal mesh points of a 150X150 mesh

417

in a time not more than 1 millisecond greater than the
3.5 milliseconds time required for a 50 X 50 mesh. It is
not a necessity to increase AP size as the mesh size
increases, since the mesh can be divided into blocks and
the updating procedure executed in a sequential by
block, parallel within a block fashion.
Other AP configurations can be employed for the
updating procedure. For example, an AP with no funnel
memory but with two words per mesh point would give
execution times close to those given above. Such a
configuration can be made transparent to the user who
"sees" an AP with half the number of words and twice
the number of bits per word. In such a configuration
execution times increase slightly because certain
operations which are apparently "within word" are in
fact "between word"-words 1 distant. For the example
problem the total storage requirements are not increased by such a configuration but storage capacitysome of it unused-is greater than the AP Ifunnel
memory configuration since the AM word is 256 bits as
opposed to 120 bits for a funnel memory word. The
relative merits of the two configurations are of course
user dependent. Also, an AP with a small, say 32 bit,
funnel memory could be employed in conjunction with
a conventional processor (CP) which would provide
temporary storage for intermediate results and perform
certain minor computations such as accumulating
partial sums involved in computing time derivatives.
For such a configuration, the data transfer between the
AP funnel memory and the CP will markedly increase
execution time but will have little effect on total storage
requirements. For the 50X50 mesh problem the
execution time for one update of interior mesh points
would be 33.5 milliseconds for an AP ICP configuration
which employed an AP with a 32 bit funnel memory tied
to a CP such as the CDC 6600. One can also envision
future systems in which each AP word is tied through
the response store to one head of a large head-per-track
disk. Such a configuration should prove quite attractive
for a variety of applications.
The execution times for both the AP ICP configuration and the stand alone AP compare favorably with an
estimated time9 of 215.5 milliseconds for the same
computations (that is, one complete updating of all
variables at each interior mesh point) using an IBM
360-65 with array processor (2938, model 2) .
REFERENCES
1 E E EDDEY
The use of associative processors in radar tracking and
correlation
Proceedings National Aerospace Electronics Conference

1967

418

Fall Joint Computer Conference, 1971

2 W C MEILANDER
The associative processor in aircraft collision prediction
Proceedings National Aerospace Electronics Conference
1968
3 A COSTANZO J GARRETT
Applications of associative processors in an intercept radar
system
Proceedings National Aerospace Electronics Conference
1969
4 L C HOBBS, et al
Parallel processor systems, technologies and applications
Spartan Books New York 1970

5 L 0 FULMER W C MEILANDER
A modular plated wire associative processor
Proceedings IEEE Computer Group Conference June 1970
6 J A RUDOLPH L C FULMER W C MEILANDER
The coming age of the associative processor
Electronics Magazine Feb 15 1971
7 STARAN IV programming manual
Goodyear Aerospace Report GER-15096 1970
8 National meteorological center memorandum
File No 417 March 1967
9 Implementation of numerical weather forecasting on a 360,
Model 65 with an array processor (2938, Model 2)
IBM Report 322-15001967

Consistency tests for elementary functions
by A. C. R. NEWBERY and ANNE P. LEIGH
University of Kentucky
Lexington, Kentucky

INTRODUCTION

results using assumed values for e', s'. First we compute
the residual r defined by (M(e))2+(M(s))2-1=r,
and we assume that M(e) and M(s) are each wrong
by a fraction t of the admitted error, and that the signs
reinforced the error to build it up to the observed
residual r. Our assumption then is (c+te')2+
(s+ts')2-1 =r, or
t2(e'2+ s'2) +2t(ee'+ss') -r=O.
(1)

The possibility of using consistency tests to determine
the quality of an elementary function subroutine has
been considered by several authors.1, 2, 3 Although none
of the authors thought highly of the idea, our investigations have led us to conclude that consistency
tests do have a definite provable value in some situations. The error in a subroutine has two possible
sources: (a) range-reduction and (b) the reduced-range
approximation. For instance, in approximating the sine
of a large angle one reduces the problem to that of
approximating the sine or cosine of an angle in a reducedrange-perhaps [0, 'IT' / 4]. Since the range-reduction
process will then involve subtracting a large integermultiple of an inaccurately represented 'IT'/2, it is clear
that the reduced-range argument will be in error by a
quantity which varies linearly with the original argument. Since these range-reduction errors are unavoidable and well understood, we have concentrated
our efforts on consistency tests which will help to
evaluate the quality of a subroutine in the reducedrange approximation. We give three examples. In each
case the variable x is supposed to be within the reduced
range; the tests are still valid without this condition,
but there is a diminished likelihood of our bounds being
realistic when x is outside the reduced range.

If tm is the root of (1) of smallest magnitude we can
assert that either M (e) or M (s) is wrong by a fraction
at least I tm I of the admitted error. This is a rigorous
assertion, because throughout we have given the
manufacturer the benefit of the doubt by assuming that
the observed residual r was the result of small errors
reinforcing rather than large errors cancelling. The only
trouble is that we cannot solve the quadratic (1)
because we do not have trustworthy values for c, s.
To get around the difficulty we define a neighboring
quadratic equation
z2(e'2+ s'2) +2z(e'M(e) +s'M(s)) -r=O,

(2)

and let its root of smaller magnitude be zm; we attempt
to relate tm , Zm. If we differentiate (1) with respect to
the parameter e we obtain

2t (::) (C"+8") + 2 ( : ) (cc' +88') + 2tc' = 0,
hence

SINE AND COSINE TEST

(ic)

We can construct a consistency test for the sine and
cosine routine for 0~x<'IT'/4 based on the identity
coS2 X+sin2 X=1. Let c, s denote cos x, sin x; let
M (e), M (s) denote machine values for e, s, respectively, and let e', s' be the "admitted errors" in e, s,
i.e., the manufacturer is prepared to admit that
I M (c) - e I could be as big as e' but no bigger. If values
for e', s' are not available to us we have a good cause for
complaint, although our test can still yield meaningful

= -

-(t-(-e'-2+-s'-2~-c~-ce-'-+-s-S'-) .

(3)

Hence dt/de has the sign-t for the chosen argument
range. From (1) we see that if r>O then tm>O. Hence
from (3) (dt/de) t=tm <0, i.e., the positive root of (1)
decreases if we increase the value of e. But increasing
the value of e is precisely what we are doing when we
replace e by M ( e) on moving from (1) to (2), because
positive r implied M(e) >e. Similarly, the root will also
419

420

Fall Joint Computer Conference, 1971

diminish on replacement of 8 by M(s). We conclude
that for r>O the root Zm of (2) is a close lower bound for
the quantity tm • Unfortunately the case r0. The
replacement of c by M (c) is like a decrease in the
parameter c, implying an algebraic decrease in the
(negative) value t, hence an increase in the magnitude
of the smaller root. This gives us a bound on the wrong
side of tm • To get a bound on the right side we must
replace c by a computable approximation that is
guaranteed ~c. Such a quantity is M(c) +c'; similarly
we replace s by M(s) +s'. Hence, when rO and by
(2') for rO, and when r 0 and 0 < T < 1,
use the following fact based on the manipulation of
inequalities:
If
1 M(z) -z I ~Tz,
then
(7)
I M(z) -z I ~TM(z)/(1+T).

(8)

l+t
=r= - - -1.
(1-t)2

(11)

_ r _ -0
(l+r) -

(12)

This leads to
2

t -

t(3+2r)
(1+r)

+

and
p=

r(l+r)
(3+2r)2 .

TESTING THE TESTS
In order to see whether our tests would indeed give
useful lower bounds in a practical situation, we performed several test runs on an IBM 360/65 using

Consistency Tests for Elementary Functions

standard single-precision software4 for the test functions
and double precision for the computation and processing
of residuals. A description of the tests follows:
The consistency test for the sine and cosine subprograms was run over a range of values (0,2-10, 1r/4)
for the argument x. The residual r was evaluated from
the equation
(M(cos x) )2+ (M(sin x) )2-1 =r

M_(e_x)_M_(_eh_) -1 = r = _(l_+_t_)2 -1
M(e x + h )
1-t

The consistency test for the sinh and cosh subprograms was run over the same grid of values for x and
h. The residual r was calculated using Equation (11)
and the lower bound for the relative error was found
from Equation (12). Care was taken that (x-h) ~O
for all h, x. The table below shows the results obtained.
h

for r>O, equation (2) was solved for the root of smaller
magnitude tm • For rO

(13)

for rO

421

x
.02832
.0332
.124

The' admitted value for the relative error bound is
.465 X 10-6• The first bad binary digit provable by this
technique occurs in the 21st bit position.

Although we had (and still have) no reason to doubt the
manufacturer's word concerning the accuracy of the
subroutines supplied, we note that our tests were able to
prove errors of comparable magnitude to those that the
manufacturer admitted. In one case the provable error
was 89 percent of the admitted error. In view of this we
feel it is safe to conclude that any manufacturer whose
claims are at all extravagant can almost certainly be
proved wrong on the basis of carefully constructed
consistency tests alone. It is also worth noting that our
tests were able to show up a badly coded hyperbolic
function routine, since badly coded routines of this
sort are said to be in circulation.
In summary we would like to state our position on
consistency tests: Firstly we do not advocate them as a
substitute for an elaborate full-scale validation process;

422

Fall Joint Computer Conference, 1971

we merely note that there are many routines in circulation that have evidently not passed any stringent
tests at all. Since few institutions possess the funds and
manpower to do their own full-scale validation, there
is a legitimate market for consistency tests which can
quickly and cheaply give a rough indication of quality.
Secondly we believe that in the literature, particularly,!
there has been a tendency to underrate the potentialities
of the consistency test. Too much importance is sometimes attached to the fact that a subroutine can be
grossly in error while exactly satisfying a mathematical
identity. One can guard against being misled by this
situation, partly by use of redundancy (using several
different identities) and partly by common sense (i.e.,
by avoiding identities which the subroutine programmer
is likely to have used in the given range). Cody2 gives
additional advice on the choice of identity. Thirdly,
although we believe that consistency tests deserve to
be held in higher esteem than is commonly the case, we

are anxious not to over-correct the situation. A consistency test can only provide lower bounds for errors.
Such a test cannot possibly tell "the whole truth"
about a subroutine; on the other hand, it will tell
"nothing but the truth," and we think we have demonstrated that just this much information, delivered
promptly and cheaply, can be very helpful.
REFERENCES
1 C HAMMER
Statistical validation of mathematical computer routines
Proc SJCC 1967331-333

2 W J CODY
Performance testing of function subroutines
Proc SJCC 1969 759-763
3 J F HART et al.
Computer approximations
Wiley New York 1968
4 IBM Publication form C 28-6596-4

Laboratory automation at General Electric corporate
research and development
by P. R. KENNICOTT, V. P. SCAVULLO, J. S. SICKO, and E. LIFSHIN
General Electric Company
Schenectady, New York

INTRODUCTION

of the data being done on the 225 in batch mode. An
on-line analog-to-digital converter was soon installed
which permitted direct on-line recording and processing of analog data. With the upgrading of the 225 to
a GE 265 time-sharing computer in 1966, it became
possible to carryon the collection and processing of
data without devoting an entire computer to the task.
By 1967 the computing requirements had outgrown
the 265 facilities, and it was decided to implement a
special time-sharing system on a modified GE 600. This
computer uses an addressing mechanism particularly
convenient for a multi-user system. At that time, the
success of the early laboratory automation experiments
led to the decision to add a GE/PAC® 4020 process
control computer as a real-time peripheral processor
for the 600 to handle laboratory automation work. A
PDP 9 handles graphics work for the complex (Figure
1). To supplement these computer facilities a modular
line of hardware devices was developed which enable
a particular system to be quickly implemented.

The purpose of a laboratory automation system is to
improve the quality and quantity of the work of the
scientists who use it. An organization such as General
Electric Corporate Research and Development presents a large number of problems which are capable of
being solved by laboratory automation. We wish to discuss a system which was developed to solve many of
these problems. We will first describe the computer
facilities which are a part of the system. Next, we will
describe some of the data communications equipment
which has been developed to interface individual experiments to these computer facilities. Finally, we will
describe three applications which illustrate the use of
the laboratory automation system.
The system we have developed is designed to be
sufficiently flexible to accommodate a large number of
different experiments. In the design, a premium was
placed on hardware which is modular and easy to assemble into a system for the automation of any particular experiment. Since we have found it advisable for
the scientist or engineer to take an active part in the
implementation of the application software for his experiment, a premium has also been placed on system
software features which aid a person who is relatively
inexperienced in computer programming to quickly
create and test his software. The Research Center consists of two locations approximately three miles apart
in Schenectady together with a number of outlying
locations in upstate New York. Consequently, data
communications assume a greater importance in our
system than would be the case with a laboratory automation system dealing with a single central location.
The history of laboratory automation at Corporate
Research and Development goes back to the installation in 1964 of the first on-site computer, a GE 225.
The first laboratory automation experiments consisted
of off-line recording of data on paper tape, processing

GE 600 SYSTEM
The GE 600 serves the main computational requirements for the Center.l It is a large scale computer with
32 million words of random access storage. It offers
the Center's engineers and scientists a convenient and
flexible computer facility. A single file system serves as
the data base for batch, remote batch, and time-sharing.
Hence, users may access the same data files from any
of these modes.
The operating system is organized as a central executive handling scheduling ~nd input/output. It supports
a number of subsystems which handle the applications
work. Among these subsystems are modules imitating
the standard commercial offerings of General Electric
and Honeywell Information Systems as well as modules
supporting systems available only in the Center.
423

424

Fall Joint Computer Conference, 1971

Figure I-General Electric Corporate Research and Development
computer facilities

To facilitate programming, a variety of languages
have been made available. Among these are FORTRAN, BASIC, TRAC, and ALGOL. A large library
of statistical, mathematical, and numerical analysis
routines supplement these languages. Various editors
and debugging tools are also available.
GE/PAC 4020 SYSTEM
The GE/PAC 4020 supports a data logging system
which is able to log several types of data being transmitted at varied rates from many users simultaneously
in real-time. In addition, it supports high-speed paper
tape input/output and on-line plotting for the complete
computer complex.
Data can be transmitted from an experiment to the
4020 either in digital or analog form. Communication
in digital form is in the asynchronous mode and includes rates of 110, 300, 600, or 1200 baud. Provision
is made for handling both forms of data either on inhouse cables or over switched phone lines with the use
of digital and analog modems.
The operating system in the GE/PAC 4020 is organized so as to optimize its real-time response. Thus, it
will maintain its designed response time as the user
load increases until it reaches the point of overload,
whereas the more common type of time-sharing system
simply reduces its response time gradually as the user
load builds up. This design is necessary in a real-time
peripheral processor such as the 4020 in order to avoid
loss of data due to poor response time.

The difference in approach to response time between
the two systems results in the possibility that considerable data may have to be stored in the 4020 before the
600 can store it on its disc file. The data is temporarily
buffered on the 4020 drum store before transmission
over the memory interface in order to accommodate
this storage requirement.
Communication between the GE/PAC 4020 and the
GE 600 takes place over a memory interface controller
(see Figure 1). The GE/PAC 4020 can read from or
write to the GE 600 core memory. Each computer can
interrupt the other computer. A program exists in the
600 which handles all the requests made by the 4020.
This program writes data to the disc and prepares plot
and paper tape punch files.
Each computer can perform its functions independently of the other. However, since the buffering capability of the 4020 is limited, it cannot tolerate an
extended outage of the 600. This is because the 4020
must eventually transmit the data it collects to the 600
for permanent storage on the disc. Once the data is
stored on the disc it is available for access by user
programs which run on the 600.
In order to insure real-time service for all higher speed
devices on the system, an automatic double buffering
technique was employed in both hardware and software. This consists of assigning two buffer control
words for each device or channel which has a high speed
capability. The hardware was modified to automatically
switch between the two buffer control words each time
an operation on one of them was satisfied. The software
was organized in a way which guarantees there will
always be two read operations under way for these devices at any given time. This technique increases the

Ut.HSOO BAUD
MOf}EMS
9CHANNfLS

Figure 2-Data collection computer hardware

Laboratory Automation

time available to replenish the buffer control word
from one sample time to N times the sample time
where N is the number of samples in the buffer.
GE/PAC 4020 HARDWARE
The data collection hardware is shown in Figure 2.
It consists of a special low-speed analog scanner handling the analog modems described below, a set of 9
standard asynchronous communications interfaces
which can be plug-adjusted to operate at speeds between 110 and 1800 baud, the on-line plotter, paper
tape I/O equipment, and the high-speed analog scanner
serving the experiments hardwired to the computer
system.
Data in analog form is received at the GE/PAC 4020
by this high speed scanner. (See Figure 3) This peripheral was developed by the Design Engineering Group
specifically for the laboratory automation system. The
analog scanner was designed to place as much control
as possible in hardware instead of software. Not only
does this reduce software load, but, in a multi-user
system, it also assures the individual user of a more
uniform sampling rate for his data than would be possible in a software-driven data collection system. It is
capable of accepting 120 analog lines, although only 32
have been implemented. Each line can be programmed
to be turned off or to sample at one of the following
data rates: 120, 60, 30, 15,7.5,3.75, 1.87, .937 samples
per second. A specific block of core storage is dedicated
to the high speed scanner control, with individual lines
having their own specific locations for control words.
Programming of the high speed scanner is accomplished

Figure 3-High speed scanner

425

by storing control words in this core area for starting,
stopping, or specifying the sampling rate for each line,
and then initiating the transfer of the entire block to
the scanner control hardware. The information so
transferred controls the scanner hardware until it is
updated by another transfer of the control block.
The sampling rate is under control of a master clock
running at 4 MHz. The exact frequency is under servo
control of the 60 Hz. line frequency. This enables the
user who wishes to synchronize his experiment with
the data collection system to do so by use of the power
line frequency.
When the command to commence sampling a line is
received by the line control logic, a status signal is
raised. Actual sampling does not begin, however, until
a control signal is also raised. While the status signal
can be used for this control signal, it is often more
convenient to carry both signals to the user's site
where they furnish status indications and control capability. By the use of pulses of suitable length on the
control line, it is possible for the. user to cau~e the
sampling of a single datum, continuous samphng of
data, or an end-of-file indication in the hardware for
his line.
The scanner has two stages of automatic gain control. The first stage, which exists in each line, has two
ranges, Xl and X64. It is set by the automatic ranging
logic while the previous line is being sampled, thus
allowing time for settling for the somewhat less expensive line amplifiers. The second stage, which is common
to all lines, has three ranges, Xl, X4 and X16 giving a
total of six ranges. Since the second stage must be set
during the actual sampling time of a given line, it
must have a faster settling time than the first stage.
The line to be sampled is connected to the common
amplifier and analog-to-digital converter by a field
effect transistor multiplex switch. The line must be
sampled in a time sufficiently short to allow all possible
lines to be sampled at their maximum rate, i.e., 120 X
120 samples/sec or 69.4 ,usec. The digital representation
of the output of the common amplifier, together with
the bits representing the ranging amplifier settings, is
.
placed in core memory by cycle stealing.
The location in core where the data will be stored IS
controlled by a word for each line stored in the dedicated scanner control block. This word has address and
tally information, and is automatically updated by the
scanner hardware after each sample. When the tally
for a given line runs out, an interrupt is generated for
that line and hardware provides an alternate buffer.
Thus , after the control information is transmitted . to
the scanner hardware, no additional software attentIOn
is required until the interrupt indicates that a buffer
is full.

426

Fall Joint Computer Conference, 1971

More than one input line on the High Speed Scanner
can be assigned to a single user by placing pointers to
the same buffer in the control word of each line. The
data from each line will be stored sequentially in the
core buffer, thus allowing the sampling of several variables at once. By sampling the same variable on several
lines, a sampling rate higher than the maximum 120
samples per second can be realized.
GE/PAC 4020 SOFTWARE
The operating system on GE/PAC 4020 data logging
system is based on the Real-Time Multiprogramming
Operating System (RTMOS) of the General Electric
Process Computer Department. RTMOS is a large
grouping of programs and subroutines, available only
on GE/PAC computers, that supervises the interaction
of process events, time, computer peripherals, and the
central processor of the computer.2 The core resident
routines of RTMOS which perform scheduling, core
management and related routines are the only parts of
RTMOS used on the GE/PAC 4020 Data Logging
System. An I/O system which supports time-sharing
and a set of functional programs which supports the
teletype command system were designed and implemented for this system. All data gathering, table manipulation, and utility programs were also developed for
this system.
The I/O system was designed so that all devices on
the system use common feeder, driver, and interrupt
handling routines. This is accomplished by associating
a table with each device type. Depending on the device
type presently being serviced, the proper table is accessed and appropriate operations or subroutines are
executed. In most cases, a new device can be implemented simply by creating a table for that device.
An operating system running under RTMOS is organized as a set of functional programs which can be
scheduled and will run independently. In the datalogging system we have developed, two types of functional programs are found-applications programs and
a type of program which supports the teletype command system called a manager program. With each
manager there is associated a group of commands. For
each command in a given manager's list, there is a
corresponding applications program. When a command
is given to a manager from a user's teletype, the applications program corresponding to that command is run
by the manager. At the same time, the manager removes itself from core. When the ~pplications program
terminates, it is automatically removed from core and
the manager which was previously running is brought
back and run again. Any applications program under

a given manager can replace itself with another applications program from that manager's list.
The manager program concept enables new commands and their associated applications programs to be
easily added to the system. Also, since one program
can replace itself with another program, a program
chain can be implemented.
There are three basic tables which are used by the
4020 operating system, the User ID table, the analog
table, and the digital table. All users have their user
identification, password, and billing number entered in
the User ID table. Users who transmit data in the
digital mode have an entry in the digital table. Similarly, users who transmit data in the analog mode
through the high speed scanner have an entry in the
analog table. The digital table indicates whether the
user is a local user transmitting on a specific hardwired line or a remote user transmitting over any
available phone line. The analog table indicates which
lines are assigned to a user and the sampling rate for
each line. Automatic table manipulation is provided.
Thus, new users can be brought on the system in a
matter of minutes. Existing users of the analog system
can change variables to be sampled or sampling rates
simply by making a change in their table specification.
In addition to these real-time data logging functions,
three utility functions are performed by the GE/PAC
4020. Data which have been produced off-line on paper
tape can be sent to the 600 disc through the 100 frames/
sec paper tape reader. Data which have been refined
on the GE 600 can be punched on paper tape on a 120
frames/sec punch. On-line plotting of data is done on
'the GE/PAC 4020 to relieve the GE 600 of this
task.
DATA COMMUNICATIONS
We have described the computer facilities of our
laboratory automation system. A second part of the
system is the data communication facilities. It is much
more difficult to standardize this part of the system
because of the varying needs of the individual experiment and the fact that these varying needs impact
more closely on the data transmission system than on
the computer facilities.
This variation of requirements from experiment to
experiment is found in several forms. The output of
some experiments is an analog signal such as the output
of an amplifier, while the output of others is digital,
such as the output of a scaler. The question of speed
arises in two aspects in the design of the communications for a particular experiment. First, the rate at
which data is conveniently produced by the experi-

Laboratory Automation

ment dictates the equipment to be used. If the rate is
slow enough to allow transmission over the 110 baud
channels characteristic of teletypes, the system required
is much simpler than if higher rates are required. Second, the response to the experimenter dictates the
form of the communications systems. It may be possible to simply collect data and process it at the end of
the experiment. This would result in a simpler system
than if it were necessary to feed back processed data to
influence the future course of the experiment. The
amount of logic required at the experimental site is
another aspect of data communications to be considered. In order to effect enough time-saving to justify
the effort required for automation, it may be necessary
to automate much of the control of the experiment,
while for other experiments simply collecting the data
is sufficient. A final aspect of data communications design is the nature of the experiment. If the experiment
is to be performed only one or a few times, it will require a simpler interface than one which will be routinely performed many times. Small tasks which the
experimenter does willingly a few times become onerous
burdens when they must be repeated many times. The
following is a description of the data communication
hardware and software available to the staff which
can be used with the computer facilities available at
Corporate Research and Development.
We have described the high speed scanner, the analog
peripheral of the GE/PAC 4020. In support of this
scanner, a cabling system has been installed which
runs the length of the main building at the Center.
This provides cable facilities close to the experiments
and makes for a convenient way for the staff to connect their experiments to the laboratory automation
system. Figure 4 shows the analog transmission system.

Figure

q.--ATIHIC}IT

data transmission system

427

Figure 5-Digital data transmission system

Transmission is by individually shielded twisted pairs.
While the user can connect to the system in a number of
ways, the control terminal shown in the figure has
proven convenient. Within the terminal is a light indicating the status of the line; logic to produce the appropriate pulse length for single samples, continuous samples, or end-of-file; and an amplifier with up to 60 db
gain to aid in interfacing with the experiment.
Figure 5 illustrates a family of the digital modules
available for data collection and control. The experiment can supply data in either analog or digital form.
Analog information is converted to digital form by
either a digital voltmeter or by an AjD converter in
the instrumentation control console. The digital data
is sent on either the instrumentation control console or
to a digital data controller. The latter device formats
the data for either off-line recording or transmission to
a computer by the high-speed communications terminal. Off-line recording can be done with paper tape or
incremental magnetic tape. The high-speed communications terminal serializes the data for transmission
over asynchronous lines, and operates at either 110 or
1200 baud.
The functions of the digital voltmeter, digital data
controller, and high speed communications terminal
are all combined in the instrumentation control console. This device finds its greatest use when information must flow both to and from the experiment. It can
be interfaced to the GEjPAC 4020 or, if local control
information is to be generated, to amini-computer. In
cases where response time is not a factor, either the
instrumentation control console or the high speed communications terminal can be interfaced directly to the
GE 600, thus avoiding the additional step of the 4020.

428

Fall Joint Computer Conference, 1971

speed communication terminal is used to interface the
digital data controller to transmission line data sets.
I t formats the data into bit-serial for asynchronous
transmission at either 110 or 1200 baud. It can also
receive serial data from the computer and format it
into bit paraJlel for displays, plotters, and control signals. The keyboard shown on the housing is used during
the conversation mode with the 4020 Computer for 1200
baud data logging.
For those cases where it is inconvenient to use the
dial-up communications network, an electronic system
simulating a 1200 baud frequency shift keying system
is available to be used with the in-house cabling system.
The call-up for this system is accomplished by pushing
a button. When the computing system responds, light
is turned on to indicate that the computer has responded to the call.
Figure 8 is an illustration of the instrumentation control console. It contains a keyboard with a 64 ASCII
character set, controls for systems configuration, and
displays for computer commands. The control console
is designed with a pair of data busses, and will accept
five separate plug-in modules. Connections to the data
bus are accomplished by depressing buttons on the
front panel of the console. There are two data busses
in the console. The output bus accepts data from one
of four sources: the digital data controller, the binary
A/D converter, the keyboard, or an external device.
One of three transmission/recording devices can be
connected to the output bus: the high speed communication terminal, a mini-computer, or a teletypewriter.
To set up a system, the operator selects one of the data
sources and one of the transmission/recording media.

a

Figure 6-Digital data controller

The digital data controller interfaces a variety of
digital signals to the data communication media. Figure
6 illustrates the data controller plug-in module with its
housing. It can be programmed either manually or
automatically. The input to the data controller is from
one to eight sources of 4-wire BCD information, each
having from one to eight decades. Output is eight-line
ASCII code with even parity. Commas, carriage return, and line feed characters are inserted where
necessary.
Figur~ 7 illustrates the high speed communication
terminal plug-in module and its housing. The high

Figure 7-High speed communication terminal

Figure 8-Instrurnentation control console

Laboratory Automation

This connects a data source to a transmitting system
with a recording medium. In the same way, the receive
bus can accept data from the high speed communication terminal, a mini-computer, a teletypewriter, or the
keyboard. The receive bus can transmit data to either
an external device or a digital-to-analog converter. The
system controller plug-in module monitors data on the
receive bus and decodes data for the console displays.
The instrumentation control console with its data
busses provides a convenient way to configure a data
acquisition and control system. It is portable and
quickly placed into operation. The plug-in module concept increases its flexibility. In addition, the ability to
perform maintenance on the plug-in level increases the
system up-time.
While a variety of mini-computers have been interfaced to the laboratory automation system, the majority of our experience has been with the GE/PAC
30. This is a 1 microsecond sixteen bit word machine
with up to 16 k bytes of core store and an optional
65 k disc. A variety of analog or digital peripherals are
available for interfacing to experiments as well as the
conventional I/O peripherals. A variety of software
has been written for use on the GE/PAC 30. Included
are programs for data collection, graphics display,
stepping motor controllers, and 110 or 1200 baud data
transmission.
A peripheral which has proved useful in software
development for the GE/PAC 30 is the TRICOM
switch shown in Figure 9.10 This is an automatic threeway switch handling low-speed asynchronous communications lines. It allows the operator to interface the

Figure 9-TRICOM switch

429

teletype with the 600 time-sharing system in order to
utilize the editing facilities for writing his program.
When the program is ready the user initiates an assembly and supplies names of any library subroutines
required. When assembly is complete, the resulting
binary code is loaded, together with library subroutines, into a pseudo-core image of the GE/PAC 30 in
600 core. When this operation is complete, a special
character switches the TRICOM to connect the 600
to the 30 for transmission of the core image. When
transmission is complete, the TRICOM switch connects the teletype to the 30 for running the program.
Figure 9 also shows the path of data from the GE/PAC
30 to GE/PAC 4020 to GE 600.
APPLICATIONS
The computer facilities and data communication
equipment are the tangible parts of our laboratory
automation system, but even more important are the
people and facilities that make it work. The scientist
or engineer wishing to automate his experimental work
can find personnel familiar with the computer and data
communication systems. These people are able to give
applications advice or, if necessary, to design and implement extensions to these systems. Applications information is also provided on a variety of measurement
equipment available from an instrument pool. Finally,
if special equipment is required, a model shop can
assist in its design and fabrication.
To automate each research or development project,
a careful study is made to determine the data collection and control requirements for each experiment.
Based upon the specific needs for hardware and software, in many cases a system is assembled from the
available digital and analog subsystems. If the in-house
terminal hardware, software, and computer systems
cannot adequately meet the data collection needs, additional in-house development is undertaken or outside
vendors are investigated.
When the hardware portion of a laboratory automation project is complete, the applications programs must
be written. This is the responsibility of the experimenter who will benefit from the project. We have
found that, in general, it is easier for a scientist to
learn the necessary computer programming to implement his applications software than it is for a computer
programmer to learn the necessary science to write the
programs himself. This principle is particularly true
when the programming is done in a time-sharing environment such as ours. Following are descriptions of
three experiments which have been automated and for
which the application software has been implemented

430

Fall Joint Computer Conference, 1971

TABLE I-Additional Laboratory Automation
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)

Infrared Spectroscopy
Nuclear Magnetic Resonance
X-ray Crystallography
Capacitance/Voltage Measurements on Semiconductors
Stress/Strain Relaxation Measurements
Optical Spectroscopy
Atomic Absorption Spectroscopy
Electronic Micro Balance
Gas Mass Spectroscopy

in this manner. In addition to the three experiments
defined in detail, Table 1 is a brief listing of other
laboratory areas where automation has been incorporated.
SPARK SOURCE MASS SPECTROGRAPH3
The spark source mass spectrograph is an instrument
for analyzing trace impurities in electrically conducting
solids. In some respects it resembles the more wellknown emission spectrograph, but it is approximately
1000 times more sensitive. In the course of an analysis
it produces a photographic plate which contains the
data from one sample (Figure 10). There are 16 exposures arranged horizontally along the plate. Each
element in the sample being analyzed contributes a
group of one or more lines in each exposure. The blackening of each line depends on the amount of the particular element in the sample.
It is these lines which contain the information about
the sample one wishes to recover. Two aspects of the
plate suggest an automated system for the recovery.
First, the individual lines vary in shape from place to
place on the plate. Thus, it is necessary to sample a

Figure 10-Spark source mass spectrograph

number of points across a given line to obtain an accurate representation of the line. Second, the variability of the photographic emulsion from plate to plate
makes it necessary to derive the characteristic curve, or
the relation between blackening of a line to the number
of ions striking the line, from the plate itself. Fortunately, there is enough information on the plate to do
this, but it entails a lengthy calculation. The automation system enables one to sample the blackening of a
line on the plate in sufficient detail to obtain an accurate representation of the line, to utilize isotopic
ratio information derived from the plate to obtain the
derivative of the characteristic curve, to solve the resultant differential equation for the characteristic
curve, and to utilize the resulting information to convert the blackenings of the various lines on the plate
into concentrations of impurities in the sample. Thus,
there is a requirement for a system to collect analog
data. The speed with which the micro densitometer can
produce data is about 10 to 15 samples per second. It is
not necessary for the resulting data to be fed back to
the operator during the time he is collecting it, so the
relatively large calculation can be done after the collection has taken place. The application is a routine one,
thus justifying some effort to make it convenient for
the operator to use.
In order to use the system the operator places a plate
on the carriage of the microdensitometer, moves the
carriage to position the line he wishes to record on a
projection screen, and presses a switch indicating to
the logic that the plate is ready for scan. The computer
initiates and times the scan. During the scan the operator enters on a teletypewriter the alphanumeric information necessary to describe the line being scanned.
Interlocks insure that a complete set of both analog
and alphanumeric information is entered for each line.
Analog information is transmitted to the computer
center over analog cables where it is digitized and
stored on bulk storage. The teletype is connected to
the computer system in a current-loop circuit. After
collection of the information, it is processed and the
results returned to the operator over the teletypewriter
circuit.
The use of the system has resulted in an improvement in both the quantity and the quality of the work.
Whereas before using the system, it required 16 hours
to process the data on a plate, it now requires one
hour-an improvement of a factor of 16. A group of 12
laboratories participated in a round-robin analysis in
which they each analyzed the same sample of copper.
The average precision of all laboratories was 40 percent,
while our laboratory reported results which proved to
have a precision of 15 percent, an improvement of over
a factor of two.

Laboratory Automation

431 "

THE ELECTRON MICROPROBE
In the electron microprobe analyzer a beam of electrons from an electron gun is focused by magnetic
lenses to a micron size spot on the surface of a specimen
causing the emission of x-rays. Qualitative chemical
analysis of the excited region can then be performed by
scanning a crystal spectrometer to establish the wavelength distribution of the emitted radiation. Quantitative analysis involves tuning the crystal spectrometer
to a specific wavelength corresponding to a particular
element and counting the x-ray pulses emitted both
from the specimen and a standard. Although standards
of similar composition to the specimen provide the
simplest method of calibration, i.e., by direct comparison, they are usually not available, particularly at the
homogeneity levels required for microprobe analysis.
An alternative method which needs only the use of pure
elemental or simple binary standards is based on theoretical correction procedures which have been described
extensively in the literature 4 •5 and relate x-ray intensity
to composition by equations of the form:

KA

= CA

X

electron )
backscatter
(
factor

X

electron )
penetration
(
factor

X

x-ray
)
absorption
(
factor

secondary x-ray)
X fluorescent
(
factor
where
KA = the ratio of measured x-ray intensity of element
A to that of an A standard with both intensities
corrected for background, drift, and counter
tube dead time and
CA = the weight fraction of A.
Since all of the above factors are themselves complicated functions of composition, the calibration equation must be solved iteratively. Furthermore, intensity
ratios of at least N - 1 components (N = the total
number of components) are necessary and therefore
N - 1 calibration equations must be solved simultaneously. Since hand calculation for even a few data
points in a binary system takes hours, a computer is

Figure ll-Electron beam microprobe system

essential for analysis of the multipoint multicomponent
systems frequently encountered in practice. During
the past ten years dozens of different computer programs have been written to reduce computation times
to minutes even for the most complex systems. 5
Since it is often desirable to map out the composition
of a specimen an x-y matrixing system was developed
for our Cameca microprobe which can be used to automatically control specimen position and collect digital
x-ray data as shown in Figure 11. The system shown is
for a single spectrometer, but is in fact connected to
four spectrometers and a digital specimen current
readout. In practice the region to be analyzed is manually positioned while being observed by an optical
microscope coaxial with the electron optical system. A
fixed number and size of x and y steps are selected
(e.g., a 3 X 10 matrix with x steps of 2 microns and
y steps of 5 microns). Operation of the system is then
initiated by a switch on the data scanner. Amplified
x-ray pulses from each spectrometer are converted to
pulses of fixed size and time duration which are counted
by scalers for a preset time. The parallel data from
the scalers is serialized by the data scanner and transmitted to the teletype where the results are printed
and a paper tape punched and/or the data is transmitted to aGE/PAC 30 computer through TRICOM.
The operator can then call the 600 on time-sharing
and transmit the data and any other information into
a data file. The 30,000 word microprobe analysis program, GEMAGIC (written by J. Colby7 and modified
by R. BolonS) can then be run in remote batch mode
with the results either printed on a high speed printer
in the computer room or returned on the teletype.
Options are also available for plotting the results on a
Calcomp plotter, an example of which has been re-

432

Fall Joint Computer Conference, 1971

IOO~~~~~~--~~--~~--~

90
80

20
10

x
000

Xx

x
ot..oo<)Q..()'OO'OO<>O-=-OOCK)()JO'O"C:).()......(::>!l.I.!_~_-----L_o_o~~~~oo
o 10 20 30 40 50 60 70 80 90
00

00

DISTANCE (MICRONS)
MICROPROBE ANALYSIS

DIFFUSION PROFILE FIEDLER-CICCARELLI
Figure 12-Electron beam microprobe diffusion scan

drawn and shown in Figure 12 for a diffusion scan
across a nickel aluminum coated, thorium-doped nickel
alloy.

displacement meter and the concentration of nitrogen
in the expired gas which is measured by measuring
mass 28 on a small mass spectrometer sampling the
expired gas.
The two signals are transmitted to the Center using
analog modems over standard voice-grade telephone
lines. There they are digitized by the 4020 and stored
on 600 bulk storage. The calculation made on the data
is an integration of the nitrogen concentration over the
total volume of gas flow. To provide backup in the
event of failure of the data collection system, an analog
tape recording of the information is made at the same
time the data are being collected.
Two other measurements are made with the same
system, but with different settings of the mass spectrometer. These are oxygen uptake using mass 32 and
carbon dioxide output using mass 44. These tests are
typically measures of the patient's metabolism, but, in
the case of shock patients, also give information regarding the patient's success in fighting trauma As the
body reacts to a loss of blood, for example, it decreases
the flow of blood to the extremities, resulting in a decrease in oxygen uptake. As the body recovers, it increases circulation to the extremities, with a resulting
increase in oxygen uptake.
Another measure is pulmonary venous admixture.
This is the percentage of blood flowing through the
lung which is shunted through non-active tissue and
consequently is not oxygenated. The measurement is
made by calculating a mass balance for oxygen across
the lung membrane. The required data are the gaseous

PHYSIOLOGICAL MEASUREMENTS
The last application we wish to discuss concerns
some physiological measurements being made at the
Trauma Unit of Albany Medical Center Hospita1. 9 This
is a single bed unit devoted to the care of, and research
on, extremely seriously injured patients. Five measurements are currently being made on patients in the unit
with the aid of our laboratory automation system by
Dr. S. Powers and his staff. (Figure 13)
Functional residual capacity of a patient's lung is a .
measurement of the lung volume which is usable in
respiration. The measurement is useful, for example,
in reaching a decision regarding the use of a respirator
for the patient. The measurement is made by abruptly
switching the atmosphere being breathed by the patient
from air to 100 percent oxygen and measuring the
amt. of gas necessary to wash out the nitrogen from
the air in the lung by oxygen. The information needed
to calculate functional residual capacity is total gas
flow from the lung which is measured by a positive

Figure 13-Physiological tests

Laboratory Automation

oxygen and oxygen content of blood flowing into and
out of the lung. The gaseous oxygen content is measured
by the mass spectrometer system, while the blood oxygen content measurements are made with a blood
oxygen analyzer.
The fifth measurement is cardiac output. This is
measured using a dye-dilution technique. A dye is in-jected in the blood stream and the rate it is diluted is
measured using a densitometer through which a portion
of the patient's blood flows.
A by-product of the automation of these tests is the
ability to store the results in the computer files for
later retrieval. It is thereby possible to obtain a record
of all tests together with the trend of their results for
a given patient.
Each of these measurements is a standard physiological test, and nothing is remarkable about doing them
per se. The interesting aspect of this work is that the
tests are being made on patients actually in shock, and
that the laboratory automation system aids in obtaining the results within 5 to 10 minutes from the time the
tests are made. Thus the physician attending the patient can make decisions regarding the care of the
patient much more quickly, and, with the aid of the
additional information furnished by these tests, more
accurately.
CONCLUSION
As the above applications demonstrate, the laboratory
automation system has permitted experiments which
would not have been possible without it. While this in
itself would permit us to view it as a success, a far
more important aspect of the system impresses us. The
development of the system has been evolutionary and
will continue to be so. There is, for example, a requirement now for synchronous communication channels
between the mini-computers and the 4020. We find,
however, that this evolution has been provided a
framework in the form of the computer operating system and the data communications system about which
to grow. Thus, a sound system design at the time these
systems were first conceived has provided, and will
continue to provide, direction for the orderly growth
of our system into areas which we could not have
foreseen at that time.

433

ACKNOWLEDGMENTS
We wish to acknowledge the assistance of J. Newell of
Albany Medical Center Hospital in describing the
physiological test application. The system we describe
here would not have been possible without the contributions of many of our colleagues at the Corporate
Research and Development Center. We take this opportunity of acknowledging the work of a long list of
contributors.
REFERENCES
1 R KERR A BERNSTEIN G DETLEFSON
J JOHNSON
Overview of R +DC operating system
General Electric Report 69-0-355 1969
A J BERNSTEIN J C SHARP
A policy driven scheduler for a time sharing system
Comm ACM 14-74 1971
2 Anonymous
GE/PAC 4020 RTMOS manual
General Electric Company
3 P R KENNICOTT
A system for the quantitative evaluation of mass
spectrograph plates
14th Annual Conference on Mass Spectrometry Dallas
1966
4 R CASTAING
Application of electron probes to local chemical and
crystallographic analysis
PhD Thesis Univ of Paris France 1951
5 K F J HEINRICH Editor
Quantitative electron probe microanalysis
National Bureau of Standards Special Publication
Washington Vol 298 1968
6 D R BEAMAN J I ISASI
A critical examination of computer programs used in
quantitative electron microprobe analysis
Anal Chern Vol 42 p 1540 1970
7 J COLBY
Quantitative microprobe analysis of thin insulating ftlms
Adv X-Ray Anal Vol 11 p 287 1968
8 R BOLON
Private communication
General Electric Company
9 S R POWERS JR MD ET AL
Analysis of mechanism of disturbed physiology in critically
ill patients
7th Annual Meeting of Soc of Engineering Science
St Louis November 3 1969
10 H HURWITZ
Unpublished
General Electric Company

Multicomputer processing in laboratory automation*
by

c.

E. KLOPFENSTEIN

University of Oregon
Eugene, Oregon

and

c. L. WILKINS
University of Nebraska
Lincoln, Nebraska

INTRODUCTION

number of techniques to facilitate program debugging
using simulation methods. Also, the use of high level
languages will be discussed in order to show how the
usefulness of the minicomputer may be maximized by
offsetting its limitations.
If we consider, first, the program assembly process,
it is apparent that the rate-limiting step is input and
output. The slow devices we are forced to use can easily
insure that an assembly on a laboratory computer can
take anywhere from several minutes to an hour or more.
One solution would be to install high-speed peripheral
devices (lineprinters, magnetic tapes, magnetic drums,
disks, etc.) but, of course, these may easily cost more
than the computer itself, so this approach is often unsatisfactory. Short of doing this, is there any way for
the user to avoid being I/O limited in the critical
program development and debugging phases? We
believe the answer is yes. The following pages will
describe software we have developed for one particular
minicomputer, the Varian 620/i, to permit the programmer to use a much larger computer system for
much of his program development. To achieve this end,
both an assembler and simulator have been written
(in FORTRAN) to run on the IBM 360 to assemble
620/i code, and to simulate the execution of the programs thus produced. Additionally, high level language
compiler development is under way. These compilers
will operate in an analogous fashion, producing machine
programs which will then directly execute on the smaller
computer. While we are well aware that the techniques
mentioned above have been in use by computer designers
and manufacturers for years, no such widespread use
has been evident among those who have recently discovered the usefulness of the small computer. as a
laboratory tool. Accordingly, the end user rarely

Widespread availability of minicomputers in the
scientific laboratory has become a reality in recent
years. With the price and size reductions which have
taken place, it is now practical to use these machines
as routine tools in the laboratory environment. However, in order to effectively implement this new research
tool the scientist has had to learn its limitations as
well as its capabilities. It is well-known by computer
scientists that, in general, the trade-off made by
programmers is memory size vs. speed of execution. In
other words, the more memory the programmer has
available, the faster running program he may write
and, conversely, the less memory he has, the slower
will be the execution times. While there are a great
many qualifications to this broad generalization, it still
represents a recognizable truth.
Since memories available in the laboratory minicomputer are generally in the 4 to 12 K word range,
the experimenter has had to painfully learn the lesson
mentioned above. Attention to good programming
practice has helped alleviate the problems of limited
memory, but major problems still remain. Nowhere
does this become more evident than in the program
development stage. Traditionally, the minicomputer
has employed multipass assembly procedures, usually
utilizing a slow output device such as the teletype to
produce program listings. In practice, it rapidly becomes
apparent that such an assembly procedure is highly
unsatisfactory in all but a very few cases. In this paper
we will describe an alternate to this approach and a

* We gratefully acknowledge support of the National Science
Foundation through grants GJ-441 and GJ-393.
435

436

Fall Joint Computer Conference, 1971

specifies the sort of software mentioned above and,
consequen tly, little such software has been provided
by minicomputer producers. **
We began, about two years ago, the development
of a program to train chemistry students, both graduate
and undergraduate, in the use of the small computer in
the chemistry laboratory. Accordingly, small computers
especially configured for this purpose were acquired and
employed in the early stages of course development.
Since we were using· 4K computers equipped with
ASR 33 teletypes and fast paper tape I/O (300 cps read,
120 cps punch) for instructional purposes, we noted at
the outset that our student programmers would often
spend 45 to 60 minutes assembling a program and
comparable times for performing the elementary
debugging associated with syntax errors, mispunches,
and the like. We estimated that as much as 90 percent
of the computer time was being used for program
development, with only about 10 percent being used
for running the experiments in laboratory data acquisition and control which were our prime interest. This
was clearly an unacceptable ratio and we immediately
began to consider how we might make use of off-line
(from the small computer) assembly and simulation
techniques in order to improve the ratio and to free the
laboratory computer for those tasks it does best.
THE USE OF DAS 360 AND SIM 620
Our solution was two programs, written in
FORTRAN (DAS360 and SIM620) which would
rapidly assemble and simulate 620/i code on the
IBM 360. Now it became possible to provide far more
adequate diagnostic messages (necessarily limited and
cryptic in the Varian assembler) and to greatly facilitate
the whole programming process. Students could now
enter their programs via punched cards to the IBM
360, operating in the batch mode, and have a program
assembled in a few seconds, as well as have a listing
returned with comprehensive diagnostic messages. At
the University of Oregon, where a research computer
equipped with IBM compatible magnetic tape was
available, the student programs were assembled on the
360, the assembled programs written on magnetic tape
and the resulting tape transferred to the 620/i where
the programs were dumped on paper tape and returned
to the students along with the assembly listing for
debugging or execution. At the University of Nebraska,
students were permitted to use remote job entry CRT

** This situation has changed rapidly and now this software is
increasingly available.

terminals (first IBM 2260, later Bunker-Ramo 2206
terminals and the University of Nebraska Remote
Operating System (NUROS) to enter their Varian
620/i programs first to a disk file, then directly into the
job queue for assembly. Students had the option of
placing the assembled programs together with their
error messages on a disk file for immediate viewing on
the terminals in the chemistry building, and obtaining
a printed listing later or, alternately, obtaining a
printed listing only. Punched card output of the
assembled programs was available to all students. Both
the assembler and simulator were stored on the 360
disk files and their execution could be invoked via a
catalogued procedure in much the same way as a user
would invoke any other program (e.g. the FORTRAN
or COBOL compilers). Figure 1 contains a block
diagram of the major components of the University of
Nebraska system. In this way, it became possible to
instruct far more students than could have conceivably
been handled had we been restricted to the use of the
small computer only. Furthermore, it was now possible
for any student or faculty member at either of the
Universities to make use of the programs, the only
restriction being that they needed to have a valid
accou t number to allow them to use the 360. Quite
simple instructions for the use of both assembler and
simulator were developed and freely distributed to any
student who desired them.
The simulator (SIM620) proved to be at least as
useful as the assembler for the program debugging

IBM 360/65
BYTES

t 2400 BAUD DATA SETS

..

~

B-R
STANDALONE
CRT

0J

MAG TAPE
9TRACK.
800 BPI

BUNKER RAMO
MULTISTATION
CONTROLLER

l'

I
;-------..

VARIAN

620/1

620/I

8K

8K

I
I
I
I

: PAPER TAPE

1-----------1
I
1
r-------,-I_ - - ,
...-_-'-1_---,

VARIAN

620/L
8K

DATA
GENERAL
SUPER NOVA
4K

!

VARIAN

r

B-R
CRT UNITS

I
EXPERIMENTS

Figure I-Block diagram, University of Nebraska distributed
computer system for chemistry

Multicomputer Processing in Laboratory Automation

phases. As we mentioned earlier, it was our observation
that, initially, as much as 90 percent of the minicomputer time was absorbed by the process of program
debugging. Since those using the computer were either
(a) beginning students who had never seen a computer
before, (b) researchers writing code for a specific
research application (many of whom had never seen a
computer before, either) , or (c) students and researchers
engaged in the final check-out of completed programs,
it was apparent that-short of buying many expensive
peripherals or several more teaching computer systemswe were not going to be able to significantly change the
debugging/experiment use ratio with laboratory computer systems. The obvious solution to the problem was
to make use of the powerful, fast 360, insofar as possible.
'Ve therefore developed a program which allowed the
l:-::trger computer to simulate the operation of programs
assembled for use with the laboratory computer. This
simulator also allows the programmer to enter limited
amounts of data in order to test the operation of his
620/i program. The results of this simulation are quite
useful and allow programs to be debugged very much
faster and more efficiently than would otherwise be
possible. A listing containing a full trace of program
operation is produced. The contents of any or all
operational registers before the execution of every
instruction, the instructions executed and their memory
locations may be included in this listing. At the University of Nebraska, this listing can be placed on
magnetic disk files at the computer center and viewed
by the student on the chemistry department CRT
terminals. In this way, the student has, in effect, a
remote 620/i to use for program debugging. The information thus obtained can be collected in a small
fraction of the time it would otherwise take. Through
use of the simulator, large numbers of students can
simultaneously check programs for correct operation
and only after they are reasonably certain their programs are error-free do they need to use the laboratory
computer. This makes possible the restriction of its
use to the types of laboratory problems most important
to successful implementation of the system in the
chemistry laboratory.
Once DAS360 and SIM620 had been successfully
implemented, we turned our attention to improvement
of debugging tools for that essential element of the
experiments, the on-line testing phase. Particularly
important for this is the availability of a fast effective
means of interpretatively executing programs, changing
and displaying core contents, and trapping certain
conditions. With the 620/i, as with most small computers, utility programs to perform certain of these
tasks were provided. In order to attain all the capa-

TECD
BASIC
DASIO
SIM 620
RT EXEC
DEC PDP-IO

437

DAS360 ~

I

1 SIM 620

IBM 360/65

DAS
PMTR
AID
BASIC
FORTRAN
CLASS
VARIAN 620/I

BASIC
CALCTRAN
NUROS
PMTR
BUNKER-RAM
MULTI-STATION
CRT UNITS

o

Figure 2-Interrelationships of software packages

bilities we required it was, however, necessary to
extensively modify the software provided by the
manufacturer. The resulting program (AIDF) is
described below.
The basic premise behind the design of AIDF was
that nearly all of the scientists and students who debug
620/1 assembly programs can easily interpret machine
code in actual format. Accordingly, a simulatorinterpreter was prepared that provides complete
program tracing and trapping facilities, with or without
a teletype listing of the conditions of all registers before
and after execution of each instruction. In this regard,
listings similar to that provided by the 360 simulator
are provided. However, since the programmer generally
will use AIDF interactively, a much more flexible
command set was made available. Provisions were
made for listing and changing core from the teletype
during execution of programs, as well as for simulating
memory protection and features to pass control to
command mode on execution of certain user defined
"illegal" instructions. Also, the user can define certain
unused operation codes to perform calls to his routines
for fast non-interpretive execution. Multiply-Divide
and other optional hardware are in this way emulated
by the interpreter. To provide even more rapid debugging capabilities, all interpretive output may be
directed to a storage oscilloscope. In this latter mode,
execution rates of about 10 instructions per second with
a full trace are attained. Using this debugging aid,
students spend le~s wasted time at the computer
console, and they have hard copy to carry away and
survey at their leisure. Without AIDF, many fewer
students would have access to the minicomputer for
debugging assembly language programs. The interrelationships of these various software packages as
well as an I/O translator (PMTR) and some others are
summarized in Figure 2.

438

Fall Joint Computer Conference, 1971

TABLE I-Basic Subroutines
CALL

FUNCTION

Succ. Approx ADC
13 Bit-lOO Kc conv/sec
Max
CALL RADC, V
CALL RADCB, S, N,
A(I)
CALL RADCT, V, T
CALL RADCF, S, N,
A(I)

CALL ODIO, W
G=Gain

1=1,2=8,4=64,
8=512
C = Channel 1 = 1, 2 = 2, 4 = 3,
8=4
C=Channel 1=1, 2=2, 4=3,
8=4
S =Sense on DIO for block in
start (0-7)

CALL ODAC, X, Y

Reads one value from DSIADC
Reads N values at S points/sec
into array A from DSIADC
Simultaneously reads one value
from DSIADC and the timer
Equal RADS except for fast
ADC

Outputs V to DAC number C
Channels 2 and 7 are 10 Bit
DAC's
Channels 4, 5, and 6 are 15 Bit
DAC's
Simultaneously outputs X and
Y to Channels 2 and 7 and
strobes 611 storage scope beam.
Used for scope and X - Y recorder drivers.
(Range ± 500)

C. For Scope Control
CALL SCOPE
CALL TELY
CALL SIZE, S
CALL POS, X, Y
CALL ERASE

All output goes to scope
All output goes to teletype
Sets scope characters to size S
Usual size is 4
Causes next character to be put
at X - Y on scope (Range ± 500)
Erases scope

D. X - Y Recorder Pen
Control
CALL PENU
CALL PEND

Lifts pen and waits one second
Puts pen down and waits one
second

E. External Controls
CALL EEXC, C
CALL EXCD, C, D

Strobes Channel C (0-7) on DIO
Strobes Channel C (0-7) on
Device D (0-63)

F. Senses
CALL ISEN, C, A

CALL RDIO, C, W

CALL RCTR, C, T, A

FUNCTION

Returns A = 1 if sense on
Channel C of DIO is true,
otherwise A = 0

Outputs W to lamps and digital
drivers on DIO
Inputs from DIO to W from
Channel C (0 = 16 Bit Buffer,
1 = switches)
Reads the digital counter on
Channel C (1 or 2) for time T
(1 or 10 sec) into variable A

H. System Control

CALL MAGT

B. For Analog Output
CALL ODAX, C, V

CALL
G. Digital I/O

A. For Analog Input
CALL SADC, G, C
Dual slope integrating
ADC
14-Bit-110 conv/sec
Max
CALL SADCF, C, S

TABLE I-Continued.

Loads Magnetic Ta pe Operating
System

HIGH LEVEL LANGUAGES
We will now examine the potential usage of high
level languages with minicomputers. The advantages
of providing laboratory users with high level language
capabilities are multifold. The most important is that
the time between conception and execution of an
experiment is reduced to a minimum, which not only
increases the effective usefulness of the minicomputer,
but also greatly reduces the "activation barrier" to
scientists who have never used the small computer into
trying various experiments. Likewise, it becomes
possible to incorporate experiments that illustrate data
acquisition techniques into laboratory courses which are
already established in the curriculum of a university
or college.
One popular misconception held among those who
have not previously used portable small computers
(8K core with teletype I/O) is that high level languages
such as BASIC and FORTRAN cannot be used for
real-time applications due to their limited speed of
execution and the large core storage space required for
their use. However, in most experiments, the input data
rate requirements are specified independently from the
output requirements, and usually even the time between
the input and output functions may be specified
separately. Accordingly, for those applications where
the timing requirements can be modularized, it seems
reasonable to provide high level languages for coding the
calculations, with assembly language sections or routines
for input and output. Core storage remains a problem,
but even with only 8K of core, programs with over 150
FORTRAN statements can be supported.
Provisions for calling subroutines are part of the
languages FORTRAN and BASIC, and since these are
the most commonly used compiler and interpretive
languages today, we have provided a set of subroutines
that allows the high level user access to virtually every

Multicomputer Processing in Laboratory Automation

peripheral device required for real-time computing.
Table I lists these routines. The devices include clocks,
timers, analog-digital converters, digital-analog converters, relays, lamp indicators, and the like. Electrical
connections are provided through a standard front panel
using VICI or Banana plugs so that students can easily
and rapidly change experiments without requiring even
a screw driver. Our experience indicates that even
relatively demanding tasks, such as low resolution mass
spectrometry data processing, can have the calculation
and teletype input-output code written in FORTRAN.
Student experimenters prefer to use the slower, but
more interactive, Real Time Basic package. The most
important feature of Real Time Basic is that programs
can be written, tested, and modified interactively from
an on-line teletype eliminating the compilation and link
edit steps of FORTRAN. Statements are checked for
correct syntax on loading rather than at run time to
even further accelerate the rate of program development. The experiments we have developed for students
to perform in BASIC range from analysis of gas
chromatography data to complicated simulations in
physical chemistry. A typical student-written program
for acquiring and plotting 250 pairs of time and voltage
readings (from a pH meter, in this case) is as follows:
90
100
110
120
130
140

DIM V(250),T(250)
CALL SADC,4,1
FOR 1=1 TO 250
CALL RADCT,V(I),T(I)
CALL ODAC,V(I),T(I)/10.
NEXT I

The language CLASS (Computer Language for
Spectroscopy Systems) was developed by Varian
Associates to provide a simple programming medium to
solve slow to medium speed on-line data collection and
massaging problems. This string processing language
provides an easy to learn system for instrument control
and acquisition of lists of data-all arithmetic operations
are performed in a double precision (31 bit) stack, and
data may be stored in either single or double word
formats. Programs are written as strings composed of
macro calls which are executed interpretively. Any
string or collection of strings can easily be defined by
new macro names, or, old, unused macros may be
deleted to free needed core space. All program generation
steps occur interactively at the teletype immediately
before an experiment.
We have used CLASS to support acquisition and
processing of data from various spectrophotometers and
electron spin resonance spectrometers.
The sample CLASS program given below demonstrates the code required to collect a 3000 point electron
spin resonance spectrum in each of the possible pro-

POP-IO
Timeshare

439

360/50
Batch

3000 feet
50,000 Baud

Figure 3-Block diagram, University of Oregon distributed
computer system for chemistry

gramming modes. Effective use of the lowest string
mode requires a complete understanding of the architecture of the language, whereas use of the highest level
requires only instruction in the use of a teletype.
TTY
STRING
R24
R77

•
R17
C2
R33
A100
CI
C5999
R76
R76

•

M
GO
Lowest "Assembly"
Level

TTY
RETURN
PEND
END
COLLECT
FILT2
PUT
DEST 100
STRT1
END 5999
RIGHT
RIGHT
END
GO

TTY
EPR
FILT2
DEST 100
STRTI
END 5999
GO

Simple Macro Highest Macro
Level
Level

On execution of the sample programs given above,

440

Fall Joint Computer Conference, 1971

the recorder arm would first be moved to the extreme
left with the pen up. After that, the pen would be
lowered, three thousand data points collected (at a rate
of one point every two deciseconds) and the data
plotted. For each data point the recorder would be
stepped two recorder increments to the right.
The maximum advantage in the use of linked computers is that expensive bulk storage capabilities need
to be provided at only one location. At the University of
Oregon, we are currently implementing a communications network that will support direct transmission of
data between several Varian 620ji computers and a
centrally located Digital Equipment Corporation
PDP-10 computer. The tremendous volumes of data
generated in gas chromatographic mass spectrometer
and Fourier infrared experiments is collected with an

8K machine and transferred for computation to the
PDP-10. The system hardware is outlined in Figure
3.
SUMMARY
In this paper we have discussed an approach to laboratory computing which allows the experimenter to take
advantage of each of a variety of programming languages and hardware facilities. Through use of the
distributed computing systems described, it has been
possible to serve a wide range of users ranging from the
inexperienced student to the experienced researcher, and
to allow them to make use of laboratory computers
routinely.

Enhancement of chemical measurement techniques
by real-time computer interaction.
by SAM P. PERONE
Purdue University
Lafayette, Indiana

The small, dedicated, laboratory computer can
provide enhanced capability for electro analytical measurement techniques in the chemistry laboratory. The
work described here is based primarily on three recent
publications by Perone, Jones, and Gutknecht. I ,2,3
The particular electrochemical analysis techniques to
which computerization has been applied are stationary
electrode polarography (SEP) ,4 and the closely-related
derivative voltammetry.5,6 However, the principles and
methodology described should be generally applicable
to other electro analytical techniques, and, perhaps, to
chemical experimentation, in general.
Certainly, one very important way in which the
on-line digital computer can improve the capabilities
of chemical measurement techniques is simply to
provide automated experimentation, data assimilation,
and straightforward data processing. This has been
demonstrated amply already for electro analytical
instrumentation. 7 ,s,9 However, these approaches are
bounded ultimately by the inherent limitations of the
particular measurement technique. To take full advantage of the dedicated computer, one should utilize
its capabilities for rapid "intelligent" feedback and
incorporate the computer into the experimental control
loop, as shown in Figure 1. Thus, the computer could
monitor the experiment; process the data as the experiment progresses; and, "intelligently" modify the course
of the experiment to provide optimum measurement
conditions. Of course, the "intelligence" is related to the
programming skill and the experimental intuition of the
programmer. Moreover, the transfer function for
"intelligent" response will depend on the degree of
sophistication and number of calculations and decisions
which must be made in "real-time"-i.e., between
successively acquired data points; also important are
the computer's speed, and hardware arithmetic and
logical capabilities. (A more quantitative discussion of
response factors is given below.)
The important feature of this latter approach-real-

time computer optimization of analytical measurements-is that a measurement technique can be
generated, which would be unattainable without the
aid of an on-line computer. Moreover, in the process of
developing this application, the experimenter must
necessarily investigate systematically those experimental parameters which most critically determine the
computer optimization of the measurement. This can
be done only with the computer in the control loop.
However, the results of these investigations then
provide the foundation for the design of new instrumental methods which might be implemented independent of an on-line computer. These aspects of
laboratory computer applications-real-time computer
interaction with instrumentation, and computerized
experimental design of interactive instrumentationwill be discussed below.
RESPONSE FACTORS FOR REAL-TIME
COMPUTER INTERACTION
The response of the computerized control loop
depicted in Figure 1 can be quantitatively evaluated.
The analysis is similar in many respects to that applied
in characterizing analog operational amplifier response. IO
That is, one can describe a transfer function for the
digital computer control element. This transfer function
can be defined as the dependence of computer power
on stimulus frequency.
To define these terms, consider that a digital computer is a programmable device which executes
arithmetic, logical, and input/output operations in
sequential fashion. (References- 9, 11, and 12 may
provide useful background on computer programming
for laboratory applications.) Thus, computer power is
directly proportional to execution time available. It is
also directly related to inherent hardware capabilities,
such as instruction execution time, micro-programming

441

442

Fall Joint Comput~r Oonferen~e, 1971

between stimuli as given in Equation 1,
(1)

CPU

where P is in units of time. The dependence of P on f
and T is shown graphically in Figure 2. The smallest
value of P shown on the ordinate axis in Figure 2
is 1.0 psec.
The type of response analysis presented above is
admittedly simple-minded. Nevertheless, it can be very
useful in the design and implementation of computerinteractive instrumentation. Certain related factors
must be considered, however. First of all, the discussions here relate only to dedicated systems, where the
computer is interfaced to a single instrument. Thus, it
is assumed that the execution of the real-time service
program begins "immediately" upon request from the
interfaced instrument. In fact" there is some minimum
response time, TO, required. The magnitude of TO depends
on the computer hardware features and also depends
on the programmer's choice of response mechanism. If
he chooses to use the computer's J nterrupt System, TO
may be as short as a microsecond. If he chooses to use
program-controlled response, where the computer is
programmed to sit in a loop waiting for a service request,
TO may be several microseconds.

Figure l-Block diagram of laboratory instrumentation with
computerized feedback loop
~haracteristics, input/output structure, etc. However,
for a given computer and a specific experimental
application, the critical variable is execution time
available.
The stimulus frequency, j, is simply the frequency at
which the computer is '''poked'' by the experiment
requesting some external service. In the simplest
situation, f is the data acquisition frequency-the rate
at which digitized data are made available to the
computer from the experiment.
Consider now the role played by the on-line computer
during the execution of a given experiment. In some
experiments, the computer's service, as each datum is
made available from the digital data acquisition system,
may involve only inputting the datum, saving it in
memory, and some bookkeeping. Typical service time,
T, may be 20 or 30 p.sec. In a more complex case, where
some computational evaluation of the data, logical
decisions, and possible experimental control operations
may be required also in real-time, T may be considerably
longer. This latter case is the type with which we are
concerned here.
N ow we can define a transfer function for a computerized system. The real-time computer power, P,
can be equated to the available computational time

3

log P
(p-sec) 2

c

D

3

4

log f

Figure 2-Dependence of real-time computational power on
stimulus frequency
A.
B.
C.
D.

T

T
T
T

= 10-6 sec
=

10-5 sec

== 10-4 sec
= 10-3 sec

Enhancement of Chemical Measurement Techniques

A second assumption is that the computer is the
slowest element in the control loop. That is, it is assumed that all analog instrumentation controlled or
measured by the computer does not limit the overall
system response.
Another relevant consideration is that the value for
service time, 7', used in any calculations should be the
"worst case" value. That is, where alternative program
pathways exist, assume that conditions will always
require the longest path.

443

gramming had to have a worst-case execution time,
less than 82.4 JLsec. This allowed programming
requiring no more than 51 machine cycles.
The following discussion presents a specific application of real-time computer programming for the
enhancement of stationary electrode polarographic
measurement capabilities. The work described is taken
from Reference 1 and illustrates the implementation of
principles discussed above.
7'R,

REAL-TIME COMPUTER CONTROL IN
STATIONARY ELECTRODE POLAROGRAPHY

SOME SAMPLE CALCULATIONS
Consider now how one might use the system response
analysis described above for the design of a particular
experimental application involving real-time computer
interaction. First, the essential elements of the minimal
experimental service program-including the required
input/output and bookkeeping instructions to be
executed for each stimulus-must be established. The
time required for these operations plus the minimum
response time, 7'0, correspond to the minimum service
time, 7'M. The transfer function for this value of 7'M
establishes the real-time computer time available,
7'A, where
(2)

Thus, the programmer can calculate 7'A for the specific
data acquisition or service frequency, I, required in his
application. Then, he must establish 7'R, the real-time
computational time required to provide the desired
calculations, decisions, and experimental control operations to allow computer interaction with the experiment.
The linear combination of the two programming segments results in a total real-time service time, 7'T, where
(3)

If the programming is such that 7'R~7'A, the proposed
application is feasible. Alternatively, for a given value
of 7'T, one can calculate the maximum data acquisition
frequency allowed. This can be obtained by setting
P=O. Then, 1/1= l/lmax=TT.
A specific example should illustrate the above discussion. One computer system used by this author has a
basic machine cycle time of 1.6 JLsec, and each instruction is some integral multiple of this value. For a
particular application, program-controlled service was
used requiring a worst-case response, 7'0, of 4.8 JLsec.
The additional minimal service programming required
8 cycles, 12.8 JLsec. Thus, 7'M was 17.6 JLsec. The desired
data acquisition frequency, I, was 10KHz. Therefore,
the value of 7'A was computed from Equations 1 and 2
to be 82.4 JLsec. Thus, the real-time interactive pro-

Stationary Electrode Polarography, (SEP) , is a
chemical analysis technique where solutions are
analyzed by the measurement of electrolysis currents
that flow when the cell voltage is swept. Despite its
many desirable characteristics for automated analysis,7
SEP is limited seriously for application to mixtures.
Because of the continuous nature of the experiment,
currents from easily-reducible species continue to flow
and contribute to, distort, or mask currents measured
for more-difficultly-reducible species. The non-ideal
aspect of the technique is the fact that a continuous
linearly varying potential is applied to the electrolysis
cell, regardless of the composition of the sample. If the
linear sweep were discontinuous, stopping briefly after
each reduction step to allow the more complete dissipation of the easily-reducible species in the diffusion
layer around the electrode, the interference with
reduction steps for more difficultly-reducible species
would be considerably diminished. However, such a
discontinuous or "interrupted-sweep" experiment would
require some foreknowledge as to the composition of
the mixture--and this is not a likely situation in real
analytical situations.
The work of Perone, Jones, and Gutknechtl illustrated how one can take advantage of the control
capabilities of the on-line digital computer to overcome
the resolution problems of stationary electrode polarography. The approach taken was to allow the computer
to interact with the experiment in real-time to generate
an interrupted-sweep experiment which was effectively
"sample-oriented". Some details of that work will be
presented here.
Sample-oriented analysis-Interrupted sweep approach
Resolution limits

The theories of conventional stationary electrode
polarography4 and stationary electrode polarography
with derivative read-out5 ,6 allow the accurate prediction

444

Fall Joint Computer Conference, 1971

TABLE I-Vaules of Current Function Ratios,
x (at)p/x (at)E, x' (at )p/x' (at)E, and x" (at)p/x II (at)E
as Functions of (E -E1I2)a

Cell

-150
-175
-200
-225
-250
-275
-300
-325
-350
-400
-450

1.813
1.991
2.153
2.299
2.436
2.563
2.671
2.788
2.896
3.096
3.304

6.044
7.983
10.18
12.54
15.02
17.63
20.22
22.99
26.16
32.28
38.90

15.01
24.08
35.03
50.26
68.00
88.90
115.6
144.5
172.6
241.0
330.0

aSee References 4 and 5 for definition of symbols.

of resolution limits. This can be done by calculating
the theoretical ratio of the current functions at the peak
and at some potential beyond the peak. Using the
mathematical approaches outlined previously,4-6 and
considering only reversible systems, this has been done
for 0-, Ist-, and 2nd-derivative measurements. IS The
results are shown in Table I. [The peaks chosen for the
Ist- and 2nd-derivative current functions are the
largest ones in each case-at n(E-EI/2) = 18.8 and
-14.4 mY, respectively.] Note that the interference
with a succeeding reduction diminishes considerably
with derivative measurements and with increased
potential separation of reduction steps; resolution
increases with the order of the derivative; however,
even with the derivative measurement, the resolution is
not particularly good when reduction steps are closer
together than about 300jn mY.
Thus, considering arbitrarily a separation of 300jn
mV and equal n- and D-values, it would be possible to
resolve, with 5 percent contribution of the first reduction
step to the second, concentration ratios of 1: 7, 1: 1,. and
5.8: 1 for the 0-, Ist-, and 2nd-derivative measurements,
respectively. These calculations are exclusive of any
other background contributions, with the assumption
that they are negligible or can be measured independently for correction.
It would, of course, be possible to correct mathematically for the interference caused by overlapping
reduction steps. However, a limit is reached with this
approach when the interference is so great as to preclude
even the recognition of a succeeding reduction step. In any
event, this approach has been considered2 and will be
discussed later.

Current Measurement
and Differentiatar
Instrumentation

SEP-I

Figure 3-System block diagram

Details of Interrupted-Sweep Experiment

The interrupted-sweep experiment involves a computer-controlled potentiostat (described in Reference
1), and real-time analysis of fast-sweep derivative
polarographic data. (A system block diagram is shown
in Figure 3.) The computer continuously monitors the
experimental output, is instantaneously aware of the
occurrence of reduction steps, and can interrupt the
linear potential sweep at an appropriate potential
cathodic of each peak. The interrupt potential is held
for a length of time computed to allow sufficient
depletion of the electro active species in the diffusion

' !/
LIl
I

I

I

I

1

E

E

Figure 4-Comparison of normal and interrupted-sweep
stationary electrode polarography
A. Stationary electrode polarogram (W /0 interrupt)
B. Stationary electrode polarogram (With interrupt)
. C. Applied cell potential (W /0 interrupt)
D. Applied cell potential (With interrupt)

Enhancement of Chemical Measurement Techniques

layer, and then the sweep is restarted. The interrupt
delay time, T', is calculated in proportion to the magnitude of the reduction step, with the restriction that
T' not be so long as to cause convection processes to
occur or to allow significant electrolysis of the next
electroactive species. Thus, the controlling potential
function-which is basically a series of ramp-and-hold
steps-will be different for each experiment, depending
on the sample mixture and composition. The experiment
is custom-tailored to the sample-i.e., sample-oriented.
A simplified comparison of the continuous- and interrupted-sweep experiments is shown in Figure 4. A flowchart of the computer-controlled experiment is given in
Figure 5. The computer used in this work was a HewlettPackard Model2ll5A with l6-bit word size, 8,192-word
core memory, 2.0 p.sec cycle time, and a hardware
extended arithmetic unit. The data acquisition system
included aI~-bit, 33 J.Lsec conversion-time analog-todigital converter (ADC). The timing was provided by
an external 10 MHz crystal clock scaled down to 1 KHz,
which was the maximum data acquisition frequency, f,

Real-Time
Computations

,--------,
I
I

1

I

~----I~I~

I
I

I
I

I

I
I
I
I

I

I
I

I.

II

I

I
r

I

I
I
L__

I
I
I
I
_ __ --1

445

1"01'=---......-4

E(t)

E(t)

Figure 6-Analytical measurements from first- and
second-derivative voltametric data
A. First derivative
B. Second derivative

used for all experiments. A detailed discussion of interfacing and control logic is given in Reference 1.
The information taken in by the computer is extracted from either the Ist- or 2nd-derivative signal.
Only the negative region of the derivative signal is seen
by the analog-to-digital converter, and the measuring
circuit is arranged so that only the largest peak in each
case is the correct polarity. The result is shown in
Figure 6. The computer is programmed to look for
sharp peaks-above an arbitrary threshold-and to
measure and store peak heights, peak areas, and peak
locations in real-time.
When a complete peak is observed-i.e., one which
goes above threshold, goes through a sharp maximum,
and then comes back down below threshold-the
computer executes the maximum real-time programming, calculating the n-value, the potential at which the
sweep should be interrupted, ED, and the delay time,
T'. (The maximum total real-time programming time,
TT, is about 800 J.Lsec.) Then the computer allows the
sweep to continue (if necessary) until the desired
interrupt potential (ED) is reached; the sweep is
interrupted at this point for time, T'; the sweep is then
reinitiated with the computer looking for the next
reduction step, ready to reexecute a similar interrupted-sweep. (The delay time, T', is computed in
proportion to the peak height, with the restriction
that T' be between 100 and 1000 ms.)
Advantages of real-time calculations

Uses of Integral Data
Flowchart for Real-Time Computer-Optimized S. E. P.

Figure 5--Flowchart for real-time computer-optimized SEP

It was possible to integrate the derivative peaks
(Figure 6) seen by the computer in real-time and to use

446

Fall Joint Computer Conference, 1971

TABLE II-Interrupt Potential Required for Specified Surface'Ratio, Co/CR

o
1/1

-59.1/n
1/10

-28.5/n
1/3

these integral data for analytical purposes, for calculating appropriate interrupt potentials, and to
provide diagnostic information. The area under a peak
is directly proportional to the peak height and, therefore, to the concentration of electroactive species.
Thus, the peak integral, Qp, is a concentration-dependent
output. Moreover, should the peak location routine in
the program fail because the derivative peaks are too
noisy, broad, or small, the peak integrals will still be
taken and provide a useful, reliable, source of analytical
information. In addition, a peak area threshold is
incorporated into the program, whereby the area of a
given peak must exceed some arbitrary value before
the computer will recognize a signal excursion as a
bona fide reduction peak. This is a very useful processing
parameter.
In the case of 1st-derivative read-out, the value of
the integral of the observed peak is equivalent to the
peak height of the conventional stationary electrode
polarogram. It has been shown previously that, for a
reversible system, the ratio of the 1st-derivative peak
height, I' p, to the conventional peak height, i p , is
related to n 6 , as given by Equation 4,
(4)

where v is the scan rate in volts/sec. Thus, a determination of the ratio of the derivative peak height to
the peak integral can lead to an evaluation of n, and the
computer is programmed to do this in real-time so that
the information is available for interrupted-sweep
decisions. This information is also useful for qualitative
identification of species or for providing an error
diagnostic if wrong n-values are obtained for known
systems.
A similar relationship exists between the n-value and
the ratio of the 2nd-derivative peak height to the peak
integral. The measured integral is equivalent to the
difference between the positive- and negative-going
1st-derivative peaks. The relationship can be calculated
from previous theoretical data5 and is given in Equation
5. (Note the error in Equation 2 of Reference 1.)
(5)

Thus, the n-value can be obtained from either the Istor 2nd-derivative measurements. For reversible processes, the computed n-value is accurate. For irreversible
processes, the n-value at least reflects the broadness of

-l00/n
1/50

-118/n
1/100

-177/fl

1/1000

the peak, and this is useful for the interrupt potential
calculations discussed below.
Selection of interrupt potential

In the interrupted-sweep experiment, the potential,
ED, selected to be held during the delay period,

T',

is critical. The objective is to select a potential which is
cathodic enough to deplete adequately the electroactive
species in the diffusion layer. That is, the potential
should be chosen such that the concentration ratio
CO/CR , approaches zero at the electrode surface.
The obvious problem is that selecting a value of ED
cathodic enough to truly deplete the electroactive
species at the electrode would eliminate the possibility
of observing a succeeding closely-spaced reduction.
Thus, a compromise must be reached, and a knowledge
of the n-value for the reduction step on which the delay
is made is useful in selecting the appropriate interrupt
potential.
In this work, interrupted-sweep experiments were
run with the interrupt potential (ED) selected by the
computer after it has observed a complete derivative
peak-i.e., when the derivative signal is going through
zero. ED is selected relative to the potential, E z , at
which the derivative signal goes through zero. The
computer uses information provided initially by the
operator in order to calculate ED-E z . That is, the
operator initially specifies n(ED-Ez); the computer
determines nand E z , and then selects ED. Experiments
are reported below where the operator-selected value of
n(ED-Ez) was varied for a series of runs to observe
the effect of ED on quantitative resolution.
The influence of ED on the surface concentrations of
species in the redox couple, 0 and R, is shown in Table
II. The calculations for Table II are based on the
Nernst equation and reversible behavior. Also, it
should be noted that Ez-El/2 is -28.5/n mV for the
1st-derivative measurement, and - 66.5/ n mV for the
2nd-derivative measurement.
Results

Two different two-component systems were studied
in this work. The first system consisted of TI(I) and
Pb(lI) in 1.0M NaOH electrolyte. The half-wave
potentials for these two species are separated by

Enhancement of Chemical Measurement Techniques

447

approximately 280 mY. The second system studied was
that of Pb(II) and Cd(II) in a 2M ammonium acetateacetic acid electrolyte. The El/2 separation for this case
was approximately 150 mY. All runs were made at a
scan rate of 1.00 VIsec. Data points were taken at
1- or 2-mV intervals, and both the first- and secondderivatives of the reduction currents were observed
for each system.
Various experiments were applied to the two systems.
These included normal SEP and interrupted-sweep
SEP with computer-selected interrupt potential. The
values of (ED-Ez ) employed varied from Oln mV to
values which resulted in noticeable charging spike
interference. The delay time, r', for the first peak in
each example was always near to or equal to 1000 msec.,
since the first peaks were always quite large. The
results of these tests are summarized in Table III. Also
included in these tables are the theoretical estimates of
overlap error based on the data from Table I.
TABLE III-Peak Derivative Measurements of Smaller Component with and without Computer Interaction for Binary Mixtures.
Mixture

30:1
Tl:Pb
II

II

"
10:1
Pb:Cd
II

II

"
100:1
Tl:Pb
II

"
1000:1
Tl:Pb

"
"

Condition

ED-Ez,
mV

% Std.,
JpI

"
"

Jp "

50.8a

92.6b

89.9
92.8
97.9

100.8
100.8

Oln

45.3 c
104.4

91.9/1
100.0

-20/n
-30/n

104.8
102.3

86.9

wlo interrupt
interruptedsweep

% Std.,

Oln
-20/n
-100/n

wlo
interrupt
interrupted-sweep

"

"

no peake
detected

wlo
interrupt
interrupted-sweep

"

-40/n
-100/n

wlo
interrupt
interrupted-sweep

"

-40/n
-100/n

88.71

70.8
83.5

101.0
102.9

no peaku
detected

no peakh
detected

"
"

78
98

(a) -48.4%, (b) -5.2%, (c) -45.6%,
Predicted errors:
(d) -8.2%, (e) -160%, (f) -17%, (g) -1600%, (h) -170%.
(Predicted errors based on Table I and known concentration
ratios.)

Figure 7-First-derivative curves for 30:1 {Tl(I)]-[Pb(II)]
system
3.21 X 1O-4M Tl(l), 1.05 X 1O-6M Pb(II),
1. OM NaOH
Upper trace: W10 interrupt
Middle trace: With interrupt; ED-Ez= -lOin mV
Lower trace: With interrupt; ED-Ez = -40/n mV
The signals observed are: (A) Tl(I) signal; (C) Pb(II)
signal; (B) and (D) charging spikes; and (E) and (F)
current decay occurring during interrupt.

The 1st-derivative data for the [TI(I) J- [Pb(II) ]
system show continued improvement with increasing
(ED-Ez ). However, beyond (ED-Ez) = -1001n mY,
some distortion apparently is caused by the charging
spike of the restarted sweep overlapping slightly with
the Pb(II) signal.
The error due to overlapping reduction signals shown
in the 2nd-derivative data for the [TI(I) J- [Pb (II) ]
system is small even without the interrupted-sweep.
(This is predicted, of course, from Table I.) The

448

Fall Joint Computer Conference, 1971

mately 100 mV. Thus, if ED is to be set cathodic of E z,
the end of the first signal peak and the start of the
second signal peak must be at least 100 mV apart. This
is a significant limitation on the interrupted-sweep
experiment.
The second-derivative data for / the [Pb (II) J[Cd (II) ] system show better overall results than the
first-derivative data. This was to be expected, on the
basis of earlier studies,5.7 and the data of Table I. It
was observed that for both systems the peak integral
data, Q' p and Q" p, show greater error than the peak

Figure 8-Second-derivative curves for 30:1 [Tl(I)] - [Pb(II)]
system
3.21X1O- 4M Tl(I), 1.05X1O- 6M Pb(II),
100M NaOH
Upper trace: W /0 interrupt
Lower trace: With interrupt; ED - Ez = -10/n mV

improvement with the interrupted-sweep is significant,
however. The experimental effects of the interruptedsweep for both the first- and second-derivatives can be
visualized in Figures 7 and 8.
The first-derivative data for the [Ph (II) ] - [Cd (II) ]
system, using the interrupted..;sweep experiment, show
some contribution to the Cd (II) peak from the charging
spike, even with small values of (ED-Ez). This interference is illustrated in Figure 9, and results in a small
positive error for the second peak.
The width of the charging current spike for both the
first- and second-derivative is observed to be approxi-

Figure 9-First-derivative curves for 10:1 [Pb(II)]-[Cd(II)]
system
1005X1O-4M Pb(II), 1ollXlO- 6M Cd(II),
2M NH 40Ac, 2M HOAc
Upper trace: W /0 interrupt
Lower trace: With interrupt; ED-Ez= O/n mV

Enhancement of Chemical Measurement Techniques

height data. This is not unexpected, because the area
from the peak base is missed when peak overlap occurs.
Thus, peak areas should not be the primary source of
analytical data.
The results for [TI(I)]-[Pb(II)] 100:1 and 1000:1
mixtures in l.OM N aOH show a considerable improvement in quantitative resolution when the interrupted-

449

sweep experiment is employed. For the 100: 1 mixture,
the second peak is undetectable with a 1st-derivative
read-out (see Figure 10). With the interrupted-sweep,
however, not only is the second peak detectable, but
the measured value comes up to 84 percent of the
correct value, for 1st-derivative read-out, and 100
percent for 2nd-derivative read-out. For the 1000: 1
mixture the second peak is undetectable, even with a
2nd-derivative measurement. EmploYing the interrupted-sweep experiment with 2nd-derivative read-out,
however, reliable and quantitative detection could
be obtained.

Observations
The work described here was intended to demonstrate
that an on-line digital computer could be used to
optimize an experimental measurement technique by
real-time interaction with the experiment. The results
clearly show a dramatic improvement in quantitative
resolution of overlapping reduction signals, provided a
minimum El/2 separation of about 150 mV is present.
Thus, the optimized measurement is subject to at least
this one severe limitation; but, nevertheless, appears
quite useful. Most importantly, the principle of realtime computer-optimized measurements in electroanalysis was demonstrated by application to a real
system.
COMPUTERIZED EXPERIMENTAL DESIGN
OF INTERACTIVE INSTRUMENTATION

Figure 100First-derivative curves for 100:1 [Tl(l)]-[Pb(II)]
system
2.68XlO-4M Tl(l), 2.63XI0-6M Pb(II),
1.0M NaOH
(Arrow shows Pb(II) peak)
Upper trace: W /0 interrupt
Middle trace: W /0 interrupt; sensitivity increased
5X

Lower trace:

With interrupt, ED-Ez= -40/n
m V; sensitivity same as middle trace

The experimental method described above illustrates
the application of an on-line digital computer to
generate, evaluate, and optimize a new electro analytical
approach. The general-purpose laboratory computer is
well-suited for this task because of the ease with which
programmed control functions can be modified during
the development of experimental control characteristics.
However, having arrived at an optimum set of experimental control features the continued dedicated use of
an on-line computer for routine application of the
technique might not be economically feasible. Thus, a
more practical approach should be taken to adapt the
technique developed with the general-purpose computer
system for routine laboratory application. In another
publication,3 Jones and Perone described the incorporation of the optimum parameters determined from the
earlier work,! summarized above, into a specialized
instrument designed to generate the interrupted-sweep
experiment without the need for an on-line computer.
This later work3 demonstrated the value of the com-

450

Fall Joint Computer Conference, 1971

puter-controlled experimentation for the design of
interactive experimental techniques that can then be
hardware-implemented. It also demonstrated that
many programmed control operations can be converted
easily to hardware logic and analog functions by using
medium-scale integrated circuit (MSI) modules. 14
The optimum parameters selected from the earlier
work were incorporated into a device in which few
manual operations were needed and which, as a result,
would implement the technique for most, but not all, of
the cases studied. Detailed descriptions of the instrumentation and approach are presented in Reference 3.
Essentially the same functions as in the computerized
technique were provided, but some changes were made.
Only the 2nd-derivative peak height, I" p, was used to
extract quantitative information. The first-derivative
zero crossing E' z was used for qualitative identification,
rather than the peak center of the 1st- or 2nd-derivative
which was used in the computer-controlled technique.
This was because of the ease with which the zero crossing
could be detected with a hardware comparator. The
peak width of the second derivative was used as an
indication of n 5, rather than the ratio of the peak height
to its integral as in the earlier work. This was because
the 2nd-derivative peak-width measurement was much
easier to accomplish with hardware and gave very
reproducible results. The interrupt time delay was
proportional to the second-derivative peak height, as it
was for the computerized approach, and was again
limited to between 100 and 1000 msec. Because the
earlier work had shown that adequate resolution could
be obtained up to at least 100: 1 mixtures for ED=Ez ,
no provision was made in the hardware device for
adjusting ED-Ez in normal automated operation.
When the hardware instrumentation approach3 was
compared with the computerized experimentation of the
earlier work, l several observations were made. First, the
analytical results and limitations were essentially
identical for the two approaches. The only exception
was that the hardware instrumentation failed for a
1000: 1 mixture. This was because the hardware deviceused a slightly less sensitive-though more reliablepeak detection method. Also, quantitative resolution
of 1000: 1 mixtures would require minimum peak
separations of about 250/n mV and a value of ED-Ez
of -IOO/n mVI. Because the hardware device was
designed for completely automated operation, ED was
set to always equal E z , and this allowed application to
any system where peaks were separated by 150 mV or
greater, up to peak ratios somewhat greater than 100: 1.
The limitations mentioned above reflect the objective
of hardware development. The device was built to
handle most analytical situations with a minimum of

operator manipulations. Moreover, the cost was only
about five percent of the computerized system. The
device is much simpler to operate than the computerized
instrumentation because only the initial cell potential,
amplifier gain, and sweep mode (with or without sweep
interrupt) need to be specified by the operator before
an experiment. Nevertheless, it provides truly interactive instrumentation. 'Although the feedback is less
flexible or "intelligent" than in the computerized
system, it is optimized. In addition, the hardware
device offers one other distinct advantage over the
computerized approach. The sweep rate of the computerized system was limited by the time necessary to
perform the real-time calculations. This amounted to
about 800 J.Lsec between data points, which limited the
data rate to about 1 KHz. If data are taken at each
mV during the sweep, this limited the sweep rate to
"about IV/sec. In the hardware device, this limitation
does not exist as the hardware can perform several
logical, arithmetic, storage, and control operations in
, parallel; thus, less real-time steps are required. With
proper modifications to the potentiostat, current
follower, differentiators, and filters, the interruptedsweep experiment could be run at sweep rates of
100 V/second. With appropriate scaling of T', this would
allow much shorter experimental times and might be
useful for the analysis of unstable systems.
The most important point here, though, is that the
hardware instrumentation could not have been designed
readily without the prior investigative study! using the

Figure ll-SEP curves for Real 1:1 [In(III)]-[Cd(II)]system
Peak Potential Separations 48 mv.
3.84XlO-6M In (III) , 3.95 X 1O-6M Od(II), 1.0M HOI
Voltage range shown: -0.300 -+ -0.800 v. vs.
S.C.E.
Maximum peak current: 3.7 p.a.

Enhancement of Chemical Measurement Techniques

451

on-line general-purpose computer. The optimum design
parameters-as well as the very feasibility-of the
interrupted-sweep technique were evaluated with the
computerized experimentation.
COMPUTERIZED RESOLUTION OF
CLOSELY-SPACED PEAKS IN STATIONARY
ELECTRODE POLAROGRAPHY
The experimental work described above demonstrated how computerized instrumental interaction
could improve the quantitative resolution capabilities
of SEP. However, that approach fails for peaks separated
by less than ,about 150 mY. An alternative approach
must be used to handle electro analytical samples where
SEP peaks are more severely overlapped. One approach
taken has been presented by Gutknecht and Perone. 2
The approach involved extracting the analytical
information from SEP data using mathematical
deconvolution techniques. An empirical equation was
developed which describes the general stationary
electrode polarogram for a wide variety of electro active
species. The function is fit to a number of standard
polarograms, and the constants of the function, as
specifically determined for each species, are stored in
computer memory. Upon analysis of an unknown
mixture, these constants are used to regenerate the
standard curves, a composite of which is then fit to the
unknown signal. In the fitting process, account is taken
of overlap distortion as well as experimental fluctuation
of peak potentials.
The small computer was used to perform several
different functions in the development of the on-line
electro analytical system. Primary among these were
experimental control, timing, synchronization, and data
acquisition functions. In addition, with the aid of an
oscilloscopic display system, off-line simulation studies
were carried out to evaluate empirical equations
developed for later on-line data processing. Finally,
the computer was used to process SEP data acquired
on-line for qualitative and quantitative information.
The processing approach proved especially valuable for

Upper trace:

Figure 12-SEP curves for synthetic 1:1, 1:5, and 5:1 mixtures of
[In(IIl)] -[Cd(Il)]
Peak potential separations 38-42 mv.
Voltage range shown: -0.300 -+ -0.800 v. vs.
S.C.E.

4.80X10- 6M In(lll), 4.94XlO-6M
Cd(Il), loOM HCI
Maximum peak current: 4.8 p.a.
Middle trace: 0.960XlO-6M In(IIl), 4.94X10- 6M
Cd(Il), loOM HCI
Maximum peak current: 3.0 p.a.
Lower trace: 4.80XlO-6M In(IlI), 0.988XlO-6M
Cd(Il), 1.0M HCI
Maximum peak current: 3.5 p.a.

452

Fall Joint Computer Conference, 1971

the analysis of mixtures of similar concentrations where
the overlap was so severe as to preclude visible recognition of the individual signals.
A discussion of the numerical deconvolution approach
is beyond the scope of this paper, and the reader is
referred to the original work for details. 2 However, a
brief summary of the results of that work can be
presented.
It was shown that for mixtures of similar concentrations of In(III) and Cd(II) in 100M HCl, with a
peak potential separation of 48 mV, it was possible to
detect and quantitatively resolve the overlapped peaks
with relative errors the order of 1 to 2 percent. A
polarographic trace for a 1: 1 mixture is shown in
Figure 11. The approach was applicable to mixtures
with concentration ratios as great as 10: 1.
To establish the limiting peak separation which could
be handled by the deconvolution technique, synthetically generated polarograms of In(III)-Cd(II) mixtures were used where peak separations were varied.
The limiting peak separation was found to be about
40 mV. Mixtures with concentration ratios as great as
5: 1 could be qualitatively identified and quantitatively
resolved with about one to two percent relative errors.
By contrast, simple simultaneous equation calculations
led to relative errors the order of 10 to 35 percent.
Moreover, the visual detection of the two individual
peaks was not possible, as shown in Figure 12.
It should be obvious that the numerical deconvolution approach and the computerized interaction
approach are complementary in many ways. The latter
is applicable to widely-spaced peaks with large concentration ratios; the former is applicable to very
closely-spaced peaks, but can handle more limited peak
ratios. Both approaches represent a significant enhancement of measurement capabilities in stationary electrode polarography.
CONCLUSIONS
Certain observations should be made here. First of all,
real-time computer interaction with experimentation is
not the answer to all measurement problems. In some
cases, in fact, it makes the problem worse. For example,
the interrupted-sweep approach is completely inappropriate for SEP measurements of closely-spaced
reduction peaks. One is tempted to generalize and state
that the interactive approach fails when the interaction
distorts the fundamental processes of interest. It was
shown here how one might use the numerical analysis
capabilities of the small computer for solution of these
measurement problems.

A second point is that one may not need to devise a
real-time interaction scheme to achieve computerized
optimization of experimental measurements. A perfectly
adequate approach might involve an iterative method
where the computer is programmed to analyze the data
from a completed experimental run; make decisions
regarding modification of controlled parameters for
improved measurements; and then reinitiate the
experiment under new conditions.
A third point to be made here is that, as demonstrated
above, it may not be necessary to require a digital
computer for implementation of real..;time interactive
instrumentation. However, the investigation of the
approach and the establishment of the optimum mode
of interaction are greatly facilitated by the on-line
digital computer. Subsequent hardware implementation
of the approach can be straightforward and economical.
A final observation to be made here is to attempt to
define in general the experimental situations where
real-time computerized interaction is advantageous
and/ or necessary for optimization of measurements.
These situations seem to include those where separate
dynamic experiment-associated chemical or physical
processes occur which interfere with the measurement
of interest at a particular time during the experiment.
If the interfering processes can be independently
evaluated by real-time computations, computerized
interaction may be advantageous. If post-mortem
analysis of unoptimized experimental measurements
does not provide adequate information for subsequent
experimental modifications, real-time computer interaction may be necessary for optimization.
It would be presumptuous on the part of this author
to describe specifically how other measurement techniques might be optimized by real-time computer
methods. Only the experienced worker skilled in
the particular analytical method has the appropriate
understanding and intuition for proper experimental
design. However, several analytical methods have been
recognized as being amenable to optimization by realtime computer .measurements. These include gas
chromatography,IS kinetic and other clinical methods of
analysis,16.17 as well as coulometric analysis. Is Undoubtedly, many such applications will be developed
in the near future.

ACKNOWLEDGMENTS
The support of the National Science Foundation,
Grants No. GP-8677 and GP-21111, is gratefully
acknowledged.

Enhancement of Chemical Measurement Techniques

REFERENCES
1 S P PERONE D 0 JONES W F GUTKNECHT
Analytical chemistry VoI411154 1969
2 W F GUTKNECHT S P PERONE
Analytical chemistry Vol 42 906 1970
3 D 0 JONES S P PERONE
Analytical chemistry Vol 42 1151 1970
4 R S NICHOLSON I SHAIN
Analytical chemistry Vol 36706 1964
5 S P PERONE T R MUELLER
Analytical chemistry Vol 372 1965
6 C V EVINS S P PERONE
Analytical chem1'stry Vol 39 309 1967
7 S P PERONE J E HARRAR F B STEPHENS
R E ANDERSON
Analytical chemistry Vol 40 899 1968
8 G LAUER RABEL F C ANSON
Analytical chemistry Vol 39 765 1967
9 G LAUER R A OSTERYOUNG
Analytical chemistry Vol 40 30A 1968

453

10 Handbook of operational amplifier applications
Burr-Brown Research Corporation
Tucson Arizona 1963
11 Introduction to programming
Digital Equipment Corporation
Maynard Massachusetts 1969
12 S P PERONE
Journal of chromatographic sdence Vol 7 714 1969
13 P E REINBOLD
MS thesis Department of Chemistry
Purdue University Lafayette Indiana 1968
14 J S SPRINGER
Analytical chemistry Vol 42 23A 1970
15 R G THURMAN K A MUELLER M F BURKE
Journal oj chromatographic science Vol 977 1971
16 G E JAMES H L PARDUE
Analytical chemistry Vol 41 1618 1969
17 G P HICKS A A EGGERT E C TOREN JR
Analytical chemistry Vol 42 729 1970
18 F B STEPHENS F JAKOB L P RIGDON
J E HARRAR
Analytical chemistry Vol 42 764 1970

The television/computer system-The acquisition and
processing of cardiac catheterization
data using a small computer*
by H. DOMINIC J. COVVEY, ALLAN G. ADELMAN t CLARENCE H. FELDERHOF,

PAUL MENDLER, E. D. WIGLE and KENNETH W. TAYLOR
Toronto General Hospital
Toronto, Ontario, Canada

INTRODUCTION

in conjunction with a standard broadcast television
system provides a powerful data acquisition and processing facility which occupies one office in the unit and
has the capacity to deal with the large volume of multiformatted da.ta (digital, analog, pictorial, patient records) resultant from heart investigation procedures.
The total cost of the system has been about $100,000
for hardware and $20,000 for personnel. I t is being
used for resea.rch and development and is more sophisticated than necessary for routine work. For routine work
a simpler configuration can be used. For example, a
minimum system might include: a PDP-8/E ($5,000),
a teletype ($1,700), disc or tape ($8,000-$10,000) interfacing ($2,000-$4,000), and some basic television
equipment ($5,000), or not more than $27,000.
Other techniques have been developed for obtaining
this information. They range from totally automated
border recognition and dimension extraction systems9,10,1l to hand measurements. 12 ,13 Between these extremes, there are: (a) semi-automated systems similar
to ours,14,16 (b) a light pen system15 and (c) various
techniques for obtaining border coordinates by standard X-Y position digitizers or scanners. 17 ,18,19,20
Totally automated systems invo1've long setup procedures, the digitization of entire pictures, and the
presentation of this data to a large computer for analysis. This is often slow and extremely expensive. On the
other hand, manual measurements are tedious, time
consuming and the resultant data often requires some
machine processing.
Three systems similar to ours23 have been developed.
Two use a dimension measuring interface similar to the
one described in the text and illustrated in Figures 3-7:
( 1) the Bugwatcher, 21 which uses a much more expensive computer, is very similar to ours in design, but is
not used for cardiovascular work, and (2) an analog

One of the prime objectives of cardiovascular research
is to assess the functional state of the heart especially
the left ventricle, its main pumping chamber. The
functional state of the left ventricle is determined by
its dimensions,! volume,2 the velocity of wall movement, 3 the intracavity pressure, 4 and the wall tension
and stress. 5,6,7,8
This paper describes a semi-automated technique for
obtaining parameters which indicate the functional
state of the heart from left ventricular cineangiograms
(35 mm. X-ray cine films of the heart taken while injecting a contrast material into the pumping chamber)
and simultaneously recorded intracardiac pressures.
Doing the measurements necessary to obtain function
data is tedious and time consuming as dimensions must
be measured from each frame of a cine angiogram of the
left ventricle taken at 60 frames per second over several
seconds, and correlated with the instantaneous cavity
pressure. Displaying resultant function parameters in
an intelligent fashion is also critical if they are to be
useful to the clinician.
Two years ago the Cardiovascular· Laboratory at
Toronto General Hospital undertook to quantitatively
assess the functional state of the human left ventricle.
The resources available to this department dictated
that any application of data processing equipment be
modest, and that the equipment be housed within
existing space. We, therefore, chose a small computer,
optimizing on both the low cost and expandibility of a
local on-line system and the availability of software,
peripherals and interfacing electronics. This equipment

* Work supported by the Ontario Heart Foundation and
Toronto General Hospital
455

456

Fall Joint Computer Conference, 1971

H. S. PRINTER

+

GRAPHICS

T.V.

DIGITAL

+

TERMINAL

ANALOG
SIGNALS

~-ME-RA---I-------------------~
T.v. CAMERAS

MAG. TAPES

H.S. READER

+

NUMBER

PUNCH

Figure 1-Diagram of the Television/Computer System. The computer (DEC PDP-S/l) can transfer data to or receive it from a
number of peripherals (see Table II, text). The television system is composed of a master synchronization generator and sync and videodistribution circuits (see Table I, text for details). In addition, there are the interfaces which connect the computer and television systems. These include: the dimensional analysis interface (DAI), the light pen and light pen interface (LPI) and the analog-digital (A/D
and D / A) converters. Certain peripherals are shared by both systems, permitting easy communication between them.

system,22 which, although real-time operation appears
possible, suffers the limitation of having to depend on
an expensive video magnetic disc and of making only
area, length and one left ventricular width available as
data. The third15 uses a light pen similar to that described here, but employs a scan converter and storage
oscilloscope instead of a magnetic disc recorder for refreshing the display, and is less flexible in use.

functions, e.g., selecting parts of a picture for examination, changing the quality of a picture, superimposing
windows or other signals on a picture, recording televised signals, and presenting calculated or plotted data.
The main features of this work are the interfaces
which permit the analog signals from the television
system to be converted into digital format for input to
the computer and allowing the computer, in turn, to
communicate with the television system. The interfaces

THE TELEVISION/COMPUTER SYSTEM
The system we have designed (Figure 1) is based
upon interfacing a television system with a small computer. The television system (Table I) (standard 525
line broadcast equipment), is inexpensive but very
flexible. We use television as a brightness/voltage, dimension/time converter for pictures. In addition,
available television circuits permit a number of useful

TABLE I-The T.V. System
The main sub-elements of the Television System are:
1. Monitors (CONRAC, SONY, H/P)

2.
3.
4.
5.
6.

Number and bar generating circuits
Television cameras (SHIBADEN)
Magnetic Tape (SONY 2")
The Vista 1 H
The Video Disc

The Television Computer System

457

TABLE II-The Computer System
The computer peripherals available to the 4K PDP-SjI are:
1. Magnetic tape (DEC, TU-55)
2. 32K Magnetic Disc (DEC, DF-32)
3. AID, DjA converters, relay drivers, pulse inputs (DEC,
AXOS)
4. Oscilloscope display (TEKTRONIX, RM 503)
5. Teletype (ASR-33)
6. ISO LPM printer and graphic output terminal (LEIGH,
ALPHAGRAPHIC)
7. 4S00 Baud CRT terminal (INFOTON, VISTA 1 H)
S. Video disc via the AXOS (COLORADO VIDEO, VIDEO
PLOTTER)

ensure that the dimension data measured from each
television line in a picture are identified with that line,
and are presented to the computer at an acceptable
rate.
The computer, a 4K PDP-8/1 (recently upgraded
to 8K) has available to it a number of standard peripherals (Table II) for inputting and outputting the data
and for combining dimension data with other measurements during catheterization.

VIDEO TAPE RECORDER

EBSJ
-<>-g-y

Figure 3-Cutouts, and pictures from film or videotape can be
measured by the DAI. The video signal from these pictures is
adjusted by means of the camera control and presented to the
DAI for analysis. A special effects window, which allows the
manual selection of a portion of the picture for analysis, is mixed
with the video signal and appears on the monitor. The switching
waveform from this special effects window is input to the DAI.
The DAI can feed back the measured image to the monitor and
transfer the measurements to the computer which can display,
record or print the dimensions.

Presently, these measurements are made after the
films obtained during cardiac catheterization are processed and returned to the unit. This entails a delay of
about 24 hours before the catheterization data is available. However,· ultimately our aim is to obtain data
directly from video tape recordings of the angiogram
and to play back the results ·into the television system
during the catheterization. 22 ,23
CARDIAC CATHETERIZATION

CINE PROJECTOR

~~~

CINE SCREEN

Figure 2-A diagram of the cineangiographic system used in
these studies. The X-ray image of the left ventricle is brightened
electronically and a 35 mm. cine camera photographs the images
at 60 frames per second. The image is also relayed via a television
camera to a television monitor and a video taperecorder. The cine
film is later projected by a 35 mm. projector onto a screen where
the left ventricular silhouette can be outlined. If the spatial
relationships of the various interfaces are kept constant, the only
variable affecting magnification in the system is the midplane
of the left ventricle.

Cardiac catheterization is done by introducing
catheters (small, 2 mm. O.D., tubes) into the peripheral arteries or veins and advancing these into the
heart chambers. The catheters are used to record pressure or to inject radio-opaque material to obtain highcontrast serial X-ray pictures of the chambers of the
heart. In left ventricular function studies, one catheter
is used for injecting contrast material and one for
monitoring pressure. The main data obtained from this
procedure are: (a) sequential films of the left ventricle
(left ventricular cineangiograms) and (b) the analog
left ventricular pressure recording.
In our work the angiograms have been recorded on
35 mm. cine film at 60 frames per second directly from
an image intensifier (Figure 2). However, it is also
possible to process full-size films from a high-speed film

458

Fall Joint Computer Conference, 1971

ONE LINE
OFA TV SCAN
LE

TH

SPECIAL EFFECTS
WINDOW

RIGHT
EOOE

VIDEO VOLTAGE
FROM ONE TV. LINE

t------+-53.5 ~s -+----4
CLOCK
OUTPUT

11"""""'111111

Figure 4-Schematic diagram showing how measurements are
obtained from a television picture. When a line of the T.V. scan
crosses the left edge of a dark image, a fall in the video voltage
below a Schmitt trigger threshold starts a lOMHz clock, when the
line crosses the right edge of the image the rise in the video voltage
stops the clock. The number of clock pulses is proportional to the
width of the image and the number of television lines crossing the
image is proportional to its length. The threshold can be adjusted
to define the edge of the image and a special effects window allows
manual selection of the area of the picture which contains the
image to be measured.

viewed by a television camera. Alternatively, dimen~ions may be measured directly from eaQh 35 mm. cine
frame (Figure 6). The results obtained directly from
film are less consistent than those obtained from cutouts. However, with model studies,25 the differences
between measurements from film and cutouts were not
significant.
The television image is input to the Dimensional
Analysis Interface (DAI). The dimensional analysis
interface measures the width of the image on each television line by a threshold technique and (Figure 4)
makes the width and corresponding line number available to the computer. The way this is done is shown
in detail in Figures 3-6. Programs calibrate the mea...
sured widths and length (proportional to the number of
television lines crossing the left ventricular image) and
calculate scaled widths, the area, the length and volume.
The latter is calculated assuming that the left ventricle
is circular in latitudinal cross-section. 26 The raw data
is output on paper or magnetic tape (Figure 7). The
tape record is later processed by programs which compute wall tension, stress, velocity and plot any selected
dimension data.
REMOTE
THRESHOLD
ADJUST

VIDEO

COMPUTER READY •
FIELD DRIVE
BLANKING

changer for greater detail or 16 mm. cine film of the
televised X-ray image (kinescope recording) when television format would be advantageous for presenting
other data on the frame. The pressure is recorded on
photosensitive paper in an Electronics for Medicine
oscillographic recorder, and also directly on film.

LEFT VENTRICULAR FUNCTION DATA

Dimension data
The Dimensional Analysis Interface (DAI)

Dimensions are measured from 35 mm. cine film projected onto paper by a stop frame projector (Tage
Arno). A technician outlines the ventricle in each
frame, draws axes on the outlines, cuts them out and
puts them one-by-one on a light box where they are

SPECIAL EFFECTS

OUTPUT

•

GATES
AND
LEVEL
CONVERTERS

C
0
M
P
U
T,
E
R

VIDEO

Figure 5-Schematic diagram of the DAI. Voltage transitions
in the video signal turn on or off a Schmitt trigger which has an
adjustable threshold. The output of the trigger is gated with the
signals from the computer, with the television FIELD drive and
BLANKING signals, and with the switching waveform from a
SPECIAL EFFECTS generator. When these input conditions
are satisfied, the time the trigger is on for each television line is
measured by the number of clock pulses accumulated in a "width"
register and is fed back to the television monitor as a bright line
(Figure 6). The line number is recorded in a second register. At
the end of each line during which the clock was on, the output
control transfers the width and line registers to their respective
buffers and indicates to the computer that it is ready for a
TRANSFER REQUEST. When a TRANSFER REQUEST is
received by the output control, the contents of the buffers are
copied into the computer accumulator through a series of output
gates and level converters.

The Television Computer System

459

Figure 6-Cutout (top) and film (bottom) television images (left) along with a representative video waveform (center) and the feedback images from the DAI (right). The "window" surrounds both the television and feedback images. The video waveform is from the
brightened line that crosses the upper part of the images. The video waveform of the cutouts has sharp edges, high contrast and is uniform,
whereas that of the film has a diffuse edge, lower contrast and is not uniform. The result is that the measurement of the cutout is much
more accurate and objective than that of film and that small changes in the threshold of the trigger will alter the measured size of the
film image but will have little effect on the measured size of the cutout.

The light pen and light pen interface (LPI)

Because obtaining cutouts for use by the dimensional
analysis interface is tedious and time consuming, the
light pen and light pen interface were developed to
input the left ventricular border directly to the computer and to further reduce the time involved in processing dimension data. With the light pen, the image
of the left ventricle displayed on a television monitor
from 35 mm. film or video tape, is outlined manually,
the coordinates of the border being input by the interface directly to the computer. This system is illustrated
in detail in Figures 8-10. Two monitors are used in
practice, one to allow viewing of the picture with adequate contrast, the other for use as a "tablet". All the
operator needs to do is advance the film frame he

wishes to process, outline the border as he recognizes it
by eye, and indicate to the computer that the border
is complete. The video disc constantly refreshes the
track of the light pen and keys this into the picture the
operator is viewing. The computer is interrupted 60
times per second to accept the X and Y coordinate of
the position of the light pen. Multiple terminals are
thus easily accommodated even by a small computer.

A nalog signals: pressure
The trace Dlarker

When first using the dimensional analysis interface
(DAI) to process cutouts, pressures were obtained from

460

Fall Joint Computer Conference, 1971

VOlZ

WVOl

00
FILM

HORIZONTAL VERTICAL
CO-ORDINATES
COMPUTER

RAW
DATA
OUTPUT TO
PUNCH OR
MAG TAPE

DISPLAY
OR
PRINT

PRINT
RAW
DATA

CALCULATE
DIMENSIONS

CALC~LATE

PRINT
CALIBRATION
FACTORS

Figure 7-These are flow charts of two programs: VOLZ which
calculates the dimensions of images and WVOL which calculates
the calibration factors. The latter can be run in a repetitive mode
to check the operation of the system and detect electronic faults.
Because of the high data rates, both VOLZ and WVOL simply
store the measured widths and their corresponding line numbers
in assigned memory areas until the buffers are full. Thus, several
scans of the same image are available. Checks are done to ensure
that no lines were missed and that each discrete image is chosen
from the several scans stored in memory. This is done by checking
for sequential line numbers.

the oscillographic tracings on which the time of occurrence of each cine frame was marked (Figure 11). A
line was drawn by hand from each mark to the left
ventricular pressure tracing and the pressures read off,
calibrated and tabulated by hand. This method of obtaining instantaneous pressures was used on 80 cases
in conjunction with the DAI and cutouts. It is tedious
and involves many possible inaccuracies. In particular,
there is the possibility of losing the time correlation
between the cine frames and the pressure tracing.
To ensure correlation between the film frames and
the instantaneous left ventricular pressure it is necessary to guarantee that neither the oscillographic paper
nor the film has stopped. Also, frames and trace
marker pulses must be counted accurately until the
heart cycle of interest is reached. To assure us of the

Figure 8-The track of the light pen placed against a monitor
displaying a blank, bright raster (monitor ~ 1) can be recorded by
the video disc. This track may be keyed (SEG KEY) into the
output of a television camera which is viewing a .cine film
(monitor ~ 2), permitting an operator to outline areas of interest
in the picture viewed by the television camera. Simultaneously,
the digital horizontal and vertical coordinates of the position of the
light pen can be output to the computer by the light pen interface.

FIELD

--..---+----'--'="-=="------1

LPP's

BLANKING~>-------<>_______t

Figure 9-Schematic diagram of the light pen interface (LPI).
The FIELD drive pulse clears the HCR (Horizontal Coordinate
Register) and the VCR (Vertical Coordinate Register) and
provides input conditions to the CLOCK GATE and the
INHIBIT. During a television line the BLANKING signal
provides a condition to the INHIBIT. At the beginning of each
line the BLANKING adds a count to the VCR if the VCR GATE
is not inhibited and starts the clock if the CLOCK GATE is
not inhibited. The output of the clock is recorded in the HCR.
If a light pen pulse (LPP) does not occUr before the end of a line,
the BLANKING clears the HCR through the LINE CLEAR
GATE. If an LPP occurs during a line, a Schmitt trigger (ST)
turns the INHIBIT on. The latter INHIBITS the VCR gate,
the LINE CLEAR GATE and the CLOCK GATE effectively
freezing the contents of the HCR and VCR. The INHIBIT also
provides a signal to the output control indicating to the computer
that the contents of the HCR and VCR are available for transfer.

The Television Computer System

461

Figure lO-The track of the light pen on a blank, bright raster (top, right). This track has been keyed into a picture of a left
ventricular cineangiogram (bottom left) and imaged by a television camera onto a second monitor (top left). By using the bright, blank
raster as the drawing tablet and by following the border as it appears on the picture, the operator has been able to outline the left
ventricular silhouette (bottom right) with the recorded track.

simultaneity of the two records, a second trace marker
was added which has every tenth mark accentuated
and which uses this accentuated mark to place a bright
dot 011 every tenth film frame. In this way the number
of errors due either to stopped recordings or counting
mistakes has been greatly reduced.

The fraIne :marker

Another way of removing the correlation and frame
counting problems is to place the pressure directly on
the film frame and number the frames. For ease and
flexibility this was done by keying the analog pressure

462

Fall Joint Computer Conference, 1971

\J
,

\J

signal (displayed as a horizontal bar), the frame number, and the digital value of the pressure into the video
output of the X-ray system (Figure 12) and kinescope
recording this on 16 mm. film. Alternatively, an optical
system is being developed to superimpose these signals
directly on the 35 mm. cine film. The pressure bar
may be measured automatically by the DAI. The
numerical indication of pressure is useful for manual
keyboard entry into the computer while using the
LPI. In each case, the computer calculates true pressure after being given the appropriate scale factors.

A

Figure ll-Pressure is recorded using a Statham pressure
gauge and displayed on a channel of the Electronics for Medicine
oscillographic recorder. Along with the pressure trace there are
two markers, one recording the output of a photodiode-fluorescent
screen combination in the pulsed X-ray beam and one redisplaying
this and accentuating every tenth pulse to facilitate counting up
to the particular heart cycle desired. Each pulse corresponds to
the exposure of a cine film frame. A mark also appears on every
tenth film frame to ensure trace-to-film correlation.
For manual digitization of the pressures a line is drawn using
these markers as' a reference, and the point at which the line
intersects the pressure tracing recorded in units of height. These
values are later calibrated to true pressure. The change-over to
semi-automatic digitization involves sampling the recorded
pressure waveform at the time of occurrence of the peaks on the
marker channel.

Figure 12-Block diagram of the numerical pressure display on
film. The pressure (appearing as a voltage waveform) is filtered to
remove high frequency noise, and, at a resetable time after the
beginning of each television field, is sampled and analog-to-digital
converted to a Binary Coded Decimal number. This number is
decoded to 7-segment format. The number generating circuits
then create numbers on the screen by keying a bright pattern into
the video at the position set by the horizontal and vertical
position controls.

CONCLUSION

It should be noted that this system may be useful in
other areas where dimensional or geometrical analysis
of pictures is desired, e.g., area measurement of plane
objects (DAI), or measurement of the shape and size
of chromosomes (LPI). In addition, analog data presented as bars in a television picture or as televised
graphs or traces may be handled very easily; in the
first case automatically through the DAI or, in the
second, manually through the LPI. The LPI may also
be used to count and mark objects in a picture.
Lastly, the entire computer system is available as a
general purpose laboratory data acquisition and processing system. We have used it for regional myocardial
blood flow measurements, for the processing of Xenon
washout curves from the lungs, for statistical analysis,
for plotting, for digitizing selected television lines and
for the analog-to-digital conversion of electrocardiograms.
Using the television/computer system we have processed, in a 372 month experimental run, the pressures
and dimension data from one complete cardiac cycle
(60-100 frames) for each of 80 patients. Other routine
studies are under way on both normal left ventricles
and on pre':' and post-operative left ventricular function. The Cardiovascular Unit at Toronto General

Figure 13-Pictures of a television monitor with the pressure
bar and the numerical display mixed with the X-ray video. The
number on the left in each picture is the frame number, the one on
the right is the numerical value of the pressure. Two different
frames have been simulated to show how the display would look
for two different pressures.

The Television Computer System

Figure 14A-Mrs. M's ventricle exhibits very poor contraction,
a more rounded shape (more spherical than cylindrical) and
relatively no change in the long axis length or widths perpendicular
to this axis.

463

Figure 14B-Mr. T's ventricle has excellent left ventricular
contraction, an elongated shape (more cylindrical than spherical)
and major changes in the long axis and widths.

Figures 14A and 14B-Geometry. Four frames from a left ventricular cineangiogram are shown, one at the end (end-systole) and one
at the start (end-diastole) of the cardiac contraction cycle and two intermediate.

Hospital averages 6 investigations per day. We have
thus been able to process roughly 20 percent of the
cases available at the Unit. Using the hand techrriques
for measuring volume, we would have been able to
process (obtaining volumes only) less than 5 percent
of the available cases. It is projected that we will be
able to process 50-75 percent of the available cases with
the total implementation of the light pen interface and
the numerical pressure indication system. This increase
can be achieved without the addition of any staff, and
with no additional expenditures on hardware. Furthermore, the system will be capable of providing more advanced information to the clinical staff than is currently
available.
RESULTS
To illustrate the kind of information generated by
the system the results of studies done on 2 patients,

one with very poor left ventricular function and one
with excellent left ventricular function, are shown in
Figures 14-19. Figure 14 shows that the poorly contracting ventricle exhibits little change in all dimensions (width, length and area), whereas the normal
ventricle, shows marked reduction in its dimensions.
Figure 15 illustrates that the left ventricular widths
are larger and show little change from the start to the
end of contraction in the poor ventricle compared to
the normally contracting ventricle. Similarly, in Figure
16, the poor ventricle's volumes are larger, the rise
time slower, and variation smaller in contrast to the
normal left ventricle. The pressures (Figure 17) show
a slower rise time and a higher minimum pressure in the
poor ventricle compared to the good one. The pressurevolume correlation shown in Figure 18 demonstrates
that the stroke work (i.e., the work done in ejecting
blood) is much smaller in the poorly contracting ventricle, that this work is done at a higher pressure in a

464

Fall Joint Computer Conference, 1971

1

: ....

,

i.'

i>

,

.. '.
..

:~

i:

4

r

-

l'::

A
...

'.

'.

,"

',.'

...':
5

,.

A ccuracy in these studies

The accuracy of the measurements of the left ventricle is most affected by picture quality, e.g., if the
border of the left ventricle is blurred, the position of
the edge cannot be precisely determined. Various arbitrary criteria may then be used to determine the edge
automatically and each will produce different measurements of the size of the left ventricle. For this reason,
we have chosen to rely on the eye to choose the border.
The eye is superior to the machine in recognizing the
boundary since it is very sensitive to small changes in
contrast and since it uses the whole picture to assist in

,
,

I
I·'

I'

Ii!

:
IS

,

i..' '

. ..

,.

.:

~

.~r~-I

.'

'\11'

\~ I--'f
v ...."

V~

Tf

·f

1

v

1-1

\ '., I.

iM irr""

...

'

.,

Iliillii

:'

I

.~.

,'.,. I'''''···

i

...

,.,

I""

.'

lS
40
TIME ( M THS Of THE SEC)

t~

I
'''q

"

.'

1',

1

I'·

K'~ ~iD

,..

:,'. ,>
,.

~"'iV

\.

I'

3
....

i

.','

1

,

:
i l :'

i
4S

5

Figure 15B-In Mr. T's width plots, the marked change in
widths is apparent. The rise times are fast, indicating vigorous
contraction. All the traces are roughly in phase, indicating that
BIll areas are contracting together.

Figures 15A and 15B-Width plots. The plot of three widths from the left ventricle at 7.4:',

larger chamber, and that little blood is ejected. This
means that the poor ventricle has a mechanical disadvantage relative to the normal ventricle. Finally, the
wall tension (Figure 19) starts higher, rises more
slowly, and remains higher longer in the poor ventricle
compared to the good one, indicating that the poor
ventricle uses more energy to do less stroke work.

:

,:.'.

<:"

TIME(MTHS Of THE SEC)

Figure 15A-It should be noted in Mrs. M's width plots, that
the values are large, that there is little change from end-systole
(small widths) to end-diastole (large widths), and that the rise
times are slow. The width at the base of the heart (the top,
width 1) shows relatively better contraction compared to the
other two widths. Width 3 (at the apex) is out of phase with
number 1. These findings are in keeping with the visual findings
of a lack of contraction in this area.

~
.. '

I:"

...

,

~~,.".

:,

...

\if\I'I ..

. , .. , .

0

'~

1

..
'.
,.

..

l',

r

1

I

."

,"' .

!V

2

i

,'.'

72, and % of the way along the long axis.

judging the border. This is put to use in the LPI and
in the making of cutouts or the selection of the threshJ
old when using the DAI.
The DAI, when measuring films and cutouts from
films of model ventricles, was generally within five percent of the true volume of the modeJ.25 In routine
X-ray films, however, the measured size of the ventricle
is very dependent on the threshold chosen. In these
cases, the eye must be used to judge the position of
the border.
But, even the eye may find it difficult to choose a
border, since the border may be very unsharp or irregular due to poor mixing of the radio-opaque dye and
the non-uniformities of the wall itself. Some of these
problems may be resolved by better injections, improved picture quality, more understanding of the
attenuation of the X-rays by the contrasted heart, and
more knowledge of the behavior of the left ventricular
wall irregularities (trabeculae carneae and papillary
muscles) during the heart cycle. Work needs to be
done in all of these areas to improve the accuracy and
meaning of the results obtained. This is particularly

The Television Computer System

. ,.

i

Ii

1

465

I

i .

!~

"

!

lllilllili
'

,

40

45

1

55

I:

60

65

TIME(64THS OF THE SEC)

Figure 16-The volume curves. It should be y.oted that these curves are displayed on one sheet for convenience only and that they
are not in phase, since the two hearts are beating at different rates.
Mrs. M's ( ~ 1) ventricular volumes are large and there is little change in volume during contraction. The rate of change of volume
is slow and the usual characteristic features of volume changes during a heart cycle are not evident.
In Mr. T's (~2) curve there is a large change in volume with a good rise time. Although noisy, the curve seems to show a rapid-filling
phase (frames 15-20), a reduced-filling phase (frames 20-27), atrial systole (the contribution of the contraction of the atrium) (frames
36-43) and an ejection phase (frames 43-60). These features are not apparent in Mrs. M's ventricular volume plot.

110
100

o

Figure 17-Pressure plots. These are the pressure tracings for the tW{) ventricles. The curves are not in phase. The features that are
.mportant are: Mrs. M's peak pressure is lower than that of Mr. T, whereas Mr. T'~ minimum pressure is lower than that of Mrs. M.
~he .higher minimum .(di~to.lic) pressure in Mrs. M's ventricle is typical of a large, dilated, poorly contracting ventricle. Mrs. M's pressure
~lse IS more gradual, mdlCatmg a more poorly contracting ventricle than Mr. T.

466

Fall Joint Computer Conference, 1971

-.....-

L.Y. INJ.

P

MRS. M.

I. :
170

s
~
R

E
160 (_HG)

ISO
140

IJO
I»
AC

110

I.
90

•

\
\
\

70
60

,.

so

,

.\

;

.

"

:

,

40

JO
. 1

11

. __ !"' ______ ;

i

i

. -;--··'-·i

o

I.

_1_1. _.:
110

III

IJO

140

ISO

160

"

no

;

i

1
i

no

no

140

I

.__ ... ; ••.. __ ._, __ "_.. -1

I.

190

_

210

VOLUME (eM 3 )

Figure 18A-The main features of Mrs. M's loop are that its
area, which is proportional to the stroke work, is small. The
volume' of blood ejected relative to the volume of the cavity is
small. Moreover, the centroid of the loop is shifted upward
(high pressure) and to the right (high volume) compared to a
normal ventricle.

Figure 18B-Mr. T's plot is characterized by a high stroke work
(area enclosed by the loop) and a shift in the centroid of the loop
downward and to the left as compared to Mrs. M.

Figures 18A and 18B-Pressure-volume plots. Smoothed pressure-volume plots for Mrs. M and Mr. T.

true now, as we would like to measure wall thickness
accurately.
Other errors in left ventricular measurements arise
from changes in the X-ray system magnification (image
intensifier), spatial distortion by the intensifier, film
shrinkage, human error in outlining and cutting out the
paper for the cutouts, the spatial orientation of the left
ventricle when it is filmed, and the orientation of the
cutout while being measured. Studies of these errors
are in progress and techniques will be refined to reduce
them.
FUTURE OBJECTIVES
The light pen and light pen interface are becoming
operational on a routine basis. This speeds the processing time, allows more cycles per patient to be done, and

provides more information per frame (the coordinates
of the border are available and hence shape may be
measured). With the ability to process a larger number
of heart cycles per patient, we should be able to study
the stressing effects of drugs and other inotropic agents,
such as pacing, within a given angiogram or by repeat
angiograms. This should provide further information
about the functional state of the left ventricle.
Meanwhile, a systematic study of the X-ray system
and other factors affecting border recognition is being
pursued. This should eventually provide us with better
quality pictures.
A number of projects are under way in the development of the television/computer system:
(1) We will attempt to improve the automated system (DAI) by using more reliable border recog-

The Television Computer System

nition criteria (presently merely voltage threshold sensed by a Schmitt Trigger) and also by
enhancing the image presented to it, finally perhaps carrying out rapid measurements on videotape recordings and displaying the results during
the catheterization.
(2) We are planning a more efficient semi-automated
system involving improvements to the LPI and
the use of the DAI in conjunction with it; for
exaniple, the LPI can be used to delineate areas
to be quickly and automatically measured by
the DAI (the left atrium and aorta may be cut
off using the LPI, and the left ventricle measured by the DAI) .
(3) More use of automation in the handling of analog data is being attempted.
(4) Weare considering the development of a videodensitometric system to provide two important
measurements: (a) of the blood How, and (b)

467

of the depth of opacified objects (assuming that
the density of an image is due to the depth of
absorbing material in the path of the X-rays) .
Presently, the system is being assessed as to its usefulness in the clinical and biophysical studies now being
carried out.27 It has, however, already been valuable in
providing clinical data in the Cardiovascular Unit and
may have many applications for measurements on
pictures and analog signals elsewhere within the hospital, in research, and in industry.
ACKNOWLEDGMENTS
The authors are grateful to the technical staff of the
Department of Medical Engineering and Biophysics,
particularly to Mr. Paul Mendler for his work on the
DAI and LPI, to Mr. Roy Liggins for his excellent
technical assistance, to Mr. Donald Mills for constructing the DAI, to Mr. Robert Kubay for building the
trace marker circuits, to Mr. Robert Growcock for the
digital number display construction, and to Mr. Franz
Schuh and Mr. Louis Rostocker for their mechanical
assistance. In addition, the authors would like to express their appreciation to Mr. Eric Covington for his
contribution to the programming, to Mrs. Yasna Polic,
1\Irs. Ulla Nordin and the Department of Art as Applied to Medicine for the diagrams, to Mr. Barry
Bassett for the photographs in Figures 6, 10 and 13, to
the Department of Medical Photography, Toronto
General Hospital for photographing the figures, and to
Miss Vivian Martin for her excellent secretarial assistance. This work was supported by the Ontario Heart
Foundation and Toronto General Hospital.
REFERENCES

I 1 1 4 5 6 7 • 9 10 " 11 13 14 15 16 17
FRAME OF SYSTOlE

Figure 19-Wall tension plots. These plots of the wall tension
of each ventricle are not in phase. The tension plots show Mrs. M
starting at a higher tension than Mr. T, rising more slowly,
peaking higher and later, and remaining higher longer than Mr. T.
Mrs. M's curve is characteristic of a dilated poorly functioning
left ventricle, and Mr. T's of a fairly normal ventricle.

1 A BURTON
The importance of the size and shape of the heart
Vol 54 No 6 p 8011957
2 H T DODGE W A BAXLEY
Left ventricular volume and mass and their significance in
heart disease
The American Journal of Cardiology Vol 23 p 528
April 1969
3 E H SONNENBLICK W W PARMLEY
C W URSCHEL
The contractile state of the heart as expressed by the
force-velocity relations
The American Journal of Cardiology Vol 23 p 488
April 1969
4 I L BUNNELL C GRANT D C GREENE
Left ventricular function derived from the pressure-volume
diagram
American Journal of Cardiology Vol 39 p 881 December
1965

468

Fall Joint Computer Conference, 1971

5 J H GAULT J ROSS E BRAUNWALD
Contractile state of the left ventricle in man
Circulation Research Vol XXII p 451 April 1968
6 H L FALSETTI R E MATES C GRANT
D C GREENE I L BUNNELL
Left ventricular wall stress calculated from one-plane
cineangiography
Circulation Research Vol XXVI p 71 January 1970
7 A Y KWONG P M RAUTAHARJU
Stress distn'bution within the left ventricular wall
approximated as a thick ellipsoidal shell
American Heart Journal Vol 75 No 5 p 649 May 1968
8 D D STREETER R N VAISHNAV D J PATEL
H M SPOTNITZ J ROSS E H SONNENBLICK
Stress distribution in the canine left ventricle during
diastole and systole
Biophysical Journal VollO p 345 1970
9 R NATHAN
Digital video data handling
Technical Report No 32-877 Jet Propulsion Laboratory
California Institute of Technology January 5 1966
10 D A WINTER B G TRENHOLME D MYMIN
E L MYMIN
Computer processing of videoangiographic images noise
reduction image enhancement and data extraction
Canadian Cardiovascular Society Conference October
15-17 1970
11 C K CHOW T KANEKO
Boundary detection of radiographic images by a threshold
method
Unpublished manuscript received from the IBM
Corporation Thomas J Watson Research Center
December 1970
12 R J GOERKE E CARLSSON
Calculation of right and left cardiac ventricular volumes
Investigative Radiology Vol 2 p 360 September-October
1967
13 M E SANMARCO S H BARTLE
Measurement of left ventricular volume in the canine heart
by biplane angiocardiography: accuracy of the method
using different model analogies
Circulation Research Vol XIV p 11 1966
14 A G TSAKIRIS D E DONALD R E STURM
E H WOOD
Volume ejection fraction and internal dimensions of left
ventricle determined by biplane videometry
Federation Proceedings Vol 28 No 4 p 1358 1969
15 P H HEINTZEN V MALERCYZCK
J PILARCZYCK K W SCHEEL
A new method for the determination of left ventricular
volume by use of automatic video datfJ, processing
Abstract American Heart Association 43rd Scientific
Sessions 24th Annual Meeting 1970

16 M R GARDNER H R WARNER
Dynamic aortic diameter measurement in vivo
Computers and Biomedical Research 1 p 50 1967
17 C B CHAPMAN 0 BAKER J H MITCHELL
R G COLLIER
Experiences with a cinefluorographic method for measuring
ventricular volume
The American Journal of Cardiology 18 p 25 1966
18 A JARLOV T MYGIND CRISTIANSEN
Left ventricular volume and cardiac output of the canine
heart
Medical and Biological Engineering Vol 8 No 3 p 221 1970
19 A H GOTT
Cooperative heart study cardiac volume analysis
SPIE Journal Vol 8 p 233 1970
20 H T DODGE SANDLER D W BALLEW
J D LORD JR
The use of biplane angiocardiography for the measurement
of left ventricular volume in man
American Heart Journal 60 p 762 1960
21 D DAVENPORT G J CULLER JOB GREAVER
R B FORVAREL W G HANEL
The investigation of the behaviour of microorganisms by
computerized television
IEEE Transactions on Biomedical Engineering Vol
BME-17 No 3 p 2301970
22 M L MARCUS W SCHUETTE W WHITEHOUSE
J BAILEY D L GLANCY S E EPSTEIN
A completely automated video tracing technique for the
determination of dynamic changes in ventricular volume
Abstract American Heart Association 43rd Scientific
Sessions 24th Annual Meeting 1970
23 H D COVVEY
Measuring the human heart with a real time computing
system
Data Processing Magazine p 27 May 1970
24 H D COVVEY A G ADELMAN
C H FELDERHOF E D WIGLE K W TAYLOR
The television/computer dimensional analysis interface
Submitted for publication March 1971
25 C H FELDERHOF A G ADELMAN
H D COVVEY E D WIGLE K W TAYLOR
Unpublished Data
26 H D COVVEY
The measurement of left ventricular volumes from T V with
a PDP-8/I
The proceedings of the Digital Equipment Computer
Users Society Atlantic City New Jersey p 255 Spring 1970
27 H D COVVEY
A television/computer system for the rapid processing of
x-ray pictures and analog signals in studies of the left
ventricle
Master of Science Thesis University of Toronto
January 1971

Cost benefits analysis in the design
and evaluation of information systems
by 1. LEARMAN
Medical Systems Technical Services Inc.
Rolling Hills, California

INTRODUCTION

zation of their time and services, attempts to automate these activities will continue to fail."

In June of 1969, a report! was prepared for the Federal
Hospital Council by the staff of the Health Care
Technology Program of the National Center for Health
Services Research and Development. The report entitled Summary Report on Hospital Information Systems,
has as primary objectives-"to give a broad view of
the components of automated information systems, to
briefly evaluate the cost and effectiveness of such systems, and to estimate their future importance."
In meeting the first and last objectives, the report is
quite good and should be read by all who are interested
in the field. It is in the areas of cost and effectiveness
that the report is weak. This in no way should be considered a reflection upon the authors, who fully recognize the paucity of data in these matters. In fact, the
authors wisely address, first the moneys being spent
on hospital information systems, and then independently, their general performance and acceptance. Indeed, among the quite excellently thought out conclusions is the following:

It is the author's judgment that, concomitant with
the increased recognition and demand of other medical
specialties for computer applications is the need for
reliable and acceptable methods of assessing the practicality and effectiveness of those applications. The
paper attempts to address such methods of assessment.
Specifically, we will discuss absolute cost effectivenessthe measure of worth of applying technology to the
enhanced transfer and processing of information. The
segments of technology include analysis, design, implementation, operation and maintenance. These are
measured in the utilization of available resources employed to each technology segment-people, equipment,
material and facilities.
The paper will not address relative cost effectiveness-the optimization of resource selection. For such
discussions, the reader is referred to a paper2 on the
subject published by the author. In what follows, a
case will be presented for the establishment of absolute
cost effectiveness parameters during the planning and
design of a hospital information system.

"The discrepancy which exists between the apparent
success and enthusiasm for the use of the computer
(sic) in the business and chemistry applications as
opposed to the patient management areas suggest
that the need for and utility of the computer were
more easily recognized, which resulted in a high
degree of motivation to see these projects through
to a successful operational stage. It might then be
assumed that the lack of success in other areas has
been a result of an inability qn the part of our hospitals to precisely define either the need or practical
utility that the computer can serve in other patient
management areas. We would be willing to speculate
that until such time as other medical services, independent of external pressures, are capable of first
recognizing and then demanding more efficient utili-

THE ANALYSIS OF COST EFFECTIVENESS
IN DESIGN
The steps in cost effectiveness analyses take the
following general sequence:
•
•
•
•
•

Selection of Domains of Analysis
Analysis of "Current" Operations
Determination of Absolute Cost Effectiveness
Relative Cost Effectiveness
Refinement of Absolute Cost Effectiveness

Note the relative cost effectiveness (optimization of
469

470

Fall Joint Computer Conference, 1971

MAJOR AREAS
MANAGEMENT
SYSTEMS

INSTRUCTIONAL
SYSTEMS

HOSPITAL

Student Records

Nursing Units

Resource Allocation

Surgery

Personnel

Pharmacy

Computer Aided
Instruction
Computer Managed
Instruction
Availability of
Instructional
Data

Payroll
Administration

Radiology
Clinical
Laboratories
Accounts Receivable Medical and Dental
Records
Accounts Payable
Dietary
Purchasing
Hospital Business
Office
Insurance Office
Admitting and
Patient Logistics

valid areas of measurement the domain of analysis.
The criteria for selecting valid areas for current (manual, automated, or modeled) and new (improved
manual or automated) information handling applications are as follows:
First, each application must have a common, definable entry and output. Second, the designed change
must have some implication to benefits. Third, the
application in the current system must have a functional equivalence relationship to the new system.
Some examples may be in order:

Figure 1-Cost analysis, goal-hard reductions in required resources

resources) is considered in the context of this paper as
a refinement of the measure of worth. In the evaluation
of cost effectiveness, this step would not be available.
Domain of analysis

In order to attain a valid measure of worth, it is
essential to compare old apples to new apples or manual
oranges to automated oranges. We call the sum of all

Application: Laboratory Order to the Lab
Criterion I-Order entered at ward 1, order output
at Lab
Criterion 2-Faster turnaround time, reduce transcription, error.
Criterion 3-The current system requires the physician to write the order, the nurse to transcribe it,
the ward clerk to send it through the tube, the secretary to log it, separate it, and send it to the proper
department(s) .
The new system may require the physician to enter
the order directly into a device. The nurse and the
proper laboratory department(s) receive it via output devices. But the function was equivalent. This
would identify a valid area for our domain of analysis.

PROCESS
PROCESS
Reservations
Bed Availability Check
Nursing Unit Check
Bed Control Card
Type Daily Admissions List
Type Transfer List
Notify Surgery and Nursing Units
of Room Change
Check Discharges, Pull and Mark
for Information Desk
Emergency Admissions
Assign Bed and Notify Units
Prepare Pre-Admit Form
Handle Patient Transfer Requests
Call for Pre-Admit Information
Search for Pre-Admit Information
Type Admission Form
TOTALS
NUMBER OF PERSONNEL

ADMITTING CLERK
PRESENT
1974
3
.33
.33
.33
.5
.15
.15
1

3.6
.33
.33
.33
.6
.18
.15
1.2

.25
1
10
5
10

.3
1.2
.6
1.2
12
6
12

33.55
4

40.02
5

1
.5

Figure 2-Result of cost analysis, current costs admissions and
reservations information handling, current system in hours
per day

Reservations
Bed Check
Nursing Unit Check
Bed Control Card
Type Daily Admission List
Transfer List
Notify of Room Change
Check Discharges for Information
Desk
Emergency Admissions
Assign Bed
Prepare Pre-Admit Form
Handle Patient Transfer Requests
Call for Pre-Admit Information
Search for Pre-Admit Information
Enter Admission Data
Set Up Admit Package
Notify of Patient Arrival
Indicate if Admit Lab.
TOTAL
NUMBER OF PEOPLE

ADMITTING CLERK
1974
1

.1

o
o
o
o

o
o

.1

.33
.5

.5
5

.5
10

3
.33
.33
21.69
3

Figure 3-Result of cost analysis-New system, admissions
and patient logistics information system, requirements in hours
per day

Cost Benefits Analysis

471

CURRENT SYSTEM
1970
Admissions Clerks

4

Admissions Clerks

4

Admissions Clerks
Annual Reductions
Monthly Reductions
5 Year Annual Average
5 Year Monthly Average
10 Year Annual Average
10 Year Monthly Average

o
o
o

1971
1972
1977
1978
1973
1974
1975
1976
.5
4
6
4
5
6
5
5
ADMISSIONS AND PATIENT LOGISTICS INFORMATION SYSTEM
4
3
3
3
3
3
3
3
DIFFERENCE TABLES
1
2
1
2
2
2
2
3
$7,000 $7,350 $15,000 $15,700 $16,500 $17,300 $26,000 $18,500
2,330
1,560
585
615
1,250
1,310
1,375
1,650

1979
6
4
2
$19,500
1,625

$9,010
750
14,280
1,175
Figure 4-Results of cost analysis, admissions and reservations cost comparisons

Figure 1 lists the major functional areas in a teaching
hospital where significant numbers of such areas of
analysis have been found.

Cost analysis-Quantitative benefits
Hard savings

For each area in our domain of analysis, a cost
analysis is performed. The sequence is as follows:
• Selection of Applicable Procedures
• Determination of "Current" Resource Requirements
\
• Statement of Growth Assumptions
• Extrapolation of Resource Requirements to Operational Era
• Determination of "System" Resource Requirements
• Comparison of Extrapolated "Current" to
"System"
Each procedure in the area is defined together with
the current required resources to accomplish the procedure over an operational duration (minute, hour,
shift, day, week, month). An extrapolation of required
hospital resources is then determined for the same
operational time frame as the new system (2 years
hence, 5 years hence, etc.) .
The new information system design will point to a
different set of resources to accomplish the tasks.
During this phase of the analysis, no attempt is made
to include development and operational costs for the
information system; only those differences in carrying
out the tasks in the domain are compared.
The cost entities inCluded in these analyses are personnel salaries, overhead and fringe benefits; disposable
material such as forms or cards; equipment capital
costs, rentals, and maintenance; facilities requirements

..

for offices, storage, equipment, etc. These cost entities
must be established for both the extrapolated "current"
operations as well as the "system" operations.
Figures 2, 3 and 4 demonstrate the results of such
an analysis for admitting clerks.
Figure 2 demonstrates the procedures identified for
analysis and the present and extrapolated labor.
Figure 3 demonstrates the same procedures but
with the labor resources required in the new system.
Figure 4 compares the two and summarizes the potential hard savings.
We use the term "hard" to describe cost savings
which can be taken to the bank-reduction in personnel, material, equipment. In Figure 5, the column
labeled "Reduction Assumed" (R) includes all such
savings in costs per full operational month.
Partial tim.e savings

The most common source of error found in reviewing
cost effectiveness analyses, has been in the quantification of part-time labor savings. It is tempting to sum all
the minutes and hours of partial time saved and to
include the total in the quantitative benefits. The error
Total Reductions Partial
Total InTime
formation Potential Assumed
Saved (P)
(R)
Processing Savings

Area

$ 20,050 $ 11,785
2,000
3,650
5,960
8,110
12,315
25,142
85,400
127,400
1,500
2,990
1,928
3,342
1,575
2,150

$ 8,300
1,310
5,500
9,700
15,000
1,250
1,040
1,200

$ 3,485
690
460
2,615
70,400
250
888
375

TOTALS $192,834 $122,463

$43,300

$79,163

Business Office
Admitting
Pharmacy
Laboratory
Nursing Units
Surgery,
Radiology
Dietary

Figure 5-Summary of cost analyses for selected major hospital
entities, full operating system dollars per month

472

Fall Joint Computer Conference, 1971

in doing this is that it implies perfect managementcertainly an ideal but never a reality. For it would
mean that all of the partial times could be used fully
in performing other functions resulting in equivalent
benefits. We have found that an administrative efficiency factor should be used to weight such potential
benefits properly.
Figure 5 shows the relationship among four quantifiable factors for major areas of a 400 bed hospital.
"Total Information Processing" are those labor
dollars spent in the current system within the domains
of analysis.
"Total Potential Savings" are all the hours of labor
dollars saved by the new system.
"Reductions Assumed" are the savings which we
have called quantitative benefits (R).
"Potential Time Saved (P)" are the partial hours
of labor dollar savings.
It is the last figure (P) which we will penalize by
assigning a 33~ percent Administrative Efficiency
Factor. In other words, for every hour of partial time
saved, 20 minutes are utilized effectively. Note from
the figure that the greatest P exists at the Nursing
Units.

that factor of administrative payroll can be
taken as a benefit.
(2) Increased bed utilization enhances the effectiveness of patient management. The resulting
higher census and better scheduling can be
measured as an increase in hospital revenues.
(3) The transition to, and the operating environment of, a new system can profoundly effect the
rate of personnel turnover. The increase or decrease of such turnover can be measured in
terms of personnel acquisition costs.
As implied in the last example, such measurements
can produce negative results. The Quantitative Benefit
(Q) is the sum of these measurements (L2 EiXi) and
is given one-sixth the weight of the hard benefits.

An equation for cost effectiveness (EjC)
The analytic expression
E j C is a function of time based upon the accrual of
benefits and costs and can be expressed analytically as
follows:

jL B(t)dt

EjC =
Intangible benefits
Perhaps the most difficult role of the analyst is the
assignment of values to benefits which are not explicitly
measured in terms of identifiable resources. One can
simplify the problem monumentally by dividing such
benefits into two general types:
• Plausible Quantitative Measures-These can be
associated with a product of the information flow
which is in itself measurable. Examples of these
are benefits which can be measured as factors of
revenues or costs (improved bed utilization resulting from more timely and effective bed reservation
and surgery scheduling systems) .
• Judgmental Criteria-These are the most difficult
to measure and are used as supportive (or deciding) arguments when all measurable da.ta have
produced a borderline or negative cost effectiveness picture. Examples include enhanced care of
high risk patients, availability of processing power
for research support, etc.
Several examples of assigning quantitative values to
intangible benefits are described below
(1) If increased administrative effectiveness can be
measured as a percentage of management time,

o

(1)

C(t)

where B (t) are the accrual of benefits, C (t) are the
accrual of costs, and L is the life of the system including
development and implementation.
The author has used two approximations to this
equation in performing cost effectiveness analyses.
Approxhnation

#1

EjC = (Cav)-l

= (Cav)-l

jLo B(t)dt
L2 B(t) ~ L2 P(Bmax)
L

L

Cav

(2)

where

B

max

= R

max

+ P3 max +

Qmax

(3)

6

The benefits and costs per month accrue over the
years of development, implementation, and operation.
Using Bmax as the fully accrued monthly benefits and
Cav as the average monthly system costs over L years,
one can approximate the integral such that

EjC =

L2 P(Bmax)
L

Cav

where

L2 P

equals the percentages per year of benefits accrued.

(4)

Cost Benefits Analysis

In the example which follows, we have used an L of
7 years and aLp of 4.5 (made up of .1, .2, A, .8,
1, 1, 1).

values for our example under varying extremes:
E/C

R=O
R = 0, P = 0
R = 0, Q = 0
P=O
P = 0, Q = 0
Q=O

If

E/C

+L

L

P(Bmax)
Cay

(5)

is greater than 1, the system is judged as cost effective.
Examining the end points of this equation is an interesting exercise. Should there be no partial time saved
(P) or qualitative benefits (Q), then the hard benefits
(R) must exceed the total cost of the new system
(Cav ) over the system lifetime span. Cay includes all
the costs of all the resources as indicated under Hard
Savings in the discussion of Cost Analysis-Quantitative Benefits. Now let us assume no hard benefits and
no qualitative benefits. Then the total partial labor
saved would have to exceed three times the Cay. In our
example from Figure 5, P equaled approximately
$80,000. The average monthly cost of the major system
. was $40,000. Without the other two factors exceeding
$35,510, the system would not have been judged as
cost effective since3 Cay = $120,000
P = $80,000

4.5( 80,000) = ~ = 0429
7 120,000
21

Then E/C

>

4.5(R + Q/6)
7
40,000

>

R + Q/6

>

.571
$35,510

473

1.04
.64
043
1.3
.69
1.12

These figures indicate that for our example, cost
effectiveness could not be achieved by considering one
benefit category only and cost effectiveness could be
achieved considering any two benefit categories. Approximation # 1 was found to be most useful in determining cost effectiveness of integrated hospital information systems.
Approximation

#2
L

E/C L = -RL

CL

LP
+ - + LC
-(QLmax)
PL
CL

(6)

L

where RL and P L are the cumulative benefits up to
time L CL are the cumulative costs up to time L,
QL ma~ are the full qualitative benefits being derived
at time L, E/C L is the cumulative cost effectiveness
calculated at time L, and L is an integer year of system
life such that 0 < L :::; L max.
Factors of Approximation #2
RL

(a) CL

1

Although unlikely, the other extreme case would
mean there were only qualitative benefits. Using the
same example, Q would have to exceed $360,000 if
Rand P were zero, in order for E/C to be greater than 1.
The actual example turned out the following results:
4.5
(
80,000
185,000)
E/C = 7 (40,000) 43,000+ - 3 - +
6
4.5 (108000)
280,000
'
485,000
280,000

= 1.73
The following table indicates the cost effectiveness

This is the cost factor; if RL is greater than C L, the
system actually saves money. If the quality of performance is about the same, a system with RL greater
than CL(RL/CL > 1) would be cost effective.
L

PL

(b) 3CL

LP) (Q)
+ L(C
6
L

This is the quality factor; if it is greater than zero,
quality of performance will increase; if it is greater
than one, the system is cost effective even if there are
no cost savings.
Example Using Approximation

#2

For RL, P L and CL we have used real data from an
actual analysis of a clinical laboratory system.

474

Fall Joint Computer Conference, 1971

L

For

Cost effectiveness calculations

L: P we used the following:
1

First year, PI

.132

Second year, P2

.4

Third year, P3

.875

L:p = .132
L:p = .532
L:p = 1.407
2

3

4

L: P =

1

Fourth year, P4

2.407

5

Fifth year, P5

1

L: P =

Sixth year, P6

1

L: P =

Seventh year, P7 = 1

L: P =

3.407

6

4.407

7

5.407

s

L: P =

1

Eighth year, Ps

6.407

9

L: P =

1

Ninth year, P9

7.407

for L we have used 1 through 9
for Q we have used the following:
1971 Management Payroll
$500,000
Management Effectiveness Increase .2
QI =

.2(500,000) =

$100,000

1971 Non-Professional Payroll = $2,020,000
Operational Effectiveness Increase .09
(accuracy, duplications, etc.)
Q2 = .09(2,020,000) =

$182,000
280,000

We have extrapolated Q according to the anticipated growth of payroll.

Figures 6 and 7 show the cost effectiveness of the
system for the first three years of operation. According
to our definition, if the system had a life span of 3
years, it would not be considered cost effective.
In the fourth year, there is marginal cost effectiveness. For life spans of 5 through 8 years, the cost effectiveness is good. In the ninth year it becomes excellent.
Analysis of the factors

Figure 8 summarizes the factor values which make
up the total cost effectiveness.
The R factor, the cost savings, becomes marginally
cost effective in 1978 and good in 1979.
This says that the system actually pays for itself in
real dollar tradeoffs starting with a life span of eight
years.
The P and Q factor, the qualitative benefits, do not
become cost effective through the nine-year life span.
This says that there must be cost savings for the system to be cost effective.
315
432
2.407
E/C4 = 556.1 + 3(556.1) + 4(556.1)

(1275)
6

=.56+.26+.23
=1.05
441
566
3.407
E/C6 = 695.3 + 3(695.3) + 5(695.3)

(1630)
6

=.65+.27+.27
=1.19
622
712
4.407
E/C 6 = 857.4 + 3(857.4) + 6(857.4)

(1805)
6

=.72+.28+.25
=1.25
E/C _~ ~ .132
1 -100.3 + 3(100.3) + 100.3

(280)
6

=.1+.09+.06
=.25

E C _799
~
5.407
/
7 - 901.1 + 3(901.1) +7(901.1)

(2190)
6

=.89+.32+.29
=1.5

E/C _ 54.8
139.6
~
2 - 243.1 + 3(243.1) +2(243.1)

(590)
6

=.23+.19+1
=.52

E C _982
~
6.407
/
8 - 939.1 + 3(939.1) + 8(939.1)

(2580)
6

= 1.04+.36+.37
=1.77

E/C _ 147
294.8
1.407
(925)
3 - 406.6 + 3( X 06.6) + 3( X 06.6)
6
=.32+.24+.17
=.73

Figure 6-E/C years 1-3

E C _ 1169 ~ 7.407
/ 9 - 962.1 + 3(962.1) +9(962.1)
=1.21+.41+.42
=2.04

Figure 7--E/C years 4-9

(2980)
6

Cost Benefits Analysis

P&Q
YEAR

R FACTOR

FACTOR

E/C

1
2
3
4
5
6
7
8
9

.1
.23
.32
.56
.65
.72
.89
1 .04 marginal
1.21 good

.15
.29
.41
.49

.25
.52
.73
1 .05 marginal
1.19 good
1.25 good
l.5 good
1.77 good
2 .04 excellent

.54
.53
.61
. 73
.83

Figure 8-Analysis of E/C factors

The E / C column adds the two factors for the resultant cost effectiveness.
Approximation # 2 has been found to be most useful
when the system life is not certain or when the accrual
of benefits are known with more certainty as for a
specific hospital area.
The application of judgmental criteria

Candidly, the relationship between the Cost Effectiveness equation and the Judgmental Criteria depends
very much on the particular institution. In one case,
the equation will or will not lend support to a hospital
management and staff already convinced of the desirability of an automated information system and its
potential to patient care and hospital efficiency.
On the other hand, such judgmental aspects could
tip the scales in either direction whem E/C approximates 1.
In any case, the cost effectiveness analysis provides
the data and the insight for optimal decision making.
EVALUATION OF SYSTEM E/C
It is one thing to a priori design a cost effective system; it is another matter to determine the validity of
the design by an evaluation of cost effectiveness after
the system hasbecome operational. In the main body
of this paper, the author has attempted to present a
case for the establishment of E / C parameters during
the design. One of the arguments is to optimize the
design. Another is to provide management with decision
making tools. The third argument relates to evaluation.
Without the data gathering pursuits of the design
phase, the evaluation would have little with which to
compare since the former "current" operation would
have vanished.
The first step in the evaluation is the updating of
the domain of analysis. In the great majority of the

475

cases this is actually an expansion. Since the design
phase, new applications will have emerged through
improved technology or continued enlightenment of
hospital personnel as to the potential benefits of the
system.
The determination of system cost effectiveness
starts with establishing the criteria for evaluation. They
include the benefits expected from the design as well as
unpredicted benefits or negative influences .
The other major criterion is the actual system cost.
The determination of system cost will be a matter of
record. The hospital can establish separate cost codes
for all information system equipment, personnel, facilities, material, etc.
The determination of benefits will require the same
kind of analysis performed during the design study on
the current system. Measurement parameters will
include:
• Personnel Time-the effort required to perform
the activity.
• Classification of Personnel-the task may be performed by different classifications than previously.
• Transit and Cycle Times-most related to the
qualitative measurements.
• Stagnation Points-the new system can have its
own bottlenecks.
• Patient Status Factors-average stay, census, etc.
• Resource Allocations-reductions or increases in
personnel, equipment, material, etc.
• Qualitative Assessments-selective survey.
Essential to the evaluation is the data collection
methodology a. d the resources utilized for data
gathering.
The control data for "current" operations and their
domain equivalents for "system" operations should be
gathered by the regular hospital staff. The new system
will impose a set of unique man-to-man and man-tomachine interactions which will require trained observers. Both the "current" and "system" operations
will require test and simulation models where data describing the effect of perturbations and contingencies
can be examined. Finally, acceptance and comfort of
the new system will require opinion and judgmental
data collections from patients, practitioners, etc.
The steps in the evaluation are exactly the same as
outlined for the Absolute Cost Effectiveness.
• Cost analysis of operational system.
• Compare with current (extrapolated) system from
design study-results in Rand P.
• Determine qualitative benefits-Q.

476

Fall Joint Computer Conference, 1971

• Determine actual costs to develop, operate and
maintain the system C8'
• Select the best approximation and calculate E/C.
• Acquire judgmental assessments.
SUMMARY
The absolute cost effectiveness analyses described in
this paper result in a set of decision enhancing estimates regarding the worth of developing an information system. Implied in these data are the following
criteria:
(a) Experience in hospital operations.
(b) Cognizance of current and anticipated information systems with hospital applicability.
(c) An information system design philosophy and
plan with associated cost estimates.
(d) An existing library of departmental data.
Depending on the extent to which the criteria are
met, an absolute cost effectiveness analysis and in-

formation plan can be performed in 2 to 4 months. In
essence, the analysis provides balance sheets. It enables
hospital management to attain a firmer understanding
of the potential implications of the technology to the
institution. It allows decision makers at every level a
better picture of available design alternatives. And
finally, it can provide the data to analyze the impact
of the system during actual operation. In short, if used
properly, cost effectiveness analysis can help take us
out of the "pin the tail on the donkey" era of hospital
information system design and implementation and
provide the means for judging the impact of the system
when in full operation.
REFERENCES
1 NCHS-RD-69-1
2 I LEARMAN
Relative cost effectiveness analysis in the evaluation and
selection of hospital information systems
Presented at the Stony Brook Symposium on HSC
Information Systems March 1970

Factors to he considered in
computerizing a cliIlical chemistry
department of a large city hospital
by R. MOREY, M. C. ADAMS and E. LAGA
Massachusetts Institute of Technology
Cambridge, Massachusetts

INTRODUCTION

To accomplish the above objective, a comprehensive
set of requirements was established at the outset of the
proposed program in conjunction with the Boston City
Hospital. These requirements were based on an intensive analysis of the present clinical chemistry operations
and took into account current and projected user needs,
accumulated experience in the hospital, and experience
obtained in operating comparable systems.
Concurrent with the development of requirements,
the operational flow in the present facility was delineated
and cost data collected and analyzed in relation to
the requirements established to determine future needs.
Appropriate existing systems and equipment were evaluated relative to needs and recommendations based upon
available funding prepared for use as a basis for implementation
In relation to the approach outlined, managing the
"throughput" was a major aspect of the program. The
operation had to be designed to handle a normal stream
of tests as well as to respond to "demand" tests. The
testing process had to be rigidly controlled to assure
reliable results. The entire operation was under constant
supervision and monitoring; including feedback controls, abnormality signalling, test-interference protection, and comparison of results against known standards.
The hospital biochemical laboratory nowadays
performs the majority of the requeste~ analyses by
automated equipment. Except for specialized tests, and
in small hospitals, manual methods have been superseded by automated and semi-automated· methods.
Blood chemistry tests are being called out in an everincreasing manner to aid diagnosis and to monitor a
patient's condition during his stay in a hospital. When
an analysis becomes very complex, automation is often
the only means to· develop a consistent and repeatable
analyses.

The present biochemical testing facilities at Boston
City Hospital perform about one million tests per year.
To handle this workload with present personnel and
equipment, the facilities must impose certain undesirable constraints on those who request tests and, in
general, must limit the services which can be provided
and the data which can be accumulated for subsequent
analysis. These constraints ultimately imply a reduction
in the health care available for patients.
The operation of the test facilities can be divided into
three categories-input, processing, and output, or:
Receiving, identifying and placing samples; transcribing the tests appropriate to each sample; and
assigning appropriate groups of tests to the test
equipment.
Setting up equipment, assembling samples, and
running tests.
Reducing data from tests; recording and storing data
for future reference; and transmitting results to the
requester.
The first phase of the program was to address the
input and output categories relative to a specification of
improvements which can be introduced into the existing
test facilities of the hospital.
The essential objective of the program was to provide,
in a relatively short period of effort, specification for a
clinical chemistry operation of significantly increased
effectiveness. The specification was to be applicable to
existing clinical chemistry operation and consistent
with a long-term program of continued improvement in
service, reliability and cost savings. Thus, the shortterm consideration must be consistent with the longterm goals.
477

478

Fall Joint Computer Conference, 1971

The hospital laboratory staff must translate chart
values typically obtained on a recorder into values
expressed in concentration per unit - volume. This
requires mathematical or graphical manipulation and
interpretation including calculation of base line drift in
which errors can occur due to fatigue or inattention.
The staff must prepare daily log sheets, laboratory
summary and statistical reports, and quality control
reports and the possibility of errors always exists.
These procedures are all clearly within the capability
of. currently available computing systems. A solution
would be to utilize an on-line system (hardware/software) that provides the above services during the time
of actual specimen processing.
The laboratory at BCH is inundated with large
numbers of specimens every morning requiring multiple
sorting to prepare them for testing by automatic
analytical instruments. There is always the possibility
present of lost identification of the, specimen, the test
and even of the patient.
The tests themselves are now automated and this is
the only way a laboratory can keep up with the volume
of tests but another imbalance is caused by non-automation of the data input and output and record keeping
causing a reduction in the advantage obtained from the
technological advances in instrumentation.
Lame in an article in Laboratory Medicine, N ovember, 1970, states that a survey of ten clinical laboratories
concluded that a minimum of 20 percent and more
likely 30 percent of the technologist's time was being
spent in performing clerical duties. Our study at BCH
indicated that the 30 percent figure is a good estimate.
Writing test reports, keeping log books, labeling
vacutainers and test tubes, preparing work sheets, hand
and slide rule calculations, billing, updating patient
files, answering telephone enquiries, are but a few of the
clerical duties being performed in a typical clinical
laboratory.
It is evident that the solution to this problem may be
found by examining other industrial and research
operations where the introduction of data processing
and the use of a dedicated on-line computer has been
most beneficial.
The doctor's prime requirement is a supply of up-todate information concerning his patients. Test requests
are initiated by the doctors who desire clear, concise
laboratory reports of the results obtained. To provide
this capability using a laboratory data handling system,
the laboratory staff must be able to enter information
on test results into the system. Also queries as to the
status of on-line instruments and the contents of
patient files must be entered. Similarly, test procedures
must be capable of modification as procedures are
changed in the normal course of laboratory develop-

ment. In addition, routine reporting of test results on
each patient and documentation of laboratory work
performed can be initiated by the system itself or may
be requested by the laboratory operating staff.
A computer will function well only in an organized
system. A comprehensive system analysis such as was
carried out for BCH and is delineated later in this report
is an essential prerequisite prior to computerization to
determine the most efficient procedures to use.
The disadvantages of computerization are concerned
with cost, the transition period and the development of
operating procedures in case the computer system goes
down due either to a power failure or some mechanical
or software problem.
Another disadvantage in computerizing the clinical
laboratory is the tendency to treat computerization as
an end, rather than a means.
It is possible in the interim stages that a computerized
system will add to the cost rather than reducing the
overall cost of the laboratory but in general, hospitals
which have automated, have found this to be short
range objection. There may also be resentment in that
the laboratory will be dependant on outside vendor
personnel from the supplier of the system. One solution
would be to use the hospital electronic data processing
personnel as an interface between the laboratory and
the system vendor or even to utilize a programmer on
the laboratory staff. When laboratory personnel understand the operation and run their own computer
system, feelings of distrust and antagonism should be
overcome.
One essential requirement for any laboratory system
is that it must have a high degree of reliability to
consider the results significant and be worth the expense
financially and personnel-wise.
A projected use for the computer once the reliability
of the hardware and software are verified is for the use
of the system in differential diagnosis. The computer
will be able to extend and provide backup for the
physician's medical decision-making ability. It could
be used to generate such differential diagnosis on the
basis of the laboratory data and to be able to generate
lists of additional tests to be done to generate more
specific diagnoses. Such adaptation of computer
science would affect both the quality and cost of
health care.
SYSTEMS ANALYSIS OF EXISTING
OPERATION AT BCH (1970)
A flow chart indicating the major operational steps in
the Clinical Biochemistry Department of the Boston
City Hospital (BCH) is shown in Figure 1. This in-

Computerizing Clinical Chemistry Department

479

1000
TEST UNIT LOAD
0
UJ

TEST LOAD

~

~

800

z

0

~

~
::l
<

..,

r--.

y----

600
Q
x

"--'
0

§

/--

x: 400

25
3

m

[::::1=I>OC......

<::> __
D.

181011

200

AC'l'lf1'rYPOIft

Figure I-Flow chart of operations

cludes (a) test requesting, and specimen collection
which are performed by ward personnel; (b) accessioning of samples and of request forms, preparation of
worklists and result lists, and administrative logging,
which are performed by technical and clerical staff of
the laboratory; (c) analysis, computation of results and
transcription of results to report forms which are performed by the analysts; (d) sorting of results by patient
and wards and creation of alphabetical long-term files
which is done by the clerical staff of the laboratory.
In the following Tables and Graphs the present and
future workload as it relates to in- and out-patient
services is considered within-laboratory work scheduling
of the staff and its analytical performance and some
aspects of bUdgeting. Next will be discussed some of the
discrete operations involved in the requesting, accessioning, analyzing and data processing steps of the

/958

1960

1965
\

1970

Figure 3-BCH-CC: Annual chemistry load (1958-1970)

-PREDICTIONS

..

3+

-

400

300
r-..
In

~
~IN-PATIENTOA\'S

~

Q
x

'-'

OUT-PATIENT viSITS

o 200
~
.J

....
Z

UI

~ .100

a:

&
I

I

I

I

I

1960

1965

1970

I

REQUEST LOAD
I

I

1975

I

IQ80

YEAR
18

Figure 2-BCH-CC: Annual patient load (1960-1970)

Figure 4-BCH-CC: Annual chemistry load forecast
(1969-70)

480

Fall Joint Computer Conference, 1971

I

~

I

I
I

I

I

TEST LOAD PER IN-PATIENT ADMISSION

100 -r

--.

ESTIMATED

WORK LOAD

50 -,...

_..
I

-I

I

I

c

5.0

I

I

I

I

TEST LOAD PER IN-PATIENT DAY

10.0 -r

9

------

..,~-....-

--

t;

w

.

I-

•

.......

,. .---

......

-..

I

I

I

I

I

I

I

I

chemistry load by test for the 1958-70 period and its
breakdown in automated and non-automated tests
shows an increasing demand for tests and the increasing
use of automated tests during this period (Figure 6).
About 3 automated tests are responsible for 25 percent
of the load, about 6 for 50 percent, about 9 for 75
percent and about 13 for 90 percent Figure 7). This is
indicative of a high volume operation with a rather
limited diversification in terms of tests offered. The
latter situation obviously creates a more acute demand
for computer assistance than a more diversified, lowvolume operation.
Monthly variations in test load for day-, evening-,
and night-shifts in 1970 are shown in Figure 8. By the
end of 1970, about 10 percent of the work load was
performed on the SMA 12/60s. Workload on Mondays
occasionally is twice as large as on any other single day
of the week. Average test load per request is from 8 to
10 depending on day of week and month of the year

TEST LOAD PER OUT-PATIENT VISIT

1.0 -f-

0.51 - -

I
I

1958

~

I
1960

-

.. .
I
I

1965

,r_.#'

.- ~
_ _ NON- AUTOMATED

,r'

SDK

I

_

AUTOMATED

I

1970

Figure 5-BCH-CC: Annual test load, by patient (1958-70)

entire operation. This will be followed by a discussion of
characteristics of hardware, software and peripheral
equipment currently available from different computer
manufacturers. Alternative proposals for introduction
of a computer-assisted system for on line monitoring
of Auto-Analyzers and for preparation of cumulative
reports will conclude this section.
The recent decrease in utilization of both in- and
out-patient facilities of BCH is shown in -Figure 2.
This has been accompanied by a steadily increasing
request load, test load and/or test-unit load, * (Figure
3) in Clinical Biochemistry. If the test load keeps rising
at the current rate, there will be an increase from the
present 1 million tests per year to about 1,500,000 tests
per year by 1975 and to 2,000,000 tests per year by
1978-79 (Figure 4). Test load for chemistry per outpatient visit is still about 710 less than in-patient day.
The latter is expected to rise faster in the next decade
than the former (Figure 5). A breakdown of the

* Test units are tests weighted by U.S. Veterans Administration
AMIS Test Weighting System.

60K

a

9
tnuJ

SCOOT - LDH (112)

I--

oe BILIRUBIIII
---4

40K

I

TOTAL PROTEINALBUMIN

1

ALKALINE
PHOSPHATASE

CALCIUM
PHOSPHORUS

201<

AMYLASE

URIC ACID

1958

1960

1965

1970

YEAR

Figure 6-BCH-CC: Annual automated test load, by test
(1958-70)

Computerizing Clinical Chemistry Department

%
o

-+-

'10

OF LOAD
00

---t

III

LOA.D

N

I

-I-

OF"

I

0

UREA

CREATININE

PoMY LAS E

I
POTASSIUM

I

I
I
I
I

I

[[]]

0

-l

-l

lJl-'-f

I

Z

:j

I

r

0

URIC ACID

SODIUM

I

~

CHOLESTEROL

CHLORIDE

ACI D

CO2

SODIUN\ (URINE)

GLUCOSE

POTASSIUM (URINE)

LDH

LA"TEX

TRAN

PHOSPI-\A,AS

BSP
ELECTROPHLORESIS

BILIRUBIN

r

0

~

eHLORI D'E" (URINE)

TOTAL PRO"TEIN

J7

{]

I

481

SALICYLATE.S

ALSUMIN

THYMOL
AU( PHOSPH,6;I"'&sE.

I

OTHER iE-S"S
CALCI UM

I
I

21

----+--------

j-

PHOSPliORU5

-----_t__

Figure 7A-BCH-CC: Fractional test load ranking,
by test (1969)

(Table I). The Medical Services are responsible for
about 50 percent of requests and about 60 percent of
tests performed. The remainder of the requests originate
from the Surgical Services, Clinics, and Outpatient
Services (Figure 9).
Within-laboratory work scheduling for analysis,
administrative personnel and supervision is shown on a
time basis for different shifts and work assignments in
Figure 10. In total, about 96 analytical man-hours are
available per day. Two-thirds of these are spent on true
analytical work and one-third on clerical jobs. About
32 man-hours per day are available for administrative
and supervisory assignments.
At a current 95 percent level of test automation in
the laboratory, a total of 14 analysts performed about
70,000 tests per analyst-year, which is above the
national average for hospital laboratory performance.
From the available budget figur-es, it appears that
the operational cost for the annual test load of about

Figure 7B-BCH-CC: Fractional test load ranking,
by test (1969)

TABLE I
BCH - DAILY CHEMISTRY TEST LOAD* (1970)
WEEK-DAY

All Day

Day
Routines

Day
Emergencies

WEEK-END

N· ht

Emer~encies

S
aturday Sunday

SAMPl:,o.!l'-~-1)A.o.

Average
Minimum
Maximum
Average, as
0/0 all-day

406.7

298

54.5

54.2

291

210

40

41

656

44

68

512

72

72

183

170

13.40

13.32

100

73.27

136.2

96.4

TEST LOA 0_
Average

4,052

3,498

292

Minimum

262

2,396

712

1,998

405

Maximum

511

5,186

208
369

190

5,903

324

348

879

566

Average, as
0/0 all-day

100

86.33

7.20

6.47

11.73

5.35

4.83

AVERAGE TEST I QAP

PER SAMPL~

9.9"6

*March 1970

5.23

4.20

482

Fall Joint Computer Conference, 1971

lOOK

REQUEST LOAD
•

TEST LOAD

TEST-UNIT LOAD

OUT-PATIENT

~CLlNICS

8DK

OJ] SURGICAL
DAY
ROUTINES

:r:
I-

z

m

MEDICAL

60K

0

~

"'9"

BO

0

KS.HlrTS, WEEkPAY
SATIJA.o.o.y

SU>jDA.y

~~~~~~~~~~~~~~~~~

(12Hz)
(2)

(2)

BLOOD COLLECTION, PEAK.

oMiii'~

A.CCESSIONING
SAMPl.E~"T'a--..

fT177J17ii"V'EiAi:i<.wic,'';;;7177777771t'LZll II 711 11177 (

vI77ZZ771IJ(2)

(Idi

aI:

fZZZZJ

WORKLIST PREPERAT"lON (:2)
AliIDANALYZER TESTS (ID)

~IVZ/; ROUTINE! / I dZl11

SEMIAUTOMATED 1'1OSTS (1)

V7R1"

(a)

(1)

RESULT CAI!ST~o~ (1) i
P~ION/M'''TENI.N(£(i2;

I

'-E31':~;·.·;

RE5I.)~T C~RD COMPl-ETIeN (12)
RESULT CARp cptt::c.K.

j

.:I!

1_ _ _ __

IlTIIIID

~---t- -Ik

-t -i--·--~

~ -·~--·-·t--r-~ -t~

NOON

Figure 1Q-BCH-CC: Work scheduling by staff (1970)

Computerizing Clinical Chemistry Department

1

11

1

I
I

I

·I TABLE II

BIOCHEM. DATE: _ _ _ _
PREYIOUS 1

>
III:

~
w

~1r-+--+--4-------~-1--~T~~~T+-------~~--+_~------~

o

gll-+-·+---+----1I-+--+--+----4-+----1--t------I

iii

1r+-+-+----~~~~---+4--+--4---~

2391. L..:::"--L_ _ _ _ _ _ _ _

~_

1L-.L:.::.:..J.~-L-

21 !2-=:::,~!.
I

<50, >300

AA-color

90

AA-color

70

AA-flamej

6. Potassium
7. Chloride

AA-fiame
AA-color

8. Carbon Dioxide

AA-coloLl

5-10

9.

AA-col°rt
AA-color !

10-15

Bilirubin

10. Total Protein

~1r-t-+=-t---------lI--+-t--t----If--+--+~---lI

YES

Criteria for
Repeats

5, Sodium

0

DATE

NoD

HOSP

______

---1l----1__=.~=_:.L-______

OIA("NOSIS

5-10

Discrepancy

5-10

5-10
5-10

12. Alkaline Phospho

AA-color

5-10

>8

13. LDM

AA-color-I

70

15-20

14. SOOT

AA-color~

70

15-20

15. Amylase

AA-color

10-15

16. Calcium

AA-colorl

15-20

<85 > 12. 5

17. Phosphorus

AA-coloD

15-20

<20 > 10

18. Profile. chemical
19. Prothrombin Tim ~
20. Sodium Urine

21. Potassium, Urine

,#:

22. Trace Metals

WARD

NA. Cl

I

AA-colo~l

11. Albumin

1

Samples % Repeats
Per IIr.i

4. Uric Acid

3. Glucose

iiil

o,~ I__

Dialysis

AA-colorl

2. Creatinine

USE ADDRESSOGRAPH PLATE OR WRITE LEGIBLY I ! !
~Ir-~-.---r------~-.--_'--'-------~~--~~------~

::I:
U

Recorder

Sampler

# SHde.\,{.ires
Bun

1.

Iii

(1!)70)

REQUEST FORMS

NAME

YES~
NO

483

>1125

> 1. 400

SMA 12/60
NA

fhmel

IBO

IL.
IL. nam~
PE, AAS 303

180

SERVICE'

USE ADDRESSOGRAPH PLATE OR WRITE LEGIBLY! ! !
y
!!!~ Ir~-r---r--~~~~~~~~~~~~~~~~
_

...

CODE

:e

23.

Cholesterol

24.

Acid Phosphatase

25.

Electrophoresis

SWEAT

_UWIr-+--+~~PT~-r------------~_+--+_~pNHA~~------------~
_
ELECTRO·

Coleman, Jr.
III
Coleman, ColorBM - Analytrol GM - Digiscreen

PHORESIS

QI
~~~L~im'~
FIBRINOGEN
mlr-+--+~~CF~~--·----------~-+--+-OO-N-GO-R-ED~------------~
1

3

(S~~~~fV) :

SU lPHA

I

13~IM.

I
I
>1

DATE: _ _ _ 1

PREVIOUS CHEMISTRY
¥EOn
NO

INAME
HOSP •

0 ____________

OIAGNOSI •• _ _ _ _ _ _ _ __

~_Ir-r--.---r-------4~---.--r-------~-,--r--,--------.... r CODE
,;
CODE
TEST
RESULTS

NOTE:

EMERGENCY REPORTS WILL 8E TELEPHONED TO SERVICES UPON COMPLETION.

NAME AND SERVICE SHOULD APPEAR ON .rEeIMEN AS WELL AS ON THIS FORM

manually transferred to a serum tube on which the
accession number has been transcribed. Sample splitting
from serum tube to Auto-Analyzer cups is done
manually, using work lists to select serum samples and
to list the sequential position number of the sample on
the Auto-Analyzer sample tray for each accession
number. Results are collated after the Auto-Analyzer
run from recorder chart and final results are manually
computed from the raw data. The obtained results are
matched to cup number, next to accession number and
finally to the patient's name before final transcription
on the report form.

100

W\EDIAN

80

MODE.

MEAN

/

I

~

/

~ 60
'-'

=

J

I

z

uJ

'"

MEDI"N=
MODE
10.5

a

/

l-

MEAN - 7.79

200

~
w

I

I

/

/

~

lY

/

40

U-

/

U

S'

/

/

0

~:)

I

%AUTOMATION

/

~

ill
I-

/r-

TEST-UNIT LOAD

/

/

/

/
100

20

/

/

17

rt

17

17

17

NUMBER OF
AUTOMATED
TEST
PROCEDURES

0
1958
10

15

TE5TS PER REQUE.ST (N°)

Figure ll-BCH-CC: Frequency of multiple test
requests (1970)

20

1960

1965

1970

YEARS

Figure 12-BCH-CC: Percentage automation of test load
(1958-70)

484

Fall Joint Computer Conference, 1971

Table III lists the currently available instrumentation in the laboratory. Eighteen analog Auto-Analyzer
channels are potential candidates to be multiplexed in
an on-line computer-assisted data handling system.
The sampling rates for different tests are indicated.
Also shown is the percentage of repeat tests and the
criteria used for repeating an analysis. Figure 12
indicates the number of automated tests procedures and
the percentage of test and test unit-load which is
automated.
For routine specimens, the time delay between receipt
of the sample and result leaving the laboratory is about
eight hours. One report form goes to the patient's chart
and one to the Billing Office. Long term files for both

in- and out-patients are kept in the laboratory using
another duplicate of the report forms filed alphabetically. File retrieval is done manually. Requests for
retrospective cumulative retrieval have obviously to be
limited because of limited available personnel to perform
the task. No separation exists in the file structure
between active and inactive patient-files.
DESCRIPTION OF AVAILABLE LABORATORY
COMPUTER SYSTEMS (1970)
A standard procedure was established for evaluating
the various turnkey systems available at present. All

TABLE

TV

HARDWARE CHARACTERISTICS OF 12 COMPUTER SYSTEMS (1970)

BSl
ClinData

! Con-

comp
I

DEC
ClinLab-12

IDNA
;CLS

Infotronics
CL II

IBM ! IBM! IBM
360/40 1130: 1800
Batch

IBM
1080

1

I

Turn-Key System

yes

yes

yes

yes

yes

no

no

no

no

yes

Iyes

Computer in Lab

yes

yes

yes

yes

yes

no

yes

yes

no

yes

Ino

Purchase Price ($)

188,600

Rental Price($)(Mo. )

165,000

151,083

I

IBM I Medi Spear
Syst/17 tech ' Class:
300-B:

yes

I

i

175, 000 175,em 500, em 90,em 250,(0) 24, em '

; yes

i165,em

2, 000 4, 400- 600.
10, em

4,350

Maint. Contract ($)
after first year

550

No. Autoanalyzer
Channels

16

32

24

16

40

NA*

NA

NA

NA

NA

Time to Complete
Installation (Mo. )

9

3

4-6

4

6

12

4

4

4

12

Computer

PDP-8

PDP-12

2700
Raytheon

PDP-12 360/
40

MicroSyst.800

1130 1800

1080 :Syst/
7

5

7

7

yes

nO

no

no

no

yes

yes

no

no

no

no

yes

13

0

2

6

Total Responsibility
by Vendor

yes

yes

yes

yes

Factory On-Job
Training

yes

yes

yes

4
!2
:PDP-12 Spear
I

!

• 13

0

4

Operating Installations
(Jan. 1971)

28

yes

yes

no

: yes

Teletype Terminals

yes

only

yes

yes

only

Specific Console
Terminals

yes

no

yes

yes

no

no

no

no

no

NA

CRT Terminal

no

no

yes

no

no

no

no

no

no

NA

yes

yes

yes

no

no

no

no

no

no

no

no

no

no

yes

---.~.-'-"'~.'

._ .•.•... -

Power Requirement
BTU Generated by
equipment/hr.
User Group Available
for Consultation

* Standard Configuration

Computerizing Clinical Chemistry Department

485

TABLE V'
SOFTWARE CHARACTERISTICS OF 12 COMPUTER SYSTEMS* (1970)
BSL ConClin- Comp
Data
I
Patient Directory

yes

no

Dec
DNA
Clin- CLS
Lab 12
yes

IBM
IBM
Infotronics 360/40 1130
CL II
Batch

yes

no

no

IBM
1800

IBM
1080

IBM
Medi- Spear
System Tech Class
7
300-B

no

yes

yes

no

no

I

Specimen Collection

yes

yes

no

yes

no

no

no

yes

no

no

yes

Test Work List

yes

yes

yes

yes

yes

no

no

yes

yes

yes

yes

Load List

yes

yes

no

no

yes

no

no

yes

no

no

no

Ward Report

yes

yes

yes

yes

yes

no

no

yes

yes

no

yes

Day Report

yes

yes

yes

yes

yes

no

no

yes

yes

no

yes

Que ry Report

yes

no

yes

yes

yes

no

no

yes

no

yes

yes

Abnormal Value Report

no

no

no

no

no

no

no

no

no

yes

yes

Unfinished Procedures

Rpt yes

no

yes

yes

no

no

no

no

no

no

yes

Quality Control Report

yes

no

yes

yes

no

no

no

no

no

no

no

Daily Statistical Report

yes

no

yes

no

yes

no

no

no

no

no

yes

yes

yes

yes

no

no

no

no

yes

no

yes

yes.

yes

no

yes

yes

no

no

no

yes

no

yes
16

Billing Report

7

-

7

7

-

-

-

7

-

yes
6

Management Report

yes

no

no

yes

no

no

no

no

no

no

no

EDP Compatibility

no

no

no

no

yes

no

no

yes

no

no

?

Retrieval Programs

no

no

no

no

no

no

no

no

no

no

Sample Identification Sys.

no

no

no

no

no

no

no

no

no

. no

Computer Manufacturer
acts as Consultant

no

no

no

no

no

no

' no

no

no

Cumulative Rpt. for Days

IBM

!

no

no

no
yes

*Standard Configuration

known vendors in the field were evaluated together with
other computer manufacturers who have shown an
interest in entering the clinical chemical computerization field.
The initial step was to obtain relevant literature on
the systems. A preliminary evaluation was made after
reading this information to determine the applicability
of such systems to the requirements of BCH. An
important segment of this evaluation was to ensure that
the companies had the proper software to meet all or
most of the BCH requirements. This included Day
Reports, Cumulative Reports, Patient File, Quality
Control, Work Lists, and other types of miscellaneous

programs available to support a clinical laboratory
operation.
The second step after familiarization with the
literature was to request a representative from the
company to come and give a presentation of their
system. Generally this presentation was given by a
technical person from the Bio-Medical Department
rather than a salesman. The presentations augmented
the technical data previously sent, showing a typical
hospital arrangement, the consoles available and typical
format of chemical and patient data output. There
was a discrepancy in that in certain cases the literature
stated that certain software was available but the

486

Fall Joint Computer Conference, 1971

TABLE VI
INPUT !OUTPUT /STORAGE CHARACTERISTICS OF 12 COMPUTER SYSTEMS (1970)
BSL
: Concomp!
DEC
DNA {¥b~ics!
IBM! IBM!
Clin-Data
I
' Clin-Lab 12 CLS : CL II i Batch: 1130 I 1BOO!

IBMI IBM
MedilOBO! Syst/17, tech
no

no

no

yes

no

no
no

I

3d"&J'i0'

I

tq~~§
300-B

I. Data Input Equipment
1. Mark Sense Cards

yes

no

yes

no

no

no

2. Port-a-Punch Cards

no

yes

no

no

no

yes
no

no

yes

; yes ; yes

3. Bar-Coded Cards

no

no

no

no

no

4. Keyboard Terminal

yes

no

no

no

no

5. Key-Mat. Terminal

no

no

no

no

no

6. CRT at CPU

yes

no

yes

no

no

no

no

7. CRT-Remote

no

no

yes

no

no

no

no

. yes

yes

yes

yes

yes

no

no

no

no

no

no

yes

9. Teletype, Terminal

no

yes

yes

no

yes

no

no.

no

no

no

no

no

10. Specific Purp. Term.

yes

no

no

yes

no

no

no

no

no

no

no

no

yes

no

no

no

no

no

no

no

B. Teletype, at CRU

no

; no

no

. no

no

yes

no

no

,no

yes

yes • yes

: no

no

no

no

no

yes

yes

no

no

no

no

yes

no

no

no

yes

yes

I. Data Output Equipment
1. Teletype

yes

yes

yes

yes

2. Kleinschmidt Printer

yes

no

no

no

,no

no

no

no

no

no

no

yes

3. Line Printer

yes

yes

yes

yes

no

yes

yes

yes

yes

yes

yes

yes

4. Plotte r

no

no

no

no

no

no

no

no

no

no

no

no

yes

II. Data Storage

1. DEC-Tape

no

yes

no

yes

no

no

no

no

no

yes

2.IBM
Tape

. no

yes

no

no

no

yes

yes

yes

yes

yes

no

3. Disc

, yes

yes

yes

no

yes

yes

'yes

yes

yes

yes

yes

yes
: yes
yes

I

presentation indicated that some programs were still
under active development. This is understandable
because the verification of software without an expensive in-plant simulation for the verification of programs is
extremely difficult. As proof, the Apollo flight programs
are still finding bugs after six years and millions of
dollars worth of verification testing.
The vendors were carefully questioned as to the
operating characteristics of the computer including air
conditioning requirements, the preventive procedures
maintenance and the response time for emergency
service.
The third step in our evaluation was to try to visit
representative systems in a hospital. This was difficult
as in most cases the systems were in process of being
installed, they were of an early configuration that was
not representative of later designs or they were installed but not operational.
CONCLUSIONS
In general most of the systems were similar as far as cost
and type of software available. The major factor that

separated the companies was the number of working
systems which they had in the field which is directly
proportional to the length of time they have been
marketing systems. The number of systems operating
is thus not an efficient means of evaluating the companies. Evaluation therefore must be based on the
service that the individual companies will provide with
their equipment.
A general survey of 12 computer systems available
in 1970 with respect to hardware characteristics (Table
IV), the accompanying software (Table V) and the
associated input-output storage devices (Table VI)
has been compiled.
AVAILABLE LABORATORY COMPUTER
SYSTEMS (1970)
A form was circulated to hospitals with known
installed systems. A request was made for their comments on the equipment as installed in the Clinical
Laboratory and it was emphasized that this was in no
way to be connected with the user by name.

Computerizing Clinical Chemistry Department

An analysis was made on the returns for the various
systems.
It was noted that the longer a particular system has
been marketed the higher the probability that deficiencies have been rectified. It is only when major changes
are incorporated such as substituting disk storage for
tape storage that major difficulties are found by a
particular user.
Also the efficiency with which a particular system is
used is bound up with the attitude of the technicians
and the dedication with which the laboratory director
endeavors to integrate the system into his operation.
It was shown that it is essential that a systems
analysis as has been performed at BCH be carried out.
The type of hospital, i.e., teaching versus non-teaching,
also is important in estimating the number of tests per
patient that are to be carried. A further observation is
that the time for complete integration of the computerized system into the work schedule of the laboratory can be excessively long. Again this is probably a
reflection of software efficiency and technician acceptability. The down time is noted to be excessive in a
number of cases which indicates that the laboratory
must always maintain the capability of switching to a
manual mode when this need arises.
ACCESSIONING AND POSITIVE SAMPLE
IDENTIFICATION SYSTEMS (1970)
At this time there is no positive sample identification
system which would be practical for Boston City
Hospital. This is due to the fact that Boston City does
not have a collection team but leaves the collecting of
the blood up to the individual ward nurses. This would
require a large investment at each collecting station for
equipment to mark the samples. An embossed card can
be used for sample I.D., reference Figure 13. This can
be done in the following way: the form is imprinted with
patient's name which is human readable, and patient's
I.D. and ward number in machine readable form. The
blood is drawn and a request for test is attached to the
vacutainer. These are now sent to the lab where the
technician will assign an acquisition number and mark
in the number on the mark sense card. The container is
placed under the assigned number on the work table
and the cards are collected and fed through the card
reader. The computer then makes up the work list
which\tells the technician which specimen, by acquisition
number, is to be placed in which location in the sample
loader. Upon completion of the tests the computer will
type out the ward report which will have the patient's
I.D. number and the test results. The reports are then
sent to the wards where the nurse will assign the test to

487

WAllO

LAB

-------a-----

t

BUN NO. I WOIIK Ll ST
POS I
POS 2
POS3
POS 4
POS5
POS 6
POS 7

STANDARDS
IIId
CONTROlS
IlANl(

101

\

o0 0 ~ 0 0 0

AUTOANALmRS

Figure l3-Sample accessioning and identification scheme

the patients according to their patient I.D. With this
method all queries will be by patient I.D. number
only. There will be no flagging of out of normal limits
because patient's age and sex are unknown. If working
without the patient's name is satisfactory, this method
is a very fast and efficient way of getting positive simple
identification. If the patient's name is required the
technician can type in the patient's data. On most
systems about four entries a minute can be made.
Another way of accomplishing this is to have the wards
send a copy of their collection lists the night before so
that the people on the night shift in the lab could update
the patient's data for the following day. This would
take care of all patients except the out-patient and stats
which would have to be entered upon receiving the
blood sample in the lab. Tables VII and VIn represent a survey taken of available schemes by different
manufacturers.
RECOMMENDATIONS
There are several ways to computerize the lab at
BCH. These depend on the degree of involvement of the

488

Fall Joint Computer Conference, 1971

TABLE VII

A~7:~~:n
Positive Patient
IdentHication Tab

Technicon
AA II Idee

Spear

and Engineering

SAMPLE ACCESSIONING SYSTEM (1970)

Vickers

Identicon:~

M~OO

Identiscan TMIOO

Dupont

American Cyanamid
OMS System

Automatic Clinical
Analyzer

Damon
Engineering
Microtainer

Armband with Patient
Hospital I. D. NO/Bar
Code: Bar Code transcribed through transcription device on vacutainer heat sensitive
label
Special Flat
1.5mlvial

Underdevelopment

Request Card

~~r;;:le Container

Electrostatically printed
on Vacutainer label from

Under development

Vertical preprinted bar
code on Jdee label

Vertical Hollerith, prepunched on tab card

Vertical Hollerith.
post-punched on
label fixed to
special vial

Retro-reflective
horizontal bar code
on label

Fluorescent vertical
bard code on label

Prepunched tab card
attached to transfer

Sequential accession
number, post-punch~~a~n label fixed to

Under development

Under development

Armband

Sample Accessionmg

Hospital NO Bar Code
Transcribed from
Vacutainer Label to
transfer tube AA cup
via transcription device

Under development

tubelAA cup

HPCa~ ~eader

Request Card
Accessioning

Serum
Transfer System

Test Container Code

AA cup carrying elec·
trostatically imprinted
BarCode

Sampler

Technicon AA sampler

1I,II1
(40 positions)

:!t:~~e~ut~ special
Under development

Under development

Special vial holder

Under development

Under development

Computer Controller
sequential serum aspirationfrom original
sample tube in single/
dual channel Auto
Analyzers from continuou:;belt. Optical
control of sampling
depth

Sequential serum
aspiration from
original sample tube.
Optical control of
sampling depth

Under development

Under development

Pre-accessioning via
telephone/computer

Centrifugation

Sequential accession
number entered on
identification card

Sequential accession
number entered on
identification card
::~~~~ed to mic ro-

Vacutainer-Holder

~~e~UPCarrYing Idee

Original sample
container carrying
prepunchcdtab number

Original sample
container carrying
post punched number

Under development

Under development

Dupont sample cup
with attached identification card

~~c~~icon AA ~ampler

~;chnicon AA sampler

No sampler needed

Under development

Under development

(40 positions)

(40 positions)

Vickers M300
sampler
(300 positions)

Linear sample/reagent pack set input
tray

Under development

Reflection Scanner

Transmission Scan-

Reflection Scanner

UV -fluorescence
Scanner

Photographic reproductionofIDinformation from ID ca rd
attached to sample to
report form

Digits

'~Under Development
. Monmouth Modification
NIA

= No

Information Available

aboratory with the installation and the projected
utilization of the equipment after installation. Do they
want a turnkey system, like another instrument, which
will help them in their existing workload or do they
hope to expand and use the computer system for other
functions? Reference Figure 14.
If a turnkey system only is required our recommendation is that they proceed with plans to purchase a
system from either Digital Equipment or SPEAR.

If the lab plans on imaginative use of the computer
system, it is felt that a Bio-Engineer should be hired.
With this person on the staff, the use of the equipment
would be greatly enhanced. This technical person would
assist the director in whatever he might require in the
areas of new test procedures or new programs. An
interesting development along this line is noted by
Doctor Lame in Laboratory Medicine, November, 1970,

TABLE VIII
TEST REQUEST AND SAMPLE IDENTIFICATION SYSTEM
Addressograph

IMultigraph

H-P
'Hewlett-

i

IBM

COPAC

jPackard

Card
Image

Telephone
Activated Request Sys.

NA

NA

IHollerith Pch
Code
Installations
Admissions
Plate Punch

Nursing Station Punch

Bar Code

BCH

6400

Dial Code

ipo~~:~PunCi1

BSL

I Spear

NA

1

I

NA

Bar-To-Hollerith
Conversion Unit
Dial-To-Hollerith
Conversion Unit
Simultaneous Bar/Markl
Hollerith Reader
Hollerith-Compatible
Card Reader

S.F.,Cahf.
Templa
Punch 501
Templa
Punch SOlT

A&M 12-45

Selecta
Punch 5082

Service Area Punch

Tab Card

Hollerith

~ _c. Med _.Ct s~~~~;:~ai

No

No

NO
NA

Not
Needed

~anders
Plate Punch

NA

NA

NA

NA

NA

NA

No
Not
Needed

A&M 9620

NA

Yes

~~yo

ltfonmouth _
Clinic County Hospl

Not
Needed

Not
IBM 2956
NO
Needed
BM Card Rd
HP2761A
IBM
HP2761A
Optical
Card Reader
OMR
Mark Reade

Not
Needed

Not
Needed

Not
Needed

Not
Needed

Not
Needed

Not
Needed

Not
Needed

Not
Needed

Figure 14-Logic diagram for implementation of a computer
system of BCH (1970)

Computerizing Clinical Chemistry Department

489

AUTOANAlYZfR
COMPUTED
RESIl.TS

Figure 15-Basic data acquisition system

MANUAL TEST INPUT
PATIENT DATA
TEST REQUEST

Figure 17-Integrated computer system

who says that his savings did not come from relieving
technicians from their mounting clerical duties but from
the use of the computer as a lab management system.
If maximum involvement is what the lab desires,
they should proceed in the following manner.
Negotiate with a contractor such as D.E.C.; IBM;
or B.S.L., for a simple data acquisition system which
will do the peak detecting from the auto-analyzers and
print out the results and prescribed measurements on a
teletype. An example of this type of system is shown in
Figure 15. This would eliminate some of the technicians'
clerical work and reduce the mounting workload.
The next phase would be to hire a Bio-Engineer who
will familiarize himself with lab operations and require-

Figure 16--Advanced computer system

c
0

.S
~~

cO

~ ~

'c"

'"

1
t-<

~
:r:

~

z

ci

'""

~

I

0

~

:r:"

t-<

~

~

~

~

z

1.5

1.5

10

5.5

5.5

7.5

7.5

IManual Tests

1~~~Ult Card Comple-

12

3.5

12

2.5
12

Result Card Ward
Distribution
Maintenance

ci

ci

Z

5.5
1*

1*

Cost Analysis of Computer Configurations

~

*!

g

:r:

5.5

Acquisitlon :::iystem

)EC (Basic)

i

7.5*

1*

7.5"

BSL Chern Lab

I

4*

1*

2"

IBM System 7

!

0

i

0

Time Shared System

Meditech

In-Lab Dedicated
ComEuter
DEC Advanced
Spear
BSL (Chern Lab)
(ClinLab)
DNA

2.5
2.5

12

Result Card Check
ijReSUlt Card Corelatian
fResult Card Refiling

~

"'"0

~

:r:"

2

!Besults Entered on
IWorklist
rPatient File Sorting

.~

.~.;:::

t-<

Autoanalyzer Tests

TABLE X

'" 1

til

Worklist Preparation

Semiautomated Tests

]~!

ou£

'""

"

'" '"
~[s

~ ~~

~ g-~
o «:Ul

'" Q.
....!o

"C

~~a

:~ ~

2.5

2.5

.75

.75

2000

.75

N/A

p

Telephone Call

Include in Cost

$8000

50,000

75,000

200,000

52,000

150,000

208,000

54,000

225,000

216, 000

56,000

300,000

224, 000

58,000

375,000

2:12,000

68,000

750,000

312,000

2

TTY Inputs

N/A

N/A

Computer

N/A

N/A

* More time
available for

215
tI'echnicUm
Hours

.2

.5

173
Techniclan
Hours

2.5

56
Technician
Hours

.2

46
Techni

1

DO  =  TO
;  END;

2



3

 : 

4

 : 

5

 : 

 ;

6



7



8

 + 

Replaceable symbols begin with '(', end with ')', and
contain a name that usually has some relation to the
meaning of the string generated by the symbol. Such
symbols are called non-terminal symbols, because of
their role in the Backus-Naur Form (BNF) notation
for describing programming languages. 12
In BNF a syntax for a formal language has three
parts-a set of terminal symbols, a set of non-terminal
symbols, and a set of syntactic rules. The terminal symbols are those characters and strings of characters
(punctuation, reserved words, identifiers, constants)
that can be part of the completed text. The nonterminal symbols are a specific. set of symbols introduced only to help describe the structure of the formal
language. Every non-terminal symbol must be replaced
by terminal symbols before the entire text is complete,

I

9

10

 : 

II

1

: 

11

:  «ARITHX*»

12

 IS AN IDENTIFIER

13

 IS A CONSTANT

Figure I-Portion of syntax for PLjI
Each rule specifies a possible replacement for the non-terminal to
the left of the colon. If the left side is omitted, it is the same as the
previous line. Rules 12 and 13 specify special classes of terminal
symbols

DO

I TO ;


END;

DO

II

;

END;

DO I = II TO ;

THE EMILY SYSTEM

DO I = 1 TO 20;

DO (ARITHV) = (ARITHX) TO (ARITHX);
(STMT*)
END;

12



tions are available elsewhere. 10 ,1 ,11 Emily has been implemented for an IBM 2250 Graphic Display Unit,
model 3. The 2250 displays lines and characters on a
12" by 12" screen. The user can give commands to the
system with a light pen, a program function keyboard,
and an alphameric keyboard.

Emily is primarily intended for construction and
modification of computer programs written in higher
level languages. Many such systems exist, but all
existing systems require the programmer to enter his
text as a sequence of characters. With Emily, the user
constructs his text by selecting choices from the menu
to replace certain symbols in the text. For example,
the symbol (STMT) might be replaced by

10

7


END:
13,7,13,3,2
4

II
END;
12,8,5,12,6,11,
12,9,5,12

DO lITO 20;
S

S

+ A(I);

END;
Figure 2-Steps in the generation of a DO loop
In each step, the non-terminal in the rectangle is replaced
according to the rule whose number appears at the right

User Engineering Principles

, . 

DO,  END,
DO  =  _,
DECLARE  I

IF  THEN

DO WHilE «BITX»' < •••
00  =  THEN;

RETURN,

<110 STI11)
MORE

EDITING FilE: TEXT

<."ITH)


«ARITHX»

-

+ (ARITHX>

*
 .... 

EDITING FilE: TEXT

DO~=~TO~'
END,


. ("!fITH)

(ARITH>
->->->. . ->. ->. (PTR>->

.




DOkmm?=~TO~'
ENOl


 -



ENTEIIt  _

DOI~

DO I = , TO 28,

END,

'NDJ

 = <81 TX);
 = <~XPR>:
I (lRl THX>

EDITING FilE: TEXT

EDITING FilE: TEXT

Figure 3-Generation of a DO loop with Emily
These photographs show the same steps as shown in Figure 2. The menu displays all the choices available in the implemented PLjI
syntax. An arrow indicates the syntax rule the user will select next. Up to twenty-two lines of text may be shown in the text area,
so it appears empty with only 3 lines

but the only allowable replacements for a given nonterminal are specified by the syntactic rules. In
each rule, the given non-terminal is on the left followed
by a colon followed by the sequence of symbols that
may replace the non-terminal. As an example, Figure 1
shows a portion of the syntax for PLjI. Figure 2 shows
a DO loop\generated using this syntax.
It is important to note that a string generated according to a syntax is not simply a sequence of characters, but can be divided into hierarchies of substrings
on the basis of the syntactic rules. Each non-terminal
in the sequence of symbols for a rule generates a subsequence. The DO statement in Figure 2 can be one
of a sequence of statements in some higher DO loop
and can also contain a subordinate sequence of statements (generated by {8TMT*». Replacement of a
non-terminal by a rule can be thought of as replacing
the non-terminal with a pointer to a copy of the rule.

The non-terminals in this copy can be further replaced
by pointers to copies of other rules. In a diagram each
syntactic rule used in the generation of the string is
represented by a node (a rectangle). The node contains
one pointer to a subordinate node for each non-terminal
in the syntactic rule. The subordinate node is called a
subnode or a descendant, while the pointing node is
called the parent.
Emily text structure

Text in the Emily system is stored in a file, which
may contain any number of fragments. Each fragment
has a name and contains a piece of text generated by
some non-terminal symbol. Generated text is physically
stored in a hierarchical structure like that described
above. Each node is a section of memory containing
(a) the number of the syntax rule for which this node

526

Fall Joint Computer Conference, 1971

was generated, and (b) one pointer to each subnode. In
a completed text, there is one descendant node for each
non-terminal in the syntax rule and the pointer to a
descendant is the address of the section of memory
where it is stored. If no text has been generated for a
non-terminal symbol,there is no subnode and the corresponding pointer is replaced by a code representing
the non-terminal symbol. If a subnode of a node is an
identifier, the pointer points at a copy of the identifier
in a special area. All pointers at a given identifier point
to the same copy in this identifier area. Other than
identifiers, each node is pointed at exactly once within
the text structure. This guarantees that if a node is
modified, only one piece of text is affected.
Notice that punctuation and reserved words do not
appear in this representation of text. Instead, they can
be generated because the syntax rule number identifies
the appropriate rule. Two tables in Emily contain
coded forms of the syntax rules. One table, called the
ab8tract 8yntax, controls the hierarchical structure of
generated text. It specifies which syntax rules can replace a given non-terminal symbol and the sequence of
non-terminal symbols on the right-hand-side of each
syntax rule. Another table, the concrete syntax, tells how
to display each rule; it includes punctuation, reserved
words, and formatting information like indentation and
line termination.
Creating text

The Emily user creates hierarchical text in a series
of steps very similar to Figure 2. In each step the right
side of a rule is substituted for a non-terminal symbol.
Before the user creates any text, the fragment contains
a single non-terminal symbol. In the case of Figure 2,
that symbol is (STMT). The user sees the result of
each step on the 2250 display. Figure 3 shows the steps
of Figure 2 as they appear on the screen.
While using the Emily system the 2250 screen appears to be divided into three areas: text, menu, and
message. The text area occupies the upper two-thirds
of the screen and displays the text the user is creating..
The lower third of the screen is the menu where Emily
displays the strings the user can substitute in the text.
The bottom line of the screen is the message area, where
Emily requests operands and displays status and error
messages.
Non-terminal symbols** in the text area are underlined to make them stand out. One of the non-terminals

** When it is displayed, a non-terminal is the end (or terminal)
of a branch of the hierarchical structure. It is called a non-terminal because it must be replaced with a string of terminals before
the text is complete.

is the current non-terminal and is surrounded by a
rectangle. The menu normally displays all strings that
can be substituted for the current non-terminal. These
strings are simply the right sides of the syntax rules
that have the current non-terminal on the left.
When the user points the light pen at an item in the
menu Emily substitutes that item for the current nonterminal. Usually, the substitution string contains
more than one non-terminal and the new current nonterminal is the first of these. The user can also change
the current non-terminal by pointing the light pen at
any non-terminal in the display. Emily moves the rectangle to that non-terminal and changes the menu accordingly. When the current non-terminal is an identifier, the menu displays identifiers previously entered in
the required class (some of the classes for PLjI are
(ARITH), (CHAR), and (ENTRYNM»). The user
may select one of these, or he may enter a new identifier
from the keyboard. Constants are also entered from the
keyboard.
Viewing text

Since text is stored hierarchically within Emily, it
can be viewed with operations that take advantage of
that structure. The user may wish to descend into the
structure and examine the details of some minor substructure. Alternatively, he may wish to view the
highest levels of the hierarchy with substructures represented by some appropriate symbol. Both of these
viewing operations are possible with Emily.
The symbol displayed to represent a substructure is
called a holophra8t. This symbol begins and ends with
an exclamation mark and contains two parts separated
by a colon. The first part is the non-terminal symbol
that generated the substructure and the second part is
the first few characters of the represented string. Figure 4 shows three examples of holophrasts. Note that
contraction to a holophrast only changes the view of
the file and it does not modify the file itself. Moreover,
the user never enters a holophrast from the keyboard;
they are displayed only as a result of contracting text.
The user contracts a structural unit in the display
by pushing a button on the program function keyboard
and then pointing at some character in the text. The
selected character is part of the text generated by some
node in the hierarchical structure. The display of this
node is replaced by a holophrast. If the user points at
a holophrast, the father of the indicated node contracts
to a holophrast which subsumes the earlier one. To expand a holophrast back to a string, the user returns to
normal text construction mode and points the light pen
at the holophrast.

User Engineering Principles

The operations to ascend and descend in the text
hierarchy are also invoked by program function buttons. To descend in the hierarchy the user pushes the
IN button and points at a part of the text. The selected
node becomes the new display generating node; subsequent displays show only this node and its subnodes.
The OUT button lets the user choose among the ancestors of the display generating node and then makes
the selected ancestor the new display generator.
System environment

At Argonne National Laboratory, the 2250 is attached to an IBM 360 model 75. The 75 is under control of the MVT version of OS/360. Unit record input/
output is controlled by ASP in an attached 360/50. The
360/75 has one million bytes of main core and one
million bytes of a Large Capacity Storage Unit.
The Emily system itself requires 60K bytes of main
core (the maximum permitted for a 2250 job at Argonne) and about 400K bytes of LCS. Emily is written
in PL/I and uses the Graphic Subroutine Package to
communicate with the 2250. Files for Emily are stored
on a 2314 disk pack. Emily is table driven and can
manipulate text in any formal language. To date,
tables have been created for four languages: PL/I,

+----+~7~-----(~---- punctuation

!~:pO ~ =)

L

r

non-terminal that
generated string

first N characters of
string represented by
this holophrast

DO I = 1 TO 20;
!STMT:S

=

S +!

END;

DO l I T O 20;
S

!ARITHX:S + A(I!;

END;
Figure 4-Examples of holophrasts
All three examples show the DO loop, but each has been. contracted differently. The user may change N, the number of
characters of the substring. In the examples, N is seven

527

GEDANKEN,13 a simple hierarchy language for writing
thesis outlines, and 'a language for creating syntax
definitions.
USER ENGINEERING PRINCIPLES
The first principle is KNOW THE USER. The system
designer should try to build a profile of the intended
user: his education, experience, interests, how much
time he has, his manual dexterity, the special requirements of his problem, his reaction to the behavior of
the system, his patience. One function of such a profile
is to help make specific design decisions, but the designer must be wary of assuming too much. Improper
automatic actions can be an annoying system feature.
A more important function of the first principle is to
remind the designer that the user is a human. He is
someone to whom the designer should be considerate
and for whom the designer should expend effort to provide conveniences. Furthermore, the designer must
remember that human users share two common traits:
they forget and they make mistakes. With any interactive system problems will arise-whether the user is
a high school girl entering orders or a company president asking for a sales breakdown. The user will forget
how to do what he wants, what his files contain, and
even-if interrupted-what he wanted to do. Good system design must consider such foibles and try to limit
their consequences. The Emily design tried to limit
these consequences .by explicitly including a fallible
memory and a capacity for errots in the intended user
profile. Other characteristics assumed are:
curious to learn to use a new tool,
skilled at breaking a problem into sub-problems,
familiar with the concept of syntax and the general
features of the syntax for the language he is using,
manually dextrous enough to use the light pen,
not necessarily good at typing.
Throughout the following discussion, reference is
made to 'modularity' and 'modular design.' These
terms refer to the structure of the program, but have
important consequences for user engineering. A modular program is partitioned into subroutines with distinct functions and distinct levels of function. For
instance, a high level modular subroutine implements
a specific user command but modifies the data structure
only by calls on lower level modules. To be useful for
the general case, the lower modules must have no functions dependent on specific user commands. In the
Emily system, for example, there are user commands to
MOVE and COpy text and there are low level routines

528

Fall Joint Computer Conference, 1971

Minimize memorization
User Engineering Principles
First principle:

Know the user

Minimize Memorization
Selection not entry
Names not numbers
Predictable behavior
Access to system information
Optimize Operations
Rapid execution of common operations
Display inertia
Muscle memory
Reorganize command parameters
Engineer for Errors
Good error messages
Engineer out the common errors
Reversible actions
Redundancy
Data structure integrity
Figure5-User engineering principles

for the same functions. These low level routines always
destroy the existing information at the destination,
but the user commands are defined to move that existing information to the special fragment *DUMP*. The
low level routines must be called twice (destination~
*DUMP*; source~destination) to implement the user
commands, but these same routines are used in several
other places in the system. Designing adequate modularity into a system requires careful planning at an
early stage, but pays off with a system that takes less
time to implement, is easier to modify, and can be debugged with fewer problems and more confidence of
success.
Specific user engineering principles to help meet the
first principle can be categorized into
MINIMIZE MEMORIZATION,
OPTIMIZE OPERATIONS,
ENGINEER FOR ERRORS.

The principles are outlined in Figure 5.

Because the user forgets, the computer memory
must augment his memory. One important way this
can be accomplished is by observing the principle
SELECTION NOT ENTRY. Rather than type a character
string or operation name, the user should select the
appropriate item from a list displayed by the computer.
In a sense, the entire Emily system is based on this
principle. The user selects syntax rules from the menu
and never types text. Even when an identifier is to be
entered, Emily displays previously entered identifiers;
though the user must type in new identifiers. Because
the system is presenting choices, the user need not remember the exact syntax of statements in the language,
nor the spelling of identifiers he has declared. Moreover,
each selection-a single action by the user-adds many
characters to the text. Thus if the system can keep up
with the user, he can build his text more quickly than
hy keyboard entry.
The principle of 'selection not entry' is central to
computer graphics and by itself constitutes a revolution
in work methods. The author first saw the principle in
the work of George14 and Smith 7 but has since observed
it in many systems. The fact is that a graphic displayattached to a high bandwidth channel-can display
many characters in the time it would take a user to
type very few. If the choices displayed cover the user's
needs, he can enter information more quickly by selection. Ridsdale 15 has reported a patient note system used
in a British hospital that is based on the principle of
selection. In this system, selection is not by light pen
but by typing the code that appears next to the desired choice in the menu.
Experience with Emily suggests that keyboard code
entry is better than light pen selection because of two
user frustrations. First, the menu does not provide a
target for the light pen while the display is changing;
and second, the delay can vary depending on system
load. With keyboard codes, the user can go at full
speed in making selections he is familiar with, but
when he gets to unfamiliar situations he can slow down
and wait for the display. Thus, his behavior can travel
the spectrum from typing speed to machine paced
selection.
The second principle to avoid memorization is
NAMES NOT NUMBERS. When the user is to select from
a set of items he should be able to select among them
by name. In too many systems, choices are made by
entering a number or code which the system uses to
index into a set of values. Users can and do memorize
the codes for their frequent choices, though this is one
more piece of information to obscure the problem at

User Engineering Principles

hand. But when an uncommon choice is needed, a code
book must be referenced. Symbol tables are understood
well enough that there is no excuse for not designing
them into systems so as to replace code numbers with
names. In Emily, there are names for files, fragments,
display statuses, syntaxes, and non-terminals. Conceivably, the user could even supply a name to be displayed in each holophrast. In practice, though, so
many holophrasts are displayed that the user would
never be done making up names. For this reason, the
holophrast contains the non-terminal and the first few
characters of the text-a system generated 'name' with
a close relation to the information represented by that
name.
It is also possible to forget the meaning of a name, so
a system should also provide a dictionary. System
names should be predefined and the user should be allowed to annotate any names he creates. The lack of a
dictionary in Emily has sometimes been a nuisance
while trying to remember what different text fragments
contain.
The next principle, PREDICTABLE BEHAVIOR, is not
easy to describe. The importance of such behavior is
that the user can gain an 'impression' of the system and
understand its behavior in terms of that impression.
Thus by remembering a few characteristics and a few
exceptions, the user can work out for himself the details
of any individual operation. In other words, the system
ought to have a 'Gestalt' or 'personality' around which
the user can organize his perception of the system. In
Emily all operations on text appear to make it expand
and contract. Text creation expands a non-terminal to
a string and the viewing operations expand and contract between strings and holophrasts. This commonality lends the unity of predictable behavior to Emily.
Predictable behavior is also enhanced by system
modularity. If the same subroutine is always used for
some common interaction, the user can become accustomed to the idiosyncracies of that interaction. For
instance, in Emily there is one subroutine for entering
names and other text strings so that all keyboard interactions follow the same conventions.
The last memory minimization principle is ACCESS TO
SYSTEM INFORMATION. Any system is .controlled by
various parameters and keeps various statistics. The
user should be given access to these and should be
able to modify from the console any parameter that he
can modify in any other way. With access to the system
information, the user need not remember what he said
and is not kept in the dark about what is going on.
Emily provides means of setting several parameters,
but fails to have any mechanism for displaying their
values. This oversight is due to a failure to remember

529

that the user might not have written the system.
Another such oversight is a failure to provide error
messages for many trivial user errors. Even worse, the
'MULTIPLE DECLARATION' error message originally failed to say which identifier was so declared. This
has been corrected, but should have been avoided by
attention to the 'Access to system information' principle
of user engineering.
Optimize operations

The previous section stressed the design-the logical
facilities-of the set of commands available to the user.
'Optimize operations' stresses the physical appearance
of the system-the modes and speeds of interaction and
the sequence of user actions needed to invoke specific
facilities. The guiding principle is that the system
should be as unobstrusive as possible, a tool that is
wielded almost without conscious effort. The user
should be encouraged to think not i~ terms of the light
pen and keyboard, but in terms of how he wants to
change the displayed information.
The first step in operation optimization is to design
for RAPID EXECUTION OF COMMON OPERATIONS. Because Emily text is frequently modified in terms of its
syntactic organization, a dat~ structure to represent
text was chosen so as to optimize such modification.
The text display is regenerated frequently, so considerable effort was expended to optimize that routine. More
effort is required, though; it is still slow largely because
a subroutine is called to output each symbol. Less frequent operations like file switching do not justify special
optimization. Lengthy operations, however, should
display occasional messages to indicate that no difficulty has occurred. For instance, while printing a file
Emily displays the line number of each tenth line as it
is printed.
As the system reacts to a user's request, it should
observe the principle of DISPLAY INERTIA. This means
the display should change as little as necessary to carry
out the request. The Emily DELETE operation replaces a holophrast (and the text it represents) with a
non-terminal symbol. The size and layout of the display do not change drastically. Text cannot be· deleted
without first being contracted to a holophrast, thus
deletion-a drastic and possibly confusing operationdoes not add the disorientation of a radically changed
display. The Emily display also retains inertia in that
the top line changes only on explicit command. Some
linear text systems always change the display so the
line being operated on is in the middle of the display.
Because the perspective is constantly shifting, the user

530

Fall Joint Computer Conference, 1971

is sometimes not sure where he is. The Emily automatic
indentation provides additional assistance to the user.
As text is created in the middle of the display, the
bottom line moves down the display. Since this line is
often not indented as far as the preceding line, its
movement makes a readily perceptible change in the
display.
One means of reducing the user's interaction effort is
to design the system so the user can operate it on
'MUSCLE MEMORY.' Very repetitive operations like driving a car or typing are delegated by the conscious mind
to the lower part of the brain (the medulla oblongata).
This part of the brain controls the body muscles and
can be trained to perform operations without continual
control from the conscious mind. One implication of
muscle memory is that the meaning of specific interactions should have a simple relation to the state of
the system. A button shoul~ not have more than a few
state dependent meanings and one button should be
reserved to always return the system to some basic control state. With such a button, the muscle memory can
be trained to escape from any strange or unwanted
state so as to transfer to a desired state. In Emily the
buttons of the program function keyboard obey these
principles. The NORMAL button always returns the
entire system to a basic state waiting for commands.
Other buttons have very limited meanings and it is
almost always possible to abort one command and invoke another simply by pushing the other button (without pushing NORMAL first).
A second implication of muscle memory for system
design is that the system must be prepared to accept
commands in bursts exceeding ten per second. (Typing
100 words per minute is 10 characters per second. A
typing burst can be faster.) It is not essential that the
system react to commands at this rate, because interactive computer use is characterized by command
bursts followed by pauses for new inspiration. But if
command bursts are not accepted at a high rate, the
muscle memory portion of the brain cannot be given
full responsibility for operations. The conscious brain
has to scan the system indicators waiting for GO. Command bursts from muscle memory account for the unsuitability of the light pen for rule selection as discussed
under 'selection not entry.'
In addition to optimizing the interaction time, the
system designer must be prepared to REORGANIZE
COMMAND PARAMETERS. Observation of users in action
will show that some commands are not as convenient
as their frequency warrants while other commands are
seldom used. Inconvenient commands can be simplified
while infrequent commands can be relegated to subcommands. Such reorganization is simplified if the origi-

nal system design has been adequately modularized.
High level command routines can be rewritten without
rewriting low level routines and the latter can be used
without fear that they depend on the higher level.
A good example of command reorganization in Emily
has been the evolution of the view expansion commands.
In the earliest version, pointing the light pen at a holophrast expanded it one level, so that each of the subnodes of the holophrast became a new holophrast. With
this mechanism, many interactions were required to
view the entire structure represented by aholophrast.
Very soon the system-designer/user added a system
parameter called 'expansion depth.' This parameter
dictated how many levels of a holophrast were to be
expanded. To set the expansion depth, the user pushed
a button (on the program function keyboard) and
typed in a number (on the alphameric keyboard). It
soon became obvious that users almost always set the
expansion depth to either one or all. Consequently, two
buttons were defined, so that the user could choose
either option quickly. Later, the button for typing in
the expansion depth was removed and that function
placed under a general 'set parameters' command. Further experience may show that only the 'expand one
level' button is required. It would take effect only
during the next holophrast expansion. At all other
times, holophrasts would always be expanded as far as
possible.
Engineer for errors

Modern computers can perform billions of operations
without errors. Knowing this, system designers tend to
forget that neither users nor system implementers
achieve perfection. The system design must protect the
user from both the system and himself. Mter he has
learned to use a system, a serious user seldom commits
a deliberate error. Usually he is forgetful, or pushes the
wrong button without looking, or tries to do something
entirely reasonable that never occurred to the system
designer. The learner, on the other hand, has a powerful, and reasonable, curiosity to find out what happens
when he does something wrong. A system must protect
itself from all such errors and, as far as possible, protect
the user from any serious consequences. The system
should be engineered to make catastrophic errors difficult and to permit recovery from as many errors as
possible.
The first principle in error engineering is to provide
GOOD ERROR MESSAGES. These serve as an invaluable
training aid to the learner and as a gentle reminder to
the expert. With a graphic display it is possible to pre-

User Engineering Principles

I

sent error messages rapidly without wasting the user's
time. Error messages should be specific, indicating the
type of error and the exact location of the error in the
text. Emily does not have good messages for user
errors. Currently, the system blows the whistle on the
2250 and waits for the next command from the user.
Each error is internally identified by a unique number,
and it will not be difficult to display the appropriate
message for each number.
It is not enough to simply tell the user of his errors.
The system designer must also be told so he can apply
the principle ENGINEER OUT THE COMMON ERRORS. If an
error occurs frequently, it is not the fault of the user,
it is a problem in the system design. Perhaps the keyboard layout is poor or commands require too much
information. Perhaps consideration must be given to
the organization of basic operations into higher level
commands.
Emily provides several means of feedback from the
user to the system designer. (Though for the most
part, they have been one and the same.) A log is kept
of all user interactions, user errors, and system errors.
There is a command to let the user type a message to
be put in the log and this message is followed by a
row of asterisks. When the user is frustrated he can
push a 'sympathy' button. In response, Emily displays
at random one of ten sympathetic messages. More importantly, frustration is noted in the log and the system
designer can examine the user's preceding actions to
find out where his understanding differed from the system implementation.
'Engineering errors out' does not mean to make them
impossible. Rather they should be made sufficiently
more difficult that the user must pause and think before he errs. In Emily, time consuming operations like
file manipulation always ask the user for additional
operands. If he does not want the time consuming
operation he can do something else. To delete text, the
user must think and contract it to a holophrast. This
means that large structures cannot be cavalierly
deleted.
A single erroneous deletion can inadvertently remove
a very large substructure from the file. To protect the
user the system must provide REVERSIBLE ACTIONS.
There ought to be one or more well understood means
for undoing the effects of any system operation. In
Emily, a deleted structure is moved to *DUMP*. If
the user has made a mistake, he can reach into this
'trash can' and retrieve the last structure he has deleted. (Deletion does destroy the old contents of
*DUMP*.) A more general reversible action mechanism
would be a single button that always restored the state
existing before the last user interaction. Emily has no

531

such button, but the QED system16 supplies a file containing all commands issued during the console session.
The user can modify this file of commands and then
use it as a source of commands to modify the original
text file again.
Besides helping the user escape his own mistakes,
error engineering must protect the user from bugs in
the system and its supporting software. Modular design
is important to such protection because it minimizes
the dependencies among system routines. The implementer should be able to modify and improve a routine
with confidence that his changes will affect only the
operation of that routine. Even if the changes introduce
bugs, the user will be protected if the designer has observed the principles of redundancy and data structure
integrity.
REDUNDANCY simply means that the system provides
more than one means to any given end. A powerful
operation can be backed up by combinations of simpler
operations. Then if the powerful operator fails, the user
can still continue with his work. Such redundancy is
most helpful while debugging a system, but very few
systems are completely debugged and any aids to the
debugger can help the user. As an adjunct of redundancy, the system must detect errors and let the user
act on them, rather than simply dumping memory and
terminating the run. In Emily, the PLjI ON-condition
mechanism very satisfactorily catches errors. They are
passed to a subroutine in Emily that tells the user that
a catastrophe has occurred and names the offending
module. Control then returns to the normal state of
waiting for a command from the user, who has the option to continue or call for a dump.
A system should provide sufficent DATA STRUCTURE
INTEGRITY that regardless of system or hardware trouble
some version of the user information will always be
available. This principle is especially applicable to
Emily where most of the information is encoded by
pointers. A small error in one pointer can lose a large
chunk of the file. Some effort has been spent ensuring
that errors in Emily will not damage the part of the
data structure kept in core during execution. But if an
error abruptly terminates Emily execution (such errors
are generally in the system outside Emily) the file on
the disk may be in a confused state. Currently, the only
protection is to. copy the file before changing it, but
there are file safety systems that do not rely on the
user to protect himself, and one of these should be
implemented for Emily.
Protection and assistance for the user are keywords
in user engineering. The principles outlined in this
paper are not as important as the general approach of
tailoring the system to the user. Only by such an ap-

532

Fall Joint Computer Conference, 1971

proach can Computer Science divest the computer of
its image as a cold, intractable, and demanding machine. Only by such an approach can the computer be
made sufficiently useful and attractive to take its place
as a valuable tool for the creative worker.

ACKNOWLEDGMENTS
I am grateful to Dr. John C. Reynolds and Dr. William
F. Miller. Any success of the Emily project is due to
their persistent advice and encouragement.

REFERENCES
1 W J HANSEN
Creation of hierarchic text with a computer display
Argonne National Laboratory ANL-7818 Argonne
Illinois 1971
2 J McCARTHY D BRIAN G FELDMAN
J ALLEN
T HOR-a display based time sharing system
AFIPS Conf Proc Vol 30 (SJCC) 1967 pp 623-633
3 W WEIHER
Preliminary description of EDIT2
Stanford Artificial Intelligence Laboratory Operating
Note 5 Stanford California 1967
4 DEC LIBRARY
PDP-6 time sharing TECO
Stanford Artificial Intelligence Laboratory Operating
Note 34 Stanford California 1967
5 STANFORD UNIVERSITY COMPUTATION
CENTER
W ylbur reference manual
Campus Facility Users Manual Appendix E Stanford
California 1968
6 D C ENGELBART
Private communication
Stanford Research Institute Menlo Park California 1971

7 L B SMITH
The use of man-machine interaction in data-jitting problems
Stanford Linear Accelerator Center Report 96 Stanford
California 1969
8 J G MITCHELL
The design and construction of flexible and efficient
interactive programming systems
Department of Computer Science Carnegie-Mellon
University Pittsburgh Pennsylvania 1970
9 R B MILLER
Response times in man-computer conversational
transactions
AFIPS Conf Proc Vol 33 (FJCC) part 1 1968 pp 267-277
10 W J HANSEN
Graphic editing of structured text
in Advanced Computer Graphics R D PARSLOW
R E GREEN editors Plenum Press London 1971
pp 681-700
11 W J HANSEN
Emily user's manual
Argonne National Laboratory Argonne Illinois
forthcoming
12 J W BACKUS
The syntax and semantics of the proposed international
algebraic language of the Zurich ACM-GAMM
conference
Proc International Conf on Information Processing
UNESCO 1959 pp 125-132
13 J C REYNOLDS
GEDANKEN-a simple typeless language based on the
principle of completeness and the reference concept
Comm ACM Vol 13 No 5 1970 pp 308-319
14 J E GEORGE
Calgen-an interactive picture calculus generation system
Computer Science Department Report 114 Stanford
University Stanford California 1968
15 B RIDSDALE
The visual display unit for data collection and retrieval
in Computer Graphics in medical research and hospital
administration R D PARSLOW R E GREEN editors
Plenum Press London 1971 pp 1-8
16 K THOMPSON
QED text editor
Bell Telephone Laboratories Murray Hill New Jersey 1968

Computer assisted tracing of text evolution
by W. D. ELLIOTT, W. A. POTAS and A.

VAN

DAM

Brown University
Providence, Rhode Island

INTRODUCTION

review the manner in which his treatment of a topic
has changed during the course of editing, or must determine how or when during editing an error has been
introduced into his text, his marked-up hard copy
drafts (if they have been saved) provide a rough means
of tracing the way in which his text developed. Several
other attributes of hard copy are the portability of hard
copy drafts, the ability to easily access material anywhere within these drafts, and the ability to identify
different editors or distinguish between different editing passes on the basis of handwriting style or color of
ink.
The principal disadvantage of hard copy editing is
its tendency to lead to congestion when many changes
are made to a single draft. As a result, a writer may be
inhibited from making further editing changes to a
draft because of the increase in information density on
a page which is fixed in size. Fixed page size also encumbers a writer in making editing changes which involve points on different pages, such as moving or
copying text from one place to another. Furthermore,
the changes indicated on any single draft may be sufficiently convoluted that the evolutionary nature of the
changes may become difficult to discern. A writer faced
with such a mass of editing changes could either continue to edit the same draft, in which case the text
would become even more illegible, or he could frequently start new drafts to insure a legible presentation
of the evolutionary information. Creation of a plethora
of drafts would, however, impair later access to the
developmental changes because of the necessity of
searching through the great number of drafts.
The conventional method employed to deal specifically with evolutionary information uses revision bars,
vertical lines placed in the margin of a text to call attention to newly made changes. Although revision bars
show the points of change more clearly than heavily
edited hard copy drafts, evolutionary information associated with overlapping editing changes is obscured.
For example, whenever several editing changes within a

Many situations exist in which convenient access to
the detailed evolutionary information associated with
a text's development is desirable:
(1) The principal author of a paper edited by a
number of people, perhaps within the context of
a group project, might desire to see who made
what changes and comments.
(2) Many periodicals retain the entire evolutionary
development of each article, so that all of the
changes leading to the development of a final
version can be identified in case, for example, a
resulting article is legally challenged.
(3) There are many instances when an author would
like to be able to determine the exact nature of
his editing changes. One example is that of an
author who wants to study how his thoughts on
a particular issue have evolved and matured
over a period of time or who wants to determine
how an error has crept into his writing. Another
example is the case of an individual who has
written a computer program (a particular kind
of text) for use on one computer system and who
must then convert the program to run it on
another system. If he later improved this new
edition and then discovered he had to run the
improved program under the old system, some
manner of noting all the individual changes
made to the original would be important in order
to separate the conversion changes from the improvement changes.
Conventionally, manuscripts are written by editing a
series of hard copy rough drafts until a final version is
produced, where the creation of each new draft typically entails many changes of wide-ranging complexity.
If a writer desires to compare the changes between two
different drafts for phraseology or content, wishes to
533

534

Fall Joint Computer Conference, 1971

given portion of the text overlap, the revision bars for
the edits merge and the distinction between the individual editing changes is lost. A more significant drawback
to revision bars is the loss of all deleted text, any annotative information explaining editing changes, and
information concerning the description of the individual
edits (the context in which changes were made, in
particular). This information could be partially preserved by annotating the revision bars with information
specifying what the indicated changes were, who made
each change, and when each was made and why. (As
an example, D. C. Englebart's NLS computer assisted
editing systeml provides a facility for storing the
initials of the last person to edit a given statement as
well as the date and the time of the editing session.)
However, the loss of any deleted material and the general inability to see exactly what changes were made
without comparing separate drafts remain inherent
problems in any revision bar system.
Computer based editing systems such as NLS and
HES2 have been developed which provide a writer with
excellent facilities for online composition and editing of
his text. Despite their high degree of sophistication,
none of these systems have the ability to retain any
significant degree of evolutionary information concerning a text's development. Except for Englebart's NLS
system which provides the limited facilities noted
above, the only manner in which text editing systems
provide evolutionary information concerning a text's
development is by allowing printing or offline storage
of intermediate drafts which must be manually compared in order to identify specific changes.
Although hard copy, revision bar, and current computer aided editing systems cannot individually provide
optimal capabilities for both editing and evolutionary
tracing, each does have its own particular advantages.
It is the purpose of this paper to consider how facilities
for tracing the evolution of a changing text can be integrated with the facilities provided by advanced currently existing computer aided text editing systems to
produce a more comprehensive writing system.
A COMPUTER AIDED WRITING SYSTEM
WITH EVOLUTIONARY TRACING
In contrast with the above methods, the next system
to be considered is capable of preserving the finely delineated details of a text's evolution without requiring
the retention of separate drafts. This system, computerized of necessity and using an interactive graphics
terminal for system/user interface, has been implemented by the authors. The record of a text's evolutionary development is accumulated by storing a com-

plete description of each editing operation, as the
operation is performed, into the system's internal data
structure representation of a text. For example, text
which an author has deleted and which does not subsequently appear on the display screen is retained in the
internal data structure as part of the system entry for
that change. The system identifies all changes by editor
and date, and allows an editor to optionally associate
explanatory annotation with any change.
The online storage and CPU requirements needed to
support a system providing access to the evolutionary
development of a text are of necessity quite large, since
each edit recorded in the system requires that a significant amount of information be stored, including all
"deleted" material, and since the total amount of information grows commensurately with the number of
editing operations performed. Additionally, the display
is called upon to present a substantial amount of information to the user, resulting in a relatively dense
presentation. Although the system has its inherent
shortcomings, it does possess two distinct advantages
over currently available editing methods-computer
assisted editing which offers both convenient editing
functions (including insert, delete, substitute, move and
copy) and immediate display of the updated text, and
use of the accumulated evolutionary information to
provide both passive review facilities for examining
prior editing changes and a set of more advanced active
review facilities which enable the user to modify specific
editing changes ex post facto.
PASSIVE REVIEW
Although this system is able to accumulate a complete record of editing changes made to a text, the
measure of the system's usefulness is its ability to present this information to the user so that he can easily
deal with the text and its stored evolutionary information. The system facilities described below provide an
"instant replay" capability for displaying in any of the
basic display modes the development of selected portions of a text, with a number of options available to
filter the material to be displayed. Normal display always shows a clean, updated version of a text, incorporating all applicable changes, while proofreader display shows a previous version of the text and superimposes stylized markings, functionally similar to a
proofreader's blue-pencillings, to indicate the changes
transforming this version into the version normal display would show.
Since an entire text does not in general fit on the
display screen, the spatial review facility allows a user
to scroll through his text until that particular portion

Computer Assisted Tracing of Text Evolution

is displayed which he wants to examine. Random access
to locations in a text, as contrasted with the sequential
nature of spatial review, is provided by allowing a user
to associate an identifier with a specific location in his
text. When subsequently requested, the system displays the text starting at the location associated with a
selected identifier.
Because each successive editing change stored in the
system defines an incrementally different version of a
text, the aggregate of information available is sufficient
to recreate the historical sequence of versions corresponding to the successive transformations of the original text. The system provides a chronological review
facility by allowing a writer to request that any particular version from this historical sequence be displayed, using either display mode. In proofreader
display mode, another feature is the ability to show the
transformation of any selected version into any later
version. Whenever an editor chooses to examine the
sequential transformation of his text from an earlier
version into a more recent version, he can indicate
whether or not the system is to confine itself to the
chronological changes in a particular section or whether
it is to jump from one location to another in displaying
the chronologically successive changes to the entire
text. If individual changes are being chronologically
reviewed in the latter manner and an editing change
involving multiple text fields, such as a move (text
deleted in one location and copied elsewhere), is encountered, the system will by default show the spatial
area containing the deleted text associated with that
change. If the user desires to see the spatial area containing the other field, he can issue a simple system
command.
In order to allow an author to review by editor the
changes made in a text, the system can filter out all
editing changes which have not been made by a selected editor, and display in either proofreader or
normal display mode only those editing changes or
versions which that editor has produced.
DISPLAY AND DATA STRUCTURE
FACILITIES
As suggested above, a particular display may contain a substantial amount of information, especially in
proofreader display mode. The effort to prevent a display from becoming congested and thus preserve its
legibility resulted in the provision of several system
facilities dealing with this area.
First, design of the display modes provided by the
system had to take into account the limitations imposed by the large amount of information to be dis-

535

played and the requirements of the graphics terminal
employed. Consequently, the design emphasis was on
creating a display format which would be as straightforward and canonical as possible. A complete description of the display modes appears in Appendix 1.
Briefly, for proofreader display, the display screen is
divided into two major sections, with one section displaying the text with embedded proofreader marks, and
the other containing descriptive information identifying
the changes indicated by the proofreader marks (including when and by whom the changes were made) .
Second, the system is designed to associate userspecified priority levels with each editing change. In
preparing a proofreader display, only those changes
with a priority greater than a chosen limit are displayed
with ,the stylized proofreader markings. Remaining
changes are incorporated into the displayed text but
are not separately indicated. This edit priority facility
is designed to prevent the display from becoming congested with proofreader-marked and annotated editing
changes of a minor nature (spelling corrections, for
example). Otherwise, not only might the significant
edits tend to be eclipsed on the display screen by a
mass of minor ones, but the system itself might have
difficulty in creating the display since the high density
of edits could tax the capability of the display screen
to simultaneously show both the proofreader-marked
text and the descriptive information associated with
each change.
Another system capability which reduces the number
of edits to be displayed is the ability to specify that
particular editing changes are to physically alter the
text without being traced. Since no information pertaining to physically executed changes is stored in the
system, such changes can never be reviewed or modified.
Physical execution of editing changes is thus particularly well suited for minor corrections which do not
warrant the retention of tracing information.
Physical execution in a single edition text not only
contributes to display clarity, but reduces system overhead as well by eliminating the necessity of dealing
with the internally stored editing information. If the
evolutionary development of a text has become static
after a period of time and the traced information associated with each editing change is no longer needed,
the user may specify that all traced edits (or all editing
changes up to some particular one) be physically executed. Additionally, a user can request that a copy of
the current internal data structure (comprised of the
text and its evolutionary information) be placed into
archival storage before physical execution takes place.
If the need to reference information concerning earlier
editing changes ever arose again, the user could call
this file into the system and examine it as before. In

536

Fall Joint Computer Conference, 1971

effect, this procedure produces a series of files similar
to a sequence of hard copy drafts, while retaining the
evolutionary information intact and readily accessible.
ACTIVE REVIEW
The active review facilities provide means for a user
to modify the current state of his text by altering the
effect of prior editing changes. The three principal
types of modification are nullification of a past editing
change (in which case the current text is altered to
appear as though the nullified editing change had never
occurred), reacceptance of such a nullified edit, or
physical execution of a prior change, whether active or
nullified. With active review, an author can reject and
thereby nullify the effect of an editing change which
he later determines to be counterproductive or which,
for example, has inadvertantly deleted a portion of his
text. Consequently, no ideas can ever be permanently
lost from a manuscript unless a writer specifically requests that particular edits be physically executed.
With the addition of these facilities, an author always
has five possible actions available to him for dealing
with his text. He can make a new editing change and
have it either traced or physically executed, or he can
nullify, reaccept, or physically execute a prior change.
In order to keep nullified editing changes accessible for
possible reacceptance, a third display mode, full display, is provided which has the same form as proofreader display, but which identifies nuflified edits as well
as currently active editing changes. Since a writer frequently deals with a number of editing changes which
logically fall into particular groups or categories (all
changes in a section of text, all changes in tense, etc.),
the system provides the ability to group an arbitrary
number of editing changes so that active review
modification will be applied to each edit specified in the
group when the group itself is referenced. Although new
editing changes must always be performed on the latest
(or current) version of text, modification of any past
editing change or a group of editing changes can be done
without regard to the chronological order in which the
edits were originally performed.
Because active reveiw provides for modification of
editing changes already made, the fact that edits are
not necessarily independent of each other must be
taken into account when the system performs active
review modification of any particular edit. Any pair of
editing changes in a text may be either spatially unrelated or overlapping. If two edits are spatially unrelated, then active review or physical execution of
either has no effect on the other. If not, the edits are
implicitly related because chronologically later editing

changes depend on the context established by earlier
editing changes. Changing an antecedent edit by active
review thus has an impact on spatially related later
edits. If, for example, a number of relatively minor
substitutions were made to a section of text inserted in
a previous version, a user would probably want all of
these substitutions to be nullified if the enclosing insert
was nullified. However, if the changes were of a more
substantial nature, possibly substituting whole paragraphs for already existing text, the user might want
this material to be retained even if the enclosing insert
were to be nullified. The coupling mechanism provided
by the system to control the interaction between spatially overlapping editing changes gives the user the
ability to choose either of these coupling arrangements,
as appropriate, for any particular set of related edits.
The coupling mechanism employs two coupling fields
for each editing change. The emanate scope value associated with an edit specifies whether or not modification
of this edit by active review will be allowed to affect
any spatially overlapping edits. Conversely, the receive
scope value associated with an edit specifies whether
text associated with that edit itself is to be changed
when active review modification of an overlapping edit
is made. In order for any modification of the text
associated with an overlapping edit to take place, both
of the following conditions must be satisfied: the edit
to be modified must emanate scope and the overlapping
edit must receive scope. An example of how edit coupling

The following sequence of changes is made to a section of text:

If
that

1

Text is moved from "MF" to "MT"

2

"51" is substituted for "SD"

3

The move (1) 1S nullitied.

the coupling between the move and the substitute specifies

the

overlappin':l

substitute is to be subjected to

lIIod1ficatiolJ as the move (positive coupling), then
the move will cause the substitute to be nullified.
material

inserted

current text

by

the

the substitute· will remain as pact

MF

f(1
f.XAMPLE:
CHANGE 1
CHANGE 2
CHA:"GE 3:
POS. COUPLING:
NEG. COUPLING:

scenic View in the pack
view in the scenic park
view in the baseball park
scenic View in the park
scenic View in the baseball pack

Figure l-Edit coupling example

of

Otherwise, the

(ne':lative coupling).

SCHEMATIC:

same

nUllific~tion
of

the

Computer Assisted Tracing of Text Evolution

can be employed is given in Figure 1. The effect of
physical execution on overlapping edits coupled so that
modification of one is to affect the other should particularly be noted. When an editing change is physically
executed, overlapping portions of spatially related edits
coupled in this manner to that change are likewise
physically executed. Consequently, because physical
execution of an edit results in the loss of evolutionary
information which cannot be reconstructed, the user
must be especially certain that all edits overlapping an
edit to be physically executed have the intended coupling values. The ability to interrogate and/or alter
both the coupling specification fields of an edit and its
priority level is provided. User modifiable defaults for
edit priority, coupling field values, and other parameters
associated with new, traced editing changes are provided by the system so that a user need supply only a
minimum of information when making any particular
change.
The concept of active review may be better understood by considering the following model. The text
initially input into the system prior to editing defines a
base text relative to which all future editing changes
are made. Whenever a user specifies that an edit is to
be performed, two modifications to the internal data
structure are made by the system. The editor's initials,
the date, and any annotation explaining the change are
stored in that part of the data structure reflecting the
chronology of editing changes. All information concerning the scope of each editing change and the operation
performed, along with any new material added to the
text or moved within it, is incorporated inline into the
base text. The chronological components of all editing
changes can be thought of as a chronologically ordered
set of editing information, to be termed an edit vector.
The active review facility provides the ability, through
edit rejection, to selectively eliminate edits from an edit
vector. By performing new editing changes or reaccepting nullified changes, additional edits are included
in an edit vector. In this way, the edit vector becomes
a selector on the chronologically ordered set of all
editing changes, indicating which ones are active (nonnullified) and thus take part in defining the current
text version.
In terms of this model, the way in which a version
of the text is prepared for display is indicated in Figure
2. To create a proofreader display, the display processor
uses the information contained in both the edit vector
and the base text to display a version with proofreader
marks for all active edits above a user specified priority
level. The edit vector acts as a selector by allowing
changes to be incorporated into the version of text displayed only if they are specified in the edit vector. It is
important to note that only the spatial portion of the

537

DISPLAY GENERATING SUBSYSTEM

DISPLAY
PROCESSOR

CHRONOLOGICAL
VERSION
SELECTOR

Figure 2-Display generating subsystem

text to be immediately displayed needs to be processed
in this manner.
Chronological review, in this light, consists merely of
truncating the edit vector at some point, implicitly
nullifying all succeeding editing changes. Review by
editor is accomplished by implicitly nullifying all
changes in an edit vector which have not been made by
the selected editor. In this case, only versions of the
text created as a result of this editor's changes may be
displayed, and only changes made by this editor will be
shown in the proofreader display mode. Spatial review,
while utilizing an edit vector to define the version of
text to be displayed, neither explicitly nor implicitly
affects an edit vector.
MULTIPLE EDITIONS
Another major facility provided in this system is the
ability to define multiple, distinct text editions which
utilize a single, common text data structure. In the
above model, this corresponds to the maintenance of
multiple edit vectors. This multiple edition facility is
particularly well suited for use by a group of collaborating authors. Each author can maintain his personal
edition based on the common data structure, selecting
editing changes for incorporation into his edition via
active review and adding new changes to reflect his
particular contribution and style (with or without
direct consultation with his colleagues). A composite
document can later be produced through comparison of
the individual editions and selection of those editing
changes representing the best content and phraseology
of the group (subject to final editing touches). Alternatively, a single author can use this feature to maintain
separate editions of a text which are substantially similar but which differ in presentation to meet the needs
of different audiences.

538

Fall Joint Computer Conference, 1971

Besides employing editing functions and the facilities
of active review to alter a particular edition, the multiple edition facility allows a writer to create a new
edition by copying another edition (including its text
and evolutionary information). Whenever an editor
initiates additional editing changes to a particular
edition after multiple editions have been created, he
must specify directly or by system default which edition(s) are to be updated (i.e., to which edit vector(s)
this edit is to be appended) . Each new editing operation
is automatically nullified in those editions not specified
to receive the change.
Whenever a text has more than one edition defined
on it, physical execution of editing changes is performed
in an altered manner. In order to comply with the
two requirements that all editions be based on a
common data structure and that active review changes
in one edition not affect any other edition, virtual
rather than physical execution is employed. Furthermore, virtual execution allows all changes initiated in a
multiple edition environment to be accessible in all
editions. As far as a user can tell, the results of virtual
execution and physical execution are the same in that
editing changes will never again be displayed with
proofreader markings and may never be further modified in the affected edition. The difference is that
evolutionary information associated with virtually executed changes must be retained even though each such
edit appears to be physically executed. Virtual rather
than physical execution is always performed when the
physical execution function is specified and more than
one edition of a text exists. New editing changes virtually executed in specified editions are treated as nullified editing changes in all other editions. In this manner,
provision is made for later acceptance of such changes
into any of the other editions. When an already existing
editing change is physically executed, no modification
is made to this change in other editions.
Except as noted above, all information concerning
each edit is stored by the system in such a manner that
changes performed by an editor in one edition do not
appear in any other edition. In particular, an edit can
have a different priority or different scope coupling
parameters in each edition. A further feature of the
system is the ability to spin off any edition as the base
text for a separate data structure, either with information concerning all but virtually implemented editing
changes intact or with physical· implementation of all
edits in accordance with the edit vector for that edition.
In comparing separate editions, it is convenient for
a writer to have some method for associating locations
in one edition with those in another. The system accomplishes this by providing a spatial coupling facility
which allows a user to associate a location in each of

several editions with a single identifier. After establishing such locations, a user examining text near a coupling
point in one edition can employ the coupling to examine
the material at the corresponding location in another
edition. Such coupling designated by the user is in addition to the natural coupling inherent in the data structure--unless specified otherwise, whenever the user
switches editions, he automatically sees the text in the
same general spatial area.
CONCLUSION
It is the principal objective of this particular computerized writing system to present in one-implementation many of the separate advantages favoring other
kinds of editing systems. By providing writers with this
new, generalized writing system characterized by ease of
editing and both review and use of evolutionary information, it is hoped that writers' abilities to create
texts may be significantly augmented.
Any implementation realizing these facilities, by the
nature of the system's massive information storage and
processing requirements, places a heavy demand on
computer system resources. Although further advances
will be made in computer hardware capabilities and in
the creation of display presentations which are more
human factored, the current experimental implementation of this writing system is significant because it will
allow determination of the extent to which the ability
to passively review and actively modify evolutionary
information does augment the writing process. It also
provides for establishment of criteria against which the
effectiveness of other means for displaying a text's
evolution can be judged.

ACKNOWLEDGMENTS
The authors wish to express their appreciation to
William P. Braden, John V. Guttag, and Daniel E.
Stein for their contributions to the design of the computer assisted writing system described herein.
REFERENCES
1 D C ENGELBART and W K ENGLISH
A research center for augmenting human intellect
Proceedings AFIPS 1968 Fall Joint Computer Conference
Part 1 1968
2 S CARMODY W GROSS T NELSON D RICE
A VAN DAM
A Hypertext Editing System for the/360
Proceedings 2nd Annual Conference Computer Graphics
University of Illinois Urbana Illinois 1969

Computer Assisted Tracing of Text Evolution

539

APPENDIX I
DISPLAY FORMATS
A user can choose to view his text in any of three
modes of display-normal, proofreader, or full display.
Normal display consists simply of a text area in which
an updated version of the text is displayed in double
spaced ragged-right lines, and a prompt area at the
bottom of the screen in which system messages for the
user are displayed.
The proofreader display (Figure AI) consists of a
text area, a descriptor area, and a prompt area. The
text for the chronological version of a particular edition
is comprised of not only text presented by normal display but also text associated with active (non-nullified)
deletions; proofreader marks are superimposed to
identify editing changes. A line is drawn through all
deleted text in order to distinguish it from current text,
and vertical lines are embedded in the text to delimit
the spatial scope of each editing change. When the
vertical line denotes the beginning of a scope, it rises
to meet a horizontal line above the text line; when it
denotes the close of an edit's scope, it drops to meet a
similar horizontal below the line of text.
The descriptor area provides detailed information
about the editing changes displayed and is comprised
of marginal areas on both sides of the text, a tag block,

DESCRIPTOR AREA

~RGINAL

BLOCK

TEXT
AREA

~

MARGINAL
BLOCK

TAG BLOCK

ANNOTATION BLOCK

PROMPT AREA

Figure AI-Proofreader and full display format

1. SEE TAG 1

The princilPa~ AmeDifrcan source of

SEE TAG 2

2. 3I;4SI,4S0

thE!! ideas IO~ thef professional

4SI;4S0,3I

3. 5MD

reformers was the ~i1itary

5MD

4. 6SI;6S0

Iiiilightenmentla9~g_sl in Enq1and.

6SI;6S0

TAGS:
1. lSI; ISO; 2SI; 2S0
2. 1SI;lS0; 2SI; 2S0
ANNOTATION:
1. SUBSTITUTE;
2. SUBSTITUTE;
3. INSERT;
4. SUBSTITUTE;
5.~
6. SUBSTITUTE;

170;
213;
73;
215;
108;
234;

WOE;
WAP,
WAP,
WOE;
AVO;
AVO;

4/7/71;
4/8/71;
1/5/71;
4/8/71;
2/1/71;
5/4/71;

SI: 1/1; SO:
SI: 1/1; SO:
2/2
SI: 2/2; SO:
MI: 6/6; MO:
SI: 4/4; SO:

1/1
1/1
2/2
3/3
4/4

*
*
*

FUNCTION IGNOREO: NEW CHANGES MUST BE MAOE TO CURRENT VERSION

Figure A2-Sample proofreader display

and an annotation block. In the left margin, editing
changes beginning their scope on a line are identified by
short codes indicating their edit type. Similarly, the
closing scopes of editing changes are identified in the
right margin. The order in which these codes are listed
corresponds to the order in which the edit scope delimiters occur on each line. If there is insufficient space
to list a given line's codes in the appropriate margin
area, this information is placed in the tag block and a
reference to this tag is made in the margin.
The annotation block contains a fuller, description of
each change, including the edit number, the editor's
initials, the date of the change, and an indication where
each end of its scope lies. Figure A2 indicates how a
representative proofreader display appears.
Full display presents the text and all editing changes,
whether nullified or active, which are included in the
version of the particular text edition displayed. In order
to distinguish between accepted and nullified edits, text
within the scope of the former is displayed in small
letters whereas that of the latter is shown in capital
letters. This display, as contrasted with the proofreader
display, uses lines drawn through text to identify not
only text deleted by active edits, but also text inserted
by nullified edits.
The following example illustrates the manner in
which an edited line of text would be displayed in each
of the three display modes. For the sake of clarity, the
example has been chosen so that the edits have no overlapping scopes. The initial text to be edited is the phrase
"hard copy editing". The first change is to insert" text"

540

Fall Joint Computer Conference, 1971

NORMAL DISPLAY:

evolutionary editing

PROOFREADER DISPLAY:

IhaFEi

eej;l~evolutionarYl editing

FULL DISPLAY:

JhaFEi

eepYjevolutionary~ EDITING!

Figure A3-Display mode example

after "copy", resulting in "hard copy text editing".
Then "evolutionary" is inserted after "copy" producing "hard copyevolutionary text editing". "Hard copy"
is then deleted, resulting in "evolutionary text editing".
Finally" editing" is deleted, resulting in "evolutionary
text". If upon reviewing his edits the writer accepts his
second and third edits while rejecting the first and
fourth edits, the edited line would appear on the three
displays as shown in Figure A3.

APPENDIX II
INTERNALS OF THE SYSTEM
A text's internal data structure consists of two major
and a number of ancillary areas, all of which are arbitrary length segments and program pageable to disk.

The two main areas are the text area, which contains a
linear string of text with embedded edit codes defining
the nature and scope of each editing change, and the
edit area, which gives the location in the text of each
edit and contains additional information concerning the
execution status and priority of each traced edit in
each text edition.
Every time a traced editing change is performed, the
text pertaining to that change is bracketed inline by a
pair of edit codes and an entry is made in the edit
area. The correspondence between edit codes and the
related edit area entry is maintained by the use of edit
numbers. The edit's start code, besides delimiting the
scope of a change, contains the bulk of the information
describing an edit which is required by the display
processor. This information consists of the edit number,
an index into the table containing any annotative information pertaining to the edit, the type of editing
change, and the priority, nullification status, virtual
execution status, and the emanate and receive scope
values for the edit in each edition. The edit's end code,
which contains only the edit number, merely delimits
the end of the scope of a change. Each page in the text
area has a header which lists those edits whose scope
overlaps from the previous page. Consequently, the
display processor must only go back to the beginning of
a page in order to determine text to be displayed in any
spatial area.

Planning computer services for a complex environment
by JOHN E. AUSTIN
Harvard University
Cambridge, Massachusetts

Massachusetts Institute of Technology (MIT). They
also involved commercial services. The criteria for
selection, in the last analysis, were a combination of
cost-effectiveness to the University and flexibility for
the users.

INTRODUCTION
A responsibility that is being faced by more and more
corporations, research laboratories, universities, government agencies, and other large complex organizations is
that of providing effective computer services from a
variety of sources to serve a multiplicity of user needs.
At Harvard University this problem has been faced for
a number of years and various planning processes have
been developed to cope with it. In the 1970-71 academic
year a new office was created, the Office for Information
Technology, with planning and coordination of computer services as its first priority.
The University at that time had an IBM 360/65 in
its Computing Center providing batch processing,
timesharing, and process control services through a
central batching room and a number of remote job
entry stations. There was also a 360/30 operated by the
Comptroller's Office. In addition to these two facilities,
there were a number of smaller computers located in the
various departments, and considerable use of timesharing services purchased from commercial vendors.
At the time the Office for Information Technology was
created the Computing Center was incurring deficits
because of a downturn in research funding and because
of shifts of interest to other forms of computing services.
Because of the uncertainty about the future, the Office
was asked to analyze the usage of computing in the
University and to propose alternatives for services.
This paper summarizes part of the analysis and
several of the alternatives that were proposed. The
actual choices and implementation processes involved
considerable negotiation. But for non-technical administrators of any complex endeavor to make a decision
that has a number of technical ramifications, they must
have a point of departure and this .paper represents
just that.
The choices, as they would be for the corporation,
research laboratory or government agency, involved
neighboring institutions, which in Harvard's case is the

CLASSES OF COMPUTER USE
We found it convenient to divide uses of computers
at Harvard into six basic classes: User written program
compilations/executions, Use of package programs,
Large scientific production, Administrative production,
On-line data collection and process control, and On-line
interaction. The adequacy of these descriptions can be
argued, but they show up as groups in the statistics
derived from the 360/65 at the Harvard Computing
Center, as well as in discussions with faculty and
administrative groups throughout the University. An
individual can be a member of different classes at
different times in that, for example, an administrative
or scientific production job requires program compilation at some previous point in time.
The following sections will discuss characteristics
and volumes of these types and outline alternative
sources of services.
User written program compilations/executions

The largest user class .of batch processing at the
Computing Center in numbers of jobs per day is
represented by the person who has written a
FORTRAN program and wishes to have it compiled
and, frequently, executed. He typically requires less
than 150k bytes of core and during a typical week he
had an average turnaround time of 23 minutes. This
class accounts for 36 percent of the number of jobs at
the Computing Center.
It is more difficult to measure the extent to which this
use is made of the commercial timesharing system.
541

542

Fall Joint Computer Conference, 1971

At the Harvard Business School (HBS) it is probably
less than 10 percent. In the Faculty of Arts & Sciences
(FAS) it may be as high as 90 percent. The reasons are
pedagogical: the F AS students are learning how to
program, the HBS students are learning how to analyze
management problems.
Some members of this user class also have available
to them certain other facilities. There are the IBM 1620
computers available to users at the School of Public
Health, in the Biophysics and Chemistry departments.
Members of the Physics Department can use the XDS
Sigma 7 located there.
The needs of this user class can be generally described
as follows:
Rapid turnaround time on an unscheduled basis is
frequently important. If the individual is working on
a project, he usually wishes to get on with it as
quickly as possible. In the case of students, it may be
vital to the completion of an assignment.
Means of providing this service can depend on the
size and computational requirement of the program.
If it is small and requires small amounts of core and
CPU time, timesharing offers the obvious advantage
of being able· to complete the task, including several
iterations, in one session. If the program is large in
the number of coded instructions or in the amount of
data to be processed or in the amount of output to
be produced, then batch processing can be cheaper.
In some cases a combination of terminal access to a
system with a deferred batch execution is the best
method.

the system, to know that it will remain operational, and
know that he can get his work accomplished within a
reasonable time.

Use of package programs
The next largest user class, and one that is growing,
is the person who comes to the computer to use a
program that has been prepared by someone else. This
may be a statistical program like Datatext or SPSS, or
it may be a structured presentation of a large data base
like the King Charles County Case or the Management
Game at the Business School. The user in this class may
bring his own data to be processed by the package
program, or he may be given a decision-making aid in
which both the logic and data are part of the package.
This user is seeking a service that is more than simple
computation. The computer (and its associated software) is responding to the user in terms related to his
kind of work.
At the Computing Center this currently represents
about 12 percent of the jobs. On the timesharing system
this class represents most of the HBS usage (90 percent)
and a small but growing amount of FAS usage (10
percent) .
The service needs of package program users vary
greatly depending on the size, computational requirements and intended use of the package. For some
packages used mainly by a local subgroup of users these
functions will be performed locally but for packages of
more general interest it may be determined that the
package is of such universal appeal as to be managed
centrally.

Service alternatives
Service alternatives

This user class could be served in several ways. For
the batch user whose mode of entry is punched cards
and whose principal output is paper listings, a relatively
convenient card reader/printer station is sufficient, the
location of the central processor not being important.
For many users it is essential to have a consulting
service available at the reader/printer station to help
them with problems they encounter in attempting to
run their programs. The range of this service will be
that currently covered by the programming assistants
at the Computing Center. In addition, it is important
that a source of information and advice on all features
of the system be available to the user.
For the timesharing user in this class whose mode of
entry is typing on a keyboard in his own work place,
ease of remote access to the system, reliability, and
fairly fast response are his main concerns. He would
like to be able to depend on the time when he can get on

By and large this usage class is not interested in
proximity to the central processor. Since the results are
the main concern, and since packages are frequently
obtained from many different service sources, a rule of
thumb would be that cost-effectiveness to the user
should dictate where the computing is done.
Program packages are in one of the fastest growing
segments of the computer market. It is also one of the
areas in which universities make major contributions
and can profitably share their results. Because of the
diversity of systems used throughout the academic and
commercial worlds, it is important to the potential
package program user that he not be constrained from
obtaining this service because it will not run on the
system he is required to use.
It should be not-ed that the requirements for package
development are different from those of package use.

Planning Computer Sel"Vices for Complex Environment

There is a general fear that if service requirements are
met solely from outside sources, the resulting inconveniences will inhibit experimentation and development
of new packages.

543

financed center. To the extent that this class is directed
to other centers there is the risk that they will feel
unsupported by the University.

Administrative production
Large scientific production
This usage class may overlap with the package
program user to some extent. Its distinguishing characteristics are that it uses large amounts of core and/or
CPU time relative to input/output. The programs may
be lengthy and by running again and again with new
data, solve complex problems in chemistry, physics,
astronomy, and other sciences.
This usage class does not represent a large number of
jobs at the Computing Center (about 4 percent), but
the demands on the resources are such that the costs
incurred to serve them are high; and their portion of
income to the Center is also significant. Of the approximately 320 active customers of the Center in a recent
month, twenty of them provide 50 percent of the
income.
Historically it was the large, government-su pported
scientific project that was the high priority user of the
Computing Center. Other activities are beginning to
catch up in volume and dollars with this usage class,
but government-supported work still tends to get considerable attention because of the general overhead
component of the research contract.
The needs of these users are:
Large resources. They will use high speed processing and large quantities of core in abundance.
Programs will tend to conform to the largest dimensions of the system, whatever they are.
Reasonable rates. Most of these users have access
to government-financed computing centers. When the
local rate is too high, the differential will overcome the
inconveniences of going elsewhere to get the work
done. Members of this class will do as much computing as they can get money for.
Reasonably convenient access to the machine.
Most of these users have a high technical skill, need
very little help from the staff, would like to use the
computer as they do any lab equipment.
Service alternatives

This class uses batch processing almost exclusively
because large CPU/large core jobs do not blend well
with a multi-user environment. Satisfactory service
could be obtained from a number of sources including a
Harvard-operated center, MIT, or a government-

One of the best understood and most visible classes
of computer usage in a university is that of administration. These users have several things in common: they
all face deadlines which they must meet; they can
schedule most of their work long in advance; and they
have considerably more input/output relative to
computation than any other type of user. In addition,
there is considerable need for security of files and for
the operational environment to be protected against
unauthorized access.
At the present time about half of the Harvard
administrative jobs are run at the Comptroller's Data
Processing Center on the 360/30 and half at the
Computing Center. The Comptroller's workload is
mostly his own with some work being done for the
Personnel Office, Widener Library, and others. The
Computing Center jobs are those of the Printing
Office, the Press, the Medical School, the Development
Office, the Admissions Office, the Law School, the
Division of Engineering & Applied Physics, Widener
Library, the Registrar of the Faculty of Arts & Sciences,
and the Business School.
The needs of the administrative production class are:
Dependable scheduling. They must be able to rely
on completion of their jobs at the appointed time.
Security. Files and output must be protected
against unauthorized access.
Data control. More attention to operational considerations is required of administrative work.
This could be a growing area of interest at Harvard.
The possibility of having a comprehensive information
source will enable the Deans and the budget officers
of the various departments to plan better and to
monitor and to analyze their operations throughout
time.
Service alternative5

As a user of batch processing, the administrative
production class requires a machine with a high level of
disk, tape, and printer capacity to do the work. Technically there is no compelling reason to have the
machine located in or near the administrative offices.
In most places it happens to be so located and there
would probably be a strong desire to have at least a

544

Fall Joint Computer Conference, 1971

high speed card reader/printer station in the administration building for Harvard administrative
production.
It is important to recognize that this work requires
the mounting and storage of many reels of tape and disk
packs under tightly scheduled conditions. Operator
error can be both painful and costly, as would theft or
other forms of abuse. Any combined center doing administrative work, whether Harvard-operated or elsewhere, would have to make very careful provision for
this aspect of the operation.
On-line data collection and process control

There is a class of computing at Harvard that is
much less visible than the others but is certainly a
sizable one. There are a number of laboratories in which
a part of the instrumentation consists of a small computer to collect experimental data or control experimental processes. In some cases these same computers
analyze and display the data with plotters, printers or
CRT displays.
The most important current consideration in this
class is the service provided by the Computing Center
to the Cambridge Electron Accelerator. The commitment made by the CEA was 10 percent of the total
dollar amount pledged to support the Center this year.
The arrangement is that when the accelerator is actively
conducting its colliding beam experiments, the 360/65
acts as a high speed data collector at the end of a cable
directly connected to the accelerator. The data is
processed in lOOk of core in the 360/65 and displayed
on units back at the CEA facility. The important
elements in the relationship are the direct connection
over a high speed cable, continuous on-line operation
for long periods (of up to two weeks), the cycle time of
the 360/65 CPU and the lOOk of core.
It is essential to the CEA that a computer with a
speed of at least that of a 360/50 and lOOk of core be
available at the end of a high speed cable for the life of
the colliding beam project. The service alternatives are
quite limited from a practical standpoint, because the
reprogramming required to run on another machine
would be added delay and expense.

feature is the opportunity for the user to interact conversationally (as they say) with a program. Because of
that, and because the service sources have to be thought
of in different terms, it tends to be considered as a
separate kind of computing.
Experience with commercial timesharing service at
Harvard has been that the demands have exceeded the
supply very early in the life of any contract. There
have been many difficulties in getting the level of
service that we thought we were contracting for, but
in spite of those difficulties many, many thousands of
terminal connect hours have been used. Members of
the several faculties predict that the demand will
increase by at least 50 percent next year.
The Computing Center announced its CALL/360
service early in February, but there has not been enough
time to tell whether that service will be used to its
capacity. Continuance of CALL/360 beyond the
Spring Term depends on user response and billings.
The needs of individual timesharing users are those
previously mentioned: reliability and responsiveness.
In the University community there are several larger
needs:
Capacity. As demands grow, they continually
exceed the ability of anyone system to satisfy them.
When several sources are used, there is a problem of
storing programs and files on multiple systems, and
if they are different, there is the problem of incompatibility.
Languages and other software subsystems. The
Harvard community has not only large demands but
a diversity of needs for different computer language~
and package programs. These are not always available
on one system, and one system cannot always
operate effectively if it tries to offer services at too
many levels.
Variety. Compared with batch processing, timesharing can be a much more pleasing mode of operation. For this reason classes of users now content
with batch processing may convert to timesharing,
and it will be necessary to partition the class of
timesharing users into finer categories.
These categories are the same as the ones above,
namely:

On-line interaction

Interactive timesharing is the class of computer use
that has had the most growth, the most problems, and
is in the opinion of many people the most important
class for the future. It is not an exclusive class in that
program compilation and use of program packages are
the primary timesharing activities. Its distinctive

User program compilations/executions. It is extremely convenient (to the point of inducing carelessness) to write programs in the timesharing mode
because editing, testing, correcting can follow each
other rapidly.
Package programs. The possibility of having a
library of commonly used programs should remove

Planning Computer Services for Complex Environment

the need of having small computers such as the IBM
1620s since many of the programs are small and the
users don't need a computer all the time. Many of
the data analysis· packages are more useful when run
in timesharing mode. These will also require setting
up, maintaining, and retrieving from large data
bases.
Large scientific production. Although users will
not sit and wait for such jobs to be done, the initiation
and control of such jobs may be done by remote
access to the computer.
Administrative production. It is convenient to
update the information in files and to do editing of
such files on-line. The capability of doing on-line
analyses by timesharing is then available.
Service alternatives

There is such a diverse set of needs for timesharing
that one source is probably not the best solution to
the problem.
A contract with a commercial service corporation is
one alternative for one type of service. For .next year
there are various other alternatives or additions
possible. We could replace the commercial service with.
another timesharing vendor. We could try to renegotiate
the present contract at a lower level and supplement it
from another vendor. We could install one or more
small basic one-language timesharing machines like the
HP-2000 for the use of beginning programming students. We could offer CALL/360 on a more extended
basis if the trial use on the 360/65 proves effective.
There is also the possibility of some service available
from MIT on Multics.
In any case, more timesharing service than is now
available will be required for next year and in the years
to come. Furthermore, we predict that other usage
classes will tend to convert gradually to on-line systems
as such systems acquire greater capacity, become more
reliable and evolve into networks.

ALTERNATIVES FOR PROVIDING
COMPUTER SERVICES
The minimum risk alternative

A general review of all six classes of computer use
points up the two fixed obligations for Harvard-operated
computers: the service to the CEA and theadministrative production, which at this point is mainly the work
for the Comptroller. There has to be a computer with
the speed of a 360/50 at the end of the cable stemming

545

from the CEA and a 360/30 or its equivalent to serve
the administration with input and output under
administrative control. These obligations are both
technically and organizationally based. All other
services could be procured from other sources.
During the month of February this possibility was
explored with two batch processing computer service
sources: MIT and a commercial vendor. The user
classes under consideration were user written program
compilations/ executions, use of package programs, and
large scientific production.
In our analysis of rates we found that the jobs run by
the Harvard Computing Center in January would have
cost about .9 times as much at MIT and 1.3 times as
much at the commercial vendor. We tried to consider
the logistical problems in getting that volume of work
done at these two places, and found that the work
being submitted from IBM 2780 RJE stations could be
processed much as they are now with a few changes in
the job control cards. The large problem would be
handling the work now submitted at the Computing
Center batching room. In all likelihood we would have
to provide a very high speed I/O terminal (on the
order of a 360/25) in addition to regular courier service.
We would also have to provide assistance to users as we
do now and have at least one systems programmer who
knew the other system thoroughly.
At the same time, the Office for Information Technology would help find other sources of both batch
processing and timesharing services, as is done now, for
special needs. If large scientific production users could
get a better deal at the AEC financed center, arrangements for a remote job entry station could be made
for them.
This alternative has the least financial risk to the
University in that it involves the lowest fixed-cost
arrangement that can be made. It also carries with it a
more complicated management problem in making all
these services effective (as has been the case with the
timesharing service contract this year) .
The minimum change alternative

Given the utilization rates on the 360/65, it is the
opinion of the Computing Center staff that the University needs a machine of that size to do the work that
needs to be done. Replacement of the 360/65 by the
recently announced IBM 370/155 has a number of
unquestionable advantages.
The hardware costs for an equivalent machine
are lower.
The design of the 370/155 and its peripherals

546

Fall Joint Computer Conference, 1971

provide for a greatly enhanced file handling capability.
The 370/155 could emulate the 360/30 DOS
operation run by the Comptroller, and therefore,
eliminate the need to either do a quick conversion of
his system or to keep the 360/30 beyond this next
summer. The conversion to OS/360 would continue,
but at a more thoughtful pace.
All of the present users of the 360/65 at the
Computing Center could continue to get services with
no disruption whatsoever, including the service
provided to the CEA.
This alternative has at least two possible versions:
A 370/155 configured to meet the batch processing,
load of a 360/65 only could be installed in the Computing Center building. This would be a simple
replacement of the 360/65.
The only major technical problems to be overcome
in this case are those connected with emulation of the
360/30 prior to the Comptroller's conversion to OS.
Many of those DOS programs require operator
intervention which complicates life in a multiprogramming environment.
The second version of this alternative would be to
put the 370/155 on the fifth floor of the administration building in the area now occupied by the
Comptroller's 360/30. On that same floor there would
be space for a separate I/O area for the Comptroller.
There would then be a fast I/O station in the batching
room at the Computing Center building to handle the
work submitted there.
This version has the advantage of giving greater
security to the hardware and to the files which would
be an improvement from an administrative point of
view. It has the disadvantage of having to provide a
path for the CEA service over the broad band communication cable from the Computing Center to the
administration building.
This alternative, in either of its versions, is a continuation of the Computing Center concept as it exists
now but at a lower cost. There would still have to be a
procurement of special services-particularly timesharing services-from other sources.
The integrated center alternative

An alternative building on the previous one would be
to structure a system that would combine batch and
remote batch processing with a timesharing service.
The 370/155 is designed to take advantage of multiple
channels and high speed disks for both batch and on-line

file handling. With the addition of 500k bytes of high
speed core a 50 terminal CALL/360 system and CRBE
would be able to run with minimum interference with
the batch stream. In about a year IBM's new Time
Sharing Option may be running under OS and could
provide a significant new form of computer access.
The advantage of this alternative would be a substantial increase in timesharing capability for those
who needed only BASIC or FORTRAN at a medium
cost. There might also be a number of users who would
welcome TSO when it becomes available.
LONG-RANGE IMPLICATIONS
What do these alternatives point to beyond the
immediate planning period?
The minimum risk alternative suggests that
Harvard wishes to relinquish a large part of its
computer services operations business. It also opens
the way for a combined center with MIT if that
should be desirable for the two institutions.
The risks of this alternative are that control of
service quality would be in someone else's hands and
this could have repercussions among the many
computer users who judge a university on its resources. On the other hand, it is just as likely that
such a move could be among the first of many such
moves by universities located in urban areas where
diverse computer services are locally available. It is
no longer true that computing is an exotic activity
requiring a research and development environment.
While the field is still undergoing great change, there
is reason to believe that buying services as needed
will prove more beneficial to the educational and
research institution than attempting to support them
at a very high level internally.
The minimum change alternative suggests that
there is a cost-effective-Ievel-to be determined by
financial commitments from users-at which some
internally provided services can be maintained. Just
as we continue to have a Buildings and Grounds
Department and a Printing Office, so we should have
a computer service center where University needs
can be met by resources managed by other University
staff. There is an opportunity to influence priorities,
to control costs and to dictate what services and
service levels will be provided. In addition, the
administrator gets the security and control that his
files require.
In the years ahead, we would be saying, there is a
continuing need for this and it is part of the overall
service component of a university.

Planning Computer Services for Complex Environment

The integrated center alternative suggests that the
services of a single fast processor can handle a variety
of computing needs effectively and it may well be
demonstrated that it can. The advantages are that
one management takes care of everything and that is
much less wasteful than having several faculties each
doing their own thing.
The costs of getting services from many sources
would be higher in some cases and lower in others. The
advantages of diversity would have a cost in inconvenience. The maintenance of a resource center is a
hedge against fluctuations in the marketplace. Everything has its price.

547

BIBLIOGRAPHY
1 D N FREEMAN J R RAGLAND
Response-ejficiency trade-off in a multiple university system
Datamation pp 112-113 March 1970
2 F WARREN McFARLAN
Problems in planning the information system
Harvard Business Review pp 75-89 March-April 1971
3 CHARLES MOSMANN EINAR STEFFERUD
Campus computing management
Datamation pp 20-23 March 11971
4 ANTHONY RALSTON
University EDP: Get it all together
Datamation pp 24-26 March 1 1971
5 MICHAEL M ROBERTS
A separatist's view of university EDP
Datamation pp 28-30 March 1 1971

A high performance computing system for time critical
applications
by T. J. GRACON, R. A. NOLBY and F. J. SANSOM
Control Data Corporation
Sunnyvale, California

INTRODUCTION

SYSTEM DESIGN

The increasing complexity of current time critical
computer applications has generated requirements for
large scale, general purpose, digital computers in realtime systems. Yet, few real-time applications can
singly provide the economic support for such a major
system. This high computational capability-reasonable cost/study need has led to the development of
systems that are capable of running two or more time
critical jobs concurrently, in addition to local batch
processing, remote batch, communications, and interactive graphics.
This paper discusses recent design improvements*
in such systems. The new design handles the two most
important system tasks in a multiprogramming realtime system (CPU time scheduling, and real-time data
interface) through a hardware real-time monitor which
schedules tasks (interrupts) using a relative urgency
algorithm and an extension of the central memory of
the system to provide for data input/output. Also
provided is the means to guarantee job integrity so that
up to 15 time critical jobs can run concurrently.
The system is currently being implemented for the
Naval Air Development Center, Johnsville, Pennsylvania, for use in supporting research and development
activities in the field of naval aviation. The computer
application areas include real-time man-machine simulations, acoustic research signal processing, a quickresponse capability for Southeast Asia problems, direct
support of operational systems, a batch processing and
graphic capability for general scientific and engineering
problems, management information, and computerized
NIF accounting.

Time critical tasks-response

An application is said to be time critical if it demands
a response within a fixed time after it has received
a stimulus (i.e., interrupt).
The system responds in two ways. It first must sense
the interrupt, and capture all data needed to process the
task associated with the interrupt. Then it must
perform the processing required and have the results
available within the required time. In most hybrid
systems, devices such as sample/hold units and. data
buffers are provided in the interface to automatIcally
store the current values of the needed variables at the
time of the interrupt. The system maintains the
responsibility for sensing the interrupt and perfor~~g
the required computation on this captured data Within
the time tolerance allowed.
The extension to a multiple interrupt job is straightforward. A good system will recognize all active interrupts and schedule the required tasks so that each
task is completed when the interrupting system needs
the results.
Note that in a system which guarantees that all
interrupts will be processed within the required time,
there is no reason for a preferential (priority) treatment
of any interrupt. Internally, the system often has to
select between conflicting requests while scheduling the
tasks and does so through a time-dependent priority
rule ~valuation. This CPU scheduling algorithm is
hardware implemented in a special unit named the
Hardware Real Time Monitor which is discussed later.
The same interrupt task may have different priorities
depending on the state of all requests at that time.
This is an internal procedural matter, and the user has
no need to specify priorities between his tasks.
In fact, letting the user specify priorities for interrupt

* While some of the concepts providing the philosophy and
theoretical basis for these designs have been previously reported
(see Reference 1), this paper reports the first known implementation of them in a system.
549

550

Fall Joint Computer Conference, 1971

processing in a multiple job environment causes difficulties. For example, where a priority tree is to be shared
among simultaneous users, the users must meet and
decide by committee which interrupts are assigned to
each job. Inevitably, each job finds itself running in an
environment with other jobs that have interrupts of
higher priority than its own. The result is of course
that a job's successful execution is dependent on the
benevolence of the other jobs concurrently in execution.
Since the allocation of interrupts between users who are
resident simultaneously within the system will vary as
jobs enter and leave the system, it becomes impossible
to guarantee consistent results from multiple runs of
the same program.
To prevent any possible interjob interference, central
processor time must be properly allocated. Before a
time critical job is allowed to start, the system must
guarantee that it can coexist with the time critical jobs
presently running. The method used in the NADC
system of analyzing this situation is a static scheduler
system program which is run prior to the job being
allowed into real-time status. This program compares
the worst case requirements of the time critical job
requesting initiation against the worst case requirements of time critical jobs running in the system.
Control card information such as frame time (FT),
required compute time (RCT) , and the number of
analog and digital channels being converted each frame
time provide the static scheduler with sufficient information to determine if the requesting job will fit,
without conflict. into the system.
If the petitioning time critical job will possibly
conflict with other running time critical jobs, it will not
be allowed to enter the system. Of course, the system
operator has ultimate control over all jobs in the system
and can suspend a running job to release time if the
facility manager decides to assign a higher priority to
the requesting job.
Once the new job is allowed in the system, the task of
maintaining job integrity is reduced to a task of
monitoring the individual interrupts for violation of the
parameters supplied on the job card. The monitoring is
done automatically in the Hardware Real Time
Monitor.

operated on by the CPU. Output is performed in a
similar manner with data loaded into buffers from which
it exists on a data channel through a device controller
and into the D/A gear.
The approach, while proven workable in a large
number of systems, has some drawbacks. It requires
that a portion of the systems I/O capability be devoted
to, or at least be on short call notice, to the real-time
instrumentation, limiting the amount of normal I/O
processing that can be done. Normally, the systems
I/O facilities were designed for non-real-time data
transfer and have inherent design traits such as low
bandwidth and a time consuming "activate" requirement which limit their performance in a time critical
applications environment.
To alleviate some of these problems, a special I/O
port was designed for the N ADC system. This unit,
called DADIOS for Direct Analog Discrete Input/Output System, exists as an extension of central memory.
Every A/D and D / A channel in the system has a one
word storage buffer which is directly addressable by the
CPU. The data conversion is controlled by a combination of timing and interrupt signals from HRTM
and channel addressing capability within DADIOS.
The net effect of the system is that real-time data can
be transferred into and from the virtual central memory
without requiring any CPU time or standard I/O
resources. Other capabilities include hardware fix/float
conversion and program controllable allocation of
pooled instrumentation to jobs. The use and design of
DAD I OS are discussed later.
SYSTEM OPERATION
The job processing analysis presented in this section
begins with a variety of scheduling algorithms, and the
aspects of job entry, including control card requirements, job control, and system monitors. Data transfer
and any required analog control across the interface
during a time critical run is discussed, followed by
methods employed in the system under discussion to
recognize interrupts and cause the CPU to begin processing the interrupt-specified task.
Interrupt philosophy and scheduling

Time critical tasks-data handling

Traditionally, real world data has been handled in a
time critical system in a manner similar to I/O data in
a conventional computing system. The data is captured
by the A/D conversion gear under the guidance of a
device controller and then transmitted through a
normal input channel into some buffer from which it is

Before discussing the hardware and software concepts
utilized in this system, it is necessary to define "interrupt response" and "scheduling."
In a conventional signal real-time job environment,
the response to an interrupt will normally be quoted as
the time from interrupt stimuli until CPU action is first
initiated (assume interrupt is highest priority). The

High Performance Computing System

first action taken by the CPU is usually the initiation of
the data conversion cycle. Even in this environment,
this is not the most meaningful measure of interrupt
response. Rather, response should measure the period
from stimuli until results are obtainable. This response
is, then, an "outside world" response which also is a
function of CPU speed and the manner of treatment of
interrupts within the system. Outside world response
requirements are usually dictated by the problem which
is generating the interrupt. The important parameter is
not how fast the machine can get the CPU working on
the highest priority interrupt, but rather the period of
time in which the machine will guarantee finishing the
interrupt routine for each interrupt specified.
In the multi-real-time job environment, interrupt
response is again best specified by the outside world
response, that is, from stimuli to completion of results.
Since we are dealing with multiple real-time users,
levels of interrupt take on a different meaning, for the
response requirements of each of the multiple users
must be satisfied. A requirement thus exists to devise an
algorithm that will schedule interrupts into the CPU.
There are two requirements for the execution of this
algorithm:
• Outside world response will be guaranteed to meet
the requirements specified by the user.
• Absolute integrity between jobs must exist. That
is, any attempted CPU overruns of one job may
create problems in that job, but must not be
allowed to affect another user's CPU allocations in
the system.
This algorithm is broken into two portions. The first is
termed a static portion (Static Scheduler-non-realtime), and the second is called a dynamic portion
(Dynamic Scheduler-real-time). The Static Scheduler
ensures that job mixes with potential conflicts in system
resources do not coexist in· the machine. The schedule
is determined in batch mode prior to the job entering
real-time status. The scheduling is performed with the
worst-case conditions of the algorithm utilized in the
Dynamic Scheduler; a worst case fit of the real-time
job entering the system is calculated and compared to
the worst case conditions of all real-time jobs presently
running in the system. If the job fits, it is allowed to
proceed into real-time; otherwise, the user is notified
that system resources are not available.
To realize the importance of dynamic scheduling it
must be viewed as a task of monitoring jobs for violations of specified parameters. The Static Scheduler
had determined, prior to job entry into the real-time
state, that the job would fit within all restraints of the
dynamic scheduling algorithm used, provided the job

551

abides by the parameters used in statically scheduling it
into the system. The specified parameters (RCT,
Tolerance, Period) are dynamically monitored by the
Dynamic Scheduler and any job which attempts to
violate any of these parameters is flagged and aborted
for one Period. This action prohibits the job from
interfering with any of the others in the system.
When a job is put into real-time, the Dynamic
Scheduler takes over the scheduling of interrupts in
the system.
Many algorithms for dynamically scheduling the
CPU have been devised, and three of the more common
methods are presented in the following discussion.
Time slicing

The time slicing algorithm is perhaps the easiest to
understand. As indicated by Figure 1, a specified period
of CPU time is assigned to specific jobs in a fixed
pattern every revolution (R) of the time slice wheel.
Interrupts for a specific job have priority only during
that job's time slice. For example, if the CPU is ac~ive
in time slice A, only interrupts associated with job A
will be allowed into the CPU. The priority of interrupts
within job A will be determined by the priority structure
of its hardware or software interrupt tree. When time
slice B is reached, all work on interrupts for job B will
have priority. Any unused time, of course, will be
assigned to batch work.
The time slicing algorithm basically attempts to
synchronize several events by slicing all events into
many synchronous subevents. A goal in this algorithm

R
REVOLUTION
Figure 1-Time slicing algorithm

552

Fall Joint Computer Conference, 1971

I

REAL COMPUTE TIME

I

rI

RCT
CPU

---I

I
FRAME TIME (FT)

\-

I

-I

Figure 2-LTTG-JB parameters

Least-ti:m.e-to-go job basis

The least-time-to-go job basis (LTTG-JB) scheduling algorithm enters a factor of complexity over time
slicing into the process of scheduling, but also provides
a more efficient scheduling mechanism (it can be
illustrated that time slicing is a synchronous subset of
LTTG-JB). The two parameters required per job
scheduled are as follows (Figure 2) :
• Frame Time (FT) -This is the scheduling frame
time which is often identical to the problem frame
time. However, if multiple interrupts are used in a
job, the scheduling frame will usually be equated to
the most critical interrupt in the job .
• Real Compute Time (RCT)-The RCT is the
total CPU time specified per FT for the servicing of
all interrupts which may require service during
that FT. Actually, the CPU time will be split
into many segments, but the composite of these
segments will not be allowed to exceed the RCT
specified (see Figure 3) .

--------_--11 ~
FT

JOB A

~

t.

Acpu~
FT

. Figure 3-Real compute time

~

~

~

~

~

t.

~

•• •••

I
-1

•~ ~

t_ \.-

BATCH

/

1/2 UNIT

5 UNITS, RCT

~

2 UNITS

in a FORTRAN call to the Static Scheduler to determine worst case job fit. If the job is allowed into realtime mode, these parameters will then determine which
job has priority at a given instant of time by calculating
which job has the least-time-to-go before CPU calculations for a particular frame must be complete.
Explanation of this algorithm is best given by
illustration. Assume a three job situation where
specifications of FT and RCT for each job are as
indicated in Figure 4, and each job consists of a single
interrupt. As illustrated in Figure 5, job priorities are
rescheduled every job frame sync time. The priority
level of the job is determined by the time left until the
end of that job frame. Thus, the job with the leasttime-to-go will have priority 1, the job with the next
least-time-to-go is given priority 2, and so on. If a job
attempts to use more CPU time than specified as RCT
for that job, the system will abort CPU utilization of
that job until the next frame. This ensures system
integrity by preventing one job from affecting another.
This algorithm can also be easily implemented by hardware or software. Many 6000 time critical simulation
systems have been implemented utilizing a dynamic
software scheduler.

JOBC

I
I·

1/2 UNIT

Figure4-Job specifications

JOB B

~

=

----------_---11 FT ~ 4 UNITS, RCT

The above parameters will be supplied by the user

RO

3 UNITS, RCT

JOB 2 •

JOB 3

is to make the revolution (R) of the time slicer very
small, such that response to the multiple users is fast.
The inherent system overhead is a practical problem
encountered when the system is forced periodically to
jump from one job to another. The time slicing algorithm can be easily implemented with either a software or hardware monitor.

I

JOB J - _ _ _ _ _ _ _ _---' FT

•
1_

~

~

~

~

~

t.
•
t_. '- t_
~

--

t.

• ••

-

Indicates the time when the interrupt occurs.

.D.

Indicates amount of CPU time devoted to a particular job.

Figure 5-Least time to go-Job basis dynamic scheduling

High Performance Computing System

553

Least-time-to-go interrupt basis

The basic least-time-to-go interrupt basis (LTTGIB) scheduling algorithm is identical to LTTG-JB with
the exception that every interrupt of each job is
scheduled. For example, assume a system running with
three jobs, each job using ten interrupts. In this example, the LTTG-JB algorithm would schedule against
a single specified FT which is short enough to provide
the response required by all ten interrupts in the job.
Also, the specified RCT will contain total job RCT
requirements for all interrupts in the job. Thus, scheduling is performed against the three groups of job
parameters.
When utilizing the basic LTTG-IB algorithm, FT
and RCT are specified for each interrupt; therefore,
scheduling in this example would be performed against
all thirty groups of interrupt parameters.
Up until this point, discussion has been of a basic
LTTG-IB algorithm which utilized the FT and RCT
parameters specified in the section on LTTG-JB. This
algorithm, however, lends itself to a more powerful
scheduling mechanism by redefining the two parameters
of FT and RCT into three parameters called Tolerance,
Period, and Real Compute Time (Figure 6).
• Tolerance (T)-The parameter that "least-timeto-go" is scheduled against. In the synchronous
periodic situation, T will usually be equivalent
to FT in the basic L TTG-JB algorithm.
• Period (P)-Specifies the minimum period in
which this interrupt can occur again. For periodic
synchronous interrupts, P will usually equal T.
• Real Compute Time (RCT)-The total CPU
time specified to be completed in the T allowed.
These parameters provide efficient scheduling of
normal periodic synchronous interrupts, periodic interrupts demanding fast response times, and also allows a
means for the scheduling of asynchronous interrupts.
The following example illustrates the scheduling this
algorithm provides. Assume a two-job system with two
interrupts per job and parameters as defined in Figure
7. Recall that T is the parameter that "least-time-to-go"

J

I

- - - - i..
-t

I

tRaJ

A2

RCT- 1 UNIT, TP-4 UNITS
(PERIODIC)

~T~

6 Bl@J
~

~

RCT 1/2 UNIT, T3 UNITS,
P'B UNITS
(PERIODIC, FAST RESPONSE)

-==1-.-1

RCT' 2 UNITS, T=I'--5 UNITS
(PERIODIC)

I' _ _
T

I

RCT,
_

N

B2 _

:.,J
I-L....

RCT 1/4 UNIT, T=1 UNIT,

l _ _ _ _._ _ _ _

II"

10 UNITS

I' ----lIASYNCHRONOUS)

Figure 7-Tolerance, period, and real compute time, job examples

is scheduled against, and P is the parameter which
prevents an interrupt from being scheduled more
frequently than the user has specified. Figure 8 illustrates a dynamic scheduling of the system. Again, it is
to be pointed out that the system will not allow an
interrupt to overrun its RCT specifications, and all
unused CPU time is available for batch work.
The quantity of parameters (up to 64 in the current
implementation) which must be scheduled against one
another makes implementation of this algorithm by a
software scheduler impractical. The hardware implementation of this scheduler is presented later.
CPU exchange process

Each independent job running in the system requires
that a number of items of information be saved and

Al

~

A2

f. t. t. t. t. t.

BI

t • _ t...

t...

82

~

I

~

~

t.

~

iLk

~

.t

~

t.

t•

.

~

t.

t•• t...
I~k

RCT - - + -_ _ _
U£J--L

~LERANC~

PERIOD (P)

Figure 6-Tolerance parameter

~

------l.~1

PRIORITY

11:11 il rp~ I~ll ~ I ~
}

5S

I.

I----- P ----II---- P - -

BATCH

:-----,,1

~ 5~

M~ I :~ ~~:lml:q :q ~

Indicates the point in time when the tolerance = O. This is the reference against
which the interrupts are scheduled. In cases where Tolerance equals Period,
the symbol ~ is not

Figure 8-Dynamic scheduling

554

Fall Joint Computer Conference, 1971

maintained. Such information includes the contents of
all CPU registers, the control card buffer, the buffer
containing the job's history of events, messages,
charges, the time limit and times consumed, and many
other flags relating to the status of the job.
Because information for each job is independent of
other jobs, the items are grouped into control points.
An active job is always signed on at a unique control
point, and all software related requests and operations
within the system are linked to the respective control
point.
The time critical operating system normally provided
runs seven control points (1 through 7) for user jobs
and utility operations. (A version of the operating
system which provides for 15 control points is available.) The system maintains other control points for
internal operations. Each of the seven control points
can run an independent job using any desired external
equipment without interference, up to the limits of the
system resources. Any number of these control points
may be used to execute concurrent real-time jobs
providing sufficient resources are available.
The Exchange Jump instruction is used by the system
for interrupting the CPU. This instruction allows a
complete exchange of the CPU executing environment
in 5 J.'sec. For this reason the Exchange Jump is used
for the processing of interrupt routines and for sharing
the CPU by control points.
The system reserves space for one Exchange Jump
package for each control point. The CPU registers for
a control point which is not executing (idle, waiting
for I/O, or waiting for a higher priority program to
complete) are stored in this exchange package area.
Additional Exchange Jump packages may be defined by
the system programmer, one for each requested interrupt. When that interrupt occurs, the system initiates
the user's interrupt routine by using that interrupts
exchange package.
In the hardware monitor system all incoming interrupts are recognized by the hardware real-time monitor
(HRTM). A hardware scheduler in the HRTM
schedules the new interrupt against the ones currently
in the schedule using the LTTG-IB algorithm. If a CPU
task change is indicated, the EXJ controller in the
HRTM locks out all batch processing exchange and
requests and issues the EXJ instruction through a
hardware modification directly to the CPU.
Theory of operation of real-time jobs

A job is submitted with the real-time parameter
(RT) on the job card to distinguish the job from normal
batch jobs. The real-time job is then held in the input

queue until it is directed to a cleared control point.
The job usually consists of three records. The first
record contains control cards. For each group of special
equipment to be used by the job, a file must be created
using the standard REQUEST control card. Other
control cards are used to perform tasks associated with
the job. The second record contains the program. Files
used by the program must be defined in the program.
The third record is the data record. It must contain the
real-time control cards in addition to any other necessary data. A job enters real-time mode with a
FORTRAN call (SIM RUN). Prior to placing the job
in real-time mode, the system determines if the program
will adversely affect other real-time jobs by performing
the static schedule check and a check for availability
of interrupt hardware. If other jobs will not be affected,
and if the requested hardware is available, the system
accepts the job into real-time status, clears all interrupts associated with the job, and control of the
execution of the time critical job is passed to the
Hardware Real Time Monitor and Central Resident
Monitor (CRM).
After acceptance into real-time and upon occurrence
of an interrupt, the following sequence of events is
initiated:
• The hardware monitor begins decrementing the
interrupt's tolerance counter.
• The interrupt is passed on to DADIOS where all
channels associated with that interrupt are converted.
• An "End of Convert" signal is passed on to HRTM
which causes the start of scheduling.
• HRTM performs the dynamic schedule comparing
the remaining tolerance of all tasks waiting for or
using the CPU.
• If the remaining tolerance for this task is less than
all others, HRTM issues an exchange jump to
CRM.
• CRM receives the number of the most critical task
from HRTM and, after performing certain accounting functions, places it in execution.
When the task completes execution, it returns to
CRM by calling from FORTRAN (SIM WAIT or
SIM IDLE), which informs HRTM that processing
for this interrupt is complete for this period.
If HRTM detects an RCT overrun (i.e., a task
attempting to use more than its assigned CPU time)
from real-time status, or if a synchronous interrupt
attempts to interrupt too often, an error condition
occurs and the action taken is dependent upon the mode
selection in the original FORTRAN SIM RUN call.
The user may exit from real-time mode by calling

High Performance Computing System

SIM STOP or by encountering some unrecoverable
error condition. After exiting, the user's control point
returns to batch mode. He may now do post-processing,
or he may return to real-time by calling SIM RUN.
Time critical programming language

The standard CDC FORTRAN Extended Compiler
has been modified to distinguish between normal
variables and those that are assigned to DADIOS
(virtual memory). The compiler distinguishes between
the individual DADIOS channels (ADC's, DAC's,
etc.) as well as the particular function on each channel
(Le., with or without fix/float conversion). Finally,
interrupt definition can be specified. The user is able to
address DAD I OS channels either directly or indirectly.
The linking of indirectly addressed channels is accomplished at run time so that changes in physical
channel assignment can be made without requiring
recompilation.
Interrupt specification

Interrupts are specified by the statement form:

555

where
n is the logical interrupt number.

The loader control card INTSEGM is recognized by
the compiler if it appears between subprograms.
Compiler processing places it in the desired position on
the binary output file. The loader, in turn, associates
interrupt number-n with the next entry point it encounters, and places this information in a table for use
at initialization time. An example of its placement is
presented later in this section.
DADIOS variable specifications

Variables to be assigned to DADIOS channels are
specified in labeled common blocks. The functions to be
performed (fix/float) are based on the type of the
variable (integer, real, etc.).
DADIOS variables are handled differently, in that
instead of allocating core to these variables, DADIOS
channel addresses are assigned to them. When these
variables are used (at run time), hardware will route
the data over the appropriate channels to/from the
DAC/ADC's, etc.
The variables are specified by the statement form:

INTERRUPT

*ADCn

(I=n, H=m, S=i, R=x, T=y, P=z, E=j)

*DACn

where

/kl' }al' "', {k i • }ai

COMMON/

n=logical interrupt number.
m=hardware interrupt number.
i=interrupt set number.
x=required compute time for the interrupt (in units
of 10 }Lsecs) .
y=tolerance of the interrupt (in units of }Lsecs).
z = period of the interrupt (in units of 10 }Lsecs).
j = external interrupt indicator.
The set designator allows a means to define an
interrupt with several "sets" of descriptors (RCT,
PER, TOL). At the beginning of a real-time job, all
sets of interrupt descriptors are statically scheduled
against all other sets of interrupt descriptors in the
system. If the schedule is successful, it allows the user
to switch from one set of descriptors to another during
real-time without having to reschedule.
Loader control card

The entry point for a particular interrupt is specified
by the loader control card
INTSEGM(n)

*IDISn
*ODISn
where

n is an integer constant specifying a DADIOS unit
number (l~n~4);
k i is an (optional) integer constant which specifies
the channel (relative to the first channel for this
type) to be associated with the following ai
(default = 1);
ai are variable names, array names, or array declarations in which one, two, or three constant dimensions are specified.
DADIOS variables can be typed either implicitly by
the labeled COMMON statement or explicitly by the
TYPE statement, as standard FORTRAN conventions
allow. The type (integer or real) is reflected in the
variables address (1 bit). The hardware then determines
whether or not conversion is to be made.
In order to obtain the desired interrupt/element
association, the main program must contain specifications for all interrupts to be used. For each interrupt,

556

Fall Joint Computer Conference, 1971

the interrupt statement must be followed by the
associated DAD lOS variable statements.
Subroutines associated with each interrupt are
required to have the corresponding DADIOS variable
statements, but do not require the interrupt statements.
NOTE 3:

Example:
PROGRAM TEST (INPUT, OUTPUT, HFILE)
NOTE 1
DIMENSIONDATA-

INTERRUPT (1= 1, H= 1, R= 1000'1
T=2500, P=2500)
COMMON/*ADCl/ A(32)
COMMON/*DAC/33, B(32)
INTERRUPT (I = 2, H = 3, R = 500'1.
T=2500, P=3000)
COMMON/*ADC2/C(32)
COMM ON/*ID IS I/ID (10)

NOTE 4:
NOTE 2
NOTE 5:
NOTE 3
NOTE 6:

EXECUTABLE STATEMENTS

END
INTSEGM(I)
SUBROUTINE INTI
COMMON/*ADCl/ A(32)
}
COMMON/*DACl/33, B(32)

NOTE 7:
NOTE 4
NOTE 8:
NOTE 5

END
INTSEGM(2)
SUBROUTINE INT2

NOTE 6

COMMON/*ADC2/C(32)

NOTE 7

COMMON/*IDISl/ID(10)

END

NOTE 8

NOTE 1: Information pertaining to the hybrid environment is maintained in file HFILE for
system usage.
NOTE 2: (Logical) interrupt 1 is to be associated
with hardware interrupt 1, an internal
interrupt (by virtue of default value of
omitted E parameter), hence synchronous,

to have an RCT of 10 ms, a TOLERANCE
of 25 ms, and a PERIOD of 25 ms. The
first 32 ADC elements of DADIOS unit 1,
as well as the 32 DAC elements, starting at
element 33, of the same DADIOS unit. are
to be associated with interrupt 1.
(Logical) interrupt 2 is to be associated
with hardware interrupt 3, also synchronous, and is to have an RCT of 5 ms,
tolerance of 25 ms, and a period of 30 ms.
The first 32 ADC elements of DADIOS
unit 2 and the first 10 input discrete
channels (each 16 bits wide) of DADIOS
unit 1 are to be associated with this interrupt.
This loader directive will associate the
primary entry point of subroutine INTI
with interrupt 1, and will cause all unsatisfied externals, up to this point, to be
satisfied.
This is a duplication of the element specification to establish addresses for the
DADIOS variables A and B for this (interrupt) subroutine.
This loader directive will associate the
primary entry point of subroutine INT2
with interrupt 2, and will cause all unsatisfied externals, since the last INTSEGM
directive, to be satisfied.
These statements establish addresses for
DAD lOS variables C and ID for this subroutine.
In addition to standard end-of-program
processing, all unsatisfied externals since the
last INTSEGM directive will be satisfied.

HIGH PERFORMANCE LINKAGEHARDWARE
The High Performance Linkage System as shown in
Figure 9, uses a DADIOS (Direct Analog Discrete
Input Output Subsystem) to gain fast data transfer,
an ACS (Analog Control Subsystem) to control any
analog devices in the system, a HRTM (Hardware
Real Time Monitor) to efficiently schedule "interrupts" into the system together with a CB (Control
Board) to tie system control functions together. The
tie of the system to the 6000 mainframe for data
transfers is via a Data Bus Extension modification to
the mainframe and a Bus Adapter which allows up to
four 6000 CPU's to access the linkage. The tie of the
interrupt scheduling structure of the HRTM to the
6000 mainframe is via an exchange jump modification

High Performance Computing System

to the mainframe. These devices are discussed in the
following paragraphs.

ACCESS EXPANSION

*

DATA BUS
EXPANSION

DATA
MODULES

Mainframe modifications and bus adapter

IN .....UIllDiTI
EXPANSION

ALLOWED TO I.
Of ANY TYPE

8000 SERIES

COMPUTER

The Data Bus Extension modification to the mainframe extends the central memory bus to the outside
world and contains the hardware to allow this extension
to function as an extension of central memory of the
6000 mainframe. The Bus Adapter adapts this extended
memory port (together with identical ports on up to
four 6000 mainframes) to the DADIOS, HRTM, and
ACS. This device's function is primarily one of timing
and resynchronization.
The Exchange Jump modification to the mainframe
provides an external port into the exchange jump
mechanism of the 6000 mainframe.

The DADIOS is a multi-programmed linkage subsystem which allows n jobs (n::; 15) running in the
system to concurrently utilize DADIOS for data conversion. Note, that the system may connect to as many
as 4 CDC 6000 or CYBER 70 mainframes. DADIOS
is a combination of linkage modules providing the
following capabilities:
• Simplified programming for data transfer to/from
central processor unit and DADIOS on extended
central memory read and write buses.
• Access from peripheral processors via standard
data channel for setup.
• Individual assignment of channels (in groups of 8)
with each job number under program control

CAPABILITY
FOR ADDITIONAL
COMPUTERS

USER DATA
(ANALOG &
DISCRETE)

.1IUA1.

COIIDtT'OII'M

JOB NUMBER
FROM HRT..

* ~~~~o: TO FOUR 8000 SERIES COMPUTERS
** EXPANSION TO SIXTEEN UNIT.

Figure 10-DAD lOS

•
•

Dadios

DATA BUS
EXTENSION

557

•
•
•

(element reservation of ADC's, DAC's and discretes).
Data integrity between concurrently processing
jobs.
Program selectable hardware fixed-to-floating point
conversion for all ADC channels.
Program selectable hardware floating-to-fixed point
conversion for all DAC channels.
Intermixing of various types of instruments.
Expansion capability.

An overall block diagram of DADIOS is shown in
Figure 10. DADIOS can be addressed from the CPU
and pass data to or from the CPU through the Bus
Adapter and Data Bus Extension.
The Bus Adapter and Data Bus Extension are
transparent in that they do not change addresses or
data in any way. In effect, DADIOS communicates
directly with the CPU with its primary· purpose being
to pass data between the outside world and the CPU.
The path of DAD lOS via the Data Channel is primarily
used for DADIOS setup.
As indicated by Figure 10, DADIOS is comprised of
a series of modules which can be configured into a
linkage system as required. Brief descriptions of each
module indicated in Figure 10 are given in the following
paragraphs.
Address :module

SETUP
CDC
6000 SERIES
COMPUTER

-INTERRUPTS
(FOR DATA TRANSFER AND CONVERSION TIMING)

EXCHANGE
JUMP

USER INTERRUPTS
TIMING SIGNALS FOR USER'S USE

Any channel in each of the input or output modules
can be addressed. The address carries a bit to indicate
which fixed-to-floating point or floating point to fixed
point conversion is to be done.
Float and unHoat :module

DADIOS
HRTM
CB
ACS

-

DIRECT ANALOG DISCRETE INPUT OUTPUT SUBSYSTEM
HARDWARE REAL TIME MONITOR
CONTROL BOARD
ANALOG CONTROL SUBSYSTEM

Figure 9-High performance linkage system

Sixteen-bit fixed point data words are converted into
60-bit floating point words (and vice versa) in this
module.

558

Fall Joint Computer Conference, 1971

Integrity lDodule

Integrity between jobs is provided by this module so
that one job cannot interfere with the operation of
another job in any way or at any time. For example,
the module prevents a user from altering data in any
channel not assigned for his use.
'

Data lDodules

Four types of Data Modules (Serial Input, Parallel
Input, Serial Output, Parallel Output) can be provided.
Each module provides buffering for 16 data words each
16-bits wide. Data can be routed to or from these
modules through the Float and Unfloat Module or
directly from the CPU or instruments.
The data flow through the DADIOS system is
illustrated by the following paragraphs.

Data output

Again it is important to understand that DADIOS is
an extension of central memory. The registers of the
D I A converters and the discrete signal buffers serve as
CM locations into which data can be written. The
sequence for writing the data is as follows:
1. The program executing in the CPU writes data to
the first rank register of the addressed D I A or
discrete location. (A bit in the address will
indicate whether or not the data is to be converted from floating-to-fixed point via the
floating-to-fixed hardware.)
2. The sync pulse will transfer data from first to
second rank registers. In the case of analog
channels, the second rank data is converted and
presented as an analog voltage.
Analog control subsystem (ACS)

Data input

Since data is not stored in central memory, but
rather in the DADIOS interface, the sequence of events
for acquiring the data is straightforward.
1. A sync pulse for the interrupt is issued to the
DADIOS system.
2. DADIOS has the capability to utilize one AID
converter per analog channel. In this case, the
interrupt would issue a start convert pulse to all
channels previously assigned to that interrupt
'and they would all convert simultaneously.
When data conversion is complete, the data can
essentially be thought of as residing in central
memory.
If a multiplexed analog system is used as
instrumentation on DADIOS, the interrupt will
place the applicable sample and hold units into
hold mode. The multiplexer-converter would then
start converting the analog data and placing it
into the applicable digital buffer for that channel.
When all channels associated with the interrupt
have been converted and the data is contained
in the digital buffer, the data is directly addressable from the central processor.
3. Memory references by the CPU can now act on
the data contained in DADIOS, and the references will indicate whether the data is to be
presented in the CPU in fixed or floating point
format.

For systems where control of analog computing
devices (i.e., analog computers) is required, the ACS
concept has been developed to allow this control to be
accomplished by the central processor. As indicated by
Figure 9, ACS is also treated as an extension of central
memory. Thus, if a mode is to be changed, a pot or
digital coefficient unit is to be set, a DVM reading is to
be made, etc., these operations are initiated by writing
into the ACS section of "extended memory" and any
information to be received from the analog device is
received via reading from the ACS section of "extended
memory."
Hardware real-time monitor

The Hardware Real Time Monitor (HRTM) is
designed to provide faster response to external interrupts of real-time jobs than can be done with a software
monitor system. This is accomplished by performing the
detection and dynamic scheduling between external
interrupts in hardware circuits. The following capabilities are provided with the HRTM:
• Scheduling and monitoring of up to 15 concurrently
executing real-time hybrid simulation jobs.
• Dynamic scheduling between all interrupts on a
"least-time-to-go interrupt basis."
• Assignment of any interrupt to any job, providing
more efficient usage of external interrupts.
• Use of three scheduling parameters (real compute
time, tolerance, frequency of occurrence) to allow
more CPU time for real-time jobs.

High Performance Computing System

• A maximum of 64 external interrupts (a minimum
of eight, expandable in groups of eight) , terminated
on a control board (for user termination).
• Availability of providing data to allow timeaccounting on an interrupt basis.
• Access to HRTM via Data Bus Extension (used
for central monitor information and error status
to central monitor and user).
• Access to peripheral processor via standard data
channel for setup and monitor functions.
• Error detection circuitry to prevent one time
critical job from interfering with another.

559

DADIOS

1- :~R~P~

-

-- -

I

:-C:~~D~E -

-

-

-

-

-

-

--1

(INDICATES CONVERSION COMPLETE)

:

~

EXTERNAL
INTERRUPTS

I

PERIOD-O

I----'-'-TlM...;.cIN.c.::G_ USE~ A1

.....!T=OL..::..:=0'----+_-----.1
INTERRUPT

CONTROL
BOARD
LOGIC

START SCHEDULE

LINES

INTERRUPTS

EXTERNAL
INTERRUPTS

I

I
I
I
I
L- _ _ _ _ _ _ _

6000

I

TIMING SIGNALS

I

I

DATA
CHANNEL

I
I

USER
PATCH
PANEL

L-_J--'----.;T:.:::IM::::.ING~__ USER An

~~~ ~~

_______ _

SETUP

Figure 12-Control board

Syste:m design

A diagram showing how the HRTM fits into a realtime system is shown in Figure 9. The HRTM can gain
access to the CPU by directly providing an exchange
jump to the mainframe. The CPU has the ability to
read directly from the HRTM through the extended
memory via the Read/Write Bus Adaptor. Any PPU
may setup the HRTM and input status information via
the Data Channel Adapter. The HRTM receives all
interrupts and returns all control signals to a control
board.
Syste:m description

An overall block diagram of the HRTM is shown in
Figure 11.
Error detection
The Error Detection circuitry is necessary to ensure
that one user does not interfere with another. The
enable/ disable logic will prevent any unused interrupts
from requesting processing. The following parameters
TO

EXCHANGE
JUMP

are monitored and if any errors are found, the job is
removed from real-time.
• The required compute time of each interrupt is
monitored and countered down. If the RCT for
an interrupt counts to zero and the computation
is not complete, the job is considered in error and
it will not he allowed to continue until its next
frame. The user will not be notified of this condition
and he can take any action he wants; however, it
is noted that not allowing him to continue until the
next frame protects other users from feeling any
effects of his overrun of his schedule.
• The occurrence period for each interrupt is monitored, and if interrupts occur faster than the
specified period, the job is considered in error. The
HRTM will not allow interrupts to be scheduled
faster than the specified period. User will be
notified of the error condition.
• Error signal lines from the DADIOS interface
system are sent to the HRTM via the control
board. These are errors which occur if the operating
program attempts to write out toa DAC or output
discrete, or read an ADC or input discrete assigned
to another job number. They are brought to
HRTM so errors can be associated with the
interrupt which had them.

TO

RWBA

Control board

6000 SERIES DATA ~HANNEL

Figure I1-HRTM block diagram

The Control Board (CB) is the device in the linkage
system which allows the system to be tailored to a
particular user's needs.
All Period and Tolerance Parameters (used for
generation of synchronous time frames) from and
interrupt schedule lines to the HRTM, user external
interrupt signals, timing signals to user, and interrupt

560

Fall Joint Computer Conference, 1971

Board will be. Examples of some of the basic functions
that can be provided by the proper use of the Control
Board are:

PERIPHERAL
POOL

• Ability under program control to assign any
interrupt source to a particular portion of DADIOS
equipment.
• Ability to assign an interrupt source either to an
external source or to an internal source (usually
a Tolerance = 0 or Period = 0 signal from HRTM) .
• Ability to hold up the scheduling of an interrupt
until the data has been converted. This is accomplished by:
INTERRUPTS

INTERRUPTS

1. The incoming interrupt is sent to DADIOS to
initiate any conversion which is to take place.
2. After all conversion is complete, the acknowledge signal is sent from DADIOS to
the Control Board and then to HRTM to start
the dynamic scheduling of this interrupt.
This process permits the conversion of data
to completely take place prior to involving
the CPU with the interrupt task and prevents
the occurrence of a CPU wait for data dead spot.
The NADC cOInputing systeIn

~oo

~----

____________________________

~

Figure 13-NADC computing system

signals to the DADIOS are terminated on the Control
Board. Figure 12 indicates typical ties to the Control
Board.
The basic reason for the Control Board is to adapt the
flexibility of the system to the user's needs. The more
flexible the user wants to be, of course, the more combinations of paths between the user, DADIOS, and
HRTM will exist and the more complicated the Control

The Naval Air Development Center, Johnsville, Pa.,
is acquiring a large computing complex to perform time
critical simulation studies as well as scientific batch
processing. The NADC configuration is shown in
Figure 13.
The complex is currently expanded to handle eight
interrupts in each HRTM and 128 channels and
128 D / A channels in the dual 6600 complex.

REFERENCES
1 M FINEBERG 0 SERLIN
Multi-programming for hybrid computation
Proceedings 1967 Fall Joint Computer Conference
2 Time critical simulation systems-General information
manual
Control Data Publication No 44629200

Effective corporate networking, organization, and
standardization
by PAUL L. PECK
The MITRE Corporation
McLean, Virginia

INTRODUCTION

corporate network with its associated advantages of
workload sharing, data sharing, program sharing,
remote service, program exchange and joint program
development was closed to corporate management.
The incompatibility among systems produced by
different vendors, as well as incompatibility among
successive systems produced by the same vendor, stems
from differing approaches to hardware and operating
system design and is magnified at each installation by
configuration differences, utilization of assembly rather
than procedure-oriented languages, and the introduction of installation-peculiar operating procedures. 1 ,2 ,3 ,4
Much of the existing hardware incompatibility stems
from differences in character/word lengths, internal
computer codes (BCD, ASCII, EBCDIC, etc.),
boundary alignment considerations (padding, packing,
justification, etc.) , error checking techniques and
numeric representation alternatives (binary, floating
point, etc.) .
In addition to these hardware differences, basic
software differences exist in the areas of operating
systems and languages. Each manufacturer provides a
specific operating system to use with his hardware.
These operating systems offer widely differing services
in procedure-oriented language and utility support,
hardware support, file management and input/output
control, systems services and job control. Assembly
languages differ extensively with regard to the size and
nature of the instruction sets and optional features
provided.
Data compatibility as such rarely exists. Existing
corporate data banks are often not compatible because
each installation data base was developed using unique
hardware and software. The basic data definitions, data
formats, and data structures were designed to satisfy
local, not corporate requirements. This has necessitated
the development of either special data bases or special
programs for format translation to satisfy reporting
and interface requirements. Standard means of de-

As investment in automatic data processing systems
has increased, methods to improve the productivity of
these systems have constantly been sought. One of the
most promising methods is networking-the integration
of a number of independent data processing installations
( connected by data communications to provide improved
data processing support for the linked installations.
This paper discusses the advantages of networking,
addresses the advantages of utilizing homogeneous
configurations in establishing a corporate ADP network
and presents a management concept and a proposed
network standards guide which it is believed will
promote the acceptance, growth and effectiveness of
the corporate ADP network.
BACKGROUND
Corporate network development has been hindered
by compatibility limitations. Traditionally, data processing has been considered a support function, and
decisions on the level and type of data processing
support were made at the installation level not at the
corporate level. In many companies there was a tendency to deplore the proliferation of incompatible computer systems and data banks as new systems were
installed, but to do nothing to achieve compatibility
since this was not a corporate objective. Compatibility
denotes the ease with which a program running on one
system can be transferred to another or the ease that
data generated in a particular system format can be
utilized by another system. A popular conception is
that compatible computer systems will accept programs
written in standardized languages and perform the
same computations producing the same results from
the same data. Because of compatibility limitations, the
option of quickly and economically creating an efficient

561

562

Fall Joint Computer Conference, 1971

scribing data elements and standard approaches to
data bank development have not been used in the past.
To highlight the extent of the incompatibility that
presently exists (even among systems designed to
support a common objective), consider the following
facts determined in the analysis required to support the
World Wide Military Command and Control System
(WWMCCS) procurement which will upgrade the
processing capabilities of up to 109 existing ADP
centers. According to Phil Hirsch,5 the ADP centers
to be included in the WWMCCS procurement "are
supported by 30 different programming languages
(dialects are ignored), and 802 separate programs. At
least 75 percent of these are in machine-dependent code,
primarily Autocoder. There are 20 FORTRAN programs, 8 in COBOL, and 272 in JOVIAL, which may
or may not be standardized dialects. . . . Today each
WWMCCS installation is largely self-contained. The
workload is handled on a batch basis, using local data
bases. Although installations are interconnected, the
terminals usually are off-line devices."
Since the WWMCCS systems to be replaced range in
age from one to 10 years, this situation is probably
similar in type, if not in scale, to that found today in
many large, decentralized corporations.
FACTORS THAT EASE THE
IMPLEMENTATION OF NETWORKING
Although procedure-oriented languages (POLs) such
as FORTRAN (the de facto scientific programming
language) and COBOL (the standard commercial
programming language) were developed to facilitate
programming and to ease system-conversion problems,
programmers can now code in languages that are
somewhat independent of hardware. Therefore, the use
of these languages eliminates many compatibility
limitations. Program transferability was further promoted when ANSI sanctioned standard specifications
for FORTRAN and COBOL.
With the advent of third generation equipment,
intra-system compatibility (compatibility among systems produced by the same manufacturer) became
realizable. Compatibility is a major design objective of
and is widely promoted by all major computer manufacturers. For example, in 1964 IBM announced the
System/360, its new line of compatible computers.
Since the System/360 was designed for both upward and
downward compatibility, a significant step toward
intra-system compatibility had been taken. IBM's new
series, the System/370, offers intra-system compatibility and is compatible with the System/360 series.
Control Data Corporation explicitly states that programs designed for the CDC 6000 Series will operate

on any CDC 6000 Series configuration. Furthermore,
Control Data has announced that its new system; the
CDC 7600,will be compatible with the 6000 Series.
RCA stresses that its Spectra 70 Series, in addition to
being upward and downward compatible, is compatible
with the IBM 360 Series.
System/360 plug-to-plug compatible magnetic disk
drives, magnetic tape drives, and large core storage
units are now available from independent peripheral
manufacturers. Plug-to-plug compatibility means that
the new device is physically and electrically interchangeable with the IBM peripheral, that neither the
peripheral nor the equipmen t to which it connects
requires any modification to effect the replacement, and
that no modifications are required to the operating
system and user programs. 6 Thus, a network consisting
of IBM systems can maintain its compatible status and
still take advantage of peripheral development which
provides price-performance benefits.
Perhaps the most significant developments have
taken place in the area of data communications. Data
transmission is growing at a tremendous rate with a
corresponding decrease in its costs as the Bell System
improves its digital transmission capability. Private
microwave links for data communications are being
developed by Microwave Communications, Inc. and
Datran. Digital multiplexer and modem advances
which have made possible more efficient utilization of
existing carrier channels for data communications were
spurred by the FCC decision in the Carterfone case
which permitted the attachment of customer-provided
data sets.7.8.9.10
Significant research has been conducted in the areas
of routing, buffering, synchronization, error control,
reliability and computer-communications interface.
The ARPA network has developed a separate communications processor, the Interface Message Processor
(IMP) to connect host computers to the telephone
network. The Interface MEssage Processor is an
augmented, ruggedized version of the Honeywell
DDP-516, and includes 12K 16-bit words of core
memory, 16 multiplexed channels, 16 levels of priority
interrupt and logic supporting host computers and high
speed modems. Since the ARPA network is a heterogeneous network (a network of dissimilar systems),
special hardware interfaces have been developed to
connect the IMPs to a wide variety of different hosts. ll
The MERIT network is engaged in similar research and
has developed a similar communications processor ,12
NETWORKING
Networking is the integration of a number of independent data processing installations connected by

Effective Corporate Networking, Organization, and Standardization

data communications to provide improved data processing support for the linked installations. Existing
networks include the TUCC Network, the Control Data
Cybernet Network, the Octopus Network, the TSS
Network, the ARPA Network and the MERIT
N etwork. 1l- 18
Network advantages

Potential benefits of networking include improved
operational efficiency, increased availability of resources, improved ADP backup capability, and possible
reductions in ADP support costs. Since alternate data
processing resources are available, a user may realize
an improvement in turnaround time by submitting his
job to another network node when the local facility is
saturated. Integration of activities is inherent in networking; consequently, the growth of common, compatible programs, data bases, and data formats will be
promoted and duplication of effort will be reduced.
Corporate data processing costs should decrease
because of higher system productivity resulting from
the increase in workload sharing, program sharing, data
sharing, joint program development and program
exchange between installations. Workload sharing, the
transmission (either manually or automatically) of a
discrete job entity to another ADP installation for
execution, tends to eliminate the extremes of underutilization and overloading of individual installations.
Program sharing and data sharing are variations of
workload sharing. Data sharing permits a user to send
his programs to another installation for the purpose of
utilizing data there. Program sharing enables a user to
take advantage of programs at other installations by
sending his data to the program. In all types of sharing
the output is returned to the user at his location. Both
data sharing and program sharing minimize use of
communications facilities. Communications traffic is
further· reduced by remote service which enables the
user to utilize both a program and data at another
installation and receive the output at his location.
Program exchange is the exchange of techniques, subroutines or complete programs which can be used without incurring the expense of additional modification.
Since successful implementation of networking has
led to improvements in processing capability and to cost
reductions,I3,14,15 corporate officers responsible for data
processing should seriously consider the implementation
of corporate ADP networks.
No quantitative studies of the advantages of networking are cited because I have not been able to find
any. The difficulty in quantifying the utility of networking arises because of the general unavailability of

563

data. Even when data is available, ongoing network
development makes it difficult to determine how much
of the improvement in effectiveness is due to implementation of the network and how much is the result of
tuning the system. However, as an indication of the
utility of networking the reader should consider the
experience of the TUCC Network.
The Triangle Universities' Computation Center
(TUCC) was established in 1965 as a cooperative
venture among three maj or North Carolina universities:
Duke University, North Carolina State University
(NCSU), and the University of North Carolina (UNC).
The TUCC network is a homogeneous network
(similar systems are linked), the center of which is a
System 360/75 with one million bytes of high speed core
and two million bytes of Large Capacity Storage,
operating under OS/MVT. There are approximately
100 terminals (high, medium, and low speed) in the
network. The high speed terminals are a 360/50 and an
1130 at UNC, :;1360/40 at NCSU and a 360/40 at Duke.
The 360 systems are multi-programmed with a partition
for local batch work and a telecommunications partition
for TUCC remote I/O services. The medium speed
terminals are IBM 2780s (or equivalents) and 1130s,
and the low-speed terminals are teletypes, IBM 2741s
(or equivalents) and IBM 1050s.
/
According to M. S. Davis, former director of TUCC,
the primary incentive for establishing the TUCC
Network was economic. The network was formed
because it was believed that a larger ADP system
serving the three major universities would provide
economy of scale and reduce the effect of the shortage
of competent systems programmers. 15 The question is
often asked if the three universities are better off with
the network than if each university had upgraded its
individual ADP system. TUCC officials have estimated
that if the net hardware cost of the Model 75 were
divided three ways, each member would have an
additional $10,800 per month with which to upgrade his
existing system. Each university could then install a
Model 50 with 256K memory and a 2314 disk file. The
network, however, is realizing substantially more
power than would be available with the three separate
systems. The throughput of the Model 75 alone is
about six times that of a Model 50.13
Advantages of homogeneous networks

Networking is not a panacea; however, if several
decentralized data processing facilities require upgrading in the same time period, the acquisition of
common hardware and software is recommended so
that corporate management will preserve the capability

564

Fall Joint Computer Conference, 1971

of combining these facilities into a network with
minimum expenditure of resources and time.
It is recognized that significant effort is being directed
toward establishing heterogeneous networks, e.g., the
ARPA Network and the MERIT Network. However,
these networks are research-oriented, not profitoriented. In each of these networks, special communications processors, network control systems and
communications-computer interfaces have been developed. In both networks extensive effort has been
directed toward establishing a network protocol and
the MERIT network has proposed the development of
a standard data description language to facilitate
transmission of data between computers, systems and
programs and to provide a convenient and complete
format for the storage of data and its associated
descriptor information. Development and implementation of a standard data description language would
significantly reduce compatibility limitations, however,
information interchange between dissimilar systems
presents many problems19 ,20,21,22 and some lead time
must be anticipated before such a capability will
be implemented.
Workload sharing, program sharing, data sharing,
program exchange and joint program development are
easier to implement in a homogeneous network because
program modification and data translation hardware
and software can be kept to a minimum, the cost of
developing interface hardware and software can be
minimized, and network protocol is easier to implement.
An often-voiced disadvantage of a homogeneous
network is that the user is not provided the opportunity
to utilize special hardware and software capabilities
provided by other vendors. Presently, to take advantage of the special capabilities of a dissimilar system
the user tailors his program to comply with the hardware and software requirements of the dissimilar
system. The user must familiarize himself with the
hardware, 'software, and operating idiosyncrasies of
the dissimilar system to effectively utilize its capabilities. Consequently, the need for these special
capabilities must be scrutinized by corporate officials
and alternative means of providing these capabilities
considered. For example, in the long run it may be
more economical to lease time on the dissimilar system or to use a standard system even though it may
not be best suited for processing certain types of programs.
In summary, homogeneous networks are best able to
satisfy corporate networking requirements because a
minimum expenditure of resources and time is required
to implement the network.

ORGANIZING FOR EFFECTIVE CORPORATE
NETWORKING
Although common hardware and operating systems
provide a foundation for the quick and economical
development of networks, corporations must establish
a corporate ADP focal point which will be responsible
for the development and maintenance of a network
standards guide. The need for a corporate ADP focal
point and the utility of a network standards guide in
implementing a network will be evident to anyone who
has attempted to run programs at one installation that
were developed elsewhere. Since the individual computer installations in a network are frequently staffed
by professionals of dissimilar backgrounds and since the
goals of the member ADP installations tend to be
parochial rather than corporate, some means of
facilitating communication and cooperation among the
installations is required. A network standards guide
defining corporate standards serves as a common
reference point for all installations.
The formation of corporate ADP networks has been
hindered by the absence of a corporate focal point for
standardization. As evidence of this situation, consider
that in general:
• short-term individualized solutions, rather than
common applications programs have been developed;
• standard programming approaches to specific
classes of applications have not been developed;
• standardized benchmarks for the evaluation of
programming approaches do not exist;
• ANSI standard procedure-oriented languages are
not utilized;
• a variety of documentation techniques exists; and
• compatible data banks are rare.
The creation of a Corporate ADP Coordinating Office
and a Corporation User Group will ease the solution of
these problems and facilitate integration of individual
ADP installations into a corporate network.
Corporate ADP coordinating office

This office, which reports directly to the corporate
data processing director is responsible for developing
corporate ADP policy and for providing direction and
assistance in the establishment and maintenance of a
corporate ADP network. Since the corporate data
processing director is responsible for controlling and

Effective Corporate Networking, Organization, and Standardization

• Develop the network standards guide.
• Develop benchmarks for the comparison of various
approaches to the solution of vital corporate problems and
evaluate these approaches.
• Develop and maintain a library of corporate program
documentation.
• Provide a vehicle for the dissemination of information among
the decentralized facilities.
• Review corporate ADP system needs.
• Monitor and provide assistance in corporate ADP procurement and efforts.
Figure I-Functions of the corporate ADP coordinating office

coordinating all data processing activities, the
Corporate Data Processing Office personnel will serve
as functional staff with implied line management power
because of the authority of the corporate data processing director. Suggested functions of this office are
listed in Figure 1.
No organization is recommended because specific
organizational relationships will be developed in
accordance with the management concepts of the
corporate data processing director. Two networks with
centralized management are the Octopus Network and
the Cybernet Network. The Computation Department,
Lawrence Radiation Laboratory, University of California/Livermore has overall responsibility for operation of the Octopus network and maintains separate
project groups for software design and development,
program evaluation, documentation dissemination and
standards development and maintenance. All activities
in Control Data Corporation's Cybernet Network
including hardware/software development, resource
accountability, and documentation development and
dissemination are controlled by the Data Services
Division.
A major function of the Corporate Data Processing
Office is the development and enforcement of the
detailed standards that comprise the network standards
guide. This office is responsible for:
• determination of the type and degree of standardization required;
• implementation of the standardization program;
and
• management of the program.
To ensure successful implementation of the network,
the network standardization program must have the
complete backing of the corporate data processing
director, the importance that he attaches to the network standards guide must be well publicized and the
most effective means of initiating the network stan-

565

dardization program must be determined. The two basic
methods of initiating the network standardization
program: the phased implementation approach and the
one-step implementation approach must be evaluated.
The advantages of each approach (decreased costs,
increased effectiveness, etc.) , the probability and cost of
implementing each approach, and the impact on networking and current installation activities must be
determined. If the phased implementation approach
proves to be the better choice, the areas to be standardized must be specified and a phasing schedule
developed.
The network standards section of the Corporate ADP
Coordinating Office will be responsible for the implementation and management of the standardization
program. This continuing effort includes the review,
modification, and enforcement of existing standards and
initiation of new standards as they are needed. Furthermore, to combat organizational parochialism and a
breakdown of the network due to blurring of responsibilities, this office must continually review the relationship of individual computer systems to the network to
ensure that the overall interests of the company are
being maintained. The Corporate ADP Coordinating
Office thus serves as the network control mechanism.

Corporation user group

To ensure the effectiveness of the network standardization program, a users' group is needed. This
users' group, composed of key installation representatives, will advise and assist the Corporate ADP Coordinating Office in integrating the decentralized ADP
facilities into a corporate network. As such, one of
their prime functions is to help develop the network
standards guide. This participation should encourage
coordination and communication among the decentralized installations, thus making it easier to establish
the network standardization program. Once the network
is established, this group will continue to support the
Corporate ADP Coordinating Office by proposing and
reviewing network standards as required.
NETWORK STANDARDS GUIDE
Interchangeability of programs and data is fundamental to economical networking and interchangeability is a function of standardization.
Standardization is the process of developing,reviewing, promulgating, and enforcing guidelines for
controlling the performance of the discrete elements

566

Fall Joint Computer Conference, 1971

Operating Standards
•
•
•
•
•

Network Protocol
Configuration
Job Preparation
Job Processing
Job Termination

Software Standards
• System Analysis Standards
• Programming Standards
Management Standards
•
•
•
•

Data Conventions
Program Classification Rules
Networking Rules
Utilization, Review, and Modification Rules
Figure 2-Network standards guide outline

which interact in the operation of a system. Within an
installation, ADP standards facilitate integration of
system elements (hardware, software, procedures),
make possible reductions in both operating and software costs, ease long range planning, and are a vital
element in increasing flexibility. Standards increase the
ability of an ADP installation to respond to changing
operational needs.
Similarly, just as installation standards are needed to
integrate the discrete system elements, network standards are required if installations are to be integrated
into a network.
The network standards guide will consist of the
following three categories of. standards: operating,
software, and management. Figure 2 is a proposed
outline for the network standards guide and indicates
how these categories might be further subdivided.
Operating standards

Operating standards include network protocol, configuration rules, job preparation, job processing, and
job termination procedures. Network protocol is the
operating rules and procedures to be utilized in the
receipt, processing, and transmission of programs and
data initiated at other installations. Included here are
error control procedures, message transmission techniques, and methods of determining and specifying
program priorities.
Configuration rules are needed to ensure that a
nucleus of hardware and software compatibility is
maintained among the network installations. Although
each decentralized installation is required to maintain
this degree of compatibility, it is recognized that certain
installations may require special purpose hardware or
software (e.g., large core storage, graphics systems and

text processing systems) to satisfy local needs. The
addition of these special capabilities is encouraged,
however, the Corporate ADP Coordinating Office
should determine if these capabilities can be effectively
utilized by other installations and develop standards for
their use if it proves necessary. The maintenance of
software compatibility requires continuous monitoring
of vendor modifications to the operating system and
the support software. Because of the number and
variety of special features and peripherals available, a
strictly enforced configuration control policy is needed
if system compatibility is to be maintained. For
example, to maintain hardware compatibility the
required computer configuration must be defined and
channel assignments and device addresses must be
standardized.
Job preparation standards include procedures for
preparing both the input data and the programs required for a data processing run. Programs should be
categorized and common job submission and job control
statements developed. Inherent in the development of
these common control statements is the assumption
that common default options will be specified for each
run category. At program compilation time, a programmer usually has several options which affect
ancillary aspects of the compilation (e.g., optimize or
do not optimize the resulting machine language. program; list or do not list the assembly language code).
If the programmer does not specify any options,
standard default options'are invoked.
Job processing refers to all job functions performed
by the computer operator from initiation through
termination of a programming run. Initiation involves
the preliminary set-up of all peripheral devices, the
initial setting of console switches, and the actual run
initiation procedure. Execution includes any additional
set-up activity that can be overlapped, listing of unusual events and operator messages, and operator
responses to interrupts and error conditions.
Job termination standards refer to take-down
procedures utilized under both ordinary and abnormal
conditions, control of installation data, and disposition
of computer results.
Software standards

Software standards refer to those practices which
systems analysts and programmers use in their daily
work. Adherence to these standards promotes resource
sharing and facilitates communication between the
systems design, programming, and user functions.
Systems analysis standards are those practices that ease
the preparation of the system specification. The level of

Effective Corporate Networking, Organization, and Standardization

detail of a good system specification is such that the
structure, functions, flow, and control of the system is
defined so that a programmer can readily program,
test, and implement the system.
Programming standards are those guidelines used by
programmers in their daily work. As data processing
capability has increased, the methods of best utilizing
this capability have changed. To increase the overall
effectiveness of ADP support, good programming
standards are necessary. Modular program design
techniques (so that the program can be handled as a
series of subtasks), standard programming approaches
to the solution of well defined applications, and common
testing and checkout procedures should be utilized.
Frequently, a number of software programs are available for solving certain types of problems (e.g., square
root and trigonometric conversion routines) . The
development of benchmark tests to evaluate these
approaches will promote efficiency since the advantages
and disadvantages of each technique will be determined
and a network standard will be developed.
It is generally agreed that there is a need to provide
information on what a program is supposed to do and
how it does it. Recognizing this, ADP installations have
independently developed a variety of program documentation guidelines, resulting in a lack of real standardization in documentation. Programming documentation consists of recording the detailed logic and
coding of a program. Entire books have been written
on programming documentation,23,24 and documentation
standards have been established by ANSI. At a minimum, programming documentation should consist of a
summary change log, general program description,
logical and mathematical description, detailed program
flow charts, detailed program, file, and data descriptions, an assembly listing, a run book (providing calling
sequences and job control and configuration requirements, etc.).
Management standards

Before the operating and software standards discussed above can achieve maximum effect, a framework
must be developed for their utilization. This framework
will consist of general policies to promote conformity of
usage and will ultimately determine the effectiveness
of networking for workload sharing, program sharing,
data sharing, remote service, program exchange, and
joint program development.
Management standards can be categorized as data
conventions, programming classification rules, networking rules, and utilization, review and modification
rules.

567

Data conventions

Data conventions ease data exchange because they
insure that the data itself is consistent.
Standardization guidelines are needed in the areas of:
•
•
•
•
•
•

data element definition;
terminology;
definitions of values and constants;
data names;
data exchange formats; and
data file structures.

The ASCII (American Standard Code for Information Interchange) code should be used for all data
to be transferred between facilities in order to reduce
the number of code translation programs each facility
must maintain and to ease the installation of noncompatible terminals if the need should arise in the
future. Common data exchange formats will further
reduce the complexity of the translation programs.
Program classification rules

One of the benefits of networking is that common
program development and program exchange will be
encouraged. If program exchange is to be effective, the
amount of effort required to search for available programs, sub:r:outines, and approaches must be reduced.
Each ADP installation must therefore categorize its
programs. After this has been done, either standard
categories, with clear definitions of what is included in
each classification, could be developed and all programs
classified; or each facility could distribute copies of its
categories, category definitions, and the programs which
fall into each category. Program classification (a first
step in eliminating duplication of effort) is essential if
common program development and program exchange
are to yield maximum gains and if software conversion
costs are to be kept to a minimum.
Networking rules

Guidelines must be developed which describe the
administrative practices and technical restrictions
required in a network environment. These include
priority determination rules, restrictions on the mixing
of languages within a program, incentives to use ANSI
FORTRAN and COBOL, and configuration rules which
restrict programmers to utilization of a subset of the
available facilities (core, peripherals, etc.).

568

Fall Joint Computer Conference, 1971

Utilization, review and IDodification rules

These management standards define the conditions
under which the various standards apply, the review
cycle and the steps in the standards modification
process. Utilization rules must consider the needs of the
network yet make allowances for the differences which
exist between installations. For example, installationpeculiar programs need not be subject to the strict
rules which apply to programs which will either be run
at or utilized by other ADP installations. The standards
section of the Corporate ADP Coordinating Office
together with the Corporation User Group, should
develop and implement formal review and modification
procedures.
IMPLEMENTATION OF THE NETWORK
STANDARDS GUIDE
The implementation of a standardization program for
a single installation requires a significant amount of
preparation and a great deal of marketing ability.
Additional difficulties are encountered when standards
are to be developed that apply to more than one ADP
facility. These difficulties are magnified when the
separate facilities are to be integrated into a network.
In order to facilitate the introduction of the network
standards guide, the following plan is suggested:
(1) Form a standardization team consisting of
competent representatives from each installation which is to be assimilated into the network.
(2) Announce the standardization program with a
letter from corporate headquarters and a
personal visit by the corporate data processing
director to each decentralized ADP installation.
(3) Develop an outline for the network standards
guide (e.g., an expansion of Figure 2).
(4) Submit the outline to the ADP installation
managers.
(5) Develop the actual standards:
• survey existing standards;
• determine which standards should be kept
and improved;
• develop the new standards which are needed;
(6) Submit these standards for review and approval
by the ADP installation managers.
(7) Distribute copies of the standards to all staff
members of all ADP facilities.
(8) Arrange open meetings at each installation so
that the corporate· data processing director and
the members of the standards section and

(9)

(10)
(11)

(12)

Corporation User Group can explain how the
standards program was developed.
Initiate training for each ADP installation staff
group (programmers, operators, etc.) which
will be affected by the standards.
Implement the standards.
Establish a formal network standards review
committee composed of corporate and installation data processing personnel.
Enforce the standards:
• incentives to abide by the standards should
be given to each ADP facility by the Corporate ADP Coordinating Office;
• incentives should also be given to staff
members of each ADP facility by their
managers; and
• continual monitoring and guidance should be
provided by both the Corporate ADP
Coordinating Office and the managers of the
decentralized installations.

SUMMARY
The benefits of workload sharing, program sharing,
data sharing, remote service, program exchange, and
joint program development are such that networking
should be seriously considered by corporations with
decentralized ADP facilities. Effective networking
provides improved capability, improved operational
efficiency and possible reductions in ADP support costs.
Corporate network development has been limited in
the past by compatibility limitations, however, the
effect of these limitations has been diminished by the
development of families of compatible computer
systems. Homogeneous networks are proposed as the
best method of satisfying corporate networking requirements because a minimum expenditure of resources and
time is required to implement the network.
Effective networking requires extensive standardization. It is recommended that a Corporate ADP
Coordinating Office and a Corporate User Group be
formed. The establishment of a Corporate ADP
Coordinating Office will provide a focal point for
implementation and management of the network and
the creation of a Corporation User Group is fundamental to the creation and continued updating of a
relevant network standards guide.
The network standards guide may be divided into
operating standards, software standards, and management standards. The latter part of this paper details the
functions to be standardized and suggests a plan for
implementation of the network standards guide.

Effective Corporate Networking, Organization, and Standardization

ACKNOWLEDGMENTS
The author is grateful to J. J. Powell, P. H. Messing,
J. J. Peterson and S. A. Veit for their many suggestions
and thoughtful review of this paper.
REFERENCES
1 J GOSDEN et al
Achieving inter-ADP center compatibility
The MITRE Corporation MTP-312 May 1968
2 J A WARD
A panel session-software transferability
Proceedings of the AFIPS Spring Joint Computer
Conference Vol 34 pp 605-612 1969
3 K SATTLEY R MILLSTEIN S MARSHALL
On program transferability
RADC Technical Report TR-70-217 November 1970
4 P L PECK
The implications of ADP networking standards for
operations research
Proceedings of the U.S. Army Operations Research
Symposium pp 269-2811969
5 P HIRSCH
WI M MIX: It's the biggest, but will it be the best
Datamation pp 84-90 October 1969
6 C R FROST
IBM plug-to-plug peripheral devices
Datamation pp 24-34 October 15 1970
7 R A OHARE
Modems and multiplexers
Modern Data pp 58-79 December 1970
8 J E BUCKLEY
A survey of communication tariff developments
Datamation pp 127-132 December 1969
9 A R WORLEY
Practical aspects of data communications
Datamation pp 60-66 October 1969
10 S J KAPLAN
The advancing communication technology and computer
communications systems
Proceedings of the AFIPS Spring Joint Computer
Conference Vol 32 pp 119-133 1968
11 F E HEART et al
The interface message processor for the ARPA computer
network
Proceedings of the AFIPS Spring Joint Computer
Conference Vol 36 pp 551-567 1970

569

12 E M AUPPERLEE
MERIT computer network hardware
Courant Institute of Mathematical Sciences Computer
Network Seminar November 30 1970
13 F P BROOKS JR J K FERRELL T M GALLIE
Organizational financial and political aspects of a
three-university computing center
Proceedings 1968 IFIP Congress Vol 2 pp 923-927 1968
14 D N FREEMAN J R RAGLAND
The response-efficiency tradeoff in a multiple-university
system
Datamation pp 112-116 March 1970
15 M S DAVIS
Economics-point of view of designer and operator
Proceedings of the Interdisciplinary Conference on
Multiple Access Computer Networks pp 4.11-4.17
April 1970
16 W J LUTHER
Introduction to cybernet
Courant Institute of Mathematical Sciences Computer
Network Seminar November 301970
17 J G FLETCHER
Livermore time sharing system-part 1: octopus
Computation Department Lawrence Radiation Laboratory
University of California/Livermore December 1970
18 B HERZOG
MERIT proposal summary
MERIT Computer Network February 1970
19 H SMELTZER H FICKES
Information interchange between dissimilar systems
Modern Data pp 56-67 April 1971
20 C S CARR S D CROCKER V G CERF
Host-host communication protocol in the ARPA network
Proceedings of the AFIPS Spring Joint Computer
Conference Vol 36 pp 589-597 1970
21 A K BHUSHAN R H SHOTY
Procedures and standards for inter-computer communication.q
Proceedings of the AFIPS Spring Joint Computer
Conference Vol 32 pp 95-104 1968
22 J L LITTLE C N MOOERS
Standards for user procedures and data formats in
automated information systems and networks
Proceedings of the AFIPS Spring Joint Computer
Conference Vol 32 pp 89-94 1968
23 M GRAY K R LONDON
Documentation standards
Brandon/Systems Press 1969
24 D WALSH
A guide for software documentation
Inter-Act Publications 1969

Multi-dimensional security program
for a generalized information retrieval
system
by JOHN M. CARROLL, ROBERT MARTIN, LORINE McHARDY, and HANS MORAVEC
University of Western Ontario
London, Ontario, Canada

INTRODUCTION

that item in any record. This is level 2 protection. In
addition, each item in each record has a protection
code applying only to that one item. Thus, while a
person may be authorized to examine the Salary item
in general, he may be prevented from examining the
salaries of his superiors by virtue of the protection on
these items within specific record's. This is level 3
protection.
The system is programmed in Fortran for maximal
portability among computer systems, although it is
presumed that routines will be written in an appropriate
assembly language in the interest of system efficiency
and flexibility at a particular installation.

G IRS is a generalized information retrieval system
which permits the creation and modification of a data
base, as well as the retrieval of specified data from the
base.
This system is data independent and flexible so that
the user can fit it to his particular application. Its
structure is as equally applicable to the storage of
abstracts of technical reports as it is to personnel files.
A multilevel protection scheme guarantees security of
information against unauthorized examination or
modification.
The basic model for data storage is a single-page
typewritten record, which can be entered into the file
with minimal restrictions placed on its structure. This
format has been found useful for storing personnel
records, inventory records, quality-control records, and
bibliographic entries for information storage and
retrieval systems.
The multi-dimensionality of the system's security
provisions arises from the fact that password or passwords assigned to users determine

DATA STRUCTURE
Any number of data bases can exist simultaneously,
each identified by a unique file name. As shown in
Figure 1, a file consists' of a series of records, each of
which consists of a set of items. The items are further
subdivided into elements, the smallest unit of information in the structure.
A file can contain up to 99,998 records, each of which
is identified by a 5-digit record number. Each record is
further divided into items, with the restriction that all
records in a file have the same number of items and that
they be present in the same order in all records.
Each item has a unique name within the file, and
consists of an integral number of lines (1 to 10) of 72
characters each. Further, the total number of items in a
given file must not exceed 10, and the total number of
lines in a record must be less than or equal to 20. Each
item can be subdivided into elements by use of delimiters.
The items are numbered from 0 to 9, and the lines
within an item are also numbered from 0 to 9. Each
line in the record is therefore identified by a 7-digit

(1) which subset of ten available processing functions they can exercise (level 1 protection),
(2) on which portions of records (items) they can
exercise these functions (level 2 protection), and
(3) which records they are privileged to work with,
or conversely, which records are locked against
them (level 3 protection).
Thus data protection is provided at two levels (2 and
3). Associated with each item name is a protection
code which applies to the particular item in all records.
A particular item, say, Salary in a personnel file, may
be protected so that only certain persons may examine

571

572

Fall Joint Computer Conference, 1971

File

· ". Record

•

• •

records of 80 characters in ASCII mode. This file can be
read using the random access mode of PDP-I0 Fortran.
To avoid confusion, the term "record" will refer to one
of these 80 character records. The term "data record"
will refer to one of the user's input data records identified by a five-digit number. The file is divided into two
parts-a 70 record header, followed by the user's data.
A diagram of the structure is given in Figure 4.
The header is fixed length and contains all the
information necessary to access the data. Its content
areas follow:
Records 1-10;

. , . Item···
•

••

• • •

passwords and protection keys, 5
sets/record

TITLt::

• ••.• title of paper ••••••••

AUTHOR:

last name

initials

city

now with

PUBLISHER:

name

city

ABSTRACT:

keywords

corporate author

pub date

Lib. Congress #

abstract text ••••.•••••••••

Element
Figure 1-Structure of the data base

number, consisting of the 5-digit record number, a
I-digit item number, and a I-digit line number within
the item.
These restrictions were imposed after observing that
the quantitative data in many files can be summarized
on a single typewritten face sheet. More narrative data
can be held on tape and retrieved in batch for transmission in hardcopy form. Note that an item can include
several related data elements; elements are coalesced
into items on the basis of security considerations.
In a specific implementation record size could easily
be made larger if desired.
Two sample data structures are shown in Figure 2.
The first is an example of the use of this structure to
store abstracts of technical papers. The record consists
of 4 items, "TITLE," "AUTHOR," "PUBLISHER,"
and "ABSTRACT," comprising 1, 2, 1, and 10 lines
respectively. The first of these is not subdivided into
elements, while the last three are. The second example
shows a personnel record format using 6 items and a
total of 12 lines. The data used to create a new data
base must be contained in a disk file in card image
format organized as shown in Figure 3.
The data file is to be written as fixed length Fortran

(l'.. )

NArlE:

last

first

middle

position

OHSC #

employee #

ADDRESS:

street

NUl-mER:

Soc. Ins. #

city

VITA:

date of birth
relationship

I1EDICA:::':

place of birth

union #
next of kin

address

physical handicaps,
allergies,
etc.

HISTORY:

job record •.••••••••••••••••

(B)

Figure 2-Sample data structures: (a) format of a record for
storing bibliographic entries; (b) format of a personal record

Multi-dimensional Security Program

Records 11-50; 4 lines per item containing the item
name, # lines in the item, level 2
protection code, and up to 24 element
names.
Records 51-70; index to data records containing
(data record #, Fortran record #)
8 pairs/record.

paSSWdllprotl!paSSWd? Iprot?

Fortran
Record # 1

10
11
12

I ......
I

...................

\ passwd so prot so
item naln; # lines ! prot ! / / / / /

element nrn 1 ! element nrn 2

T.

13

The index is set up to reference every 160nth record
where n is the value given on the record count card.

. I element

14

15

item name 1# lines

nrn 24

I prot I / / / / /

573

}

passwords
and
associated
protection
keys

]
1

item
l
details

H

item 2

E

A

ACCESS PROCEDURE
To access the system, it is necessary to log into the
PDP-10/50 host system. This requires use of a projectprogrammer pair and password. The user then copies to
the disk the Generalized Information Retrieval System

password count card

D

j

51

rec# I inde1 rec#

I index!

Irec# Iindex

70

71

item lO

E

index to
data
records

R

-

50

..".

da ta record 1

data record 2
etc.

Ipassword detail cards

DATA

Figure 4-Format of a data file

item count card

file and the particular data base file he wishes to use
(unless the user is going to create a new file).
If the desired file is found (or a new one created),
GIRS will request:
"PASSWORD?"

litem detail cards

record count card

Irecord detail cards

termination card
Figure 3-Card input for creation of a file

The user must then type in his password for the G IRS
system. Conventional print inhibit provisions for password protection are provided by the host system.
The password is matched against a list made up from
password detail cards.
There are as many cards as indicated on the password
count card. Each of them has the following format:
Col.

1-2
3-7
8
9-12
13
14-17
18
19

blank
password
blank
level 1 protection key
blank
level 2 protection code
blank
level 3 protection code

574

Fall Joint Computer Conference, 1971

Columns 20-80 are normally blank although columns
20-33 may be utilized in connection with an alternative
scheme for level 3 protection.
LEVEL 1 PROTECTION
The level 1 protection key is the sum of the key values
for those functions that the user whose password this
is can use.
CREATE
SEARCH
DISPLAY
END

0
1
2
4

TOTAL
INSERT
REMOVE
MODIFY
ACCESS
PROTECT

8
16
32
64
128
256

These functions are described as follows:
CREATE-is used to create. a new data base. It is
available to any GIRS user. If a user enters the name
of a data base which is not found, GIRS will assume
the user wishes to create a new data base and CREATE
is the only command that will be accepted; conversely,
if a specified file is found, the system will disallow the
CREATE command, making it impossible accidentally
to destroy an existing file by its use.
SEARCH-is used to scan parts of the data base for
certain character strings. On completion of a search, a
summary of successes will be given in the form:
'SEARCH SUCCESSFUL IN RECORD NNNN ... '
Any item which the user is not privileged to see will be
ignored in the search summary.
DISPLAY-i~ used to display information from the
data base. Any volume of information can be displayed,
from a single element to a complete file. Large volumes
of output will ordinarily be run on a batch terminal.
All data displays are consistent with the level 2 and
level 3 protection codes. Items which the user is not
authorized to see are ignored.
END-is used to exit from the system.
TOTAL-is used to aggregate the contents of a defined
numeric element over a specified set of records.
The variables are the same as in the SEARCH and
DISPLAY commands. If the element defined is nonnumeric, an error response will be generated. A 'blank'
option aggregates data over the entire set of records.
Level 2 protection is observed although data aggregates
are obtained irrespective of level 3 protection codes
except where the set of records defined contains six or

less. In this case, a response is generated advising the
user to employ the DISPLAY command.
INSERT-is used to insert either entire items or entire
records.
REMOVE-makes it possible to remove an entire
record or an entire item (i.e., one item from each record)
from the file. The record number is then free for reuse.
When removing an item, the remaining items are renumbered. This renumbering is independent of the
level 3 protection code.
MODIFY-is used to change an item in a specified
record. The user must specify the item name, record
number, and new contents.
ACCESS-is used to change the list of passwords and
associated protection keys for the file. Its use will
normally be restricted to one or two persons only. It
provides two options: to add a new password and its
protection keys, or to delete a password. To modify the
keys for an existing password it is necessary first to
delete the password, and then reinsert it with new keys.
PROTECT-is used to change level 2 and 3 protection
in the file; its use should be restricted to one or two
users. The command has an ITEM option used to
change a level 2 protection, and a RECORD option
used to change a level 3 protection. If the ITEM
option is omitted, protection is changed for all items in
the RECORD named.
As an example of the use of level 1 (function) protection: a user privileged only to search a file for statistically
aggregated information, that is, not permitted to see
individually identified information, would have the
level 1 code
CREATE'
SEARCH
END
TOTAL

000000000
000000001
000000100
000001000

LEVEL 1 CODE

000001101 = 13

LEVEL 2 PROTECTION
The level 2 protection key is the sum of the level 2
protection codes for those items which the user can
access, as given on the item detail cards.
As an example of the use of level 2 (item) protection:
a user who is privileged only to work with items 1, 5,
and 9 of each record would have the level 2 code:
ITEM 1
ITEM 5
ITEM 9

0000000001
0000010000
0100000000

LEVEL 2 CODE

0100010001 = 273

Multi-dimensional Security Program

575

Item detail cards comprise a set of one or more cards
for each item in the record. The first card of each set
contains the following data:
Col.

item name
blank
# lines in the item (1-10)
blank
level 2 protection code which should be
either a power of 2, or 0 for an item
which all users can access
19 blank
20-80 element names (1-10 characters each)
separated by backslashes and terminated with a dollar sign ($) [maximum of 24 element names]
1-10
11
12-13
14
15-18

For example, the item detail card for the fourth item of
example (b) in Figure 2 would be:
VITA 02 0128 BIRTH DATE\BIRTHPLACE\
NEXTOF KIN\RELATION\
ADDRESS $

Figure 5-Mapping of a security plan involving three ~ajor
users with their subusers, and provision for information mterchange among major users

LEVEL 3 PROTECTION

Each user is given a pair of divisors (bases) and a
pair of remainders (moduli). Every record number the
user desires to access is tested for both congruences
before the user is privileged to see it.
In the security plan illustrated in Figure 5, there are
three major users (A, B, C) which might stand for the
accounting, personnel, and production departments of a
firm. The records which are the exclusive property of
each major user are partitioned into exclusive subsets,
which might correspond to: personnel administration,
wage and salary administration, training, etc., fO.r the
personnel department. In addition, there are Intersection or shared-channel sets to facilitate exchange of
information between users AB, AC, BC, and all major
users.
Columns 20-33 of the password cards can be utilized
to store access codes (divisors and remainders) required
to access particular sets of records:

The level 3 protection code is set to correspond to the
security clearance of the user. He can see items of
records whose protection code is less than or equal to
the level 3 protection code stored with his password.
Record detail cards contain the actual data in the
following format:
Col.

1-5
6
7
8

record number
item number (0-9)
line number within item (0-9)
level 3 protection code (must be same
for all lines in one item). User can
access item if his level 3 protection
key is 2 this value.
9-80 item contents

These cards must be sorted in ascending order on
columns 1 to 7, when creating a new data base.
OPTIONAL SYSTEM TO LOCK RECORDS
An additional system for level 3 protection permits
locking all items of selected records against certain
users. This system is based on the principles of modular
algebra and permits mapping a need-to-know security
plan into the access system.

20
21-23
24
25-26
27
28-30
31
32-33
34-80

blank
divisor # 1
blank
remainder # 1
blank
divisor #2
blank
remainder # 2
blank

576

Fall Joint Computer Conference, 1971

TABLE I-Partitions of 10,000 Record Numbers
Record
Level 3 Key
Records
Numbers
D 1, R 1, D 2, R2 Accessible Assignable

Utilization

2
3
5
6

0
0
0
0

5000
3333
2000
1667

2667
1334
667
1333

10

0

1000

667

15

0

667

333

30

0

333

333

2

0

13

0

205

205

Major user A
Major user B
Major user C
Shared channel, users
A&B
Shared channel, users
A&C
Shared chann el, users
B&C
Shared channel, users
A,B&C
Subuser A-I

2
3

0
0

13 12
11 0

205
121

205
121

Subuser A-13
Sub user B-1

3
5

0
0

11 10
7 0

121
95

121
95

Sub user B-11
Subuser 0-1

5

0

95

95

Subuser 0-7

7

6

Table I illustrates the results of partitioning 10,000
record numbers. A record number is selected from a
computer-produced list of the members of each set and
subset so that the number can be accessed by the major
user, subuser, or combination of users who have the
specified need to know. Of course, additional passwords
can be issued to give various subusers access to partition
sets, which can function as common channels for
information interchange among them. Note that major
user A can access 5,000 records. Out of this set, 2,667
records belong to A exclusively; they cannot be seen by
any other major user. User A, however, can allocate
these records among his subusers (designated by the
alpha-numerics Al to A13) ; they are parcelled out at
the rate of 205 records per subuser. These records can be
seen only by User A and the subuser designated. (Of
course, A can hold back a few sets of records for his
exclusive use--becoming, actually, his own subuser.)
User A also has the use of 1,333 records; which he
shares with User B; only Users A and B can see these
records.
User A can also use records taken from the set of 667
records that he shares with C; only Users A and C can
see these records.
Finally, User A has the use of records taken from a

common set of 333 records; these records can be seen by
all three major Users; A, B, and C.
PROVISION FOR ENCIPHERMENT
There are two points at which this system remains
vulnerable to unauthorized entry: a user possessing the
general project-programmer pair and password required
by the PDP-I0 software to access the GIRS system can
make use of the peripheral interface program to assign
the entire G IRS file to some display device; and a wiretapper can intercept the transmission of confidential
file information to a legitimate user at a remote terminal.
Use of an on-line crypto system can protect the files
at these points. The item contents from on record
detail cards will be stored in enciphered form for items
'whose sensitivity requires such precaution. Decipherment can be accomplished at programmable remote
terminals; for such items, only enciphered contents will
be transmitted over telecommunications lines or be
accessible by the peripheral interface program.
A suitable deciphering scheme has been described
(1). Essentially it consists of adding modulo 2 to the
cipher text stream the bits of the key string used to
encipher it. The key string is regenerated and correctly
synchronized by using an arithmetic congruential
pseudorandom-number generator whose seed string
(i.e., ring contents) are produced by a second generator
whose operation is specified by a password-which may
be the same as used to gain access, or be totally
different. In the CRYPTO mode, the output received
in answer to a DISPLAY command would be stored on
the disk of the programmable terminal-in this case,
a PDP-8I.
CONCLUSIONS
This generalized information retrieval system provides
a test bed for continuing experimentation with security
provisions for multiple-access computer communications systems. By such experimentation, it is anticipated that the optimal trade offs between security and
economy can be determined for a wide range of information retrieval applications.
REFERENCES
1 J M CARROLL P M McLELLAND
Fast "infinite-key" privacy transformation for resource
sharing systems
AFIPS Conference Proceedings FJCC Vol 27 pp 223-230
1970

Multi-dimensional Security Program

2 D F BOOTH
File security for a shared file, remote-terminal system
Conf. on Computers: Privacy and Freedom of
Information Queens University Kingston Ontario
May 21-24 1970
3 T D FRIEDMAN
The authorization problem in shared files
IBM Systems Journal Vol 9 No 4 pp 258-280 1970

577

4 L J HOFFMAN
Computers and privacy: a survey
Computing Surveys Vol 1 No 2 pp 85-103 1969
5 W J STUBGEN M A SHEPHERD
A real-time information editing and retrieval system
Department of Computer Science University of Western
Ontario London Ontario May 1970

Insuring confidentiality of individual
records in data storage and retrieval
for statistical purposes
by MORRIS H. HANSEN
Westat Research, Inc.
Rockville, Maryland

Much has been written about the question of privacy
and the need for the protection of confidentiality of
individual records in data storage and retrieval systems.
The ability to insure confidentiality is a prime tool in
the protection of privacy. The goal of this paper is to
summarize from the point of view of a statistician some
of the aspects and principles of confidentiality and
some of the implications of these principles for computer-based storage and retrieval systems for statistical
purposes. The remarks will have special relevance to
open retrieval systems, that is, retrieval systems in
which customers for information retrieval are the
general public, or perhaps specified agencies or groups
or individuals, and these customers can retrieve any
desired statistics from the confidential records in the
files subject to a review to insure that the output conforms to prescribed rules designed to avoid disclosure
of individual information. These rules may be concerned with the minimum number of cases on which an
individual statistics or frequency count is based or with
other aspects, as is discussed later. The access to the
data may be restricted to certain authorized types of
data through control passwords or keys.

Bureau experience as an illustration. With the great
concern of the Congress and others over the potential
for invasion of privacy in statistical information systems, and especially the proposed and much-discussed
federal statistical data center, it is useful to examine
how the Census Bureau has come to be widely accepted
as a model in the confidentiality protection given to its
records. It will be seen that the experience points to
serious and as yet unresolved problems, and that the
problems are especially difficult for a storage and
retrieval system such as a federal data center with
access to statistical summaries by persons not authorized to see the individual confidential records.
The Census law (Title 13, U.S.C., Sec. 9-a-2) provides
that there shall not be " ... any publication [or otherwise make information available] whereby the data
furnished by any particular establishment or individual
under this title can be identified."
Various interpretations can be made of this language.
One is that no inference can be made about the results
reported by any individual. This is not a tolerable
interpretation. At the other extreme, the law cannot
reasonably be interpreted to mean that there is no
violation of confidentiality provided the name or
address (or other specific identification such as Social
Security number) is not associated with the information
and made available.
A reasonable as distinguished from a rigorous or
literal interpretation of the language of the law is
required if any statistics are to be published. * For
example, the publication of an aggregate of retail sales
for hardware stores ina county reveals that no in-

MEANING OF CONFIDENTIALITY
What is meant by confidentiality needs clarification.
An obvious meaning is that the individual records,
with the names or other identifying information
included, will not be made available to other than
authorized persons. But beyond this the definition of
what is adequate protection of confidentiality needs
further clarification.
The Census Bureau has a well-established and wellearned record for preservation of confidentiality of its
records. Much of this paper will draw on the Census

* Here and elsewhere in this paper the term "publication" refers
to any means of making information available to persons who do
not have authorized access to the confidential records and who are
not subject to penalties for disclosure.
579

580

Fall Joint Computer Conference, 1971

dividual hardware store had sales of a greater amount
than this aggregate, and this much is revealed about
each individual hardware store. Similarly, sometimes
the existence (or nonexistence) of an item in each report
can be inferred from the publication of statistical
aggregates. Thus, the fact that in an age distribution
for a specified area from a population census no person
is reported as over 75 years of age reveals for each
individual person that his age was reported as under
75 years.** In publishing statistics for large areas such
considerations may be of little consequence. But in
publishing statistics for smaller and smaller areas the
problem increases, and the primary role of the decennial
census is to produce small area statistics. Especially
statistics are needed and produced from the decennial
Censuses of Population and Housing for counties,
cities, towns, census tracts, and even city blocks within
cities or other communities. The storage and retrieval
of geographically detailed statistical information may
also be a primary goal of other information systems
based on a set of administrative records or integrated
from administrative systems and perhaps also from
statistical surveys.
Years of experience and precedent in publishing
statistics by the Bureau of the Census without serious
problems suggest the acceptability of the rules and
principles that have been followed to avoid unreasonable
disclosure of data for individuals in statistical aggregates. However, the computer adds new capabilities as
its capacities and applications increase, and these may
call for reexamination and some new rules and principles. It is desirable to get recognized, in applying
past principles and in developing any new ones, and as
has been illustrated in the above discussion, that if any
statistics are to be published nondisclosure cannot be
absolute. Rules for nondisclosure are necessarily based
on an interpretation of what is reasonable, and supported by precedents and past experience.

SOME PRINCIPLES AND QUESTIONS FOR
GUIDING NONDISCLOSURE IN
PROTECTING CONFIDENTIALITY
Some relevant principles or questions concerning
rules for protecting confidentiality of individual records
will be presented. Clear and unequivocal answers may
not exist. Nevertheless, reasonable decisions have been
made and must be made, in order to publish census and
other statistical results.

** Additional illustrations are presented in

1.

What constitutes protection against exact or
approximate disclosure?

Protection against exact or approximate disclosure of
specific items of information in a record must be provided. However, "approximate" disclosure must be
interpreted or defined. Issues concerning the approximate disclosure of magnitudes, as distinguished from
frequency counts, involve some special considerations.
Illustration: In some studies the Census Bureau has
interpreted the disclosure of a magnitude, X, as not to
be an approximate disclosure when the range of interpretation is of the order of (.75 to 1.5) X. Frequencies
in a distribution may automatically meet this condition
if intervals are broad enough for the upper limit of the
interval to be at least double the lower limit, as in the
following illustration:
Number of employees
Less than 5
5-9
10-24
25-49
50-99
100-199, etc.
Under this rule even an individual case may be reported
in such an interval without making an approximate
disclosure. Of course the individual is not identified, but
frequencies as low as 0, 1, 2 or 3 are shown in such
intervals, as in employment size classes for retail stores,
by type, within a county, for example, and a person
with commonly available local knowledge may be able
to identify a particular store identified by a frequency
of 1, and its reported employment within the range of
the class interval.
Effect of sensitivity of the information

Should disclosure rules take some account of the
sensitivity of the information, and be more restricted
with highly sensitive information than with less sensitive information? Some information loses sensitivity
with time; some may not, or the sensitivity may
increase with time. Some information is essentially in
the public domain. These factors should, and in fact,
do have some impact on the confidentiality treatment,
but still without completely specified formal rules. * For

* Alan F . Westin has expressed a need for developing a classification system for personal information to identify types that need
various degrees of control. See, for example, Privacy and Freedom
(Atheneum, 1967).2

Insuring Confidentiality of Individual Records

example, is there any point in regarding the size of a
family or a household (which often is known to everyone
in the neighborhood) as equally confidential as the
income of the head of the household? Similarly, should
the industry code derived from the types of production
reported by a manufacturing company be protected as
confidential, when often the company spends much
money to let the public know of the types of products it
makes or the services in which it is engaged? Should
the number of employees reported for a plant be
protected as equally confidential as the reported sales?
Such questions may have more difficult implications
than is readily apparent. Thus, in some instances the
number of persons in a household may indicate illegal
occupancy to a landlord or to housing code authorities.
Again, the industry in which a company is classified may
affect the rate of taxation for unemployment compensation. If a company is classified in a high-risk
industry instead of a lower-risk one, and if the industry
code derived from a confidential statistical report of a
company is made public, will it influence the company's
tax rate?

Disclosures with supplemental knowledge or collusion

Is it necessary to provide protection against disclosures that can be achieved by collusion, or by
supplemental knowledge in addition to one's knowledge
of his own affairs? A common rule in avoiding disclosure is that there must be at least three nontrivial
cases aggregated in a cell (based on aggregates of
magnitudes) so that, for example, a business respondent
will not know his competitor's response. Presumably it
is not feasible, and the Census Bureau accepts the
principle that it is not feasible, to protect against
disclosure by collusion. Otherwise, again, nothing
could be published. However, the issue of possible disclosure through taking advantage of supplemental
knowledge needs further attention, especially in view
of the computer capabilities. There is an important
difference between analysis to achieve disclosures,
with and without the computer. Consider, as an illustration, a cross-tabulation made in great detail.
Assume 10,000 persons in a file for an area, and
information for each person on 50 characteristics
(something like the results of the questions in a 1970
Population and Housing Census sample questionnaire) .
Suppose that the record includes some characteristics
with two alternative responses, as for sex. Others may
have three, five, ten, or twenty alternative classifications
(as with ten intervals for an age tabulation). A question

581

such as occupation may be recorded and tabulated in
100 or many more classes.
If we assume 10 of the questions have 2 alternatives,
10 of the questions have 3 alternatives,
10 of the questions have 5 alternatives,
10 of the questions have 10 alternatives,
10 of the questions have 20 alternatives,

and if we conceive of a cross-tabulation in the fullest
possible detail of these 50 questions the number of
possible cells becomes 210X31OX51OX2010= 1()38 cells,
which is an astronomical number. It is likely that in
such a detailed cross-tabulation each person would be
unique, with each cell showing a frequency of zero or 1.
A cross-tabulation of only five of these questions
(one from each of the indicated numbers of alternatives) would yield a tabulation with about 6,000 cells,
so that a population of 10,000 would have an average
of 1.7 per cell in such a tabulation. Of course, many
cells may be impossible or blank, and some cells might
have several cases. Nevertheless, tabulations in such
detail may make it feasible for a person or organization
(such as a welfare or taxing agency or a credit bureau)
with certain of the same information on some of the
people to identify many of them in the tabulation and
ascertain other information for them. With a computer
the comparison and identification become far more
feasible. Consequently, consideration must be given to
the amount of detail in which tabulations will be made
available in order to preserve confidentiality. Or should
and can any possible violations of confidentiality be
ignored that can be achieved only through the use of
extensive supplemental information? In the computer
age this seems unreasonable.
Some interesting discussion and examples of principles and procedures for using collateral information
to extract information for individual records from a
statistical data bank with retrieval allowed only for
statistical aggregates, and by obtaining legitimate
responses to queries, are given in an article by Hoffman
and Miller. 3
The presence of errors, or differences in time reference
or in the treatment of individual items of information
in two sets of records, is common. Such errors or
differences would make more difficult the problem of
using collateral information to extract individual
information from statistics derived from a set of
confidential records used for statistical purposes.
However, with sufficiently extensive and detailed
independent information available to use in identification, and even in the presence of such errors or differences, the probability of correctly identifying a person

582

Fall Joint Computer Conference, 1971

and picking up the desired confidential information
increases as the number of cells in a cross-tabulation is
increased, or with appropriately designed queries of
increasing detail.

terms of public interest, may foreclose the possibility of
later retrieval of other more important information.
The question of priorities adds great complexity to the
design of any such retrieval system for information
that is subject to confidentiality restraints.

Indirect disclosures

Indirect as well as direct disclosures must be considered, and these can be a major source of difficulty.
Thus, suppose a small county has six hardware stores,
and that a city within the county has four of them. If
retail sales are published for the county, and also for the
city (we assume each would individually meet disclosure requirements) an indirect disclosure occurs.
Each of the two stores in the balance of the county
could directly determine his competitor's sales by taking
the difference between the county statistics and the
city statistics. Thus, if disclosure is to be avoided the
data for the city can be made available, and not the
county, or for the county and not the city. Indirect
disclosures should be avoided, at least in any sensitive
type of information.
Priorities need in statistics subject to indirect
disclosure restraints

The consequences of indirect disclosures are that
priorities are necessary in determining which statistics
will be made available and which will not, in order to
avoid making available some relatively unimportant
information and thereby subsequently denying statistics
that have highly important uses. The providing of
information forecloses making information available
for an alternative, as illustrated above. As another and
more serious illustration, it is often true that in the
Manufactures or Business Censuses information can be
shown for a state total, or for a metropolitan area total,
but not for both, and many similar situations arise.
Exactly the same kinds of problems can arise in the
publications of Population and Housing Census data,
especially for small areas where the frequencies get
small. In these Censuses, however, some of the data
may be less sensitive, and disclosure analysis may not
need to be pressed as rigorously. For sensitive data,
however, the question becomes: how should one determine the priorities? Obviously, it is public interest and
utility that should be determining, but this problem
poses many questions beyond the scope of this discussion. Of particular importance, however, is the
consequence that the priority problem means that the
first comer, who may have a limited use or need in

Random modification of data to avoid
approximate disclosure

There has been some consideration of random
modification of data within the range of, for example, a
factor of .5 to 1.5, with the choice of factor within the
range made at random, as a means of avoiding approximate disclosure. With this approach an actual report
of 850 employees in an establishment might be modified
to become 595 = 850 (.7) where the .7 was chosen at
random from the interval .5 to 1.5. The average effect
of such modifications on simple aggregates or averages
would be relatively small (over a large experience) and
numbers so modified in reports can be subjected to less
rigorous disclosure rules or even no disclosure analysis.
In the case of attributes the approach must be modified
to change some fraction of ones to zeros and of zeros to
ones, where changes are made at random in ways that
do not unduly violate internal consistency of the data
for the individual record.
The impact may be more serious with crosstabulations where the independent variables-those
used in sorting into various classes or cells-have been
so modified. In this latter case a bias is introduced that
mayor may not be serious in its magnitude. Such a bias
is not necessarily reduced simply by increasing the
number of cases within a class.
The random modification of data to avoid approximate disclosure has been considered extensively for
various ~pplications in the Bureau of the Census over
the past decade or more, but has actually been applied
to a very limited extent, so far as I am aware. It has
developed and been discussed independently, and again,
with limited applications, as a means of preserving
confidentiality in retrieval or publication of information. 4 ,5 This approach deserves more exploration. It may
be that an announced program of random modificatIon
of a relatively small fraction of the records selected at
random can accomplish much in avoiding disclosure for
all of the records in the set.
Disclosure with statistical information from samples

If information in some statistics is based on a sample
of a population, the chance of disclosure is reduced, and

Insuring Confidentiality of Individual Records

the thinner the sample, the less the chance of disclosure in statistical tabulations of a given amount of detail.
For a small enough sampling fraction, even if disclosure rules are not fully observed, the chance of
pay-off may be small enough to make prohibitive (as a
practical matter) the cost of taking advantage of the
potentials for disclosure.
In recognition of this principle the Census Bureau
decided to put in the public domain the statistical data
recorded for each household from the 1960 Census for
a 1 in 1000 sample of households after deleting certain
information from the records that would facilitate
identification. Of course the name and address were
deleted. In addition, geographic identification was
deleted below the level of broad city-size class within
geographic divisions of the country (there are nine
geographic divisions, each consisting of several states) .
In addition, some extreme cases were modified for
sensitive types. of information so that, for example, the
upper boundary of income reported may have been
reduced. Beyond this, the full household information
was included in a magnetic tape file on a set of punched
cards for the 1/1000 sample, including the housing
information and the full listing of individual household
members, with the information reported for each
individual member. The purpose was to make it feasible
for various users to make their own summaries or crosstabulations or correlations to meet a wide range of
needs. It was a great success, with a largB number of
users of the tape putting it to many uses that could not
be served directly by the Census tabulations.
From the point of view of confidentiality, anyone who
has a supplemental source of more limited information
but that duplicates a number of the items of information
in the Census file for individuals or families or households for some part of the population could use that
information to identify many of the individual cases in
this sample that were also in his file. Of course he could
expect to find less than 1 in 1000 of the cases in his file,
but for those found he would then have identified the
additional information in the 1/1000 file.
Suppose, for example, that a credit bureau had
records for a "chunk" of the population in a metropolitan area, including, perhaps, information on age for
the head of the family, the number of persons in the
family (not necessarily the same as in the household),
occupation of the head of the household, whether the
home was owned or rented, and the value of the home
or the amount of the rent paid. With such information,
and even with errors and with differences in time
references in both sets, he might run his tape against
the Census 1/1000 sample tape, perhaps for a larger

583

area or areas, and identify with a fairly good chance of
success (but with much less than certainty) the cases
in the 1/1000 file that were also in his records. He
would thereby acquire the additional Census information for the identified cases (including misinformation
for cases that were misidentified). But it would cost
him a considerable amount both in efforts and dollars,
and at the very best he could expect to find a pay-off
of less than 1/1000 in the sense of obtaining Census
information for the cases in his file. The possibility of
misuse arises only in the case of someone with a file of
supplemental information that is sufficiently relevant
for some subgroup of the population. Even then the
pay-off presumably would be very small because of the
presence of errors and time reference differences in each
source, and the great effort in relation to the number of
successful matches (and of course he would not know
which of his linked records were the unsuccessful
matches). The pay-off might be small not only because
of the very small fraction of "finds," but also because
the information in the Census records, in general, is
not all that sensitive.
Presumably because of such factors no evidence has
come to light of any such misuse. At the same time the
1/1000 sample has served many highly useful purposes,
so much so that the Census Bureau is proposing to
extend the program along the same lines for 1970, and
to increase the size of sample from 1/1000 to 1/100.

Disclosure of disclosure rules

There is some thought that rules for disclosure should
not be disclosed, and that the availability of the rules
will increase the ability of one who wishes to arrive at
desired disclosures through analysis of the information
that is made available. On this principle, apparently,
the Bureau of the Census has not published its various
disclosure rules in full, although some of the rules
are more or less obvious, and have been made available.

SOME IMPLICATIONS FOR AN OPEN
RETRIEVAL SYSTEM
There is need to bring the issues of confidentiality as
related to storage and retrieval of information into
fuller discussion. The implications of some of the points
and principles that have been made above may not be
obvious, and study and exploration are needed.

584

Fall Joint Computer Conference, 1971

There is no basis for simply assuming that an allpowerful software system can be designed that will take
care of the problems of preserving confidentiality in a
national statistical data center if one were to be created.
Obviously, such a software system cannot be designed
until the principles and specific rules of what constitute
disclosure and nondisclosure are agreed upon. U nless ~
the principle of reasonable disclosure, instead of no
disclosure, is adopted, it appears that little or no
information could be made available. If the principle of
reasonable disclosure is adopted, it will be necessary to
define what constitutes reasonable disclosure.
It also must be determined how far the disclosure
system will protect against the potentials for disclosure
that are made possible by the use of extensive supplemental information acquired through other sources.
The availability of such supplemental information can
make it feasible to extract increasing amounts of
confidential information by making increasingly detailed
tabulations or queries, as illustrated earlier, as well as
from records such as the 1/1000 sample. Unless the
system makes no attempt to protect the disclosure of
additional information from sources that have extensive
and detailed supplemental information, the disclosure
rules may have to be so designed that little of the kinds
of anticipated uses from, say, a national statistical data
center could be served.
A particularly difficult problem is that of indirect
disclosure, through comparisons or analyses of successive tabulations or results of queries. With disclosure
analysis that takes account of indirect disclosures
many requests might have to be drastically curtailed
after a few initial uses. If there were no auditing for
indirect disclosure anyone could specify changes in the
classifications or specifications for a sequence of
tabulations in such a way as to reveal, after analysis,
the desired characteristics of many or all· of the individual records. Some computer programs have been
prepared for dealing with indirect disclosure analysis,
and are in use in the Bureau of the Census, but the
complexities in a system of open access (subject to
restraints on disclosures) seem enormously challenging.
A system of recording who has retrieved information,
what kinds and how much, for post-audit on a judgmental basis may offer a sufficient protection, especially
if a rule of reason is used.
But suppose the problem of indirect disclosure is
solved (and in theory, at least, it appears that it can be
solved). The problem of priorities still remains. Must
all high-priority statistics be listed in advance? Is this
feasible? If not, minor or trivial uses of the data may
override the subsequent possibility of acquiring information the need for which was not originally foreseen.

One unimportant use may foreclose any possibility of
providing information on an urgent and unforeseen
current problem.
The issue of priorities is not a new one, as we have
seen. It exists in a system in which there is no general
access to the stored records. It appears that the problem
may be greater in a system that allows access without
going into a judgment filter of evaluating public interest
and need, or potentials for foreclosing future uses, as is
now done in the Bureau of the Census activities. This
problem may be a sufficiently serious one to foreclose
effective development of anything like a federal
statistical data center or data bank that retains confidential records in storage, and permits access by the
public or specified groups to statistical tabulations that
are audited lor disclosure by computer software. The
priority problem remains even if other problems prove
manageable and can be brought under control.
There is need for fuller discussion of some of these
issues by scientific and professional groups. It is not
sufficient for these discussions to be conducted separately and in isolation. There is need for interchange
using some organized approaches arranged to discuss
the issues and problems.

REFERENCES
1 P HIRSCH
The world's biggest data bank
Datamation May 1970 pp 66-73
2 A F WESTIN
Privacy and freedom
Atheneum New York 1967
3 L J HOFFMAN W F MILLER
Getting a personal dossier from a statistical data bank
Datamation May 1970 pp 74-75
4 R F BORUCH
Educational research and the confidentiality of data
ACE Research Reports Vol 4 No 41969
5 R F BORUCH
Maintaining confidentiality of data in educational research:
a systemic analysis
American Psychologist Vol 26 No 5 May 1971 pp 413-430
6 I P FELLEGI A B SUNTER
A theory for record linkage
Journal of the American Statistical Association Vol 64
No 328 1968 pp 1183-1210
7 I P FELLEGI
On the question of statistical confidentiality (unpublished)
Revision of a paper given at the 1970 annual meetings
of the American Statistical Association
8 The computer and invasion of privacy
Hearings before a Subcommittee of the Committee on
Government Operations House of Representatives
89th Congress Second Session July 26-28 1966

Insuring Confidentiality of Individual Records

9 C KAYSEN chairman
Report of the task force on the storage of and access to
government statistics
Executive Office of the President Bureau of the Budget
October 1966
10 Privacy and the national data bank concept
35th Report by the Committee on Government
Operations 90th Congress 2nd Session House Report
No 1842 August 2 1968

585

11 E V COMBER
Management of confidential information
AFIPS Conference Proceedings Vol 35 1969 Fall Joint
Computer Conference
12 M H HANSEN
Some aspects of confidentiality in information systems
Papers from the Eighth Annual Conference of the Urban
Regional Information Systems Association Louisville
Kentucky September 1970

The formulary model for flexible
privacy and access controls*
by LANCE J. HOFFMAN
University of California
Berkeley, California

and decrypting operations in any particular interpretation of the model.
Specific interpretations of the model can be implemented on any general-purpose computer; no special
time-sharing or other hardware is required. The only
proviso is that all requests to access the data base must
be guaranteed to pass through the data base system.

INTRODUCTION
This paper presents a model for engineering the user
interface for large data base systems in order to maintain flexible access controls over sensitive data. The
model is independent of both machine and data base
structure, and is sufficiently modular to allow costeffectiveness studies on access mechanisms. Access control is based on sets of procedures called formularies.
The decision on whether a user can read, write, update,
etc., data is controlled by programs (not merely bits
or tables of data) which can be completely independent
of the contents or location of raw data in the data base.
The decision to grant or deny access can be made in
real time at data access time, not only at file creation
time as has usually been the case in the past. Indeed
the model presented does not make use of the concept
of "files," though a specific interpretation of the model
may do so. Access control is not restricted to the file
level or the record level, although the model permits
either of these. If desired, however, access can be controlled at arbitrarily lower levels, even at the bit level.
The function of data addressing is separated from the
function of access control in the model. Moreover, each
element of raw data need appear only once, thus allowing considerable savings in memory and in maintenance
effort over previous file-oriented systems.
Specifically not considered in the model are privacy
problems associated with communication lines, electromagnetic radiation monitoring, physical security, wiretapping, equipment failure, operating system software
bugs, personnel, or administrative procedures. Cryptographic methods are not dealt with in any detail,
though provision is made for inclusion of encrypting

ACCESS CONTROL METHODS
Access control in existing systems

In most existing file systems which are concerned
with information privacy, passwords1 •2 are used to provide software protection for sensitive data. Password
schemes generally permit a small finite number of specific types of access to files. Each file (or user) has an
associated password. In order to access information in
a file, the user must provide the correct password.
These methods, while acceptable for some purposes, can
be compromised by wiretapping, electromagnetic radiation monitoring, and other means. Even if this were
not the case, there are other reasons3 why password
schemes as implemented to date do not solve satisfactorily the problem of access control in a large computer data base shared by many users.
One of these reasons is that passwords have been
associated with files. In most current systems, information is protected at the file level only-it has been
tacitly assumed that all data within a file is of the same
sensitivity. The real world does not conform to this
assumption. Information from various sources is constantly coming into common data pools, where it can
be used by all persons with access to that pool. A problem arises when certain information in a file should be
available to some but not all authorized users of the file.
In the MULTICS system4 for example, if a user has
a file which in part contains sensitive data, he just can-

* Prepared for the U.S. Atomic Energy Commission at the
Stanford University Linear Accelerator Center under Contract
No. AT(04-3)-515.
587

Fall Joint Computer Conference, 1971

588

EXISTING FILE SYSTEM

DESIRABLE FILE SYSTEM

c
c

A

B

B
•

Unnecessarily Duplicated
Information

controls access at the record level, one step beneath the
file level. In it, access control information is stored independently of raw data, and thus can be examined or
changed without actually accessing the raw data.
Hsiao's system and the TERPS systemlO at West Sussex County in England are two of the first working
systems which control access at a level lower than the
file level.

D

Access contro l in proposed systems
Access Control
Information

Figure I-Use of computer storage in file systems

not merge all his data with that of his colleagues. He
often must separate the sensitive data and save that
in a separate file; the common pool of data does not
contain this sensitive and possibly highly valuable data.
Moreover, he and those he permits to access this sensitive data must, if they also wish to make use of the
nonsensitive data, create a distinct merged file, thus
duplicating information kept in the system; if some of
this duplicated data must later be changed, it must be
changed in all files instead of only one. Figure 1, taken
from Hoffman's survey5 of computers and privacy,
graphically illustrates this situation by depicting memory allocation under existing systems and under a more
desirable system.
The file management problems presented and the
memory wastage (due to duplication of data) tend to
inhibit creation of large data bases and to foster the
development of smaller, less efficient, * overlapping
data bases which could, were the privacy problem really
solved, be merged.
Several years ago Bingham7 suggested the use of
User's Control Profiles to associate access control with
a user rather than a file. This allows users to operate
only on file subsets for which they are authorized and
to some extent solves the memory wastage problem.
Weissman has recently described a working system at
SDC which makes use of security properties of users,
terminals, and files. 8 He presents a set-theoretic model
for such a system. His model does not deal with access
control below the file level.
Hsia09 has recently implemented a system using
authority items associated with users. Hsiao's system

* A simple cost model for information systems is presented by
Arvas. 6 He there derives a simple rule to determine when it is
more efficient to consolidate files and when it is more efficient to
distribute copies of them.

Some other methods have been proposed for access
control, but not yet implemented.
. These
. include
.
Graham's scheme ll which essentIally aSSIgns a senSItivity level to each program and data element in the
system, ** another which allows higher-level programs
to grant access privileges to lower-level programs,12 and
still others which place access control at the segment
leveP3,14 via machine hardware and "codewords". These
methods may prove acceptable in many contexts. However, they are not general enough for al! situations. If
distinct sensitivity levels cannot be assIgned to data,
as is sometimes the case, Graham's scheme cannot be
used. The other methods, while working in principle on
a computer with hardware segmentation, seem infeasible and uneconomical on a computer with another
type of memory structure such as an associative memoryI5,16,17,18,19 or a Lesser memory.20 These objections
are covered in more detail elsewhere. 5
Desirable characteristics for an access control method
I t seems desirable to devise a method of access control which does not impose an arbitrary constraint
(such as segmentation or sensitivity levels) on data or
programs. This method should allow efficient control of
individual data elements (rather than of files or records
only). Also, it should not extract unwarranted cost in
storage or elsewhere from the user who wants only a
small portion of his data protected. The method should
be independent of both machine and file structure, yet
flexible enough to allow a particular implementation of
it to be efficient. Finally, it should be sufficiently modular to permit cost-effectiveness experiments to be
undertaken. We would then finally have a vehicle for
exploring the often-asked but never-answered question
about privacy controls, "How much does technique X
cost?"
We now present such a method.
** Evidently this scheme has now been implemented.

Formulary Model for Flexible Privacy and Access Controls

THE FORMULARY METHOD OF ACCESS
CONTROL
We now describe the "formulary" method of access
control. Its salient features have been mentioned above.
The decision to grant or deny access is made at data
access time, rather than at file creation time, as has
generally been the case in previous systems. This, together with the fact that the decision is made by a
program (not by a scan of bits or a table), allows more
flexible control of access. Data-dependent, terminaldependent, time-dependent, and user response-dependent decisions can now be made dynamically at data
request time, in contrast to the predetermined decisions
made in previous systems, which are, in fact, subsumed
by the formulary method. Access to individual related
data items which may have logical addresses very close
to each other can be controlled individually. For example, a salary figure might be released without any
identification of an employee or any other data.
For any particular interpretation, the installation
must supply the procedures listed in Table I. These
procedures can all be considered a part of the general
accessing mechanism, each performing a specific function. By clearly delimiting these functions, a degree of
modularity is gained which enables the installation to
experiment with various access control methods to arrive at the modules which best suit its needs for efficiency, economy, flexibility, etc. This modularity also
results in access control becoming independent of the
remainder of the operating system, a desirable but
elusive goal. 8 While the formulary model and its central
ACCESS procedure remain unchanged, each installation can supply and easily change the procedures of
Table I as desirable. These procedures are all specified
in the body of this paper.
The basic idea behind the formulary method is that
a user, a terminal, and a previously built formulary
TABLE I-Procedures Supplied by the Installation
FOR
EACH
INTERPRETATION,
MUST SUPPLY

INSTALLATION

TALK PROCEDURE
ACCESS ALGORITHM
• PRIMITIVE OPERATIONS
•

AT LEAST ONE

•

CODING FOR THE
•

•

•
•

•

FETCH

• STORE
AT LEAST ONE

A

FORMULARY,

CONSISTING OF

CONTROL PROCEDURE
VIRTUAL PROCEDURE

(may be null)
(may be null)
FORMULARYBUILDER PROCEDURE

•

SCRAMBLE PROCEDURE

•

UNSCRAMBLE PROCEDURE

589

(defined below) must be linked together, or attached,
in order for a user to perform information storage, retrieval, and/or manipulative operations. At the time
the user requests use of the data base system, this
linkage is effected, but only if the combination of user,
terminal and formulary is allowed. The general linking
process is described later in this section.
Virtual memory mapping hardware is not required to
implement the model but the model does handle systems equipped with such hardware. It is assumed that
enough virtual addressing capacity is available to
handle the entire data base. Virtual addresses are
mapped into the physical core memory locations, disc
tracks, low-usage magnetic tapes, etc., by hardware
and/or by the FETCH and STORE primitive operations (see below) for a particular implementation.
Definitions and notation

The internal name of a datum is its logical address
(with respect to the structure of the data base). The
internal name of a datum does not change during continuous system operation.
Examples:
(1) A "tree name" such as 5.7.3.2 which denotes
field 2 of branch 3 of branch 7 of branch 5 in
the data base
(2) "Associative memory identifiers" such as (14,
273, 34) where 14 represents the 14th attribute,
273 represents the 273rd object, and 34 represents the 34th value, in a memory similar to the
one described by Rovner and Feldman. 21
A User Control Block, or UCB, is space in primary
(core) storage allocated during the attachment process
(described below). It contains the user identification,
terminal identification, and information about the
VIRTUAL, CONTROL, SCRAMBLE, and UNSCRAMBLE procedures of the formulary the user is
linked to. (An entity with the same name and used
similarly has recently been presented independently in
a non-implemented model by Friedman. 22 )
Usually this information is just the virtual address
of each of these procedures. The virtual addresses are
kept in primary storage in the UCB since a formulary,
once linked to a user and terminal, will probably be
(oft-) used very shortly. The first reference to any of
these addresses (indirectly through the UCB) will
trigger an appropriate action (e.g., a page fault on some
computers) to move the proper program into primary
storage (if it is not there already). It will then presumably stay there as long as it is useful enough to

590

Fall Joint Computer Conference, 1971

TALK, THE

CONVERSATIONAL STORAGE AND RETRIEVAL PROCEDURE

F ormularies-what they are

DATA

DATA BASE

o

trieval procedure. TALK converses with the user (ro
the user's program) to obtain, along with other information, (1) a datum description in a user-oriented
language, and (2) the operation the user wishes to
perform on that datum. TALK translates the datum
description in the user-oriented language into an internal name, thus providing a bridge between the user's
conc~ption of the data base and the system's conception
of the data base. The TALK procedure is described in
more detail below.

PRIMITIVE OPERATIONS
1275C9

Figure 2-User/data base interface

merit keeping in high-speed memory. The virtual addre.sses of procedures of a formulary cannot change
while they are contained in any UCB. This constraint
is easy to enforce using the CONTROL procedure described below which controls operations on any datums
including formularies. Each UCB always is in high~
speed primary storage in the data area of the ACCESS
procedure.
The ACCESS procedure

All control mechanisms in the formulary model are
invoked by a central ACCESS procedure. This ACCESS procedure is the only procedure which directly
calls the primitive FETCH and STORE operations
and which performs locking and unlocking operations
on data items in the base. All requests for operations
on the data base must go through the ACCESS procedure.
The ACCESS procedure is a very important element
of the formulary model. It is described in full detail
and its algorithm is supplied below.
The user communicates only indirectly with ACCESS. The bridge (see Figure 2) between the systemoriented ACCESS procedure and the applicationoriented user is provided by the (batch or conversation)
storage and retrieval program, TALK.
TALK, the application-oriented storage and retrieval
procedure

To access a datum, the user must call upon TALK,
the (nonsystem) application oriented storage and re-

A formulary is a set of procedures which controls
access to information in a data base. These procedures
are invoked whenever access to data is requested. They
perform various functions in the storage, retrieval, and
manipulation of information. The set of procedures and
their associated functions are the essential elements of
the formulary model of access control.
Different users will want different algorithms to
carry out these functions. For example, some users
will be using data which is inaccessible to others·, the
name of a particular data element may be specified in
different ways by different users; some users will
manipulate data structures-such as trees, lists, sparse
files, ring structures, arrays, etc.-which are accessed
by algorithms specifically designed for these structures.
Depending on how he wishes to name, access, and controlaccess to elements of the data base, each user will
be attached to a formulary appropriate to his own
needs.
Procedures of a formulary

In this subsection, we describe the procedures of a
formulary. These procedures determine the accessibility, addressing, structure and interrelationships of
data in the data base dynamically, at data request
time. They can be arbitrarily complex. In contrast,
earlier· systems usually made only table-driven static
determinations, prespecified at file creation time.
Each procedure of a formulary should, if possible,
run from execute-only memory, which is alterable
only under administrative control. The integrity of the
system depends on the integrity of the formularies and
therefore the procedures of all formularies should be
written by "system" programmers who are assumed
honest. These procedures should be audited for program
errors, hidden "trap doors," etc., before being inserted
into the (effective) execute-only memory under administrative control. Failure to do this may result in
the compromising of sensitive data, since an unscrupu-

Formulary Model for Flexible Privacy and Access Controls

lous programmer of a formulary could cause the formulary to "leak" sensitive information to himself or to his
agents.
A formulary has four procedures: VIRTUAL,
SCRA1VIBLE, UNSCRAMBLE, and CONTROL. The
first three are relevant but not central to access control; the decision on whether to grant the type of access
desired is made solely by the CONTROL procedure.
The first three procedures are explicitly included in
each formulary for three reasons:
. (1) to centralize in one place all functions dealing
with addressing and access control;
(2) to give the model the generality necessary to
model existing and proposed systems; and
(3) to provide well-delimited modules for cost/
effectiveness studies and for experimentation
with different addressing schemes and access
control schemes.
a. The VIRTUAL procedure. VIRTUAL translates
an internal name into the virtual address of the corresponding datum. VIRTUAL is a procedure with two
input parameters:
(1) the internal name to be translated
(2) a cell which will sometimes be used to hold
"other information" as described below.

591

Note that if an auto-key cipher (one which must access
the start of the cipher-text, whether or not the information desired is at the start) is used, all of the information encrypted using that cipher, be it as small as a
single field or as large as an entire "file," must be
governed by the same access control privileges. Therefore, some applications may choose to use several (or
many) auto-key ciphers within the same "file." It is
inefficient and usually undesirable to scramble data
items at other than the internal name level, e.g.,
scrambling as a block (to effectively increase key
length) the data represented by several internal names .
In cases where internal names represent data which
fit into very small areas of storage, greater security
may be obtained by other methods (e.g., use of nulls).
We do not discuss encrypting schemes in this paper.
The interested reader is referred to work by Shannon,23
Kahn,24 and Skatrud. 25
c. The UNSCRAMBLE procedure. UNSCRAMBLE is an unscrambling procedure which transforms
encrypted data into raw form. (In some specific systems, UNSCRAMBLE may be null.) UNSCRAMBLE
has two input parameters:
(1) the virtual address of the datum to be unscrambled
(2) the length of the datum to be unscrambled
UNSCRAMBLE has three output parameters:

VIRTUAL returns
(1) the resulting virtual address
(2) a completion code (1 if normal completion)
Recall that enough virtual addressing capacity is
assumed available to handle the entire data base.
Virtual addresses are mapped into the physical core
memory locations, disc tracks, low-usage magnetic
tapes, etc., by hardware and/or by the FETCH and
STORE primitive operations for a particular implementation.
b. The SCRAMBLE procedure. SCRAMBLE is a
procedure which transforms raw data into encrypted
form. (In some specific systems, SCRAMBLE may be
null.) SCRAMBLE has two input parameters:
(1) the virtual address of the datum to be scrambled
(2) the length of the datum to be scrambled
SCRAMBLE has three output parameters:
(1) a completion code (1 if normal completion)
(2) the virtual address of the scrambled datum
(3) the length of the scrambled datum

(1) a completion code (1 if normal completion)
(2) the virtual address of the unscrambled datum
(3) the length of the unscrambled datum.
d. The CONTROL procedure. CONTROL is a procedure which decides whether a user is allowed to perform the operation he requests (FETCH, STORE,
FETCHLOCK, etc.) on the particular datum he has
specified. CONTROL may consider the identification
of the user and/or the source of the request (e.g., the
terminal identification) in order to arrive at a decision.
CONTROL may also converse with the requesting user
before making the decision.
CONTROL has two input parameters and two output parameters. The two input parameters are:
(1) the internal name of the datum
(2) the operation the user desires to perform
The two output parameters are:
(1) 1 if access is allowed; otherwise an integer
greater than 1
(2) "other information" (explained below).

592

Fall Joint Computer Conference, 1971

In some specific systems, data elements may themselves contain access control information. Consider
three examples:
Example 1.
DATUM

IR

IW 130 bits of actual data I

If bit R is on, DATUM is readable.
If bit W is on, DATUM is writeable.

Example 2.

I

SALARY $25,000

I

Reading or writing of salaries of $25,000 or over requires special checking. CONTROL must inspect the
SALARY cell before it can do further capability checking and eventually return 1 or some greater integer as
its first output parameter (see Figure 3). Note that
return of an integer greater than 1 actually transmits
some information to the user; if he knows that he will
not be allowed to alter salaries which are $25,000 or
over, a denial of access actually tells him that the
salary in question is at least $25,000. In the formulary
model, CONTROL can only make a yes or no decision
about access to a particular datum. Any more complex
decisions, such as one involving release of a count which
is possibly low enough to allow unwanted identification
of individual data26 (e.g., "Tell me how many people
the Health Physics Group treated for radiation sicknesses last year who also were treated by the Psychiatric Outpatient Department at the hospital"), can
only be made by a suitably sophisticated TALK
procedure.

can then examine this "other information." If a virtual
address has been put there by CONTROL, VIRTUAL
will not duplicate the possibly laborious determination
of the datum's virtual address, since this has already
been done. VIRTUAL will merely pluck the address
out of the "other information" and pass it back.
Note that CONTROL can be as sophisticated a procedure as desired; it need not be merely a table-searching algorithm. Because of this, CONTROL can consider many heretofore ignored factors in making its decision (see Figure 3). For example, it can make decisions
which are data-dependent and time-dependent. It can
require two keys (or N keys) to open a lock. Also it
can carryon a lengthy dialogue with the user before
allowing (or denying) the access requested.
CONTROL is not limited to use at data request
time. In addition to being used to monitor the interactive storage, retrieval, and manipulation of data, it
can also be used at initial data base creation time for
data edit picture format checking, data value validity
checking, etc. Or, alternatively, one could have two
procedures CONTROL1 and CONTROL2, .in two
different formularies, F1 and F2. F1 could be attached
at data input time and F2 at on-line storage, retrieval,
manipulation, and modification time.
Simultaneous use of one formulary by multiple users

Note that the same formulary can be used simultaneously by several different users with different access permissions. This is possible because access control
is determined by the CONTROL procedure of the attached formulary. This procedure can grant different
privileges to different users.

Example 3.
RecordN
RecordN-1

""'"

Record N+1

347 1346 storage units
of actual data
The record contains its own length (and, therefore,
also points to its successor). This type of record would
appear, for example, in variable length sequential records on magnetic tape and in some list-processing
applications.
In systems of this type, CONTROL might often
duplicate VIRTUAL's function of transforming the
internal name of a datum into that datum's virtual
address. To achieve greater efficiency, CONTROL can
(when appropriate) return the datum's virtual address
as "other information." VIRTUAL, which is called
after CONTROL (see the ACCESS algorithm below),

~~~~~ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

R~"N

~~

NOTE:

L TIME- DEPENDENT

..

~

-

2. FEEDBACK LOOPS

,. •

,. •

____________

3. TWO - KEY SYSTEM

Figure 3-A sample CONTROL procedure

~

Formulary Model for Flexible Privacy and Access Controls

Building a formulary

Before a formulary can be attached to a user and a
terminal, the procedures it contains must be specified.
This is done using the system program FORMULARYBUILDER. FORMULARYBUILDER converses with
the systems programmer who is building a formulary
to learn what these procedures are, and then retrieves
them from the system library and enters them as a set
into a formulary which the user names. The specifics
of FORMULARYBUILDER depend on the particular
system.***
The attachment process-the method of linking a
formulary to a user and terminal

In order to allow information storage and retrieval
operations on the data base to take place, a user, a
terminal, and a formulary which has been previously
built using FORMULARYBUILDER must be linked
together. This linking process is done in the following
manner.
At the first time ACCESS is called (by TALK) for
a given user and terminal, it will only permit attachment of a formulary to the user and terminal (i.e., it
will not honor a request to fetch, store, etc.). The attachment is permitted only if the CONTROL program
of the default formulary allows. The default formulary,
like all other formularies, contains VIRTUAL, CONTROL, SCRAMBLE, and UNSCRAMBLE procedures. For the default formulary, they act as follows:
CONTROL

CONTROL takes the internal name
representing the formulary and decides whether user U at terminal T
is allowed to attach the formulary
represented by the internal name.
U and T are maintained in the UCB
and passed to CONTROL by
ACCESS.
VIRTUAL
VIRTUAL takes the internal name
representing the formulary and returns the virtual address of the
formulary.
SCRAMBLE
No operation.
UNSCRAMBLE No operation.
The ATTACH attempt, if successful, causes informa-

*** An

extension to FORMULARYBUILDER which would
allow a user to grant capabilities to other users, and then allow
these users to grant capabilities to still other users, etc., has been
proposed by Victor Lesser. The formulary model does not
currently adequately handle this area of concern.

593

tion about the formulary specified by the user to be
read into the UCB (which is located in the data area
of the ACCESS procedure). ACCESS then uses this
information (when it is subsequently called on behalf
of this user/terminal combination) to determine which
CONTROL, VIRTUAL, SCRAMBLE, and UNSCRAMBLE procedures to invoke.
Independence of addressing and access control

After the attachment process, the User Control
Block (UCB) contains the user identification U, terminal identification T, and information about (usually
pointers to) the VIRTUAL, CONTROL, SCRAMBLE,
and UNSCRAMBLE procedures of a formulary.
Whether the user can perform certain operations on' a
given datum is controlled by the CONTROL program.
The addressing of each datum is controlled by the
VIRTUAL program. Addressing of data items is now
completely independent of the access control for the
data items.

Breaking an attachm.ent

An existing attachment is broken whenever
(1) the user indicates that he is finished using the
information storage and retrieval system (either
by explicitly declaring so or implicitly by logging out, removing a physical terminal key,
reaching the end-of-job indicator in his input
card deck, etc.), or
(2) the user, via his TALK program, explicitly detaches himself from a formulary.
Subdivision of data base into files not required

Note that while the concept of a data set (or a "file")
MAY be used, the formulary method does not require
this. This represents a significant departure from previous large-scale data base systems which were nearly all
organized with files (data sets) as their major subdivisions. Under the formulary scheme, access to information in a data set is not governed by the data set
name. Rather, it is governed by the CONTROL procedure of the attached formulary. Similarly, addressing
of data in a data set is governed by the VIRTUAL
procedure and not by the data set name. Subdividing
a data base into data sets, while certainly permitted
and often desirable, is not required by the formulary
model.

594

Fall Joint Computer Conference, 1971

Concurrent requests to access data-the LOCKLI ST
The problem of two or more concurrent requests for
exclusive data access necessitates a mechanism to control these conflicts among competing users. This problem has been discussed and solutions proposed by
several workers. 28 ,9,27 In the formulary model, data can
be set aside (locked) dynamically for the sole use of
one user/terminal combination in a manner similar to
Hsiao's "blocking"9 using a mechanism known as the
LOCKLIST.
The locking and unlocking of data to control simultaneous updating is an entirely separate function from
the access control function. Access control takes into
account privacy considerations only. Locking and unlocking are handled by a separate mechanism, the
LOCKLIST. This is a list of triplets maintained
by the ACCESS program and manipulated by the
FETCH LOCK, STORELOCK, UNLOCKFETCH,
and UNLOCKSTORE operations. Each triplet contains (1) the internal name of a current item, (2) the
identification of the user/terminal combination which
caused it to be locked, and (3) the type of lock (fetch
or store). Any datum represented by a triplet on the
LOCKLIST can be accessed only by the user/terminal
combination which caused it to be locked.
Data items which can be locked are atomic, i.e.,
subparts of these data items cannot be locked. This
implies, for example, that if a user wishes to lock a tree
structure and then manipulate the tree without fear
of some other user changing a subnode of the tree, either
(1) the tree must be atomic in the sense that its
subnodes do not have internal names in the data
base system, or
(2) each subnode must be explicitly locked by the
user and only after all of these are locked can
he proceed without fear of another user changing
the tree. ****

the operating system to obtain
(1) a datum description in a user-oriented language
(2) the operation the user wishes to perform on that

datum
(3) user identification and other information about
the user and/or the terminal where the user is
located.
Depending on the particular system, the user explicitly
gives TALK zero, one two, or all three of the above
parameters. TALK supplies the missing parameters
(if any), converts (1) to an internal name, and then
passes the user identification, the terminal identification, the internal name of the datum, and the desired
operation to the ACCESS procedure, which actually
attempts to perform the operation.
Note that one system may have available many
TALK procedures. A user requests invocation of any
any of them in the same way he initiates any (nonsystem) program. Sophisticated users will require only
"bare-bones" TALK procedures, while novices may
require quite complex tutorial TALK procedures. They
may both be using the same data base while availing
themselves of different datum descriptions. As an example, one TALK procedure might translate English
"field names" into internal names, while another
TALK procedure translates French "field names" into
internal names. This ability to use multiple and userdependent descriptions of the same item is not available
with such generality in any system the author is aware
of, though some systems allow lesser degrees of this. 29 ,3o
Different TALK procedures also allow concealmeI\t
of the fact that certain information is even in a data
base, as illustrated in Figure 4. The remarks above

WHAT PROGRAM? talkl

TALK2 HAS BEGUN EXECUTION.
WHAT DATA WOULD YOU LIKE TO SEE?

salary of robert d. jones

The TALK procedure-details

WHAT PROGRAM? talk2

TALKl HAS BEGUN EXECUTION.
WHAT DATA WOULD YOU LIKE TO SEE?
YOU ARE NOT PERMITTED READ ACCESS
TO THE

~

salary of robert d. jones
NO FIELD NAMED ~.

FIELD.

To access a datum, the user must effectively call
upon TALK, the (nonsystem) application-oriented
storage and retrieval procedure. TALK converses with
the interactive user and/or the user's program and/or
CONTROL determined that the user was not

**** A more general and elegant method of handling concurrent
requests to access data is being developed by R. D. Russell as
part of a general resource allocation method. Much of the
housekeeping work currently done in the formulary model can be
handled by his method.

permitted read access, causing this reply
to be given by TALK1.

TALK2 lntenUonally returned this reply
to the user.

Figure 4-Concealment of the fact that a data base contains
certain information

Formulary Model for Flexible Privacy and Access Controls

about using different TALK procedures also apply if
a system uses only one relatively sophisticated TALK
procedure which takes actions dependent on the person
or terminal using it at a given time.

The ACCESS procedure-details'
ACCESS uses the VIRTUAL, CONTROL, UNSCRAl\1:BLE, and SCRAMBLE procedures specified
in the UCB to carry out information storage and retrieval functions. Its input parameters are:
(1) information about the user, terminal, etc., defined by the installation. This information IS
passed by the procedure that calls ACCESS;
(2) internal name of datum;
(3) an area which either contains or will contain the
value of the datum specified by (2) ;
(4) the length of (3) ;
(5) operation to perform-FETCH, FETCH LOCK,
STORE, STORELOCK, UNLOCKFETCH,
UNLOCKSTORE, ATTACH, or DETACH.
FETCHLOCK and STORELOCK lock datums
to further fetch or store accesses respectively
(except by the user/terminal combination for
which the lock was put on). UNLOCKFETCH
and UNLOCKSTORE unlock these locks.
ATTACH and DETACH respectively create
and destroy user/terminal/formulary attachments.
(6) a variable in which a completion code is returned by ACCESS.

595

ACCESS itself handles all operations of (5) except
FETCH and STORE. For FETCH and STORE operations on the data base, it invokes the FETCH and
STORE primitives specified below.
Note that some means must be provided to determine which formulary is attached so the CONTROL,
SCRAMBLE, UNSCRAMBLE, and VIRTUAL procedures of that particular formulary can be invoked.
One method is to have those procedures themselves
determine which formulary is attached by examining
data common to them and to the ACCESS procedure.
These data are initially set by the ACCESS procedure
and then are referenced by the other procedures. A
working system using this method is illustrated in
another report.31 An alternative method, if ACCESS is
written in a more powerful language or in assembly
language, would be to use a common transfer vector.
Note that the procedures TESTANDSET and
IDXLL and their corresponding calls can be removed
from ACCESS if no user will ever have to lock out
access to a datum which ordinarily can be accessed by
several users at the same time or if the installation
wishes to use another method to control conflicts
among users competing for exclusive access to datums;
this makes the procedure considerably shorter. Such a
"no parallelism" version of the ACCESS algorithm is
given elsewhere. 31
An ALGOL algorithm for the ACCESS procedure
follows. This procedure is quite important and should
be examined carefully. The comments in the algorithm
should not be skipped, as they often suggest alternate
methods for accomplishing the same goals.

THE ACCESS ALGORITHM
procedure access (info, intname, val, length, opn, compcode) ;
integer array info, val; integer length, opn, compcode;
begin COInInent If OPN = FETCH, VAL is set to the value of the datum represented by INTNAME.

If OPN = STORE, the value of the datum represented by INTNAME is replaced by the value
in the VAL array.
If OPN = FETCHLOCK or STORELOCK, the datum is locked to subsequent FETCH or
STORE operations by other users or from other terminals until an UNLOCKFETCH or
UNLOCKSTORE operation, whichever is appropriate, is performed.
If OPN = UNLOCKFETCH or UNLOCKSTORE, the fetch lock or store lock previously
inserted by a FETCHLOCK or STORELOCK operation is removed.
If OPN = ATTACH, the formulary represented by internal name INTNAME is attached to the
user and terminal described in the INFO array.
If OPN = DETACH, the formulary represented by internal name INTNAME is detached from
the user and terminal described in the INFO array.
VAL is LENGTH storage elements long.
Note that a FETCH (STORE) operation will actually attempt to fetch (store) LENGTH
storage elements of information.

596

Fall Joint Computer Conference, 1971

It is the responsibility of the TALK procedure to handle scrambling or unscrambling algorithms
that return outputs of a different length than their inputs.
ACCESS returns the following integer completion codes in COMPCODE:
1
2
3
4
5
6
7
8
9
10
11
12

normal exit, no error
unlock operation requested by user or terminal who/which did not set lock
operation permitted but gave error when attempted
attempt to unlock datum which is not locked in given manner
cannot handle any more User Control Blocks (would cause table overflow)
attempt to detach nonexistent user/terminal/formulary combination
operation permitted for this user and terminal but could not be carried out since datum was
locked (by another user/terminal) to prevent such an operation
cannot put lock on as requested since LOCKLIST is full
datum already locked by this user and terminal
error return from VIRTUAL procedure
operation on the datum represented by INTNAME not permitted by CONTROL procedure
of the attached formulary
end of data set encountered by FETCH operation.

Note that by the time the user has left the ACCESS routine, the data may have been changed by another user
(if the original user did not lock it). Note that ACCESS could be altered to allow scrambling and unscrambling to
take place at external devices rather than in the central processor.
Important: ACCESS expects the following to be available to it. The installation supplies these in some way other
than as parameters to ACCESS (for example, as global variables in ALGOL or COMMON variables in
FORTRAN)-the default User Control Block. Its length is NUCB storage units.
see (1).
a list of User Control Blocks (UCB's) initialized outside
ACCESS to ucb (1, 1) = -2,
ucb (i, j) = anything when "-'(i = j = 1)
UCB is declared as integer array [1 :maxusers, 1 :nucb].
(4) MAXUSERS the maximum number of users which can be actively connected to the system at any point
in time.
the length of the INFO array (which is the first parameter of ACCESS)--INFO contains
(5) ITALK
information about the user and terminal which is used by ACCESS and also passed by ACCESS
to procedures of the attached formulary.
INFO[IJ contains user identification.
a list of locks (each element of the LOCKLIST array should be initialized outside ACCESS
(6) LOCKLIST
to -1) LOCKLIST is declared as integer array [1 :4, 1 :maxllist].
(7) MAXLLIST the maximum length of the LOCKLIST.
a semaphore to govern simultaneous access to the critical section of the ACCESS procedure
(8) CSI
(initialized to 1 outside ACCESS).
(1) ISTDUCB
(2) NUCB
(3) UCB

ACCESS assumes that the variables FETCH, STORE, FETCH LOCK, STORELOCK, UNLOCKFETCH,
UNLOCKSTORE, ATTACH, and DETACH have been initialized globally and are never changed by the
installation;
integer array iucb [1 :nucb], reslt [1 :length];
integer i, ii, islot, j, yesno, other, n, datum;
integer procedure testandset (semaphore); integer semaphore;
begin cOllllllent TESTANDSET is an integer function designator. It returns -1 if SEMAPHORE was in the
state LOCKED on entry to TESTANDSET. Otherwise, TESTANDSET returns something other than -1. In all
cases, SEMAPHORE is in state LOCKED after the execution of the TESTANDSET procedure, and must be
explicitly unlocked in order for it to be used again.

Formulary Model for Flexible Privacy and Access Controls

597

TESTANDSET is used to implement a controlling mechanism to prevent conflicts among users competing for the
same resource, as discussed in work by Dijkstra. 27 It will NOT prevent "deadly embraces". 32 No explicit code is given
here, since the function is machine-dependent.
This procedure can be removed if no user will ever have to lock out access to a datum which ordinarily can be
accessed by several users at the same time or if the installation wishes to use another method to control conflicts
among users competing for exclusive access to datums;

end testandset;
integer procedure idxll (intname, opn) ; integer intname, opn;
begin corn.rn.ent IDXLL, given an internal name INTNAME, returns the relative position of INTNAME on the

LOCKLIST if the datum represented by INTNAME is locked in a manner affecting the operation OPN. Otherwise,
IDXLL returns the negation of the relative location of the first empty slot on the LOCKLIST. If the LOCKLIST is
full and the INTNAME/OPN combination is not found on it, IDXLL returns O.
This procedure can be removed if no user will ever have to lock out access to a datum which ordinarily can be
accessed by several users at the same time or if the installation wishes to use another method to control conflicts
among users competing for exclusive access to datums;
integer first empty ;
j : = if opn = FETCH or opn = UNLOCKFETCH or opn
FETCH LOCK then 1 else 2;
idxll : = firstempty : = 0;
for i : = 1 step 1 until maxllist do
begin ii : = -i;
if locklist [1, i] = -1 then firstempty : = i
else if locklist [1, i] = intname and locklist [2, i]
j then begin idxll : = i;
go to RET
end;
end;
if firstempty

~

0 then idxll : = -firstempty;

RET:
end idxll;
procedure ret (i); integer i;
begin corn.rn.ent RET sets the completion code compcode to i and then causes exit from the ACCESS procedure;
compcode : = i; go to FIN
end ret;

compcode : = 1;
co:mrn.ent first let's see if we recognize the user/terminal combination in INFO;

islot : = 0;
for i : = 1 step 1 until maxusers do
begin ii : = i;
if ucb [i, 1] = -2 then begin corn.rn.ent end of list of ucb's;
if islot = 0 then begin if ii ~ maxusers then ucb [ii
go to XFER;
end
else go. to PRESETUP;
end
else if ucb [i, 1] = -1 then islot : = ii
corn.rn.ent remember this slot if vacant;
else begin for j : = 1 step 1 until italk do
if ucb [i, j] ~ info [j] then go to ILOOPND;
go to SETUPPTRS
end;

+

1, 1] : = -2;

598

Fall Joint Computer Conference, 1971

ILOOPND:
end i loop;
if islot = 0 then ret (5); COll1ll1ent cannot handle any more UCBs;
PRESETUP:
ii : = islot;
XFER:
for k : = 1 step 1 until italk do ucb [ii, k] : = info [k];
for k : = italk + 1 step 1 until nucb do ucb [ii, k] : = istducb [k];
SETUPPTRS:
for i : = 1 step 1 until nucb do iucb [i] : = ucb [ii, i];
COll1ll1ent set up pointers to appropriate user control block for particular implementation. Note well: Setting up
pointers to appropriate user control blocks is quite dependent on the particular system;
COll1ll1ent We have now associated user and terminal with the user control block (representing a formulary) in
relative position i of the UCB table;
if iucb [nucb] =;t. intname and opn = DETACH then ret (6);
COll1ll1ent attempt to detach user/terminal/formulary combination not currently attached;
control (intname, opn, yesno, other);
if yesno > 1 then ret (11);
COll1ll1ent return 11 if CONTROL does not permit operation;
if opn = ATTACH then begin ucb [ii, nucb] : = intname; go to FIN
end;
COll1ll1ent Note well: In many implementations, pointers to each procedure of the formulary (obtained by having
VIRTUAL transform intname into a virtual address) might be put into the UCB upon attachment. In other, the
philosophy used here of only putting one pointer- -to the formulary- -into the UCB will be followed. The decision
should take into account design parameters such as implementation language, storage available, etc.;
if opn = DETACH then begin COll1ll1ent detach formulary (this leaves an open slot in the ucb array);
ucb [ii, 1] := -1; go to FIN
end;
if opn = UNLOCKFETCH or opn = UNLOCKSTORE then
begin i : = idxll (intname, opn) ; COll1ll1ent find internal name on LOCKLIST;
if i :::; 0 then ret (4); COll1ll1ent cannot find it;
for j : = 1 step 1 until italk do
if locklist [2
j, i] =;t. iucb [j] then ret (2);
locklist [1, i] := -1; COll1ll1ent undo the lock and mark slot in UCB array empty;
go to FIN
end unlock operation;
TRY:
if testandset (csl) = -1 then go to TRY;
COll1ll1ent loop until no other user is executing the critical section below;
COll1ll1ent ACCESS should ask to be put to sleep if embedding system permits;
COll1ll1ent-------------------------------enter critical section for locking out datums----------------------;
i : = idxll (intname, opn) ;
COll1ll1ent get relative location of locked datum in locklist;
if i > 0 then begin COll1ll1ent datum found on locklist so see if it was locked by this user and terminal;
for j : = 1 step 1 un til italk do
if locklist [2 + j, i] =;t. iucb [j] then ret (7);
COll1ll1ent data already locked by another user or terminal;
if opn = FETCHLOCK or opn = STORELOCK then ret (9);
COll1ll1ent datum already locked by this user and terminal, so return completion code of 9;
end;
i := -i;
if opn = FETCHLOCK or opn = STORELOCK then
begin COll1ll1ent this is a lock operation;

+

Formulary Model for Flexible Privacy and Access Controls

599

if i = 0 then ret (8); comment cannot set lock since locklist is full;
locklist [2, i] : = if opn = FETCH LOCK then 1 else 2;
comment set appropriate lock;
for j : = 1 step 1 until italk do locklist [2 + j, i] : = iucb [j];
comment place user and terminal identification into LOCKLIST;
locklist [1, i] : = intname; comment place internal name on LOCKLIST;
go to FIN;
end lock operation;
virtual (intname, datum, other, compcode);
comment VIRTUAL returns in datum the virtual address of the datum specified;
if compcode > 1 then ret (10); comment error return from VIRTUAL;
if opn = STORE then
begin comment store operation;
scramble (val, length, compcode, reslt, n) ;
if compcode > 1 then ret (3);
comment operation permitted but gave error when attempted;
comment now perform a physical write of n storage units to the block starting at reslt;
store (datum, reslt, n, compcode) ;
if compcode > 1 then ret (3)
end
else
begin comment fetch operation;
fetch (datum, reslt, length, compcode) ;
if compcode = 2 then ret (12); comment end of data set encountered;
if compcode > 1 then ret (3);
unscramble (reslt, length, compcode, val, n);
if compcode > 1 then ret (3);
end fetch operation;
FIN:
comment-----------------------------Leave critical section for locking out datums-------------------------------------------------------- ;
cs1 : = 1;
end access;
FETCH and STORE primitive operations
The two primitive operations FETCH and STORE
are supplied by the installation. These primitives
actually perform the physical reads and writes which
cause information transfer between the media the data
base resides on and the primary storage medium (usually, magnetic core storage). They are invoked only
by the ACCESS procedure.
The primitive operations cannot be expressed in
machine-independent form, but rather depend on the
specific system and machine used. They are defined
functionally below.

scrambled, but if so unscrambling will be done later by
UNSCRAMBLE (called from ACCESS) , and
LENGTH is the length of the scrambled data. The
value comprises LENGTH storage elements. Upon
completion, the completion code COMP is set to:
1 if normal exit
2 if end of data set encountered when physical
read attempted
3 if length too big (installation-determined)
4 if illegal virtual address given to fetch from
5 if error occurred upon attempt to do physical
read.

FETCH (ADDR, VALUE, LENGTH, COMP)

STORE (ADDR, VALUE, LENGTH, COMP)

This primitive fetches the value which is contained
in the storage locations starting at virtual address
ADDR and returns it in VALUE. This value may be

This primitive stores LENGTH storage elements
starting at virtual address VALUE into LENGTH
storage elements starting at virtual address ADDR.

600

Fall Joint Computer Conference, 1971

The information stored may be scrambled, but if so
the scrambling has already been done by SCRAMBLE
(called from ACCESS), and LEN G TH is the length of
the scrambled data. Upon completion, the completion
code COMP is set to:

(2)

(3)
1
3
4
5

if normal exit
if length too big (installation-determined)
if illegal virtual address given to store into
if error occurred upon attempt to do physical
write.

( 4)

(5)
A NOTE ON THE COST OF SOME
PRIVACY SAFEGUARDS
As mentioned above, a desirable property for an access control model is that it be sufficiently modular to
permit cost-effectiveness experiments to be undertaken. In this way the model would serve as a vehicle
for exploring questions of cost with respect to various
privacy safeguards.
Using the formulary model, an experiment was run
on the IBM 360/91 computer system at the SLAC
Facility of Stanford University Computation Center.
This experiment was designed to obtain figures on the
additional overhead due to using the formulary method
and on the costs on encoding (and conversely the cost
of decoding data). Early results31 seem to indicate that
the incremental cost of scrambling information in a
large computer data base where fetch accesses (and
hence unscrambling operations) are relatively infrequent is infinitesimal.
It is easy to use the formulary model to carry out
various other experiments dealing with relative costs
of diverse encoding methods and data accessing
schemes. We hope to do more of this in the future.
SUMMARY
We have defined and demonstrated a model of access
control which allows real-time decisions to be made
about privileges granted to users of a data base. Raw
data need appear only once in the data base and arbitrarily complex access control programs can be associated with arbitrarily small fragments of this data.
The desirable characteristics for an access control
method laid out in the section on access control methods
are all present (though we have not yet run enough experiments to make general statements about efficiency) :
(1) No arbitrary constraint (such as segmentation

or sensitivity levels) is imposed on data or
programs.
The method allows control of individual data
elements. Its efficiency depends on the specific
system involved and the particular controls used.
No extra storage or time is required to describe
data which the user does not desire to protect.
The method is machine-independent and also
independent of file structure. The efficiency of
each implementation depends mainly on the
adequacy of the formulary method for the particular data structures and application involved.
The discussion above illustrates the modularity
of the formulary mode.

ACKNOWLEDGMENTS
This paper is a condensation of a Ph.D. dissertation
at the Stanford University Computer Science Department. The author is deeply indebted to Professor
William F. Miller for his encouragement and advice
during the research and writing that went into it.
Many other members of the Stanford Computer Science
Department and the Stanford Linear Accelerator
Center also contributed their ideas and help, in particular, John Levy, Robert Russell, Victor Lesser,
Harold Stone, Edward Feigenbaum, and Jerome Feldman. The formulary idea was initially suggested by the
use of syntax definitions ("field formularies") for
input/output data descriptions as described by
Castleman. 33
REFERENCES
1 P S CRISMAN (EDITOR)
The compatible time-sharing system-a programmer's
guide MIT Press Cambridge Massachusetts 1965
2 J D BABCOCK
A brief description of privacy measures in the RUSH
time-sharing system
Proc AFIPS SJCC Vol 30 pp 301-302 Thompson Book Co
Washington D C 1967
3 B W LAMPSON
Dynamic protection structures
Proc AFIPS FJCC pp 27-381969
4 F J CORBATO V A VYSSOTSKY
Introduction and overview of the Multics system
Proc AFIPS SJCC pp 185-196 1965
5 L J HOFFMAN
Computers and privacy: a survey
Computing Surveys Vol 1 No 2 pp 85-103 1969
6 C ARVAS
Joint use of databanks
Report No 6 Statistiska Centralbyran Stockholms
Universitet Ukas P5 Sweden 1968

Formulary Model for Flexible Privacy and Access Controls

7 H W BINGHAM
Security techniques for EDP of multilevel classified
information
Document RADC-TR-65-415 Rome Air Development
Center Griffiss Air Force Base New York 1965
8 C WEISSMAN
Security controls in the ADEPT-50 time-sharing system
Proc AFIPS FJCC pp 119-133 1969
9 D K HSIAO
A file system for a problem solving facility
Ph D Dissertation in Electrical Engineering University
of Pennsylvania Philadelphia Pennsylvania 1968
10 M G STONE
T E RPS-file independent enquiries
Computer Bulletin Vol 11 No 4 pp 286-289 1968
11 R M GRAHAM
Protection in an information processing utility
Communications of the ACM Vol 11 No 5 pp 365-369
1968
12 J B DENNIS E C VAN HORN
Programming semantics for multi-programmed computation
Communications of the ACM Vol 9 No 3 pp 143-155 1966
13 J K ILIFFE
Basic machine principles
MacDonald and Co London England 1968
14 D C EVANS J Y LE CLERC
Address mapping and control of access in an interactive
computer
Proc AFIPS SJCC Vol 30 pp 23-30 Thompson Book Co
Washington D C 1967
15 J A FELDMAN
Aspects of associative processing
Technical Note 1965-13 Lincoln Laboratory MIT
Cambridge Massachusetts 1965
16 R G EWING P M DAVIES
A n associative processor
Proc AFIPS FJCC 1964
17 R G GALL
A hardware-integrated GPC / search memory
Proc AFIPS FJCC 1964
18 J MC ATEER et al
Associative memory system implementation and
characteristics
Proc AFIPS FJCC 1964
19 J I RAFFEL T S CROWTHER
A proposal for an associative memory using magnetic films
IEEE Trans on Electronic Computers Vol EC-13 No 5
1964
20 V R LESSER
A multi-level computer organization designed to separate

21

22

23

24

25

26

27

28

29

30

31

32

33

601

data-accessing from the computation
Technical Report CS90 Computer Science Department
Stanford University Stanford California 1968
P D ROVNER J A FELDMAN
The Leap language and data structure
Proc IFIP 1968 C73-C77
T D FRIEDMAN
The authorization problem in shared files
IBM Systems Journa.l Vol 9 No 4 1970
C E SHANNON
Communication theory of secrecy systems
Bell System Technical Journal Vol 28 pp 656-715 1949
D KAHN
The codebreakers
Macmillan New York New York 1967
R 0 SKATRUD
The application of cryptographic techniques to data
processing
Proc AFIPS FJCC pp 111-117 1969
W F MILLER L J HOFFMAN
Getting a personal dossier from a statistical data bank
Datamation pp 74-75 May 1970
E W DIJKSTRA
Cooperating sequential processes
Department of Mathematics Technological University
Eindhoven the Netherlands 1965
A SHOSHANI A J BERNSTEIN
Synchronization in a parallel accessed data base
Communications of the ACM Vol 12 No 11 pp 604-607
1969
R S JONES
DATA FILE TWO-A data storage and retrieval system
Proc SJCC pp 171-1811968
R H GIERING
Information processing and the data spectrum
Technical note DTN-68-2 Data Corporation Arlington
Virginia 1967
L J HOFFMAN
The formulary model for access control and privacy in
computer systems
Report 117 Stanford Linear Accelerator Center Stanford
California 1970
A N HABERMANN
Prevention of system deadlocks
Communications of the ACM Vol 12 No 7 p 373 1969
P A CASTLEMAN
User-defined syntax in a general information storage and
retrieval system
in Information Retrieval The User's Viewpoint An Aid
to Design International Information Inc 1967

Integrated municipal information systems: Benefits for
cities-Requirements for vendors
by STEVEN E. GOTTLIEB
BASYS, Inc.
Wichita Falls, Texas

INTRODUCTION
The Fall Joint Computer Conference's call for papers
this year says that, "The scope of the conference will
encompass the entire information processing field."
It goes on to say that the primary theme is "the use of
computers to improve the quality of life."
So many of the new projects we as individuals or as
nations undertake today, we purport to be an activity,
which, if accomplished, will improve the quality of life.
If I appear to be suggesting that this phrase is a little
overworked, then, indeed, I have made my point. The
problem, however, is not with the phrase itself, but
rather with the broad, all encompassing meaning the
user frequently wishes to imply when he cannot ascribe
specific benefits to the project in which he is engaged.
I would, therefore, like to describe for you a few of the
direct and indirect benefits which I see accruing from
the USAC program and specifically from the Wichita
Falls project to create an Integrated Municipal Information System. Before doing that, however, let me
address the goal toward which we in Wichita Falls
are striving.

•
•
•
•

or welfare records for those persons receiving
treatment or aid.
The generation and posting of utility bills
The posting of the general and subsidiary ledgers
The maintenance of land use records
The development and maintenance of the voter
role

Integrated-Refers to the development of a unified,
multi-functional data base; that is, a data base shared
by all the generators and authorized users of data
within the municipal government. Frequently in cities,
as in any large complex organization, there is a multiplicity of requirements for the same data. Too frequently, however, these requirements are satisfied by
each user independently collecting and storing the data
for himself. The tax assessor, the fire department, and
the building inspector, for example, all require similar
information about buildings including such things as:
•
•
•
•

THE DEFINITION OF AN IMIS
UNDERSCORES THE GOAL OF THE
WICHITA FALLS PROJECT

Address
Dimensions
Construction type
Number of access ways, etc.

There appears to be no reason why this data cannot be
collected by only one of these departments on a single
inspection and made available to the others. Integrated,
in the context of this system, then really includes not
only the development of a unified, multi-functional
data base, but also the development of unified multipurpose data collection and dissemination methods.
Municipal-Implies that the system concerns itself
only with what is carried on by the city government.
In this case, however, that's not an adequate definition.
The city, though in many ways seemingly autonomous,
has many interrelationships with other institutions. A
city, for example, must interact with and be responsive
to the needs of both the county and state in which it is

The USAC program is aimed at cities whose population is between 50,000 and 500,000. The Wichita Falls
project is one of only two aimed at the development of
what is called a total Integrated Municipal Information
System (IMIS).
Total-In this context, means that the system
considers all aspects of municipal activity and includes
such diverse things as:
• The maintenance of criminal history information
on previously convicted law breakers or of health
603

604

Fall Joint Computer Conference, 1971

located. Additionally, cities must also interface with
outside organizations or special districts including:
•
•
•
•
•

Independent school boards
Water districts
Citizen or civic organizations
Councils of Governments
Economic development districts

In addition to these and others, cities must also be
responsive to the many reporting demands placed on
them by the Federal Government. Municipal is meant,
therefore, to include not only the organizational units
internal to the city government but all organizations
with which the municipality must interface.
Information-Again, the scope of the word is broad
and refers not only to the traditional information
requirements of top management but perhaps more
importantly to the information requirements of both
middle management and those who carry out the day to
day activities in which a city is engaged.
System-This includes the aggregation into a single
functioning mechanism of not only all the pieces to
which I have referred but also one other key element.
That is, most cities (in the 50-500,000 population
range), unlike many other large complex organizations,
do not real1y have a large number of single process (payroll and billing type) applications suitable for computerization. Those cities rather have many multiple process
applications. There are many departments which need
and process considerable information; that is, they act
on a large number of information categories but they
generally do so relatively infrequently. In Wichita
Falls, a city of only 100,000, we have, for example,
identified well over 6000 individual data elements
which are used in multiple combinations and must be
kept readily accessible. We have not, however, been
able to identify any single transaction type beyond conventional utility collection which could be considered to
have a high transaction rate. This imposes a few unique
design considerations in that no one application's file
requirements clearly dominate the data base design.
To use their information effectively, cities frequently
have to have a large number of people who "massage"
the information, putting it in a form useful for managerial decisions ranging from "what are my budget
requirements for next year" to "which of the traffic
signals should have preventive maintenance performed." The system we are building includes the first
step toward the solution of this problem by making
more effective use of the city's valuable personnel
resources. This will be accomplished by allowing the
computer to make some, and perhaps many, of the
routine decisions which too frequently occupy so much

of a manager's time. By way of example of such decisions, consider the time spent in determining:
• Which properties in the city should be reappraised
• Which vehicles and equipment now require pre~
ventive maintenance
• What is the best schedule for preventive maintenance considering skill requirements and available resources
• Which persons should be sent notifications of their
failure to pay tickets or their need to come in for
an additional medical examination
With this perspective in mind, the goal of the
Wichita Falls IMIS can be stated as the design, development, and implementation of a multi-functional data
acquisition and storage system which is capable of
making routine decisions required during the daily
operation of the municipality. Toward this end, we in
Wichita Falls have made considerable progress.
TWO PHASES OF THE WICHITA FALLS
PROJECT HAVE BEEN COMPLETED
At this time our project is one-third over and we
have completed both the Analysis and Conceptualization Phases and are well into the Design Phase. As part
of an early demonstration to prove the feasibility of
building and using an integrated data base in a municipal environment, we will by November 1971 have
implemented both an automated purchase order and
vendor performance application, as well as an automated tax assessment update application.
TRANSFERABILITY IS A KEY ASPECT
OF THE USAC PROGRAM
Transferability is the principal justification for the
Federal Government's funding the IMIS development.
Transferability may be simply defined as the expectation that the system, or at least significant parts of it,
will be usable by other cities.
Having progressed to the point we are today, I can
say with assurance that there are worthwhile products
already derived from the Wichita Falls project which
in fact are transferable and could be used by other
cities. These products can be viewed as belonging to
one of three categories:

• Concepts-The new ideas which have been developed through the project
• Methodology-The techniques which enabled us to
develop our programs or concepts

Integrated Municipal Information Systems

• Programs-The actual computer programs that
result from the Development Phase
A point to be made, however, is that the closer we
get (on the above scale of products) to computer
programs, the more difficult or lower the probability of
direct transfer to another city, while the closer to concepts, the higher the transferability is expected to be.
With the recognition, therefore, that one of the key
issues in this project is the development of products
which are transferable, let me discuss the relationship
to other cities of two major products which have already
been developed:
• The Analysis Phase documentation
• The Conceptualization methodology
The documentation of the Analysis Phase of the
Wichita Falls project, which is over 6500 pages and
required some ten man-years to compile and produce,
for the first time provided a comprehensive view of all
the activities which go into making a "real" city work.
It provides, to prospective city managers and other
students of public administration, an additional
perspective on the complexity of a successfully operating
municipal government.
Additionally, the documentation provided Department Heads and Division Directors in the City of
Wichita Falls an opportunity to more fully examine the
operations of their organizational unit. Further, it
served as a "before" snapshot of municipal operations
to be compared with an "after" shot from the resulting
design, development, and implementation documentation. This comparison will enable costs to be compared
between some of the old and new systems. The final,
but by no means least significant, benefit to be derived
from the analysis is the fact that there is now a document available which another city can use as a basis on
which it can perform a far less costly analysis of its
:lwn activities. Specifically, another city need only
prepare exception documentation for those activities
which it provides and which are significantly different
[rom those in Wichita Falls.
The methodology we developed in the Conceptualiza~ion Phase is referred to as the Top-Down Bottom-Up
l\.pproach. This methodology enabled us to formulate a
~eneral system design which, from what we have been
tble to determine so far, should be transferable to a
.arge number of other cities.
The approach was as follows. Consider that the
)urpose of a city, simply stated, is to provide public
;ervices desired or demanded by its citizens or society.
rhese services or functions, of which there are many,

605

include such things as:
• Protecting the people from those who break the law
• Providing water and sanitary services
• Providing for the transportation of people and
goods
The total of these functions can be broadly grouped into
four sectors (originally defined by USAC as subsystems) :
•
•
•
•

Public Safety
Human Resources Development
Public Finance
Physical and Economic Development

The functions themselves can be divided into components, and the components into applications in a
typical hierarchical fashion similar to the organization
of most municipal governments. This Top-Down
hierarchy provides the overall functional framework
which is to be served by the information system. To
structure the information system, however, one must
also look at the information requirements of each of the
lowest level applications, including such things as:
• Maintaining traffic signals
• Determining if a given vehicle or person is stolen
or wanted
• Putting out fires
• Processing platting changes
• Issuing parade licenses
If one then considers those applications which have
common information characteristics one can begin to
see the data exchange necessary for effective data
sharing.
In the Wichita Falls project we aggregated those
operational applications with what appeared to be the
highest number of common information characteristics
into what we called a DISC, a Decision Information
Set Center. A DISC is a hypothetical module containing
many processes, all related to the same or similar data.
The DISC provides a convenient means of simultaneously considering the data and logical file requirements of many informationally related applications
(Figure I). It was found that DISCs could be defined
in three levels:

• Operational
• Operational!Analytical
• Analytical
The lowest level or operational DISCs provide the

606

Fall Joint Computer Conference, 1971

USAC
Subsystems

Functions

Components

Town Planning

Applications

r-- ·U;;;:r;;n~portation

r

Building Code
Administration

Sectors

Subsystem

Physical
and Econol
Developme

U,banDevelopment Sub;;',....

1------1

o

Special
Exception
Request
Processing

Building
Inspection

Analytical
DISC's

r----u~ban Environment Sub~ystem

Inspection
Scheduling

Physical
and Economic
Development

Operational/
Analytical
DISC's

Operational
DISC's

Permit
Processing
Building
Survey
Land Records

Inspection
Reporting

Public Final
Treasury
Real Property
Update
Public Finance

h

Real Property
Appraisal

Human
Resources
Development

LaW

Enf~...,.ment Sub.y,tem

Disaster Control Subsystem

T.

.r--

Public Safety
Accounts
Payable

.------.•

~~==.-----.

Real Property
Assessment
Revenue
Accounting

Utility Services

f

Creation
Of Tax Roll

.

J

;Public Safet

.

Human Devel~pment Subsyst~m---'"
Health Subsystem

~

-------I

- :ducation SUbSYS~m

Budgeting

Real Property
Delinquent
Tax Processirig

Human
Resources
Developmen

Welfare Subsystem - - - - - - - I

T

Figure 1

input to more analytical applications which could be
aggregated into the higher level operational!analytical
DISCs. Further, the output of these DISCs could in
turn be aggregated into DISCs of exclusively analytical
applications such as those associated with annual
resource allocations or comprehensive planning. It was
further found that the output of these analytical DISCs
provided feedback to the first level operational DISCs,
thus establishing essentially closed systems. These
closed systems are, in effect, subsystems of the total
IMIS. For the 1MIS postulated in Wichita. Falls, ten
such subsystems were identified:
• Public Finance
• Disaster Control
• Law Enforcement

•
•
•
•
•
•
•

Urban Development
Urban Transportation
Urban Environment
Welfare
Education
Health
Human Development

The significance of the subsystem is NOT that there
are ten, since a different aggregation of the operational
applications could change the number of subsystems
slightly, but rather that a city can be viewed, informationally, as being comprised of a number of subsystems
with associated data flows.
From the viewpoint of transferability, the Wichita
Falls project's Conceptualization Phase resulted in

Integrated Municipal Information Systems

what we feel to have been a success. We have a concept
and a methodology which are clearly transferable, but
we also have a product, a general systems design which
is also felt to be transferable, though clearly, modification will be necessary to accommodate, among other
things, the variation in services provided by other cities.
THE DESIGN PHASE HAS PROVIDED NEW
AREAS OF STUDY FOR SOFTWARE
VENDORS
The Design Phase in which we are now heavily
involved is beginning to provide some interesting
problems for further study. Because we have not yet
progressed far enough in the phase to discuss its products, I have chosen to mention a basic design philosophy and comment on some areas in which I believe
the computer industry should provide some additional
guidance.
Because of the complexity of the system to be built,
and the desire to minimize data redundancy, while at
the same time providing the data to multiple users, we
chose to develop an integrated data base. In order to
assist in the management of this data base, we further
chose to implement a vendor supplied data base
management system. The efforts of our project staff
were then divided into three main areas:
• Data Base Design
• Application Design
• Application Programming (this is actually part of
the Development Phase)
Clearly, neither of these areas can be considered in
vacuum, nor do we really split the staff into three
distinct groups. For the purpose of our discussion,
however, it is appropriate to consider the three groups
as operating independently but toward a common
objective.
Toward that end then, the application design teams
provide programming specifications and detailed flow
charts to the programmer group while concurrently

607

providing data requirements to the data base group
which designs the base by establishing the files and
associated linkages. This over-simplification thus
enables me to separate out and address only the data
base portion of the Design Phase.
Initially, when we began the phase, I was surprised
to find how little was generally known about data base
management. Obviously, part of the problem lies in the
fact that it is a relatively new field in which few people
have had any experience.
What is particularly unfortunate, however, is not
how little is known about data base management, but
rather how little is known and how little effort goes into
understanding the functions of cities (a prime sales
target for computer and software vendors). All too
often I was told how a given system could handle any
data base problem the City might have. The substantiation was based on the fact that the system had
just been implemented or was to be implemented by
Company XYZ which, I was told, was bigger than
Wichita Falls and had bigger files to work with. I
have no doubt that most existing data base and file
management systems are capable of handling relatively
large files. I have considerable doubt, however, about
their ability to efficiently process multiple, extensively
linked files with individually low transaction rates and
with the high number of different processes found in
cities. What we have found thus far in our project
(and the results are available to each of you) is that
USAC-sized cities do in fact have unique data processing problems. It does not appear that these problems
are necessarily more complex than those associated
with industry. They are merely different.
I suggest to you, therefore, that you have in cities a
major and virtually untapped market place, one which
you can and should serve efficiently by studying their
needs and solving their problems. Do this not by looking
at cities as non-profit companies whose needs can be
served by the old products and methods developed in
the past, but because of their importance and because,
in fact, their problems are different, by a fresh examination and by the development of products designed
to meet their unique needs.

Geocoding techniques developed by the census use study
by CABY C. SMITH and MARVIN S. WHITE, JR.
U.S. Census Bureau
Los Angeles, California

HISTORICAL PERSPECTIVE

through ADMATCH, but may involve more complicated tabulations, such as from street intersections
to police precinct.
A wealth of data, useful for planners, is scattered in
existing files, e.g., Assessor's files, welfare files, and
motor vehicle registrations. Accessing the data is often
difficult or impossible due to lack of useful geographic
codes, confidentiality rules, etc. One can often surmount
these barriers through geocoding, usually ADMATCHing the data file to determine census tract and
block from street address.
An agency responsible for confidential information
about individuals is frequently willing to release summaries by some large enough geographic area. The
Bureau of the Census is a good example of such an
agency. Census data on individuals is confidential and
suppressed, but tabulations by block group, census
tract and other geographic areas are released. The
SCRIS staff has successfully geocoded a number of such
files including welfare and building permit files.
Often an agency will release tabulations of data by a
geographic area important to that agency but not to
other users. For example, welfare departments may
release tabulations by welfare district, which probably
do not conform to other statistical or planning zones.
Worse, welfare district boundaries may change rapidly
as caseloads vary and thus destroy histoical continuity
and render analysis by other zones prohibitively
expensive. Provided street address is available, geocoding easily solves these problems.

The Census Use Study (CUS) was established in
September 1966 in New Haven, Connecticut. The
emphasis of the study is on small area data, i.e., data
relevant to areas smaller than a city. The CUS took
the lead in geocoding by developing the DIME file and
ADMATCH, an address matching system. The CUS
at New Haven also developed the Health Information
System to provide health planners with powerful
statistical tools. In July, 1969, the Southern California
Regional Information Study (SCRIS) was established
in the Los Angeles area in order to transfer the experience gained at New Haven to a large urban area.
SCRIS is continuing the geocoding work begun at
New Haven and has embarked into other fields, while
continuing to improve existing computer programs.
For instance, SCRIS has produced an IBM 360/0S
and an RCA SPECTRA 70 version of ADMATCH.
A computer mapping system was also developed at
SCRIS from the basic research activities carried on at
New Haven. Investigations have begun on several other
fronts including an extension of a fallout shelter study,
the Summary Tape Retrieval Information Processor
(STRIP), a generalized file matching system, and
several special purpose tools related to small area data
and geocoding. In addition, the New Haven CUS and
SCRIS have produced a series of publications on these
and other related topics for public information. SCRIS
is producing a series of transportation related publications, which should be of interest to most transportation
planners.

ADMATCHl,3

GEOCO DINGl,2

ADMATCH was designed to perform this type of
geocoding, obtaining some geocode like census tract
from street address. ADMATCH operates by linking
two logically connected files, a data file and a reference
file (see Figure 1). The data file contains a street
address (or range of addresses) and the interesting

By geocoding we mean the process of attaching
relevant geographic codes to data which has some less
useful geographic codes. Usually this means converting
street address to some area code, e.g., census tract
609

610

Fall Joint Computer Conference, 1971

Preprocessor
Program

Preprocessor
Program

Sorting
Process

Sorting
Process

Matcher
Program

Geocoded

(~~{:. )

~

Figure l-ADMATCH system overview

data; the reference file, a geographic base file, contains
street address and the corresponding geocodes. ADMATCH has rendered a great deal of otherwise inaccessible data easily accessible.
ADMATCH performs this file linkage in two phases.
The Preprocessor analyzes a character string address

according to syntax'and keywords specified by the user
and creates a standardized version of the address called
the "match key." The Matcher compares the data
record match key to all the reference file records with
the same street name and selects the best match. The
best match is determined according to a weighting
scheme determined by the user. Reliable performance
and cost estimates for the IBM 360jDOS ADMATCH
are available from CUS Report No. 14, Geocoding with
ADMATCH: A Los Angeles Experience. The OS
ADMATCH is much more efficient than the DOS
version, but specific cost benchmarks are not yet
available.
GEOGRAPHIC BASE FILES
The reference file required by ADMATCH is one of
a class of files called Geographic Base Files (GBF).
Since urban planning is so extensively related. to'
geography, close attention should be paid to GBFs
and their multitude of applications. A GBF is minimally
an extended correspondence table for two or more
geographic codes. However, a GBF may be much more
complicated, viz., it may reflect any geography related
structure or information. For example, street network
and land use information may be contained in a sufficiently finely resolved GBF.
There are a multitude of applications for GBFs. A
very important application, providing linkage between
otherwise incommensurate data, was already discussed.
A GBF can serve as a geographic base for information
systems in a number of ways. First, the desired information (e.g., street or area maintenance information)
can be coded directly into the GBF. Or a GBF could
act as an index to another file or a series of files, i.e., the
GBF might contain pointers to other files. The Address
Coding Guide (ACG) is a GBF in wide-spread use. Each
record in the ACG represents one block face and contains address range information and geocodes like
census tract, county and place.
The ACG was developed by the Bureau of the
Census in part to facilitate the mail-out mail-in census
but more importantly for us to provide a GBF with
nation-wide standards. The Metropolitan Map Series
(MMS), which serves as a source for ACG coding, was
the first step in standardization on a nation-wide basis.
Metro Maps are produced by the Census Bureau for
each Standard Metropolitan Statistical Area (SMSA)
in the United States.
DIME4,2,5
The Dual Independent Map Encoding (DIME)
file is the most complete type of GBF developed to date.

Geocoding Techniques

The DIME concept was developed and first implemented at the New Haven CUS. Each record in the
DIME file represents a street segment bounded by a
node at each end. Nodes are placed on a map (Metropolitan Map Series) at each interesting place, i.e., each
street intersection, each intersection of a street with an
important nonstreet feature like railroad crossings, and
curves. Furthermore, the DIME file contains the
coordinates of each node. The node and coordinate
information renders the DIME file extraordinarily
useful. The DIME file has all the applications of any
GBF and a great deal more than most. In fact, as far as
network related geographic features are concerned, the
DIME file has the ultimate form for a complete GBF.
It can contain all relevant information, provided it is
finely enough resolved. For example, the DIME file
contains geocodes for both left and right sides of the
segment, such as block number, and may also contain
street usage codes.
Some of the most interesting applications for the
DIME file are transportation related. Transportation
planners routinely perform network and node analyses
on traffic flow. The DIME file in conjunction with
DAM (DIME Aggregation Manager), a system to
abstract higher level networks, can be used to analyze
traffic networks at any desired level of detail. SCRIS is
presently investigating the possibility of producing
from the DIME file, a network file for input directly to
the Bureau of Public Roads Urban Transportation
Planning system. Other transportation related applications of the DIME file are automated routing of busses,
etc., either real time or batch, traffic flow modeling, and
intersection or node related studies. Nearly all police
and traffic safety records are coded to intersection, like
Hollywood and Vine. An intersection file can be constructed from the DIME file.
Since the DIME file contains coordinates for each
node, a number of other applications are possible.
Using the DIME file as a base for computer mapping is
one of the more exciting applications. In fact, the CUS
staff at both New Haven and SCRIS have mapped
DIME files themselves for editing purposes. More will
be said below about computer mapping. Areas and
centroids can easily be calculated for areas bounded by
streets, using the DIME file. The CUS is studying the
feasibility, in terms of cost, of further DIME applications.
Two additional features of the DIME file should be
noted. The DIME file offers special advantages for
editing and updating. These boil down to the fact that
the DIME file is an interconnected network of records
and that these records are unequivocally tied down to
specific lines on a map.
Since DIME records are tied to maps, records may

611

be referenced with only the map as a source. Thus, the
clerical step of finding a serial number or other key in a
listing is eliminated. A significant source of errors and
cost is also eliminated. In the case of an ACG file, for
instance, a clerk must resolve ambiguities that arise
when the same street name occurs on two faces of the
same block by scanning a listing.
The fact that DIME records form a connected link
network means that topological edits may be performed. For example, DIME records may be chained
around a block or census tract. Boundaries that do not
close are flagged as errors and corrected. Since nodes
occur at every intersection, no two segments represented by DIME records should intersect. Whether
two segments intersect is easily determined by a vector
cross product calculation. Thus, nodes with incorrect
coordinates can be located by noting that segments
containing them intersect with others.
COMPUTERIZED RESOURCE ALLOCATION
MODEL (CRAM)5.6
We have seen a number of applications for GBFs in
general and the DIME file in particular. The most
sophisticated application being developed by the CUS
is the Computerized Resource Allocation Model
(CRAM). Basically, CRAM is a generalized system for
determining the service areas for a set of facility
locations. In its most general form, service areas are
constrained by facility capacities and travel times.
CRAM is a refinement and extension of the NAPS
(Network Allocation of Population to Shelter) system
developed by System Development Corporation in
connection with a CUS contract with the Office of
Civil Defense.
The system uses a DIME file as its network base and
in addition requires demand and facility capacity
inputs. The problems which can be attacked through
CRAM run from simple fixed source districting
problems such as school district boundaries, park
planning, and site location to much more complicated
problems of emergency vehicle routing, delivery or bus
route planning and freeway location studies.
CRAM uses a modified version of the Moore Shortest
Path Algorithm to do its geographic analysis. The
Moore technique (not CRAM's version) finds shortest
paths between points using a point-to-point incidence
matrix (with associated distances or costs) as its road
map. In this technique, the network is viewed as a set
of interconnected nodes, the connections being links.
The DIME file, however, directly represents a set of
interconnected links, the connections being nodes.
The DIME file can be modified to fit the Moore tech-

612

Fall Joint Computer Conference, 1971

nique easily enough, but there are advantages to
modifying the technique to fit the DIME structure.
The advantage in modifying the technique arises
mainly in the allocation of demand. Demand on a
facility is generally the number of persons desiring to
use the facility. Information about numbers of persons
usually relates to blocks or block groups-not nodes or
links. However, the disaggregation of persons to links
is much more reliable than disaggregation to nodes.
This is true because links have length and the disaggregation may be varied according to length, but not
for nodes.
UNIMATCH7
The applications for GBFs and file linkage in general
are so extensive that SCRIS has begun the development
of a generalized file linkage system-UNIMATCH.
UNIMATCH is structurally similar to ADMATCH in
that it consists of a STANDARDIZER and a
MATCHER. However, UNIMATCH will not be
limited to matching street address. Instead, street
intersections, major traffic generators or any logical
connection may serve to link two files. This generality
is achieved by allowing the user to specify what fields
to compare, what comparisons to make (e.g., character,
numeric or parity comparisons), what significance to
attach to a success or failure to compare and finally
what action to take depending on the level of success of
the comparison. That action might be further comparisons or the copying of selected fields from one file to
another. The impetus for designing and implementing
UNIMATCH comes from transportation planning
needs. Originally, the system was to be named TRAM
(Transportation Related Address Matcher). Great
efforts were made to name the system TROLLEY, but
no suitable words to fit the acronym made themselves
obvious.
COMPUTER MAPPING8,9
The CUS has investigated a variety of available
computer mapping programs and has developed a
mapping system-GRIDS (Grid Related Information
Display System) at SCRIS to meet needs that were not
being met. The existing computer mapping systems
required quite a bit of programming ability, a good deal
of data preparation, and in· many cases required large
and expensive computing facilities. For these reasons,
mapping of data was left mainly in the hands of
draftsmen. For these same reasons, GRIDS, which is a
system for producing inexpensive printer maps quickly
and easily, was created.

GRIDS is an especially good analytical tool for
planners. Voluminous data is much more easily assimilated in map form and GRIDS handles finely
resolved data especially well. Also, each map costs
only $5 to $10. Furthermore, GRIDS will run on nearly
any system with FORTRAN compiler-it has run on an
IBM 360 model 30 with 32K bytes of storage.
The finest unit of resolution for GRIDS is one grid
cell. The system produces maps by dividing the area to
be mapped into a network of rectangular grid cells. A
grid cell may be as small as one printed character or as
large as 55 X 55 characters.
GRIDS is very flexible both in digesting input data
and producing maps. There are three types of maps
available: (1) shaded, in which data values are represented by overprinted characters of varying darkness;
(2) density, in which a character is splattered randomly
throughout each grid cell with more characters representing higher values; (3) value, in which the actual
data value is printed in each cell. Figure 2 is an example
of a GRIDS shaded map.
GRIDS accepts as many as 8 input variables and
2 coordinates to be mapped, provides for manipulation
of the data in any desired way, and produces up to five
different maps for each run. No previous preparation of
the data is necessary and no knowledge of the data
values is necessary. GRIDS provides for data manipulation through MAPTRAN, a built in FORTRAN-like
programming language, or a user exit routine if necessary.
GRIDS is especially easy to use because all specifications are free format keyword type and there are many
default values for the user with simple needs.
GRIDS has many important applications. Because
GRIDS is inexpensive and flexible, it is a very good
analytical tool. GRIDS may also be used as a prelude
to more cosmetic and thus more expensive computer
mapping techniques.
The CUS has also used, and is beginning a further
study of, the Geospace Plotter. This plotter is a Cathode
Ray Tube Photographic device with very good resolution and a selection of 32 intensity levels. One may
select either 100 or 200 dots per inch resolution, with
each dot addressable. Geospace plotter maps approach
the quality of maps produced by a draftsman. These
maps are excellent for public display and published
reports, e.g., to decision makers at all levels.
The softw&re supplied by the Geospace Corporation
called ALPACA is very similar to software for pen
plotters. SCRIS is presently improving the efficiency of
ALPACA, particularly for mapping applications.
SCRIS is also developing a mapping system, which will
be particularly useful with the DIME file. Figure 3, a
Geospace Plotter map of New Haven using the DIME

Geocoding Techniques
POPULATICN

OE~SITY

I~

LA COUNTY 197C

POP I SQ

~I

613

SCRIS

119.0113
117.5998
*+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*
34.15661+
+
+
1+
- -1-0£ +
+
+
34.63579+
+
...
+
-- I - I +
+
+
+
34.51494+
+

..•
....

•

+
+
+
+

•

I-I f

34.39411+
+
+
+
+
34.27328+
+
+
+
+
34.15243+
+
+
+

-

+
+

-

..•

11-

+
+

--1-- --I~tt--

-----ltt-~ttI--IIIltIJtttD~----lt»I

---I-JItltlllltI--- --It I -Ittltrl_tI Irt~ttII-ttl 1---Iltlttrtttttl.ltJttrt»I -t--IJ~f-- I-ttlltt~tltJ~~JJ»rttttlt-1 1IIItlttIII ---- 1-- IItHlttll- I"ttIIIttt'HI-tlttl-I-ttl1--- -If-tlttttrlrttttt»rIrtt-- IfUlfll1 I-fl-1---t"'I"I*tJtIIDJIU»llli-tlt.tJ'tl- -litt
-IItt~tJ.».t"'.DI.I».I»I»I.t~tJ.tJ~I-- -I-fitl
- 1 - IItttl~~f¥ll»I'I.'.tn.IIIHtl l-nHItI~t-- - - H+l
-111 •••• I0.»II.11111111.tllIItftII-I--I
t •••• tIIIIJ.lltItr---ttOttI -tJ--- I-I
-t IttJl»» •• OI •• ».r •• ttltttJ 1-- I t I
-II-t •• » ••••• rr.Ittiii-t.ltl--J-IO.»'.D»»».DI»r».»tltttt
ffI»J»tJID.,.»t»I».»I-Itt
I» •••• ltIIIIJ.trJrIII
tJ»t-Ilttt-I'DDtJI»I
»Jttrt.llttfI-t •• tt
IttItDIDI-tDlt-i'JIltllli-Dt--tu •• tII
I - Itt--J----IJI
--tl--

+

34.03160+
+
+
+
+
33.91015+
+
+

+
+
33.18993+
+
+
+
+
33.66<;10+
+

-1-

..
..

+
+

..+

..+
+

..
•..

.....+
•+

+
+

•

+
+

•+
+

+

..

+
+
+

+

..

+

+
+
33.54826+
+
+
+
*++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ •• +++++++++++++ •• *

POPULATIO~

GRIDS --- (GRID RELATED INFORMATICN DISPLAY SYSTEM)
FOR LOS ANGELES C(L~TY 197C
MSk 6/71
FRC~ lS70 MEDLIST
SCHIS

DE~SITY

PGPULATICN I SQ

GRIDS

~IlE

1********1
<--10M 1-->

DEGREES LATITUDE NORTH
DEGREES LONGITUDE WEST

MINIMUM CELL VALUE(S)

0.0

MAXIMUM CELL VALUE(S)

O.222C65E

as

LEGEND:

---------------

----------------------

FREQUENCY

IIIIIIII
IIIIIIII
IIIIIIII
IIIIIIII

IIIIIIII

tt-H-H+t
tit-f·H++
-fttt+-f+t
t+tt+ttt
iittiitt

J.D •••••
.1." ••• ,

••••••••
".nl
••••
JI ••• I." ••••••••
l I n •••• I••••••••
•••••••

•••• 1111

• 0000000
1999.999

19<;<;.99<;

3C;<;9.9<;8

39<;9.'1q8

6q99.991

69C;9.9<;7
11 qq9. 99

22206.50

261

151

192

192

58

11qq9.99

Figure 2-Sample GRIDS output using the shaded option. The legend displays each shade used, the corresponding data
value range and frequency (count of grid cells) for each shade

614

Fall Joint Computer Conference, 1971

Figure 3-Geospace plotter shading map (Taken from Census
Use Study Report No.2, p. 41)

file for the street network base, is a sample of the
quality obtainable through these techniques.

data items are collected from such sources as the First
Count Summary Tapes, Vital Records, Mental Health
files, and the Mental Retardation Register. These data
are linked through ADMATCH and summarized by
block group. Correlational, factor and further multivariate analyses are performed on the data to produce a
few constructs or typologies. A topology is a synthesis
of many items, ordered logically by their contribution to
the whole. The resulting typologies themselves and
maps displaying which block groups are typical for a
given typology (i.e., which block groups rank in the top
quartile among all the block groups with respect to this
typology) are extremely useful to planners.
The CUS plans to also perform cluster analyses on
the 1970 data base to obtain a more homogeneous
grouping of neighborhoods. A time-series analysis on the
1967-1970 data will also be carried out. The emphasis
will be on comparing changes in configurations of data
items rather than just one dimensional comparisons.
For example, illegitimacy is correlated with a number of
other variables including family disorganization, high
welfare roles, poor health of mother and child, and low
socio-economic status. This time-series analysis will
consider not only illegitimacy but the configuration of
variables associated statistically with illegitimacy.
The HIS is now being expanded to Los Angeles
through SCRIS and UCLA. The HIS methodology is
already being used in Nebraska and Iowa by the
Comprehensive Health Planning Council there. The
HIS is partially implemented in a dozen places around
the U.S.

HEALTH INFORMATION SYSTEMlo,l1

PRESENT ACTIVITIES 12

The Health Information System (HIS) was developed in New Haven by the CUS. Many of the
computer tools mentioned above are incorporated into
the HIS. Initially, the HIS concentrated on maternal
and child care but has now become a more general
information system for many health related fields, such
as social pathology and health care delivery systems.
The purpose of the system is to pinpoint geographically
those neighborhoods where there is a significant health
risk and define the characteristics of the population to
provide health planners with analytic tools for approaching the solution of health problems. The HIS
gathers data from a variety of sources and analyzes
that data through advanced statistical techniques.
Naturally, HIS techniques may be transferred to other
fields such as education and crime.
Briefly, the analysis proceeds as follows. Nearly 300

SCRIS is presently engaged in a wide array of special
purpose activities to demonstrate many data processing
capabilities to the planning community and to investigate the effectiveness-cost relationship of these
activities. Two of these, the Summary Tape Retrieval
Information Processor (STRIP) and SCRIS Report
No.5, will serve to indicate the nature of the investigations.
STRIP consists of several related programs to select
Census Summary Tape records by geographic code and
further select certain data items and produce tabulations of these items. STRIP operates in two phases.
The first phase performs the selections and reformats
the data to binary for more efficient processing later.
Naturally, these intermediate data sets are available to
the user for his own processing. The second phase
produces set reports as specified by the user.

Geocoding Techniques

615

AGE-SEX PYRAMID
PLACE 2125 PASADENA
75

+

XXXXXXXXX
XXXXXXXXX
XXXXXX
XXXXXX
XXXXXX
XXXXXX
XXXXXXXXX
XXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXX
XXXXXXXXX
XXXXXXXXX
XXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXX
XXXXXXXXXXXXXXX
XXXXXXXXXXXXXX
XXXXXXXXXXXXXX
XXXXXXXXXXXXXX
XXXXXXXXXXXXXX
XXXXXXXXXXXXXX
XXXXXXXXXXXXXX

70-74
65-69
60-64
55-59
50-54
45-49
40-44
35-39
30-34
25-29
20-24
15-19
10-14
5- 9
UNDER 5

XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXXX
XXXXXXXXXXX
XXXXXXXXXXX
XXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXXXXXX
XXXXXXXX
XXXXXXXX
XXXXXXX
XXXXXXX
XXXXXXXXXXXXXX
XXXXXXXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXXXX
XXXXXXXXXXXXX

PERCENT 14 13 12 11 10 9 8 7 6 5 4 3 2 1

1

2

3

4

5

6

7

8

9

MALE

TOTAL

FEMALE

POPULATION

52149

113327

61178

MEDIAN AGE

31.24

,,34.78

39.32

10

11

12

13

14

RECORDS SUPPRESSED 0
Figure 4-Age-sex pyramid for Pasadena, California (Taken from SCRIS Report No.5)

SCRIS Report No.5, 1970 Census Data: Characteristics of Cities and Unincorporated Places, is a
demonstration of how useful census data can be accessed and displayed. The programs used to produce the
report are being packaged for distribution. The report
contains tabulations of certain demographic characteristics by place. These include white population,
Negro population, rents and home values. Age-sex
pyramids were also tabulated for each place. An age-sex
pyramid is a two-way graph, percent male increasing
to the left from the middle, percent female increasing
to the right and age increasing vertically. Age-sex

pyramids are used extensively by planners as they
indicate information on the age and flavor of a community.
A careful examination of Figures 4 and 5, a sample
from SCRIS Report No.5, reveals a great deal about
Pasadena. Unfortunately, there is no information
whether the ladies over 75 are little or not.
SUMMARY
The CUS is involved in a college of data processing
activity all related to small area data and much of it

616

Fall Joint Computer Conference, 1971

PLACE 2125 PASADENA
TOTAL DWELLING UNITS 47093

TOTAL POPULATION 113327
DATA ITEM

RECORDS
COUNT PERCENT SUPPRESSED

WHITE POPULATION

90446

79.8

0

NEGRO POPULATION

18256

16.1

0

INDIAN POPULATION
OTHER SPECIFIED
RACES
REPORTED OTHER
RACE
OWNER OCCUPIED
DWELLING UNITS

281

0.2

0

3488

3.1

0

856

0.8

0

19483

-RENTER OCCUPIED
DWELLING UNITS

25170

41.4

53.5

VACANT DWELLING
5.2
UNITS
2436
VALUE OF OWNER OCCUPIED UNITS
COUNT

PERCENT

38
299
1615
3322
3077
4417
2901
2190

0.2
1.7
9.0
18.6
17.2
24.7
16.2
12.3

LESS THAN 5000
5000- 9999
10000-14999
15000-19999
20000-24999
25000-34999
35000-49999
50000 +
MEDIAN
RECORDS SUPPRESSED

0

0

COUNT

I-UNIT STRUCTURES
2 OR MORE UNIT
STRUCTURES
MOBILE HOMES
OVER CROWDED
UNITS
UNITS LACKING
PLUMBING
FACILITIES
UNITS LACKING
KITCHEN
FACILITIES
POPULATION IN
OVERCROWDED
UNITS LACKING
PLUMBING
FACILITIES

RECORDS
PERCENT SUPPRESSED

28431

60.4

0

18578
80

39.5
0.2

0
0

2243

4.8

0

984

2.1

0

1106

2.3

0

209

0.2

0

0
RENT OF RENTER OCCUPIED UNITS
COUNT

PERCENT

321
1340
4490
5304
4103
4441
2512
1258
566

1.3
5.5
18.5
21.8
16.9
18.2
10.3
5.2
2.3

LESS THAN 40
40-59
60-79
80-99
100-119
120-149
150-199
200-299
300 +
MEDIAN

26300

RECORDS SUPPRESSED

0
TOTAL RECORDS

DATA ITEM

103
0

1

Figure 5-Selected 1970 Census data tabulations for Pasadena, California (Taken from SCRIS Report No.5)

connected to geocoding and geographic analysis. A
great deal of work has been done to provide planners
with tools needed to analyze local and census data.
ADMATCH has unlocked much information;
UNIMATCH promises to unlock considerably more.
The DIME file is the basis not only for file linkage
through ADMATCH but also for computer mapping
and quite sophisticated analyses and modeling such as
CRAM.
We conduct research at the CUS in close conjunction
with planners and others actually using the data to
make the results of our research immediately relevant
to the needs of users. CUS research serves as a foundation for further research both by CUS and others.

REFERENCES
1--

ADMATCH users manual
US Bureau of the Census
Census Use Study Washington DC 1970
2 J P CURRY G FARNSWORTH
The DIME geocoding system
US Bureau of the Census
Census Use Study Report No 4 Washington DC 1970
3 M JARO
Geocoding with ADMATCH
US Bureau of the Census
Census Use Study Report No 14 Washington DC 1970
4 R CRELLIN G FARNSWORTH
ACG-DIME updating system: An interim report

Geocoding Techniques

SCRIS Report No 4
Los Angeles California 1970
5 G FARNSWORTH
DIME applications and the computerized resource
allocation model
American Statistical Association Joint Statistical
Meetings Detroit Michigan December 29 1970
6 G FARNSWORTH
Computerized resource allocation model
Unpublished paper SCRIS 1970
7 M JARO
Conversation concerning UNIMATCH
8-Computer mapping
US Bureau of the Census

Census Use Study Report No 2 Washington DC 1970
9 M JARO
Grid related information display system: GRIDS
To be published by the US Bureau of the Census
10 J DESHAIES
Health information system
US Bureau of the Census
Census Use Study Report No 7 Washington DC 1969
11 J DESHAIES
Conversation on recent health information system
developments
12-1970 census data: Characteristics of cities and
unincorporated places
SCRIS Report No 5 Los Angeles California 1971

617

URBAN COGO-A geographic-based land
information system*
by BETSY SCHUMACKER
M assachusetis Institute of Technology
Cambridge, Massachusetts

• Data attribute capabilities to associate textual
data (e.g., land use information) with any or all of
the geometric objects.
• Processing facilities for the data attributes to be
able to sort, tabulate, or report on such data, and
groupings and analyses to perform statistical tests
on such data.

INTRODUCTION
COGO, a geometric problem solving system, has been
in existence for ten years in its basic form and for four
years in its more advanced form. COGO provides a
command-structured language and a set of processing
routines to define and describe such geometric objects
as points, curves, courses, chains, and vertical profiles,
and to perform geometric computations such as locations and intersections to find new point coordinates.
It also includes file capabilities to enable users to
gradually derive problem solutions over a period of
time as well as to provide the ability for different users
to try different problem solutions against the same set
of data. It provides dynamic memory management and
dynamic program management as a subsystem of the
ICES System.
URBAN CO GO expands upon ICES CO GO in
several ways:

It is the author's belief that the combination of the
original COGO concepts and the expansions summarized above form the basis for URBAN CO GO to
provide the base and the direction for urban information
systems of the future.

GEOMETRIC OBJECTS
The geometric objects currently allowed in the
system are

• New geometric objects can be defined, namely
blocks, regions, networks, and three dimensional
objects such as building and street overpasses
and underpasses.
• Expansion of file capabilities and the provision for
hierarchies of files and subdivisions of files.
• Graphical output capabilities including both soft
and hard copy displaying of objects or groups of
objects with or without translation, rotation, or
magnification, density mapping, selective mapping,
and detailed mapping with full annotations.
• Graphical input capabilities by digitizing on a
display screen or digitizing from hard copy on a
flat-bed plotter-digitizer.

points
curves
courses
chains (or parcels)
profiles
networks
blocks
regions

Points

Points are the basic geometric unit of the system and
have absolute values associated with them. Points are
identified by number and can have x, y, and z coordinates stored as their values.

* The work reported herein was supported in part by grant no.
GK-25622X from the National Science Foundation.

619

620

Fall Joint Computer Conference, 1971

STORE POINT 17 X 113.21 Y 7521.831 Z 37.512
Points can be identified and have values stored for
them by use of the STORE command (above) or by
use of any of the LOCATE or INTERSECT commands (below) or by digitizing them on a graphic
display unit or on a plotter digitizer unit.
LOCATE 512 FROM 17 DISTANCE 153.91
BEARING N 37 23 E
INTERSECT CHAIN 'A' WITH NETWORK
'C' POINT 801

STORE CHAIN 'A' POI 159 CURVE 7 POI
153 CUR 91

Blocks
Blocks are relative objects identified by an 8-character name and defined by storing. Blocks consist of
chains. One example of a block is a street block which
consists of parcels (chains) of land.
STORE BLOCK 'D' CHA 'B' 'A' 'F' CHA 'G'

Regions
Curve8
Curves are also absolute objects, are identified by
number, and are either circular arcs, circles, or spiraled
arcs with circular arcs. They are planimetric and are
defined by storing or by digitizing.

Regions are relative objects identified by an 8-character level and an 8-character name. Any number of
levels of regions may be defined and stored. A first level
region consists of blocks, a second level region consists
of first level regions, etc. This capability permits the
user to go from very fine geometric data to very gross
geometric data (in terms of size) in any way he wishes.

Network8
Object grouping
Networks are relative objects and are identified by
an 8-character name. Networks consist of links which,
in turn, are defined by begin and end nodes which are
point numbers. Links can be singly-directional or dualdirectional and can vary from network to network.
Networks can be defined by storing or digitizing.

Cour8e8
Courses are relative objects defined by the point
number at each end. They are identified by a fourcharacter name and defined by storing or digitizing.

Chain8
Chains are relative objects, identified by an 8-character name, and defined by storing or digitizing. Chains
consist of points, curves, and courses, either with
continuous boundaries or with gaps. While chain is the
general name for this object, a more specific name for
some applications is parcel, which name the system
recognizes as a synonym for chain. The word parcel
implies the same thing as a parcel of land which is the
smallest legally recognized unit of land.
The points and curves which define a chain locate the
chain in space; the sequence of the items which define
it (i.e., the sequence of the points, curves, or courses)
define the topology of the chain.

Standard parcels of land may be stored as chains,
street blocks containing these parcels may be stored as
blocks, census tracts containing these blocks as first
level regions, counties containing these census tracts as
second level regions, states containing these counties as
third level regions, the country as a fourth level region.
Users interested only in gross areas may start out
with census tracts as chains, counties as blocks, states
as first level regions.
In other words, by designing the system so that the
user can be very accurate at the base level, any grossness of accuracy is also possible merely by defining the
base levels to be something different. Thus, in the first
use above point coordinates are stored as accurately as
they can be measured by surveying and everything else
has the same accuracy.
Thus, the accuracy achieved is not what the system
imposes upon the user, but what the user himself imposes by his choice of base level and his own needs.
DATA ATTRIBUTES
Data attributes can be associated with any of the
geometric objects. The allowable categories of attributes and range of values for these are defined as
essentially a system setting for a particular execution of
the system. The system setting aspect of this definition
permits different groups of users to access the same

URBAN OOGO

geometric data base with their own associated attribute
data. Thus urban planners, transportation analysts,
city managers, utility company inspectors, etc., can all
use the same base geometric data of a city but utilize
that attribute data which is meaningful and useful
to him.
The allowable names for these attributes and the
range of values each may have permits (1) the user to
use nomenclature which is familiar to him and (2) the
system to do editing "for free" on this data while it is
being stored, updated, modified, or accessed.

Definition
The definition of allowable categories of attributes
can be performed at the beginning of each execution,
or can be defined once and stored on a file and then
utilized for any number of executions by giving the
name of the file to use at the beginning of an execution.
Figure 1 illustrates how allowable categories of data
attributes and their valid values are defined for chains
(in this case, land parcels). The allowable categories
are as shown in Table I.
Figure 2 illustrates a similar definition but this time
for links of street networks.
Both illustrations include the request to file the
allowable categories so that they can be used in future
executions merely by giving the name of the file with
the command OOGO.
OOGO 'UG2' 'UGOAT'
TABLE I-Allowable Attributes for Parcels
name
STREETNO
WARD
BUILDING
TYPE

STORIES
USE

MIXED
LAND VALUE
BUILDING
VALUE
OWNER
YEAR

meaning
street number of
parcel
ward in which
parcel occurs
is there a building
on the parcel
type of material of
the building
n umber of stories
in the building
type of use of the
building
if mixed use, more
detail about it
assessed land value
assessed building
value
name of owner
year in which
assessment was
made

range of values
any integer value
integer 4, 5 or 9
YES or NO
BRICK, WOOD,
STONE, or
CONCRETE
any integer value
APARTMENT,
MERCANTILE,
etc.
any alphanumeric
value
any integer
any integer
any alphanumeric
any integer

621

DE~ INt CATluORY CHAIN
ADD 1 'STRELTNO' aNT
ADD 2 'WARD' tiN aNT

4
5
9

*

ADD 3 'SUILD.Nu' ~IN ALPHA 3
YlS
NO

*
ADD
4 'TYPl' ~IN ALPHA 8
BRICK
HOOD
STONt.
* CONeRE. Tf:.
ADD 5 'STORILS' .NT
ADD b 'USt' ~IN ALPHA 14
APARTfvH:.NT
MlRCANTILt.
CHURCH
O~tICE

SINllLE.
STORE.
I-OUNDAT.ON
SCHOOL
LODbiNu HOUSE.
CLUB
HALL
DORMiTORY
bARAllL
VACANT LOT
PARK. Nb LOT
POLeCi:. STATION
tiRE. HOUSL
SUB~JAY STAT I ON
HOTEL
PO S T 0 I- f· I (, E
LIBRARY
tv11 XE:.D

*ADD

7 'MIXED' ALPHA 24
ADD 8 'LAND VALUE' INT
ADO 9 'BUaLD,Nb VALUl' INT
ADD 10 'QWNlR' ALPHA 32
ADD 11 'YlAR' INTEbtR
LIST
t-aLE:.
E.ND DE.I-,Nl
Figure l-Category definition for chains

622

Fall Joint Computer Conference, 1971

OttiNE:. tAT LaNK

ADO 1 'NAME' ALPHA 12
ADO 2 'TYPt' tiN ALPHA
STRf..E.T
ALLt.Y

ti

*

ADD 3 '~JIOTH' RE:.AL
ADD 4 'LANE.S' tiN • NT

o
1
2

Hardware

3
4

*AOD 5 'PARKINb'

~IN

ALPHA 5

Lt.tT
RIl:JHT
BOTH

*ADD

6
AD 0 7

I
I

LOvJ NUM'
H I b Ii NUfv\ '

active display unit) or a "hard-copy" device (fiat-bed
plotter/digitizer unit) or any combination thereof.
The initial work on the system has separated the soft
from the hard by essentially assuming that gross
sketching and/or gross figure definition is performed
on the soft device. These separations are not system
imposed but rather decisions made as to the use of the
devices dictated by the units themselves, and probably
would hold true for most display and plotter units on
the market today.

INT
aNT

LIST

The device used in developing the system is shown in
Figure 4. It consists of an interactive storage tube
graphic display, a keyboard, a printer, a fiat-bed
plotter/digitizer, and a digitize function keyboard.
Probably, the most unique and functionally useful
thing about the unit is the fact that the whole set of
basic units (i.e., display, keyboard, printer, plotter/
digitizer, and function keyboard) are all parts of the
same unit, i.e., all controlled by one control unit and
require but a single hook-up channel to a computer, be
it telephone link to a remote computer or a direct
attachment to a local dedicated computer.

t I La:.
t:.ND Dt.taNl
Figure 2-Category definition for links

Storage
Actual attribute values for specific objects, e.g., for
specific chains, are stored through use of the STORE
TEXT command. Figure 3 shows some texts being
stored for some land parcels, the allowable categories
of which are the same as defined previously (in Figure
1). Once stored, the data attributes can be modified by
using the UPDATE TEXT command,. deleted via the
DELETE TEXT command, summary listed via LIST,
or printed by PRINT TEXT.
The attributes can be used for such things as sorting,
selective tabulation, creation of statistical sample
vectors, selective displaying or plotting, display or plot
annotation, and density mapping. Such use is further
described in the respective functional sections.
Any number of categories of information may exist for
any particular object type and any number of object
types may have categories defined for them.
GRAPHICAL INPUT/OUTPUT
Graphical capabilities exist for both input, output,
and identification, using a "soft-copy" device (inter-

STORE TlXT tHA '1092'
'STRLUNO' 532
'I'JARD' 4
'8U.LDINu' 'YtS'
'TYPt' 'SRIC.. K'
'STORILS' 4
'USL' 't·1LR(..ANT. Ll '
'LAND VALUL' j8S30U
'dUllD.Nu VAlUL' 914700
'mJNLR' 'NUJ lNuLAND t·IUTLJAL L. t l .IJSURAIJ(..L C.Of,PANY'
'YLAR' 19t.i4
l.NO
o Tl.XT ("HA '1092-1'
'STRU HJO' sua
'liARD' 4
'SUILDINu' 'YE.S'
'TYPE' 'BR.(..K'
'STUR I t.S' 4
'USt.' 'MLRtANTILL'
'LAND VALUL' 144000
'dU.LOINu VALUE' 135000
'm-JNt:.R' 'NHI LNbLAND t'lUTUAL lI .. L t NSURAN(..l tOttPANY'
'YEAR' 1960
LNO
o TlXT tHA '1093'
'STHE.ErNO' 490
'~JARD'

4

'BUILDtNu' 'YLS'
'TYPt:.' '8RI(..K'
'STOR ItS' 4
'USt:. I 't-1ER("ANT. Ll '
'LAND VAlUt.' 30000U
'dUtlDINb VAlUl' 37000U
'mJNlH' 'Nl\J ENulANO t,lUTUAL L I h
'¥tAR' 1967
lND

,r~SURA1J(.l (..OI,IPANY'

Figure 3-Storing of textual attributes for chains

URBAN COGO

Figure 4-Interactive display and plotter/digitizer

Even though the design and development of the
URBAN COGO system and the graphics part in
particular has been done using the specific device
described above, the system has been designed to
permit the use of any such hardware as long as it can be
driven in a functionally similar manner. Merely by
using a COGO system setting command, and having
the appropriate interface programs to physically drive
the units, achieves compatibility with other plotters
and interactive displays.
SET SYSTEM
UNITP 0.001

PLOTX 30.0 PLOTY

24.0

The set system command can also be used to limit the
size of the plot (as above) or of the display, thus permitting different plotter table sizes, different storage
tube sizes, and different physical paper sizes on the
plotter. The UNITP parameter tells the system what
the size of a plotter unit is, in this case, .001 inch.

The digitize command puts the user into digitize
mode and sets up basic values to be used in the ensuing
storing of the digitized points. n is the point number at
which the system is to begin storing points. The
SCALE parameters define the scaling to be performed
in translating from digitizer coordinates to actual
coordinates. It can be given as 200 PER 1 meaning
perhaps 200 feet per inch or it can be given as 200 per
whatever length is digitized as two points after the
command is given. The ORIENTATION defines what
rotation is to take place upon the digitized points prior
to storing them and can be given as N 0 E where the
corresponding direction of due north on the map which
is being digitized is defined by 2 points digitized after
the command is given.
The last parameter, that of DATUM, defines a base
point about which translation, rotation, and scaling are
to occur. The values define the actual coordinates (if
POINT m option is used, the coordinate values of
stored point m are used) of the base point and the
digitizer coordinates of that point are defined by
digitizing it after the command is given.
Thus, after issuing the command with its appropriate
parameters and digitizing 5 or 3 points (5 if SCALE
length is given, 3 if length PER value is given-[2 for
scale], 2 for orientation, 1 for datum), the system is
ready to start accepting digitized information to
translate, rotate, and scale it, and then to store it.
Any kind of allowable geometric object from points
to chains and points to networks can be defined and
stored in this way. Tp enable the user to tell the system
what object or objects he wishes to define, the function
keyboard is used. Its template for this command is
currently defined as shown in Table II.

TABLE II-Function Key Definitions
Fcn. Key

Graphic input

1
2

Inputting of information from the digitizer (flat-bed
digitizer) is most powerful.

DIGITIZE

{~} POINT n SCALE

3
4
5

6
7
8
9

length
{

} ORIENTATION

length PER VALUE
X value Y value}

direction DATUM

{

POINTm

623

10
11
12

16

Action
POINT
COURSE
CURVE
CHAIN
NETWORK
LINK
END OF OBJECT
END OF OBJECT CLASS
PLOT everything digitized up to this point
DISPLAY everything digitized up to this point
NAME the last object by keying in from keyboard
TEXT (or label information) is to be located at
next point digitized and the textual information
itself entered from the keyboard
RETURN to the standard keyboard command
environment

624

Fall Joint Computer Conference, 1971

TABLE III-Digitizing Sequences
Function key 4
Digitize point
Digitize point
Function key 3
Digitize 5 points
Function key 7
Function key 2
Digitize 2 points
Function key 7
Function key 7
Digitize point
Function key 3
Digitize 3 points

Function key 7
Digitize point
Function key 7
Function key 8
Function key 5
Function key 6
Digitize 6 points
Function key 6
Digitize 2 points
Function key 8
Function key 16

The functions are hierarchical in that once the chain
key is pushed, everything that follows, be it points
digitized or combinations of point, curve, course function keys and digitized points, will be stored as one chain
until the end of object or end of object class key is
pushed. Thus, the sequence in Table III will store 2
chains and 1 network.
The objects digitized in Table III consist of, respectively:
chain 1:
2 points, 1 curve, 1 course
chain 2:
1 point, 1 curve, 1 point
network 1: 5 links composed of 6 points, 1 link com..
posed of 2 points
If no name is given by the user (as above), the system
will automatically assign unique names to all objects
defined.
The plot and display functions are very useful to the
user by enabling him to take spot checks on what he
has done so far without removing him from digitize
mode.
It has been found that the accuracy obtained from
inputting coordinate data in this fashion is entirely
dependent upon the amount of time the user wishes to
take to accurately position the cross-hair over a position
to be digitized. With enough care, accuracy to better
than .01 inch can be achieved.

display a large section of the city, home in on a particular subarea of interest through one or many repeated
magnifications, and then get a hard copy plot of the
area of interest. Figures 5 and 6 show results of a display and a plot, respectively, of several blocks of a city.
A display or a plot can be annotated with the name
of each object on it, be it all point numbers or all chain
names or all block names, etc. The user can also point
to an object and have its name or number drawn. Stored
attributes about the plotted or displayed objects can
also be drawn. Figure 7 shows a block of a city plotted
and then annotated with the names of the parcels in the
block and the· use of each of the parcels as recorded in
the assessor's office. The caption "mixed" means that
the parcel has mixed usage. Figure 8 shows a plot of
what each mixed usage is, drawn so as to use the sheet
as an overlay.
A selective draw can also be done, whereby the user
requests all of a certain object type whose data attributes satisfy certain criteria to be displayed or plotted.
Figures 9 and 10 illustrate this capability. The first
figure is a display of all chains in a section of a city
whose land value is greater than $75,000 or whose
building value is greater than $150,000. The second
figure is a display of all chains in the same section of a
city whose land value is greater than $75,000 and whose
building value is greater than $150,000.
Just as objects which are drawn in a standard fashion
can be annotated, so can objects that are selectively
drawn. Figure 11 shows the result of a selective plot and
an annotation. The plot requested was all chains in a
section of a city whose land value is greater than

Graphic output

A wide variety of graphical output capabilities exist
in the system and will be expanded upon in the future.
Straightforward plotting or displaying of geometric
objects was the first graphic capability implemented.
All objects, points, courses, curves, chains, blocks,
regions, of any level, and networks, can be drawn and
translated, rotated or magnified. It is thus possible to

Figure 5-Display of several blocks

URBAN COGO

$100,000. That plot was then annotated with the chain
names and the land value for each chain.
Another graphic capability is density mapping. Sets
of ranges of values for an attribute of chains, blocks, or
regions can be drawn, each set with a different shading
line. Figure 12 illustrates this by showing different
ranges of land value for parcels in a block of a city. The
ranges chosen for this were 1,000 to 10,000 designated
by 1?Z2l, 10,000 to 25,000 designated by J:SSSI, 25,000
to 75,000 by~, and 75,000 to 500,000 bylIITl. This
capability can be used on the display as well as the
plotter.

1~47

I

-J.~_{'j
MIXED

fv1IXE.D

j

APA~T~ENT

I

lI48

I

iI4g

L<;' _< 7

11 ..{..{ b

~J

I

1.

~ t.j

~~

G i- F_
[J

i

SINGi-L

A.CJA~T''1E·_NT

-------1

i
I

1 I"J 1

IA~:!\RT~ENT

SINGi-L

I

StNGi-L

11

3:"{ 5
jMIXED

I

1~5C:::

13:53:
5INGi-L

13:..{4
MIXED

1 3:

t:.j

4

!

~INGi-L

I

13:"{"{

III
II

13:55
:'JINGi-L
13:5b
SINGLE

MIXED
13:"{Z
MIXED
1 3:"{ 1
MIXED
i3:~O

5INGi-F_

>01]1AI

>f\j
flJrn
-I

3:
r'1

noIIAI
CLfl
flJ[l1

13:Zg
MIXED

J1

625

n
I
LIltHIAI

zm

~D

rrn

z

-I

Figure 7-Annotated plot of a city block

Figure 6-Plot of several blocks

Statistical charts and graphs can also be drawn.
They are described in the section about the statistical
capability of the system.
Additional graphical capabilities are planned for the
system but not yet implemented. Full mapping capabilities including full or partial dimensioning and
symbol labeling is planned. Examples of such use are
tax mapping and utility network schematics. Planar
perspectives of three dimensional objects is also
planned. Usage for these includes skyline perspectives
along a street and multi-tiered transportation and
utility networks.

626

Fall Joint Computer Conference, 1971

>

>
1]

1]

>

>

][]

][]

-i

-I

:t

:t

1"'1
Z

1"'1
Z

-i

-I

U1

U1

STORE

OF"" F"" I-£:

~:

~TDRE

-i

0

0
][]

]IJ

M

M

DF"F"ICE:

Figure 9-Selective display on land value

"P"RTMENT

STORE

"P"RTMENT

STORE

"P"RTMENT

STORE.

AP"RTMENT

STORE

"PARTMENT

STORE

I\P"RTMENT

STORE

sist of raw data (the actual attribute values) or of data
formed through groupings, exclusions, inclusions,
additions, subtractions, multiplications, divisions,
powers, etc., upon the raw data. Besides the standard
statistical values, the system can perform the following:
Non-Parametric
Chi-square analysis
Kolmogorov-Smirnov test
Mann-Whitney U test
Kendal Rank Correlation test

Figure 8-Annotated overlay of mixed usage

A great deal of emphasis is being placed upon the
graphical capabilities of the system. It is the author's
opinion that this will prove to be one of the major
features of URBAN COGO.
STATISTICAL ANALYSES
URBAN CO GO currently has the capability to
perform parametric and non-parametric statistical
analyses on groups of data attrihutes. Sample vectors,
upon which the analyses are to be performed, can con-

Figure 100Selective display on land value

URBAN COGO

---------------

627

---------------------------------

1 ~B~

1~5B

1,180000

12Sg00

I

L -_ _ _ _ _ __

--0

LIJD

127g

"N

M~

277800

1~12l
1 b'33:DD

r,;~g7l

1174000
i

L--_

-

7
~--~~"-----J

11

~B

l~:7500

I

'--__ J
r1II 00 " 0o
D° OD
M~

1 IV!

l -;

~
~

N

I

112gb

0
1II0

0
1'-0

N:

Ol~
Nt--

~

~

OlLD
~

~

l4g0DOD

0

Ulo

010

i

N~

~ _ _ _ _ _ _ _ _L __ _~~rn~~

Figure ll-Annotated selective plot

Parametric
Variance, covariance
Correlation
Regression

Additional analyses, such as time senes and factor
analysis, are planned.

Using the statistical subset of URBAN COGO, it is
possible, through combining, rejecting, creating new
vectors with the results of analyses, and so forth, to
combine and aggregate in almost any way to any level
required, starting at even the most raw level. Capabilities exist for the sample vectors to be saved in a file
so that statistical investigation and analysis can be
done at the user's leisure.

I .

X S[ALE 41
Y S[ALE 41

PER IN[H
PER IN[H
Figure 12-Density plot on land value

628

Fall Joint Computer Conference, 1971

The statistical subset also provides the user with the
ability to graphically portray a vector or group of
vectors. Graphs, scatter diagrams, and histograms can
be drawn on the printer, the display, or the plotter.
Figures 13 and 14 illustrate the plotting of a graph and
a histogram, respectively. The first is a graph of the
land values of all parcels in a section of a city. The
second is a histogram of the use of the parcels in a
section of a city.

~I~ r~GRAM

Dr USE

LJ UNT

151-:IA1+

In...;...
125-1119i
ill~

I

Im-r
1

95 -+-

8 /-1-

PROCESSING FUNCTIONS
Various processing facilities have been included.
ICES CO GO contained the ability to locate points in
the XY plane in many ways, including:
-at a given distance and direction from another
point
-at the intersection of 2 lines
-at the intersection of 2 curves
-at the intersection of a curve and a line
-as a projection onto a course or a curve
-at a given distance along a course or curve
Figure 14-Histogram plot on usage
~RAPH

OF

LAND VALUE

IN

CHAIN

URBAN CO GO has expanded upon this to enable the
user to intersect higher level objects and store the
points of intersection as new points. Networks, chains,
and blocks can all be intersected with similar objects,
e.g., network with network, or with other objects,
e.g., network with block.
It is planned to expand upon the location capability
as well as the intersect capability to move along any
planar slice in 3-space when the three-dimensional
object capability is implemented.
Translation and rotation of networks, chains, and
blocks are also provided with the ability to do so on
any of the three standard planes-XY, XZ, or YZ.
Thus, with this capability, a user may store groups of
objects in their own local coordinate system and when
they are fully checked, translate and rotate them to a
more global coordinate system.

COUNT

Ell
Z:ZD

Z:Ol
ZBb
Zb':J
Z5Z
ZZ:5
ZIB
ZOI
184
Ib7
ISO

J:n::
I Ib
':J':J
8Z
bS
48

Tabulation

Z:I
14

IIl,." , ""

--+'t-+-t--++++ -f4- I I I I I I I 1-+Y-+--1

=FIf'l""-1-'4'-l-'-¥'-l-4-!-'-+-!-t---+'
0
Z
~

X-AXIS

4

5

b

7

8

g

I~IDDDDD

Figure 13-Statistical graph plot on land value

Tabulation facilities are also available and can easily
be added to for any specific use of the system. General
capabilities such as selective tabulation of objects whose
data attributes satisfy certain criteria, or the sorting of

URBAN COGO

a class of objects on one or more data attributes are
now part of the system. Major report generation could
easily be added to the system, but is highly user and
type-of-data dependent. An illustration of one such
report is the generation of tax bills for the assessors
office. Adding this capability would be very easy once
the format for the bills has been defined. It is easy to
foresee many facilities being added to the system in the
area of report generation, but primarily by and for a
specific set of users.
CENSUS DATA
Many potential users of URBAN COGO will want to
interface with census programs and to use census data.
Commands and their associated programs are being
developed to be able to retrieve any or all of the data
on a census tape and store it in a defined way in the
URBAN COGO data base. This facility is being
developed so that not only census data can be retrieved
and used in this way but almost any kind of data that
a user might want to utilize in the system.
The retrieving and the defining of the correspondences
between the original data and its URBAN COGO
counterpart is structured so that attribute data as well
as geo-data (such as that used in the DIME system)
can be handled.
It is foreseen that this capability will be one major
way in which users will be able to interface other
systems with URBAN COGO.

629

The potential uses for and users of the system are
many and varied. A brief summary of some follows:
-City Government: tax billing, analyses of effects
of tax changes, mapping, management reporting,
area redevelopment;
-State Government: transportation route location
and analysis, congressional districting, maintenance of highways and highway signs;
-Utility Companies: maintenance, location of
manholes, prime power supplies, utility network
mapping;
-Consulting Companies: transportation studies,
airport location studies, urban planning;
- Legal Services: title searching.
The specific uses of the system are potentially too
numerous and varied to describe. Perhaps the best
source is the imagination of each user.
SUMMARY
While it is believed that the URBAN COG a system
described above provides the base for a new kind of
urban information system, it is also believed that much
work still remains to be done with it in order to prove
this belief. Two maj or areas still remain to be researched.
-the design and development of a new file structure
-the use of the system in some specific applications

EDITING
There is one very important fall-out of the work done
to date on the system, namely the ability of the system
capabilities to be used for data editing, data error
detection, and self-checking. The features which have
proved very useful for error detection are:
-the definition of allowable categories of data and
the allowable values these attributes may have;
-the display and plot facilities to catch errors in
the definition of geographic objects;
-the sorting and statistical tabulation commands
to find objects for which no data is stored or for
which only partial data is stored.
The fact that a user can essentially use the system
to check the system is of major import, and the fact
that it is provided as fall-out (by design) is also of
major importance. It is most likely, however, that this
fact will only be recognized through use.
USAGE SAMPLING

This work is planned to be performed in the coming
year.
The author believes that the building of the system
upon geographic data as its base and the heavy emphasis
of graphical capabilities in the system will prove to be
the way-of-the-future for urban information systems.
ACKNOWLEDGMENTS
The URBAN COGO system reported on above is being
developed by the Urban Geometrics project of the
Urban Systems Laboratory of the Massachusetts
Institute of Technology. The author would like to
acknowledge the advice and wisdom of Professor C. L.
Miller, Director of the Laboratory, who originated the
COGO system and who formulated the original concept
of URBAN COGO. The author would also like to
acknowledge those M.LT. students (all of whom are
members of the Chi Phi Fraternity) who are and have
worked hard and diligently on making the system a
reality.

630

Fall Joint Computer Conference, 1971

The URBAN COGO project is currently being
supported, in part, by a grant from the National Science
Foundation. Previously, the project was sponsored in
part by grants to the Urban Systems Laboratory from
the Ford Foundation and the IBM Corporation.
BIBLIOGRAPHY
1 A J CASNER W BLEACH
Cases and text on property
Little Brown and Company Boston 1969
2 F E CLARK
A treatise on the law of surveying and boundaries
The Bobbs-Merrill Company Indianapolis 1939
3 M CLAWSON C L STEWART
Land use information
Resources for the Future Inc The Johns Hopkins Press
Baltimore 1965
4 R T HOWE
Fundamentals of a modern system of land parcel records
Department of Civil Engineering University of
Cincinnati May 1968
5 LOCKHEED MISSLES AND SPACE COMPANY

California statewide information system study
Sunnyvale California July 1965
6 C L MILLER
Engineers' guide to ICES-COGO I
Department of Civil Engineering Report No R67-46
Massachusetts Institute of Technology August 1967
7 B SCHUMACKER
An introduction to ICES
Department of Civil Engineering Report No R67-47
Massachusetts Institute of Technology 1967
8 B SCHUMACKER
URBAN COGO users' guide
Urban Systems Laboratory, Massachusetts Institute of
Technology June 1970
9 Census use study reports
US Bureau of the Census Reports Nos 1 through 11
Washington DC 1970
10-Urban and regional information systems: support for
planning in metropolitan areas
US Department of Housing and Urban Development
Washington DC October 1968
11-Operational and maintenance manual for interactgraphic 1
Computervision Corporation Burlington Massachusetts
October 1970

Understanding Urban Dynamics*
by GERALD O. BARNEY
Center for Naval Analyses
Arlington, Virginia

INTRODUCTION

known as a "complex system"-a system whose
behavior is dominated by multiple-loop, non-linear
feedback processes. Mathematical analysis is not too
helpful in understanding complex systems since their
non-linear properties are as yet very difficult to treat
analytically. Currently, the only successful method of
dealing with systems as complex as the urban system is
experimentation-with the actual system or with some
representation of the actual system. In the case of the
urban system, most of the experimentation is done with
a mental representation-the mental image (or model)
we each have of how the urban system operates.
Our public officials are constantly performing
experiments with their mental models as they evaluate
proposed changes and additions to laws and policies.
Although most public officials are probably not explicitly aware of it, their experiments involve three
separate and distinct steps. The official first brings to
mind his latest mental image of how the system operates;
he then uses his mental model to deduce the effects of
the proposal; and finally he judges his deduction of the
effects against his set of values and goals. In the past,
it has not been too important to distinguish these three
steps, but as the policy and legislative issues become
more complex, it is increasingly important to know
whether disagreements over a given proposal stem from
different conceptions of how the system works, from
inaccurate or inconsistent deductions of effects, or
from more basic differences of values and goals.
In turning to the computer for assistance, we are
forced to consider each step separately. Our mental
image must be developed and expressed in a language
that can be used to instruct the computer. Any consistent, explicit mental image of any system can be so
expressed. Our mental images are the results of our
experiences, and expressing these experiences explicitly
for the computer permits others to examine, correct,
and comprehend our mental images and to contribute
to a broader understanding through their different
experiences. Given the expression of our mental image,

As indicated by published reviews and unpublished
criticisms, some readers have had difficulty in understanding several of the most important points of Urban
Dynamics** by Professor Jay W. Forrester. The book
contains several stumbling blocks. For example, certain
pet theories that for years have been thought to be
important in the dynamics of an urban area are scarcely
even mentioned (e.g., transportation, crime, pollution,
discrimination and suburbs). Also, several measures of
urban characteristics appear to be sufficiently different
in the model from those found in real urban areas to
distract one's attention from the main points of the
book. But there is a message in Urban Dynamics, and
when it is comprehended, these stumbling blocks
become less significant. This paper is intended to help
the reader of Urban Dynamics to understand the
message of the book and to see beyond many of the
criticisms that have been made.
WHAT IS URBAN DYNAMICS?

Urban Dynamics is an analysis of how the urban
system operates and how it can be more effectively
managed. The development of large concentrations of
relatively unskilled persons and the blighting effect
these concentrations have on our people and cities are
the primary issues discussed in the book. The analysis is
based on a computer model which, in the most general
terms, simulates the interactions among population,
housing, employment (industry) and municipal
serVIces.
The urban system is an example of what has become

* Dr. Barney is employed by the Center for Naval Analyses of the
University of Rochester, 1401 Wilson Boulevard, Arlington,
Virginia 22209. This paper was written while Dr. Barney was on
leave at the Massachusetts Institute of Technology.
** Published by the MIT Press, Cambridge, Massachusetts, 1969.
631

632

Fall Joint Computer Conference, 1971

the computer can point out inconsistencies, determine
sensitivities, and deduce implications much more
accurately than can the human mind-and without
changing the ground rules part way along as the human
mind is so prone to do.
But probably the most important contribution the
computer makes is that it forces us to give separate
consideration to questions of values and goals. Given
the implications of a proposed change in a law or
policy, we are forced to ask if this is what we want, if
this is consistent with our values, and if this brings us
any nearer our collective goals. With finite resources,
cities can't be everything to everyone. Given a better
understanding of the options available and the effects
of any given proposal, debate must then center on the
desirability of the effects and the values necessary for
judging desirability.
By passing laws and changing policies, our public
officials are making changes in the very structure of
our society. To be of assistance in the analysis of the
questions they face, a model must not only reproduce
the behavior of a city in a general sense, it must also
correctly reflect the basic causal mechanisms at work.
Many basically different models can reproduce urban
history, but a model that is to be used to examine the
effects of change in structure must correctly reflect
all of the important causal mechanisms-some of which
are not yet easily measured. This is a formidable requirement, and success for now must be measured not
against an absolute standard of accuracy but rather
against our only alternatives-inexplicit mental models
or intuition.

and holding power for people in the three socio-econornic
classes. Although a city's attractiveness is generally
different for the Underemployed, Labor, and Management-Professional populations, there is an attractiveness for each of the three classes determining the rates
at which they are drawn to the city and how effective
the city is in holding them there once they arrive.
Forrester inadvertently confuses many readers when he
gives the attractiveness indicators the following three
apparently unrelated names: Attractiveness for Migration Multiplier (AMM) , Labor Arrival Multiplier
(LAM), and Management Arrival Multiplier (MAM).
The identical modeling function of the three attractiveness multipliers is indicated in Figure 1.
The attractiveness multipliers are especially important in that they reflect the population's response to a
variety of "incommensurables." Population, housing,
housing programs, economic advancement potential,
public expenditures and employment opportunities are
reduced to a common scale or commensurated (differently for the three classes) to give a composite attractiveness for each class. For example, attractiveness for
the Underemployed population increases with increased

Underemployed Labor
~=*~==t or Management- Professional ~=~;==
Population

THE HEART OF THE MODEL
The Urban Dynamics Model is an explicit expression
of a distillation of several mental images. Its subject is
the causes of urban decay-the concentration of large
numbers of relatively unskilled people in urban areas,
and all the attendant problems. In the model, just as in
real cities, people migrate in and out and move among
the socio-economic classes (Forrester defines three such
classes: Underemployed, Labor, and ManagementProfessional) in response to a variety of conditions,
including population, housing, employment, and
municipal services. The heart of the model is the
enumeration and description of the multitude of urban
conditions which influence migration and economic
advancement.
The concept of "attractiveness," the central idea
behind Forrester's description of migration, is frequently
misunderstood. Attractiveness is not an indication of a
city's beauty but rather a measure of a city's drawing

,,

"
/

I
\

/

,

......

Population
Housing Stock
Housing Programs
Advancement Potential
Public Expenditures
Employment Opportunities

Figure I-A flow diagram summarizing how the attractiveness
indicators are used to influence the arrivals and departures for the
three socio-economic classes (see Urban Dynamics, pages 134,160
and 165)

Understanding Urban Dynamics

housing programs, economic advancement potential,
public expenditures, and employment opportunities,
but decreases with increased Underemployed population
(reflecting more competition for jobs, housing, etc.).
Wherever differences in attractiveness exist, the population gradually migrates to the more attractive area,
and the changed population distribution gradually
reduces the difference in attractiveness.
Significant differences in attractiveness can exist
only across boundaries where migration is restricted
(e.g., between cities in Mexico and California). Within
the United States, however, attractiveness is essentially
constant. When all components of attractiveness are
considered, New York City, Chicago, Colorado Springs,
and Bend (Oregon) have very nearly equal attractiveness for a given socio-economic class.
Another important part of the model is the descriptionof the factors that determine how fast people
advance from Underemployed to Labor and from
Labor to Management-Professional. The rate of
advancement from Underemployed to Labor (UTL) is
particularly important, since it is only through this
transition that the Underemployed can escape poverty.
The conditions that influence UTL are total labor and
underemployed jobs, Labor and Underemployed
populations, education level of Underemployed, job
training programs, and the ratio of Labor (teachers)
to Underemployed (students).
THE FAILURE OF CURRENT URBAN
PROGRAMS
The importance of the advancement and attractiveness concepts can be seen in Forrester's analysis of
current urban programs. In actual practice, these
programs generally have a similar and characteristic
development pattern: an initial period of slight improvement and generation of hope, followed within a few
years by readjustments within the urban system which
result in a loss of gained ground and general disillusionment of the Underemployed. The net result has been
increased concentrations of Underemployed, continually
decaying conditions, and growing hostility of the
Underemployed toward the "System" and toward the
"Establishment" that they think controls the "System."
Actually, as Forrester's analysis shows, the failure of
our urban programs is due not to the control of the
establishment, but rather to a collection of feedback
processes that are at work within the system and are
almost beyond the influence of the establishment. The
advancement and attractiveness concepts are important
in understanding the operation and effects of these
feedback processes.

Reduced
attractiveness
for
Underemployed

633

Programs stort
improving conditions
for the
Underemployed

Increased
a ttracti veness for
the Underemployed
perceived.

Reduced
capacity of
Underemployed to
develop their own
environment

Migration
into the area
increases slightly;
slightly fewer
are leaving.

Reduced
advancement and
employment
potential for
Un deremp loyed

Increase
congestion and
crowding

Reduced
attractiveness to
Management- Professionals
Labor and
business

competition for
available land;
less land available
at higher prices

Increased
pressure for
Underemployed
housing

Figure 2-Illustration of two negative feedback loops which tend
to undermine the effects of direct aid to the underemployed

There are many interacting feedback processes that
cause urban problems to feed on themselves and make
failures of our urban programs, but in a highly simplified way, two of the most dominant interactions are
illustrated in Figure 2. Shortly after the initiation of
any given program (housing, food, health, job training,
etc.) , conditions for the Underemployed do measurably
improve, and the improvement encourages continuation
of the program. In time, the increased attractiveness of
the area is evident to the Underemployed, and, as a
result, a somewhat larger number move into the area
and a somewhat smaller number leave than would
have, had the attractiveness not increased. There
follows a period of somewhat expanded growth of the
Underemployed population in the area, and this
population growth increases the pressures on the available schools, housing, employment opportunities,
shopping and recreational facilities, and transportation
systems. The effects of this first feedback loop are felt
1

634

Fall Joint Computer Conference, 1971

within a few years when the increased crowding and
congestion begin to drop the total attractiveness back
toward what it was before the program was started.
The effects of the second feedback loop are delayed
by another few years. The enlarged Underemployment
population, which resulted from the initial success of
the program, increases the demand for Underemployed
housing. As a result, Underemployed housing competes
more and more vigorously for available land. This
competition not only takes vacant land but preserves
very decayed housing stock that might otherwise be
destroyed to permit an alternative .land use. The increased demand for Underemployed-housing land-use
drives up the cost of land for Labor and ManagementProfessional housing-thus reducing the attractiveness
of the area for these two populations. This reduced
attractiveness for Labor and Management-Professionals, combined with the more intense competition
for land, implies more business expenses, fewer business
opportunities, and increased difficulties for all forms of
business activity. Declining business activity reduces
the advancement potential for the Underemployed.
The lowered advancement potential in turn diminishes
the chances of the Underemployed escaping from the
urban poverty trap, destroys their chances of improving
their living conditions, and increases their frustration
and hostility.

THE PROBLEM AND FORRESTER'S SOLUTION
The really basic problem is this: Given the existence
of feedback processes which tend to counter the effects
of direct improvements, how can a city best improve
the lot of its Underemployed with the limited resources
it has available? Forrester's approach to this problem is
significantly different from many approaches used in
the past. He does not start by stating what cities
should be like but rather asks the very practical
question: In what ways can the urban system be made
to operate differently? The effects of changing the way
the system works are then investigated with the model,
and the "best" alternative is recommended.
But "best" for whom: the rich, the poor, the absentee
landlords, the stock brokers? What values are to be
used in judging the possibilities? Forrester doesn't
explicitly discuss the values he uses, but implicit in his
discussions of the alternatives are two values: the
solution must be lasting (decades at least) as opposed
to temporary (a few months to a few years), and the
solution must lead to increased upward economic
mobility for the Underemployed. Others (slumlords,
for example) might have different objectives and values
to use in evaluating the results of the computer runs,

but the values behind Forrester's discussions deserve
careful consideration. Temporary solutions have produced much disillusionment, frustration, and resentment among the urban poor, and while the poor may
not be able to agree on which particular urban co:q.ditions are most in need of improvement, increased
upward economic mobility provides them with both
hope and freedom of choice.
Forrester's choice for the best way to revive a
decayed urban area is to demolish five percent of the
slum housing stock (some of which would have been
destroyed anyway and much of which is already
vacant) each year and to provide business encouragements that increase new enterprise construction 40
percent over what would have occurred under the
same conditions without the added incentives. The
demolition need not involve active intervention by the
city; it is probably best accomplished through changes
in tax laws and zoning. The increased availability of
land improves the attractiveness of the area to business,
Labor, and Management-Professionals. This in turn
results in an upsurge in the demand for labor, which in
turn increases the opportunities for upward economic
advancement for the Underemployed. As the area
becomes more of a place for the Underemployed to get
ahead, its attractiveness to the Underemployed begins
to increase, and if unchecked, a new Underemployed
inmigration (and a resulting demand for Underemployed housing) would within a few years increase
. competition for the available land to the point that
business opportunities (and the associated advancement potential for the Underemployed) would again be
decreased. In Forrester's solution, *** attractiveness
and migration are limited by reduced housing stock
available to the Underemployed.
COMMON MISUNDERSTANDINGS
Forrester's proposed program of slum housing
demolition provides an interesting demonstration of
need for more than intuition in predicting the dynamics
of urban systems. Intuitively it would seem that slum
housing demolition could do nothing but make housing
scarcer and ultimately drive the Underemployed out of
the area. At first glance, the analysis from the model
also seems to support these conClusions, in that the
ratio of Underemployed to Underemployed-housing
increases significantly. But something else is happening.
The net immigration of Underemployed increases by
almost 4,500 persons per year, a factor of approximately 450. Why? Because even though housing is

*** See

Urban Dynamics, Section 5.7.

Understanding Urban Dynamics

tighter, the increased business activity and potential
for economic advancement actually make the area very
attractive to the Underemployed.
It seems paradoxical that reduced housing makes it
possible for more Underemployed to migrate into the
area. In spite of the increased influx, the Underemployed
population actually decreases by 10 percent because
many more are advancing into the Labor population.
The net annual-advancement-rate from Underemployed
to Labor rises from about 5,500 to just over 9,000
persons per year, 165 percent of the old rate. In contrast to the urban renewal programs of the fifties, this
program works slowly and does not completely disrupt
and destroy whole communities. In Figure 3, the
effects of the program on population movements to,
from, and within the urban area are illustrated. In
addition to being economically viable, the area is now
an efficient upgrader of the population. The overall
effects are indicated in Table 1.
OMISSIONS?
The Urban Dynamics work has been criticized for the
omission of a variety of factors that are alleged to be
central to the urban problem. Influences of the suburbs,
r---------------------j
4.194 net

: Management-Professional

departures per year

I

II
5,841 net
departures per year

8 net
arrivals per year

I
I
I

l
I

7 I , 31 0

551 births
per year

1

advancements per year

J

3,986 births
per year

5,498 net

I
advancements per year
1
I Underemployed

i

1

3,643 net

392,550

:

1
I

I
l
I

.

I

5,490 bIrths I

IL ______________________
377,310
per year

:
~

BEFORE REVIVAL

r----------------------,

6,640 net
departures per year

I
I

I
i
I

8.871 net
departures per year

I
I

Management-Professional

108,720
5,865 net
advancements per year

599,990
9,166 net
advancements per year

4,453 net

I

arrivals per year

:

Underemployed

775 births
per year

I
I

l
I
I
I
I

5,570 births
per year
I

1

I

I

I

4,713 births 1

335,930
per year
I
L ______________________ ~

AFTER REVIVAL

Figure 3-Equilibrium population flows before and after revival.
The dashed lines represent the boundary of the urban area

635

TABLE I-Changes in the Population Mix and Land Use
Distribution Resulting from Slum-housing Demolition
and Industry Encouragement
Equilibriuml
Mode
Land used for incomeproducing activities
(acres)
Land used for housing
(acres)
Ratio of housing-land to
income-producing land
Management-Professional
population
Labor population
Underemployed population
Total population
1
2

Revival2
Mode

Percent
Difference

5,800

8,600

75,700

77,900

12.96

9.05

-30%

109,000
71,000
600,000
393,000
336,000
377,000
341,000 1,045,000

+50%
+50%
-10%
+200%

+50%

+

3%

This is the starting-point for all runs in chapter five,
See chapter five, section 5.7.

transportation systems, discrimination, pollution, and
"external driving forces" are among the factors frequently cited. Although some of these factors are very
important to certain urban phenomena, the urban
dynamics model already contains enough detail to
produce the urban phenomenon about which the book
is written: the concentration of large numbers of
relatively unskilled people in decaying sections of our
cities. Additional details are likely to make an already
complex model more confusing, and unless they can be
shown to be required to produce the problem under
study, extra details are probably best left out. Some
comments on several of the frequently-noted "omissions" are given in the following paragraphs.
First, Urban Dynamics does not assume (as has been
asserted) that the dynamics of urban areas are independent of depressions, world wars, technological
change, earthquakes, and other "external driving
forces." What Urban Dynamics does assume is that the
management of a given urban area has little or no
influence on the external forces acting on the area, and
that, come what may, urban areas must be managed as
effectively as possible. Although uncertainties may
dictate a more or less cautious advance, Urban Dynamics
asserts that effective urban management is possible in
spite of uncertainties. The effectiveness of urban
management seems to be limited not so much by an
inability to predict the future course of external driving
forces as by an inadequate understanding of the timedependent consequences of the many non-linear feedback processes at work.
Concerning suburbs and their effects, it should be
noted that every city is in competition with its environ-

636

Fall Joint Computer Conference, 1971

ment (that is, the remainder of the nation) for people
and industry. A city's suburbs are a part of its environment, and this basic competitive influence of the
suburbs on a city is included in the attractiveness and
migration concepts of the model. A city's suburbs are
different from the remainder of its environment only
in that they are close enough to the city to allow suburbanites to commute to jobs in the city without
actually living in the city.
The definitions of the attractiveness indicators and
the system boundary are closely related to the suburbs
question. In defining the term "urban area" and in
specifying the system boundary , Urban Dynamics
assumes that the population both lives and works in
the "urban area" inside the system boundary. In
determining the migration rates, the attractiveness of
the area as a place to work is not differentiated from
the attractiveness of the area as a place to live. Daytime and nighttime populations are equal. In this
approximation, some very important effects can and
have been studied. It is interesting to note, however,
that the conditions resulting from Forrester's solution
(good business activity, many job opportunities, a
shortage of housing-especially for labor (Labor-toHousing Ratio = 1.332» are exactly the conditions that
lead to tremendous highway-expansion programs and
suburban growth. An expanded suburban population
that is allowed to enjoy the attractiveness of the city as
a place to work and to enjoy the attractiveness of the
suburbs as a place to live would probably have a
significant impact on Forrester's recommended solution. Without expansion, the model cannot analyze
this impact. The book does, however, suggest ways of
minimizing the effect.
Although racial discrimination has declined during
the last decade, it is still a significant problem for many
Americans-but not their only (and perhaps not even
their most serious) problem. But through preoccupation
with the discrimination issue, there is danger of losing
sight of a basic obstacle to the economic recovery of the
victims of discrimination. Discrimination in the past
has put many more blacks than whites into the U nderemployed category, and now this imbalance makes the
basic problem of the Underemployed appear to be a
problem of discrimination. But the problems of Watts
and Harlem have common roots with the problems of
Appalachia. The Underemployed-black or white-are
trapped by the same basic mechanism. If today we
could completely eliminate all discrimination (and who
could deny that a significant amount still exists), the
Underemployed black's problem would not be solved.
As an Underemployed person, the "System" would
keep him right where he is in the inner city ghetto,

with virtually no hope of escape. Urban Dynamics
describes the basic obstacle to his advancement and
indicates how, and at what cost, the "System" can be
made to work for the Underemployed, black and white.
U nti! the feedback processes at work in the System are
better understood by our leaders and by the general
public, many obstacles to the advancement of Underemployed blacks will continue to be easily confused
with discrimination.
Although it cannot be called an omission, the
"infinite environment" approximation has concerned
many people. The urban area of the model is located in
an "environment" which acts as an infinite source (or
sink) for people of the three socio-economic classes.
Migration rates depend only on the relative attractiveness. The attractiveness of the environment is independent of the number of people that are drawn from it or
added to it. More specifically, this approximation enters
the model through the assumption that, on an annual
basis, an urban area for periods of at least 50 years can:
(1) draw the equivalent of 1 percent of its Underemployed population from its "environment,"
(2) deposit into its "environment" about 1.5 percent
of its Labor population,
(3) deposit into its "environment" about 6 percent
of its Management-Professional population,
without significantly altering conditions in the "environment." One city, critics argue, can probably do this, but
if all cities were doing this (as they might if Forrester's
solutions were adopted as federal policy) conditions in
the "environment" would change. It is then argued
that the ensuing changes in relative attractiveness
would lead to different conditions than those suggested
in Forrester's analysis.
Since the departing Labor and ManagementProfessional population could well be used in starting
new communities (which will be needed if our population continues to grow as assumed in the model), the
most likely change in the environment would be a
drying up of the source of Underemployed. This seems
unlikely to destroy the usefulness, of Forrester's suggestion.
THE MESSAGE
In spite of its first appearance, much of what Urban
Dynamics says closely resembles what urban experts
have been saying for years. There is nearly complete
agreement, for example, that the fundamental characteristics of a decayed urban area are an inappropriate
population mix and an economically unsatisfactory

Understanding Urban Dynamics

distribution of land use. Without Managers and
Professionals to recognize opportunities and to organize
income producing activities, and without a large Labor
population from which skills can be learned, it is not
surprising that the economic advancement of the
Underemployed living in decayed urban areas is rather
limited. Yet relatively inexpensive shelter, welfare
income, televised entertainment, public transportation,
and municipal services attract the Underemployed into
our urban poverty traps. As the number of U nderemployed in an area increases, many small changes
interact to shift land use toward housing and away from
income-producing activities. This is another way of
describing what Forrester calls excess housing. This
mode of operation in which problems feed upon themselves and in which the Underemployed are trapped for
generations is widely agreed to be The Urban Problem.
The new and important contributions that Urban
Dynamics makes are in three areas. First, it describes
the basic characteristics of the urban system which
cause decay to feed on itself. This phenomena is complex and not easily summarized, but a particularly
important aspect of it is the close coupling between
land use and population mix. As is very clearly illustrated in the book, neither land-use nor population-mix
can be managed independently. A change in one always
produces a change in the other. Municipal responsibility for land-use management has long been recognized, but the fact that every land-management and
municipal-service decision affects the relative attractiveness of the area to the various socio-economic
elements of the population (and thus determines the
population mix) has only rarely been openly discussed.
Urban Dynamics discusses this point quite openly and
points out how land-use policies, tax laws, assessment
practices and zoning procedures playa major role in

637

bringing together the population-mix we find in our
slums.
A second and related contribution is the analysis of
the failure of past urban programs to achieve any
lasting impact on the basic problem. Urban renewal
(as practiced in the fifties) and the relocation of
Underemployed in low-rent suburban housing have
either completely destroyed and replaced a community
or transplanted the Underemployed to a new location
where they are needed and wanted no more than they
were before the relocation. These brute force solutions
do not recognize and deal with the basic causes of the
difficulties and can lead to nothing more productive
than localized temporary improvements.
The final and most significant contribution of Urban
Dynamics is that it provides an approach through which
even a single city, acting alone, can make a lasting and
significant impact on the distribution of land-use and
the population-mix in its blighted urban areas. The
solution does not involve pushing the Underemployed
out but rather gradually attracting Management,
Labor, and business back. This approach requires
patience, tenacity and understanding, but it treats
the problem rather than the symptoms. Lasting solutions will be achieved only when the underlying feedback processes are recognized and dealt with. Some
basic changes and new responsibilities for municipal
management are required, but only if we establish as
our goal the rebalancing of both the population-mix
and the distribution of land-use to maximize the upward
economic mobility of the Underemployed, can we hope
to eliminate the frustrations of our inner-city U nderemployed and the explosive atmosphere that accompanies their disillusionment. In that cities have
open to them an effective, independent, and imperative
course, they are truly "masters of their own fate."

Bankmod-An interactive decision aid for hanks
by WOLFGANG P. HOEHENWARTER and KENNETH E. REICH
Bank Administration Institute
Park Ridge, Illinois

funds in order to maXlffilze long term performance
while simultaneously satisfying certain constraints4 of
the regulatory agencies. These constraints are designed
primarily to protect depositors. From the stockholder's
perspective, future performance is defined in terms of
the expected future return and the uncertainty associated with that return. Stated another way, management must choose the appropriate combination of return and risk which maximizes performance from the
view of the stockholders. To achieve this optimum
trade-off between risk and return, management must
make a variety of decisions which directly or indirectly
affect the mix of sources and uses of funds.
For purposes of applying modeling techniques, decisions can be classified into three types:

INTRODUCTION
Use of mathematical modeling techniques to assist
bank management in managing the sources and uses
of funds has become the subject of increasing bank research effort. This emphasis on use of modeling techniques stems from a concern that planning and managing for a sustained increase in bank profits is becoming
increasingly critical.
A recent studyl showed that while bank profits
tripled in the decade 1958-1968, much of this increase
is due to unusual factors such as rising interest rates,
reduction in excess liquidity and the impact of now discontinued accounting practices. The improvements re.sulting from these factors will not be available to the
industry for future profit growth. In addition, this
study and others2.3 forecast that the nature of credit
demand and deposit supply will also change significantly. To maintain profitability, bank management
will therefore have to revise their concepts about balance sheet structure and utilize techniques which
permit banks to assume more risk in their mix of assets
and liabilities. Furthermore, they need to improve the
profit margin on funds managed through more opportunistic lending and investment policies and more
sophisticated funds management techniques. l
For these reasons, Bank Administration Institute
has undertaken a multi-stage project to develop a
series of balance sheet management models. The initial
prototype, now in the testing and demonstration phase,
is described in this paper. Following an introductory
discussion of decision-making problems in banks, the
paper outlines the approach BAI has taken and describes the major features of the model, including the
handling of the man-machine interface.

Long term strategic decisions

These relate primarily to the overall level and direction of the bank's business. Examples include:
•
•
•
•

Introduction of major new services
Expansion of physical facilities
Implementation of marketing programs
Recruitment and development of personnel

Major decisions in these areas tend to be made relatively infrequently, have a gradual impact on bank
performance and are usually made only after lengthy,
in-depth studies. It would be desirable to have models
to aid in these decisions as they substantially determine
the ultimate growth and profitability of the bank.
However, the large amount of necessary information
required, the difficulties in quantifying the interrelationships among the many variables, and the uncertainties involved make this task extremely difficult.

CLASSIFICATION OF MANAGEMENT
PROBLEMS FACING BANKS

Intermediate term balance sheet management decisions

Managing a bank consists largely of continually determining the mix of sources and uses of the bank's

These decisions relate to average levels of sources
and uses of discretionary funds for periods of a month
639

640

Fall Joint Computer Conference, 1971

or more in length and a planning horizon of perhaps
one to two years into the future. Examples include:
• Balancing anticipated sources and uses of funds to
meet liquidity and captial adequacy constraints
while maximizing profitability.
• Allocating funds between loan and security portfolios and, within these portfolios, among asset
units of various types, maturities and yields.
• Determining appropriate adjustments to make if
actual flows of sources and uses of funds deviate
significantly from the expected.
These decisions are made relatively frequently and
have both an immediate and long term impact on bank
performance. In the long run, the overall yield on assets
held and the cost of funds used are affected. In the short
run, a decision to reallocate assets could result in selling securities at a substantial gain or loss thereby
drastically affecting earnings for that reporting period.
In contrast to the problems of modeling long term
strategic decisions, these decisions require relatively
less information, which in turn is much less resistant to
quantification.
Short term money management

If balance sheet management is concerned with the
bank's average position in very short term funds over
a span of time, money management is concerned with
handling of daily fluctuations around this average.
Models for these latter decisions seem to offer limited
opportunity for profit improvement because of the
relatively small percentage of total assets involved in
short term funds.
BANKMOD I: modeling intermediate term, balance
sheet management decisions

Based upon this analysis of hank management decisions, BAI has decided to concentrate on modeling
the balance sheet management problem with particular
emphasis on the management of the security portfolio
and borrowed or purchased funds. This problem is susceptible to quantification while at the same time offering potential for a significant payout.

tempts were made to apply linear programming techniques to its solution. With bank asset management,
however, serious difficulties arose in properly quantifying the many variables and their interrelationships and
in adequately specifying an objective function. Acceptance was further hindered by difficulties in gaining
management understanding of the optimization approach, its benefits and its limitations. Additionally,
decision-makers had fears of becoming obsolete and
being replaced by some mechanized decision process.
It is not surprising, therefore, that asset management
optimization models have not been widely accepted.
The effort expended in their development has, however, been useful in contributing to a better understanding of the problem.
The intoduction of time-sharing with conversational
capabilities permits use of interactive simulation to
handle problems not readily amenable to optimization
techniques. This approach eliminates the need to express a manager's judgment in the form of a utility
function to drive an optimization model. Rather, with
interactive simulation, the manager is included in the
solution-finding process. He can ask "what if" questions and use his judgment to evaluate the answers.
In bank balance sheet management, finding the
"best" possible policy requires evaluating often conflicting components of performance. For example, an
increase in profits may coincide with a deterioration in
the bank's liquidity and capital positions (i.e., cause
the bank to be considered less "safe" from the depositor's point of view). It is simpler to provide the
banker with information on the effects of various asset
management policies and use his expertise to select the
appropriate solution. The banker's judgment is brought
to bear upon the problem of selecting the most desirable
trade-off between profit and safety.
This approach greatly simplifies the problem, reducing both development and running costs. -It is also
attractive to management in that the decision making
process in the simulation .corresponds to that used in
the real world. This understanding of the decision
process facilitates management acceptance of the simulation results. In contrast, bank managements have
apparently been more reluctant to accept the prescriptions of an optimization model because of failure to
understand how such decisions are developed and an
unwillingness to have a machine appear to take over
the management role.

USE OF DISCRETE SIMULATION APPROACH
Much of the earlier work on balance sheet or asset
management models was based on optimization techniques. 5 ,6 The asset management problem resembled
the resource allocation problem in industry and at-

DESCRIPTION OF BANKMOD
The previous sections have shown need for a decision
aid for managing the balance sheet and outlined the

Bankmod

approach BAI is taking to it. This section describes the
resulting model. The paper will not discuss the underlying mathematical structure which cannot be treated
adequately in the space available here. The interested
reader is referred to special literature on banking,
especially Gray, Kenneth B., Jr., "Managing the Balance Sheet: A Mathematical Approach to Decision
Making," Journal of Bank Research, Spring, 1970,
which presents a mathematical structure related to
BANKMOD.

n = n + 1
and/or Next Named Assumptions

641

not satisfied
Sat.isfied

Overview of the model

The model is a "what if ... ", or statement projection
model. Basically, it is a financial reporting system
computing future bank statements resulting from the
impact of environmental changes and user decisions on
the initial state of the bank. These financial statements
and various analytical reports provide the banker with
the information he needs to select the most appropriate
decision strategy.
Prior to running the model, the user develops a set
of environmental assumptions consisting primarily of
his forecast of future interest rates and of deposit and
loan levels. He also enters the initial state of the bank
including balance sheet and portfolio data.
With the assumption set and the initial state of the
bank, the simulation can be run without user decision.
The model then computes income, takes maturities,
and adjust loans and deposits to the forecasted levels.
An excess or deficit of funds is handled by sale or purchase of Fed funds.
While the simulation can be run without any user
decisions, normally the banker would interact with the
simulation. This interaction of the user and the model
is illustrated in Figure 1.
His decisions would include reinvestment of funds becoming available (rather than simple handling through
Fed funds), portfolio shifts for yield and appreciation,

Figure 2-Iterative development of the optimal decision set

purchasing funds (e.g. issuing CDs), and changing
the bank's financial leverage. Upon completion of a
simulation run, the user can save the decision set and
use it in a subsequent simulation where modifications
to that decision set can be made. Successive iterations
can be run until the decision set is considered to be
reasonably optimum.
At this point another feature of the model can be
utilized to test the decision set against different assumptions about the environment. Figure 2 illustrates how
this feature of the model operates.
The user, after developing a satisfactory set of decisions for the most likely environment, then tests this decision set against assumption sets which represent deviations from the expected (e.g., higher loan demand and
interest rates, lower loan demand and interest rates).
This permits the decision set to be tested for its sensitivity
to deviations of the economic environment from that
which is expected. This sensitivity analysis is performed
simply by calling in an alternative assumption set and
running the saved decision set against it. Should analysis reveal that the decision set is not sufficiently hedged
against adverse changes in the environment, the decision set can be modified appropriately.
Structuring BANKMOD into decision periods and
decision points

f
Figure 1-Interactive simulation process

To make BANKMOD conform as much as possible
to reality and still be manageable, the planning horizon
with which the banker is concerned is divided into a
specified number of periods of uniform length (e.g.,
month, quarter). Within a period the model considers
the environment to be constant. Balance sheet changes
occur only at the points separating the periods (the
"decision point").
In reality, however, not only does environment
change continuously, but decisions are also made con-

642

Fall Joint Computer Conference, 1971

Several types of automatic adjustments occur:
Reality
Real State
Lf----~ Simulation

Point
Period

Figure 3-Continuous and discrete environment

tinuously rather than at discrete decision points. Thus
the concept of a steady state during the period introduces an element of artificiality with some slight distortion of income computations. However, this distortion is not considered to be of consequence for several
reasons. First of all, forecasts of deposit and loan levels
are assumed to be the averages prevailing during the
period. Second, although security maturities are considered to occur at the end of the period during which
they actually occur, the income distortion is minor. The
discrepancy in earnings is limited to the difference between the yield of the maturing security and the rate at
which the proceeds of the matured security would be invested, this difference multiplied by the time between
actual maturity date and the end of the period.
In any event, deviations due to the decision period
convention are small compared to potential errors in
the rate and level forecasts embodied in the assumptions.
Balance sheet updating

When the simulation is run all adjustments to the
balance sheet can be considered as occurring at the
decision points. These adjustments are of two kinds:
automatic and user decisions.

Income

These changes occur regardless of any decisions by the
user.
Adjustments also occur as the result of user decisions.
These include:
•
•
•
•

Purchase or sale of securities.
Issuance of large, negotiable CDs.
Sales of capital notes or debentures.
Issuance of stock.

These decisions can generate further adjustments of
the automatic type. For example, a decision to issue
CDs will also cause an automatic adjustment in
reserve.
Because the net effect of automatic adjustments and
user decisions are unlikely to be exactly offsetting, any
discrepancy between a change in the source of funds
and a change in the use of funds is handled by the
purchase or sale of Fed funds. Excess funds earn at
the Fed funds rate; a deficit is charged at the Fed
funds rate.
Performance reporting

~

(

• Deposits and loans are adjusted to the levels forecasted in the assumption set.
• Securities, CDs, and other borrowed funds with a
maturity structure are matured automatically.
• Certain balance sheet categories are adjusted in
relation to others. For example, reserve is set
automatically to the legal minimum required by
deposits.
• Other categories are adjusted to reflect accrual of
income. For example, book value of securities is
adjusted to reflect amortization of premiums and
accretion of discount. Equity is adjusted to reflect
earnings.

Yield * t

Period--~)

Figure 4-Effect of decisions at discrete points

The major feedback to the user in the interactive
simulation process is the system of reports available
from the model. These reports resemble those which
would normally be furnished by a bank's accounting
system. They provide information on operating income,
security gains and losses, the composition of the balance
sheet, status of the portfolio and a variety of analytical
ratios. One set of reports provides information regarding performance and status for the period being simulated and highlights the impact of the user's decisions
for that period. These reports are designed to aid the
user in further decision-making. The second group of

Bankmod

reports shows results for the entire simulation horizon
to facilitate comparing the results of one simulation
with another.

FM

15:53

643

07/02171

FRB MEMBER (Y,N) 1Y
OPTIONAL BALANCE SHEET CATEGORIES (I FOR INCLUDE, 0 FOR DELETE)

Uses of the model
The BAI simulation model is especially helpful in
aiding management decision-making because it provides
a consistent, comprehensive technique for examining
sets of decision alternatives under a variety of assumptions about the economic environment. Specifically it
aids management in analyzing a set of decisions according to the following criteria:
• Change in average yield earned on securities (paid
for purchased funds) .
• Related effect on liquidity and capital adequacy of
portfolio changes to improve yield.
• Cost of providing funds through sale of securities
if supply of funds is inadequate to meet forecasted
demand.
• Opportunity to improve earnings performance by
aggressively managing the security portfolio for
market appreciation.
• Improvement in earnings resulting from changing
the leverage of the bank (ratio of assets to equity).

ASSET
DUE FROM-FOREIGN
MONEY MKT LOANS
ASSET-CO'S
ACCEPT'S&COMM'L PAPER
TREASURY BILLS
AGENCIES
TRADING SECURITIES

(I,D) 10
(1,0) ?D
(I,D) 10
(1,0) ?D
(1,0) 11
(I,D) 1 I

(I,D) 10

LIABILITY
CD'S-STATE&lOCAL
CD'S-MONEY MARKET
EURODOLLARS-REG.D
EURODOLLARS-REG.M
HOLDING CO. PAPER
CAP NOTES

Cl,D) ?I
(l,D) ?D

(I,D) ?D
(1,0) ?D
(1,0) ?D
Cl,D) 10

ENTER LOAN GROUPS:
NAME (ONE WORD)

ABBREVIATION (TWO LETTERS)

1INSTALLMENT
1REAL-ESTATE
1eOMM' L-MET RO
1COMM'L-NATIONAL

IN
RE
eM
CN

ENTER DO GROUPS (EXEPT US, STATE&LOCAL):
NAME (ONE WORD)
ABBREVIATION (TWO LETTERS)
?SMALL
?eOMM'L-METRO
?COMM'L-NATIONAL

OS
CM
CN

LIST FORMULATION (Y,N) ?N
DECISION PERIOD (MONTH,QUARTER) ?Q
PLANNING HORIZON (NO. OF DEC.PER.) ?4
FUNDS UNIT (THOUSAND, MILLION, BILLION) ?T

The model is therefore designed to quickly summarize
performance and balance sheet status according to
these criteria and to evaluate a decision strategy and
its sensitivity to changes in the economic environment.
SYSTEM DESIGN
Since BAI is an organization supported by a majority
of banks in the U. S., it was of the highest priority to
develop a model which would help not only a few very
large banks but which would also be of use to mediumsized and smaller banks as well. This requirement has
been recognized in several ways. A bank using BANKMOD can tailor a model to fit his bank by including
only those balance sheet categories, and associated
routines, which are appropriate to his bank. In addition, the model is designed to run on a nationwide
time-sharing system thereby enabling any bank to access the model by simply installing a terminal and
paying for the necessary CPU and connect time.
The interactive simulation process described above
is embedded into a very powerful system with several
large programs working together. The individual pro-

ENTER PORTFOLIO-ARRAY SPECIFICATIONS :
CAT.
MIN.PURCH.YIELD
INCREMENT
BIL 1 :2
GOV 1 2.5
AGe? 3
MUP ? 3.5
MUG ?4

NO. OF POINTS

1

6

.5

10

6

1

.5

.?

10

e

FM END

I

Figure 5-Formulate mode

grams or modes of the system are:
Formulate
Assumption
Real State
Input-Forms
Query
Simulate
Originally, these modes are programs in the shared
lib.rary of the T /S company. When run, they build and
utilize files in the user's private library. This guarantees
security for the user's proprietary data, such as the

644

Fall Joint Computer Conference, 1971

initial state of the bank and forecasts of loan and
deposit levels. The major features of the model can be
described in terms of these modes.

INPUT FORMS
NAMED ASSUMPTI(
SET NM'E

ASSUNiPTIO~1

Formulate Mode

LINE.
KlY

In the Formulate .Mode (FMODE) the user interactively tailors the model to fit his needs. Figure 5
illustrates how the user formulates a specific model for
his bank. The pattern of interaction with the user is
for the program to print the data to the left of the
question mark while the user's response immediately
follows the question mark.
The user establishes the various parameters of the
model. In addition to the basic balance-sheet structure,
he can add optional asset and liability categories which
may be appropriate to his bank. He can also provide
further breakdowns of loan and deposit categories as
desired.
In this mode the user also defines the length of a
period to fit his requirements for accuracy, and the
number of periods to be simulated. He also defines the
size and degree of resolution incorporated in the portfolio array structure. FMODE uses this information to
specially tailor all the other modes. That is, when the
other modes are run, all their arrays, records and files
are custom-built in order to run the program more
efficiently, particularly to minimize response time.
The user normally runs FMODE once to "establish"
his model bank, but may set up additional models
with different period lengths and different levels of
balance sheet detail.

EX~6C.76f)

-------------------------

DETERMINEO RATES (XX.X)

~AHKET

DAILY FUNDS

11~

4.0

GOVERNMlNTS-YIELDS TO MATURITY (bASED ON ASKED PRICES>
DAYS

117

9L1

11H

IHO DAYS

119

30

If'

II

---!:.~- _o!~_~_

.I. ~

-~;.:t_ -~·_Z -~-?-

(EARS

~.D

.r.

'7

_!..:_o{_
_~_Z_

-!:.-?- -~-~-

>jUMPEO CURVE. COMPLETE THE FOLl.OwING TWO LINES:
(IF NON-HUMPED ENTER ZEROES)

1~1

MATURITY OF

122

MAXIMUM YIELD

~AX

YIELD(YRS)

(J

---------!--

---~--

---~-- ---~--

AGt:iJCIES-YIELOS TO MATURITY (ASI( PpICES)
120

1 YI::AR

126

2

127

1~

-_?:..!- --!:..f_

YEARS

-~:..{- --~£

'fEAHS

MU~ICIPALS-PRI~E

130

1

YEAR

131

2

YEARS

132

30 YEARS

_~,-t_

_~.2._

-fl'.t1

--~~-

-!.'-~- --~:.J!. --~~!- --~-...~
YIELDS TO MATURITY (ASK PRICES)

-?.·_2_ _..!'!:..?. -!?;2_
.;. (J

-~2._

_i!....?.

--~~ -~...~- _!':'_t:J.._

-_£.2.

~;;.t"

~'..J'"

PRICE OF BM,K'S STOCK

Figure 6-Illustration of assumption input

Assumption Mode

In the Assumption ,Mode (AMODE) the user describes the economic and institutional environment in
which the simulation will take place. Certain assumptions, primarily economic, must be forecast for each
period in the simulation. Examples are yield curves,
loan and deposit levels and Regulation Q limits. (See
Figure 6 for an illustration of a completed page of the
named assumption input form.) Other assumptions are
considered to remain constant over the entire planning
horizon. These include assumptions such as bid-ask
spreads and tax rates.
The user can establish several sets of economic assumptions and assign each a name. One set may represent the environment considered most likely to prevail.
Another may be based upon a more active economy
with greater loan demand and higher interest rates. A
third may forecast a less active economy with all its

implications. Each set, however, must be internally
consistent. These sets of assumptions are used to check
the sensitivity of one set of decisions against different
economic forecasts.
The amount of data required for each named assumption set are approximately 100 items per period in the
simulation horizon. Additionally, about 75 items are
required to specify certain institutional factors which
remain constant over all periods in the simulation and
all environment forecasts. The actual number of items
will depend on the size of the formulated model.
Because of the amount and importance of the data
entered as assumptions, considerable attention was
given to insure the efficiency and accuracy of the input
procedures. The model will generate forms for the
banker to use in assembling the required data. Once the
data have been prepared, a technical assistant can be

Bankmod

645

used to enter the data through the terminal keyboard.
As the data are entered, the program intensively checks
for technical validity and, to some extent, checks for
logical validity as well.
Another feature permits the user to develop a set of
assumptions which are a partial modification of an established set. The established set is copied under a new
name and the user then makes the desired modifications to this newly generated set.

The model will generate forms for assembling the
necessary data. (See Figure 7 for an illustration of a
completed page of the real state input form.) The data
would then be entered manually through the terminal
keyboard. For those banks which have automated general ledger and portfolio files, it would be possible to
use a utility program to transform the data to BANKMOD specifications and transmit it directly into the
model.

Real State Mode

Input Forms Mode

The Real State Mode (RMODE) is used to enter
the most recent actual "state of the bank" data. This
includes the current balance sheet amounts and a somewhat aggregated description of the portfolios.

This mode (IMODE) provides the input forms described for the AMODE and RMODE. The user specifies the number of copies and whether they should be
printed in-house on the terminal or on a high speed
printer at the computer center for mailing to the user.

PA(:E

Query Mode

REAL STATE INPUT FORMS

~~~~~--

DATE _____

ASSET

BOOK VALUE(THOUSANDS OF DOLLARS)

CASH

CSH ______ ~~~ __

RESERVE

RES _____ ~~~__

DUE FROM

DUE

ITEMS IN PROC. COLL.

PRC

DAI LY FUNDS SOLD

DFS

TREASURY BILLS

BIL _____ -~_~-~~_

GOV'TS

GOV _____ ~-~~-~_

AGENCIES

AGC

MUNiCIPALS-PRIME

MUP

MUNICIPALS-GOOD

MUG

INST. LOANS

LIN

The Query Mode (QMODE) provides for convenient access to certain permanent files that are kept on
the computer. For example, it might be necessary to
get the contents of the portfolio files if this information
has been misplaced since the last run, or it might be of
interest to check the nature of certain decisions for a
certain period while performing a sensitivity analysis.
Simulate Mode

~

--------------S" "111"'0

--------------()

s ~I/
---------------

RE LOANS
LOANS-COMM'L METRO
LOANS-COMM'L NAT'L
BLDGS&EOUIP

B&E ____ - __ ~_~~__

OTHER ASSETS

OTA _______ ~~---

Figure 7-Illustration of real state input

The Simulate Mode (SMODE) is the heart of the
whole BANKMOD system. Here all the work, outlined
above in the description of BANKMOD, is carried out.
The user proceeds period by period from the initial
state of the bank (as entered in RMODE) to the end
of the planning horizon. In doing so, the user requests
specific information, makes decisions regarding the discretionary assets and liabilities, and calls for various
performance and analytical reports.
Within each period, the decision-making process is
an interactive conversation as portrayed in Figure 1
above. The user requests information about the status
of the bank, makes appropriate decisions, and calls for
reports of various levels of detail to evaluate the impact
of his decisions. These reports are called "snapshot"
reports and show performance data for only the period
being simulated. When satisfied with the decisions for
a period, the user causes the simulation to proceed to
the next period until the end of the simulation horizon
has been reached.

646

Fall Joint Computer Conference, 1971

Once the user has completed the decision-making
process over the entire horizon, he can obtain so-called
"horizon" reports summarizing performance and status
over each period and in total for all periods. These reports can be used for comparing the results of separate
simulation runs. This can be done to evaluate different
decision sets (representing different strategies) or for
evaluating a decision set against several different
, environments.
The interactive decision process is discussed in more
detail in the following section of the paper where it is
used to illustrate how the model handles the manmachine interface.

COMMUNICATIONS HANDLING AND THE
MAN-MACHINE INTERFACE
As has been intimated, it was of the highest priority
to develop a model which would help not only a few
giant banks but which would be of use to medium-sized
banks and possibly small banks as well. This directly
implies that the model must be designed to operate in
a way understandable to the bank management-its
use must not depend on a sophisticated management
science department since only the largest banks have
such departments. Further, the user should be able to
run the model and interpret the results with a minimum
of formal instruction. It seems clear, therefore, that
the acceptance and the ultimate value of the model
depended heavily on the quality of its interactive capabilities. That is to say, successful use of the model depends on a satisfactory solution of the man-machine
interface problem.
It has been frequently pointed out that adequate
handling of the man-machine interface is one of the
most difficult and vexing problems in using computer
time-sharing systems (or for that matter, on-line realtime systems as well).8 The difficulty involves achieving accurate, unambiguous and efficient communication
between the user and the computer. The user must find
it easy and natural to enter input data, to guide the
simulation process, and to extract and interpret simulation outputs. The next sections describe how BANKMOD satisfies these requirements and concludes with
an illustrative run of the Simulate Mode.

Input

BANKMOD offers the user format-free input with
extensive error checking. The program prevents the
user from causing a program or system breakdown by

accidentally entering faulty responses. It also prevents
the user from entering information which does not
meet the built-in logic tests.

User-computer interaction

Great effort has been spent to make user-computer
interaction as easy and natural as possible. First, the
user communicates with the model using terms familiar
to him. Second, the communication process is standardized throughout the model, so that whatever mode
is being used, the user employs the same technique for
entering data, issuing commands, and requesting
output.
In addition, the method of communication with the
model has been reduced to two types: indicated response and sequence-free commands. The indicated response is used where the input cannot be sequence
free, such as entering certain parameters at the initiation of the run. A simple command structure is used to
enter user decisions, requests for information and reports, and simulation control commands on a completely sequence-free basis.
Once the user has become familiar with these communications standards, he can run the simulation with
a minimum of distraction for system mechanics and
concentrate on the problem-oriented aspects of the
simulation.

Output

The speed and format of the output are of critical
importance. Speed of response is facilitated by the design of the system. Most of the output requires only
nominal calculation with very short response time.
Major computations with longer delays (of the order
of 10 to 20 seconds) are made only when moving from
one simulation period to the next. This design provides
the user with quick response in decision-making interaction and makes the longer response time when the
user is psychologically better conditioned to endure the
delay.
The proper selection and presentation of output information is also of critical importance for a model
such as this. Not only is the total amount of data quite
large, but decisions must be based upon this data. One
of the major research efforts was, therefore, devoted to
structuring the output to maximize the information
conveyed and minimize the total data presented. For
the first evaluation of the effect of a decision, a banker
can ask for a report of a few key indicator variables.

Bankmod
ASSUMPTION CATALOGUE (Y,N) ?Y
NAMED ASSUMPTIONS
EXPECTED
TIGHT
EASY
ASSUMPTION NAME ?EXPECTED
SAVED DECISIONS (Y,N)?Y
DECISION CATALOGUE (Y,N) ?Y
NAMED DECISIONS
WPH-TESTl
WPH-TEST2
UECISION NAME ?WPH-TEST2
~~~ DECISIONPOINT

0:

05/04/71

ENTER COMMANDS :mlt

07/01/71

ENTER COMMANDS

ASSUMPTION: EXPECTED
DECISION
WPH-TEST2
~~~ DECISIONPOINT

FUNDSBAL.:
?~D

1:

x~x

782.693

MUP

MAT.
POINT
110
?::O GOV

AMT
PAR
1000

PURCLf.
YIELD

COUPON

MAT.
AMT
POINT
PAR
?10
-200
?::FUN
FUNDSBAL.: -titi.6435

PURCH.
YIELD

COUPON

ASSUMPTIONS

3

EXPECTED

SAVED DECISIONS : WPH-TEST2

BEFORE
DECISIONS

AFTER
DECISIONS

--------- --------NOI/SHARE

PERCENTA~E
CHAN~E

-15

-10

~

~

-5

CHAN~E

0

x

-

1.?1

1. 81

.29

1. 51

-1. 32

-2.83

Ll QU IDITY RATIO

95.6%

93.2%

-2.4%

XXXX

CAPITAL RATIO

23.8%

21. 7%

-2.1%

XXXXXXXXXX

BOOK=

----------- ____ _

1398.43

PORTFOLIO SUMMARY (Y,N) ?Y
MATURITY
POINT

2

2
3
4
5

0
0
0
0

PURCH. YIELD
3
0
0
0
0

4

5

6

7

750
660
0
0

0
0
0
0

0
0
0
0

0
0
0
0

?::END

SAVE DECISIONS (Y,N) ?Y
UECISION CATALOGUE (Y,N) ?N
NAME ?WPH/TEST3
WPH/TEST3 SAVED ON DF3
SM END
READY

x

15
~

++++++++++++++++

?)q Bil

1410

10

--------------------------------

NI/SHARE

PAR=

5
~

SIGN OFF

Figure 8-Illustration of a simulate mode run

647

648

Fall Joint Computer Conference, 1971

If necessary or desired he can then request additional
reports providing greater levels of detail. Without
this report structure whereby the user requests a specific type of information and specifies the level of detail, the overwhelming amount of information would
make the interactive approach totally impractical.

Sample run
To illustrate the design concepts described above,
this section concludes with a sample run. Since the
communications standards are uniform throughout
the model and the Simulation Mode is the most extensively used, a run of this mode is used for illustrative purposes. (The sample run referred to in the
material following is reproduced in Figure 8.)
After the user has started SMODE, the computer is
in one of two states: either awaiting a pre-defined
indicated response or awaiting a command. Examples
of pre-defined indicated responses are shown in Figure
8 at the start of the SMODE run. The program here
requests information regarding the assumption set and
decision set to be used in the simulation. Any input
other than what is indicated will be rejected resulting
in a repetition of the inquiry. The request and response
are kept as short as possible.
Commands are given in those parts of the run which
are sequence-free. Commands are essentially of four
types-Control, Information, Decision and Reportand all commands begin with the character "*".
Control commands steer the flow of the program. An
example in Figure 8 is the command "*N" which
causes the simulation to move from decision point 0
(the starting point) to decision point 1 (the beginning
of the first full calendar quarter). Another example
near the bottom of Figure 8 is "*END" causing the
simulation to proceed to the last period and terminate.
Another control command, not illustrated in the example is "*PAUSE". This command stores the status
of the simulation permitting the user to sign off and
later resume the simulation at the point where he
paused. This has proved to be very convenient in the
use of the model since the user can interrupt the simulation to consider decision strategy or avoid excess
fatigue without elaborate requirements for saving the
status of the simulation.
The information command is a multi-purpose command of the format "*1 cat". The abbreviation "cat"
signifies that the user specifies the balance sheet category for which he desires information. In the sample
run, the command "*1 BIL" shows the use of this
command to obtain information about the bank's

holdings of Treasury Bills. After providing the basic
information, the program asks the user if he wishes
detailed portfolio data. If the user provides the response
"Y", the portfolio array as shown in the lower portion
of Figure 8 is printed.
The decision command is also a multi-purpose command taking the format "*D cat" where "cat" refers
to the balance sheet category to be changed. The use
of this command is shown at the mid-point of Figure 8.
The command "*D MUP" indicates that the user
wants to enter a decision to buy or sell prime municipals. The computer responds with the format readings prescribing how the data is to be entered. The
command "*D GOV" illustrates a command to buy
or sell U. S. Government bonds.
The run illustrates several uses of the report command. The command "*FUN" illustrates a request for
the current funds balance (net funds currently available
for investment). This is essentially a one-item report.
A more elaborate report is obtained by the command
"*FLASH" which shows the effect of the decisions on
certain key indicators of performance. The report combines tabular presentation of data with a graphical display of the direction and magnitude of the changes
resulting from the decisions. Additional "snapshot"
reports are available providing greater levels of detail
of performance and bank status for the period being
simulated. Also, at the completion of the simulation,
the user can request various "horizon" reports which
summarize the simulation over all the periods in the
simulation.
The sample run terminates after asking the user
whether or not the decision set should be named and
saved for future simulations.

REFERENCES
1 BOOZ-ALLEN & HAMILTON INC
The challenge ahead for banking: a study of the commercial
banking system in 1980
Chicago Illinois August 1970
2 C F HAYWOOD L R McGEE
The expansion of bank funds in the nineteen-seventies
Association of Reserve City Bankers Chicago 1969
3 G C FISCHER Ed
Commercial banking 1975 and 1980
Robert Morris Associates Philadelphia 1970
4 R I ROBINSON R H PETTWAY
Policies for optimum bank capital
Association of Reserve City Bankers Chicago 1967
5 K J COHEN F SHAMMER
Analytical methods in banking
Richard D Irwin Homewood Illinois 1966

Bankmod

6 K J COHEN F SHAMMER
Linear programming and optimal bank asset management
decisions
Journal of Finance XXII May 1967
7 J B BOULDEN
Instant modeling in Corporate simulation models
A N Schrieber Ed Seattle Washington University of
Washington 1970

8 H SACKMAN
Computers, systems science and evolving society
John Wiley and Sons Inc New York 1967
9 K B GRAY JR
Managing the balance sheet: a mathematical approach to
decision making
Journal of Bank Research Spring 1970

649

Simulation of large asynchronous logic
circuits using an ambiguous gate model
by S. G. CHAPPELL
Bell Telephone Laboratories
Naperville, Illinois

and

S. S. YAU
Northwestern University
Evanston, Illinois

INTRODUCTION

logic circuits provides fault simulation and race analysis which is also based on the Huffman Model. However, only binary simulation logic is used and even
combinational logic is not simulated correctly since
hazards are suppressed due to the use of the leveling
technique. Leveling means the output of a gate is not
calculated until all its inputs are known. In this way
the output of each gate is calculated only once. Later
Chang4 extended the simulator to include shorted input diodes on a DTL gate. Szygenda, Rouse and
Thompsonfi proposed a simulator which uses ternary
simulation logic and provides various gate delays and .
ambiguity regions (regions where the simulation model
cannot predict the output of agate). However, the
third logic value is only used for circuit initialization.
During simulation a Potential Error Flag is used to
represent the ambiguity region and an additional logic
element is required to manipulate this flag. In addition,
the uncertainty region associated with the turn-on delay is assumed to be the same as that for turn-off delay
for a gate.
The simulator proposed in this paper is the only
simulator which is able to accurately simulate the effects of shorted diode and shorted net (gate outputs)
faults in an asynchronous sequential circuit. In addition, the gate model used here allows specification of
minimum and maximum turn-on and turn-off delays
where the interval between the minimum and maximum
delay is treated as a third simulation value x (the
ambiguity or don't-know value). The ambiguity value
allows efficient handling of each gate (no extra elements
are required) aml: the use of a new high-frequency
rejection technique. This technique provides easy suppression of transient input conditions of shorter dura-

Digital logic simulation is the process whereby the action of a logic circuit due to a specified input is predicted based upon some model of the circuit. Logic
simulation is becoming increasingly necessary as larger
and more complex computers are built. Because of the
cost of building hardware it is not wise to commit a
circuit design to manufacture without first verifying
the operation of the circuit by simulation. This is true
even for large computers (say 50,000 gates) where
simulation will eliminate many logic errors and may
save construction of a prototype model. Simulation
may be used to predict the output of the circuit due to
specified faults as well as to predict the output of the
good (fault free) circuit. A dictionary is generally
compiled of the output of the circuit in the presence
of known faults. By comparing the actual (perhaps
faulty) circuit output to the correct output, it is possible to detect and diagnose a fault in the circuit.
Many simulators have been proposedl-9 which allow
simulation of the good circuit and several have been
proposed which allow simulation of the traditional
stuck-at (stuck-at-one and stuck-at-zero) faults.
Among these simulators, the following are more significant. Eichelbergerl first proposed a simulator for logic
circuits, based on the work of Yoeli and Rinon2 which
uses a ternary logic 0, Y2 and 1 where logical 0 and 1
are the Boolean 0 and 1 and the Y2 represents a don'tknow value. However, his simulation and hazard detection are based on the Huffman Model which is not an
accurate representation of a general asynchronous logic
circuit because the delay in all the gates is lumped into
the delay elements. Seshu's Sequential AnalyzerS for
651

652

Fall Joint Computer Conference, 1971

Nanosec
10
~

0

cu

8

a

(0)

-

6

~

4

0

I

c

::J

t-

2
2

4

8

6

10

Fanout

Load
Nanosec

10
~

0

cu

8

c
0

6

a
(b)

I

c
~

::J

t-

4

2
2

4

6

8

10

Fanout

Load
Figure l-Load vs transition time curves for TTL logic

tion than the appropriate minimum transition delay.
The gate model used allows detection of certain constrained hazards and produces a worst case timing
analysis of the circuit (based on the transition delays
assigned to each gate) for both the good and the faulty
circuit. This simulator has been implemented for circuits of up to 50,000 gates, and its speed is comparable
to that of the latest simulator.5
THE GATE MODEL
In order to provide an accurate simulation of a logic
circuit, an accurate model of each logic element is
necessary. Only gates will be simulated here since the
actual logic circuit is composed only of interconnected
gates. Attempts to simulate larger modules such as

flip-flops and registers may introduce logic and timing
errors. The only exception to this rule is that a pure
delay element is allowed to simulate actual delay lines
or long wiring runs.
If the load versus time curve for a typical TTL gate
shown in Figure 1 is examined, it is apparent that the
turn-on and turn-off (transition) delays are different.
It is also known that there is considerable variation in
transition delays among supposedly identical gates. In
addition, such factors as the loading, length of the output net and temperature, affect the speed of a gate.
However, if reasonable design constraints are imposed
and the gates are reliably manufactured, the transition
delays will usually fall within certain bounds. Therefore,
a logic gate may reasonably be characterized by its
minimum and maximum turn-on (0 to 1 transition) and
turn-off (1 to 0 transition) delays.
The turn-on (turn-off) delay is the time between
application of the input signal and the time the output
signal reaches 90 percent (10 percent) of its final value
(initial value). If the gate is operating in saturation,
the turn-off time includes the storage time as well as
the decay time of the gate (transistor). However, this
characterization of a gate is very general and may be
applied to any gate regardless of its mode of operation
or location in the logic circuit. Another factor which
must be considered is the high-frequency rejection of a
logic gate. That is, a gate will not respond to an input
pulse of shorter duration than the appropriate minimum
transition delay. Therefore, the gate model must perform both high-frequency rejection and account for
variations in gate transition delays. In this paper, the
gate model will be referred to as the gate, while the
real gate will be called the real or actual gate.
Assume the gate G may be characterized by the
following four parameters:
a = minimum turn-on delay
b = maximum turn-on delay

c = minimum turn-off delay
d = maximum turn-off delay
The transition delays for G may be set to any integral
values and are typically selected based on statistical
information about the behavior of the gate being used
for some given load. If a = band c = d, then the gate
model G is said to be unambiguous. If aKu,

(7)

and

where K is a constant. This is shown in Figure 2.
The penetrating memory filter for this case is the
well-known growing memory least squares filter: ll

:tin (v+l) = Uo n(v+l)
v(v-I)
_
(v)
(v+ 1) (v+2) UO(n-l)
v(v-I)
+ (v+I) (v+2)

UI(n_I)(v)+

2(2v+ 1)
(v+I) (v+2) Un

-lin (v+l) • T = Uln(v+l)

8

I: MdiUi(n-l) +Ndun+ I: SdJUn-J+j,
(v)

;=0

i=O

d=O, 1, ... , N,

(3)

and the penetrating memory filter is:
Udn (v+p+l) =

N

p

i=O

;=1

(v-I) (v+4) _
(v)
+ (v+I) (v+2) UI(n-l) +

I: J diUi(n-l} (v) + Kdun+ I: P dJUn-J-j,
d=O, 1, ... , N.

6
_
(v)
(v+I) (v+2) UO(n-l)
6
U
(v+I) (v+2) n
(8)

(4)

The shrinking memory filter for this case may be

Adaptive Memory Trackers

derived from (8) and the equations for the finite
memory least squares filter.ll The result is:

665

200

ISO

Un (v-I) = Uo n(v-I)

100

SO
01------

_ (v-9) _
(v)
- (v-I) UO(n-l)

+

5-

(v)

UI(n-l)

+

2(2v-3)
v(v-l) Un

150'
140'
130'
120'

110'

2
2(v+3)
+ (v-I) Un-vH+ v(v-l) Un-v

Un (v-I) • T = Ul n(v-I)
18

_

(v-I) (v-2) Uo(n-l)

(v)+ (v+l0) (11)
(v-2) UI(n-l)

iij~__~~~__
31.251

93.75T

156.25T

PHYSICAL MODEL (P

):- -

-

__

++++++
OUTPUTS (un (f»: _ _ __
218.75T

281.25T
n

6

n

INPUTS (un):

343.75T

406.25T

468.751

T

6

+ v(v-l) Un+ (v-I) (v-2) Un-v+l
6(v+2)
+ v(v-l) (v-2) Un-v.

Figure 3-Adaptive memory, straight line, least squares tracker
when physical model is constant, df(E) given by Figure 2, with
K=O.8

(9)

the interval i = j to i = n is given by the response:

In these equations, itn (I) = Ul n(I) IT denotes the velocity
output.
The figures which follow show the results of simulations for this adaptive memory tracker on an IBM 1130
computer, using disturbances dn which are independent
samples from a pseudo normal distribution generated
by a subroutine within this computer. In these figures,
the values Pn which constitute the physical model are
shown by a dashed line, the inputs un=Pn+dn are
shown by crosses, and the tracker outputs Un (I) are
shown by the solid line. At the top of each figure is a
plot of the values that the adaptive variable f assumes
for each value of n.
A convenient measure of the tracker response as a
function of f when the physical model is strictly linear
is the noise-reduction-ratio.u This may be used, therefore, in Figure 3, which follows, but only rarely in the
remaining figures.
A simple measurement of the tracker response over
Af
Af(E)

+1
K

-I

Figure 2-Change in memory length (tJ.f) as a functon of error
signal (E) for trackers shown in Figures 3, 4, and 5

n

Rj,n=

L: I u/'(i» -

Pi I

..:..i=.....:J'-·_ _ _ __

n-j+l

(10)

The response is viewed visually in an easy way in these
figures as Rn,n= I Un(I)-Pn I, the magnitude of the
difference between the solid line representing the
tracker output Un(J) , and the dashed line representing
the physical model pn at each value of n. Values of the
response Rj,n over particular intervals will be displayed
in these figures where necessary in order to lend preciseness to the description of the tracker response.
Figure 3 shows the case where the physical model is
constant and K =0.8. This may be called a regulator,
homeostat,I2 or, even, sociestat,I3 where the control
variable is memory length. The slow increase in f and
resulting slow improvement in the tracker response are
due to the low value of K = 0.8.
In Figures 4 and 5, the value of K is K = 1.0, and the
physical model is piecewise linear, over the path A, B,
C, D, E, F, G. It is assumed that BA is extended before
A linearly, and the tracker is started at A with the value
of f as shown. The operation of the tracker before A is
similar to that shown in Figure 3. This physical model
corresponds to the case where there are repeated
changes in environmental constraints. It may be
produced in an approximate sense by an aircraft making
repeated changes in bearing of 90° through high g
maneuvers, or through repeated changes in laws, either
in a society or, as in learning reversal experiments,
on organisms. 14
In Figure 5, the tracker does not track BC, CD, DE,
EF, and FG, until B I , el , D I , E I , and F I , respectively,

666

Fall Joint Computer Conference, 1971

PHYSICAL MODEl (Pn): - - - - -

200
150
f 100

INPUTS ('.)

OUTPOTS

-

,+++++++

ru/»):---

50

o

rRI88,281-1.34C1i

t--R37S,468-0.690'---r

i i i

I\~'::'Y2:~~1';-\~'\~:.:.,~.l:,:,·o,~~~,.,.,j
2817

I

617
417

2"
SOT

lOOT

150T

200T

250T

300T
.T

3SOT

400T

4SOT

500T

550T

Figure 4-Adaptive memory, straight line, least squares tracker
when physical model is piecewise linear, t::.f(E) given by Figure 2,
with K=1.0

for one of two reasons:
Either
(i) The tracker does not sense data from the physical
model BB I , GGI , DD I , EEl, and FFI , and in the absence
of data simulates data from the substitute model BBo,
GGo, DDo, EEo, and FFo; or
(ii) The tracker is being deceived and receives false
data from the substitute model BBo, GGo, DDo, EEo,
andFFo•
In either case, the tracker senses data from the
physical model starting at B I , GI , D I , E I , and Fl. The
values 8 n of the substitute model are shown in Figure 5
as a dotted line, so that the values of the input data Un
along the paths BBo, CGo, DDo, EEo, FFo, are given
by Un = Sn+dn •
It can be seen that this tracker shows very good
learning characteristics. Not only does the tracker show

improved response to each change in path of the
physical model, but the tracker also distinguishes
between the conditions of Figures 4 and 5, in the former
case stabilizing to values of f in the range 52 ~f ~ 76,
and in the latter case stabilizing to values of f in the
range of 40 ~f ~ 65. It can be seen that in Figure 4 the
response improves from R94.187 = 2.170" to R 375 •468 = 0.690"
and R469.561 = 1.020". In Figure 5 the response improves
from R 126 .187 =4.950" to R407.468=1.700" and R500.561=
1.920". The improvement in response shown by this
simple tracker is similar to that shown by the higher
vertebrates. 14 Such performance, obtained with the
filter model, K = 1, e= EI = 1, 8= 1, p = 0, all being fixed,
suggests that more sophisticated examples, where some
of these are allowed to vary, either in a preset way or
adaptively, may be of more value for future work and
research.
In the above examples, the adaptive memory tracker
is unbounded. Practical implementation within a
computer (whose physical memory is finite) requires a
value, F, so that f~F for all f.* This is easily accomplished with the above tracker by adding the following
proviso to (6) and (7).
If v=F then ilf= -1 (s= 1)
regardless of I E I; i.e., in this case, the value
of I E I is ignored and (9) is used.

(11)

To give a more sophisticated example, consider the
same filter model (straight line, least squares), the same
e= EI = 1, but ilf(E) , the change in memory length as a
function of error signal, given as shown in Figure 6.
Here, f:,.f(E) is again a fixed function, but now f:,.f lies in
the range -50~ilf~+50, as shown. This implies

30
20

\

10
r-----;-''''c-----~---------IEI

317

-10

-20
-30

-50

soY".

lOOT

150T

200T

250T

300T

350T

400T

450T

500T

550T

Figure 5-Adaptive memory, straight line, least squares tracker
when physical model is piecewise linear with substitute, t::.f(E)
given by Figure 2, with K = 1.0

Figure 6-Change in memory length Ct::.f) as a function of error
signal (E) for tracker shown in Figures 7 and 8

* Of course there is no need from the standpoint of performance to exceed the required specifications for the problem at
hand.

Adaptive Memory Trackers

values of 8=0, 1, ... ,50 and values of p=O, 1, ... ,49.
This is of little interest from an implementation standpoint, and the corresponding equations for the shrinking
memory and penetrating memory filters need not be
given. This may be of interest, however, for biological
systems. It is reasonable to suppose that there is more
flexibility in such systems than implied by the binary
alternative shown in Figure 2. One purpose of this
example is to show that large variations in memory
length as a function of error signal have unfortunate
consequences in terms of tracker response. In particular,
such variability in short or medium term temporal
memory occurs in some cases of hysteria, or amnesia in
senile dementia. At certain times of day, a patient's
temporal memory extends over a period of hours; at
other times of day, the patient's temporal memory
extends only over a period of minutes.
Figures 7 and 8 demonstrate the operation of this
tracker for the same cases as presented in Figures 4
and 5. Here the tracker is bounded with F= 175 (corresponding to a form of permanent amnesia). The values
of af(E) given by Figure 6 are limited by the proviso
that f lie in the range 2~f~ 175.
As in these earlier figures, the line BA is extended
linearly before A, in order to provide previous input
data. In Figures 7 and 8, however, the tracker is started
normallyatf=2.
The tracker response over the path AB indicates
that such a tr~cker is a satisfactory homeostat as long
as F is sufficiently high. There are sudden sharp drops in
f, which would seriously degrade the tracker response
were the value of F too low. The chosen value of F = 175
is sufficiently high to prevent this degradation.
However, the response of the tracker to repeated
changes in environmental constraints is poor. In Figure
7 the response values over the indicated intervals vary

SOT

lOOT

ISOT

200T

250T

300T

~~ 350T

400T

450T

SOOT

550T

Figure 7-Adaptive memory, straight line, least squares
tracker when physical model is piecewise linear, ~f(E) given by
Figure 6, with F = 175

667

PHYSICAL MODel (Pn): -

--

suasnTUTE MODEL (In):· • • • •

INPlJn (u h

200
1 150
100
50:

++ +++

OUTPUTS (ij (0): _ _
Vl

i,!i

'i

,i\J:J

V'~V'!i

}~~\~".

~J\~iN0

l

t

n

i.

~N1~i\·ilj '1.~~IAll '"jmli\~\tN~ \ .w\~y,(v\r\ ~
i

..

19,280-1.70

200T

250T nT

.I

f.7,168- .7l

28"

14"
12.
10"
8"

SOT

lOOT

lOOT

lOOT

350T

Figure 8-Adaptive memory, straight line, least squares
tracker when physical model is piecewise linear with substitute
~f(E) given by Figure 6, with F == 175

beween 0.70 u and 0.98 u; in Figure 8 th~ response values
over the indicated intervals vary between 1.50u and
1.74u. There is no improvement in the response, such
as shown by the decrease in response values of Figures
4 or 5. The response values of Figures 7 and 8 are of the
same order as the final response values of Figures 4 and
5, respectively, but here the response is very erratic,
with many sharp increases in Rn,n over all values of n
after B in Figure 7 and BI in Figure 8.
Thus the response of the tracker to repeated changes
in environmental constraints is poor in Figure 7, and
worse in Figure 8. This erratic response with no learning
is similar to the response of the above mentioned
patients in such a situation as, for example, repeated
transfers from one hospital to another. The response of
these patients in this,case may, in fact, become so poor
as to result in death.
In all of the above, including the simulations, the
model or' plant errorI5 is zero. It follows froms,I6 that
the implementation of an adaptive memory tracker
which uses a least squares recursive filter model would
result in non-zero model or plant errors accumulating
without bound as the number of recursions increases.
This may be avoided either by considering other filter
models, such as the stable filter models indicated in
References 8 and 17, for example, or by restarting the
adaptive memory tracker at regular intervals before
these errors become too great. Note in particular that
this tracker does not distinguish between the accumulation of errors in u/ and the variation in Uj caused
by the disturbances dj, j=n, n-1, ... ,n-e+1, as
shown by the equation for the error signal (2). Thus, as
n increases, an accumulation of such errors in Un' which
exceeds the variations in Un caused by the disturbances
will result in increases in E. Since adaptive memory

668

Fall Joint Computer Conference, 1971

trackers in normal operation have df(E) as a monotonically decreasing function*, it follows that the value
of the adaptive variable f will eventually reduce to
f = 2, its minimum value, at which point the tracker
can be restarted.
The restart equations for the above examples at
n = nc are just

-line (2) •

T=

Ulne (2) = U ne - Une-l

(12)

In the above discussion, only simple examples have
been considered, in order to display basic values of the
mathematical model in an easy way and, in particular,
to highlight the effect of adaptive changes in memory
length, all other parameters being fixed. It seems clear
that the mathematical model will have more value
when other parameters, such as df (E), or e and ei,
are allowed to vary, and especially when the physical
models or the statistics of the disturbances are more
complicated. It is recommended that these ideas be
extended and applied.

ACKNOWLEDGMENT
Valuable assistance was provided by Lowell Dean
McMahan, who performed the programming for the
simulations.

REFERENCES
1 T R BENEDICT G W BORDNER
Synthesis of an optimal set of radar track-while-scan
smoothing equations
IRE Transactions on Automatic Control pp 27-36
July 1962
2 H R SIMPSON
Performance measures and optimization condition for a
third-order sampled-data tracker
IEEE Transactions on Automatic Control pp 182-183
April 1963

* It may be of interest to consider functions, /lfCE), which are not
monotonically decreasing, in order to study organisms whose
memory operation is of an abnormal or so-called paradoxical
type.

3 E NAGEL
The structure of science-Problems in the logic of scientific
explanation
Harcourt Brace and World 1961
4 L APOSTEL
Towards the formal study of models in the non-formal
sciences in the concept and the role of the model in mathematics and natural social sciences
Editor H Freudenthel Proceedings of the Colloquim
Sponsored by the Division of Philosophy of Sciences
organized at Ultrecht January 1960 by H. Freudenthel
Reidel Dordrecht 1961
5 W R ASHBY
A n introduction to cybernetics
Chapman and Hall Ltd 1956
6 D A NORMAN Editor
Models of human memory
Academic Press 1970
7 J M NIELSON
Memory and amnesia
San Lucas Press 1958
8 G EPSTEIN
On finite-Memory recursive filters
IEEE Transactions on Information Theory Vol IT 16
No 4 pp 486-487 July 1970
9 M BLUM
Recursion formulas for growing memory digital filters
IRE Transactions on Information Theory Vol IT 4
pp 24-30 March 1958
10 R E KALMAN
A new approach to linear filtering and prediction problems
Journal of Basic Engineering Vol 82D pp 35-45 March
1960
11 N LEVINE
A new technique for increasing the flexibility of recursive
least squares data smoothing
The Bell System Technical Journal pp 821-840 May 1961
12 W B CANNON
The wisdom of the body
NY 1932
13 A L STINCHCOMBE
Constricting social theories
Harcourt Brace and World 1968
14 M E BITTERMAN
The evolution of intelligence
Scientific American Vol 212 No 1 pp 92-100 January 1965
15 H W SORENSON
Least-squares estimation from Gauss to Kalman
IEEE Spectrum pp 63-68 July 1970
16 G EPSTEIN
Comment on 'on fim:te-memory recursive filters'
IEEE Transactions on Information Theory Vol IT 17
No 5 September 1971
17 G EPSTEIN
A note on the derivation of finite-memory almost-least-squares
recursive filters
IEEE Transactions on Information Theory Vol IT-17
No 6 November 1971

A panel session-Planinng community information utilities
Conference Results

vision for an "Information Bill of Rights" is certainly
necessary, and perhaps also for such things as anonymous coin-operated terminals and a "Fifth Amendment" switch (to insure user anonymity) on the console.
Management aspects of the utility will be just as touchy.
Even if the utility were just a distributor, the management would have considerable power over priorities.
And, if users strongly adjust their lives around the
utility, they won't have much choice but to go along
with the management.
3. A well-designed, scientifically-evaluated Prototype
Community Information Utility (PCIU) would greatly
reduce the long-term social risk. It would provide an
opportunity to sense the resulting social strains and
experiment with ways to reduce or eliminate them. It
would also reduce the economic risk to business in
interfacing with an information utility. As Bruce Gilchrist expressed it, a PCIU could provide the kind of
future socioeconomic insights we might have gained by
equipping a representative community in 1920 with two
cars per family and the associated services.
However, developing a PCIU wouldn't be easy or
cheap. Very rough estimates for providing a full range
of services to a representative city of 90,000 people
were: approximately 80 million computer instructions
per second, 15,000 file accesses per second, 10 million
statements of computer program, 7-10 years of development time and a development cost of $500 million$1 billion. This puts the full-scale PCIU into the category of a major national effort; however, some perspective is restored when one considers that the nation's
computing bill for the manned spaceflight effort during
the 1960s has been estimated at $2 billion. In any
case, the PCIU certainly would need a great deal of
careful preliminary planning before proceeding into
development.
4. The next steps toward a PCIU would be valuable
whether or not they resulted in a PCI U. These steps
include:

by BARRY W. BOEHM

The RAND Corporation
Santa Monica, California

Although diversity of opinion characterized many of
the individual discussions, the conference yielded a
surprisingly strong degree of consensus on a series of
four major related points.
1. Mass information utilities of some sort will be with
us by the 1980s. They will probably be based on cableTV to the home and a low data-rate return line, although some participant,s felt that two-way video, and
particularly Picturephone, offered a strong alternative.
Some precursors exist now in airline and ticket reservation systems, IBM's Advanced Administrative System, and Mitre's TICCET system for elementary education in Reston, Va. Some commercial planning is
going on, including efforts at Hughes and RCA, and a
multi-client study by A. D. Little. Paul Baran cited a
market analysis by the Institute for the Future, indicating very little large-scale penetration before the
1980s and about a $15-20 billion market by the end
of the 1980s.
2. Mass information utilities carry a great deal of social
risk. Especially within the tight contraints of maintaining economic self-sufficiency, it will be difficult to
avoid effectively discriminatory service policies between
rich and poor users, urban and rural users, Englishspeaking and non-English-speaking users. Thus, any
explicit or implicit public support of such utilities will
have to be carefully thought through. An information
utility will probably widen the gap between the information-rich and the information-poor, although probably not in the long run. Telepurchasing and continuous
credit would increase the temptations and hazards of
overspending, but could on the other hand eliminate
the overcharging for goods in ghetto stores.
Information privacy aspects will certainly be touchy,
even though economics will probably dictate a decentralized file structure. The polling and voting area is
particularly sensitive to the quality of safeguards. Pro-

a. Development of a thorough, detailed PCIU plan
including analysis of management, services, and
technical alternatives, and delineation of principles and pitfalls common to any information
utility implementation.
b. Evaluation of related experiences to date, and
669

670

Fall Joint Computer Conference, 1971

their implications with respect to PCIU development and operation.
c. Development and evaluation of some low-cost
pre-prototypes, involving groups of 100-1000
users and based on existing resources; e.g., an
existing CATV system, an existing educational
network, or a just-developing planned community.
Even if such studies were not followed by actual
development of a PCIU, the insights they would provide on the social and economic implications of any
sort of information utility would be invaluable in guiding the development of alternative systems, in order to
make sure that the utility would serve the people and
not vice versa.

Software Design for the Community
Information Utility

by DONALD COHEN

The RAND Corporation
Santa Monica, California

The aim of our current efforts is to present a framework that will support further consideration, and hopefully development, of a Prototype Community Information Utility (PCIU). There is a continuing explosive
growth in the number and size of data bases and information services that are potentially useful to various
segments of the community. If a common point of
contact can be developed among these data bases and
services (no mean problem in itself), then the economic
and social value of a properly integrated data bank
could greatly exceed the value of the individual data
bases. To exploit such a data bank in any effective
manner requires a number of things:
• Data bases, individually and collectively, must be
organized in a consistent manner, their utility and
scope of application carefully defined, and their
validity certified.
• A continuing process must be established for the
collection, organization, evaluation, and certification of new data.
• The set of services that utilize the data bank must
be carefully selected, defined, and implemented so
that the system is not burdened with applications

that are individually of limited scope and difficult
to interface with one another.
• The system that supports these services must be
virtually failsafe, guarantee a very high degree of
data security and privacy, and provide the basis
for management mechanisms that insure its continued operation within accepted guidelines.
Preliminary estimates of the PCIU workload generated by a moderate sized city (75,000-100,000 people)
indicate that an instruction execution rate of 80-85
million instructions per second and some 15-20 thousand
file accesses per second will be required. Some 40-50
thousand on-line consoles will have to be supported
with possibly 20-50 percent of these active at anyone
time. Extending the system to a large metropolitan
area of several million potential users will require an
exponential increase in system capacity and complexity.
No single processor exists currently, or is likely to
exist in the foreseeable future, that can handle a workload of even the prototype system. A very large multiprocessor system will be required in which groups of
processors are devoted to one of the three maj~r tasks
of message concentration (front-end communications
processors), message processing, and data management.
Within each group processors may be either partially
or completely interchangeable. In addition, redundant
equipment will be required in case of severe hardware
failures or system overloads. As the PCIU would be a
large-scale social experiment, strong emphasis on system
measurability is required.
The PCIU will be predominantly a closed system
with its emphasis on well-defined applications that
manipulate medium- to large-scale data bases. Development of new applications, on-line programming by
users, and updates to various data bases will have to be
rigidly controlled to insure that bread-and-butter applications receive sufficient support, to minimize the effects
of system failures, and to maintain the integrity of the
data bank.
The detailed design of the PCIU must be preceded
by an in-depth analysis in several areas. First, potential
applications must be examined to determine workload
and file requirements, the needs of these applications
in terms of system resources, and the interfaces among
an application, the system, and other applications. This
will provide the first realistic estimate of system size
and complexity. Once this data is available, alternative
software designs can be proposed and examined for the
degree of difficulty inherent in each to provide the
necessarily high level of overall system control.
A third stage of the pre-design process should be a
simulation of feasible alternative designs including the
mechanisms that are proposed to handle the high vol-

Planning Community Information Utilities

ume of processor and file activity that is anticipated.
At the same time, those portions of the system design
that are responsible for insuring that data can be kept
secure and that the effects of system failure can be
minimized should be subjected to a detailed evaluation.
Only when there is sufficient confidence in the preliminary system design, along with quantitative justification for that confidence, should the detailed development of a PCID be attempted.

Manage:ment Prospects and Proble:ms

by BURT NANUS

University of Southern California
Los Angeles, California

There is no doubt that if a community information
utility (CID) is to be a reality, it will have to integrate
smoothly with all other aspects of urban life. This
represents an enormous challenge because it is clear
that the CID will produce fundamental and far reaching
changes in the major subsystems of urban life-i.e.,
the economic, political, educational and life support
subsystems-and it is by no means certain that all the
changes will be beneficial.
The only way to realize the many benefits of the
CID is by the most scrupulous and careful management
of its design, implementation and operation. The CIU
must be managed so that resources are allocated for the
effective achievement of social ends, and these ends
must represent a balancing of the interests of at least
three constituencies; as follows:
1. Society's Objectives-the CID must be designed
in such a way as to assure equal and fair treatment to all users, to contribute to individual
self-fulfillment and to enhance the awareness
and general welfare of the citizenry. To do this,
it should place a higher priority on public than
private services, should be self supporting in the
long run, and should be operated so as to protect
the privacy and dignity of individuals.
2. User Objectives-the users of the CID should
be able to expect reasonable and fair prices, high
quality of service, protection for proprietary data
and programs, and a voice in the setting of
standards and priorities.

671

3. Supplier Objectives-suppliers of CIU equipment and services should receive fair profits,
rewards for technical excellence and social concern, and protection from losses due to the
actions of other suppliers, users or regulatory
agencies.
Given an appropriate set of ends, of which the above
is merely suggestive, it should be possible to explore the
appropriateness of alternative organizational configurations. Many models are worth considering, and they
cover the entire spectrum from a purely public agency
to a privately held corporation. Along this spectrum are
such models as urban entrepreneurship, the non-profit
corporation, the heavily regulated but privately owned
utility, the COMSAT-like consortium of private companies and the government regulatory agency model,
to name a few. One model that appears particularly
attractive at this time is the public authority form
(e.g., the Port of New York Authority) which would
permit the CID to operate outside the regular structure
of government with relative administrative autonomy,
but within carefully defined limits and a mandate to
act in the public interest.
The legal organization form of the CID and the
definition of the locus of power for policy making are
only the first of a series of management problems that
will have to be resolved in establishing the CIU. Other
difficult problems are related to determination of acceptable levels of service, pricing structure, competitive
structure, funding methods, government relationships,
research and development policy, regulatory issues,
consumer safeguards and public relations policies. Some
aspects of these problems have been solved successfully
in other contexts, but many are unique to the CIU and
will require extensive experimentation and management
research.

Planning Co:m:munity Infor:mation Utilities

by NORMAN R. NIELSEN

Stanford University
Stanford, California

Other members of the panel have discussed the
various services which a Community Information
Utility (CIU) might provide as well as the various
hardware, software, and communication facilities which

I

672

Fall Joint Computer Conference, 1971

it might employ. The CIU's actual configuration and
mix of services will be determined by a number of interrelated factors stemming from areas such as sociology,
psychology, computer science, economics, political science, communications, and electrical engineering.
Nevertheless, a study of the economic considerations
which underlie the CIU concept can indicate the more
likely paths for development.
What a OIU might "look like" is of major interest.
Economic considerations point toward a OIU built
around a cable system which would link terminal and
computer for both input and output purposes. The
home TV set would be the primary output device, with
some slow speed (pointer, touch tone pad, keyboard,
etc.) mechanism for user input. The "central" computer
system of the OIU would likely be merely a message
switching computer. It would pass inputs and outputs
among and between the users on the cable and the
various application or service computing systems as
well as control the output video generation for cable
users. The application systems would be developed and
operated independently, although each would communicate with the central computer via a standard interface.
Although inter-CIU communication could be handled
indirectly through each of the application services as
appropriate, there are economic advantages to the direct
connection of OIUs. The use of a network interface
processor in conjunction with the central OIU computer
would not only minimize communication resource usage
but would also permit efficient use of services available
on other OIUs. This latter situation has positive implications for development costs, start-up costs, and the
required critical mass of a OIU (see below).
Despite the desirability, it is quite unlikely that the
CIU will offer the full range of services that are frequently talked about. The provision of dynamic video
output (e.g. film clips) to individual users is virtually
prohibited by economic considerations. Voting services
face very tough cost-benefit questions. On the other
hand, some services such as education appear in a much
more favorable light. Despite the independence of the
various applications services, it would appear that the
whole will be greater than the sum of the parts; that is,
the individual applications will tend to become more
valuable as additional applications are added to the
OIU. The OIU will also face a critical mass effect in
that a certain level and variety of service must be
provided and used before the OIU can develop in a
viable fashion.
The decentralized organization of the OIU is likely
to imply a decentralized file structure, even though
each service could talk to any other service. Such a
development favorably affects the privacy and file

integrity issues, since it lessens protection problems
and renders file integration more difficult. By the same
token, however, it hinders the applications, efficiencies,
and other benefits that would be realizable with integrated files.
Another major concern is the likely cost of a OIU.
It would appear that the communication and terminal
systems might run on the order of $25 per user per
month. Estimated usage costs (admittedly very gross)
could easily run to another $25 per terminal per month.
Thus, the widespread use of OIUs could have a big
impact upon consumption patterns as well as upon the
manner in which many businesses are conducted.
This raises the question of how to pay for a OIU.
It is most likely that there would be some combination
of fixed charges and variable charges based upon actual
usage. Many services would likely be subsidized or
otherwise supported by the provider (rather than by
the user). The computer base for the system opens up
a number of possibilities for splitting payments between
users, service operators, program developers, and other
support organizations. The basic economics permit a
wide range of alternatives, so it is likely that noneconomic considerations will have a large impact upon
the final charge structure.
Two economic problems face the development of a
prototype OIU system. First, and most obvious, is the
need for massive funding for software development and
start-up costs. The second problem relates to the impact
of the provision of these funds. Olearly one can't have
users without services nor services without users. Hence,
some type of subsidization or guarantee will likely be
needed to get the prototype started. However, such
support will alter participant behavior, partially obscuring the desired marketing and behaviorial data. A
number of other economic factors combine to indicate
that the development and operation of a prototype
OIU will be a valuable but non-straightforward endeavor.

InforInation Services

by EDWIN B. PARKER

Stanford University
Stanford, California

Two general classes of services will be required in a
community information utility developed as an exten-

Planning Community Information Utilities

sion of cable television. One class is that provided by
the private sector of the economy and the other is
public sector services. With respect to all services available through the private sector, such as banking, shopping, entertainment, and all business and commercial
services, the information utility should provide a standard information transmission service such that all potential suppliers of service can reach their potential
customers. In other words, the utility should provide
non-discriminatory competitive access to all computer
and other information services without putting itself
in the position of being a monopoly supplier of services.
This implies that the utility specify technical interface,
access and communication standards, but avoid responsibility for the contents of information transmitted
through the utility. Detailed discussion with potential

673

suppliers of services will be required to establish adequate interface standards.
Special arrangements may have to be made to develop
public sector services, the most important of which are
education and information retrieval services. In the
initial stages of development of education services,
highest priority should be given to the delivery to
homes of pre-school, supplementary and continuing
education services. The greatest potential of the information utility may lie in its promise to provide economical and effective life-long learning. Also important
will be the provision of public access via the utility to
"public" government information that's otherwise difficult to obtain. Online voting and polling services
should be given low priority or deferred because of a
variety of political dangers.

A panel session-Computers and the problems of society
interactive communication, etc.). Planning and decision
making can be greatly enhanced by on-line interactive
computer methods. The complex, open-ended urban
system with its absence of clear goal definitions tends
to defy total optimization. In contrast, the man-machine mode permits man's value judgments to become
part of the problem solving process itself. An Urban
Simulation Laboratory is proposed which would bring
together all means of simulation (mathematical modelling, man gaming and perceptual environment simulation) for purposes of research and experimentation
with hypothetical solutions to urban problems. Interactive information display would permit queries by
researchers and community representatives in an "ifthen" mode, and thus would significantly contribute to
urban decision making within the context of an informed and participatory society.
Successful implementation of innovation depends on
a general climate of positivism, government subsidy of
innovation, retraining programs, continuing education,
as well as on user participation in the planning and
implementation processes. New institutional arrangements between universities, research institutions, government and private industry are suggested to maximize learning, research and problem solving opportunities. The author cautions not to forget the "art"
of problem solving in the commendable attempt of
creating a rigorously applied "science" for "Computers
and the Problems of Society."

Co:m.puters and Urban Proble:m.s

by PETER KAMNITZER

University of California
Los Angeles, California

The recent focus of interest on urban problem solutions has progressed from early enthusiasm to the
realization of the enormous difficulties awaiting the
problem solver. The social and political problems attending problem perception and definition, priority establishment, cost and benefit distribution, information
and program control seem to overwhelm the potentially
available technological solutions to the manifestations
of present urban ills.
Utilization of computer technology in the area of
urban problems has grown from extensive data storage
and retrieval and data analysis to simulation and
modelling on a useful but as yet limited operational
scale. Large scale simulation models have been attempted particularly with regard to urban transportation and its impact on land use. Computer graphics
and on-line interactive man-machine systems are showing promise as useful aids to planning and decision
making. Continuing progress will depend on further
urban research; on development of uniform data formats; on faster, cheaper, more reliable and more powerful computers; and on financial and institutional encouragement.
Urban problem patterns on a short range scale will
basically follow present physical and social trends resulting in further congestion, pollution, slums, sprawl,
etc. On a long range scale they will increasingly be
associated with the impact of technological and social
forces on changing urban patterns within established
as well as totally new concepts of urban life.
Computers can contribute to the amelioration of
urban problems predominantly in two major categories:
through their effect on a changing urban fabric and
through their effect on the process of planning and
decision making. They can be used in city building and
rebuilding through the utilization of automation, communication and systems control (automated transportation, construction methods, controlled environments,

The Current Crisis in American Education

by NORTON F. KRISTY

Refocus
Los Angeles, California

Public education in America is in crisis. Its critics
point out that it is not doing its job at all effectively-at
least according to the current expectations of upgrading
the economically and culturally handicapped. The costs
of public education have risen alarmingly in the past
675

676

Fall Joint Computer Conference, 1971

20 years, and are currently out-running the tax base at
all levels of education from primary school to graduate
school. In the early sixties, many educators joined
forces with systems specialists and computer development people in what has turned out to be a romantic
dream. The dream was to some way, some how, combine
concepts of system analysis and computer technology
with principles of programmed learning in a way that
would "revolutionize learning". Those high hopes have
proved to be remarkably short-lived.
Now, in 1971, the conservatives appear to be in
ascendence. There is widespread disillusionment with
the "failure" of educational technology and innovation.
However, the failure of educational technology and
computer applications has largely been the result of
1. Poor administration of research and development monies by a tangled skein of governmental
agencies competitively involved in educational
research.
2. A grossly inadequate funding program for educational research. Those monies which were a vailable tended to be spent on short-term, fragmented research programs. In other words, there
is an almost desperate need to coordinate research, particularly at the Federal level, into one
reasonably well-orchestrated program that will
support risk-taking, and will give a financial
base to promising ideas for an extended period
of time.
3. Premature implementation of educational technology. Implementation of computer applications
to education must be at least as well planned as
the implementation of a major new military
system.

Since we in the educational community are now at a
point of considerable disillusionment concerning the
value of computer technology, the role of the Federal
Government in the past five years needs to be recounted. The Bureau of Research of the U.S. Office of
Education in 1966/67, planned for an accelerating program of investment in computer applications ranging
from administrative applications to computer assisted
instruction. Over the time period 1967-69, less than 20
percent of these planned-for funds were ever in fact
expended. Many programs were initiated and then not
funded. Those that were funded were given much shorter
periods of time than had been initially planned to cover
the proposed work.
In spite of this, considerable technical progress has
been made on the effective application of computers in
education. However, it is the political situation which

will control the future of wide-scale research, development and installation of such applications. The educational enterprise in America is a fragmented instrumentality composed of more than twenty thousand
school districts and almost three thousand institutions
of higher learning. This educational enterprise has no
centralized authority that can promote, support and
press for change. At the same time, it has most of the
limitations of small-scale organizations without much
compensating freedom of action or flexibility of response
to user requirements. Finally, education in America
has now run out of money. It cannot mount a sustained
program of experimentation, development, and implementation dealing with the very technologies and instructional practices that could redeem it.
A highly feasible method of bringing computer technology to important use in public education in the next
decade is a reorganized state/federal program of major
proportion. This program, if it were to be created,
should concentrate first on higher education, for the
cost/effectiveness is greater there.

International bnplications: Need for
World SiDlulation

by JOHN McLEOD

SCi World Simulation

Technological trends are causing profound changes
at all levels and in all sectors of society. Some of these
changes are considered desirable by those affected, some
are not.
Computers are so inseparably intertwined with technology that many of the undesirable, even alarming,
trends are being blamed on computers. Whether or not
this is justified, it seems that if the undesirable trends
are to be checked and the desirable ones reinforced, we
will have to call on computers for help.
The reason computers will be necessary is that problems of society today, stemming from or aggravated
by the "population explosion" and its multiple sideeffects, are much too complicated for comprehension
by the unaided human intellect.
In the last analysis, if humanity is to survive, people
must solve the problems of society. But first there must
be understanding. And to acquire understanding, people
must have a tool for keeping track of the myriad pieces

Computers and the Problems of Society

of information, and the dynamic interactions among
them, that contribute to the problems and which must
be taken into account in any proposed solution. If the
interrelationships as well as the facts are properly fed
into a computer, the result will be a computer model of
the system of interest. Experiments can then be designed and run on the model which will impart an
understanding of the real-world situation.
However, all sub-systems of our society are so interrelated, even up to and including nations, that only a
model including all the nations of the world can give us
insight into social problems which transcend national
boundaries-as so many important ones do.
For the foregoing reasons it is urged that work on the
development of a world simulation be officially encouraged and adequately funded by our government.

CODlputers and National Security

by E. W. PAXSON

The RAND Corporation
Santa Monica, California

The digital computer was spawned by World War II.
Military requirements have continued to pace computer development. Computer technology and weapon
system sophistication have marched in tandem. Neither
has dominated, but current weapon systems, operations,
and management are impossible without the computer.
Almost 90 percent of the Government's current stock
of 5000 plus computers are devoted to defense and
space activities, reversing the civilian use pattern of at
least ten times that number of machines.
Weapons have been developed and fielded in response
to a real external threat. But there has been no true
arms race and one is unlikely. We have relied largely
on advanced technology to generate a posture deterring
the catastrophe of nuclear war. Our lead in computer
science is a major contributing factor in giving us the
technological edge and in contributing to deterrence.
Will we continue to have this edge? Pressures from
the domestic sector, the flyback effect after termination
of our involvement in South East Asia, our hopes for
favorable Strategic Arms Limitations Talks will undoubtedly lead to a decrease in defense funding. Since
our Research and Development system, unlike that of
Russia, is closely coupled to weapon system develop-

677

ment, there will be a cutback in R&D, including computer sciences, as new systems are cancelled or stretched
out.
Scientific research has relied heavily on Government
funding. Current Congressional attitudes toward basic
research are that it must be directly 'relevant' to military matters. In the USSR, the State has opposite
attitudes.
Are industrial motivations strong enough to fill these
gaps?
Under the concept of strategic sufficiency, President
Nixon has asked for options of greater flexibility in
deterrence and nuclear war management than the implementation of Assured Destruction which can imply the
death of half of our people in retaliation.
The computer implications are heavy. Surveillance
and all operating weapon systems must be tightly linked
to produce the required data base update to permit
finger-tip command and control of a major crisis. Not
only are the data processing requirements immense, but
there is imperative need for adaptive, on-line, manmachine intelligence-not artificial intelligence, to explore the 'what if?' 'what then?' of combat situations
in much less than all too short real time. We still talk
to computers and not with them. Machine technology
on the LSI and memory levels is well out of balance
with required and expensive software.
As military budgets are decreased, basic research and
development should obviously increase in proportion.
But, as usual, I think it will take its share of a slash.
I hope I am wrong.

Ecological ProbleDls

by ROGER WEINB.ERG

Kansas State University
Manhattan, Kansas

Man has exploited nature during the 19th and 20th
centuries. Therefore, in America he has changed many
productive Indian systems of life: the Oklahoma plains
of the Kiow, rich in grass and buffalo, into a dust bowl;
the Washington salmon streams of the Haida into a
sequence of DDT-poisoned reservoirs; the British Columbian Kootenay Lake of the Tlinglit, filled with fish,
into a recipient for fertilizers.
By 1971 he had gone further, filling the atmosphere

678

Fall Joint Computer Conference, 1971

with carbon monoxide and other noxious gases so that
breathing in New York City was equivalent to smoking
a pack and a half of cigarettes a day. He has even
polluted the vast life-giving ocean with death-dealing
poisons. Mercury contaminated the swordfish which
became dangerous for humans to eat. DDT slowed the
ocean plants' ability to capture the energy of the sun
in the vital first step of a long food chain leading to man.
As man was developing a technology which created
these problems, he was with the same technology, developing means for solving them. By 1946 he had built
thinking machines-electronic computers, and by 1971
he had used them to model eco-systems , to optimize
the results of resource management, and to coordinate
research efforts by teams of individual research workers
who were scattered by distance.
Future developments such as small, cheap minicomputers can provide computers for gathering weather
data at remote Pacific island stations, while powerful
parallel processing computers will be capable of running
models of large complex weather systems.
Along with new computers, a new technique such as
microprogramming will enable a programmer to set up
computer circuits tailored for his particular program,
and a new concept such as the computer utility will

enable groups of programmers to communicate with
each other, and to utilize the power of a large central
computer.
These new computers, techniques, and concepts are
man's genie. And man, become Aladdin, will be able to:
a. Improve weather forecasting, and be forewarned
against natural disasters.
b. Plan the optimal use of scarce natural resources.
c. Simulate to improve decision making, testing
alternate ecological policies in. order to choose
the best one before it is implemented, thereby
avoiding dangerous mistakes.
d. Plan and coordinate measures in pollution control.
e. Build information retrieval systems which make
scientific and technical data accessible to interdisciplinary teams studying world-wide environmental systems.
Computers provide man with the power of vision into
alternate future worlds, and the option of choice among
these worlds. What choice he makes is his decision.
Whatever his choice, he will live in the heaven he
creates, or in the hell.

AMERICAN FEDERATION OF INFORMATION
PROCESSING SOCIETIES, INC. (AFIPS)
AFIPS OFFICERS and BOARD OF DIRECTORS

President

V ice President

Mr. Keith Uncapher
The RAND Corporation
1700 Main Street
Santa Monica, California 90406

Mr . Walter L. Anderson
General Kinetics, Inc.
11425 Isaac Newton Square, South
Reston, Virginia 22070

Secretary

Treasurer

Dr. Donald Walker
Artificial Intelligence Group
Stanford Research Institute
Menlo Park, California 94025

Dr. Robert W. Rector
University of California
6115 Mathematical Sciences Building
Los Angeles, California 90024

Executive Director
Dr. Bruce Gilchrist
AFIPS
210 Summit Avenue
Montvale, New Jersey 07645

ACM Directors
Mr. Donn B. Parker
Stanford Research Institute
Menlo Park, California 94025

Mr. Walter Carlson
IBM Corporation
Armonk, New York, 10504

Dr. Ward Sangren
The University of California
521 University Hall
2200 University Avenue
Berkeley, California 94720

IEEE Directors
Mr. L. C. Hobbs
Hobbs Associates, Inc.
P.O. Box 686
Corona del Mar, California 92625

Dr. Robert A. Kudlich
Raytheon Co., Equipment Division
Wayland Laboratory
Boston Post Road
Wayland, Massachusetts 01778

Professor Edward J. McCluskey
Stanford University
Department of Electrical Engineering
Palo Alto, California 94305

Simulations Council Director

Association for Computational Linguistics Director

Mr. James E. Wolle
General Electric Company (VFSTC)
Space Division
P.O. Box 8555
Philadelphia, Pa. 19101

Dr. A. Hood Roberts
Center for Applied Linguistics
1717 Massachusetts Avenue, N.W.
Washington, D.C. 20036

American Institute of Aeronautics and
Astronautics Director

A merican Institute of Certified Public
A ccountants Director

Dr. Eugene Levin
Culler-Harrison Company
5770 Thornwood Drive
Goleta, California 93017

Mr. Noel Zakin
Computer Technical Services
ACIPA-666 Fifth Avenue
New York, New York 10019

American Statistical Association Director

American Society for Information Science Director

Dr. Martin Schatzoff
IBM Cambridge Scientific Center
545 Technology Square
Cambridge, Massachusetts 02130

Mr. Herbert Koller
ASIS
1140 Connecticut Avenue, N.W. Suite 804
Washington, D.C. 20036

Instrument Society of America Director

Society for Industrial and Applied Mathematics Director

Mr. Theodore J. Williams
Purdue Laboratory for Applied Industrial Control
Purdue University
Lafayette, Indiana 47907

Dr. D. L. Thomsen, Jr.
IBM Corporation
Armonk, New York 10504

Society for Information Display Director

Special Libraries Association Director

Mr. William Bethke
RADC (EME, W. Bethke)
Griffis Air Force Base
Rome, New York 13440

Mr. Burton E. Lamkin
Office of Education-Room 5901
7th and D Streets, S.W.
Washington, D.C. 20202

JOINT COMPUTER CONFERENCE BOARD
President

A CM Representative

Mr. Keith W. Uncapher
The RAND Corporation
1700 Main Street
Santa Monica, California 90406

Mr. Richard B. Blue Sr.
1320 Victoria Avenue
Los Angeles, California 90019

V ice President

IEEE Representative

Mr. Walter L. Anderson
General Kinetics, Incorporated
11425 Isaac Newton Square, South
Reston, Virginia 22070

Dr. Robert A. Kudlich
Raytheon Company, Equipment Division
Wayland Laboratory
Boston Post Road
Wayland, Massachusetts 01778

Treasurer

SCI Representative

Dr. Robert W. Rector
University of California
6115 Mathematical Sciences Building
Los Angeles, California 90024

Mr. John E. Sherman
Lockheed Missiles and Space Co
Org. 19-30, Building 102,
P.O. Box 504
Sunnyvale, California 94088

JOINT COMPUTER CONFERENCE
COMMITTEE

JOINT COMPUTER CONFERENCE TECHNICAL
PROGRAM COMMITTEE

Dr. A. S. Hoagland, Chairman
IBM Research Center
P.O. Box 218
Yorktown Heights, New York 10508

Mr. David R. Brown, Chairman
Stanford Research Institute
333 Ravenswood Avenue
Menlo Park, California 94025

FUTURE JCC CONFERENCE CHAIRMEN
1972 SJCC

1972 FJCC

Mr. Jack E. Bertram
IBM Corporation
P.O. Box 37
Armonk, New York 10504

Dr. Robert Spinrad
Xerox Data Systems
701 South Aviation Blvd.
EI Segundo, California 90245

1971 FJCC STEERING COMMITTEE
General Chairman

Ralph R. Wheeler
Lockheed Missiles and Space Company
Vice Chairman

Albert C. Porter
California Public Utilities Commission
Secretary

Joseph M. Crosslin
Control Data Corporation

Exhibits

Jack Miller-Chairman
Ampex Corporation
Clyde Cornwell-Vice Chairman
Ampex Computer Products Division
Special Activities

Norman Kristovich-Chairman
Dept. of Industrial Relations
David Wilkinson-Vice Chairman
Hewlett Packard International Corp.

Treasurer

Corydon Hurtado
Cyberrary International Company
Technical Program

Dr. Martin Y. Silberberg-Chairman
IBM Corporation
Robert Blumenthal-Vice Chairman
IBM Corporation
Local Arrangements

Thomas Bieg-Chairman
IBM Corporation
Kenneth W. Charshaf-Vice Chairman
Bank of America
Registration

Robert Borkenhagen-Chairman
lSI Corporation
Gary Thomasson-Vice Chairman
Trans-A-File Systems Company

Public Relations

Frederick M. Hoar-Chairman
Fairchild Camera & Instrument Corp.
Ro~ald R. Batiste-Vice Chairman
System Development Corporation

A CM Representative

Thomas E. Murray
Del Monte Corporation
IEEE Computer Society Representative

Terry Ruster
Fairchild Corporation
SCI Representative

J. E. Sherman
Lockheed Missiles and Space Company

Printing and Mailing

Jeffrey D. Stein-Chairman
On-Line Business Systems, Inc.
Eckart Sellinger-Vice Chairman
Bank of America

JCC Committee Liaison

Dr. Morton M. Astrahan
IBM Corporation

REVIEWERS, PANELISTS, AND SESSION CHAIRMEN
SESSION CHAIRMEN
Baran, Paul
Bell, C. Gordon
Bennett, John L.
Blois, Marsden S.
Borko, Harold
Coate, Robert E.
Farber, Dave
Frederickson, A. Anton, Jr.
Gould, Kent
Hamming, Richard
Haynes, Herb
Hisey, Bradner L.

Hoffman, Lance J.
Howard, John
King, Warren
Kuney, Joseph
La Riviere, Dave
Lipton, Harry
Madden, JohnD.
Mason, Maughan S.
Newton, Carol
Nigh, Max T.
Nilsen, Raymond N.

Ponder, Leonard H.
Purdy, Gerry
Ross, L. W.
Sackman, H.
Schroeder, David L.
Schwetman, Herb
Warlick, Charles
Weiss, Donald H.
Wilkinson, Dave
Williams, Robert
Wormeli, Paul

PANELISTS
Barg, Benjamin
Bekey, G. A.
Blease, Thomas
Boudreau, P. E.
Brandt, Gil
Brooks, F. P.
Burke, Robert
Caine, S. H.
Chien, R. T.
Cohen, N. D.
Courtney, Robert
Cserhalmi, N.
Dalkey, Norman
Davis, Dan
Edwards, D. B. G.
Engelbart, D. C.
Epple, Ken
Everett, R. R.
Foster, J. E.
Frank, A. A.
Frazer, J. W.
Freeman, R. B.
Griswold, R. E.

Harding, P. A.
Hawley, C. L.
Hugo, F. M.
Jeffries, S. B.
Kamnitzer, P.
Karplus, W. J.
Katter, R. V.
Kay, A. C.
Korn, G. A.
Kristy, N. F.
Lalchandani, A.
Lamson, B.
Larson, Dave
Levinthal, C.
McLeod, J.
McClure, R. M.
-aginniss, F. J.
Maley, G. A.
Mallender, Ian
Martin, D. C.
Merritt, M.
Mitchell, Kent
Morris, J. B., Jr.

Morton, M. S.
Morton, N. E.
Nanus, B.
Nathan, R.
Nielson, N. R.
Norberg, G. R.
Parker, E. B.
Paxson, E. W.
Pollard, B. W.
Post, C.
Raub, W.
Rosen, Saul
Ryan, Frank
Sibley, E. H.
Smith, C. L.
Tatum, Liston
Van Brunt, E. E.
Walker, D. E.
Weinberg, R.
Weissman, Clark
Yamamoto, W.
Yarrington, A.

REVIEWERS
Abbott, Robert P.
Acton, Forman S.
Adams, Edward N.
Aiken, Robert M.
Alcorn, Bruce K.
Allen, Roy P.
Amarel, Saul
Anderson, James P.
Anderson, Robert H.
Anderson, Thomas C.
Anzelmo, Frank

Arndt, Fred R.
Arnovick, George N.
Axsom, Larry E.
Badger, George F., Jr.
Ball, N. Addison
Barcelo, Wayne R.
Barlow, Allen E.
Barnes, Ben B.
Barnett, Robert M.
Bartlett, James P.
Bayles, Richard U.

Belady, L. A.
Bell, Thomas E.
Berglass, Gilbert R.
Berning, Paul T.
Bethke, William P.
Beyer, William A.
Black, Donald V.
Bodoia, Morris J.
Bolton, Gordon R.
Borko, Harold
Bratman, Harvey

Bredt, Thomas H.
Bremer, John W.
Brennan, Robert D.
Brown, Ralph R.
Browne, J. C.
Bryan, G. Edward
Burkhard, Walter A.
Caine, Stephen H.
Calhoun, Kenneth J.
Calhoun, Myron A.
Calingaert, Peter
Calvert, Thomas W.
Canaday, R. H.
Cardwell, David W.
Carmon, James L.
Chaitin, Leonard J.
Chandler, John P.
Chapman, R. G., Jr.
Cheydleur, Benjamin F.
Chow, W. F.
Clymer, A. Ben
Cocanower, Alfred B.
Coles, L. Stephen
Collmeyer, Arthur J.
Condon, S. F.
Connors, Michael M.
Constant, Robert N.
Cook, Jeffrey D.
Cooke, Walter F.
Corduan, Alfred E.
Corwin, Barnet C.
Cotton, Ira W.
Coulman, George A.
Cowan, Robert
Critchlow, Arthur J.
Critchlow, Dale L.
Cserhalmi, Nicholas
Csuri, Charles A.
Curtis, Kent K.
Dale, A. G.
Daniel, Walter E., Jr.
Darms, Donald A.
DeJong, S. Peter
Denes, John E.
Denning, Peter J.
Deveber, Jeffrey L.
Dimmler, D. Gerd
Dodd, George G.
Douglas, John R.
Dove, Richard K.
Duffendack, John C.
Duggan, Michael A.
Durney, Arnold I.
Earnest, Lester D.
Eisenstark, Raymond
Ellin, Everett

Estes, Samuel E.
Farmer, Nick A.
Feurzeig, Wallace
Feustel, Edward A.
Firschein, Oscar
Flanagan, J. L.
Foster, John E.
Fox, Margaret R.
Frank, Amalie J.
Fraser, A. G.
Futterweit, Adolf
Gardner, Reed M.
Geyer, James B.
Gilliand, B. E.
Goetz, Martin A.
Gold, Michael M.
Gotterer, Malcolm H.
Greenawalt, Eddie M.
Greenfield, Martin N.
Haberman, Eugene J.
Haibt, Luther H.
Haims, Murray J.
Hammer, Carl
Haney, Frederick M.
Hanlon, A. G.
Hansard, Robert M.
Harding, Philip A.
Harrison, Joseph 0., Sr.
Hartwick, R. Dean
Hathaway, Allen W.
Hedrick, George Ellwood, III
Heilweil, Melvin F.
Hermann, Paul J.
Herzog, Bertram
Hinrichs, Joe
Hodes, Louis
Hollander, Gerhard L.
Hodper, Robert L.
Humphrey, Thomas A.
Hyatt, Gilbert P.
Jameson, Wm. J., Jr.
Jeffries, Ronald E.
Jessep, Donald C.
Kain, Richard Y.
Kaitz, Marvin J.
Kalin, Richard B.
Keenan, Thomas A.
Keller, Roy F.
Kahalil, Hatem M.
King, Robert E.
King, Willis K.
Kinney, Edward S.
Klerer, Melvin
Klir, George J.
Knupp, John L., Jr.
Koen, Henry R., Jr.

Koller, Herbert R.
Koory, Jerry L.
Kopf, John O.
Kovach, Ladis D.
Kurtz, Thomas E.
Lambert, Robert J.
Lampson, Butler
Landoll, James R.
Larkin, Robert C.
Leathrum, James F.
Lenahan, John J.
Lett, A. S.
Lewis, William E.
Lindahl, Charles E.
Lindenmeyer, Leonard R.
Linville, Thomas P.
Liu, Ho-Nein
Livdahl, Richard C.
Lomet, David B.
Long, Henry A.
Mason, Maughan S.
McClure, Robert M.
McCoy, Maurice E., Jr.
McFarland, Clay
l\lcKnight, Randy S.
McLeod, John
Machover, Carl
Main, Walter
Malone, Charles M.
Marcotty, Michael
Meadows, H. E.
Miles, E. P., Jr.
Miller, William G.
Moe, Maynard L.
Morrison, James F.
Myers, Robert P.
Nanus, Burt
Nassir, Andrew M.
Neilsen, Norman R.
Notz, William A.
O'Brien, Joseph A.
Paden, Douglas R.
Page, Carl Victor
Parker, Donn B.
Passaretti, Anthony
Pattee, Harold E.
Pearson, Karl M., Jr.
Pomerance, Richard M.
Pounds, Kenneth
Pritchard, J. Paul, Jr.
Rahe, George A.
Ralston, Anthony
Ramamoorthy, C. V.
Remson, Irwin
Rigney, Joseph W.
Rubey, Raymond J.

/

Ruffing, Linus F.
Sanborn, Jere L.
Schafer, Ronald W.
Schischa, Eywin
Schwenker, J. E.
Sedelow, Sally Yeates
Seed, John C.
Sheldon, Robert C.
Shipman, Jerome S.
Shuey, Richard L.
Slaughter, Barbara G.
Slutz, Donald R.
Smith, Cecil L.
Springe, Fred W.

Starkweather, John A.
Stewart, David H.
Stewart, Robert M.
Sturm, Walter A.
Summit, Roger K.
Tan, Chung-Jen
Uber, Gordon T.
Van Brink, Herbert F.
Van Tassel, Dennie
Vemuri, V.
Vichnevetsky, Robert
Wadia, Aspi B.
Wait, John V.

Walker, P. Duane
Wallace, John B., Jr.
Warheit, I. A.
Weissman, Clark
Wigington, Ronald L.
Wilborn, R. C.
Wilcox, Lyle C.
Wilkov, Robert S.
Willard, Donald A.
Williams, Theodore J.
Wolle, James E.
Wyman, John C.
Yau, Stephen S.

PRELIMINARY LIST OF EXHIBITORS
Addison-Wesley Publishing Company
Addressograph Multigraph Corp.
AFIPS Press
American Telephone & Telegraph Co.
Ampex Corporation
Applied Magnetics Corp. (with Standard Memories)
Auerbach Info, Inc.
Auricord Div.-Scovill Mfg. Co.
Automata Corporation
Bell & Howell, E & IG
The Bendix Corporation
Benwill Publishing Corp.
Boeing Computer Services, Inc.
Bryant Computer Products
Bucode
Bunker Ramo
Caelus Memories
California Computer Products, Inc.
Cambridge Memories, Inc.
Canada: Dept. of Industry, Trade & Commerce
Canberra Industries
Centronics Data Computer Corp.
Century Data Systems, Inc.
Cincinnati Milacron
Cipher Data Products, Inc.
Clasco Systems, Inc.
Codex Corp.
Collins Radio Company
Com Data Corporation
Com-Mark, Inc.
Compucorp (A Div. of Computer Design)
Computer Automation, Inc.
Computer Communications, Inc.
Computer Decisions
Computer Design Publishing Corp.
Computer Intelligence Corp.
Computer Investors Group, Inc.
Computer Terminal Corp.
Computer Transceiver Systems, Inc.
Computerworld
Control Devices Inc.
Courier Terminal Systems, Inc.
Cybercom Corporation
Data Disc, Inc.
Data General Corporation
Datamation

Data Printer Corp.
Datapro Research Corporation
Data Products Corporation
Dataram
Datawest Corporation
Diablo Systems Inc.
A. B. Dick Company
Dicom Industries
Digi-Data Corporation
Digital Computer Controls
Digital Development Corp.
Digital Equipment Corporation
Digitronics Corporation
Eastman Kodak Company
E-H Research Laboratories, Inc.
Electronic N ews-Fairchild Publications
Electronic Processors, Inc.
Fabri-Tek, Inc. Memory Products Div.
Facit-Odhner, Inc.
Gould Inc., Brush Div.
Grumman Data Systems Corp.
GTE Information Systems Inc.
GTE Lenkurt Inc.
GTE Sylvania
Hewlett-Packard
Hitchcock Publishing
Houston Instrument
IEEE Computer Society
Incoterm Corporation
Inforex, Inc.
Information Control Corporation
Input Output Computer Services, Inc.
Instronics Limited
Interdata, Inc.
International Data Corp.
International Teleprinter Corp.
I/O Devices, Inc.
Itel Corp., Information Storage Systems Div.
Kanematsu-Gosho (U.S.A.) Inc.
Kennedy Company
Keuffel & Esser Company
Kybe Corporation
Licon Div. LT.W.
Lipps., Inc.
Litton ABS OEM Products
Litton Industries

Lorain Products Corp.
Lundy Electronics & Systems Inc.
3M Company Instrument & Data Products
Magnusonic Devices, Inc.
Marshall Data Systems
Memory Systems, Inc.
Microdata Corporation
Micro Switch, A Div. of Honeywell
Milgo Electronic Corporation (ICC)
Miratel Div.-BBRC
Modern Data Services, Inc.
Mohawk Data Sciences Corp.
Nashua Corporation
NCR
Nortronics Company, Inc.
N umeridex Tape Systems, Inc.
Optical Business Machines, Inc.
Optical Scanning Corporation
Pacific Micronetics, Inc.
Panasonic
Paradyne Corporation
Penril Data Communications, Inc.
Peripheral Data Machines, Inc.
Peripheral Equipment Corporation
Phonocopy, Inc.
Pioneer Magnetics, Inc.
Potter Instrument Company, Inc.
Precision Instrument
Prentice Hall, Inc.
Princeton Electronic Products, Inc.
Quadri Corporation
Raytheon Company
Remex, A Unit of Ex-Cell-O Corp.

Sangamo Electric Company
Signal Galaxies, Inc.
The Singer Company (Librascope Div.)
Singer-Micrographics Systems
Sola Electric
Sorbus, Inc.
Spartan Books
Storage Technology Corporation
The Superior Electric Company
Sycor, Inc.
Sykes Datatronics, Inc.
Tally Corp.
Tandberg of America, Inc.
Tee, Incorporated
Techtran Industries, Inc.
Tektronix, Inc.
Teletype Corp.
Telex/Communications Div.
Thomson-CSF Electron Tubes, Inc.
Timeplex, Inc.
Time Share Peripherals Corp.
Tracor Data Systems
United Telecontrol Electronics, Inc.
Unicomp, Inc.
Van San Corporation
Varian Data Systems
Video Systems Corp.
Wang Computer Products, Inc.
Warner Electric
Western Union Data Services Co.
Western Union Telegraph Company
John Wiley & Sons, Inc.
Xerox Corporation-Xerox Data Systems

AUTHOR INDEX
Adams, M. C., 477
Adelman, A. G., 455
Amiot, L., 31
Aschenbrenner, R. A., 31
Asman, E. Z., 233
Aus, H. M., 379
Austin, J. E., 541
Baca, R. L., 309
Barney, G. 0., 631
Bateman, B. L., 89
Bekey, G. A., 401
Bell, C. G., 387
Bennett, J. L., 197
Berg, R. 0., 177
Berman, R. A., 369
Boehm, B. W., 669
Boehm, S. C., 309
Boudreau, P. E., 9
Brandt, G., 397
Bravdica, S. A., 225
Brooks, F. B., Jr., 395
Carroll, J. M., 571
Chamberlin, D. D., 263
Chambers, M. G., 309
Chappell, S. G., 651
Chew, P., 233
Clark, R. L., 369
Cleveland, W. B., 213
Cohen, N. D., 670
Coleman, N. L., 65
Covvey, H. D. J., 455
Crawford, P. B., 89
Deland, E. C., 369
Drew, D. D., 89
Edwards, D. B. G., 395
Elliott, W. D., 533
Eppele, L., 397
Epstein, G., 663
Felderhof, C. H., 455
Forman, E. H., 51
Frank, A. A., 357
Frank, A. J., 135
Freeman, R. B., 1
Gack, G., 295
Gallati, R. R. J., 303
Gilmore, P. A., 411
Gottlieb, S. E., 603
Gracon, T. J., 549
Graves, G. W., 123
Groner, G. F., 369
Hansen, M. H., 579
Hansen, W. J., 523

Hinkelman, K. W., 65
Hirschsohn, I., 501
Hodges, J. D., Jr., 281
Hoehenwarter, W. P., 639
Hoffman, L. J., 587
Jackson, R. S., 225
Jen, T. S., 171
Kamman, A. B., 17
Kamnitzer, P., 675
Kay, A., 395
Kennicott, P. R., 423
Klopfenstein, C. E., 435
Kolechta, W. J., 65
Korn, G. A., 379
Kristy, N. F., 675
Laga, E., 477
Lalchandani, A., 398
Lamson, B. G., 195
Langlois, W. E., 97
Learman, 1.,469
Levinthal, C., 199
Lifshin, E., 423
Loeber, N. C., 79
McHardy, L., 571
McLeod, J., 676
Maholick, A. W., 1
Martin, D. C., 361
Martin, R., 571
Medak, G. M., 295
Mendler, P., 455
Menninga, L. D., 145
Merrit, M. J., 351
Mesquita, A. L., 27
Mishelevich, D. J., 271
Mitchell, K., 399
Moravec, H., 571
Morey, R., 477
Morton, N. E., 199
Nakamura, G., 57
Nanus, B., 671
Natarajan, N. K., 31
Nathan, R., 200
Newbery, A. C. R., 419
Newell, A., 387
Nielsen, N. R., 671
Nolby, R. A., 549
O'Connor, D. G., 203
Olsen, D. J., 115
Parker, E. B., 672
Patterson, A. C., 575
Paxson, E. W., 677
Peck, P. L., 561

Pendleton, J. C., 491
Perone, S. P., 441
Pingry, D. E., 123
Post, C. T., Jr., 195
Potas, W. A., 533
Prerau, D. S., 153
Pringle, W. L., 309
Purdy, K. G., 399
Raub, W. F., 201
Reich, K. E., 639
Rodriguez, E. J., 71
Ross, L. W., 105
Ryan, F. B., 400
Sammet, J. E., 243
Sanford, J. E., 233
Sansom, F. J., 549
Scavullo, V. P., 423
Schumacker, B., 619
Sheridan, T. B., 327
Sicko, J. S., 423
Sinclair, R., 351

Smith, C. C., 609
Steen, R. F., 9
Strauss, J. C., 39
Tang, C. K., 163
Taylor, K. W., 455
Thurber, K. J., 177
Turoff, M., 317
Umpleby, S., 337
Ung, M. T., 401
Van Brundt, E. E., 196
Van Dam, A., 533
Wegbreit, B., 253
Weinberg, R., 677
Whinston, A., 123
Whisenand, P. M., 295
White, M. S., Jr., 609
Wigle, E. D., 455
Wilkins, C. L., 435
Wood, D. C., 51
Yamamato, W. S., 201


Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.3
Linearized                      : No
XMP Toolkit                     : Adobe XMP Core 4.2.1-c041 52.342996, 2008/05/07-21:37:19
Producer                        : Adobe Acrobat 9.0 Paper Capture Plug-in
Modify Date                     : 2008:11:18 10:14:42-08:00
Create Date                     : 2008:11:18 10:14:42-08:00
Metadata Date                   : 2008:11:18 10:14:42-08:00
Format                          : application/pdf
Document ID                     : uuid:d2fbc783-e602-4a5a-b7de-de719f96001e
Instance ID                     : uuid:87ea13a8-aafd-4f43-bafa-a003b6a2bc28
Page Layout                     : SinglePage
Page Mode                       : UseOutlines
Page Count                      : 697
EXIF Metadata provided by EXIF.tools

Navigation menu